MOBILE | LOGIN | ADD | HELP       







Blogs are different. They are made of blog-posts and not web-pages. So they have to be treated differently. The correct units when dealing with blogs are the blog-posts and their permalinks. Blog Post Analysis (BPA) is an attempt in building a platform for blog analytics by identifying and presenting the fundamental units of blogs, the blog-post.


One marked difference between websites and weblogs is that while the fundamental unit of a website is a webpage, that of a weblog is a blog-post. But until now all the tools and utilities that deal with weblogs have been treating them as websites. The situation is that for the various blogs related services - permalinks just don't exist. This makes it difficult to get to the information that exists on a weblog vis-a-vis a website. Its not right to refer to the blog homepage or any of the archive pages when it should be pointing to the permalink.

The Problem:
This problem stems from the peculiar nature of weblogs. Blogs are websites with a difference. Unlike websites their homepages are transient, frequently updated and hold content reverse chronologically. Thus what you see here today may not be there tomorrow. When blog tools like news indexes, link trackers and search engines treat them as websites; they assume that the content they have found on a blog's homepage will remain there. Which of course doesn't happen.

For example:

  • Search engine indexes a blog's homepage and records that it has found certain links and keywords on that page.
  • Eventually the content of the homepage changes and all the old content that the search engine had indexed is now only available in the archive pages.
  • A User performs search on the search engine.
  • Results returned by the search engine still point to the blog homepage.
  • The user lands on the blog homepage expecting to find what he had looked for. But doesn't find it there.

The Solution:
To address this problem we need to understand it first. The matter of a webpage resides at the URL of that webpage but the matter of a blog-post resides at the permalink. To associate a blog's homepage with a blog-post is erroneous.

The required correction is to associate a blog-post with its permalink. But for this to happen we need to sit up and acknowledge that weblogs are different from websites. We need to realise that when we index blogs we need to index the blog-posts and not merely webpages. The first step in this direction is to identify the posts of a blog and then associate these posts with their respective permalinks. This break-up of blogs into their fundamental units and handling them using these units will not only correct the error but will also open up new ways of leveraging and making sense of the blogworld.

We have built two applications using Blog Post Analysis. Both of these are available on the homepage.

Blog Post Search:
It searches and returns the posts along with the permalinks - pinpointing you to the blog-post where keywords matched. This type of a search has two advantages. It makes it easier for you to locate information. And more importantly the blog search engine doesn't remain valid just for the current period - it can even search the archives. We have close to 300 blogs indexed having a total of 6000+ posts to be searched from. This is being increased rapidly.

RSS Generator:
Since we are having a handle on each of the blog-post with its associated permalink we can generate RSS feeds for blogs. This service is specially useful for non-techie blog owners hosting on blogspot kind of services which don't provide a rss feed. There are other services which generate a dynamic rss feed but require you to change the template code, which not many people are comfortable doing. It can also be useful if you are not the owner of a blog but want a rss feed for it.

Updates of blogs indexed in the search engine and whose rss feeds have been generated are tracked by checking pings to weblogs.com and blo.gs. In any case they are updated minimum once a day. For those blogs which don't have the facility to ping the above mentioned notification services there is an update form available to notify BlogStreet.

Please note that this is a beta version of Blog Post Analysis. It means few existing things will be broken and a few features are in the pipeline. This version won't be able to correctly identify the posts for those blogs which have

  • Permalinks before their posts
  • Permalinks in the title of their posts.

Therefore these kind of blogs wouldn't be eligible for inclusion in the search engine or RSS Generator for now.

The two applications viz. Blog Post Search and RSS Generator can be looked at as proof-of-concept. There are a set of XML-RPC web services available for the developers to take advantage of the platform and use it to their advantage. They can use it to refine the news indexes, link trackers and make them more relevant by basing them on blog-posts.

One of the reasons behind this release is to test it against the user feedback and incorporate the learning in the final release. Your comments on the concept and applications will be appreciated.


Update: Some Responses

  • Microdoc News - Blogstreet Takes Content Management to a New Level.
  • K-Praxis - "superimposing" an RSS feed onto an information source could lead to new ways of information extraction and discovery from various structured and semi-structured information sources
  • Scripting News
  • Anil Dash - anything that recognizes posts as the fundamental unit of weblogs is on a good track
  • Joi Ito - can sell it to Google so they can filter blog posts. ;-)
  • Steven Cohen - Now, blog software users have no excuse not to have an RSS Feed.
  • Moss Collum - Blog Post Analysis, from BlogStreet, is the most beautiful hack I have seen in... well, in a good long time.
  • Roland Tanglao - Complete with an API for developers. Very cool.
  • Paul Victor Novares - I'll be poking at this one for sure.
  • Diego Doval - the finer-grained view clearly will be more useful when doing analysis (and then, potentially, search). Very cool.
  • Austin Bloggers
  • mesh on mx


Home | Contact | Icons