The basic mechanism used in track // microsoft to cluster articles is similar to that used by Techmeme. A fixed set of blogs are crawled and clustered based on specific features such as link structure and content (and in the case of Techmeme, additional human input). However, what about blogs that aren't known to the system?
I recently added a feature to track // microsoft which analyses clusters for popular urls and adds those to the bottom of the cluster. The title of the web page is used as a simple description of the popular page.
In the recent story about Nuno Silva's mistaken comment regarding the future of Windows Phone devices, there were many links to Nuno's own blog post. In addition to the large cluster of known blogs that were determined to be talking about the story, track // microsoft also surfaced Nuno's post through analysing the popular links discovered within the cluster.
This can be seen in this screen shot of the cluster currently appearing on the site.


You could also track some news aggregator e.g. http://news.search.yahoo.com/rss?ei=UTF-8&p=microsoft&fr=sfp
Posted by: Viliam Kanis | April 20, 2012 at 03:37 AM