Current systems for ranking blogs are largely about inlinks. Technorati and BlogPulse both use this basic measure of citation to create their lists; TechMeme - whose new list created plenty of discussion on the topic - takes the algorithm it uses for placing stories on its home page (essentially, another citation based approach) and aggregates visibility information. Additional features to consider include the number of feed subscribers and the number of visitors to the blog site. However, there are plenty of alternative approaches to creating a list of important blogs.
The above approaches are motivated by some (vague) notion of influence - a term that is central to the analysis of social media and blogs in particular, but one which has not really been given a full, well grounded definition in the space. However, there is also the issue of reader efficiency - ensuring that the consumer of blog data maximises the value they get from reading blogs.
A group of researchers at CMU have been considering a notion of blog importance based on how likely a set of blogs is to ensure that you will be informed of topics bursting in the blogosphere. By analogy, they consider a graph of water pipelines. Their paper - Cost-Effective Outbreak Detection in Networks Leskovec, Krause, Guestrin, Faloutsos, VanBriesen, Glance - poses the problem:
Given a water distribution network, where should we place sensors to quickly detect contaminants? Or, which blogs should we read to avoid missing important stories? These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information as quickly as possible.
As a result of this work, the authors have published some blog lists which answer a fundamentally important question in terms of weblog reading habits: Which weblogs should I read to be most up to date? The lists answering this question - generated by the approach described in their paper - come in a number of varieties to be found on the project's page.
Highlights from the work include the top 10 and bottom 10 from the list of blogs to read to be the most up to date on stories if you only have time to read 100 blogs. It must be noted that this work is a theoretical exploration - the dataset mined to create the list is not a live corpus of blogs; thus some of the blogs may be stale or even abandoned.
Note that another view of the data - which blogs to read if you can only read 500 posts - generates quite a different list of blogs.
Interestingly, the #3 spot is a blog that hasn't been updated in a month.
Posted by: Jason Adams | October 23, 2007 at 11:40 AM
Jason - note that the data used for this work is not a live corpus of weblog posts, but a (recent) historical set. I've updated the post to underline this point (the publication I link to makes this clear). Thanks!
Posted by: Matthew Hurst | October 23, 2007 at 12:07 PM
Wow...what a great list. I will have to take the time to read them all. Thanks for putting together them all for me!
Posted by: JC Carvill | October 23, 2007 at 12:24 PM
Calculating a blog's influence is important for ROI for potential advertisements from companies looking to get their products out there as well as companies working in brand management. The problem is right now there is not a set of standard metrics to form an algorithm to get a raw influence number for each site.
I like this approach because it gets away from old metrics like in link and out link counts, which do not speak to the topics being posted. Thanks
Posted by: the constant skeptic | October 23, 2007 at 08:44 PM
To me as a user, I am much more interested in the interaction between blogs. IN other words, which blogs like to read mine and interact with it, and which do I like and interat with. Instead of a "who has the biggest..." contest we would be able to see how blogs interact with each other and (re-) discover information in new ways. So, it is time for the mast of attraction to take the podium. You know him, there is only one, Sir Isaac Newton's Universal Law of BLOG attraction:
http://vanelsas.wordpress.com/2007/10/09/newtons-universal-law-of-blog-attraction-better-than-a-techmeme-leaderboard/
Posted by: Alexander van Elsas | October 24, 2007 at 01:29 AM
Something's wrong with either the conception or execution of that research because there isn't a Ron Paul Revolution blog on that list.
How can anyone be more influential than Ron Paul.
Posted by: Hipple, Rev. Paul T. | October 25, 2007 at 02:02 PM
Something's wrong with either the conception or execution of that research because there isn't a Ron Paul Revolution blog on that list.
How can anyone be more influential than Ron Paul.
Posted by: Hipple, Rev. Paul T. | October 25, 2007 at 02:02 PM
Something's wrong with either the conception or execution of that research because there isn't a Ron Paul Revolution blog on that list.
How can anyone be more influential than Ron Paul.
Posted by: Hipple, Rev. Paul T. | October 25, 2007 at 02:02 PM
Something's wrong with either the conception or execution of that research because there isn't a Ron Paul Revolution blog on that list.
How can anyone be more influential than Ron Paul.
Posted by: Hipple, Rev. Paul T. | October 25, 2007 at 02:02 PM