With the proliferation of memetrackers, I'm starting to wonder more about the algorithms that they use to determine the relationships between blog posts. Initially, I had assumed that they were purely link based: If A links to B, then A is about B. I think this is how TailRank works - so the differentiator that TailRank is after is mostly to do with coverage.
However, I don't think that this is how the Memeorandum suite of trackers works. Or, at least, they have this plus some other approach. The reason I believe this is that a post I wrote is currently listed as being in the discussion set associated with a post from Venture Beat about Powerset. However, my post didn't link to the Venture Beat post. So why is it in the discussion? In fact, my post isn't even about the Venture Beat post. The only real explanation is that Venture Beat linked to the Powerset website and so did I.
Unfortunately, I can't link to the article on TechMeme - permalinks there seem to provide a view of the discussion that is different from that given on the front page.
While it is interesting (and vital) that services like Memeorandum explore new approaches, cases like this bring in to focus issues of accuracy - how accurate is the description of the 'discussion' around a topic (which, here, means a post or news article)?


It could be using a combination of links and shared words.
Also, how do they work out which posts are actually contributing to the discussion and which are just 'links of the day' type posts?
Posted by: Stew | November 06, 2006 at 01:55 PM
I can't speak for Techmeme of course but Tailrank is a lot more complicated than just link counting. Sometimes I wish this stuff was a lot easier. We're actually heads down working on another significant algorithm update now... as well as some other fun stuff.
I have seen Techmeme do this before. It seems to get confused but then again everyone has bugs from time to time.
Posted by: Kevin Burton | November 07, 2006 at 01:20 AM
Kevin,
So - are links necessary but not sufficient? Or not even necessary? I'm going to start looking for cases in both systems where there is no link up the tree.
Posted by: Matthew Hurst | November 07, 2006 at 02:21 AM
Links are necessary of course. We use other variables as well which are somewhat outside of your control. Other people's linking behavior is important as well.
Kevin
Posted by: Kevin Burton | November 08, 2006 at 04:27 AM