My Photo

« Visualisation Interfaces | Main | TailRank Update »

March 23, 2006

Comments

Steven Cohen

What an interesting post. Thanks. When looking at any sort of statistical analysis of blogs, whether it be just inlinks, outlinks, or the quality of the posts, it can be very difficult and controversial.

We work hard to provide the best statistics possible. We are attentive to user issues with our stats and have worked to fix any problems. I'd like to help out if possible and would you be able to send me some examples of what you see as "out of whack." Please feel free to contact me.

In addition, we do provide a feed for any blogs stats. While you are right that there is no archive, I've found it useful to have the feed sent via e-mail (I've had success with RSSfwd). This way, there is some sort of archive present.

Steven Cohen
PubSub Concepts, Inc
scohen@pubsub.com

pwb

I don't know how Technorati can call itself a service. Besides that its results are simply horrendous, the bold colors and awkward typeface make it literally hard to read.

mark wagner

Hey Matt

Interesting post. I do have several comments:
1) You state "there is no mechanism to ensure that data will be missed". I am hoping that you intended to add a "not" in there. I think the current structure (or lack of it) of the blogosphere does its best to ensure that data will be missed.

2) I think one of the main problems in link counting is that there is no reliable mechanism to match a feed to its blog. In fact, my previous work experience counting links has shown that there is typically a one to many relationship between blogs and feeds. The 'good' ones map directly based on url, just add a feed.rss and you're good to go. Now throw in feedburner and similar schemes and it takes some extra programming and / or human intervention to accurately map a feed to a site.

For even more fun thrown in a couple random blogs a directory level or two down from a domain in the url. However there are also 'valid' website content that live at those same directory levels. Remember, you are being graded on accuracy !

And don't get me started on the sites that can be accessed via http://username.example.com or http://users.example.com/username or http://www.example.com/users/username . All of these refer to the same blog. The fun part comes in when trying to accurately count links.

Oh well, sorry for the rant...
Flexability keeps life interesting...

(Note, comments/thoughts/emotions are mine and are not intended to reflect the views of any current or previous employers)

Fred

Hi Matt, Great post!

Hi Mark, you are completely right, and it is exactly why I am sure (at least, I hope) that people will anotate their blogs/websites with FOAF-like files. That way, crawlers/agents/systems will be able to easily make order in that mess, otherwise....


Take care,

Salutations,

Fred

The comments to this entry are closed.

Twitter Updates

    follow me on Twitter

    March 2016

    Sun Mon Tue Wed Thu Fri Sat
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    

    Categories

    Blog powered by Typepad