My Photo

 

  • Subscribe with Kindle

« BBC Generation Next and Open Data | Main | TV Reviews on Yahoo »

December 02, 2006

A Round Trip To Google

If you publish a blog and subscribe to Google's blog alerting, you can - by using the real time option for altering - get an idea of how long it takes Google to get each post.

  • I posted Weekends At The Movies at 5:18 AM on December the first. The alert from Google came at 3:48 PM, a delay of 10 hours and 30 minutes.
  • Now With More Matterhorn was published at 7:25 PM on November the 30th, the alert arrived at 2:32 AM on December 1st, a delay of 7 hours 7 minutes.

These figures suggest a substantial round trip time for blog publishing, indexing and alerting. Given that Technorati claims to have a mean time to index of 5 minutes (something which I personally doubt) Google seems to be pretty slow. Of course, I've only looked at 2 data points from a single blog. Perhaps there are other bloggers out there who can also share their stats on this one.

Update: This post was published at 9:23 AM on the December 2nd, the alert from Google was time stamped 3:22 AM on December 3rd - a minute shy of 18 hours.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c994053ef00d8342e75a753ef

Listed below are links to weblogs that reference A Round Trip To Google:

» Google's blog indexing timed (7 - 10 hours, it turns out) from Open (finds, minds, conversations)...
* * UPDATED * * Matthew Hurst at Data Mining has been timing how long it takes Google to index a blog post. His test shows 7 - 10 hours, which puts a way behind Technorati which can pick up [Read More]

Comments

Matt -- I've noticed much faster times when I checked for a blog post via their search. I'd expect some lag to be introduced via the alerting service, maybe it's significant.

Also, it seems like links are quickly moving from Google's blog system into the main Google index. I first noticed this just a few months ago. Through informal sampling, it seems like our blog posts are getting into Google's index in about twelve hours.

Of course, it's only my speculation that this is the mechanism: ping -> Google Blog Search -> Google. It could be that The Google thinks our blog so significant that it crawls it daily, but I think that less likely. It makes a lot of sense for Google to take advantage of the push from feeds to get pages indexed more rapidly.

Do you also subscribe to the RSS feed from GBS as well? I've seen stuff get into their web page results pretty fast, as well as get stuff via the RSS feeds. I haven't noticed as long as a delay with the emails, but I don't use that stuff as much.

There's some more interesting stuff with GBS, such as the fact that they don't retain full archives of posts (try a date sort for something less common, and notice the gaps in dates); also, Google in general seems to be doing some experiments in detecting duplicate content, especially as regards blogs, and this seems to have hit GBS as well.

Nice test--I mentioned it to a Google blogsearch colleague. It would also be neat to decouple this and measure the time-to-index and time-to-alert separately. Just do a post, then search for it every so often to see how long until it's searchable. The alert will presumably follow later, giving you the time-to-alert info. Repeat 3-5 times and post what you find. :)

Matt - Tim Finin did something like this on his blog (see his comment above). The problem really is that anything we do of this nature is purely anecdotal. One would need to do many tests with many blogs in different circumstances (e.g. different languages, different platforms, different ping channels, etc.) to get a real picture. I suppose, however, that one thing individuals do care about is the MTTI for their own blogs.

This may just be me, but I get an alert for my posts (I likely do much fewer than you) within minutes, not hours and the feed goes out within an hour or two, and I use Blogger.

J - you are probably getting some benefit from the fact that Blogger is part of Google. There may be some artifact of feedburner, which delivers my feed, which I'm running up against - more on that later.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Twitter Updates

    follow me on Twitter

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    Categories

    Blog powered by TypePad