A Round Trip To Google
If you publish a blog and subscribe to Google's blog alerting, you can - by using the real time option for altering - get an idea of how long it takes Google to get each post.
- I posted Weekends At The Movies at 5:18 AM on December the first. The alert from Google came at 3:48 PM, a delay of 10 hours and 30 minutes.
- Now With More Matterhorn was published at 7:25 PM on November the 30th, the alert arrived at 2:32 AM on December 1st, a delay of 7 hours 7 minutes.
These figures suggest a substantial round trip time for blog publishing, indexing and alerting. Given that Technorati claims to have a mean time to index of 5 minutes (something which I personally doubt) Google seems to be pretty slow. Of course, I've only looked at 2 data points from a single blog. Perhaps there are other bloggers out there who can also share their stats on this one.
Update: This post was published at 9:23 AM on the December 2nd, the alert from Google was time stamped 3:22 AM on December 3rd - a minute shy of 18 hours.



Matt -- I've noticed much faster times when I checked for a blog post via their search. I'd expect some lag to be introduced via the alerting service, maybe it's significant.
Also, it seems like links are quickly moving from Google's blog system into the main Google index. I first noticed this just a few months ago. Through informal sampling, it seems like our blog posts are getting into Google's index in about twelve hours.
Of course, it's only my speculation that this is the mechanism: ping -> Google Blog Search -> Google. It could be that The Google thinks our blog so significant that it crawls it daily, but I think that less likely. It makes a lot of sense for Google to take advantage of the push from feeds to get pages indexed more rapidly.
Posted by: tim finin | December 02, 2006 at 10:27 AM
Do you also subscribe to the RSS feed from GBS as well? I've seen stuff get into their web page results pretty fast, as well as get stuff via the RSS feeds. I haven't noticed as long as a delay with the emails, but I don't use that stuff as much.
There's some more interesting stuff with GBS, such as the fact that they don't retain full archives of posts (try a date sort for something less common, and notice the gaps in dates); also, Google in general seems to be doing some experiments in detecting duplicate content, especially as regards blogs, and this seems to have hit GBS as well.
Posted by: Greg G. | December 03, 2006 at 09:03 AM
Nice test--I mentioned it to a Google blogsearch colleague. It would also be neat to decouple this and measure the time-to-index and time-to-alert separately. Just do a post, then search for it every so often to see how long until it's searchable. The alert will presumably follow later, giving you the time-to-alert info. Repeat 3-5 times and post what you find. :)
Posted by: Matt Cutts | December 03, 2006 at 06:58 PM
Matt - Tim Finin did something like this on his blog (see his comment above). The problem really is that anything we do of this nature is purely anecdotal. One would need to do many tests with many blogs in different circumstances (e.g. different languages, different platforms, different ping channels, etc.) to get a real picture. I suppose, however, that one thing individuals do care about is the MTTI for their own blogs.
Posted by: Matthew Hurst | December 03, 2006 at 07:10 PM
This may just be me, but I get an alert for my posts (I likely do much fewer than you) within minutes, not hours and the feed goes out within an hour or two, and I use Blogger.
Posted by: J | December 04, 2006 at 06:09 PM
J - you are probably getting some benefit from the fact that Blogger is part of Google. There may be some artifact of feedburner, which delivers my feed, which I'm running up against - more on that later.
Posted by: Matthew Hurst | December 04, 2006 at 07:13 PM