I'm writing a paper with some colleagues on estimating the number of bloggers from various countries. We wanted to compare our statistics with those published elsewhere online. In my mind, this is a great example of the Data Web.
Imagine the following interaction. A user asks an interface: How many Chinese bloggers are there? Instead of the usual ranked list of web pages which match some or all of these words, and which may or may not contain content that associates their meaning in the way intended, we get the following output:
This graph shows the various estimates for the size of the Chinese blogosphere (actually, the number of Chinese bloggers) published online against the date associated with the estimate. This was gathered by hand from 14 different published estimates in 10 different documents (with some pain, I might add).
R squared of 0.399 indicates that the linear estimate is not suitable for the data.
Posted by: Rands | December 08, 2006 at 07:11 AM
Rands,
Thanks for the comment. Yes, I'm aware that this is not a good linear fit. However, this is not really the point of the post. The point was to show how disparate date could be brought together in an application that knew about data (as opposed, say, to applications that almost know about documents but not really, such as web browsers ;-)
Posted by: Matthew Hurst | December 08, 2006 at 07:16 AM