I love this new feature in Google's book search product which allows you to look at the time series trends for terms according to the publication dates of books. The example below shows the trend for the tokens 'colour' and 'color'.
This type of statistical analysis brings up lots of questions, simultaneously about the occurrence of terms in books in general, and the distribution of books in Google's collection. Does it show a decline in the ratio of British to American publications? or a decline in the British spelling of colour? or a bias in the corpus towards recent American publications and earlier British publications? Hard to say, and interesting to ask if probing only via this tool one could find out.
Update: I'm actually quite serious in being keen on understanding both the distribution of terms in our language and the nature of the collection. While this article on ReadWriteWeb rightly celebrates the insights that this data set can bring, it lacks in any questioning on the representative nature of the underlying data set.
Update: It seems Chomsky throws the tool off. In searching for 'manufacturing consent' I'm shown this artwork:
This is probably a positive sign indicating a lot of interest in the tool - though surprising to see this in a Google product.
Comments