Search Trends: Data Mining, Text Mining
Below is the graph generated by Google Trends for the terms 'data mining', 'text mining', 'social media' and 'visualization'. How I wish this graph were labeled. The 'about' page suggests that the graph shows 'search volume'. It talks about normalization in the context of the auxiliary data (regions, cities and languages). Should we assume, then, that the graph shows absolute counts? If so, how can we interpret the below? The total number of searches for 'data mining' and 'visualization' is decreasing over time - i.e. fewer people are searching for these terms; while 'social media' and 'text mining' are increasing or staying constant. So why would searches for 'data mining', for example, be decreasing?
[Thanks to Ron Kass for inspiring this post.]




Is the y-axis absolute or relative? If the y-axis is relative it would explain the trend (more non-tech users are searching), if it is absolute the explanation could be that tech searchers bypass search.
Without knowing what the y-axis depicts the information is worthless.
Anjo.
Posted by: Anjo Anjewierden | September 06, 2007 at 06:14 PM
Anjo - wasn't that the point I was making?
Posted by: Matthew Hurst | September 06, 2007 at 07:29 PM
it's the students: whether the x-axis absolute or not, the spikes are at the beginning and the end of the autumn semenster and in the middle of the spring semester ... my uneducated guess is data mining lost it's hype value and moved to mainstream, which means better curricula and more books in the libraries, while "social media" is still hyped ... "text mining" is for the humanities and social sciences what "data mining" is for business or computer science ... so there the curricula might be less well developed.
My other uneducated guess is that the number of google searches is less relevant for the interest for a subject, and more relevant to the difficulty of finding relevant info about a subject: seven years ago I went to google to find info on Perl, now I go to perlmonks, use perl and cpan.
Posted by: Emil Per. | September 07, 2007 at 06:06 AM
I don't have an answer to this issue, but I agree that interpreting Google Trends results is very difficult. Many factors can influence results. As written on Google Trends page, we should "Keep in mind that instead of measuring overall interest in a topic, Google Trends shows users' propensity to search for that topic on Google on a relative basis". I think that interpreting the basic results (i.e. the search volume) of Google Trends is already a challenge. I have a related post on my blog about interpreting results of Google Trends when used with data mining terms.
By the way, your blog is excellent! It's a pleasure to read it.
Posted by: Sandro | September 10, 2007 at 06:47 AM