Paul Kedrosky (whose opinion and blog I greatly enjoy) posted yesterday about the correlation between Microsoft stock prices and searches on Google for "google interview questions". These time series are correlated at r=0.6361.
To replicate his experiment you can do the following steps:
- Visit Google Finance and get the MSFT stock data up.
- Visit the 'historical prices" link and edit the date range so that it includes a good chunk of data (say 10 years).
- Click the update button to update the data on the page (note that the graph won't update - some sort of bug).
- Click on the download option on the right hand side (this is not available for all data on Google Finance, btw).
- Once the data is downloaded, upload it to Google Documents as a spreadsheet and delete all but the first column (dates) and the second last column (closing price). Note that you might see a server error when deleting columns, just ignore it and keep going.
- Now download the results and open the data in a text editor and delete the first line.
- Next, upload this data to Google Correlate.
I did this for a couple of other stocks : Google and IBM. For IBM, the correlated terms are as shown below:
While there are various views on what value of r for Pearson's statistic constitute good or strong correlations, it is clear from the above that these terms are more correlated with IBM's stock than the terms Paul showed for Microsoft's and, importantly, that they bear no intuitive reasoning - what explains the relationship between 'mi novio' (my boyfriend) and IBM?
You can see the relationship between the top correlated term and the stock below:


Indeed, it is important to remember that correlation does not imply causation. I also think that this service is supposed to find events that cause searches, not the other way around.
Perhaps if you move the dates around a bit (i.e. push the dates of stock closing prices back a day or two) it could find some real correlations of news items about IBM causing searches about related issues, but I can't imagine a straight relation between _the volume_ of searches and the stock price. Any event, good or bad, will cause a rise in volume of searches. Perhaps you should try to correlate the volume of trading stock, not the closing prices.
Posted by: Michał Tatarynowicz | May 29, 2011 at 05:00 AM
I haven't checked by doing a proof of concept, but I think you can probably pull the Google Finance data directly into a Google spreadsheet using the =googlefinance() formula?
http://docs.google.com/support/bin/answer.py?answer=54198
Posted by: Tony Hirst | May 29, 2011 at 06:16 AM
I'm agree w/ Michał Tatarynowicz's opinion.
Posted by: JacopoPT | May 29, 2011 at 08:19 AM