These days, I find myself executing many complex queries and - like everyone who is employed by a search engine - maintaining my search behaviour as far from the norm as possible. My focus of late has been in exploring the world of excel documents available online. Consequently, I've been looking at the difference in search experience between Google and Bing.
Firstly, to find excel documents, one can execute a 'filetype:xls' query in the search box. This works on both Bing and Google. Right there, one is exposed to major differences. Bing estimates it has 38, 100 hits, peanuts compared with Google's claim of 18, 000, 000.
Both Google and Bing provide some ability to inspect documents without having to load the actual excel data. The semantics are, however, a little different. Bing's 'cached' page (which all results appear to have) is actually an HTML version of the excel document. This is very useful for making quick judgments about the value of the hit. Google has both 'cached' and html versions of the document.
From the point of view of intuition, I find the Google design to be a little simpler, being explicit about cache and html viewing.
However, the problem with Google's implementation is that their cached pages are not always present:
In addition, not all Google pages have a 'View as HTML' option. So on the one hand you have fewer results in a less intuitive but consistently implemented interface, and on the other, a far larger collection with clearer semantics but with inconsistent implementation.
Comparing the HTML views of the spreadsheets, I find Bing's to be a little better, with Google's rendering often appearing with reams of whitespace and strangely repeated elements.
Unfortunately, I don't even believe all of the above summary. If you include a search term in the query - 'filetype:xls per' - you get back from Bing an estimate of 2, 810, 000 results, and from Google 828, 000 results. Who knows which of these numbers is correct - but Bing's indication of a super set of the more general 'filetype:xls' query is troubling.
Much of the evaluation of search engine technology focuses on the 'head' - do they get searches for movies right, do navigational searches for well known brands work, etc. However, I think by looking in the niche areas one can get an understanding of the rich opportunity that lies in the data available online as well as the manner in which this opportunity challenges the popular optimizations of search engines.
Comments