While mining Twitter data for business and marketing intelligence (trend/buzz analysis, sentiment/opinion mining, authority/influence analysis) looks like a compelling path to explore for a business model, it is important to consider the proposition from the point of view of the customer. Enterprises have been working with vendors in this space (mining social media content for BI) for well over 5 years and already have expectations regarding the features and quality of reports that these analytics needs to deliver to be useful (actionable).
- Domain coverage: how broad is the topical space available in the solution? Crawling all data sources is the way to win here.
- Demographic coverage: the broader the demographic coverage (and the accuracy with which the demographic features of the content authors can be determined) the better.
- Content Analysis/Text Mining: how well does the solution take all the unstructured content and deliver structured interpretations that can then act as the input for further data mining. This is generally a matter of applied research (taking the current state of the art in text mining and making it work with the greater variety and complexity of social media content).
- Timeliness: how timely is the analysis. This is generally a function of how timely the data is collected. Blog data, for example, can be gathered in a very timely manner thanks to the ping/feed mechanism. However, the reality of real time mining is that the consumer of the data is the real calibrator - real time may mean 4 hourly, not second by second.
If the business model for Twitter is going to be mining the Twitter stream for BI/MI, then they will be competing with companies that gather very large data sets (weblogs, usenet, message boards, reviews, groups, mailing lists, etc.). Seth Grimes suggested that the short texts of the Twitter stream may make hard problems like sentiment mining simpler as the limited space requires the author to be concise. However, this is a double edges sword as it means that the depth of analysis will be far shallower.
I believe that mining Twitter data will be a very exciting experiment, but I think that if Twitter goes down this path, it will have to either provide analytics over the other data sets, or partner with an existing company (say Visible Technologies). In fact, such a partnership would take the burden of building out an analytics engine away from the small Twitter team allowing them to continue to focus on infrastructure and ensuring the flow of this valuable data stream.