SentiMetrix, which I've covered before, now has a website. There is a bit of new information available there, including more background on the company. However, the thing I find most interesting, and which has perhaps the biggest impact on the market, are the claims regarding different languages:
Our system was designed from the ground up to support multiple languages. At this time, we can handle 8 languages: English, French, Spanish, Italian, Chinese, Russian, Korean and Arabic. While accuracy is still best for English documents, we have followed French elections and have successfully predicted Sarkozy's victory.
Generally, any company offering a sentiment analysis service is asked (right after 'yes, but does it really work?') 'can you do this in these 5 other languages?' My understanding is that SentiMetrix' technology requires lexical resources, thus they have a one time cost per language to create that resource (not a huge deal perhaps, but requiring native speakers for sure). If they also use grammatical information (something which I believe to be important) then there are additional costs involved.
Anyway - congratulations to the team for getting this far: onwards and upwards.
"we have followed French elections and have successfully predicted Sarkozy's victory"
I think, there is some difference between the buzz and the people opinion. I sure this tools use the official media, so, in France, M. Sarkozy manages all the official media. It's the same in Italy with Berlusconi.
For me, "predicted sarkozy's victory" is so easy if analyse the official media. Just, remind you the french european constitionnal vote (2005). I sure the predication is bad if you analyse the official media. It's very important to include the blog system. Do you agree ?
Posted by: trebormat | July 12, 2007 at 05:29 AM
SentiMetrix has a patent-pending technology for multilingual opinion analysis that leverages existing linguistic resources (e.g. parsers) available for the language. The cost and time involved in building an opinion analyzer for a given language varies, based on what resources of this nature are available for that language, and the cost and ease of use of those resources.
Vadim Kagan, SentiMetrix, Inc.
Posted by: Vadim Kagan | July 13, 2007 at 08:20 AM