Farecast (recently acquired by MIcrosoft) does just what you want when making an airplane ticket purchase: it predicts if the price is going to go up, down or stay level and advises when you should buy (now, wait). Timberpost is a small company founded by Peter Ross and Tim Taylor (Peter was a professor at Edinburgh University when I was there studying AI and Tim completed his PhD in the same department). Timberpost’s product – TRAITS – takes a crack at a real chestnut of an AI problem: predicting the stock market. The difference with this solution appears to be that it actually does a good job of it.
The graph below shows the performance of a hedge fund run (in simulation) by TRAITS compared against the FTSE EuroFund 300 Index.
This portfolio is currently showing an annualised return of +23%, which would rank it 6th out of 200 peer funds according to the latest performance data on real European Long/Short Equity hedge funds published by EuroHedge magazine.
Timberpost describes their technology as follows:
Many machine learning techniques have been applied in finance, including neural nets, genetic algorithms, reinforcement methods and rule induction. We are developing a new approach that is inspired by ideas about how the human immune system functions. Like the immune system, our software can not only discover effective responses to new conditions (in our case, potential trading opportunities), it also adapts to remember past successes in order to be able to re-activate them quickly when conditions change.
In biological systems, recognition happens by molecular binding. In our software, recognition is based on elaborate mathematical expressions that describe features of the behaviour of stocks. The system is designed to be efficient; it can look at many thousands of elaborate expressions per second.
Amazon's version of the Mechanical Turk is a service which distributes human judgment problems to the crowd. Dolores Labs is a new enterprise which plans to make managing, collecting, interpreting and leveraging distributable judgment problems and their answers. One can get a good idea of what they do, and for whom, from their examples page, which includes: sentiment analysis, search relevance and classification tasks.
One of the nicest illustrations of the problem space is the labeling of colours. A full description appears on their blog, which includes a pointer to their released data set.
When I joined WhizBang!Labs, one of the most impressive sights was the room full of labelers - part time workers whose job was to create labeled data. This data was used to both train and test classifiers - supervised machine learning. One of the things one rapidly learns is the value of good labeled data. Part of creating a labeling exercise is to develop good criteria for determining the classes - the labels that need to be given to the data. Even with good criteria, it is always quite remarkable to see how often a room full of people can disagree.
In a recent paper by Alm, Roth and Sproat: 'Emotions from text: machine learning for text-based emotion predictions', a pair of labelers were given the task of identifying the emotion found in the text of children's fairy tales. They report that the labelers agreed between 45 and 64 % of the time. In other words, when asked to determine if a piece of text indicated ANGER, DISGUST, FEAR, HAPPINESS, SADNESS, SURPRISE or no emotion, there was only moderate agreement.
Having multiple annotators, and the ability to detect the intersection of their decisions, is part of an established process by which a gold standard is reached. Those labels which are not common to all (both) labelers are then reviewed and resolved. Emotion may be one of the hardest things to label for. Defining emotions is tricky and penetrating the textual fog that mixes both direct and indirect emotional cues as well as the use of metaphor (does I loved that movie really mean I had an emotional relationship with it?) is highly valuable yet challenging.