Charles Armstrong, CEO of Trampoline Systems, pointed me to Enron Explorer - a technology demonstration. Email data is hard to publish and share due to obvious issues of privacy. Consequently, it is hard to create data sets that can be used openly for text mining, natural language processing and other language and social research and that can be passed around the scientific community. Thankfully, the boys and girls of Enron came to our rescue and, via their prosecution, inadvertently provided public access to a huge amount of email data.
Since then researchers have been annotating this data, running experiments on it and publishing papers.
Enron Explorer provides a number of interfaces into the data. One can browse the mailboxes of individual employees, search the data set, explore the data visually and (one of the most interesting applications) provide commentary on the juicy data you've found.
The visualization, shown below, provides a localized graph view of an individual - showing their immediate social network as recorded by the emails they have sent and received. The graph uses the thickness of the arcs to indicate the strength of relationships.
I like the visualization for the way in which it doesn't try to show the entire network - which wouldn't allow one to view the detailed information about an individual. However, there is a little too much animation going on in it for my taste, When the user clicks on a new node (person), the whole thing whizzes around finally landing in a new configuration. The transition could have been less complex and consequently less confusing.



Comments