I wrote briefly about an article by Google researchers called The Unreasonable Effectiveness of Data – saying it was worth a read, but making three points to keep in mind when reading it. Fernando responded quite enthusiastically, and I want to clarify two of the points here (I’ll be following up with clarifications on the third in a later post).
Data may be unreasonably effective, but effective at what?
In asking this, I was really drawing attention to firstly the ability for large volumes of data (and not much else) to deliver interesting and useful results, but its inability to tell us how humans produce and interpret this data. One of the original motivations for AI was not simply to create machines that play chess better than people, but to actually understand how people’s minds work.
Despite all the ontology nay-sayers, a big chunk of our world is structured due to the well organized, systematic and predictable ways in which industry, society and even biology creates stuff.
Here, I want to draw attention to the skepticism around ontologies. Yes, they come at a cost, but it is also the case that they offer true grounding of interpretations of textual data. Let me give an example. The Lord of the Rings is a string used to refer to a book (in three parts) a sequence of films, various video games, board games, and so on. The ambiguity of the phrase requires a plurality of interpretations available to it. This is a 1-many mapping. The 1 is a string, but what is the type of the many? I actually see the type of work described in the paper as being wholly complimentary with categorical knowledge structures.
Comments