[in response to Fernando’s response to Strings are not Meanings part 2.]
Fernando points out that
[t]he data we were discussing in the original paper tells us a lot about how people “produce and interpret” it. Links, clicks, and markup, together with proximity, syntactic relations, and textual similarities and parallelisms, are powerful traces of how writers and readers organize and relate pieces of information to each other and to the environments in which they live.
Fernando is right – these observations are powerful traces of how writers and readers organize and relate pieces of information. Just as a film of Kasparov is a trace of his playing chess.
In this dialectic, I’m beginning to get some coordinates on where our beliefs, methods and reasoning overlap and where they don’t. Certainly there is text ‘out there’, created by intelligent agents. We might also agree that neural activity occurs in response to consuming that text and as a precursor to creating it. My interests lie in getting a handle on a level of abstraction of that process which will allow two things: firstly, a computer to act in the same way and secondly for a human to understand what it is the computer is doing – and how that relates to how the human mind works - in order to achieve that behaviour.
I think that Fernando approaches this space from a more behaviourist mindset – accepting the input, output and context but with no requirements for stuff happening ‘inside’. There is no discredit in this approach – we are just interested in different things, and those differences lead down different paths in terms of our beliefs and assertions.
I owe a debt to a lecturer in the philosophy of AI at Edinburgh who introduced ‘ontology’ not in its now more common meaning (a graph of nodes and labels), but in its original philosophical meaning: an ontological statement is a statement about something that (the speaker believes) is actually the case. In addition to their taxonomic meaning, ontologies have come to refer to a requirement for communication – that the stuff I refer to maps to the same stuff for you. The relationship between these two uses and both semantics and inference provide a basis for communication between agents and taking action in the world. It seems to me that this offers a very practical framework for AI and that our efforts should not be spent in rejecting that paradigm, but in figuring out graceful approaches to all the messy stuff that happens at the edges.
I recall when Kitano won the Computers and Thought Award at IJCAI in 1992, he declared machine translation done because example based translation could now operate at a scale capable of dealing with anything we could say.
I disagree that Fernando has no requirements for the stuff that happens "inside." Indeed, I see that as the very essence of what he is trying to understand. While your "agents" as you mention read and write semantic meaning that are formally specified according to some ontology, Fernando's approach reads and responds to semantic meaning specified in the form of massive amounts of web text. Web text is inherently less formal and imprecise... enter large scale consumption to capture all of the edge cases and better "understand" the inexplicit context. Agents consuming formal ontological statements have the advantage of explicitly defined contexts since they are formally specified. It is not important what is consumed, but the action that can be taken upon consumption (see my blog post for perhaps a better description: http://ivsoftware.blogspot.com/2009/04/its-all-just-semantics.html)
Further, I see the large scale data processing described by Fernando as supporting human cognitive models well. Much of what we are is determined by the large volumes of data we have processed since childhood. There is a very real aspect to social meaning derived from repeated exposure to some "truth." This, in large part, is what is so interesting about the social web. It opens the horizons to what we can experience and understand socially. While I am no expert, I see a fundamental complimentary role between both of these approaches. There are aspects of human cognition that seem extremely difficult, if not impossible to model with a pure descriptive formal specification. At the same time, higher levels of rational thought would be difficult to successfully model by a purely statistical/social approach.
Posted by: Robert Butler | April 14, 2009 at 08:38 AM