The fallibility of introspection as a means to understanding consciousness is well known and understood. Tempting as it may be to refer to the 'voice' inside our head - the inner monologue, or the thought process - one can never win any arguments by playing this card. While we may, at the most, use the common experience as an indication of the separation between conscious and subconscious thought, we can't claim that intelligence works thisway or that way by summarizing a thought process.
If we could do that, then we would simply declare that intelligence involves inference, self-awareness, symbolic reasoning, etc. This argument can be brushed aside by reasoning that we have no evidence that any of our inner monologue, or stream of consciousness is a prime mover - rather, it may well be a post hoc phenomenon.
However, when it comes to communicating with other agents in what we perceive to be the real world, we have created an interface that does appear to have all of these nice qualities: symbols, structure, stereotypes and so on are all used to externalize our thoughts and as an input mechanism to grasp the inner workings of our fellow beings. And while it is attractive to believe in the emergence of intelligence via huge data sets and massive but simple processing power, that intelligence will arise from the simplest machines if only we throw enough data at them - the fact of the matter is that much of what we learn as humans we do so by the consumption of structured symbols of various types.
Fernando ask
How do you know [that] the power to generalize [doesn't come from massive scale]?
Fernando's post is somewhat confusing. He argues that scientific discovery is perhaps the most celebrated example of the qualities of intelligence that I require for AI. Science is perhaps the most formal, structured, symbolic and hierarchical form of communication that society has created. Fernando's example of scientists creating machines to mine genomic data for repeated structures is, he claims, one that supports the use of scale for AI. But how did we get from the genomic data - represented as simple sequence - to the problem of finding patterns in it? That requires all the symbolic, hierarchical structured knowledge: the genetic model.
In (partial) answer to Fernando's question - clearly the parallelism of the brain is considerable, but that is not the type of scale that Larry Page is talking about (that is to say, the symbols - or units/mechanism of representation - and operations involved are quite different).
You write: "Fernando's post is somewhat confusing. He argues that scientific discovery is perhaps the most celebrated example of the qualities of intelligence that I require for AI. [...] Fernando's example of scientists creating machines to mine genomic data for repeated structures is, he claims, one that supports the use of scale for AI. But how did we get from the genomic data - represented as simple sequence - to the problem of finding patterns in it?"
This seems disingenuous.
In your prior post you indicated that "The reason [Google's weak vision of AI] upsets me is that driving for scale of this type sidesteps the fundamental power to generalize."
Fernando then responded, "The obvious question to ask Matt is 'How do you know?' Where does the power to generalize come from? The hypothesis that it is based on a very large associative memory is at least as credible as any of the alternatives. It's certainly the case that all of the advances in information retrieval, speech recognition, machine translation, image understanding, and genomics, of the last twenty years are basically advances in extracting statistical regularities for very large data sets. No other approach has had even a teeny fraction of that impact."
He then elaborates on the genomic example: "Computational biologists developed a variety of statistical pattern-matching methods that discover the specific conserved and mutated elements by comparing the genomes of related species. These methods are able to find generalizations that lead to experimentally testable predictions of, for instance, important conserved regulatory modules. That's just one among many discoveries that would be totally impossible without statistical generalization from huge data sets. These statistical methods are discovering new facts about biological evolution and function."
As a response to your strong claim "that driving for scale of this type sidesteps the fundamental power to generalize", this seems to me be a quite clear and valid critique.
Posted by: joyrexus | April 03, 2007 at 10:42 AM
Joy
Thanks for your comment.
Let's take a simple example. There are a number of applications of statistics to language which involve n-grams (machine translation - an area in which Google has invested and hired heavily in recent years - is a good example). The problem with n-grams is that there are never enough. It always seems that you could do better if you just had more and longer n-grams. A data scale approach would be to do simply that - create an even bigger set of n-grams.
This large data approach is also evident in example based MT in which one uses a set of example pairs to translate from A to B. Hiroaki Kitano, in his acceptance speech for IJCAI's Computers and Thought indicated that MT was pertty much solved by this approach (plus massive parallelism). The thinking being - the more examples, the better translation.
I confess that my beliefs regarding AI often relate to some notion of an elegant solution rather than an engineering one - which is why I'm happy to engage in debates around it. In the above discussion, I'm characterizing Google's approach as being one that first resorts to scale rather than something more interesting. This is again where personal preferences come in (i.e. the notion of what is interesting).
In re-reading Fernando's comment on my post (his post) I realise that I misinterpreted his question. I think he means: how do I know that Google sidesteps generalisation. Take look at the video of Larry Page declaring that AI is an issue of data and processing power, not algorithms, and see what you think.
Posted by: Matthew Hurst | April 03, 2007 at 11:20 AM