My Photo

« Google Earth/SketchUp 3D Campus | Main | Twitter, Atlas »

April 02, 2007



You write: "Fernando's post is somewhat confusing. He argues that scientific discovery is perhaps the most celebrated example of the qualities of intelligence that I require for AI. [...] Fernando's example of scientists creating machines to mine genomic data for repeated structures is, he claims, one that supports the use of scale for AI. But how did we get from the genomic data - represented as simple sequence - to the problem of finding patterns in it?"

This seems disingenuous.

In your prior post you indicated that "The reason [Google's weak vision of AI] upsets me is that driving for scale of this type sidesteps the fundamental power to generalize."

Fernando then responded, "The obvious question to ask Matt is 'How do you know?' Where does the power to generalize come from? The hypothesis that it is based on a very large associative memory is at least as credible as any of the alternatives. It's certainly the case that all of the advances in information retrieval, speech recognition, machine translation, image understanding, and genomics, of the last twenty years are basically advances in extracting statistical regularities for very large data sets. No other approach has had even a teeny fraction of that impact."

He then elaborates on the genomic example: "Computational biologists developed a variety of statistical pattern-matching methods that discover the specific conserved and mutated elements by comparing the genomes of related species. These methods are able to find generalizations that lead to experimentally testable predictions of, for instance, important conserved regulatory modules. That's just one among many discoveries that would be totally impossible without statistical generalization from huge data sets. These statistical methods are discovering new facts about biological evolution and function."

As a response to your strong claim "that driving for scale of this type sidesteps the fundamental power to generalize", this seems to me be a quite clear and valid critique.

Matthew Hurst


Thanks for your comment.

Let's take a simple example. There are a number of applications of statistics to language which involve n-grams (machine translation - an area in which Google has invested and hired heavily in recent years - is a good example). The problem with n-grams is that there are never enough. It always seems that you could do better if you just had more and longer n-grams. A data scale approach would be to do simply that - create an even bigger set of n-grams.

This large data approach is also evident in example based MT in which one uses a set of example pairs to translate from A to B. Hiroaki Kitano, in his acceptance speech for IJCAI's Computers and Thought indicated that MT was pertty much solved by this approach (plus massive parallelism). The thinking being - the more examples, the better translation.

I confess that my beliefs regarding AI often relate to some notion of an elegant solution rather than an engineering one - which is why I'm happy to engage in debates around it. In the above discussion, I'm characterizing Google's approach as being one that first resorts to scale rather than something more interesting. This is again where personal preferences come in (i.e. the notion of what is interesting).

In re-reading Fernando's comment on my post (his post) I realise that I misinterpreted his question. I think he means: how do I know that Google sidesteps generalisation. Take look at the video of Larry Page declaring that AI is an issue of data and processing power, not algorithms, and see what you think.

The comments to this entry are closed.

Twitter Updates

    follow me on Twitter

    March 2016

    Sun Mon Tue Wed Thu Fri Sat
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    


    Blog powered by Typepad