My Photo

 

  • Subscribe with Kindle

May 12, 2008

Powerset Factz, Star Wars

One of the best things about Powerset is its Factz feature. If you look at a page for a movie, you can see a pretty neat, completely automated summary of the plot. Have a look at Star Wars.

Powerset_starwars

Powerset Launches!

Powerset, which provides a new relationship with web data via innovative interfaces and natural language processing, launched this evening. Take a look at this video:

I'll write more later, but for now, check out other posts I've made on Powerset and NLP. I'll try to keep abreast of the commentary as it comes in. Meanwhile, I'm waiting for Fernando to pounce.

Update: ok, some comments. A couple of things that people are going to get hung up on. Firstly, writers seem to be referring to the technology as context or contextual search - why not call it NLP. Not sure where that is coming from. Secondly (actually, this is more important) pundits are going to write about the wikipedia-only issue. They're not getting it. 90% of search results come from a tiny fraction of web pages due to the huge redundancy on the web and the differences between searcher needs and author/publisher intents. The task isn't to always search that huge set, but to get the answers to the user.

June 14, 2007

Powerset, Powerlabs and Powerbuzz

Powerset is proving that it is not only chock full of NLP and CompLing ninjas, but also is not too shabby when it comes to online promotion. Powerlabs is a program for developers which gives them access to Powerset capabilities - it is also a channel through which anticipation can be built. Yesterday, Steve Newcomb posted a screen shot of the Powerlabs interface. Before he did this, he pushed some pictures out to the Powerlabs group.

Ps1_2

Ps2_2

Ps3_2   

February 14, 2007

Back To The Future: NLP, Search, Google and Powerset

For those following the parallel debates concerning NLP and search (NLP discussion from the technical side in parallel to the thy-shall-not-hype discussion from the Web 2.0 pundit-sphere) may be interested in this post by John Battelle from October 12th, 2004 (!). In the context of recent discussion, one hardly knows where to begin quoting:

"Named entity extraction" is a relatively new project [] which Norvig said Google had been working on for about six months. As Norvig explained the concept - essentially identifying semantically important concepts and the meaning wrapped around them[.]

This is in the context of a technology demo which Google gave around that time. Battelle continues, quoting Norvig in an eWeek story:

For example, Norvig said, researchers are looking for ways to break down sentences by looking for a phrase like "such as" and grabbing the names that follow it. The goal is to not only pull out the name but also its clusters, so that a name such as "Java" can be associated both with the computer language and with language in general, Norvig said.

"We want to be able to search and find these [entities] and the relationships between them, rather than you typing in the words specifically," Norvig said.

Battelle then goes on to speculate about how these capabilities might surface in the Google UI. The last sentence in the above quote seems so close - at least in terms of vision - with some of the current wave of NLP search debate that is provokes the question: what happened to this project? Did Google try and fail? If you read it closely, you'll see that Norvig is talking about some key NLP concepts:

  • Entities (typed concepts expressed in short spans of text, generaly noun phrases)
  • Ontologies (Java IS_A programming language)
  • Relationships (between entities)

I mean - couldn't you build a next gen search engine on such wonderful ideas?

February 09, 2007

Powerset In PARC Deal

VentureBeat (once a critic of Powerset, now more of a believer) covers the story. In summary, Powerset's technology is not some rushed together start-up demo, but the result of many man years of research and development at PARC.

VentureBeat's post contains an interview around the topic of NLP with Google's Peter Norvig. While Norvig gives some insight into the work on NLP at Google, he doesn't mention one of their main areas of focus: machine translation (MT). It is interesting to learn of his caution in the area of NLP for search while they are tasking a number of scientist at MT, possibly the hardest AI problem known to man.

Update: It took this to have John Battelle blog about the story...

November 26, 2006

Powerset In The Observer

Briefly, Powerset gets a writeup in The Observer:

Barney Pell is pursuing the dream of a 'natural language' search engine. He says that today's products, such as Google, search only for keywords and cannot, for example, distinguish between 'book for children', 'book by children', and 'book about children'. But a natural language search could identify 'function' words, understand that word order means something and respect the importance of small 'stop words'. Pell, 38, believes Powerset's search engine, soon to be launched, will be a catalyst for the 'semantic web'.

November 05, 2006

Powerset Update

I'm tickled by the new Powerset site. Powerset is doing something that really excites me - applying natural language processing to the problem of search. The reason the new site is of interest to this blog is partly due to that, but also due to the fact that they redesigned the site in part to be better positioned to respond to blog conversations about their company. A little over a week ago, Powerset was at the centre of a moderate blogstorm in which various pundits traded views on the value and promise of natural language search. At that time Barney Pell (CEO) responded on his own blog, as did a few other Powerseters. With the new site, they have both a press release section and a corporate blog at the ready.

November 04, 2006

Powerset, Amazon: Standing On The Shoulders of (a) Giant's Infrastructure

Barney posts about how Powerset is taking advantage of Amazon's ECC (Elastic Compute Cloud). He quotes from a Business Week article:

Consider Powerset, the secretive search startup backed by A-list angel investors, including PayPal Inc. (EBAY ) co-founder Peter Thiel and veteran tech analyst Esther Dyson. Co-founder and CEO Barney Pell harbors ambitions of out-Googling Google with technology that he says would let people use more natural language than terse keywords to do their searches. By analyzing the underlying meaning of search queries and documents on the Web, Powerset aims to produce much more relevant results than the current search king's.

Problem is, Powerset's technology eats computing power like a child munches Halloween candy. The little 22-person company would have to spend more than $1 million on computer hardware, two-thirds of that just to handle occasional spikes in visitor traffic, plus a bunch of people to staff a massive data center and write software to run it. That's when Pell heard about Elastic Compute Cloud. He was sold. Based on tests so far, using the Amazon site for part of the company's computing power could cut its first-year capital costs alone by more than half.

Cool!


October 11, 2006

Powerset: Blogstorm t+7

Briefly, Barney summarises the discussion (very diplomatically ;-) and Lorenzo writes a great first post on the potential and promise of natural language search. Missing from the discussion so far is a broad view of the index side component of NLP search infrastructure - the really interesting part. It is unfortunate that critics are focusing on the behaviour of users (will they or won't they write natural queries) and the ability of the search service to interpret those queries (can they automatically understand and encode the subtle differences?) rather than the more interesting issue of a system that can ingest all the data on the web in a manner that could be called interpretation. Perhaps the only relevant writing on this aspect so far as been the mention of clustering technology such as Vivisimo's which gives the appearance of (a certain class of interpretation - that of word sense) post hoc.

You can see the conversation graph rooted at Barney's original post here.

October 05, 2006

Powerset: Update

Update: TechCrunch picks up the story. One thing that is getting a little fuzzy here is where NLP ought to be applied in search. Understanding the query is only one part - the other part (to me, the more interesting part) is in understanding the text in the documents. This also requires smarts for understanding the structure and layout of documents and the function of different document areas (navigational, content bearing, etc.). This point appears to be missed in the TechCrunch post (and in the comments by readers there).

Briefly, Barney has written up in detail his vision regarding Powerset, NLP in search and the restrictions of keyword based search. This is worth reading - and I'll post more on it later. I wanted to link this article to a recent post by Paul Kedrosky which expressed satisfaction with current search engines:

I have no idea anymore what "better" search would mean. I find pretty much everything I want now, and while natural-language processing always sounds great, improvements in how I submit searches do diddly for me.

My analogy is as follows: imagine interacting with a reference librarian, but you could only speak in 2-3 word statements. The reason this analogy is useful is that it points out that the search problem, when not constrained by the assumptions and expectations of the text box, is far richer and complex than we've been blinkered to believe.

Oh yes - I also wanted to quote Barney on a most excellent turn of phrase describing keyword based queries language, which he describes as: a grunting pidgin language. Excellent!

Twitter Updates

    follow me on Twitter

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    Categories

    Blog powered by TypePad