My Photo

 

  • Subscribe with Kindle

« Google Book Search and Geographic Entity Extraction | Main | YouTube: Hurley versus Chen »

January 28, 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c994053ef00d834ddf9d053ef

Listed below are links to weblogs that reference Google Book Search and How To Really Blog:

Comments

I did a little poking around with the map app in Book Search based on your earlier post. Being an O'Brian fan, I did a search on the Aubrey/Maturin books. As it turns out, there's a location in Venezuela named "Maturin." The Google search also finds Berkeley, California, in a passage about "a young attaché called Berkeley..."

Don't get me wrong, I see a real use for this thing. We just have to remember its limitations.

This reminds me of another project, GutenKarte, which maps place names found in free books using open source tools and MetaCarta's API: http://gutenkarte.org/

They have a tool here which will attempt to do the same for any web page: http://labs.metacarta.com/PageMapper/

Before Flickr had its own geotags, my colleagues built Mappr to plot photos with placenames in tags onto a map of the USA: http://www.mappr.com/

All these projects suffer from the same problems identified in the post and first comment. Mappr was interesting though because it didn't require a "place" to be blessed by the big geocoding databases - tourist trails like Route 66, or events like Burning Man, could emerge as "places" in their own right.

I wonder if the book search tools will develop in this direction, and also how they will deal with historical locations that no longer exist, or that change name. An exciting challenge!

Speaking as someone currently working on a geocoder, it's difficult enough to reliably parse addresses when they're already identified as such, and even when you can assume some vague sort of consistent format will be present. How does one determine that "4th Ave Bypass" has two suffixes ('ave' and 'bypass'), while "Lyttleton Close Road" has one? Worse, "Lyttelton Close" could refer to the street called "Lyttelton" with the suffix "close", or it could refer to the street "Lyttleton Close" with the suffix omitted. Address parsing is littered with problems and ambiguities like this, and it only gets worse if you want to recognise addresses in free text.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Twitter Updates

    follow me on Twitter

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    Categories

    Blog powered by TypePad