My Photo

 

  • Subscribe with Kindle

« iPhone Price and the iPod Touch | Main | Tracking Political Buzz in the US »

September 10, 2007

Word Trees

Neoformix posts about a new corpus visualization available on Many Eyes called Word Trees. Fun to play with - a spin on a more traditional tool called KWIC (key word in context).

Many

However, let's consider the utility here. The visualization uses font size to indicate frequency. This certainly gives one an intuitive feeling of the relative distribution of patterns, but no quantitative information. Also, due to the abundance of localized variation in language, it would be very useful to see this interface extended to include patterns some sort, or at least wild cards.

For an interesting variation on this, take a look at this paper by Futrelle et al. which includes this display:

Extreme

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c994053ef00e54eda12f88833

Listed below are links to weblogs that reference Word Trees:

Comments

The key elements to the KWIC plus display from Futrelle et al are

- readable real examples (slightly compromised by the need to squeeze the middle
column of black in some way)
- aligning corresponding elements of the lines so as to show relationships
- sorting so as to bring together the lines that are related in interesting
ways (all the "of's" after "information", but also "integrated" and "introduced"
leading to interpolations in the otherwise contiguous sequences of "of's")

The quantitative information is implicit in the visible long sequence of "of's" rather
than made numerical. This is a good choice perceptually if the examples
are few enough to allow it while maintaining readability . I wonder if the tool ever does vertical ellipsis, somehow indicating how many examples were left out because it
thought they were dull.

If we had general AI, a modest consequence would be to make generalized KWIC
displays better. This deals completely with the long-standing difficulty of selling
AI to lexicographers.
samples it shouldn't show

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Twitter Updates

    follow me on Twitter

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    Categories

    Blog powered by TypePad