My Photo

 

  • Subscribe with Kindle

« Da Vinci and the Power of Print | Main | IR Discussion Series »

April 17, 2006

Rexa

Rexa, which shares a pedigree with Cora from Just Systems, and which provides a similar product to CiteSeer and Google Scholar is now live, according to John Langford's Machine Learning blog. Andrew McCallum, whom I worked with back at WhizBang, is the PI for this project and comments:

Rexa is a digital library covering the computer science research literature and the people who create it. Rexa is a sibling to CiteSeer, Google Scholar, Academic.live.com and the ACM Portal. It’s chief enhancement is that Rexa knows about more first-class, de-duplicated, cross-referenced object types: not only papers and their citation links, but also people, grants, topics—and in the future universities, conferences, journals, research communities, and more.

Rexa currently provides:
* Keyword search on over 7 million papers (mostly in computer science)
* Cross-linked pages for papers, authors, topics and NSF grants
* Browsing by citations, authors, co-authors, cited authors, citing authors;
  (find who cites you most by clicking “Citing authors” on your home page)
* Web-2.0-style “tagging” to bookmark papers
* Automatically-gathered contact info and photos of author’s faces
* Analysis of research topics, their impact, and how they relate.

Coming soon:
* Much improved coverage of recent CS papers (it’s a little weak now)
* Ability to make corrections to extracted data

Coming later:
* Improved extraction and co-reference accuracy
* Much more data mining
* Broader coverage of more research fields

Rather than seeing our siblings as competitors, we believe that such services are like “newspapers for the research community”, and, just as it is tremendously important that there is not just one national newspaper, we think there should be many such services. This is especially true since increasingly they will do more than simply supply raw information, but also provide subjective analysis, pattern discovery, and predictions.

One of the key challenges that this type of vertical search has to deal with is the task of recognizing variations of named entities. A search for 'Andrew McCallum' serves as a good example of this problem. The first 4 answers refer to the same person but are listed individually. There are some great clues in the space of citation analysis that can be used to help with this problem. All the content in the papers associated with a name will have strong topical affinity suggesting that variations like 'R. Andrew McCallum', 'Andrew McCallum' and 'A. McCallum' are in fact references to the same person.

Something I've been thinking of, and I'd love to see built on top of a system like this, is a social network and commentary space that can continue the topic presented in a paper. Imagine looking up one of Andrew's papers and being able to associate follow on questions with it. Perhaps one of the authors will answer, or perhaps someone cited by that paper will make a comment.

At any rate - congratulations to Andrew in getting this out the door - the feature set looks impressive and I'm looking forward to exploring this more.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c994053ef00d834292b6353ef

Listed below are links to weblogs that reference Rexa:

» On Rexa from sardonick
Rexa, a new player in community bibliography management, was opened to the public a couple weeks ago. Heres a blog post from the PI on this project, Andrew McCallum, who details the announcement, and a little more here, from Matthew Hurst... [Read More]

Comments

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Twitter Updates

    follow me on Twitter

    July 2009

    Sun Mon Tue Wed Thu Fri Sat
          1 2 3 4
    5 6 7 8 9 10 11
    12 13 14 15 16 17 18
    19 20 21 22 23 24 25
    26 27 28 29 30 31  

    Categories

    Blog powered by TypePad