My Photo

« Cymfony Sample Report | Main | Netflix Prize »

December 06, 2006

Comments

Kevin Burton

Having .csv upload would be simple enough.

It would be awesome to have them work on a plugin so I can import from google analytics or sitemeter....

Big source of data...

Kevin

Matthew Hurst

They do have csv upload as standard (though when I tried to upload a single dimension of data it failed). The import from other sources is an idea I passed along - imagine access to blogpulse data so you could mix it with stock prices or average temperatures, etc.

Andrew Hitchcock

I uploaded a dataset from a website. I didn't notice it had a bunch of references until after I uploaded it, and they caused the formatting and everything to be all messed up. I couldn't find a way to change the input dataset. I'm trying to upload it again as a new set (the site is being hammered), but I don't see how to delete the old one.

visnu

ooh, those papers should be an interesting read. but personally, pulling meaningful data out of html web pages consistently correct all the time, i have admit is impossible. add on top of that, that the table will likely be digested into a more human readable format (totals rows, aggregated data values) and it becomes even more intractable. swivel happens to shine at near-raw formatted data, the type that people look at and instantly need to put into a tool anyway.

now, what would you say to a data input api, free for anyone to use or write plugins for? hotness.

visnu
swivel eng #2

visnu

witches, i didn't read everyone's comments before commenting myself. kevin is insightful.

Matthew Hurst

Visnu - o ye of little faith. The problem is not to provide an algorithm that is perfect for every table, but to provide an algorithm that can correctly pull data from some tables and, importantly, know when it is being accurate.

visnu

matthew - that's what i'd expect to deliver for html table parsing, i'd be satisfied if it works for 80+% of the cases, but falls flat on a good 1% or more. i actually cut my teeth on machine learning parsing resumes into richer xml formats for an old company and resumes fall in the scary territory of nlp-type things.

now, microformats. those would be fun to support too. point swivel at a page that isn't even in a table, but in more semantic html mark up and have it bring back something useful. that's web _3.0_. aw yeah.

but again, my lazy self would love a gang of plugin/api coders doing the work for me.

The comments to this entry are closed.

Twitter Updates

    follow me on Twitter

    March 2016

    Sun Mon Tue Wed Thu Fri Sat
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    

    Categories

    Blog powered by Typepad