« March 2008 | Main | May 2008 »

April 30, 2008

Photosynth on CSI

Briefly, Photosynth was used in an episode of CSI. Very cool!

Update: CBS, which produces CSI, has a pretty rich online experiece for the show. Apparantly you can watch the latest episode including Photosynth right here. I'm waiting for the episode to start streaming, but as I'm in Japan, I suspect it isn't going to happen. Let me know if you have any luck!

Below is a pic from the episode provided by USA today.

Cisps

History Commons

William forwarded me a link to this awesome site: Cooperative Research History Commons. The site is a cooperative approach to history and presents data in timelines (here is the list of events in the Nixon Administration and Watergate timeline). I like this vertical approach to wiki data as it has the potential to focus both expertise and data structures, making the data more valuable in a number of dimensions.

The website is a tool for open-content participatory journalism. It allows people to investigate important issues by providing a space where people can collaborate on the documentation of past and current events, as well as the entities associated with those events. The website can be used to investigate topics at the local, regional, or global level. The data is displayed on the website in the form of dynamic timelines and entity profiles, and is exportable into XML so it can be shared with others for non-commercial purposes.

I'm sure we are only moments away from seeing a slick visualization of this site.

(It would have been interesting to see the interaction between an objective project like this and historical memories such as those which the BBC's The Time When project aggregated - it is now, sadly, closed).

April 29, 2008

The Future of News

I'm still making my way back from Beijing, and have plenty of notes to write up for WWW 2008. However, I thought I'd point ahead to my next trip: The Future of News - a workshop being held at Princeton's Centre for Information Technology Policy. I'm really looking forward to this event as it ties in some recent work I've been doing at Microsoft (including supporting MSR's Blews team), some broader thoughts about the relationship between politics, broadcast media and social media, and my current (belated) reading of Lessig's Code (v 2.0).

Some thoughts starters:

  • Main stream news is increasingly driven by simple economic forces. I suspect that the increase in personality driven news is due to a person's ability to forge emotional connections with a homophillic (read 'loyal') audience.
  • Social media has demonstrated its ability to act as a fifth estate.
  • More and more algorithmic editors are appearing (e.g. Google news, Silo Breaker, European Media Monitor).
  • Despite claims to the contrary, algorithmic editors (that is to say, programs which automate the selection and presentation of news stories) do encode certain biases (witness the lack of Dalai Lama quotes in Google news' quotes feature) intended or otherwise.
  • The designers of algorithmic news systems have an opportunity to do social good (e.g. acting against homophilly)
  • Seeing data (visualization) is a necessary first step in understanding what could be done with it.
  • Principles of data representation are required to help prevent blind siding automation to certain qualities or opportunities in the data.
  • Context provides a user with greater choice in media consumption, but adds to the cognitive effort involved.

April 27, 2008

The Future of Wired

Seed I've been a loyal Wired subscriber for many years. I've suffered through its affairs with neon ink, but am getting more and more frustrated with its dependence on noxious substances (adverts, bits of cardboard, subscription postcards,...). Then along came Seed. If you are a Wired reader, you will probably like Seed. If your Wired reading experience ends in navigational frustration as you hop over adverts, get stuck with cardboard popups and have trouble stitching the article you are reading together (continued on page X) you should certainly give it a go.

When thinking about writing about this, I started wondering: how much would I pay for Wired without the adverts? Connecting that thought with the slow rise of on-demand printing and binding for paperback books and one can imagine a market where you could choose to remove some percentage of adverts, or, say, only those adverts which include page-flipping interruptions (i.e. bits of cardboard designed to force you to read the advert).

Is the future of physical magazines to go out in an orgy of adverts (as the economics of the model force more and more adverts to be included - as appears to be the case with Wired), or is something which better blends the online and offline experiences, user preferences and on-demand printing and binding technologies. Perhaps Chris has the answer.

April 24, 2008

Four Challenges in Social Network Analysis

There are plenty of papers here at WWW 2008 which touch on some form of (social) network analysis. I see three challenges emerging in this space:

  • Overlapping community analysis - many approaches to deriving communities from the larger graph assume that the communities are distinct. This is convenient by not intuitive (consider the different circles of friends that you move in). Lada pointed me to some work in this area by Palla et al.
  • Edge semantics - much work assumes a single relation (and often a single weight) is represented by the edges in the graph. Again, this is convenient but not intuitive. Even the links within the work place are different (a colleague with whom you work closely versus a person that you work with due to organizational structure).
  • Modeling edge creation/maintenance cost - the cost of creating a link in the real world is far higher than creating one in an online social network (one or two clicks). How can graph models include this aspect?
  • Cross network analysis - many data sets that are explored in this space come from a single source. This may be an IM system, an online social network (e.g. facebook, myspace, etc.) However, if we consider the links between different networks, and even different types of networks, I believe we will observe some powerful features. For example, consider links from Usenet to the blogosphere, links between tweets and news articles, etc.

April 23, 2008

Marc Smith et al Weblog

Marc Smith, Tom Lento, Eric Gleave, Itai Himelboim and Ted Welser are blogging at Connected Action. For internet sociologists' PoV of the social media space (and other stuff) subscribe here.

.

April 22, 2008

Freebase and Data Visualization

This is a nice post [via Cool Inforgraphics] which uses Freebase as a data source to create an animation charting the growth of Wal-Mart over time. Toby says:

Freebase has a topic for every zip code, along with it’s longitude and latitude. Here’s one example. One query pulls out all the ZIP codes along with their longitudes and latitudes. You can turn longitudes and latitudes into graphical coordinates with some simple transformations (which will vary based on the region you’re plotting and how big your image is) — here are the ones I used:

x=(longitude+127)*16
y=(50-latitude)*20

If you plot all the ZIP codes using a library like PIL, you get a nice map with dots that roughly match population density, which has the advantage of looking a little bit like a night-time satellite photo of the United States.

Freebase also contains a list of Wal-mart locations, along with their addresses and the year that they opened. Here’s an example. One query pulls all of these out of Freebase.

April 21, 2008

Write Only Blogosphere

While it's great to be here in Beijing at WWW, I'm frustrated to see that Typepad weblogs are not visible behind the great fire wall. This weblog and some other Typepad hosted weblogs that I've hit (including Typepad's own weblog) simply don't load. I can read the content via a feed reader (Bloglines). Even if you've mapped a domain to your Typepad hosted blog it doesn't appear to be visible.

Interestingly, I can write to this blog, though I can't upload images (otherwise you'd see what Tiananmen square looks like on a slightly overcast day).

Note - this could be something to do with the ISP, though I've tried to see Typepad content through 2 different systems with identical (lack of) results.

Here Comes Identity

I blogged recently about Google News' new feature which provides quotes from an individual in the news (e.g. a list of quotes made by Barack Obama). I just did a search for my name on Google and found that the top result was a site summary for this weblog.

Not much new here, but we will see more and more examples like this in which an individual, via various data types found online, will become first order data types in search results (compared to html documents).

April 18, 2008

He Said, She Said

Matt Cutts points to a neat new feature in Google news search which extracts quotes by individuals and displays them at the top of the result set. You can click through to more quotes by the same person. Regardless of what you might thing of the value of this, it does expose some key capabilities on the linguistic side.

  • Disambiguation: a search for Hillary Clinton produces quotes like the following, which would require the system to resolve 'Clinton' to 'Hillary Clinton' : "When it comes to finishing the fight, Rocky and I have a lot in common. I never quit," Clinton said recently.
  • Pronoun resolution: the same search produces quotes qualified by 'she': She said last week that she knows, "what it means to get knocked down, but I've never stayed down."

I'm guessing that the product has been tuned highly for precision (that is, after all, what web search companies are all about). Thus, a search for just 'clinton' on the front end only presents results for 'Hillary Rodham Clinton', and a search for 'Bill Clinton' produces no quote results. My guess is that there is some general technology underneath this, but there is a strong editorial layer designed to ensure that all the results are of high quality at the expense of recall. This is not surprising and quite reasonable.

It'd be interesting to know who is on the list of people that get passed through. I see Gordon Brown, but not Tony Blair. No sign of the Dalai Lama saying anything quotable even though the top news search result has this very quotable passage:

"From the very beginning I have supported the Olympics," said the Dalai Lama. "We must support China's desires. Even after this sad situation in Tibet, today I support the Olympics." Still, he said he fully understands why people would express frustration and protest.

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Blog powered by TypePad