It's interesting to see an article on BlogPulse's top news stories for yesterday about Intelliseek/BuzzMetrics (and by implication, BlogPulse). It's also interesting to hear that BuzzMetrics is referred to as a 'giant' in this space:
To capture the chatter, Nielsen BuzzMetrics, a giant in the industry,
uses software that collects hundreds of thousands of comments a day.
The technology can scan for specific companies, products, brands,
people -- anything searchable. It can slice data into a range of
categories to quantify the number of times a subject was discussed
online, the individuals who mentioned it and the communities where it
Hundreds of thousands is probably an underestimate. Also the 'anything searchable' is an undervaluation of our technology. The expectations of search are far weaker than what true text mining, nlp and categorization technology can do. Anyway, nice article.
I posted recently about the Murdoch/Vivsimo story reported in the Pittsburgh Post-Gazette. This post on a web based message board (Yahoo! Message Boards: LOOK) generated a most of traffic to that post. As we think more and more about the conversational structure of the blogosphere, rather than thinking about it as a walled garded, we need to think more and more about how it integrates with other spaces on the web.
I've been saying for a long time that those interested in analysis over the blog space are going to be knocked out when they see the potential of other similar spaces, most noticably the message board space. They are going to see topics of conversation which are more richly represented in other spaces, they are going to see different (and stronger) community structures, and then they are going to have fun integrating analytics from both.
Here are a couple of examples. The first trends 'murdoch', the second trends 'xbox'.
Notice that while there are echoed trends (the large peak to the left), there are also trends which are unique to each source. For example, Usenet appears to show an upwards trend to the end where as Boards and Blogs show a steady flat trend.
Here we can see that a higher percentage of Boards are focused on the XBox. In addition, there is an initial peak - an anticipatory burst - which preceeds the main launch peak. Notice too the leap in Usenet posts.
To give full credit to everyone involved in the conversation, visualization like those posted on Anjo's blog (just been updated I see) need to include links at least to external (non blog) pages. The fact that a message board concerning LookSmart is discussing a story I'm interested in (Vivisimo) is certainly a thread I'd like to follow.
Yesterday, Corante launched its first series of hubs: marketing, media and web. The hubs bring together expert bloggers in the various fields, aggregate their feeds and top it off with an editorial blogger-in-chief. I'm very excited to be involved in this new endevour, and - I have to say - somewhat flattered to be counted in such good company.
If you've not already been there, take a look at Xooglers - a new blog written by ex-Googlers recounting life at the search company. There are now two contributors, Doug Edwards (the original Xoogler) and Ron Garret (who also writes at Rondam Ramblings).
What I find interesting about this blog is both the no-holds-barred expose with detailed descriptions of events and opinions (certainly well beyond anything a current employee could get away with) and the occassional bouts of, well, admiration:
First, I accept that Larry and Sergey really are brilliant. I'm sure
that on IQ tests, they're off the charts, but that's not the kind of
brilliance I mean. I mean brilliant in the sense that they have a
vision that burns so brightly within them it scorches everything that
stands in its way. The truth is so obvious to them that they have no
patience for the niceties of polite society when bringing that vision
Sphere will soon be providing RSS feeds for their blog search. This is a must have feature, of course, but I'm starting to get the feature that Sphere is going down quite a different path from that which I originally imagined. Mary Hodder, one of the reasons I suspected Sphere was going to deliver something disruptive, has voiced some very interesting points:
I believe that blog search, at this point, is a baseline for any
company in the space. You have to do it. But it's not so interesting to
me, compared to making a leap forward toward something like topic
browsing of communities or sophisticated weighting of bloggers. I'm
less interested in 'yet another blog search' tool. The ones we had
already were fine. But I've very interested in what Sphere is really
here for: changing the ways we can view small topic communities and the
bloggers within them in sophisticated ways that take us ahead of where
we are now, which I equate to the place websearch was in in 1997,
I get the impression that Mary is a little disappointed with Sphere. Perhaps Mary should take TailRank for a spin...
Steve Rubel posted about the Audible file format issue. In his analysis, he posted a trend from IceRocket's Blogs+RSS search system:
So Audible largely won in the eyes of the media - the sole basis for how we used to define PR.
Of course, we no longer live and die by traditional media alone. In the blogosphere it was an entirely different story, as this IceRocket trend graph shows.
I can't quite figure out what Steve means here: the trend shows what? That Audible didn't win in the eyes of the blogosphere? How does this graph show that? Another point of interest is the difference between the IceRocket trend and the BlogPulse trend (note that BlogPulse only shows hits for Blogs, not arbitrary RSS, so there are going to be some differences.)
I see two major differences here. IceRocket shows two large spikes - one in late August, the other in mid October. BlogPulse shows a single spike in early September. The other is the two smaller spikes in the BlogPulse trend (note that IceRocket has a couple of zero data points around the same time). BlogPulse allows you to drill down on specific dates. IceRocket doesn't have this, and as their custom date range query from the advanced search page is broken (it doesn't appear to work in either FireFox or IE), I can't yet determine what the spike in the IceRocket trend actually is.
The large spike found in BlogPulse is accounted for by quotes of a post by Brian Williams. The other two spikes appear to be associated with the Audible Inc. story - as this trend graph shows:
Here is the same trend on IceRocket:
I get a big kick out of trend graphs. However, it takes some care to work with them in the blogosphere. For example, every mention of the word 'audible' is not a mention of Audible Inc. - in fact, it probably the reverse. In addition, depending on what you trend over (blogs in the case of BlogPulse, RSS in the case of IceRocket) you see quite different results. If you are going to make any detailed use of trend graphs, you need the ability to inspect interesting features (either by drill down on the graph or by date specific queries).
The conventional wisdom is to keep a track of the most visited (popular) posts on your blog and highlight them in some way. The trend graph from my original post regarding the rise and rise of PostSecret brings lots of traffic as it is in the first page of search results for Google's image search.
So here is an update. Unlike the previous graph, we now see PoseSecret's trend declining with no additional bursts.
BlogWatcher is a Japanese blog search and analysis tool. It comes from an academic research group and is, therefore, less constrained by traditional consumer facing UI issues. The result is an interface that throws up lots of interesting data:
The top graph shows message volume as a histogram and burst as a red line. The bottom graph shows counts of postive and negative messages including the topic searched for. The graphics are interactive. Note that the data spans 6 years of data. You can drill down into the graph and get more constrained views of any window.
Search results also provide details of positive and negative language within actual posts as well as classifying blogs and online diaries (which have traditionally been a separate class of content in Japan - predating blogs).
I want to write a review of MeasureMap, but the more I think about what weblog visitor monitoring systems should provide, the more I get distracted. For example, looking at the stats provided by TypePad, which, I might add, are far too narrow a window into what is going on, I saw a link to https://www.webstats4u.com/s?tab=1&link=4&id=3747494. Following this link, I realised that this was probably Scott Nowson tracking down who was coming to visit his page by looking at his web stats. This in turn made me think: hey, I should really be looking at his stats as well in order to find out other interesting stuff. In so doing, I found a link to Creepy Lesbo who had posted about Scott as she is mentioned in his thesis (I mean, who isn't going to follow a link with that name in the context of a PhD thesis on blogging?) This click-fest leads to two observations:
Firstly, there is a lot of data out there about visitors that is unprotected.
Secondly, wouldn't it be great if someone could provide a service that mined all this stuff - which is a sort of back-end data set to the front-end blogosphere - to provide insights about who is visiting your blog, who their visitors are and what the community of readers is (not just the community of blog writers which is what people mostly write about).