I think of the blog space as concerning the following elements:
- content creators
- technology applications enabled by that content
- service applications enabled by the technology
I'm using the term elements here deliberately to avoid implications of ownership of these themes by groups or people as certain groups or people have expertise that overlaps.
I believe content creators generate content which is either object content (not concerning blogs and social software) or meta content (concerning blog theory, blog punditification, predictions and manipulation).
The technology applications that are enabled by the content include (in other words, things that you can do with large bodies of text and network structure)
- search (which requires crawling/spider, xml parsing, document analysis, indexing, query serving, etc.)
- natural language processing (NLP) or computational linguistics (CL) which includes data driven methods to determine topics (phrase finding), sentiment analysis to determine favourable and unfavourable author statements, topic classification and discovery, semantic interpretation, etc.
- machine learning and other approaches to classification and segmentation, for example to determine the demographic break down of the authors that created a particular data set based on observations made about previously seen data and the current data.
- data mining methods and data visualization, for example the use of time series (a visualization) to display trends, or the mining of terms or topics within a data set that have a specific temporal pattern.
- social analysis via explicit and implicit relationships.
The service applications enabled by this technology include:
- marketing intelligence
- consumer engagement
- political analysis
- demographic analysis
Each of these areas has its experts. Some of these experts have deep knowledge about a number of areas. The fascinating thing about the blog space is that these areas have come together so rapidly and in such a tightly intertwined manner that the applicability of expertise has become very broad.
In other words, the products that are being built, the manner in which they are marketed and the strategies that define the short and medium term for the companies involved are all informed by (or at least, should be informed by) the three areas I outline above.
If we think about the recent discussion on top blog rank lists and their pros and cons, we can see how
- The voice of the content creator must be heard as the control the data that is necessary for the ranking,
- The voice of the technologist must be heard as they define what is possible, what the time window is, how accurate the results are,
- The voice of the application service owner must be heard as they define the role of the list either in a consumer facing situation or in an enterprise situation.
I believe that this situation, partly due to the velocity at which things are happening, but also due to the fact that in terms of content, the syntax is very constrained making it an incredibly rich area to explore algorithmically, is quite unique for all involved (content, technology, services).