Sifry reports that spam is at 5% post volume, BlogPulse (credit to Natalie, Robert, via Pete's ClickZ column) reports 30%. The splog issue is getting more and more visible. I'd like to know why these numbers are so different. One way to do this would be to report percentage spam per host - at least BlogSpot, LiveJournal, Xanga, Typepad, Spaces, AOL. These stats, together with a breakdown of how much data comes from these sources would give everyone an idea of where the problems are as well as throwing some light on what is going to be an index size claims race amongst blog search engines.
Google has already been outed as a source of spam in many places, I suspect that some of the hosts that contribute major volumes of blog data have relatively low spam percentages (I'm thinking of LiveJournal and Spaces).
One can think of spam as a complex ecology involving:
- products
- people wanting to make a quick buck
- people who actually buy products in part due to spam
- search engines
- web site genres
The last element - types of web site - turns out to be a key thing. We have, it appears, become very sensitive to blog spam in part because of the way in which search results are presented in blog space: along the time axis. Matt Siegler - a colleague of mine, and master in the art of message board discovery - has turned over to me a list of spam message boards. The use of boards for spam has something of a different ecosystem dynamic as they don't appear in search engines with time as a major ranking axis. That is to say, they are hidden to some extent in the static rankings of major search engines like Google.
We are sensitive to blog spam as it appears in our search results, we are not sensitive to spam of other sorts because we don't search in a time dependent manner for those genres of web site. However, the effect in terms of boosting the rank of pages linked to by these sites is approximately the same.
Perhaps when we have more board specific search engines, like BoardTracker, we will start to throw up our arms about this longer lived but well hidden pocket of spam. First dibs on 'sboards'!


You wrote:
"One can think of spam as a complex ecology"
I agree with your list, but you forgot the most important, most influential group. The group that makes the most money in the spamming industry and stands to benefit the most from any spam increase: The spam protection industry.
They need spams just like Symantec needs malware. They are the ones with the power to prevent any serious antispam legislation.
It won't be long before commercial blog antispam software will become as common as email antispam software, and when it happens, there will be a strong corporate interest in maintaining a high level of blog spam.
Spam blurs the lines between the supposedly "good guys" and the bad ones ...
Posted by: a guy | February 11, 2007 at 04:08 PM