Google: Enabling & Profiting from Information Pollution

The most recent blog meme is Google's Blogger is a mass spam system.

David Sifry says he thinks that between 2 & 8% of blogs are spam, but I think just like people his systems are not good at detecting much of the spam.

Google maintains that they care:

When spam goes up, it directly affects the quality of those results. I'm exceedingly sympathetic with these folks because, well, we run one of those services ourselves.

But do they really care?

Think of blog search as a form of vertical search. If blog search is less useful and filtering through the spam

  • kills profit margins

  • slows blog search innovation

then more people will opt to use general search.

While Jeff Jarvis thinks Google should share it's tricks for not indexing blog spam I don't see why they would want to. Since Google has not put much effort into making their blog search anywhere near as good as their regular search I don't think they mind if nearly all blog search engines are full of spam.

Blog search full of spam = user may as well use general search = $ for Google. And, on another front, that helps Google ensure blog search sucks really bad until they create the solution, and then they get credit for doing right what their competitors could not :)

Just as a curiosity question, how hard would it be to attenuate trust, only trusting new blogs if they were co-cited by multiple trusted sites? There has to be an algorithmic way to do it. If you were worried about new sites being locked out then you could offer multiple search options:

  • the filtered trusted version

  • the unfiltered version
  • perhaps people could even enter their own trusted friends, levels of trust, minimum trusted citations, or make trust a slidable scale & use AJAX to reorder the results as the trust score is adjusted

On top of owning general search Google also wants to be the first port of call for vertical search. Just look at their recent desires on the real estate front.

Through monetizing spam production with AdSense and making publishing free and easy Google pollutes competing information systems for personal profit.

The same thing that is going on in vertical search scene is also going on in general search. Google has an algorithmic probationary period for most new sites. The same sites tend to rank MSN Search & Yahoo! Search quicker and easier.

By paying search spammers via AdSense Google is funding the information pollution that undermines the usefulness of competing search products. As I have stated in the past, Google generally does not give a shit if AdSense is on spam sites or sites that make money stealing other's copyrighted work.

Now what happens if Google ends up indexing AdSense spam sites? Well suddenly it is a real issue then, and they pull out the we care card. Matt Cutts recommends you report it to Google, but the hidden message there is that Google cares only when the spam ranks in Google.

Meanwhile all the A list bloggers are asking Google to fix the problem when they fail to realize the profits this problem brings Google.

Maybe a large part of being the company that organizes the world's information is encouraging entrepreneurs to stuff garbage in rich competitors databases.

Published: October 19, 2005 by Aaron Wall in google


October 20, 2005 - 5:46pm

Well it appears as if the mass blog spam this weekend was a test phase.

Today I received an announcement from a vendor whose software I have bought previously.... Announcing their program for automatically generating blogger based blogs.

I don't want to reveal the URL here as I don't want to encourage people to do this.... But if Aaron wants it I'll be glad to reveal it privately.

- Ben Fitts

