The backfill content business model has had a great run over the past 5 years, but with today's announcement of Yahoo! acquiring Associated Content, it certainly feels like it is getting toward the beginning of the end for that model for most folks.
Demand Media has grown eHow aggressively & struck partnerships with the likes of USA Today, and has recently been in the news about looking to do an ~ $1.5 billion IPO. If you look at Richard Rosenblatt's past sales you will see that he is quite good at selling right at the top.
Former Googler Tim Armstrong rebuilt Aol around their internal SEED platform which targets content at longtail arbitrage opportunities & leverages their premium Google ad feed.
Associated Content struck deals with companies like Thomson Reuters, Cox Newspapers, Hachette Filipacchi and USA Today. And they just landed a $90 million payday in the sale to Yahoo!
Yahoo! still has north of 10% search marketshare and can probe new & trending content ideas in real-time, while also using their huge distribution to market the new features. The fast data and instant distribution likely double the value of the business model for them. Take average content, tie it to a trusted brand, and immediately give it huge distribution and you have a winning formula. Assuming Yahoo! does a good job of integration this is probably one of their better acquisitions.
About a year ago a friend told me he bought some Yahoo! stock and I told him I thought he was nuts, but if I saw signs of decent integration of this content then I think they just increased their longevity of their company probably by a decade or more. And the part of this model which works great is that they view this content not as a replacement for their premium content, but as a backfill for the keywords they would like to target which don't have enough demand to pay for premium content creation. Some of the smarter independent webmasters have long understood that part of publishing profitably online means having featured content which loses money but builds awareness, and a second bucket of content which leverages that reputation to profit. That understanding is where the term "linkbait" came from, but now the big companies are playing the same game.
Here is a list of Aol properties, and as soon as they show strong profit growth you can bet they will use their stock to purchase more sites
You could put up similar network maps for the likes of Expedia, BankRate, Yahoo! subdomains, Monster.com, etc. etc. etc.
If Google continues to keep the algorithm fairly similar over the next couple years (ie: overall domain authority = relevancy panacea) it is pretty obvious what is going to happen to a lot of online categories. They will get watered down search by search as these publishing companies reinvest profits into creating a second, third, or fifth site in profitable categories.
If many people are using the same approach that will often create opportunities for other approaches. The good news for the average webmaster is that as the bland one size fits all approach (based on domain authority) gains momentum is that it will likely force Google to adjust. And it will make people become more loyal to great sites when they find them. As such general purpose sites grow I almost think it adds value to sites which look a bit unpolished and look like they are created from am amateur hobbyist. Thoughts? What say you?
A common practice in the marketing space is for people to diminish what you do, state that it is below them, help rebrand your stuff in a negative light, and then at some point in the future basically clone the idea (maybe with a few new features, maybe not) and then push their clone job aggressively as though it is revolutionary.
Another shady practice is when you ask people for advice and they say "no don't do that" and then as soon as they hang up the phone they send off emails to their workers telling them to do that which they told you was a bad idea.
I don't think that the average person or the average marketer is inherently sleazy. But I think when you look at the people who are the most successful certainly a larger than average percent of them engaged in shady behavior at some point.
To keep building yield and returns at some point short cuts start to look appealing. And so you get
None of the above is a cynical take or an opinion at this point. That was simply a list of 3 stated facts.
Create a large enough organization with enough people and you can always make something shady seem like it was due to the efforts of a rogue individual, rather than as company policy. A key to doing this effectively within a large organization is to publish public thoughts that are the exact opposite of your internal business practices.
Recently the Google public policy blog published a post titled Celebrating Copyright. Around the same time Viacom leaked the following internal Google document
You can't get any clearer than that!
In the past when I claimed Google operated as-per the above I was accused of being cynical or having sour grapes. But when you tie together a lot of experiences and observations others lack and you are not conflicted by corporate business interests you have the ability to speak truth. You are not always going to be right, but the lack of needing to cater to advertiser interests and filter means you will typically catch a lot of the emerging trends before they show up in the media - whatever that is worth.
If you're ever confused as to the value of newspaper editors, look at the blog world. That's all you need to see. - Eric Schmdit
Speaking of the media, have you heard about the Middle American Information Bureau
Buying links is considered spammy by Google because it is a ranking short cut which subverts search relevancy algorithms.
And so Google considers it a black hat SEO practice.
Links are somewhat hard to scale because (outside of those who create a networkof spam) it is time intensive to find the right sites, negotiate a price, and then ensure appropriate placement. It requires interacting with many webmasters & going through a lot of rejections to get a few yes responses. Due to scale limitations, paid links typically only exert a slight influence on core industry keywords and common variations, limiting any potential relevancy damage.
Further, when a person buys a link, the relevancy is almost always guaranteed (as one would go broke fast if they rented links targeting irrelevant keywords).
Even still, Google hates paid links because they can lower result diversity & bias the organic search results away from being informational and towards being commercial (which in turn means that Google AdWords ads get fewer clicks).
Policing Paid Links
To make link building efforts easier to police, Google created nofollow, which aimed to disrupt the flow of link equity across certain links. Initially the alleged purpose was blocking comment spam. And then after it was in place, comment spam never went away, but the role of rel=nofollow quickly expanded to be a cure-all to be placed on any paid link.
Google encouraged spam reports that highlight paid links. SEO blogs highlighted people that were buying links. Firms like Text Link Ads were eradicated from the Google index. And all was well in GoogleLand.
The Rise of Content Farms
Over the past few years people realized that Google had dialed up the weight on domain authority & that links are now much harder to get. So companies started placing lead generation forms on trusted sites & firms like Demand Media purchased highly trusted websites like eHow (which already had a ton of links in place from back when links were easier to obtain).
This type of strategy attacks the longtail of search, and given how many unique search queries there are each day, that amounts to a lot of opportunity!
Corporate Content Farming: The Art of Informationless Information
Anyone who has watched The Meatrix is likely afraid of factory farms. The content created by these content farms isn't much better. When I highlighted how bad one of the pieces was their solution was to delete it and hide it from site, then write a memo about how they do "quality" content at scale.
eHow is a content publisher known for “How To..” articles. Lately, it seems eHow visits other websites, scrapes their instructional content (on whatever topic), and republishes it as a How To article on eHow. Sometimes the entire step-by-step process is “copied” for the eHow article. I’ve noticed a few times this week, how eHow articles are basically copies of existing content from other sites, worse than Wikipedia rewrites. That’s pretty much “scraping”, even if done by poorly-paid human workers.
We are no longer in an “Information Age.” We are in the Age of Noise. Falsehoods, half-truths, talking points, out-of-context video edits, plagiarism, rewriting of history (U.S. was founded as a Christian nation, for example), flip-flops, ignoring facts (Cheney and torture for example), neatly packaged code words and phrases, media ratings focus, dysfunctional government (fillibusters have more than doubled, but most don’t realize Republicans are blocking everything), mainstreaming fringe causes….I could go on and on.
Is it any wonder why so many who are struggling with kids, jobs, rising medical costs, etcetera have such a tough time wading through all the crap? - source
Paranoid About Links
As building up your own profile has grown harder (since links are harder to get) many new web 2.0 websites provide free outbound links to help encourage participation and get links back into their websites. But then after they reach a critical mass they claim that spam is an issue and strip away the links by using nofollow, stealing that hard work people did to build up the network, offering nothing in return for it!
If Google is the one who wants that web link nofollowed because some twitter profile pages may be automated bots or spammers, then it is time they realize that THEY are responsible for determining which of those individual pages is authoritative, trusted and legitimate enough to pass link popularity, by a method other than demanding that other websites and social networks change the ways they do business to help Google stop links being used as a form of currency and to manipulate their algorithm – an issue Google and Google alone created and profited from.
Any Form of Payment = Not Trustworthy
A few years back a well known SEO joined our training program, read our tip about using self-hosted affiliate programs as a link building tool, and then promptly outed us directly to Matt Cutts, in a video, and on their blog. Google quickly blocked our affiliate program from passing link juice. Later a Google engineer publicly stated affiliate links should count.
Since then affiliate links have been a gray area (it works for some companies and doesn't work for others, based on 100% arbitrary choices inside Google). Looking for clarification on the issue, Eric Enge recently asked Matt Cutts: "If Googlebot sees an affiliate link out there, does it treat that link as an endorsement or an ad?"
Matt Cutts responded with: "Typically, we want to handle those sorts of links appropriately. A lot of the time, that means that the link is essentially driving people for money, so we usually would not count those as an endorsement."
So links which are driven by payment should not count as endorsements, even if the affiliate does endorse & believe in the product. The fact that there is a monetary relationship there means the link *should not count*
The Elephant in the Room at the GooglePlex
Ignoring links for a moment, lets get back to the the content mill content business model. It was fine that Demand Media bought trusted (well linked) sites like eHow for their trust to pour low-end content into, even though those pre-existing links were bought by the new owner.
And here is where the content mill business model gets really shady, in terms of "what is good for the user" ... Demand Media is now licensing backfill content to be hosted on USAToday.com on a revenue share basis. Describing the relationship, Dave Panos, Demand Media's CMO said "It's an opportunity for us to get in front of the audience that's already congregating around very well-known brands."
But you won't find that content on the USAToday.com homepage.
When he said "already congregating around very well-known brands" what he meant was "will rank well on Google." And so, what we have is a paid content partnership which subverts search relevancy algorithms.
If affiliate links shouldn't count, then why would affiliate content?
If Google doesn't stop it from day 1 then the media companies are going to quickly become addicted to the risk-free money like crack. And if Google tries to stop it *after* it is in place then they are going to find themselves lambasted in the media with talks of anti-trust concerns.
Something to think about before heading too far down that path.
Two Roads Diverged in a Wood...
How is a content exchange network any different than a link exchange network? The intent is exactly the same, even if the mechanics and payment terms differ slightly.
If a paid link that subverts search relevancy algorithms shouldn't count on the web graph, then why should Google trust paid content that subverts search relevancy algorithms?
Will the search results start filling up with similar sounding misinformed content ranking for 1 then 3 then 8 of the top 10 search results? Do the search results slowly get dumbed down 1 article and 1 topic at a time?
This trend *will* harm both the accuracy and diversity of content ranking in the search results. And it will grow progressively worse as people begin to quote the misinformed garbage on other websites (because hey, if it ranks in Google and is on USA Today it is *probably* true). Or is it?
Some questions worth thinking about:
Google is willing to truth police SEOs. Will they do the same for media outlets publishing backfill "content"?
How will Google be able to filter out the Demand Media content without filtering out the rest of the media sites?
Does Google care if the quality & diversity of the search results is diminished, even if/when most searchers will not be savvy enough to recognize it? I guess it depends on who has the last word on the issues inside Google, because most garbitrage content is wrapped in AdSense ads.
A content mill is a site that publishes cheap content. The content is either user-contributed, paid, or a mix of the two. The term content mill is obviously pejorative, the implication being that the content is only published to pump content into search engines, and is typically of low value in terms of quality.
The problem is that some sites that publish cheap content may well provide value, but it depends who is reading it. For example, a forum might be considered a content mill, as it contains cheap, user-generated content of little value to a disinterested visitor, or a forum might be a valuable, regularly updated resource provided by a community of enthusiasts!
Depends who you ask.
As Aaron says, content mills are all the rage in 2010. Let's take a closer look.
Why Are SEOs Interested In Content Mills?
This idea is nothing new. It's actually white-hat SEO strategy, and has been used for years.
Write content about those keywords
Publish content and attempt to rank that content in search engine results
If you can publish a page at a lower cost than your advertising return, then you simply repeat the process over and over, and you're golden. Think Adsense, affiliate, and similar means to monetize pages. Take a look at Demand Media.
The Problem With Content Mills
One of the problems with content mills is that in an attempt to drive the production cost of content below the predicted return, some site owners are producing garbage content, usually by facilitating free contributions from users.
At the low end, Q&A sites proliferate wherein people ask questions and a community of people with opinions, informed or otherwise, provide their two cents worth. Unfortunately, many of the answers are worth somewhat less than two cents, resulting in pages of little or no value to an end reader. I'm sure you've seen such pages, as such pages often rank well in search engines if they are published on a domain with sufficient authority.
At the other end of the spectrum, we have sites that publish higher-cost, well researched content sourced from paid writers. A traditional publishing model, in other words. Generally speaking, such pages are of higher value to end user, but the problem is that the search engines can't appear to tell the difference between these pages and the junk opinion pages. If the content mill has sufficient authority, then the junk gets promoted.
And there are many examples in between, of course.
As Tedster mentioned, "the problem here is that every provider of freelance content is NOT providing junk - though some are. As far as I know, there is no current semantic processing that can sort out the two. It's tough to see how this could be quickly and effectively reined in, at least not by algorithm. I assume that this kind of empty filler content is not very useful for visitors — it certainly isn't for me. So I also assume it must be on Google's radar.".
The Future Of Content Mills
I think Tedster is right - such sites will surely appear on Google's radar, because junk, low value content doesn't help their end users.
It must be a difficult problem to solve, else Google would have done so by now, but I think it's reasonable to assume Google will try to relegate the lowest of the low-value content sites at some point. If you are following a content mill strategy, or considering starting one, it's reasonable to prepare for such an eventuality.
The future, I suspect, is not to be a content mill, in the pejorative sense of the word. Aim for quality.
Arbitrary definitions of quality are difficult enough, as we've discussed above. Objective measurement is impossible, because what is relevant to one person may be irrelevant to the next. The field of IQ (information quality) may provide us some clues regarding Google's approach. IQ is a form of research in systems information management that deals specifically with information quality.
Here are some of the metrics they use:
Authority- Authority refers to the expertise or recognized official status of a source. Consider the reputation of the author and publisher. When working with legal or government information, consider whether the source is the official provider of the information.
Scope of coverage - Scope of coverage refers to the extent to which a source explores a topic. Consider time periods, geography or jurisdiction and coverage of related or narrower topics.
Composition and Organization- Composition and Organization has to do with the ability of the information source to present it’s particular message in a coherent, logically sequential manner.
Objectivity - Objectivity is the bias or opinion expressed when a writer interprets or analyze facts. Consider the use of persuasive language, the source’s presentation of other viewpoints, it’s reason for providing the information and advertising.
Validity - Validity of some information has to do with the degree of obvious truthfulness which the information carries
Uniqueness - As much as ‘uniqueness’ of a given piece of information is intuitive in meaning, it also significantly implies not only the originating point of the information but also the manner in which it is presented and thus the perception which it conjures. The essence of any piece of information we process consists to a large extent of those two elements.
Timeliness - Timeliness refers to information that is current at the time of publication. Consider publication, creation and revision dates.
Any of this sound familiar? It should, as the search landscape is rife with this terminology. This is not to say Google look at all these aspects, but they have used similar concepts, starting with PageRank.
As conventional SEO wisdom goes, Google may have tried to solve the relevancy problem partly by focusing on authority, on the premise that a trusted authority must publish trusted content, so the pages of a domain with a high degree of authority receive a boost over those with lower authority levels. But this situation may not last, as some trusted sources, in terms of having authority, do, at times, publish auto-gen garbage content. Google may well start looking at composition metrics, if they aren't doing so already.
This is speculation, of course.
I think a good rule of thumb, for the time being, should be "will this page pass human inspection?". If it looks like junk to a human reviewer in terms of organization, and reads like junk in terms of composition, it probably is junk, and Google will likely feed such information back into their algorithms. Check out Google's Quality Rater Document from 2007 which should give you a feel for Google's editorial policy.
We're very skeptical about the scale argument, as you might expect. There's a lot of aspects to this subject that are not very well understood.
So in all of this stuff, the scale arguments are pretty bogus in our view because it's not the quantity or quality of the ingredients that make a difference, it's the recipes. We think we're where we are today because we've got better recipes and we have better recipes because we spent 10 years working on search improving the performance of the algorithm.
We don't have better algorithms than anyone else. We just have more data.
And this is why you see so many hucksters hyping trash, committing fraud, scamming users, cutting corners, and working legal loopholes at launch time to try to grow marketshare *at any cost*
Build the scale and you have the cashflow and feedback mechanisms in place to test viral marketing strategies, improve conversion rates, increase real (and perceived) relevancy, and lock in users.
"In a July 19, 2005 e-mail to YouTube co-founders Chad Hurley and Jawed Karim, YouTube co-founder Steve Chen wrote: 'jawed, please stop putting stolen videos on the site. We’re going to have a tough time defending the fact that we’re not liable for the copyrighted material on the site because we didn’t put it up when one of the co-founders is blatantly stealing content from other sites and trying to get everyone to see it.'"
"Our dirty little secret... is that we actually just want to sell out quickly," said Karim at one point. In an e-mail, Chen talked about “concentrat[ing] all of our efforts in building up our numbers as aggressively as we can through whatever tactics, however evil.” - Ars Technica
Welcome to the exciting world of innovation in online media!
Without brand you have nothing.
With brand even a wounded duck full of unauthorized scraped content like YouTube or Mahalo somehow manages flight, at least for a while. Then you only need to find someone dumb enough to buy the growth story and purchase the bag of smoke before the fire emerges.
Of course people don't have to cut corners, lie, cheat, and steal to build a real business. Those are the strategies employed by people trying to sell value where none exists. You can do just fine by dominating a small niche THEN leveraging data to grow. It is not sexy. You probably can't hype it to the media. It might not lead to an 8 or 9 figure payday. But then you won't have to describe your strategy as "whatever tactics, however evil.”
Relevancy is a good thing. It makes search and the world more efficient. Many attempts at relevancy, like search is getting more social, may just create more noise. But computers are getting better at understanding language is a good thing "our measurements show that synonyms affect 70 percent of user searches across the more than 100 languages Google supports."
But it seems each increase in relevancy justifies additional increases in irrelevancy to increase monetization.
Each individual piece sounds useful and helpful, but the end effect (and goal) is hijacking and misdirecting traffic to display more ads.
Even when you claim your own business listing, Google will show your customers recommendations of other competing businesses on your business profile page. One of the best advertising based business models is extortion. And while the sum of the pieces may amount to that, certain ad networks are clever in how they tie it all together to *appear* innocent, even when acting like a shark.
What does a spam site do? Scrape content, misdirect visitors, and hope to get an ad click. Look at the above sequence through the same lens. It is the same thing - eeeeeeeeeevil.
SEO is Evil, Except When I Am Selling It!!!!
And yet a lot of the largest online spam publishers / scraper websites are taking a page out of Google's book...call SEO professionals scammers selling snake oil, while building search arbitrage businesses based on stealing third party content and wrapping it in ads. Perhaps the goal of charlatan douchebags like Dave Sifry and Jason Calacanis are to promote the Google anti-SEO public relations messaging in hoping that Google will not burn their sites to the ground. It may well work.
A popular SEO figure who sold a content management system based on cloaking mentioned at a secret meeting amongst Google's spam team and top SEOs that he loves turning in spammers. If he didn't promote Google's misinformed view he probably wouldn't get away with a business model built on cloaking.
What are Technorati and Mahalo but glorified scraper websites? And yet to promote such trash they claim to be search evangelists fighting for the purity of the search results (while they scrape scrape scrape).
While publicly those people trash SEO, they sell SEO services, and a friend told me that they are even using high pressure telemarketing and email spam to pitch "services" ... one such message I was forwarded stated:
Thanks for taking the time to review our new and improved demo. I'm glad you liked it and I'm forwarding you the PowerPoint version for you to truly experience the animation. Once you've distributed to the right parties I can always hop on a quick call to go through the demo really quick to really emphasize the value as an SEO component which is what the end result really is. Along the way you reap the benefits of having great content, a social media platform that all work to SEO and drive traffic. So even if up front the value is hard to fit into the normal SEO purchase, think of it as SEO with bells and whistles.
And as long as Google continues to rank the main scraper websites from such companies, that provides the proof of value which sells the garbage content to big brands. And so the above pitch was made by you-know-who, and Demand Media is going to start selling content to old media sites "One example Kydd mentioned was Demand’s partnership with the travel section of the Atlanta Journal-Constitution, which, like most newspapers, is strapped for cash."
Quick question: what is to prevent Demand Media from partnering with hundreds of such media sites to leverage the combination of cheap labor, keyword earnings data, the media site's PageRank, and really just doing some serious damage to the search results? Unless the trend is altered, within 3 years almost any midtail to longtail keyword of value will have at least 7 of the top 10 results recycling the same poorly researched semi-legible informationless information.
All of the top Google search results say it is true. SO IT MUST BE!!!
“The basic idea of this contract,” he writes, “is that authors, journalists, musicians and artists are encouraged to treat the fruits of their intellects and imaginations as fragments to be given without pay to the hive mind. Reciprocity takes the form of self-promotion. Culture is to become precisely nothing but advertising.”
The above has been highlighted many times on this blog, but its damage has been far faster and far more widespread than even I anticipated.
The lingering effects of the economic recession, coupled with an expanding supply of efficient, and highly targeted online advertising networks, is reshaping the way big advertisers and agencies perceive the value of online media outlets. The result has been a pronounced polarization of the online advertising marketplace, with perceived demand rising for both the high-end of the most premium publishers and the low-end of ad networks and aggregators. This has caused perceived advertising value for the muddled middle of the marketplace - all but the most premium publishing sites, and the major online portals like AOL, Microsoft and Yahoo - to erode, as the ad industry focuses its attention on the top and the bottom players.
Those ad networks are (of course) full of fraudulent distribution which helps make them seem cheaper than they are, while leeching off the legitimate publishers and driving down CPM rates on legitimate media.
But as Demand Media saturates their site the returns lower and they are in need of more links to get more "content" indexed. And so they are promoting a business model based on incentivized publishing, which includes both "The more high quality links to your article there are on the web, the more highly a search engine will rank it" and "Your family and friends are probably curious about what you are writing anyway. Send them links and invite them to take a look!"
Given that those author's articles are hidden in the bowels of a large site (and that they are already being encouraged to build exposure), how big of a jump is it to assume that some of them will search for this or this? How many of them will create unofficial click rings? How many will ask friends to click an ad while they view it? How will Google be able to detect such activity given the big smokescreen such a large site provides? They can't.
Who does the rise of content scrapers help? Those who are involved in the manufacturing of bulk misinformation, search companies which pay people to steal content and wrap it in their ads, and those who sell subscription content (well, up until some of the above outfits buy subscriptions to those sites to re-write and dumb down the content). In some markets (where the market leader is clear and obvious and oftenly referenced on the garbitrage websites) the backfill junk content might also help develop a competitive moat between the top brands and weaker competitors. It might also help some people involved in analytics, as more businesses need to squeeze every ounce of profit to stay alive.
Success from scratch in many polluted markets will require more grit, more scars, and better differentiation. As robotic content fills the search results, people will likely gravitate toward the expression of emotions. At the same time some employers are trying to prevent employees from having the opportunity to get their hands dirty, leaving an opportunity for competing businesses who want the additional exposure.
Mark Cuban recently talked about how search engines and content aggregators are vampires.
There is no reason to be indexed in Google. ... You haven’t gotten anything back
But he failed to disclose how his Mahalo investment loots content.
If Google is a vampire (while sending away billions of Dollars of traffic for free) then what does that make Mahalo (which borrows your titles and abstracts as content to pull search traffic into their ad cluttered pages pages, while placing your content below the fold (while using nofollow on attribution links))?
Is the following accurate?
If you think otherwise, then please explain. ;)
Danny Sullivan TORE UP Mark Cuban in a must read article which only Danny could have wrote. It is well worth a read for anyone who wants to understand the hypocrisy behind the Mahalo position on content scraping / vampiring.
I was talking to a friend yesterday who was at a conference where Demand Media's CEO spoke, and he stated that nobody asked the big question: "what if google decides they don't like you anymore?"
Then I got thinking about how Google torched Squidoo after Jason Calacanis went on his public campaign to rebrand it as spam. But today under the same level of scrutiny, how is Mahalo (which scrapes millions of 3rd party content listings *without any editorial filter*) not spam? Squidoo at least donates $10,000 a month to charity. Mahalo just "borrows" your content without permission and keeps all the cash.
In the chat room, I said hello to teeceo, but I know the stuff that he was doing and it’s shoot-on-sight. I think anyone who is blackhat knows (or should know) that I’m happy to talk to anyone, but that we’ll still take action on the spam we find.
Imagine taking that approach to hunting search spam all day long, and then ignoring the *fact* that Mahalo is scraping millions of third party listings and using them as content with no editorial filtering.
Then I started thinking about why the Google spam team could ignore something as outrageous as Mahalo, especially when it was built by a guy who was a false anti-spam evangelist. Is it because Jason is a good guy? No. Is it because there is some actual editorial vetting of the content? no. Is it because Google is getting a cut of the AdSense revenues? Google doesn't need the short term cash flow (look at all the affiliate AdWords advertisers they just torched), so that is too cynical of a view.
Yes Google wants display inventory (their biggest opportunity for 2010 according to the quarterly call), and these "content" websites have already given themselves over to Google as inventory. But it must be something deeper than that. So I started thinking about it from a longterm strategic level...
Google won't penalize sites like Mahalo (even though they blatantly violate Google's guidelines) because Google *wants* to use the works of companies like Mahalo, Demand Media, and Aol to lower the value of other content and bankrupt a lot of the traditional media companies.
Why would Google want to do that?
There is excessive duplication in the marketplace. The faster that duplication is driven out of the marketplace the more desperate companies will be to cut deals with Google. And while there is a down market Google can drive companies out of the market and just claim that it was the economy that did it (much like how Mahalo used the down economy as an excuse to fire most of their editorial staff and replace them with content scraping robots).
Once a lot of media companies are bankrupted, the market is far more efficient, and there are fewer mouths to feed, that means Google can squeeze greater profits margins out of the media ecosystem by getting a fatter cut of the ad revenue.
Once it starts harming the Google brand then I suspect them to act quickly and decisively. And sites like Mahalo will see a sharp drop in traffic. Jason better milk it while he can. The clock is ticking.
On Hacker News, Melvin, from Web Design Company, had a great analogy on the Mahalo business model:
Let's use a different industry to illustrate what is happening.
Let's say a band named The Beatles records a new album. The local radio station gets a copy of their album and plays their song. The listeners love it so they play it more often, but they don't mention who the band is and on their website, they put up a link to download the song... but without any credits. Their audience grows. They get advertisers to advertise to their audience. They say, "hey, playing good songs gets us more listeners and more listeners gets us more advertisers, which gets us more $$. Let's do this more often." So they go do this 500,000 times, and each time never mentioning who the artist is. They grow and prosper while the artists starve.
Oh, in the mean time they call the artist scum.
In the above metaphor, the artists are the bloggers whose content Mahalo is using. The radio station ripping off the artist is Mahalo. The Federal Communication Commission is like Google, who is allowing all this to continue because the radio station is giving them a cut from the advertising revenue.
Hope this helps make it a little more clear why what they are doing is wrong, needed to get exposed and needs to get fixed.
The analogy isn't 100% perfect...but it *is* pretty darn close. :D
[Update: And it was so bad that Demand Media removed it from YouTube after I highlighted it, proving my point]
So remarkably bad, that I had to share it! :D
I don't know who Demand(ed) that Media, but could I please get my minute and six seconds back?
With 50,000+ views, that 1 video has wasted over a month of human life, so far. How many man-years are wasted watching such garbage? And yet they are just getting started! Demand Media's goal is to create a million pieces of "content" each month.
What do Youtube users think of that "content"?
Hmm....not impressed. If Google hates cloaking and machine generated content then why is trash that is handmade seen as being any better?
Does Google realize what they are funding? Do they even care if the web turns into a pile of junk? What will come of it?