Google Kills eHow Competitors, eHow Rankings Up

Economics Drive Literally Everything

Media is all about profit margins. eHow was originally founded in 1998 & had $36 million in venture capital behind it. But the original cost structure was flawed due to high content costs. The site failed so badly that it was sold in 2004 for $100,000. The original site owners had GoogleBot blocked. Simply by unblocking GoogleBot and doing basic SEO to existing content the site had a revenue run rate of $4 million Dollars within 2 years, which allowed the site to be flipped for a 400-fold profit.

Demand Media bought it in 2006 and has pushed to scale it while making cheaper and lower quality content. Demand Media has since gone public with a $1.5+ billion valuation based largely on eHow. Just prior to the DMD IPO Google's Matt Cutts wrote a warning about content mills on the official Google blog.

Seeing Value Where Others do Not

People are arguing if Demand Media is overvalued at its current valuation, and at one point the NYT was debating buying Demand Media by rolling About.com into Demand & own 49% of the combined company. But the salient point to me is that we are talking about something that was bought for only $100,000 7 years ago. Sure the opportunities today may be smaller scale and different, but if you see value where others do not then recycling something that has been discarded can be quite valuable.

In this 6-page article about the eHow way on page 6 there was a recycling tip any SEO can do with the help of some OCR software:

The key, he said, was to keep costs low. If possible, don't pay for the intellectual content. Look for material, he urged, on which the copyright has expired. Any book published in the U.S. before 1923 was available.
He said he was in the process of turning vanloads of old books into websites. With a few hours of labor, for example, you could take a turn-of-the-century Creole cookbook and transform it into the definitive site for vintage Creole recipes. Google's AdSense program would then load the thing up with ads for shrimp and cooking pots and spices and direct people looking for shrimp recipes to your website.

A spin on the above, LoveToKnow has published a 1911 encyclopedia online. And ArticlesBase (an article farm which built up its popularity by linking to contributior sites) now slaps nofollow on all outbound links & is pulling in a cool $500k per month!

How did ArticlesBase grow to that size? It and Ezinearticles were a couple of the "selected few" which lasted through the last major burn down of article directories about 3 or 4 years ago. But it seems their model has peaked after this last Google update.

Search is Political

Content farms are proving to be a political issue in search. They are beginning to replace "the evil SEO" in the eye of the press as "what creates spam." Rich Skrenta created a spam clock which stated that a million spam pages are created every hour. He then followed up by banning 20 content farms from the Blekko search results & burning spam man. ;)

Microsoft also got in on the bashing, with Harry Shum highlighting that Google was funding web pollution. When Blekko's model is based on claiming Google polluted the web with crap, Microsoft says the same thing, and there is a rash of end user complaints, there are few independent experts the media can call upon to talk about Google - unless they decide to talk to SEOs, who tend to be quite happy to highlight Google's embarrassing hypocrisy. Freelance writers may claim that marketing is what screwed up the web, but ultimately Google has nobody credible and well known to defend them at this point. The only people who can defend Google's approach are those who have a taste in the revenue stream. Hence why Google had to act against content farms.

Always Be Selling

Demand Media's CEO is the consummate sales professional, when Google first warned about content farms Mr. Rosenblatt he used the above to disclaim that Google means "duplicate content" when they write about content farms. Then Demand quickly scrambled after they were caught publishing plagiarized content the following day. :)

Google stepping up their public relations smear campaigns against Bing and others is leaving Google looking either hypocritical or ignorant in many instances, like when a Google engineer lambasted an ad network without realizing Google was serving the scam ad.

Social Answers?

While on the content farm topic, it is worth mentioning that Answers.com was bought for $127 million & there is also a bunch of news about Ask's strategy in the Ask section near the bottom of this newsletter. On the social end of the answer farm model, Facebook was rumored to be looking into the space & Twitter bought a social answer service called Fluther. Even Groupon seems to be looking at the space. Quora is well hyped on TechCrunch, but will have a hard time expanding beyond the tech core it has developed.

High Quality Answer Communities?

At first glance StackExchange's growth looks exciting, but it has basically gone nowhere outside of the programming niche. In my opinion they are going to need to find subject matter experts to lead some of their niche sites & either pay those experts or give them equity in the sites if they want to lead in other markets. Worse yet, few people are as well educated about online schemes as programmers, so the other sites will not only lack leadership, but will also be much harder to police. Just look at the junk on Yahoo! Answers! There are Wordpress themes and open source CMS tools for QnA sites, but I would pick a tight niche early if I was going to build one as the broader sites seem to be full of spam and the niche sites struggle to cross the chasm. As of writing this, fewer than 50 Mahalo answers pages currently indexed in Google have over 100 views. It flat out failed, even with financial bribes and a PR spin man behind it.

A Warning

A Google engineer nicknamed moultano stated the following on Hacker News:

At the organizational level, Google is essentially chaos. In search quality in particular, once you've demonstrated that you can do useful stuff on your own, you're pretty much free to work on whatever you think is important. I don't think there's even a mechanism for shifting priorities like that.
We've been working on this issue for a long time, and made some progress. These efforts started long before the recent spat of news articles. I've personally been working on it for over a year. The central issue is that it's very difficult to make changes that sacrifice "on-topic-ness" for "good-ness" that don't make the results in general worse. You can expect some big changes here very shortly though.

A good example of the importance of padding out results with junk on-topic content to aid perceived relevancy can be seen by looking at the last screenshot of a search result here. Blekko banned the farms, but without them there is not much relevant content that is precisely on-topic. (In other words, content farms may be junk, but it is hard to have the same level of perceived relevancy without them).

New Signals

Google created a Chrome plugin to solicit end user feedback on content mills, but that will likely only reach tech savvy folks & the feedback is private. Google can claim to use any justification for removing sites they do not like though, just like they do with select link buying engagements. Look the other way where it is beneficial, and remove those which you personally dislike.

In a recent WSJ article Amit Singhal was quoted as saying new signals have been added to Google's relevancy algorithms:

Singhal did say that the company added numerous “signals,” or factors it would incorporate into its algorithm for ranking sites. Among those signals are “how users interact with” a site. Google has said previously that, among other things, it often measures whether users click the “back” button quickly after visiting a search result, which might indicate a lack of satisfaction with the site.

In addition, Google got feedback from the hundreds of people outside the company that it hires to regularly evaluate changes. These “human raters” are asked to look at results for certain search queries and questions such as, “Would you give your credit card number to this site?” and “Would you take medical advice for your children from those sites,” Singhal said.

Evolving the Model

One interesting way to evolve the content farm model is through the use of tight editorial focus, a core handful of strong editors, and wiki software. WikiHow was launched by a former eHow owner, and when you consider how limited their relative resources are, their traffic level and perceived editorial quality are quite high. Jack Herrick has struck how-to gold twice!

Going Political?

AOL purchased The Huffington Post for $315 million. Here are some reviews of that purchase. The following analysis is a bit rough, but I still think it is spot on - contrary to popular belief, most of Huffington Post's pageviews are still driven by their professionally sourced content.

Editors who have a distaste for pageview journalism are already quitting AOL. But if you are interested in the content farm business model, AOL's business plan was leaked publicly. Oooops. :D

Conflating Scraper Sites vs Content Farms

In addition to general content farms, Google is fighting a war against scraper websites. One such algorithmic update has already been done against sites repurposing content, and the content farm algorithm just recently went live & whacked a bunch of content farms. Check out the top losers from Sistrix's data.

Notice any content farms missing from the above list? Maybe the biggest one? Here is a list of some of eHow's closest competing sites (based on keywords, from SEM Rush). The ones in red got pummeled, the ones in yellow dropped as well & were fellow Demand Media sites, and the ones in green gained traffic.

Getting Hammered

Jason "will do anything for ink" Calacanis recently gave an about face speech claiming people need to step away from the content farm business model, and in doing so admitted that roughly everything he said about Mahalo over the past couple years was a complete lie. Surprise, surprise. The interesting bit is that the start up community - which used to fawn over his huckster PR driven ploys - no longer buys them. Jason claimed to have "pivoted" his business model again, but once again we see more garbage content. His credibility has been spent. And so have his rankings! Sistrix shows that not only is he ranking for fewer keywords, but that the graph has skewed downward to worse average positions.

After the Crash, What is Next?

The biggest content farms like Ask & eHow will still do well in the short run. Over the long run I see Google bringing the results of content farms to the attention of book publishers & then working to slowly rotate out from farmed content to published book content. Most readers do not know that most book writers are lucky if they earn $10,000 writing a 300 or 400 page tome. Publishers tell book authors that with the additional exposure they can often sell lots more other things, but unless the content is highly targeted that might not back out well for the author. But that cheap content is far better structured and far more vetted than the mill stuff is.

Over the past week I have been seeing more ebooks in the search results, though I am not entirely sure if that is just because I am searching for more rare technical stuff that simply might not be online.

The Question Nobody is Asking

I highlighted Google's hypocritical position in judging intent with links while claiming they need an algorithmic approach to content farms. But nobody is thinking beyond the obvious question. Everyone wants to know who Google punished the most, but nobody is asking who gained the most from this update.

Demand Media put out a statement that their traffic profile did not change materially.

But what they didn't mention is that eHow's rankings are actually up! In fact, their new distribution chart looks just like their old one, only skewed a bit to the left with higher rankings. eHow's profile is 15% better than it was before the update & the only site which gained more traffic from this update than eHow did was Youtube.

How Did We Get Here?

People may have been sorta aware of garbage content & saw it ranking, but were apathetic about it. Most people are far more passive consumers of search than they believe themselves to be - when the default orders switch people still tend to click the top ranked result. It was only when eHow started branding itself as a cheap and disposable answer factory that people started to become outraged with their business model.

Demand Media further benefited from flagrant spammy guideline violations, like 301 redirecting expired domains into deep eHow pages. People I know who have done similar have seen their sites torched in Google. But eHow is different!

If you listen to Richard's interviews, you would never know him to be the type to redirect expired domains:

We really want to let Google speak for themselves. Whatever Matt Cutts and Google want to (say) about quality we totally support that because again that’s their corporate interest. What we said and would have said is we applaud Google removing duplicate content ... removing shallow, low quality content because it clogs the search results. Both we and Google are 100 percent focused on making the consumer happy. It’s the right thing to do and it’s good for our business.

If you syndicate Google's spin you can get away with things that a normal person can't. Which is why eHow renounced the content farm label even faster than they created it.

Article directories & topical hub sites have been online since before eHow was created. But eHow's marketing campaign was so caustic & so toxic that it literally destroyed the game for most of their competitors.

And now that Google has "fought content farms" (while managing to somehow miss eHow with TheAlgorithm) most of Demand Media's competitors are dead & Richard Rosenblatt gets to ride off into the sunset with another hundred million Dollars, as eHow is the chosen one. :D

Long live the content farm!

I am Long Mahalo...

...too bad Google is not!

Google just did their first content farm update & Mahalo appears to have taken a swan dive in the search results, freeing up space for higher quality websites.

Google's Amit Singhal & Matt Cutts wrote:

Many of the changes we make are so subtle that very few people notice them. But in the last day or so we launched a pretty big algorithmic improvement to our ranking—a change that noticeably impacts 11.8% of our queries—and we wanted to let people know what’s going on. This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.

Currently this update is US only, so if you are outside of the United States you may need to get a US VPN or add &gl=us to your search string's results on Google (likeso). Recent updates have had a variety of impacts and implications outside of content mills.

But it seems other large content farms are still doing well

What sets Mahalo apart then? Perhaps it was karma. ;)

I suppose we should "pivot" this post with some featured video content

Google Chrome May Remove Address Bar?

Feb 21st

The Amazing Power of Domains!

A couple days ago there was a blog post on TheDomains about "how stupid SEOs are" and "the amazing power of domaining" where they highlighted how awesome domaining was because a guy registered a domain name he saw in a comic and it sent a bunch of traffic.

What that article failed to mention was:

  • That traffic wasn’t from the power of the domain name…that was the power of free advertising & the distribution of the comic strip.
  • The same domain name likely received ~ 0 traffic until it was featured in the comic strip. If it had an organic traffic stream for years before being highlighted it most likely would have already been purchased.
  • As that comic strip falls into the archives & into obscurity the organic traffic it was driving will drop back to roughly where it started at: 0.
  • The flood of new found traffic was hardly a goldmine anyhow. It was entirely irrelevant to his main business, and thus entirely worthless. The only exception there would be unless the person was offering information about comics, installing malware, pushing reverse billing scams, etc.

Being Ignorant Doesn't Create Profit

The laughable (and ignorant) thing about the comments on that post were that some of the people who were commenting were equating SEOs to misdirection & scams that sell traffic off to the highest bidder. Sorry, but that is what PPC domain parking is all about...the ad networks optimize yield & the publishers agnostically push whatever generates the most yield: often scams!

Stating that all SEOs are dumb spammers is precisely the same as stating that all domainers are cybersquatters. Neither are true, and neither serves much purpose, other than aiding the spread of ignorance.

Why Domainers View SEOs Dimly

Many domainers who try to hire SEOs fail badly because they are too cheap & buy from lousy service providers. They feel that since they bought domain x (and sat on it while literally doing nothing for a decade) that they somehow deserve to be the top ranked result. To be fair, it is pretty easy to become lazy and not want to change things when you register domain names & then literally watch them spit money at you. ;)

Against that approach, the smart SEOs (the ones actually worth hiring) realize that it is more profitable to buy their own domain names and keep all the cashflow for your efforts rather than doing 95% of the job for 5% of the revenues. Yes a good domain name is helpful, but with the right attitude you can still do quite well even on a hyphenated nasty looking .info domain name. ;)

Why SEOs View Domainers Dimly

A combination of squandered opportunity & arrogance.

I frequently tell myself that in 3 years or 5 years that the web will be so competitive that it will no longer be as profitable as it is today. And every year I have pushed that mindset back another year while we grew. But who knows how long that will last? Sure as long as there are signals there will be ways to influence them, but if you are not one of the favored parties then at some point it will be challenging to compete.

The Real Challenge: the Search Duopoly vs Publishers

At the end of the day, a lot of us are small players trying to carve out our own niches in a network that is increasingly dominated by a few large companies.

When Google got into the web browser game, one of the big "innovations" was the Omnibox. They integrated search right into the address bar to help drive incremental search volume.

As they were a new browser it was not a big risk or big concern to domainers (as most people who use direct navigation are either people revisiting a website they already visited or people new to the web who are likely to use the most common default web browser - Internet Explorer). Nonetheless, address bar as search box highlighted things to come & a way the web would change.

When Google announced their Chrome OS they decided to do away with the CAPS LOCK BUTTON AND REPLACE IT WITH A SEARCH BUTTON. OOPS SORRY ABOUT THAT. Again, it is not a big deal today, but if that ever became standard the future would grow more challenging.

The big problem with Google doing such innovations is that whatever they do, they also give Microsoft permission to do. Google can't complain about what Microsoft is doing if Microsoft is only following Google's lead.

Let me take that back.

Google can complain, but they come off looking like douchebags when they do.

Hey look, Google will recommend *any* browser so long as it is monetized by promoting Google Search. Internet Explorer (the most popular browser) need not apply.

Rather than fighting Google's approach, Microsoft is riding on the coattails. Google's Toolbar sniffs end user data to help make search more relevant. So does Microsoft's web browser.

Google allows ads on trademarks. So then will Microsoft.

Just like the Omnibox, Internet Explorer 9 integrates search into the address bar.

As soon as IE9 rolls out, domainers can count on losing traffic month after month. This trend is non reversible in well established markets like the United States & Europe, and in 3rd world markets the ad markets pay crumbs.

More recently Google has suggested dumping the address bar from the browser. Everything goes through the Google front door! A front door which increasingly is 100% ads above the fold.

If that happens, it won't impact domainers much, but if Microsoft copies it, then look out below on domain prices. You wouldn't be able to get to a domain name without first being intercepted by a search engine toll booth. In that environment, a PPC park page produces ~ $0. And even established sites that are generic might not be a great strategy for creating *sustainable* profits if/when the organic results are below the fold. People who invest in brand have some protection against pricing pressures & irrelevant search results, but those who are generic don't typically have much brand to protect their placement nor profits.

This dominance over the search channel is even more fierce when you get on the mobile platform, as there is often only 1 or 2 results visible. Google's Get scraped or go to hell TM approach to review websites is all about extending their platform dominance onto the mobile phone. It has little screen space & they want to be able to suck in as much content as they can to slow down search market fragmentation into custom apps.

Google despised how Microsoft bundled services & believes all other competitors should win market by market based on the merit of the product. Google does not believe this line of thinking should be applied to TheGoogle though, as you need to be a seriously dominant market player to stay in the lead position while opting out of appearing in the search results of the default search engine.

Even on the regular web staying competitive is growing increasingly challenging due to these moves to lock up and redirect normal user behaviors to shift it through an increasingly ad dominated search channel.

Somewhere at MountainView

Feb 15th
posted in

Somewhere at MountainView.

Late afternoon.

Google Guy: Sigh. Our algo really does suck, sometimes.....

Other Google Guy: How so, dude?

GoogleGuy: It keeps returning low-quality farmer garbage

Other Google Guy: Mahalo!

Google Guy: Huh?

Other Google Guy: Sorry, just shouting out "Thanks!" to Marissa. She left me a cup cake this morning. You were saying?

GoogleGuy: Our algo, it keeps returning low-quality farmer garbage

Other Google Guy: Ah, right. We're gone all "Alta Vista" a bit lately, huh. People are noticing....

Google Guy: Hey! No one mentions the AV word around here, OK!

Other Google Guy: Sorry dude. So, what shall we do?

Google Guy: We could invent a cool new algorithm, like Sergey and Larry did all those years ago

Other Google Guy: Hahahaha....you ain't Sergey or Larry, dude. Anyway, they're more concerned with self-drive cars these days, aren't they? Search is so 2001.....

Google Guy: Look, we've got to do something. The technorati are getting uppity. They're writing blog posts. Tweets. Everything. And let's not forget the JC Penny debacle. The shareholders could get angry about this. Well, they would if they understood it.....

Other Google Guy: Do they?

Google Guy: Probably not.

Other Google Guy: So, what's the problem? My data is showing most of our users couldn't give a toss about the farmer stuff. Some of them like learning about how to pour a glass of milk. It's just the valleywags getting grumpy, and no one listens to them.

Google Guy: Right, but this has the potential to filter out. It might get on FOX! Too many people might get the wrong end of the stick, and suddenly we're not cool anymore.

Other Google Guy: But we're not cool n.......

Google Guy: Shut it. We're still cool, OK.

Other Google Guy: Anything you say, boss

Google Guy: Hmmm.......what we could do is go "social media". So hot right now. We could crowdsource it! We'd look very cool with the hipsters.

Other Google Guy: Mmmmmm.....sauce.....

Google Guy: We'll give 'em a Chrome extension. Yes! Make them do all the work. At very least, it's going to shut them up. They won't have to look at anything they don't want to look at. It will make them feel superior, and we can collect some data about what sites techno dudes don't like

Other Google Guy: Brilliant! Superb! One problem - won't content farmers use this against each other in order to take each other out?

Google Guy : Nah, it's just a "ranking signal". We have hundreds of 'em we apply to every search, don't you know ;)

Other Google Guy: Hahahah..."ranking signal". Nice one, Google Guy. You can add it to the other two hundred! Or was it three hundred? Shareholders love that stuff.

Google Guy: Laughs. Oh...kay.....almost finished this extension. It'll push it out there.....

Ten seconds pass.....

Google Guy: Hey! The first data is in already!

Other Google Guy: People use Chrome? Opps...I mean "People use Chrome!" Which sites are they blocking?

Google Guy: Wikipedia....

Other Google Guy: It figures.....

Google Guy: Oh, and Google.....sigh......

Satire. It never happened. Not really :)

Two Diametrically Opposed Google Editorial Philosophies

An "Algorithmic" Approach

When it comes to buying links, Google not only fights it with algorithms, but also ran a 5-year long FUD campaign, introduced nofollow as a proprietary filter, encouraged webmasters to rat on each other, and has engineers hunting for paid links. On top of that, Google's link penalties range from subtle to overt.

Google claims that they do not want to police low quality content by trying to judge intent, that doing so would not be scalable enough to solve the problem, & that they need to do it algorithmically. At the same time, Google is willing to manually torch some sites and basically destroy the associated businesses. Talk to enough SEOs and you will find stories of carnage - complete decimation.

Economics Drive Everything

Content farms are driven by economics. Make them unprofitable (rather than funding them) and the problem solves itself - just like Google AdWords does with quality scores. Sure you can show up on AdWords where you don't belong and/or with a crappy scam offer, but you are priced out of the market so losses are guaranteed. Hello $100 clicks!

How many content farms would Google need to manually torch to deter investment in the category? 5? Maybe 10? 20 tops? Does that really require a new algorithmic approach on a web with 10's of millions of websites?

When Google nuked a ton of article banks a few years back the damage was fairly complete and lasted a long time. When Google nuked a ton of web directories a few years back the damage was fairly complete and lasted a long time. These were done in sweeps where on day you would see 50 sites lose their toolbar PageRank & see a swan dive in traffic. Yet content farms are a sacred cow that need an innovated "algorithmic" approach.

One Bad Page? TORCHED

If they feel an outright ban would be too much, then they could even dial the sites down over time if they desired to deter them without immediately killing them. Some bloggers who didn't know any better got torched based on a single blog post:

The Forrester report discusses a recent “sponsored conversation” from Kmart, but I doubt whether mentions that even in that small test, Google found multiple bloggers that violated our quality guidelines and we took corresponding action. Those blogs are not trusted in Google’s algorithms any more.

One post and the ENTIRE SITE got torched.

An Endless Sea of Garbage

How many garbage posts have you seen on content farms?

When you look at garbage content there are hundreds of words on the page screaming "I AM EXPLOITATIVE TRASH." Yet when you look at links they are often embedded inline and there is little context to tell if the link is paid or not, and determine if the link was an organic reference or something that is paid for.

Why is it that Google is comfortable implying intent with links, but must look the other way when it comes to content?

Purchasing Distribution

Media is a game of numbers, and so content companies have various layers of quality they mix in to make it harder for Google to find signal from noise. Yahoo! has fairly solid content in their sports category, but then fluff it out with top 10 lists and such from Associated Content. Now Yahoo! is hoping they can offset lower quality with a higher level of personalization:

The Yahoo platform aims to draw from a user’s declared preferences, search items, social media and other sources to find and highlight the most relevant content, according to the people familiar with the matter. It will be available on Yahoo’s Web site, but is optimized to work as an app on tablets and smartphones, and especially on Google Android and Apple devices, they said.

AOL made a big splash when they bought TechCrunch for $25 million. When AOL's editorial strategy was recently leaked it highlighted how they promoted cross linking their channels to drive SEO strategy. And, since acquisition, TechCrunch has only scaled up on the volume of content they produce. In the last 2 days I have seen 2 advertorials on TechCrunch where the conflicting relationship was only mentioned *after* you read the post. One was a Google employee suggesting Wikipedia needs ads, and the other was some social commerce platform guy promoting the social commerce revolution occurring on Facebook.

Being at the heart of technology is a great source of link equity to funnel around their websites. TechCrunch.com already has over 25% as many unique linking domains as AOL.com does. One of the few areas that is more connected on the social graph than technology is politics. AOL just bought Huffington Post for $315 million. The fusion of political bias, political connections, celebrity contributors, and pushing a guy who promoted (an ultimately empty) promise of hope and change quickly gave the Huffington Post even more link equity than TechCrunch has.

Thus they have the weight to do all the things that good online journalism is known for, like ads so deeply embedded in content you can't tell them apart, off-topic paginated syndicated duplicate content and writing meaningless posts devoid of content based on Google Trends data. As other politically charged mainstream media outlets have shown, you don't need to be factually correct (or even attempt honesty) so long as your bias is consistent.

Ultimately this is where Google's head in the sand approach to content farms backfired. When content farms were isolated websites full of trash Google could have nuked them without much risk. But now that their is a blended approach and content farms are part of public companies backed by politically powerful individuals, Google can't do anything about them. Their hands are tied.

Trends in Journalism

Much like the middle class has been gutted in the United States, Ireland (and pretty much everywhere that is not Iceland) by economic policies that gut the average person to promote banking criminals, we are seeing the same thing happen online to the value of any type of online journalism. As we continue to ask people to do more for less we suffer through a lower quality user experience with more half-content that leaves out the essential bits.

How to build a brick wall:

  • step 1: get some bricks
  • step 2: stack them in your workplace
  • step 3: build the brick wall

The other thing destroying journalism is not only lean farms competing against thick and inefficient organizations for distribution, but also Google pushing to control more distribution via their various data grabs: Youtube video & music, graphical CPA ads in the search results, lead generation ads in the search results, graphic AdSense ads on publisher sites that drive searches into those lead generation funnels, grouping like data from publishers above the organic search results, offering branded navigational aids above the organic search results, acquiring manufacturer data, scraping 3rd party reviews, buying sentiment analysis tools, promoting Google maps everywhere, Google product pages & local review pages, extended ad units, etc. If most growth in journalism is based on SEO & Google is systematically eating the search results, then at some point that bubble will get pricked and there will be plenty of pain to go around.

My guess is that in 3 to 4 years the search results become so full of junk that Google pushes hard to rank chunks of ebooks wrapped in Google ads directly in the search results. Books are already heavily commoditized (it's amazing how much knowledge you can get for $10 or $20), and given that Google already hard-codes their ebooks in the search results, it is not a big jump for them to work on ad deals that pull publishers in. It follows the trend elsewhere "Free Music Can Pay as Well as Paid Music, YouTube Says."

It's Not All Bad

The silver lining there is that if you are the employer your margins may grow, but if you are an employee & are just scraping by on $10 an hour then it increases the importance of doing something on the side to lower your perceived risk & increase your influence. A few years back Marshall Kirkpatrick started out on AOL's content farms. The tips he shared to stand out would be a competitive advantage in almost any vertical outside of technology & politics:

one day Michael Arrington called and hired me at TechCrunch. "You keep beating us to stories," he told me. I was able to do that because I was getting RSS feeds from key vendors in our market delivered by IM and SMS. That's standard practice among tech bloggers now, but at the time no one else was doing it, so I was able to cover lots of news first.

Three big tips from the "becoming a well known writer front" for new writers are...

  • if short form junk content is the standard then it is easier to stand out by creating long form well edited content
  • it is easier to be a big fish in a small pond than to try to get well known in a saturated area, so it is sometimes better to start working for niche publishers that have a strong spot in a smallish niche
  • if you want to target the bigger communities the most important thing to them (and the thing they are most likely to talk about) are themselves

Another benefit to publishers is that as the web becomes more polluted people will become far more likely to pay to access better content and smaller + tighter communities.

Prioritizing User Feedback?

On a Google blog post about web spam they state the following:

Spam reports are prioritized by looking at how much visibility a potentially spammy site has in our search results, in order to help us focus on high-impact sites in a timely manner. For instance, we’re likely to prioritize the investigation of a site that regularly ranks on the first or second page over that of a site that only gets a few search impressions per month.

Given the widely echoed complaints on content farms, it seems Google has a different approach on content farms, especially considering that the top farms are seen by millions of searchers every month.

Implying Intent

If end users can determine when links are paid (with limited context) then why not trust their input on judging the quality of the content as well? The Google Toolbar has a PageRank meter for assessing link authority. Why not add a meter for publisher reputation & content quality? I can hear people saying "people will use it to harm competitors" but I have also seen websites torched in Google because a competitor went on a link buying spree on behalf of their fellow webmaster. At least if someone gives you a bad rating for great content then the content still has a chance to defend its own quality.

With link stuff there is a final opinion and that is it. Not only are particular techniques of varying levels of risk, but THE prescribed analysis of intent depends on who is doing it!

A Google engineer saw an SEO blog about our affiliate program passing link juice and our affiliate links stopped passing weight. (I am an SEO so the appropriate intent is spam). Then something weird happened. A few months later a Google engineer *publicly* stated that affiliate links should count. A few years later Google invested in a start up which turns direct links into affiliate links while hiding the paid compensation in the background. (Since Google is doing it the intent is NOT spam).

Implying Ignorance

Some of the content mills benefit from the benefit-of-doubt. Jason Calacanis lied repeatedly about "experimental pages" and other such nonsense. But when his schemes were highlighted he was offered the benefit of the doubt. eHow also enjoys that benefit of the doubt. It doesn't matter that Demand Media's CEO was the chairman of an SEO consulting company which sold for hundreds of millions of Dollars. What matters is the benefit of the doubt (even if his company flagrantly violates quality guidelines by doing bulk 301 redirects of EXPIRED domains into eHow ... something where a lesser act can put you up for vote on a Google engineer's blog for public lynching).

The algorithm. They say. It has opinions.

What Other Search Engines Are Doing

A Bing engineer accused Google of funding web pollution. Blekko invites end users to report spam in their index, and the first thing end-users wanted booted out was the content mills.

But Google need to be "algorithmic" when the problems are obvious and smack them in the face. And they need to "imply intent" where the problems are less problematic & nowhere near as overt.

Makes sense, almost!

Google vs Bing

Feb 2nd

Quite Similar Results

A few months back I was running Advanced Web Ranking and noticed that Google and Bing were really starting to come in line on some keywords.

Major Differences

Of course there are still differences between Bing and Google.

Google has far more usage data built up over the years & a huge market share advantage over Bing in literally every global market. Microsoft's poor branding in search meant they had roughly 0 leverage in the marketplace until they launched the Bing brand. That longer experience in search is likely what gives Google the confidence to have a much deeper crawl.

That head start also means that Google has been working on understanding word meanings and adjusting their vocabulary far longer, which also gives them the confidence to be able to use word relationships more aggressively (when Bing came to market part of their ad campaign was built on teasing Google for this). The last big difference from an interface perspective would be that Google forces searchers down certain paths with their Google Instant search suggestions.

Who Copied Who?

But the similarities between the search engines are far greater than their differences.

  • At the core of Google's search relevancy algorithm is PageRank and link analysis. Bing places a lot of weight on those as well.
  • Google also factors in the domain name into their relevancy algorithms. So does Bing.
  • Google has long had universal search & Bing copied it.
  • Google has tried to innovate by localizing search results. Bing localizes results as well.
  • Bing moved the right rail ads closer to the organic search results. Google copied them.
  • Bing put a fourth ad above the organic search results. Google began listing vertical CPA ad units for mortgages and credit cards above the organic search results - a fourth ad unit.
  • Bing has a homepage background image. Google copied them by allowing you to upload a personalized homepage logo.
  • Bing offers left rail navigation to filter the search results. Google copied them by offering the same.
  • Bing innovated in travel search. Google is trying to buy the underlying data provider ITA Software.
  • Bing included Freebase content in their search results. Google bought Metawebs, which owns Freebase.
  • Bing offered infinite scroll and a unique image search experience that highlights the images. Google copied it.

Oh, The Outrage

Off the start Bing was playing catch up, but almost anything they have ever tried which has truly differentiated their experience ended up copied by Google. Recently Google conducted a black PR campaign to smear Bing for using usage data across multiple search engines to improve their relevancy. The money quote would be:

Those results from Google are then more likely to show up on Bing. Put another way, some Bing results increasingly look like an incomplete, stale version of Google results—a cheap imitation.

Perhaps why Google finds this so annoying is that it allows Microsoft to refine their "crawl" & relevancy process on tail keywords, which are the hardest ones to get right (because as engines get deeper into search they have fewer signals to work with and a lot more spam). It allows Microsoft to conduct tests which compare their own internal algorithms against Google's top listings on the fly & learn from them. It takes away some of Google's economies of scale advantages.

Of course, Google uses the same sort of data to help refine their own search algorithms. The brand update was all about ranking related resources from subsequent related searches higher in the initial result set. YouTube's video recommendation engine was built on ideas from Amazon's recommendation algorithm (but Google *accidentally* left off the citation).

Is Google Eating Its Own Home Cooking (And Throwing UP?)

Here is what I don't get about Google's complaints though. Google had no problem borrowing a half-dozen innovations from Bing. But this is how Google describes Bing's "nefarious" activities:

It’s cheating to me because we work incredibly hard and have done so for years but they just get there based on our hard work,” said Singhal. “I don’t know how else to call it but plain and simple cheating. Another analogy is that it’s like running a marathon and carrying someone else on your back, who jumps off just before the finish line.”

Yet when popular vertical websites (that have invested a decade and millions of Dollars into building a community) complain about Google disintermediating them by scraping their reviews, Google responds by telling those webmasters to go pound sand & that if they don't want Google scraping them then they should just block Googlebot & kill their search rankings.

When a content site compiles reviews, creates editorial features to highlight the best reviews (and best reviewers), and works to create algorithms to filter out junk and spam then Google is fine with Google eating all that work for free. Google then jumps off their backs just before the finish line and throws the repurposed reviews in front of Google searchers.

But if Bing looks at the data generated by searchers who are performing the searches on Google and uses it as 1 of 1,000 different relevancy signals then Google is outraged.

Clickstream Data & You

This public blanket admission of Microsoft using clickstream data for relevancy purposes is helpful. But outside of the PR smear campaign from Google there wasn't much new to learn here, as this has been a bit of an open secret amongst those in the know in the search space for well over a year now.

But the idea of using existing traffic stream data as a signal increases the value of having a strong diversified traffic flow which leverages:

  • search advertising (to get your foot in the door)
  • other forms of advertising that lead to exposure
  • making noise on social sites
  • cross promotion of featured network content
  • ranking in search verticals (which are sometimes incorporated into Google's core result set)
  • beautiful web design which looks good...

Recently we tested adding ads on one of our websites that had a fairly uninspired design on it. After adding the ads (which make the site feel a bit less credible) the new design was so much better fitting than the old one that the site now gets 26% more pageviews per visit. Anytime you can put something on your website which increases monetization, sends visitors away & yet still get more user engagement you are making a positive change!

If You Can't Beat Em, Filter

I was being a bit of a joker when I created this, but the point remains that as larger search engines force feed junk (content mills and vertical search results) down end user's throats that some of the best ways for upstart search engines to compete is to filter that stuff out. Both DuckDuckGo and Blekko have done just that.

The Future of (In)Organic Content Farming

Demand Media is currently worth $1.74 billion, but it remains to be seen what happens to the efficacy of the content farm business model if & when Google makes promised changes. And given that Yahoo! is Bing's biggest source of search distribution, it would be hard for Microsoft to crack down to hard without potentially harming that relationship (since Yahoo! owns Associated Content), but Google & Microsoft are the only game in town with search ads. The DOJ already blocked Yahoo!-Google. Trying to win marketshare from Google, Microsoft is burning over $2 billion a year.

Search as a Wedge to Influence & Corrupt Other Markets

Search can be used as a wedge in a variety of ways. Most are perhaps poorly understood by the media and market regulators.

Woot! Check Out Our Bundling Discounts

When Google Checkout rolled out, it was free. Not only was it free, but it came with a badge that appears near AdWords ads to make the ads stand out. That boosts ad clickthrough rates, which feeds into ad quality score & acts as a discount for advertisers who used Google Checkout. If you did not use Google's bundled services you were stuck paying above market rates to compete with those who accepted Google's bundling discounts.

And there is no conspiracy theory to the above bundling. Here is a video on how quality score works & this Google Checkout page states it plain as day

And all the while Google was doing the above bundling (as they still are to this day) they were also lobbying in Australia about Paypal's dominant market position. eBay (which owns Paypal) is one of Google's 5 largest advertisers.

This Brand is Your Brand, This Brand is My Brand

Companies spend billions of Dollars every year building their trademarked brands. But if they don't pay Google for existing brand equity then Google sells access to that stream of branded traffic to competitors, even though internal Google studies have shown it causes confusion in the marketplace.

The Right to Copy

Copyright protects the value of content. To increase the cost of maintaining that value, DoubleClick and AdSense fund a lot of copy and paste publishing, even of the automated variety. Sure you can hide your content behind a paywall, but if Google is paying people to steal it and wrap it in ads how do you have legal recourse if those people live in a country which doesn't respect copyright?

You can see how LOOSE Google's AdSense standards are when it comes to things like copyright and trademarks by searching for something like "bulk PageRank checker" and seeing how many sites that violate Google's TOS multiple ways are built on cybersquatted domain names that contain the word "PageRank" in them. There are also sites dedicated to turning Youtube videos into MP3's which are monetized via AdSense.

Universal Youtube Search

Google bought Youtube and then swiftly rolled out universal search, which dramatically increased the exposure of Youtube. Only recent heat & regulatory review has caused Google to add more prominent links to competing services, nearly a half-decade later.

A Unit of Obscurity

Knol was pushed as a way to revolutionize how people share information online. But it went nowhere. Why? Google got caught with their hand in the cookie jar, so they couldn't force the market to eat it.

Scrape You Very Much

Philosophically Google believes in (and delivers regular sermons about) an open web where companies should compete on the merit of their products. And yet when Google enters a new vertical they *require* you to let them use your content against you. If you want to opt out of competing against yourself Google say that is fine, but the only way they will allow you to opt out is if you block them from indexing your content & kill your search traffic.

“Google has also advised that if we want to stop content from appearing on Google Places we would have to reduce/stop Google’s ability to scan the TripAdvisor site,” said Kaufer “Needless to say, this would have a significant impact on TripAdvisor’s ranking on natural search queries through Google and, as such, we are not blocking Google from scanning our site.”

From a public relations standpoint & a legal perspective I don't think it is a good idea for Google to deliver all-or-nothing ultimatums. Ultimately that could cause people in positions of power to view their acts as a collection which have to be justified on the whole, rather than on an individual basis.

Lucky for publishers, technology does allow them to skirt Google's suggestions. If I ran an industry-leading review site and wanted to opt out of Google's all-or-nothing scrape job scam, my approach would be to selectively post certain types of content. Some of it would be behind a registration wall, some of it would be publicly accessible in iframes, and maybe just a sliver of it is fully accessible to Google. That way Google indexes your site (and you still rank for the core industry keywords), but they can't scrape the parts you don't want them to. Of course that means losing out on some longtail search traffic (as the hidden content is invisible to search engines), but it is better than the alternatives of killing all search traffic or giving away the farm.

Google Gearing Up for Relevancy Changes

Jan 22nd

Over the past year or 2 there have been lots of changes with Google pushing vertical integration, but outside of localization and verticalization, core relevancy algorithms (especially in terms of spam fighting) haven't changed too much recently. There have been a few tricky bits, but when you consider how much more powerful Google has grown, their approach to core search hasn't been as adversarial as it was a few years back (outside of pushing more self promotion).

There has been some speculation as to why Google has toned down their manual intervention, including:

  • anti-trust concerns as Google steps up vertically driven self-promotion (and an endless well of funding for anyone with complaints, courtesy Microsoft)
  • a desire to create more automated solutions as the web scales up
  • spending significant resources fighting site hacking (the "bigger fish to fry" theory)

Matt Cutts recently made a blog post on the official Google blog, which highlighted that indeed #3 was a big issue:

As we’ve increased both our size and freshness in recent months, we’ve naturally indexed a lot of good content and some spam as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.

It sounds like Google was mainly focused on fighting hacked sites and auto-generated & copied content. And now that hacked *GOVERNMENT* websites are available for purchase for a few hundred Dollars (and perhaps millions in personal risk when a government comes after you) it seems like Google's pushing toward fighting off site hacking was a smart move! Further, there are a wide array of start ups built around leveraging the "domain authority" bias in Google's algorithm, which certainly means that looking more at page by page metrics was a needed strategy to evolve relevancy. And with page-by-page metrics it will allow Google to filter out the cruddy parts of good sites without killing off the whole site.

As Google has tackled many of the hard core auto-generated spam issues it allows them to ramp up their focus on more vanilla spam. Due to a rash of complaints (typically from web publishers & SEO folks) content mills are now a front and center issue:

As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content. We take pride in Google search and strive to make each and every search perfect. The fact is that we’re not perfect, and combined with users’ skyrocketing expectations of Google, these imperfections get magnified in perception.

Demand Media (DMD) is set to go public next week, and Richard Rosenblatt has a long history of timing market tops (see iMall or MySpace).

But what sort of sites are the content mills that Google is going to ramp up action on?

The tricky part with vanilla spam is the subjective nature of it. End users (particularly those who are not web publishers & online advertisers) might not complain much about sites like eHow because they are aesthetically pleasing & well formatted for easy consumption. The content might be at a low level, but maybe Google is willing to let a few of the bigger players slide. And there is a lot of poorly formatted expert content which end users would view worse than eHow, simply because it is not formatted for online consumption.

If you recall the Mayday update, Richard Rosenblatt said that increased their web traffic. And Google's October 22nd algorithm change last year saw many smaller websites careen into oblivion, only to re-appear on November 9th. That update did not particularly harm sites like eHow.

However, in a Hacker News thread about Matt's recent blog post he did state that they have taken action against Mahalo: "Google has taken action on Mahalo before and has removed plenty of pages from Mahalo that violated our guidelines in the past. Just because we tend not to discuss specific companies doesn't mean that we've given them any sort of free pass."

My guess is that sites that took a swan dive in the October 23rd timeframe might expect to fall off the cliff once more. Where subject search relevancy gets hard is that issues rise and fall like ocean waves crashing ashore. Issues that get fixed eventually create opportunities for other problems to fester. And after an issue has been fixed long enough it becomes a non-issue to the point of being a promoted best practice, at least for a while.

Anyone who sees opportunity as permanently disappearing from search is looking at a half-empty glass rather than one which sees opportunities that died reborn again and again.

That said, I view Matt's blog post as a bit of a warning shot. What types of sites do you think he is coming after? What types of sites do you see benefiting from such changes? Discuss. :)

Google's Missing Disclosure

Dec 15th

Netflix's Risky Position

One of the fundamental keys to monetizing third party content is finding a way to do it while keeping your earnings data abstract. A huge problem that hits pure plays like Netflix is that as soon as companies see the profits the cost structures change.

  • Comcast is looking to get some funds from Level 3 (for distribution of Neflix content), and
  • Partners who license video content to Netflix want a bigger piece of the action as well: "Now many of the companies that make the shows and movies that Netflix delivers to mailboxes, computer screens and televisions — companies whose stocks have not enjoyed the same frothy rise, and whose chief executives have not won the same accolades — are pushing back, arguing that the company is overhyped, and vowing to charge much more to license their content."

Making big money on someone else's content makes the content owner look stupid. As soon as you let big media know you are making money on their content they get pissed and feel they rightfully earned that money. As they sense a shift in power any edge cases become the standard against which all other deals are compared.

How Youtube Differs From Netflix

If you study Google & listen to their quarterly conference calls you will always come away with the following: YouTube is operating at an amazing scale, Youtube's growth is accelerating, and YouTube might not be profitable. In the most recent quarterly call Google highlighted that their display network was a $2.5 billion business, but we never hear specific revenue or cost stats from YouTube. Hiding that business within the larger Google enterprise allows Google to print money and gain leverage without evoking the wrath of big media.

Sure there is the Viacom lawsuit, but Youtube streams over 2 billion videos a day with roughly 1 in 7 of those views being monetized. The growth trends keep accelerating, with revenues more than doubling each year, but Google doesn't have to deal with the Netflix issue of margin collapse from partners - because they don't break out profits.

Legislating Profits

Many large scammy/criminal corporations (like the too big to fail bankers & the huge pharma companies) have their 'profits' legislated, even as they destroy the economy. Their political kickbacks to politicians are so strong that in spite of committing multiple repeated felonies, they have enough political sway that third parties create scammy non-profits promoting these brigands to win political favor.

Google claims they are not dominant, but they do not sit in an area where they can legislate their own profits. So they must operate in the gray area elsewhere to sustain and grow their profits.

Alternate Paths to Endless Cash

Cashing Out Brand Equity

Have a trademark? Are you not buying your own brand? Don't worry, a competitor will. Prior Google research (and Google sales material) have shown consumer confusion from some of these activities

But Google has a great legal team & have managed to grow profits by forcing brands to buy their own existing brand equity, even if it adds 0 revenues & significant costs for the advertiser.

Cloaking + DRM = Win

Remember how Google doesn't like cloaking? But they will DRM manage your media for you & if someone views it outside of the appropriate area they will get a "screw you" page, likeso:

(If you are from the US you can see how content is cloaked in various countries by using web proxies or VPN services.)

Copyright is for Suckers

Is Google a more authoritative book seller than Barnes & Nobles? Other than lying & taking a few legal shortcuts, what puts Google in a superior position as a book seller?

At least their (lack of) respect for copyright is consistent.

You Need to Disclose, but Google Does NOT

Remember back when Google claimed that anyone buying or selling links needed to do it in a way that is both machine readable & human readable? Well, Google invested in Viglinks, which is certainly 100% counter to that spirit. Further, consider Google's recent hard coding of ebook promotions in their search results. There is no ad label in a machine readable or human readable format, but they mix it right in their 'organic' search results.

Remember how paid links were bad?

"Search engine guidelines require machine-readable disclosure of paid links in the same way that consumers online and offline appreciate disclosure of paid relationships (for example, a full-page newspaper ad may be headed by the word 'Advertisement')" - Google.

If you do the same thing Google does, then you are violating their guidelines. Sorta hard to compete with them while staying inside their guidelines then, eh?

If Google expects you to label your paid ads in machine and human readable ways, then why are they fine with blending their ads directly into the organic search results with no disclaimer? Do they actually believe that manipulating end users (to promote their own business deals) is less evil than potentially manipulating a search tool?

The absurdity reminds me of a quote from You Are Not a Gadget:

If you want to know what's really going on in a society or ideology, follow the money. If money is flowing to advertising instead of musicians, journalists, and artists, then a society is more concerned with manipulation than truth or beauty. If the content is worthless, then people will start to become empty-headed and contentless.

The combination of hive mind and advertising has resulted in a new kind of social contract. The basic idea of this contract is that authors, journalists, musicians, and artists are encouraged to treat the fruits of their intellects and imaginations as fragments to be given without pay to the hive mind. Reciprocity takes the form of self-promotion. Culture is to become precisely nothing but advertising.

Google Launches MILLIONS of Doorway Pages

Dec 13th

I mentioned this in our last post but it probably deserves a post of its own. ;)

Google has long claimed that search results inside search results are a poor user experience. They also claim their use of your content is fair use because it is only for ranking and distribution purposes.

Take a look at Google's deskbar subdomain. Google has created MILLIONS of pages on this subdomain:

These pages ARE ranking in the search results:

Google's quest to become the web is leading them to produce a lot of half done products (is eHow's content written at a higher level than Matt Cutts writes) & an increasing variety of bugs. These of course create opportunity for some folks, but a whole lot of pain for many folks who have done nothing wrong other than trusting Google to be competent & fair.

I understand ready, fire, aim on on beta tests or things for start ups, but should Google be doing this sort of silliness with a search service millions depend on?

So much of their originality algorithms determine what is the true source on the internet; the moment bugs like this appear, that trustworthiness is tarnished, and the people who poured sweat blood and tears into a product can be wiped out with a flip of a deskbar.google.com launch.

Pages






    Email Address
    Pick a Username
    Yes, please send me "7 Days to SEO Success" mini-course (a $57 value) for free.

    Learn More

    We value your privacy. We will not rent or sell your email address.