Here is a fun webmaster help video from March 10th of 2010, answering the following question:
"If Google crawls 1,000 pages/day, Googlebot crawling many dupe content pages may slow down indexing of a large site. In that scenario, do you recommend blocking dupes using robots.txt or is using META ROBOTS NOINDEX,NOFOLLOW a better alternative?"
The answer kinda jumps around a bit, but here is a quote:
I believe if you were to talk to our crawl and index team, they would normally say "look, let us crawl all the content, we'll figure out what parts of the site are dupe (so which sub-tree are dupes) and we'll combine that together.
Whereas if you block something with robots.txt we can't ever crawl it, so we can't ever see that its a dupe. And then you can have the full page coming up, and then sometimes you'll see these uncrawled URLs where we saw the URL but we weren't able to crawl them and see that its a dupe.
I would really try to let Google crawl the pages & see if we can figure out the dupes on our own.
Trust in GoogleBot
The key point here is that before you consider discarding any of your waste you should give GoogleBot a chance to see if they can just figure it out on their end. Then, without updating said advice, Google rolled out the Panda update & torched 10,000's of webmasters for following what was up to then a Google best practice. Only after months of significant pain did Google formally suggest on their blog that you should now block them from indexing such low value pages.
Matt's video also suggested some of the other work around options webmasters could do (like re-architecting their site or using parameter handling in Webmaster Tools), but made it sound like Google getting it right by default was anything but an anomaly. What such advice didn't take into account was the future.
What Does a Search Engineer Do?
The problem with Google is that no matter what they trust, it gets abused. Which is why they keep trying to fold more signals into search & why they are willing to make drastic changes that often seem both arbitrary & unjust.
Search engineers are well skilled at public relations. A big part of what search engineers do is managing the market through FUD. If you can get someone else to do your work for you for free then that is way more profitable than trying to sort everything out on your end.
Search engineers are great at writing code. A lot of what the search engineers do is reactionary. Some things get out of control and are so obvious that FUD won't work, so they need to stomp on them with new algorithms. Most search engine signals are created through tracking people, so they usually follow people. Even when it seems like they are trying to change the game drastically, a lot of that data still comes from following people.
What to Do as an SEO?
The ignorant SEO waits until they are told by Google to do something & starts following "best practices" after most of the potential profits have been commoditized, both by algorithmic changes & a market that has become less receptive to a marketing approach which has since lost its novelty.
The *really* ignorant SEO only listens to official Google advice & trusts some of the older advice even after it has become both stale & inaccurate. As recently as 2 years ago I saw a published author in the SEO space handing out a tip on Twitter to use the Google toolbar as your primary backlink checking tool. Sad!
The search guidelines are very much a living breathing document. If search engines are to remain relevant they must change with the web. Those blazing new paths & changing the landscape of internet marketing often operate in ways that are not yet commonplace & thus not yet covered by guidelines that are based on last year's ecosystem. Individual campaigns fail often, because they are trying something new or different. Off of each individual marketing campaign the expected outcome is failure. However they generally win the war. Those who follow behind remain in their footprints (unless they operate in less competitive markets).
The savvy SEO is a trail blazer who is pushing & probing to test some of the boundaries. They are equally a person who watches the evolution of the web through the lens of history, attempting to predict where search may lead. If you can predict where search is going you are not as likely to get caught with your pants down as the person who waits around for Google telling them what to do next. It may still happen in some cases, but it is less common & you are more likely to be able to adjust quickly if you are looking at the web through Google's perspective (rather than through the perspective they suggest you use).
Google's Noble Respect for Copyright
Google has a history of challenging the law & building a business through wildcatting in a gray hat/black hat manner.
They repeatedly broke the law with their ebook scanning project. Their ebook store is already open in spite of a judge requiring them to rework their agreements.
They bought Youtube, a den of video piracy & then spent $100 million on legal bills after the fact. When they were competing with Youtube they suggested that they could force copyright holders to pay Google for lost ad revenues if they didn't give Google access to the premium content. :D
They sold ads against trademarks where it was generally viewed as illegal and awaited the court's decisions after the fact.
They tried doing an illegal search tie-up with Yahoo & only withdrew after they were warned that it would be challenged. They later slid through a similar deal with Yahoo Japan that was approved.
They "accidentally" collected personally identifiable information while getting router information & scanning streets (and we later learn via internal emails in court documents how important some of this "accidental" data collection was to them).
They pushed Buzz onto Gmail users and paid the fine.
Google torched UK finance comparison sites for buying links. Then Google bought one of the few they didn't torch (in spite of its spammy links). After getting flamed on an SEO blog they penalized that site, but then it was ranking again 2 weeks later *without* cleaning up any of the spammy links.
When the Panda update torched one of your sites Google AdSense was probably already paying someone else to steal it & outrank you. Google itself scrapes user reviews & then replaces the original source with Google Places pages. The only way to opt out of that Google scrape is to opt out of Google search traffic.
For years Google recommended warez and keygens and serials to searchers, all while building up a stable of over 50,000 advertisers pedaling counterfeit goods. That only stopped when the US government applied pressure, and then Google painted themselves as the good guys for fighting piracy.
Google is reportedly about to launch their music service, once again without permission of the copyright holders they are abusing.
Those were examples of how Google interpreted "the guidelines" in modern societies.
Google doesn't wait for permission.
What are you doing right now?
Are you sitting around hoping that GoogleBot sorts everything out?
If so, grab a newspaper & pull out the "help wanted" section. You're going to need it!
If you want to win in Google's ecosystem you must behave like Google does, rather than behaving how they claim to & tell you to.
In October of 2008 Eric Schmidt announced that SEO was about to get really ugly for anyone who doesn't own a brand. He didn't word it that way though. Rather, he stated
"Brands are how you sort out the cesspool. Brand affinity is clearly hard wired. It is so fundamental to human existence that it's not going away. It must have a genetic component." - Eric Schmidt
In response to that comment (& some of Google's pro-brand algorithmic updates) I created the following video.
Google's Brand Promotion History
Ultimately Google promotes brands for the same reason they promote Wikipedia: it is (generally) safe & easy.
Here is a history of how brand promotion became part of "the algorithm"
In 2003 Google did the infamous Florida update & ever since then they have generally trended toward placing more weight on domain authority (about the only big counter point to this would be Google's recent localization push)
Google's sandbox took it one step further by making it harder for new smaller sites to break through & giving them what amounts to a purgatory period
in 2006 BMW was caught spamming (after years of increased search traffic from spamming) they got a couple day slap on the wrist from Google. Smaller webmasters who were caught doing similar were penalized for far longer periods of time.
in 2005, shortly after announcing rel=nofollow, Google stepped up a campaign promoting FUD against link buying & promoting snitching (when combined with preferential treatment toward brands, this further favored big business at the expense of smaller webmasters)
over the years Google built increasingly sophisticated algorithmic filters to detect & demote aggressive link strategies (which, when coupled with brand promotion algorithms, further made it harder for small businesses to compete online)
in April of 2007 Google bought DoubleClick, highlighting Google's aspirations to move from demand fulfillment direct marketing ads into the lucrative brand advertising market
when the Google Panda update happened Matt Cutts stated "we actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side." That algorithm allowed doorway pages & scraper sites to rank while killing off lots of smaller legitimate websites.
Sleazy Outing for Self-Promotion
In the later half of 2010 & the first few months of 2011 Google was getting beat up in the press about content farm spam (created by a combination of loose AdSense standards & Google putting too much weight on domain authority). To help deflect some of the bad press & show "who is boss" Google penalized both J.C Penny & Overstock.com for using manipulative links.
This past week the folks from Digital Due Diligence tipped of a NYT reporter for another hit piece. A lot of the top flower sites increase their ad budget around their busiest times of year, so coinciding with Mother's Day the New York Times highlighted how sites like ProFlowers, 1800Flowers, Teleflora & FTD were buying seedy links. I won't link at the NYT article because doing so would only promote more sleazy pageview journalism.
A Googler named Jake Hubert was quoted in the above mentioned article as saying the following:
"None of the links shared by The New York Times had a significant impact on our rankings, due to automated systems we have in place to assess the relevance of links. As always, we investigate spam reports and take corrective action where appropriate."
(Even Big Brands) Can't Rank Higher than #1
What is hilarious about that official Google comment is that sometimes Google has whacked websites based on perceived intent rather than results, & when I searched Google those 4 sites owned 6 first page results for that search query (along with the NYT article being listed as a 7th result (and 8th if you count the Google News result).
Google hard coded the algorithm to favor big brands (not once, but twice), promoted the big brands to the top of the search results, watches those brands violate their guidelines (in spite of said promotion) and then claimed that there is no corrective action needed for the violation since they already rank #1.
Well of course the paid links can't further improve a #1 ranking. You can't get any better than first place.
The good news for brands is that Googlers feel the sleazy outing angle is getting tired after J.C. Penny & Overstock.com & Google changed webmaster perception of their results with Panda (by making the common smaller webmaster pay for eHow's sins). Soon reporters won't justify wasting ink or bits on another sleazy SEO outing article because the pageviews won't be there.
At this point it is safe to say that Googlers don't really need to think of brands. All they have to do is search for *any* commercial keyword and click on the first result. The brand takes care of itself. :D
Would you trust the information presented in this article?
Sounds like this is all about promoting perceived authority.
Did you (or anyone you know) trust this outlet to give you a complete worldview at any point in the last couple decades & end up bankrupt, utterly decimated, or destitute because you followed its advice (on say internet stocks or the housing bubble or using excessive credit because "this time may actually be different")?
A lot of news sites are given additional distribution through services like Google news, which start them from a position of authority (because if you go to search to find something & Google promotes their news vertical right away, then sites in that news vertical will rank highly instantly & accrue backlinks from that early exposure). The education system itself is partly a propaganda tool to teach you to trust an obey authority. If the banking crisis taught us nothing else it should have taught us that many authorities are not worthy of our trust as they act in self interested ways at the expense of the whole.
Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
Mainstream media sites saw a $1 billion Dollar lift in annual ad revenue from the Panda update. Most mainstream media articles are *not* written by true subject matter experts, but rather by devout generalists who grab a couple quotes to fill out the shallow piece & make it feel more informed.
A lot of the "official" quotes are from officials who represent industry trade organizations. That means those folks support the interests of folks in that trade, even if/when that trade is working against the interest of the common man.
The problem is, you don't get to see who is a whore until *after* they already ____ed you. See for example David Lereah: "Ahhh, so he admits to being nothing more than a paid shill whose mouth was available for a price. How does that job description vary from the Trannies who hang out by the West Side Highway? In my book, not by very much. A whore is a whore is a whore."
Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
*Cough* Google Video vs Youtube vs Vevo.
Aren't most AP articles by their definition redundant duplication?
How are some of Google's late-to-the-party services like their ebook store or their places pages justified if we seek to minimize redundancy?
Would you be comfortable giving your credit card information to this site?
Some sites aim to sell, while others aim to tell.
If a passionate hobbyist desires to share but isn't selling something (and thus uses a quirky site design or a more personal formatting structure) should they be dinged for putting their passion ahead of getting an unneeded SSL certification & paying firms like TRUSTe, McAfee & VeriSign?
Sites which don't go out of their way to sell you something are more likely to be built on passion.
Does this article have spelling, stylistic, or factual errors?
"Correct spelling, indeed, is one of the arts that are far more esteemed by school ma'ams than by practical men, neck-deep in the heat and agony of the world." - Henry Louis Mencken
Further, the error of omission is one that is constantly made in the mainstream media, which is precisely why you have to read fringe rags like the Rolling Stone to get an honest look at how bankers are robbing the country blind. Of course you will read the same article in the mainstream media in 6 or 7 years, after the statue of limitations runs out. And they will sell it as "new" news, even though the story at that point is nearly a decade old.
Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
I can tell you sure as hell that the auto-generated spam stub pages on the mainstream media sites (driven by services like DayLife or Truveo) which scrape blogs like mine are not driven by passion. You can't program a bot to have "passion."
Does the article provide original content or information, original reporting, original research, or original analysis?
Further, I have had a client featured in a well read trade magazine where they wrote an entire article on the client. They were unwilling to link to the client's site (even though the client was the only source & entire purpose for the article) because they said they felt it would be too promotional. How warped is it that they will do a photo shoot at your house & make you the feature of an article, yet they are afraid to link because that might be seen as being too promotional!
Does the page provide substantial value when compared to other pages in search results?
This is actually a bit of a bait and switch styled topic. Let me explain. In an ideal world every single page would be great.
Most articles individually are failures that do not pay for themselves. It is the rare success that helps carry the failures. You do not know which is which in advance, but you hope that with some level of effort and scale you are marginally profitable out the other end.
This is how literally all forms of publishing work: online, music, movies, books, etc.
In terms of a money loser, take for instance this article. I am already rather well known, have a wide following, spent hours writing that article, and ultimately it garnered 1 comment & 0 inbound links (once you back out scraper sites, automated links, and links with nofollow on them).
Making things worse, you not only compete against others who will copy anything of yours that is successful, but if Google does decide to whack your site with a penalty then a scraper site (which Google paid with AdSense money) that steals your content will outrank you for your own work. How exactly do you provide a unique substantial value add when Google is paying others to steal & republish your work wholesale?
Things like source attribution issues, brand bias, and Google competing against publishers with scraper pages have a very real and significant impact on profit margins. A good sustainable company is generally lucky to have 20% profit margins. When Google introduced their places pages that scraped TripAdvisor Google instantly redirected 10% of TripAdvisor's search traffic.
Ultimately the above issue with content is not down to cost or effort, but if what you are doing is profitable. If it is not, then it is simply unsustainable.
And even when you are profitable, you can count on Google helping others subvert that position.
On the topic of value add, I have even seen people buying AdSense ads to redistribute 3rd party works, where the only value "add" was lowering the retail price!
How much quality control is done on content?
A lot of the high ranking and much hyped social media networks like MySpace, Friendster, Twitter & Facebook are almost exclusively spam. A couple days ago I deleted over 75% of my Facebook "friends" because I was sick of getting daily email updates about how some dirtbag wanted to promote some autowealth MLM blaster unlimited downstream product on my wall.
That is not to say that everyone I deleted did anything wrong (most of them are likely good people) but there was no opportunity cost to spamming. The spammers who automate drive everything toward the tragedy of the commons. A paywall is perhaps the single best filter for quality, but if you use a paywall expect to deal with a lot of freetard rage & expect Google to pay some folks to steal it.
Google polices the web, but anything goes in their ad programs.
You can see how ridiculous the double standard is by simply considering that Google let their counterfeiting advertisers count grow to 50,000 strong before finally axing them when the US government pressured Google. Bizarrely, Google had the audacity to position themselves as good doers who were cracking down on spammers, when in fact they were taking their own longtime business partners out to the wood shed!
Does the article describe both sides of a story?
Mainstream media sources often like to share "both sides of a story" to seem unbiased. But the truth is that media by its very nature is biased toward the interest of advertisers & away from consumers. See, for example, either Manufacturing Consent or the BGH lawsuit.
Further, some well known corporations (LIKE GOOGLE) blackball media outlets that question them in certain ways. Google would never give exclusives to SEOBook & the sites that they do give exclusives to would lose the relationship if they were as blunt as we are.
Is the site a recognized authority on its topic?
Lots of recognized authorities have conflicting funding sources - something that was well highlighted in early Google research, and has been consistently exposed (years or decades after the fact) in the medical space.
Honesty is more important than authority, but then being bland & honest is not quite as remarkable (or profitable) as putting on a coat of spin.
Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
How would Google's efforts stand up when graded against this suggestion? Why does Google have Google Video, Youtube & Vevo?
Further, most market leaders do have large networks and multiple branded sites for purposes of branding, segmentation, and double dipping in the marketplace. Remember when Bankrate (which already owned Bankrate, Nationwide Card Services, Credit Card Search Engine, Bankaholic, etc.) bought out CreditCardsGuide.com & it got temporarily penalized for the spammy links it had? Well it ranks again & of course since then they have also bought out CreditCards.com. You see this sort of behavior amongst almost any big brand: from Amazon.com to Zappos. (Oh wait, Amazon.com now owns Zappos!)
Was the article edited well, or does it appear sloppy or hastily produced?
A lot of the best content comes from people who are subject matter experts. But those people may have only mastered their subject & may be new to: writing, website design, online publishing, etc.
For a health related query, would you trust information from this site?
Let's put it this way: the media people consume is in part responsible for the current state of health in the US where there is an obesity epidemic. Further, a lot of the leading health authority sites (like WebMD) run special advert sections in their site where it looks just like content but you have to read the small print to see it is an ad.
Going one step further on this front, it is worth mentioning that a number of the large pharmaceutical corporations have repeatedly sold drugs for off label purposes & yet none of their packaging is required to highlight those ill deed they did that have literally killed millions of people.
Would you recognize this site as an authoritative source when mentioned by name?
A lot of "authoritative" sites are simply sites with large ad budgets.
Quick, tell me which company advertises a clever gecko with a British accent. Other than as a mascot (& perhaps alliteration), how relevant is that gecko (or the accent) to their business? Not at all. But they do spend nearly a billion Dollars a year on ads.
Does this article provide a complete or comprehensive description of the topic?
Would users complain when they see pages from this site?
I'll complain about something I just saw. ;)
While searching for a link for a blog post I was writing today, the #1 Google result (not voted up by social circle stuff) was a Tweet linking to a Hootsweet framed page linking to a music industry site which posts RSS feed content and linked to a BusinessInsider article that referenced the TechCrunch article I was looking for.
If we want to get rid of unneeded duplication & noise then why is Google tying their bonus system to promoting more social media noise? After Amazon.com has done a great job with Kindle why is there a need for Google's ebook marketplace? After Yelp has created a strong community review site (with real editorial expenses) why is there a need for Google Places to scrape & displace its reviews?
If you look at what actually happens in reality (rather than what folks claim to support in their "ideals") it is anarchy. The bankers stole what they could and moved on. The pharmaceutical corporations create fear-driven propaganda about the dangers of drug re-importation, all the while pushing drugs for off label purposes. Google pays people to steal your content, then tells you to suck it up & it is your fault you are not a big brand.
Anarchy is here.
The only difference is that it is dressed up in suits and fancy language, where people perceive anarchists as like ripped jeans, megadeth shirt wearing, pyro's.
You see, tricking people is bad. Unless you are Google. In which case you have to hit the quarterly numbers.
Everyone else needs to read Google platitudes, create deep content, and pray to turn the corner before bankruptcy hits.
Matt Cutts stated that you should make your products like Apple products by packaging them nicely.
For illustrative purposes:
It was easy for Google to speak from a moral high ground when their growth was above 50% a year, but now that growth has slowed over the past couple years they have been willing to do things they wouldn't have. In November of 2009 when I saw the following I knew the writing was on the wall.
When Google Instant launched, we got to test Google's 50% content theory. And they hit the numbers perfectly. A full 50% of web users could see 2 organic listings above the fold when instant was extended (the other half of folks could only see one or none).
As if the massive Youtube promotion & the magically shrinking search results for everyone else were not bad enough, with Panda they suck at determining the original content source.
This site you are reading wasn't hit by Panda, which makes us lucky, as it allows us to rank as high as #3 for our own content (while Google pays dozens of other webmasters to snag it wholesale and wrap it in AdSense).
We got lucky though. If we had been hit by Panda (like 10,000's of other webmasters) we probably wouldn't even rank on the first page of the search results for our own content.
When Google screws up source attribution they are working counter to open culture, because they are having you bear 100% of the cost of content production, and then they are immediately paying someone else for your work. Do that long enough and the quality content disappears & we get a web full of eHow-like sites.
And yet Google tells us the secret recipe (which may or may not work at some unknown time) is to pour more money into content development.
The solution to this problem is more deep content. Keep feeding Google (and their AdSense scraper partners) and hope that after you pour $50,000 into your site that some small fraction of it ends up back in your bank account (while the larger share winds up in Google's and their AdSense partners).
As bad as all that is, I recently got selected as a lucky beta user for the next version of Google's search results. Notice the horizontal spacing that drives down the organic search results. After the top AdWords listings the organic listings start off 88 pixels lower on the screen.
I have a huge monitor. Less than 10% of people have a monitor as large as mine. Before this new search result I saw 8 organic search results above the fold on my large monitor. Now it is down to 5 (and that is with no Google video ad, no Google vertical comparison ad like the above credit card one, no browser toolbars, no browser status bar, and only 1 of the advertisers having ad sitelinks).
So how does Google score now on their ad to content ratio?
When Google's new search results roll out, there are some keywords where less than 1 in 3 searchers will be able to see a single organic listing above the fold! And lest you think that spacing is about improving user experience, notice how wide the spacing in the left column is, and how narrow the right rail AdWords spacing is. This is all about juicing revenues & hitting the number.
Which leads me to the Google Panda loophole I mentioned in the headline. It is an easy (but painful) one-step process.
All Google's propaganda about the horrors of paid inclusion look absurd when compared against the search result with 0 organic listings above the fold for half of desktop computer users.
The only "exploit" here is how Google is paying people to steal other's content, then ranking the stolen stuff above the original source.
The #1 goal for any organization is self-preservation. When people feel things are fairly just & they are just getting by they are fine with squeezing out more efficiency in what they do and figuring out ways to pay the bills. But when people feel the table is tilted at some point they stop caring and do whatever it takes.
Ex Post Facto
Some longtime AdWords advertisers have recently been punished for affiliate ads they ran 8 years ago where some of the sites they promoted at some point fell out of Google's graces through an ad system which never allows you to delete your history & offers ex post facto regulations that turn a regular advertiser arbitrarily into a spammer.
In 3 weeks it will have been 3 months since Google first launched Panda. Outside of bloggers with 50,000 RSS subscribers few (if any) reports of recovery from Panda have been seen. Some of the theories floating around what caused Panda attempt to tie it to AdSense & many of Google's AdSense case studies are now highlighting best practices to follow if you want to be just like the sites Google torched.
As if that wasn't conflicting enough, some of the webmasters that were torched by Panda received automated messages that they were missing out on revenues by not using the maximum allotted number of ad units. After the huge fall off from Panda, Google has been pushing AdSense so hard that many webmasters have been receiving unsolicited emails from Google suggesting they sign up for AdSense.
I won't run AdSense on our main sections of this site because it would be tacky and destroy perceived credibility (having a "submit your site to 2000 search engines for $29" ad next to the content doesn't inspire trust on an SEO site). I could create a content farm answers section of the site that mirrors Ask's strategy, but with a higher level of quality. I won't though, because it would be viewed as spam because I am me. Once again, SEOs should be held to a higher standard than search engines. ;)
That Which You Consume, Consumes You
Where this rubs wrong is not only the overt brand push, but also that some of Google's pushes at expansion down the search funnel have looked a lot like the spam they claim to fight.
In the Wall Street Journal there was an article about the Panda update highlighting that many small businesses were laying off their employees. The same article highlighted numerous cost extensive desperate marketing measures the firms were taking which may or may not work. Google didn't disclose much in the article other than:
The Google spokesman says the company doesn't disclose details about changes it makes to its algorithms because doing so "would give bad actors a way to game our systems."
Nobody likes bad actors, but most of the webmasters that were hit were not bad actors. Rather, most of them were naive & simply followed the Google guidelines thinking that was in their best interests and perhaps would allow them to stay competitive. Unfortunately, it wasn't.
If you adhere to guidelines, get beat down, are not told why, and are told that generally sites need to "improve their quality" that can be a pretty infuriating message. The presumption that your stuff isn't good enough when 3rd grade rewrites of your content now outrank you is both smug and obnoxious. What is worse about the update though now is that many scraper websites are outranking the original content sources, so the message is that your content is plenty good enough, but it is just not good enough when it is on your site. A large portion of those scraper sites are monetized via Google AdSense & would not even exist if it were not for AdSense.
So Google whacks your site, tells you to clean up your act (& increase your operating costs while decreasing your margins), lumps you in the bad actors group, offers no information about when the pain will (or even could) end, pays someone to steal your content, then ranks that stolen copy of your content above you in the search results.
Make Your Move
If a person has the pleasure to experience the above it doesn't take much critical thinking skills to develop a different perspective on search.
Ultimately this is going to lead to a "why not" approach to search for many folks in the search space.
If Google already dinged your website why wouldn't you remove AdSense & replace it with competing ad programs? Why not test those affiliate programs you have been meaning to test? If you have to rework your content anyway, why not move past AdSense/webmaster welfare?
If your AdWords budget was marginally profitable & you were buying ads to compliment your organic exposure, why wouldn't you stop buying ads with Google & test running ads on other websites? Google is fine funding an affiliate network that uses direct links, so why not use clean links on your ad buys? If you like run it through a self-hosted affiliate program so that you are just like Google.
If your site is already whacked why wouldn't you buy links to help boost its ranking back?
If your site earns nothing from search, why wouldn't you sell links if you have to do whatever it takes to make costs?
If your site gets penalized & someone copying your content & wrapping it in AdSense outranks you why wouldn't you create new mirror sites? Why wouldn't you create scraper websites to pollute Google with?
If rankings are unpredictable & one site is no longer enough, why wouldn't you create backup sites & projects of various levels of quality & effort? At this point diversity simply serves as a needed form of insurance.
If while running these purely scientific experiments you accidentally run into something that works really well that shouldn't, why not scale it to the moon?
I am not convinced that the search results are any cleaner today than they were a few months ago. However I am fairly certain things will soon head south. I am not advocating going out of your way to be extra spammy, but am just highlighting the cost-benefit analysis which is going through the heads of thousands of webmasters who Google just torched.
Google is betting that anonymous strangers will behave more kindly than Google has, but when an animal is backed into a corner it often acts in unpredictable (and even uncontrollable) ways.
The big problem for Google is this: "when innocence itself, is brought to the bar and condemned, especially to die, the subject will exclaim, it is immaterial to me whether I behave well or ill, for virtue itself is no security." - John Adams
Like a good neighbor, State Farm is there and there and there and there and there.
The Struggle Real Businesses Face
The big problem with this IMHO is all but the spammer (who is now busy working on "local" signals) loses. Legit online-only pure plays are simply wiped off the result set. The searcher gains nothing by seeing State Farm agents 5 times in the search results. Even the local business which has a new windfall of business is simply overwhelmed with leads, meaning they likely have (at least relatively) poor customer service until they hire up.
To a small business, a sharp rise in demand can be every bit as damaging as a sharp fall in demand.
But should small local businesses hire aggressively, they could be only 1 algorithmic update away from needing to prune staff. Maybe some day Google decides to limit the results to show 1 agent per parent company, and then the agents end up fighting out each other (much like affiliates had to fight each other on bids in AdWords to be the 1 that shows up).
Given that some of the agents ranking page 1 have less than a dozen inbound links & links from only a few unique domains, it won't take long for some new "local" players to come online.
What Makes a Search Result Good?
A lot can be said for getting users where they need to be quickly. When it works it has great value. But when it doesn't work, it makes the market less efficient. Value chains exist for a reason. Sometimes a brand (or an individual agent of brand x) is not in the best position to act as an unbiased advisor.
As a consumer buying car insurance, I don't care that my agent is local. In fact, if I live in an expensive area I may want my insurance provided from someone who lives in an area with a lower cost of living so they can provide the services (while making a comfortable living) for less. For the last decade I have been insured from a company in another state (USAA in Texas). Location had precisely 0 impact on my decision making.
What mattered to me was that they had great rates. Which is precisely what almost all insurance commercials promote.
Geico spends nearly a billion Dollars a year pounding that message into the minds of consumers.
The problem is that almost all the big brands promote the exact same message. They are the cheapest. Save with them. Etc. Online pure plays that provide quote comparisons provide a valuable & value-add function in this marketplace, but they have simply disappeared from Google. They aren't local enough to hit the local signal, they aren't brand enough to hit the brand signal, and since they are not the end brands they can't justify buying $30 AdWords clicks thinking that what they don't get back in direct ROI can be written off to "brand."
Ultimately the end user loses (or at least until Google creates their insurance flavor of "comparison ads.")
This Stuff is Everywhere
This stuff is even happening on search queries where there is absolutely no implied local intent & no need for a local provider. General discovery & topical queries like "web designer" or even informational background searches like "SEO" now bring up service based sites with a local presence.
Leaving Off On a Positive Note
1 day doesn't make a trend, but if this stuff sticks ranking local sites for big keywords just got really easy.
If you know SEO and live near a big city, a second office location might soon be a profitable decision.
If you are a local business who thought SEO was too complex or expensive, that excuse may have just been removed from the marketplace.
If you run a bespoke consulting styled business & ran into a windfall of demand don't forget to increase your rates & be more selective with who you work with. Working all the time leads to burn out. Trust me I know that all too well. ;)
This is another example why it can be a great idea to mix and match your businesses...such that if one jumps out of nowhere or another one tanks you are still fine. Having multiple projects is one of the few ways you can really protect yourself from the likes of Panda & updates like this one. Running multiple businesses allows you to lean into your side gigs when your main one drops off, and push harder on your main gig when it is really humming along.
In the last post I mentioned how the US government tried to change the cost benefit analysis for some sleazy executives at pharmaceutical corporations which continue to operate as criminal enterprises that simply view repeated fines as a calculable cost of doing business.
If you think about what Google's Panda update did, it largely changed the cost-benefit analysis of many online publishing business models. Some will be frozen with fear, others will desperately throw money at folks who may or may not have solutions, while others who gained will buy additional marketshare for pennies on the Dollar.
"We actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side." - Matt Cutts
Now that Google is picking winners and losers the gap between winners & losers rapidly grows as the winners reinvest.
And that word invest is key to understanding the ecosystem.
Beware of Scrapers
To those who are not yet successful with search, the idea of spending a lot of money building on a strategy becomes a bit more risky when you see companies like Demand Media that have spent $100's of millions growing an empire only to see 40% of the market value evaporate in a couple weeks due to a single Google update. There are literally thousands of webmasters furiously filing DMCA reports to Google after Panda, because Google decided that the content quality was fine if it was on a scraper site, but the exact same content lacked quality when on the original source site.
And even some sites that were not hit by Panda (even some which have thousands of inbound links) are still getting outranked by mirroring scrapers. Geordie spent hours sharing tips on how to boost lifetime customer value. For his efforts, Google decided to rank a couple scrapers as the original source & filter out PPCBlog as duplicate content, in spite of one of the scrapers even linking to the source site.
Outstanding work Google! Killer algo :D
Even if the thinking is misguided or an out of context headline, Reuters articles like Is SEO DOA as a core marketing strategy? do nothing to build confidence to make large investments in the search channel. Which only further aids people trying to do it on the cheap. Which gets harder to do as SEO grows more complex. Which only further aids the market for lemons effect.
At the opposite end of the spectrum, there are currently some search results which look like this
All of the colored boxes are the same company. You need a quite large monitor to get any level of result diversity above the fold. The company that was on the right side of the classifier can keep investing to build a nearly impenetrable moat, while others who fell back will have a hard time justifying the investment. Who wants to scale up on costs while revenues are down & the odds of success are lower? Few will. But the company with the top 3 (or top 6) results is collecting the data, refining their pitch, and re-investing into locking down the market.
Much like the Gini coefficient shows increasing wealth consolidation in the United States, search results where winners and losers are chose by search engines creates a divide where doing x will be very profitable for company A, while doing the exact same thing will be a sure money loser for company B.
Thin Arbitrary Lines in the Sand
The lines between optimization & spam blur as some trusted sites are able to rank a doorway page or a recycled tweet. Once site owners know they are trusted, you can count on them green lighting endless content production.
Scraping the Scrape of the Scrape
Many mainstream media websites have topics subdomains where they use services like DayLife or Truveo to auto-generate a near endless number of "content pages." To appreciate how circular it all is consider the following
a reporter makes a minimally informing Tweet
Huffington Post scrapes that 3rd party Tweet and ranks it as a page
I write a blog post about how outrageous that Huffington Post "page" was
SFGate.com has an auto-generated "Huffington Post" topics page (topics.sfgate.com/topics/The_Huffington_Post) which highlighted my blog post
some of the newspaper scraper pages rank in the search results for keywords
sites like Mahalo scrape the scrape of the scrape
At some point in some such loops I am pretty certain the loops start feeding back into themselves & create a near-infinite cycle :D
An Endless Sea of "Trustworthy" Content
The OPA mentioned a billion dollar shift in revenues which favors large newspapers. But those "pure" old-school media sites now use services like DayLife or Truveo to auto-generate content pages. And it is fine when they do it.
The newspapers call others scammy agents of piracy and copyright violators for doing far less at lower scale, all while wanting to still be ranked highly (even while putting their own original content behind a paywall), and then go out and do the exact same scraping that they complain about others doing. It is the tragedy of the commons played out on an infinite web where the cost of an additional page is under a cent & everyone is farming for attention.
And the piece of pie everyone is farming for is shrinking as:
competition increases faster than the growth of the market
Aware that consumers spend someplace between eight and 10 hours researching cars before they contact a dealer, auto markers and dealers are vectoring ever-greater portions of their marketing budgets into intercepting consumers online.
As but one example, Ford is so keen about capturing online tire-kickers that its website gives side-by-side comparisons between its Fiesta and competing brands. While you are on the Ford site, you can price the car of your dreams, investigate financing options, estimate your payment, view local dealer inventories and request a quote from a dealer.
Search Ads Replacing the Organic Search Results
AdWords is eating up more of the value chain by pushing big brands
comparison ads = same brands that were in AdWords appearing again
bigger adwords ads with more extensions = less diversity above the fold
additional adwords ad formats (like product ads) = less diversity (most of the advertisers who first tried it were big box stores, and since it is priced on a CPA profit share basis the biggest brands that typically have more pricing power with manufacturers win)
Other search services like Ask.com and Yahoo! Search are even more aggressive with nepotistic self promotion.
Small Businesses Walking a Tightrope (or, the Plank)
Not only are big brands being propped up with larger ad units (and algorithmically promoted in the organic search results) but the unstable nature of Google's results further favors big business at the expense of small businesses via the following:
more verticals & more ad formats = show the same sources multiple times over
less stability = more opportunities for spammers (they typically have high margins & lots of test projects in the work...when one site drops another one is ready to pop into the game...really easy for scrapers to do...just grab content & wait for the original source to be penalized, or scrape from a source which is already penalized)
less stability = lowers multiples on site sales, making it easier for folks like WebMD, Quinstreet, BankRate, and Monster.com to buy out secondary & tertiary competing sites
If you are a small business primarily driven by organic search you either need to have big brand, big ego, big balls, or a lack of common sense to stay in the market in the years to come, as the market keeps getting consolidated. ;)
Google ignored our page title, ignored our on-page header, and then use the 'comments' count as the lead in the clickable link. Then they follow it with the site's homepage page title. The problem here is if the eye is scanning the results for a discriminating factor to re-locate a vital piece of information, there is no discrimination factor, nothing memorable stands out. Luckily we are not using breadcrumbs & that post at least had a somewhat memorable page URL, otherwise I would not have been able to find it.
For what it is worth, the search I was doing didn't have the words comments in it & Google just flat out missed on this one. Given that some huge % of the web's pages has the word "comments" on it (according to the number of search results returned for "comments" it is about 1/6th as popular online as the word "the") one might think that they could have programmed their page title modification feature to never select 'comments' as the lead.
Google has also been using link anchor text sometimes with this new feature, so it may be a brutal way to Google-bomb someone. It is sure be fun when the political bloggers give it a play. ;)
But just like the relevancy algorithms these days, it seems like this is one more feature where Google ships & then leaves it up to the SEOs to tell them what they did wrong. ;)
You can learn a lot about how search has improved over the years by reading Matt Cutts. Recently he highlighted how search was irrelevant in the past due to a lack of diversity:
Seven of the top 10 results all came from one domain, and the urls look a little… well, let’s say fishy. In 1999 and early 2000, search engines would often return 50 results from the same domain in the search results. One nice change that Google introduced in February 2000 was “host crowding,” which only showed two results from each hostname. ... Suddenly, Google’s search results were much cleaner and more diverse! It was a really nice win–we even got email fan letters.
Thanks to those kinds of improvements, in 2011 we never have to look at search results like this.*
* And by never, I mean, unless the results are linking to fraternal Google pages, in which case, game on!
Why should Google result crowding not apply to Google.com? Sure they can say those books are from different authors, but many websites are ran by organizations with multiple authors. Some websites are even built through the partnerships of multiple different business organizations. Who knows, maybe some searchers are uncomfortable with every other listing being an out of context book highlight.
In the past I have been called cynical for highlighting stuff like the following image
I saw it as part of a trend toward home cooking promotions. And I still view it that way. The above books promotion is simply further proof of concept.
other Google owned and operated sites
a branded website ranking for its own brand
Can you show me *any* occurrence of a result where a site is listed 5 times in the search result? Bonus points if you can find it where the 5 times are not grouped into 1 bunch via result crowding.
As a thought experiment, ask yourself if that Google ranking accident would happen if the content archive being served up was promoting media hosted on Microsoft servers.
A friend of mine summed it up nicely with:
well, it's not everyday you see that kind of power and the fact that other sites aren't afforded the same opportunity makes me think that they are being anti-competitive. Google literally wrote the book (ok scraped it) on anti-competitive practices.
If you live outside the United States and were unscathed by the Panda Update, a world of hurt may await soon. Or you may be in for a pleasant surprise. It is hard to say where the chips may lay for you without looking.
Due to Google having multiple algorithms running right now, you can get a peak at the types of sites that were hit, and if your site is in English you can see if it would have got hit by comparing your Google.com rankings in the United States versus in foreign markets by using the Google AdWords ad preview tool.
In most foreign markets Google is not likely to be as aggressive with this type of algorithm as they are in the United States (because foreign ad markets are less liquid and there is less of a critical mass of content in some foreign markets), but I would be willing to bet that Google will be pretty aggressive with it in the UK when it rolls out.
The keywords where you will see the most significant ranking changes will be those where there is a lot of competition, as keywords with less competition generally do not have as many sites to replace them when they are whacked (since there were less people competing for the keyword). Another way to get a glimpse of the aggregate data is to look at your Google Analytics search traffic from the US and see how it has changed relative to seasonal norms. Here is a look out below example, highlighting how Google traffic dropped. ;)
What is worse, is that on most sites impacted revenue declined faster than traffic because search traffic monetizes so well & the US ad market is so much deeper than most foreign markets. Thus a site that had 50% profit margins might have just went to break even or losing money after this update. :D
When Google updates the US content farmer algorithm again (likely soon, since it has already been over a month since the update happened) it will likely roll out around other large global markets, because Google does not like running (and maintaining) 2 sets of ranking algorithms for an extended period of time, as it is more cost intensive and it helps people reverse engineer the algorithm.