Identity vs Irrelevance

“Within search results, information tied to verified online profiles will be ranked higher than content without such verification, which will result in most users naturally clicking on the top (verified) results. The true cost of remaining anonymous, then, might be irrelevance.” - Eric Schmidt

Authoritarian Regimes & Data Mining

One wonders how Mr. Schmidt can balance the above statement along with warning about authoritarian governments.

And the risks from such data mining operations are not just in "those countries over there." The ad networks that hire lobbyists to change foreign privacy laws do so such that they can better track people the globe over and deliver higher paying ads. (No problem so long as they don't catch you on a day you are down and push ads for a mind numbing psychotropic drug with suicidal or homicidal side effects.)

And defense contractors are fast following with mining these social networks. (No problem so long as your name doesn't match someone else's that is on some terrorist list or such.)

Large & Anonymous

What's crazy is when we get to the other end of the spectrum. Want to know if your hamburger has pink slime in it? Best of luck with that.

Then you get the mainstream media sites that get a free pass (size = trust) and it doesn't matter if their content is created through...

  • a syndicated partnership of with eHow-styled content (Demand Media)
  • a syndicated partnership of scraped/compiled date (FindTheBest)
  • auto-generated content from a bot (Narrative Science)
  • scrape + outsourcing + plagiarism + fake bylines (Journatic)
  • top 10 ways to regurgitate top 10 lists from 10 different angles (BuzzFeed)
  • hatchet job that was written before manufacturing the "conforming" experience (example)
  • factually incorrect hate bait irrelevant article with no author name, wrapped in ads for get rich quick scams (example)

... no matter how it is created, it is fine, so long as you have political influence. Not only will it rank, but it will be given a ranking boost based on being part of a large site, even if it is carpet bombed with irrelevant ads.

Coin Operated Ideals

But then the companies that claim this transparency is vital for society pull a George Costanza & "Do The Opposite" with their own approach.

Whenever they manipulate markets to their own benefit they claim the need for secrecy to stop spammers or protect privacy. But then they collect the same data & pass it along without consent to those who pay for the data.

When Google was caught vandalizing OpenStreetMaps or lying to businesses listed in Mocality, those were the acts of anonymous contractors. When Google got caught in a sting operation pushing ads for illegal steroids from Mexico they would claim that behavior didn't reflect their current policies and that we need to move on.

Then of course there are the half dozen (or more) times that Google has violated their own search quality guidelines. So often that is due yet again to "outsourcing" or a partner of some sort. And they do that in spite of the ability to arbitrarily hardcode themselves in the result set.

If we don't exam the faux ideals push to shift cultural norms we will end up with a crappier world to live in. Some Googlers (or Google fanbois) who read this will claim I am a broken record stuck in the past on this stuff. But those same people will be surprised x years down the road when something bizarre surfaces from an old deranged contact or prior life.

Anyone who has done anything meaningful has also done some things that are idiotic.

Is that sort of stuff always forever relevant or does it make sense at some point to move on?

When that person is Eric Schmidt, the people he pontificate to are blackballed for following his ideals.

After all, his ideals don't actually apply to him.

No Effort Longtail SEO Revenues, from FindTheBest

In our infographic about the sausage factory that is online journalism, we had a throw away line about how companies were partnering with FindTheBest to auto-generate subdomains full of recycled content. Apparently, a person named Brandon who claims to work for FindTheBest didn't think our information was accurate:

Hi Aaron,
My name is Brandon. I have been with FindTheBest since 2010 (right after our launch), and I am really bummed you posted this Infographic without reaching out to our team. We don't scrape data. We have a 40 person+ product team that works very closely with manufacturers, companies, and professionals to create useful information in a free and fair playing field. We some times use whole government databases, but it takes hundreds-of-thousands of hours to produce this content. We have a product manager that owns up to all the content in their vertical and takes the creation and maintenance very seriously. If you have any questions for them about how a piece of content was created, you should go to our team page and shoot them a email. Users can edit almost any listing, and we spend a ton of time approving or rejecting those edits. We do work with large publishers (something I am really proud of), but we certainly do not publish the same exact content. We allow the publishers to customize and edit the data presentation (look, style, feel) but since the majority of the content we produce is the factual data, it probably does look a little similar. Should we change the data? Should we not share our awesome content with as many users as possible? Not sure I can trust the rest of your "facts", but great graphics!

I thought it was only fair that we aired his view on the main blog.

...but then that got me into doing a bit of research about FindTheBest...

In the past when searching for an issue related to our TV I saw a SERP that looked like this

Those mashed sites were subdomains on trusted sites like VentureBeat & TechCrunch.

Graphically the comparison pages appear appealing, but how strong is the editorial?

How does Find The Best describe their offering?

In a VentureBeat post (a FindTheBest content syndication partner) FTB's CEO Kevin O’Connor was quoted as saying: “‘Human’ is dirty — it’s not scalable.”

Hmm. Is that a counter view to the above claimed 40 person editorial research team? Let's dig in.

Looking at the top listed categories on the homepage of Find The best I counted 497 different verticals. So at 40 people on the editorial team that would mean that each person managed a dozen different verticals (if one doesn't count all the outreach and partnership buildings as part of editorial & one ignores the parallel sites for death records, grave locations, find the coupons, find the company & find the listing).

Google shows that they have indexed 35,000,000 pages from FindTheBest.com, so this would mean each employee has "curated" about 800,000 pages (which is at least 200,000 pages a year over the past 4 years). Assuming they work 200 days a year that means they ensure curation of at least 1,000 "high quality" pages per day (and this is just the stuff in Google's index on the main site...not including the stuff that is yet to be indexed, stuff indexed on 3rd party websites, or stuff indexed on FindTheCompanies.com, FindTheCoupons.com, FindTheListing, FindTheBest.es, FindTheBest.or.kr, or the death records or grave location sites).

Maybe I am still wrong to consider it a bulk scrape job. After all, it is not unreasonable to expect that a single person can edit 5,000 pages of high quality content daily.

Errr....then again...how many pages can you edit in a day?

Where they lost me though was with the "facts" angle. Speaking of not trusting the rest of "facts" ... how crappy is the business information for SEO Book on FindTheBest that mentions that our site launched in 2011, we have $58,000 in sales, and we are a book wholesaler.

I realize I am afforded the opportunity to work for free to fix the errors of the scrape job, but if a page is full of automated incorrect trash then maybe it shouldn't exist in the first place.

I am not saying that all pages on these sites are trash (some may be genuinely helpful), but I know if I automated content to the extent FTB does & then mass email other sites for syndication partnerships on the duplicate content (often full of incorrect information) that Google would have burned it to the ground already. They likely benefit from their CEO having sold DoubleClick to Google in the past & are exempt from the guidelines & editorial discrimination that the independent webmaster must deal with.

One of the ways you can tell if a company really cares about their product is by seeing if they dogfood it themselves.

Out of curiousity, I looked up FindTheBest on their FindTheCompany site.

They double-list themselves and neither profile is filled out.

That is like having 2 sentence of text on your "about us" page surrounded by 3 AdSense blocks. :D

I think they should worry about fixing the grotesque errors before worrying about "sharing with as many people as possible" but maybe I am just old fashioned.

Certainly they took a different approach ... one that I am sure that would get me burned if I tried it. An example sampling of some partner sites...

  • analytics-software.businessknowhow.com "BusinessKnowHow ended the relationship with find the best as soon as we realized how spammy they were." - Janet Attard
  • accountants.entrepreneur.com
  • acronyms.sciencedaily.com
  • alternative-fuel.cleantechnica.com
  • antivirus.betanews.com
  • apps.edudemic.com
  • atvs.agriculture.com
  • autopedia.com/TireSchool/
  • autos.nydailynews.com
  • backup-software.venturebeat.com
  • bags.golfdigest.com
  • beer.womenshealthmag.com
  • best-run-states.247wallst.com
  • bestcolleges.collegenews.com
  • bikes.cxmagazine.com
  • bikes.triathlete.com
  • birds.findthelisting.com
  • birth-control.shape.com
  • brands.goodguide.com
  • breast-pumps.parenting.com
  • broker-dealers.minyanville.com
  • businessschools.college-scholarships.com
  • camcorders.techcrunch.com
  • cars.pricequotes.com
  • cats.petharbor.com
  • catskiing.tetongravity.com
  • chemical-elements.sciencedaily.com
  • comets-astroids.sciencedaily.com
  • companies.findthecompany.com
  • companies.goodguide.com
  • compare-video-editing-software.burnworld.com
  • compare.consumerbell.com
  • compare.guns.com
  • compare.roadcyclinguk.com
  • comparemotorbikes.motorbike-search-engine.co.uk
  • congressional-lookup.nationaljournal.com
  • courses.golfdigest.com
  • crm.venturebeat.com
  • cyclocross-bikes.cyclingdirt.org
  • dealers.gundigest.com
  • death-record.com
  • debt.humanevents.com
  • design-software.underworldmagazines.com
  • destination-finder.fishtrack.com
  • diet-programs.shape.com
  • digital-cameras.techcrunch.com
  • dinosaurs.sciencedaily.com
  • dirt-bikes.cycleworld.com
  • dogbreeds.petmd.com
  • dogs.petharbor.com
  • donors.csmonitor.com
  • e-readers.techcrunch.com
  • earmarks.humanevents.com
  • earthquakes.sciencedaily.com
  • ehr-software.technewsworld.com
  • fallacies.sciencedaily.com
  • fec-candidates.theblaze.com
  • fec-committees.theblaze.com
  • federal-debt.nationaljournal.com
  • fha-condos.realtor.org
  • fha.nuwireinvestor.com
  • financial-advisors.minyanville.com
  • findthebest.com
  • findthebest.motorcycleshows.com
  • findthecoupons.com
  • findthedata.com
  • firms.privateequity.com
  • franchises.fastfood.com
  • ftb.cebotics.com
  • game-consoles.tecca.com
  • game-consoles.venturebeat.com
  • gin.drinkhacker.com
  • golf-courses.bunkershot.com
  • gps-navigation.techcrunch.com
  • gps-navigation.venturebeat.com
  • green-cars.cleantechnica.com
  • guns.dailycaller.com
  • ham-radio.radiotower.com
  • hdtv.techcrunch.com
  • hdtv.venturebeat.com
  • headphones.techcrunch.com
  • headphones.venturebeat.com
  • high-chairs.parenting.com
  • highest-mountains.sciencedaily.com
  • hiv-stats.realclearworld.com
  • horsebreeds.petmd.com
  • hospital-ratings.lifescript.com
  • hr-jobs.findthelistings.com
  • inventors.sciencedaily.com
  • investment-advisors.minyanville.com
  • investment-banks.minyanville.com
  • iv-housing.dailynexus.com
  • laptops.mobiletechreview.com
  • laptops.techcrunch.com
  • laptops.venturebeat.com
  • lawschool.lawschoolexpert.com
  • locategrave.org
  • mammography-screening-centers.lifescript.com
  • mba-programs.dealbreaker.com
  • medigap-policies.findthedata.org
  • military-branches.nationaljournal.com
  • motorcycles.cycleworld.com
  • mountain-bikes.outsideonline.com
  • nannies.com
  • nobel-prize-winners.sciencedaily.com
  • nursing-homes.caregiverlist.com
  • nursing-homes.silvercensus.com
  • onlinecolleges.collegenews.com
  • phones.androidauthority.com
  • pickups.agriculture.com
  • planets.realclearscience.com
  • planets.sciencedaily.com
  • plants.backyardgardener.com
  • presidential-candidates.theblaze.com
  • presidents.nationaljournal.com
  • privateschools.parentinginformed.com
  • processors.betanews.com
  • project-management-software.venturebeat.com
  • projectors.techcrunch.com
  • pushcarts.golfdigest.com
  • recovery-and-reinvestment-act.theblaze.com
  • religions.theblaze.com
  • reviews.creditcardadvice.com
  • saving-accounts.bankingadvice.com
  • sb-marinas.noozhawk.com
  • sb-nonprofits.noozhawk.com
  • scheduling-software.venturebeat.com
  • scholarships.savingforcollege.com
  • schools.nycprivateschoolsblog.com
  • scooters.cycleworld.com
  • smartphones.techcrunch.com
  • smartphones.venturebeat.com
  • solarpanels.motherearthnews.com
  • sports-drinks.flotrack.org
  • stables.thehorse.com
  • state-economic-facts.nationaljournal.com
  • steppers.shape.com
  • strollers.parenting.com
  • supplements.womenshealthmag.com
  • tablets.androidauthority.com
  • tablets.techcrunch.com
  • tablets.venturebeat.com
  • tabletsandstuff.com/tablet-comparison-chart
  • tallest-buildings.sciencedaily.com
  • technology.searchenginewatch.com
  • telescopes.universetoday.com
  • tequila.proof66.com
  • texas-golf-courses.texasoutside.com
  • tires.agriculture.com
  • tractors.agriculture.com
  • tsunamies.sciencedaily.com
  • us-hurricanes.sciencedaily.com
  • video-cameras.venturebeat.com
  • volcanic-eruptions.com
  • waterheaters.motherearthnews.com
  • wetsuits.swellinfo.com
  • whiskey.cocktailenthusiast.com
  • whiskey.drinkoftheweek.com
  • white-house-visitors.theblaze.com
  • wineries.womenshealthmag.com



we have seen search results where a search engine didn't robots.txt something out, or somebody takes a cookie cutter affiliate feed, they just warm it up and slap it out, there is no value add, there is no original content there and they say search results or some comparison shopping sites don't put a lot of work into making it a useful site. They don't add value. - Matt Cutts

That syndication partnership network also explains part of how FTB is able to get so many pages indexed by Google, as each of those syndication sources is linking back at FTB on (what I believe to be) every single page of the subdomains, and many of these subdomains are linked to from sitewide sidebar or footer links on the PR7 & PR8 tech blogs.

And so the PageRank shall flow ;)

Hundreds of thousands of hours (eg 200,000+) for 40 people is 5,000 hours per person. Considering that there are an average of 2,000 hours per work year, this would imply each employee spent 2.5 full years of work on this single aspect of the job. And that is if one ignores the (hundreds of?) millions of content pages on other sites.

How does TechCrunch describe the FTB partnership?

Here’s one reason to be excited: In its own small way, it combats the recent flood of crappy infographics. Most TechCrunch writers hate the infographics that show up in our inboxes— not because infographics have to be terrible, but because they’re often created by firms that are biased, have little expertise in the subject of the infographic, or both, so they pull random data from random sources to make their point.

Get that folks? TechCrunch hosting automated subdomains of syndicated content means less bad infographics. And more cat lives saved. Or something like that.

How does FTB describe this opportunity for publishers?

The gadget comparisons we built for TechCrunch are sticky and interactive resources comprised of thousands of SEO optimized pages. They help over 1 million visitors per month make informed decisions by providing accurate, clear and useful data.

SEO optimized pages? Hmm.

Your comparisons will include thousands of long-tail keywords and question/answer pages to ensure traffic is driven by a number of different search queries. Our proprietary Data Content Platform uses a mesh linking structure that maximizes the amount of pages indexed by search engines. Each month—mainly through organic search—our comparisons add millions of unique visitors to our partner’s websites.

Thousands of long-tail keyord & QnA pages? Mesh linking structure? Hmm.

If we expand the "view more" section at the footer of the page, what do we find?

Holy Batman.

Sorry that font is so small, the text needed reduced multiple sizes in order to fit on my extra large monitor, and then reduced again to fit the width of our blog.

Each listing in a comparison has a number of associated questions created around the data we collect.

For example, we collect data on the battery life of the Apple iPad.

An algorithm creates the question “How long does the Apple iPad tablet battery last?” and answers it

So now we have bots asking themselves questions that they answer themselves & then stuffing that in the index as content?

Yeah, sounds like human-driven editorial.

After all, it's not like there are placeholder tokens on the auto-generated stuff

{parent_field}

Ooops.

Looks like I was wrong on that.

And automated "popular searches" pages? Nice!

As outrageous as the above is, they include undisclosed affiliate links in the content, and provided badge-based "awards" for things like the best casual dating sites, to help build links into their site.

That in turn led to them getting a bunch of porn backlinks.

If you submit an article to an article directory and someone else picks it up & posts it to a sketchy site you are a link spammer responsible for the actions of a third party.

But if you rate the best casual dating sites and get spammy porn links you are wonderful.

Content farming never really goes away. It only becomes more corporate.

Introduction Thread #6

Welcome to our sixth welcome thread (prior ones here, here, here, here & here).

If you are new to the site, please say hi and introduce yourself. :)

Google: "As We Say, NOT As We Do"

Due to heavy lobbying, the FTC's investigation into Google's business practices has ended with few marks or bruises on Google's behalf. If the EU has similar results, you can count on Google growing more anti-competitive in their practices:

Google is flat-out lying. They’ve modified their code to break Google Maps on Windows Phones. It worked before, but with the ‘redirect,’ it no longer works.

We are only a couple days into the new year, but there have already been numerous absurdities highlighted, in addition to the FTC decision & Google blocking Windows Phones.

When is Cloaking, Cloaking?

Don't ask Larry Page:

Mr. Page, the CEO, about a year ago pushed the idea of requiring Google users to sign on to their Google+ accounts simply to view reviews of businesses, the people say. Google executives persuaded him not to pursue that strategy, fearing it would irritate Google search users, the people say.
...
Links to Google+ also appear in Google search-engine results involving people and brands that have set up a Google+ account.

Other websites can't hardcode their own listings into the search results. But anyone who widely attempted showing things to Googlebot while cloaking them to users would stand a good chance of being penalized for their spam. They would risk both a manual intervention & being hit by Panda based on poor engagement metrics.

Recall that a big portion of the complaint about Google's business practices was their scrape-n-displace modus operandi. As part of the FTC agreement, companies are able to opt out of being scraped into some of Google's vertical offerings, but that still doesn't prevent their content from making its way into the knowledge graph.

Now that Google is no longer free to scrape-n-displace competitors, apparently the parallel Google version of that type of content that should be "free and open to all to improve user experience" (when owned by a 3rd party) is a premium feature locked behind a registration wall (when owned by Google). There is a teaser for the cloaked information in the SERPs, & you are officially invited to sign into Google & join Google+ if you would like to view more.

Information wants to be free.

Unless it is Google's.

Then users want to be tracked and monetized.

Trademark Violations & Copyright Spam

A few years back Google gave themselves a pat on the back for ending relationships with "approximately 50,000 AdWords accounts for attempting to advertise counterfeit goods."

How the problem grew to that scale before being addressed went unasked.

Last year Google announced a relevancy signal based on DMCA complaints (while exempting YouTube) & even nuked an AdSense publisher for linking to a torrent of his own ebook. Google sees a stray link, makes a presumption. If they are wrong and you have access to media channels then the issue might get fixed. But if you lack the ability to get coverage, you're toast.

Years ago a study highlighted how Google's AdSense & DoubleClick were the monetization engine for stolen content. Recently some USC researchers came to the same conclusion by looking at Google's list of domains that saw the most DMCA requests against them. Upon hearing of the recent study, Google's shady public relations team stated:

"To the extent [the study] suggests that Google ads are a major source of funds for major pirate sites, we believe it is mistaken," a Google spokesperson said. "Over the past several years, we've taken a leadership role in this fight. The complexity of online advertising has led some to conclude, incorrectly, that the mere presence of any Google code on a site means financial support from Google."

So Google intentionally avails their infrastructure to people they believe are engaged in criminal conduct (based on their own 50,000,000+ "valid" DMCA findings) and yet Google claims to have zero responsibility for those actions because Google may, in some cases, not get a direct taste in the revenues (only benefiting indirectly through increasing the operating costs of running a publishing business that is not partnered with Google).

A smaller company engaged in a similar operation might end up getting charged for the conduct of their partners. However, when Google's ad code is in the page you are wrong to assume any relationship.

The above linked LA Times article also had the following quote in it:

"When our ads were running unbeknownst to us on these pirate sites, we had a serious problem with that," said Gareth Hornberger, senior manager of global digital marketing for Levi's. "We reached out to our global ad agency of record, OMD, and immediately had them remove them.... We made a point, moving forward, that we really need to take steps to avoid having these problems again."

Through Google's reality warping efforts the ad network, the ad agency, the publisher, and the advertiser are all entirely unaccountable for their own efforts & revenue streams. And it is not like Google or the large ad agencies lack the resources to deal with these issues, as there is some serious cash in these types of deals: "WPP, Google's largest customer, increased its spending on Google by 25% in 2012, to about $2 billion."

These multi-billion Dollar budgets are insufficient funds to police the associated activities. Whenever anything is mentioned in the media, mention system complexity & other forms of plausible deniability. When that falls short, outsource the blame onto a contractor, service provider, or rogue partner. Contrasting that behavior, the common peasant webmaster must proactively monitor the rest of the web to ensure he stays in the graces of his Lord Google.

DMCA Spam

You have to police your user generated content, or you risk your site being scored as spam. With that in mind, many big companies are now filing false DMCA takedown requests. Sites that receive DMCA complaints need to address them or risk being penalized. Businesses that send out bogus DMCA requests have no repercussions (until they are eventually hit with a class action lawsuit).

Remember how a while back Google mentioned their sophisticated duplication detection technology in YouTube?

There are over a million full movies on YouTube, according to YouTube!

The other thing that is outrageous is that if someone takes a video that is already on YouTube & re-uploads it again, Google will sometimes outrank the original video with the spam shag-n-republish.

In the below search result you can see that our video (the one with the Excel spreadsheet open) is listed in the SERPs 3 times.

The version we uploaded has over a quarter million views, but ranks below the spam syndication version with under 100 views.

There are only 3 ways to describe how the above can happen:

  • a negative ranking factor against our account
  • horrible relevancy algorithms
  • idiocy

I realize I could DMCA them, but why should I have to bear that additional cost when Google allegedly automatically solved this problem years ago?

Link Spam

Unlike sacrosanct ad code, if someone points spam links at your site, you are responsible for cleaning it up. The absurdity of this contrast is only further highlighted by the post Google did about cleaning up spam links, where one of the examples they highlighted publicly as link spam was not a person's spam efforts, but rather a competitor's sabotage efforts that worked so well that they were even publicly cited as being outrageous link spam.

It has been less than 3 months since Google launched their disavow tool, but since it's launch some webmasters are already engaging in pre-negative SEO. That post had an interesting comment on it:

Well Mr Cutts, you have created a monster in Google now im afraid. Your video here http://www.youtube.com/watch?v=HWJUU-g5U_I says that with the new disavow tool makes negative SEO a mere nuisance.
Yet in your previous video about the diavow tool you say it can take months for links to be disavowed as google waits to crawl them???
In the meantime, the time lag makes it a little more than a "nuisance" don't you think?

Where Does This Leave Us?

As Google keeps adding more advanced filters to their search engines & folding more usage data into their relevancy algorithms, they are essentially gutting small online businesses. As Google guts them, it was important to offer a counter message of inclusion. A WSJ articles mentioned that Google's "get your business online" initiative was more effective at manipulating governmental officials than their other lobbying efforts. And that opinion was sourced from Google's lobbyists:

Some Washington lobbyists, including those who have done work for Google, said that the Get Your Business Online effort has perhaps had more impact on federal lawmakers than any lobbying done on Capitol Hill.

Each of the additional junk time wasting tasks (eg: monitoring backlinks and proactively filtering them, managing inventory & cashflow while waiting for penalties tied to competitive sabotage to clear, filing DMCAs against Google properties when Google claims to have fixed the issue years ago, merging Google Places listings into Google+, etc.) Google foists onto webmasters who run small operations guarantees that a greater share of them will eventually get tripped up.

Not only will the algorithms be out of their reach, but so will consulting.

That algorithmic approach will also only feed into further "market for lemons" aspects as consultants skip the low margin, small budget, heavy lifting jobs and focus exclusively on servicing the companies which Google is biasing their "relevancy" algorithms to promote in order to taste a larger share of their ad budgets.

While chatting with a friend earlier today he had this to say:

Business is arbitrage. Any exchange not based in fraud is legitimate regardless of volume or medium. The mediums choose to delegitimize smaller players as a way to consolidate power.

Sadly most journalists are willfully ignorant of the above biases & literally nobody is comparing the above sorts of behaviors against each other. Most people inside the SEO industry also avoid the topic, because it is easier (& more profitable) to work with the elephants & attribute their success to your own efforts than it is highlight the holes in the official propaganda.

I mean, just look at all the great work David Naylor did for a smaller client here & Google still gave him the ole "screw you" in spite of doing just about everything possible within his control.

The linkbuilding tactics used by the SEO company on datalabel.co.uk were low quality, but the links were completely removed before a Reconsideration Request was filed. The MD’s commenting and directory submissions were done in good faith as ways to spread the word about his business. Despite a lengthy explanation to Google, a well-documented clean-up process, and eventually disavowing every link to the site, the domain has never recovered and still violates Google’s guidelines.

If you’ve removed or disavowed every link, and even rebuilt the site itself, where do you go from there?

Is Google Concerned About Amazon Eating Their Lunch?

Leveling The Playing Field

When monopolies state that they want to "level the playing field" it should be cause for concern.

Groupon is a great example of how this works. After they turned down Google's buyout offer, Google responded by...

The same deal is slowly progressing in the cell phone market: “we are using compatibility as a club to make them do things we want."

Leveling Shopping Search

Ahead of the Penguin update Google claimed that they wanted to "level the playing field." Now that Google shopping has converted into a pay-to-play format & Amazon.com has opted out of participation, Google once again claims that they want to "level the playing field":

“We are trying to provide a level playing field for retailers,” [Google’s VP of Shopping Sameer Samat] said, adding that there are some companies that have managed to do both tech and retail well. “How’s the rest of the retail world going to hit that bar?”

This quote is particularly disingenuous. For years you could win in search with a niche site by being more focused, having higher quality content & more in-depth reviews. But now even some fairly large sites are getting flushed down the ranking toilet while the biggest sites that syndicate their data displace them (see this graph for an example, as Pricegrabber is the primary source for Yahoo! Shopping).

Some may make the argument that a business is illegitimate if it is excessively focused on search and has few other distribution channels, but if building those other channels causes your own site to get filtered out as duplicate content, all you are doing is trading one risky relationship for another. When it comes time to re-negotiate the partnerships in a couple years look for the partner to take a pound of flesh on that deal.

How Google Drives Businesses to Amazon, eBay & Other Platforms

Google has spent much of the past couple years scrubbing smaller ecommerce sites off the web via the Panda & Penguin updates. Now if small online merchants want an opportunity to engage in Google's search ecosystem they have a couple options:

  • Ignore it: flat out ignore search until they build a huge brand (it's worth noting that branding is a higher level function & deep brand investment is too cost intensive for many small niche businesses)
  • Join The Circus: jump through an endless series of hoops, minimizing their product pages & re-configuring their shopping cart
  • PPC: operate at or slightly above the level of a non-functional thin phishing website & pay Google by the click via their new paid inclusion program
  • Ride on a 3rd Party Platform: sell on one of the larger platforms that Google is biasing their algorithms toward & hope that the platform doesn't cut you out of the loop.

Ignoring search isn't a lasting option, some of the PPC costs won't back out for smaller businesses that lack a broad catalog to do repeat sales against to lift lifetime customer value, SEO is getting prohibitively expensive & uncertain. Of these options, a good number of small online merchants are now choosing #4.

Operating an ecommerce store is hard. You have to deal with...

  • sourcing & managing inventory
  • managing employees
  • technical / software issues
  • content creation
  • marketing
  • credit card fraud
  • customer service
  • shipping

Some services help to minimize the pain in many of these areas, but just like people do showrooming offline many also do it online. And one of the biggest incremental costs added to ecommerce over the past couple years has been SEO.

Google's Barrier to Entry Destroys the Diversity of Online Businesses

How are the smaller merchants to compete with larger ones? Well, for starters, there are some obvious points of influence in the market that Google could address...

  • time spent worrying about Penguin or Panda is time that is not spent on differentiating your offering or building new products & services
  • time spent modifying the source code of your shopping cart to minimize pagecount & consolidate products (and various other "learn PHP on the side" work) is not spent on creating more in-depth editorial
  • time switching carts to one that has the newly needed features (for GoogleBot and ONLY GoogleBot) & aligning your redirects is not spent on outreach and media relations
  • time spent disavowing links that a competitor built into your site is not spent on building new partnerships & other distribution channels outside of search

Ecosystem instability taxes small businesses more than larger ones as they...

The presumption that size = quality is false. A fact which Google only recognizes when it hits their own bottom line.

Anybody Could Have Saw This Coming

About a half-year ago we had a blog post about 'Branding & The Cycle' which stated:

algorithmically brand emphasis will peak in the next year or two as Google comes to appreciate that they have excessively consolidated some markets and made it too hard for themselves to break into those markets. (Recall how Google came up with their QDF algorithm only *after* Google Finance wasn't able to rank). At that point in time Google will push their own verticals more aggressively & launch some aggressive public relations campaigns about helping small businesses succeed online.

Since that point in time Amazon has made so many great moves to combat Google:

All of that is on top of creating the Kindle Fire, gaining content streaming deals & their existing strong positions in books and e-commerce.

It is unsurprising to see Google mentioning the need to "level the playing field." They realize that Amazon benefits from many of the same network effects that Google does & now that Amazon is leveraging their position atop e-commerce to get into the online ads game, Google feels the need to mix things up.

If Google was worried about book searches happening on Amazon, how much more worried might they be about a distributed ad network built on Amazon's data?

Said IgnitionOne CEO Will Margiloff: “I’ve always believed that the best data is conversion data. Who has more conversion data in e-commerce than Amazon?”

“The truth is that they have a singular amount of data that nobody else can touch,” said Jonathan Adams, iCrossing’s U.S. media lead. “Search behavior is not the same as conversion data. These guys have been watching you buy things for … years.”
...
Amazon also has an opportunity to shift up the funnel, to go after demand-generation ad budgets (i.e. branding dollars) by using its audience data to package targeting segments. It's easy to imagine these segments as hybrids of Google’s intent-based audience pools and Facebook’s interest-based ones.

Google is in a sticky spot with product search. As they aim to increase monetization by displacing the organic result set they also lose what differentiates them from other online shopping options. If they just list big box then users will learn to pick their favorite and cut Google out of the loop. Many shoppers have been trained to start at Amazon.com even before Google began polluting their results with paid inclusion:

Research firm Forrester reported that 30 percent of U.S. online shoppers in the third quarter began researching their purchase on Amazon.com, compared with 13 percent who started on a search engine such as Google - a reversal from two years earlier when search engines were more popular starting points.

Who will Google partner with in their attempt to disrupt Amazon? Smaller businesses, larger corporations, or a mix of both? Can they succeed? Thoughts?

How to Obfuscate And Misdirect an Algo Update

Sharing is caring!

Please share :)

Embed code is here.

If you find the following a bit hard to read due to font size, a wider version is located here.

Google Algo Changes.

Counterspin on Shopping Search: Shady Paid Inclusion

Bing caused a big stink today when they unveiled Scroogled, a site that highlights how Google Shopping has went paid-inclusion only. A couple weeks ago Google announced that they would be taking their controvercial business model global, in spite of it being "a mess."

Nextag has long been critical of Google's shifts on the shopping search front. Are their complaints legitimate, or are they just whiners?

Data, More Reliable Than Spin

Nothing beats data, so lets start with that.

This is what Nextag's search exposure has done over the past few years, according to SearchMetrics.

If Google did that to any large & politically connected company, you can bet regulators would have already took action against Google, rather than currently negotiating with them.

What's more telling is how some other sites in the shopping search vertical have performed.

PriceGrabber, another player in the shopping search market, has also slowly drifted downward (though at a much slower rate).

One of the few shopping search engines that has seen a big lift over this time period was Yahoo! Shopping.

What is interesting about that rise is that Yahoo! outsourced substantially all of their shopping search product to PriceGrabber.

A Self-Destructing Market Dynamic

The above creates an interesting market dynamic...

  • the long established market leader can wither on the vine for being too focused on their niche market & not broadening out in ways that increase brand awareness
  • a larger site with loads of usage data can outsource the vertical and win based on the bleed of usage data across services & the ability to cross promote the site
  • the company investing in creating the architecture & baseline system that powers other sites continues to slide due to limited brand & a larger entity gets to displace the data source
  • Google then directly enters the market, further displacing some of the vertical players

The above puts Nextag's slide in perspective, but the problem is that they still have fixed costs to manage if they are going to maintain their editorial quality. Google can hand out badges for people willing to improve their product for free or give searchers a "Click any fact to locate it on the web. Click Wrong? to report a problem" but others who operated with such loose editorial standards would likely be labeled as a spammer of one stripe or another.

Scrape-N-Displace

Most businesses have to earn the right to have exposure. They have to compete in the ecosystem, built awareness & so on. But Google can come in from the top of the market with an inferior product, displace the competition, economically starve them & eventually create a competitive product over time through a combination of incremental editorial improvements and gutting the traffic & cash flow to competing sites.

"The difference between life and death is remarkably small. And it’s not until you face it directly that you realize your own mortality." - Dustin Curtis

The above quote is every bit as much true for businesses as it is for people. Nothing more than a threat of a potential entry into a market can cut off the flow of investment & paralyze businesses in fear.

  • If you have stuff behind a paywall or pre-roll ads you might have "poor user experience metrics" that get you hit by Panda.
  • If you make your information semi-accessible to Googlebot you might get hit by Panda for having too much similar content.
  • If you are not YouTube & you have a bunch of stolen content on your site you might get hit by a copyright penalty.
  • If you leave your information fully accessible publicly you get to die by scrape-n-displace.
  • If you are more clever about information presentation perhaps you get a hand penlty for cloaking.

None of those is a particularly desirable way to have your business die.

Editorial Integrity

In addition to having a non-comprehensive database, Google Shopping also suffers from the problem of line extension (who buys video games from Staples?).

The bigger issue is that issue of general editorial integrity.

Are products in stock? Sometimes no.

It is also worth mentioning that some sites with "no product available" like Target or Toys R Us might also carry further Google AdSense ads.

Then there are also issues with things like ads that optimize for CTR which end up promoting things like software piracy or the academic versions of software (while lowering the perceived value of the software).

Over the past couple years Google has whacked loads of small ecommerce sites & the general justification is that they don't add enough that is unique, and that they don't deserve to rank as their inventory is unneeded duplication of Amazon & eBay. Many of these small businesses carry inventory and will be driven into insolvency by the sharp shifts in traffic. And while a small store is unneeded duplication, Google still allows syndicated press releases to rank great (and once again SEOs get blamed for Google being Google - see the quote-as-headline here).

Let's presume Google's anti-small business bias is legitimate & look at Google Shopping to see how well they performed in terms of providing a value add editorial function.

A couple days ago I was looking for a product that is somewhat hard to find due to seasonal shopping. It is often available at double or triple retail on sites like eBay, but Google Shopping helped me locate a smaller site that had it available at retail price. Good deal for me & maybe I was wong about Google.

... then again ...

The site they sent me to had the following characteristics:

  • URL - not EMD & not a brand, broken English combination
  • logo - looks like I designed it AND like I was in a rush when I did it
  • about us page - no real information, no contact information (on an ecommerce site!!!), just some obscure stuff about "direct connection with China" & mention of business being 15 years old and having great success
  • age - domain is barely a year old & privacy registered
  • inbound links - none
  • product price - lower than everywhere else
  • product level page content - no reviews, thin scraped editorial, editorial repeats itself to fill up more space, 3 adsense blocks in the content area of the page
    • no reviews, thin scraped editorial, editorial repeats itself to fill up more space, 3 adsense blocks in the content area of the page
    • no reviews, thin scraped editorial, editorial repeats itself to fill up more space, 3 adsense blocks in the content area of the page
    • no reviews, thin scraped editorial, editorial repeats itself to fill up more space, 3 adsense blocks in the content area of the page
    • the above repetition is to point out the absurdity of the formatting of the "content" of said page
  • site search - yet again the adsense feed, searching for the product landing page that was in Google Shopping I get no results (so outside of paid inclusion & front/center placement, Google doesn't even feel this site is worth wasting the resources to index)
  • checkout - requires account registration, includes captcha that never matches, hoping you will get frustrated & go back to earlier pages and click an ad

It actually took me a few minutes to figure it out, but the site was designed to look like a phishing site, with intent that perhaps you will click on an ad rather than trying to complete a purchase. The forced registration will eat your email & who knows what they will do with it, but you can never complete your purchase, making the site a complete waste of time.

Looking at the above spam site with some help of tools like NetComber it was apparent that this "merchant" also ran all sorts of scraper sites driven on scraping content from Yahoo! Answers & similar, with sites about Spanish + finance + health + shoes + hedge funds.

It is easy to make complaints about Nextag being a less than perfect user experience. But it is hard to argue that Google is any better. And when other companies have editorial costs that Google lacks (and the other companies would be labeled as spammers if they behaved like Google) over time many competing sites will die off due to the embedded cost structure advantages. Amazon has enoug scale that people are willing to bypass Google's click circus & go directly to Amazon, but most other ecommerce players don't. The rest are largely forced to pay Google's rising rents until they can no longer afford to, then they just disappear.

Bonus Prize: Are You Up to The Google Shopping Test?

The first person who successfully solves this captcha wins a free month membership to our site.

Soft Launching SEOTools.net

Last month we soft launched SEOTools.net. Here are a few entries as a sample of things to come...

... do subscribe to the RSS feed if you like what you see thusfar.

Why create yet another site about SEO?

Good question, glad you asked. ;)

Our customer base on this site consists primarily of the top of this pyramid. I can say without doubt that I know that some of our customers know more about SEO than I do & that generally makes them bleeding edge. ;)

And then some people specialize in local or video or ecommerce or other such verticals where there are bits of knowledge one can only gain via first hand experience (eg: importing from China or doing loads of testing of YouTube variables or testing various upsells). There is becoming so much to know that nobody can really know everything, so the goal of our site here is to sorta bring together a lot of the best folks.

Some people newer to the field & a bit lower down on the pyramid are lucky/smart enough to join our community too & those who do so and participate likely save anywhere from 1 to 3 years on their learning curve...leveling up quickly in the game/sport of SEO. But by and large our customers are mostly the expert end of the market.

We could try to water down the community & site to try to make it more mass market, but I think that would take the site's leading strength and flush it down the toilet. In the short run it would mean growth, but it would also make the community less enjoyable ... and this site is as much a labor of love as it is a business. I think I would burn myself out & no longer love it if the site became noisy & every third post was about the keyword density of meta tags.

What Drives You?

When SEOBook.com was originally created SEO was much less complex & back in 2003 I was still new to the field, so I was writing at a level that was largely aligned with the bulk of the market. However, over the past decade SEO has become much more complex & many of our posts tend to be at a pretty high level, pondering long-term implications of various changes.

When there are big changes in the industry we are usually early in discussing them. We were writing about exact match domains back in 2006 and when Google's algorithm hinted at a future of strong brand preference we mentioned that back in 2009. With that being said, many people are not nimble enough to take advantage of some of the shifts & many people still need solid foundational SEO 101 in place before the exceptions & more advanced topics make sense.

The following images either make sense almost instantly, or they look like they are in Greek...depending on one's experience in the field of SEO.

My mom and I chat frequently, but she tells me some of the posts here tend to be pretty deep / complex / hard to understand. Some of them take 20 hours to write & likely read like college dissertations. They are valuable for those who live & breathe SEO, but are maybe not a great fit for those who casually operate in the market.

My guess is my mom is a pretty good reflection of most of the market in understanding page titles, keywords, and so on...but maybe not knowing a lot about anchor text filters, link velocity, extrapolating where algorithm updates might create future problems & how Google might then respond to those, etc. And most people who only incidentally touch the SEO market don't need to get a PhD in the topic in order to reach the point of diminishing returns.

Making Unknowable SEO More Knowable

SEO has many pieces that are knowable (rank, traffic, rate of change, etc.), but over time Google has pulled back more and more data. As Google gets greedier with their data, that makes SEO harder & increases the value of some 3rd party tools that provide competitive intelligence information.

  • Being able to look up the performance of a section of a site is valuable.
  • Tracking how a site has done over time (to identify major ranking shifts & how they align with algorithm updates) is also quite valuable.
  • Seeing link spikes & comparing those with penalties is also valuable.

These data sets help offer clues to drive strategy to try to recover from penalties, & how to mimic top performing sites to make a site less likely to get penalized.

The Difference Between These 2 Sites

Our goal with SEO Book is to...

  • try to cover important trends & topics deeper than anyone else (while not just parroting Google's view)
  • offer a contrary view to lifestyle image / slogan-based SEO lacking in substance or real-world experience
  • maintain the strongest community of SEO experts, such that we create a community I enjoy participating in & learning from

Our goal with SEO tools is to...

  • create a site that is a solid fit for the beginner to intermediate portions of the market
  • review & compare various industry tools & highlight where they have unique features
  • offer how to guides on specific tasks that help people across a diverse range of backgrounds & skill levels save time and become more efficient SEOs
  • provide introduction overviews of various SEO-related topics

Google Disavow Tool

Google launched a disavow links tool. Webmasters who want to tell Google which links they don’t want counted can now do so by uploading a list of links in Google Webmaster Tools.

If you haven’t received an “unnatural link” alert from Google, you don’t really need to use this tool. And even if you have received notification, Google are quick to point out that you may wish to pursue other avenues, such as approaching the site owner, first.

Webmasters have met with mixed success following this approach, of course. It's difficult to imagine many webmasters going to that trouble and expense when they can now upload a txt file to Google.

Careful, Now

The disavow tool is a loaded gun.

If you get the format wrong by mistake, you may end up taking out valuable links for long periods of time. Google advise that if this happens, you can still get your links back, but not immediately.

Could the use of the tool be seen as an admission of guilt? Matt gives examples of "bad" webmaster behavior, which comes across a bit like “webmasters confessing their sins!”. Is this the equivalent of putting up your hand and saying “yep, I bought links that even I think are dodgy!”? May as well paint a target on your back.

Some webmasters have been victims of negative SEO. Some webmasters have had scrapers and autogen sites that steal their content, and then link back. There are legitimate reasons to disavow links. Hopefully, Google makes an effort to make such a distinction.

One wonders why Google simply don't discount the links they already deem to be “bad”? Why the need for the webmaster to jump through hoops? The webmaster is still left to guess which links are “bad”, of course.

Not only is it difficult working out the links that may be a problem, it can be difficult getting a view of the entire link graph. There are various third party tools, including Google’s own Webmaster Central, but they aren’t exhaustive.

Matt mentioned that the link notification emails will provide examples of problem links, however this list won't be exhaustive. He also mentioned that you should pay attention to the more recent links, presumably because if you haven't received notification up until now, then older links weren't the problem. The issue with that assumption is that links that were once good can over time become bad:

  • That donation where you helped a good cause & were later mortified that "online casino" and "discount cheap viagra" followed your course for purely altruistic reasons.
  • That clever comment on a well-linked PR7 page that is looking to cure erectile dysfunction 20 different ways in the comments.
  • Links from sources that were considered fine years ago & were later repositioned as spam (article banks anyone?)
  • Links from sites that were fine, but a number of other webmasters disavowed, turning a site that originally passed the sniff test into one that earns a second review revealing a sour stench.

This could all get rather painful if webmasters start taking out links they perceive to be a problem, but aren’t. I imagine a few feet will get blasted off in the process.

Webmasters Asked, Google Gaveth

Webmasters have been demanding such a tool since the un-natural notifications started appearing. There is no question that removing established links can be as hard, if not harder, than getting the links in the first place. Generally speaking, the cheaper the link was to get the higher the cost of removal (relative to the original purchase price). If you are renting text link ads for $50 a month you can get them removed simply by not paying. But if you did a bulk submission to 5,000 high PR SEO friendly directories...best of luck with that!

It is time consuming. Firstly, there’s the overhead in working out which links to remove, as Google doesn’t specify them. Once a webmaster has made a list of the links she thinks might be a problem, she then needs to go through the tedious task of contacting each sites and requesting that a link be taken down.

Even with the best will in the world, this is an overhead for the linking site, too. A legitimate site may wish to verify the identity of the person requesting the delink, as the delink request could come from a malicious competitor. Once identity has been established, the site owner must go to the trouble of making the change on their site.

This is not a big deal if a site owner only receives one request, but what if they receive multiple requests per day? It may not be unreasonable for a site owner to charge for the time taken to make the change, as such a change incurs a time cost. If the webmaster who has incurred a penalty has to remove many links, from multiple sites, then such costs could quickly mount. Taken to the (il)logical extremes, this link removal stuff is a big business. Not only are there a number of link removal services on the market, but one of our members was actually sued for linking to a site (when the person who was suing them paid to place the link!)

What’s In It For Google?

Webmasters now face the prisoner's dilemma and are doing Google’s job for them.

It’s hard to imagine this data not finding it’s way to the manual reviewers. If there are multiple instances of webmasters reporting paid links from a certain site, then Google have more than enough justification to take it out. This would be a cunning way around the “how do we know if a link is paid?” problem.

Webmasters will likely incorporate bad link checking into their daily activities. Monitoring inbound links wasn’t something you had to watch in the past, as links were good, and those that weren’t, didn’t matter, as they didn’t affect ranking anyway. Now, webmasters may feel compelled to avoid an unnatural links warning by meticulously monitoring their inbound links and reporting anything that looks odd. Google haven’t been clear on whether they would take such action as a result - Matt suggests they just reclassify the link & see it as a strong suggestion to treat it like the link has a nofollow attribute - but no doubt there will be clarification as the tool beds in. Google has long used a tiered index structure & enough complaints might lower the tier of a page or site, cause it's ability to pass trust to be blocked, or cause the site to be directly penalized.

This is also a way of reaffirming “the law”, as Google sees it. In many instances, it is no fault of the webmaster that rogue sites link up, yet the webmaster will feel compelled to jump through Google’s hoops. Google sets the rules of the game. If you want to play, then you play by their rules, and recognize their authority. Matt Cutts suggested:

we recommend that you contact the sites that link to you and try to get links taken off the public web first. You’re also helping to protect your site’s image, since people will no longer find spammy links and jump to conclusions about your website or business.

Left unsaid in the above is most people don't have access to aggregate link data while they surf the web, most modern systems of justice are based on the presumption of innocence rather than guilt, and most rational people don't presume that a site that is linked to is somehow shady simply for being linked to.

If the KKK links to Matt's blog tomorrow that doesn't imply anything about Matt. And when Google gets featured in an InfoWars article it doesn't mean that Google desires that link or coverage. Many sketchy sites link to Adobe (for their flash player) or sites like Disney & Google for people who are not old enough to view them or such. Those links do not indicate anything negative about the sites being linked into. However, as stated above, search is Google's monopoly to do with as they please.

On the positive side, if Google really do want sites to conform to certain patterns, and will reward them for doing so by letting them out of jail, then this is yet another way to clean up the SERPs. They get the webmaster on side and that webmaster doing link classification work for them for free.

Who, Not What

For a decade search was driven largely by meritocracy. What you did was far more important than who you were. It was much less corrupt than the physical world. But as Google chases brand ad Dollars, that view of the search landscape is no longer relevant.

Large companies can likely safely ignore much of the fear-first approach to search regulation. And when things blow up they can cast off blame on a rogue anonymous contractor of sorts. Whereas smaller webmasters walk on egg shells.

When the government wanted to regulate copyright issues Google claimed it would be too expensive and kill innovation at small start ups. Google then drafted their own copyright policy from which they themselves are exempt. And now small businesses not only need to bear that cost but also need to police their link profiles, even as competitors can use Fivver, ScrapeBox, splog link networks & various other sources to drip a constant stream of low cost sludge in their direction.

Now more than ever, status is important.

Gotchas

No doubt you’ve thought of a few. A couple thoughts - not that we advocate them, but realize they will happen:

  • Intentionally build spam links to yourself & then disavow them (in order to make your profile look larger than it is & to ensure that competitor who follows everything you do - but lacks access to your disavow data - walks into a penalty).
  • Find sites that link to competitors and leave loads of comments for the competitor on them, hoping that the competitor blocks the domain as a whole.
  • Find sites that link to competitors & buy links from them into a variety of other websites & then disavow from multiple accounts.
  • Get a competitor some link warnings & watch them push to get some of their own clean "unauthorized" links removed.
  • The webmaster who parts on poor terms burning the bridge behind them, or leaving a backdoor so that they may do so at anytime.

If a malicious webmaster wanted to get a target site in the bad books, they could post obvious comment spam - pointing at their site, and other sites. If this activity doesn’t result in an unnatural linking notification, then all good. It’s a test of how Google values that domain. If it does result in an unnatural link notification, the webmaster could then disavow links from that site. Other webmasters will likely do the same. Result: the target site may get taken out.

To avoid this sort of hit, pay close attention to your comment moderation.

Please add your own to the comments! :) Gotchas, that is, not rogue links.

Further opinions @ searchengineland and seoroundtable.

The Death of SEO [Infographic]

Over the weekend Google did an update which continues their trend of diminishing the value of domain names in an SEO strategy as they "pump up the brand."

In light of that, we thought it would be a good time to get out in front of the tired "Death of SEO" meme that is sure to appear once again in the coming weeks. ;)

The font size is somewhat small in the below image, but if you click through to the archived page you can see it in it's full glorious size.

The Death of SEO.

Want to syndicate this infographic? Embed code is here. We also created a PDF version.

Pages