Why I Love the Google's Supplemental Index

May 6th

Forbes recently wrote an article about Google's supplemental results, painting it as webpage hell. The article states that pages in Google's Supplemental index is trusted less than pages in the regular index:

Google's programmers appear to have created the supplemental index with the best intentions. It's designed to lighten the workload of Google's "spider," the algorithm that constantly combs and categorizes the Web's pages. Google uses the index as a holding pen for pages it deems to be of low quality or designed to appear artificially high in search results.

Matt Cutts was quick to state that supplemental results are not a big deal, as Rand did here too, but supplemental results ARE a big deal. They are an indication of the health of a website.

I have worked on some of the largest sites and network of sites on the web (hundreds of millions+ pages). When looking for duplicate content or information architecture related issues, the search engines do not allow you to view deep enough to see all indexing problems, so one of the first things I do is use this search to find low quality pages (ie: things that suck PageRank and do not add much unique content to their site). After you find some of the major issues you can dig deeper by filtering out some of the core issues that showed up on your first supplemental searches. For example, here are threadwatch.org supplemental results that do not contain the word node in the URL.

If you have duplicate content issues, at best you are splitting your PageRank, but you might also affect your crawl priorities. If Google thinks 90% of a site is garbage (or not worth trusting much) I am willing to bet that they also trust anything else on that domain a bit less than they otherwise would, and are more restrictive with their willingness to crawl the rest of the site. As noted in Wasting Link Authority on Ineffective Internal Link Structure, ShoeMoney increased his search traffic 1400% after blocking some of his supplemental pages.

Published: May 6, 2007

New to the site? Join for Free and get over $300 of free SEO software.

Once you set up your free account you can comment on our blog, and you are eligible to receive our search engine success SEO newsletter.

Already have an account? Login to share your opinions.

Comments

Halfdeck
May 7, 2007 - 7:19am

Slightly off topic, but Matt Cutts left an interesting comment on Rand's post:

"duplicate content doesn't make you more likely to have pages in the supplemental index in my experience. It could be a symptom but not a cause, e.g. lots of duplicate content implies lots of pages, and potentially less PageRank for each of those pages. So trying to surface an entire large catalog of pages would mean less PageRank for each page, which could lead to those pages being less likely to be included in our main web index."

He adds

"I'm not aware of an explicit mechanism whereby duplicate content is more likely to be in our supplemental results, but I'm also happy to admit that as supplemental results are different from webspam, I'm not the expert at Google on every aspect of supplemental results."

Ken Savage
May 7, 2007 - 6:15pm

One problem I see is using tags on pages. I show a lot of supplemental results for pages like www.domain.com/tag/xxxxx

I rank for many more terms using tags with my Wordpress blog but the majority of them eventually fall into supplemental and suck PR. We're talking like 2800+ tag pages.

Also the wordpress pages that are built to comments and posts feeds. www.domain.com/xxxxxx/feed

Andy Beard
May 9, 2007 - 7:02pm

Google's reporting of supplemental results is a little messed up at the moment, although your toolbar also seems to report totally different numbers as well.

Lets look at your specific search method for SEOBook.com

Duplicate content on SEOBook.com search

Now lets compare that to my site that has a lot of duplicate content and much less total pagerank and links

Duplicate content on andybeard.eu search

There are some new things going on that are also fairly bugged, because pages that are duplicate content based upon the toolbar pagerank export (and show grey) still rank for reasonably competitive keywords.

New to the site? Join for Free and get over $300 of free SEO software.

Once you set up your free account you can comment on our blog, and you are eligible to receive our search engine success SEO newsletter.

Already have an account? Login to share your opinions.

  • Over 100 training modules, covering topics like: keyword research, link building, site architecture, website monetization, pay per click ads, tracking results, and more.
  • An exclusive interactive community forum
  • Members only videos and tools
  • Additional bonuses - like data spreadsheets, and money saving tips
We love our customers, but more importantly

Our customers love us!






    Email Address
    Pick a Username
    Yes, please send me "7 Days to SEO Success" mini-course (a $57 value) for free.

    Learn More

    We value your privacy. We will not rent or sell your email address.