View All Your Google Supplemental Index Results

[Update: use this supplemental ratio calculator. Google is selfish and greedy with their data, and broke ALL of the below listed methods because they wanted to make it hard for you to figur out what pages of your site they don't care for. ]

A person by the nickname DigitalAngle left the following tip in a recent comment

If you want to view ONLY your supplemental results you can use this command site:www.yoursite.com *** -sljktf

Why Are Supplemental Results Important?

Pages that are in the supplemental index are placed there because they are trusted less. Since they are crawled less frequently and have less resources diverted toward them, it makes sense that Google does not typically rank these pages as high as pages in the regular search index.

Just how cache date can be used to view the relative health of a page or site, the percent of the site stuck in supplemental results and the types of pages stuck in supplemental results can tell you a lot about information architecture related issues and link equity related issues.

Calculate Your Supplemental Index Ratio:

To get your percentage of supplemental results you would divide your number of supplemental results by your total results count

site:www.yoursite.com *** -sljktf
site:www.yoursite.com

What Does My Supplemental Ratio Mean?

The size of the supplemental index and the pages included in it change as the web grows and Google changes their crawling priorities. It is a moving target, but one that still gives you a clue to the current relative health of your site.

If none of your pages are supplemental then likely you have good information architecture, and can put up many more profitable pages for your given link equity. If some of your pages are supplemental that might be fine as long as those are pages that duplicate other content and/or are generally of lower importance. If many of your key pages are supplemental you may need to look at improving your internal site architecture and/or marketing your site to improve your link equity.

Comparing the size of your site and your supplemental ratio to similar sites in your industry may give you a good grasp on the upside potential of fixing common information architecture related issues on your site, what sites are wasting significant potential, and how much more competitive your marketplace may get if competitors fix their sites.

Google Using Search Engine Scrapers to Improve Search Engine Relevancy

If something ranks and it shouldn't, why not come up with a natural and easy way to demote it? What if Google could come up with a way to allow scrapers to actually improve the quality of the search results? I think they can, and here is how. Non-authoritative content tends to get very few natural links. This means that if it ranks well for competitive queries where bots scrape the search results it will get many links with the exact same anchor text. Real resources that rank well will tend to get some number of self reinforcing unique links with DIFFERENT MIXED anchor text.

If the page was ranking for the query because it was closely aligned with a keyword phrase that was in the page title, internal link structure, and is heavily represented on the page itself that could cause the page to come closer and closer to the threshold of looking spammy as it picks up more and more scraper links, especially if it is not picking up any natural linkage.

How to Protect Yourself:

  • If you tend to get featured on many scraper sites make sure you change your page titles occasionally on your most important and highest paying pages.

  • Write naturally, for humans, and not exclusively for search bots. If you are creating backfill content that leverages a domain's authority score, try to write articles like a newspaper. If you are not sure what that means look at some newspapers. Rather than paying people to write articles optimized for a topic, pay someone else to do it who does not know much about SEO. Tell them to ensure they don't use the same templates for the page titles, meta descriptions, and page headings.
  • Use variation in your headings, page titles, and meta description tags.
  • Filters are applied at different levels depending on domain authority and page level PageRank scores. By gaining more domain authority it should help your site bypass some filters, but that may also cause your site to be looked at with more scrutiny by other types of filters.
  • Make elements of your site modular so you can quickly react to changes. For example, many of my sites use server side includes for the navigation, which allows me to make the navigation more or less aggressive depending on the current search algorithms. Get away with what you can, and if they clamp down on you ease off the position.
  • Get some editorial deep links with mixed anchor text to your most profitable or most important interior pages, especially if they rank well and do not get many natural editorial votes on their own.
  • Be actively involved in participating in your community. If the topical language changes without you then it is hard to stay relevant. If you have some input in how the market is changing that helps keep your mindshare and helps ensure you match your topical language as it shifts.

New Directory, URL, & Keyword Phrase Based Google Filters & Penalties

WebmasterWorld has been running a series of threads about various penalties and filters aligned with specific URLs, keyword phrases, and in some cases maybe even entire directories.

Some Threads:

There is a lot of noise in those threads, but you can put some pieces together from them. One of the best comments is from Joe Sinkwitz:

1. Phrase-based penalties & URL-based penalties; I'm seeing both.
2. On phrase-based penalties, I can look at the allinanchor: for the that KW phrase, find several *.blogspot.com sites, run a copyscape on the site with the phrase-based penalty, and will see these same *.blogspot.com sites listed...scraping my and some of my competitors' content.
3. On URL-based penalties allinanchor: is useless because it seems to practically dump the entire site down to the dregs of the SERPs. Copyscape will still show a large amount of *.blogspot.com scraping though.

Joe has a similar post on his blog, and I covered a similar situation on September 1st of last year in Rotating Page Titles for Anchor Text Variation.

You see a lot more of the auto-gen spam in competitive verticals, and having a few sites that compete for those types of queries helps you see the new penalties, filters, and re-ranked results as they are rolled in.

Google Patents:

Google filed a patent application for Agent Rank, which is aimed at allowing them to associate portions of page content, site content, and cross-site content with individuals of varying degrees of trust. I doubt they have used this much yet, but the fact that they are even considering such a thing should indicate that many other types of penalties, filters, and re-ranking algorithms are already at play.

Some Google patents related to phrases, as pointed out by thegypsy here:

Bill Slawski has a great overview post touching on these patent applications.

Phrase Based Penalties:

Many types of automated and other low quality content creation cause the low quality pages to barely be semantically related to the local language, while other types of spam generation cause low quality pages to be too heavily aligned to the local language. Real content tends to fall within a range of semantic coverage.

Cheap or automated content typically tends to look unnatural, especially when you move beyond comparing words to looking at related phrases.

If a document is too far off in either direction (not enough OR too many related phrases) it could be deemed as not relevant enough to rank, or a potential spam page. Once a document is flagged for one term it could also be flagged for other related terms. If enough pages from a site are flagged a section of the site or a whole site can be flagged for manual review.

URL and Directory Based Penalties:

Would it make sense to prevent a spam page on a good domain for ranking for anything? Would it make sense for some penalties to be directory wide? Absolutely. Many types of cross site scripting errors and authority domain abuses (think rented advertisement folder or other ways to gain access to a trusted site) occur at a directory or subdomain level, and have a common URL footprint. And cheaply produced content also tends to have section wide footprints where only a few words are changed in the page titles across an entire section of a site.

I recently saw an exploit on the W3C. Many other types of automated templated spam leave directory wide footprints, and as Google places more weight on authoritative domains they need to get better at filtering out abuse of that authority. Google would love to be able to penalize things in a specific subdomain or folder without having to nuke that entire domain, so in some cases they probably do, and these filters or penalties probably effect both new domains and more established authoritative domains.

How do You Know When You are Hit?

If you had a page which typically ranked well for a competitive keyword phrase, and you saw that page drop like a rock you might have a problem. Other indications of problems are if you have inferior pages that are ranking where your more authoritative page ranked in the past. For example, lets say you have a single mother home loan page ranking for a query where your home loan page ranked, but no longer does.

Textual Community:

Just like link profiles create communities, so does the type and variety of text on a page.

Search results tend to sample from a variety of interests. With any search query there are assumed common ideas that may be answered by a Google OneBox, related phrase suggestions, or answered based on the mixture of the types of sites shown in the organic search results. For example:

  • how do I _____

  • where do I buy a ____
  • what is like a _____
  • what is the history of ______
  • consumer warnings about ____
  • ______ reviews
  • ______ news
  • can I build a ___
  • etc etc etc

TheWhippinpost had a brilliant comment in a WMW thread:

  • The proximity, ie... the "distance", between each of those technical words, are most likely to be far closer together on the merchants page too (think product specification lists etc...).

  • Tutorial pages will have a higher incidence of "how" and "why" types of words and phrases.
  • Reviews will have more qualitative and experiential types of words ('... I found this to be robust and durable and was pleasantly surprised...').
  • Sales pages similarly have their own (obvious) characteristics.
  • Mass-generated spammy pages that rely on scraping and mashing-up content to avoid dupe filters whilst seeding in the all-important link-text (with "buy" words) etc... should, in theory, stand-out amongst the above, since the spam will likely draw from a mixture of all the above, in the wrong proportions.

Don't forget that Google Base recently changed to require certain fields so they can help further standardize that commercial language the same way they standardized search ads to have 95 characters. Google is also scanning millions of books to learn more about how we use language in different fields.

Google Increases Search Result Personalization - Removes Notification

Google recently announced they are increasing user personalization. In the past they typically placed a turn off personalized results whenever your results were personalized, but now they do not disclose when they are personalizing the results, so you don't know when they changed, which sucks. To see non-personalized results you have to log out of your Google account.

Now instead of marking the results as personalized when they change them the results always say they are personalized.

Google's Paid Inclusion Model

BusinessWeek published an article about small advertisers being priced out of AdWords. Given quality score adjustments that may boost ads for sites which have strong trust associated organic SEO, it is prohibitively expensive for many businesses to use AdWords unless they are already well trusted in organic search.

What Types of Sites Rank?

The sites which are already well represented in organic search typically fall into one or more of the following groups

  • old

  • has many signs of authority (new and old links, repeat visitors, brand related search queries)
  • associated with a rich powerful offline entity
  • unique & remarkable

News Sites as God

News sites tend to fit all 4 of those categories, plus get additional easy linkage data by writing about current events and being included in select indexes like Google News, and have many other advantages. The bias toward promoting large trusted sites which are already overrepresented in the organic results starts to look even uglier when news outfits are

From the WSJ:

Britain's famously competitive newspapers have a new battleground: Google. ... Newspapers are buying search words on Google Inc. so that links to their Web sites pop up first when people type in a search. ... Paying to put their stories in front of readers by buying Google ads -- a practice the papers say has intensified in recent months -- is different from past marketing efforts

In spite of Google claiming otherwise, there is a direct connection between buying AdWords and ranking in the organic search results. If a news article is read by a few more people and gets just a few more links off the start it will become the default article about that topic and acquire many self reinforcing links.

Why Would Google Trust News Sites so Much?

  • Most news sites have some type of editorial controls and are typical hard for the average webmaster to significantly influence.

  • Most people think what they are told to (and the media is who tells us what to think about). Thus if Google returns a rough reflection of what we should think they are seen as relevant and unbiased.
  • Most news sites are associated with economic macro-parasites - not micro-parasites. Google is far more afraid of death by 1,000s of small cuts than by trusting any given domain too much.
  • It is mainstream media which makes up a large foundation of Google's index and power. Google is killing off many of the inefficient monopoly based business models, and is thus trying to throw the media scraps to keep the media bought into Google's worldview.
  • It is easier for Google to organize information if they allow their algorithms to cause their search results to parallel offline authority structures.

Crawl Delay Has Cost:

Danny Sullivan recently commented about how Digg outranked him for his own headline because his website is newer and less trusted than Digg.

Google has become less willing to regularly crawl and rank sites unless they are aged or have a significantly developed editorial link campaign associated with them. If your site gets indexed slower then your content needs to be much more remarkable to be linkworthy, thus if have not built up significant trust this is a penalty / cost you need to overcome.

Sure Google may say they do not offer paid inclusion, but requiring a huge authoritative link profile or years of aging is associated with costs. They may not have paid Google directly, but Google's algorithms do require that most people pay, in one way or another.

And if you can't get past that crawl delay problem, you can always buy AdWords.

How Google Could Commoditize (Nearly) Everything

Is Google just a large ad broker with a search service they can target ads against? Or how might they commoditize many markets? The current trend at Google is that software and storage want to be free. As technology gets cheaper so will Internet access and other forms of communication. Google offers free VoIP and ties it into Gmail, they mentioned making cell phones free via mobile ads, iPods holding all the world's TV in 12 years, and are offering media companies packets of cash to keep it on the web.

Google's main point of profit at the moment is ad sales, which is both highly inefficient and a fraction of what they could do.

Google Checkout:

Google leveraged search as a wedge against which they can sell targeted ads. Right now they are leveraging those ads to try to become a big online payment processor, by including Google Checkout buttons and $10 off coupons in the ads.

They think they can make payment processing faster and more efficient. Ads which have less slippage have greater value. But I seriously doubt that Google would want to stop at just making their ad network more efficient. Why would they?

Google has already launched a coupon program to tie together online and offline marketing, but what if they also attacked the online and offline divide via payment processing? The reason they started online is because that is where they already have leverage. Google talked about not competing with Paypal, but they offered a free month of service to try out Google Checkout for the holidays, and have already extended that holiday promotion another year.

Going Offline:

After they get enough lock-in, don't be surprised if they create a way to track offline transactions.

Most people in the US (and probably around the world) are in debt. Imagine if Google offered a coupon card or credit card. How many people would be willing to use a Google credit card if they offered the lowest interest rates or had other ways they could add value?

How Could Google Add Value?
After a period of charging an initial low interest rate (say 0%) Google could add value by providing health related precautions, related product recommendations, price comparisons, and reviews.

Health Information:
When Google created their Co-op they got many health authorities to participate. What if at the consumer level I could also input data, or I could sign into it when I signed my medical paperwork?

Related Product Recommendations:
Some of Amazon.com's recommendations are spot on. Imagine if Amazon had all their current customer purchase information, recent customer transactions, and were able to add your search history and add media consumption history to that.

Your purchase history, media consumption history, and search history paint a vivid personality profile which must be easy to target ads and product recommendations to.

Price Comparisons:
What if cell phones had product scanners on them? Read John Battelle's the transparent (shopping) society.

Reviews:
Google

  • already offers a web comments plugin

  • structures data via Google Base, Google co-op, inline suggestions, and Google OneBox
  • pulls reviews from other sites for vertical search sites like Google local and Google movies, and
  • could probably just gather reviews directly if they wanted to.

Lock In:
If Google gets enough vendors to lock in they will also have the most complete database of where to find things, which will only grow with time due to network effects.

RFID & Inventory Management:

In the video Epic 2014 they sell the case of a Google Amazon tie up, but I think Google will prevent themselves from carrying physical goods (as noted in August 2009: How Google beat Amazon and Ebay to the Semantic Web.), because they do not need to have them to influence the markets, and actually having physical goods may limit their ability to collect market data.

Before locking in consumers with all those features they will try to get many merchants to commit as well. Imagine if Google offered virtually free RFID tracking and inventory management software which helped automate restocking. And, imagine how well they could recommend competing suppliers and offer ads which looked like discounts.

A True Market Maker:

Google could influence what information we are able to find, what ads we see, what publishers are paid for creating content, and grab a cut from any and every point in the supply chain, charging whatever rates they felt comfortable charging. If they could gain that much information they could even use it to trade commodities and derivatives. Who better to trade commodities than the business which is able to turn so many things into commodities?

How Google AdWords Ads Manipulate Google's Organic Search Results

SEO Question: I was thinking about buying Google AdWords and AdSense ads or placing AdSense on my site. Will doing any of these increase my link count, Google rankings, or rankings in other search engines?

Answer: PPC ads go through redirects, so they do not count toward your link popularity, but there are other ways to tie together PPC ads and organic search placement. Search engines claim there is no direct linkage between buying ads and ranking, but they only talk in ideals because it helps reinforce their worldview and help them make more money.

Buying AdWords Ads

What They Won't Tell You:
Highly commercial keywords may have the associated editorial results go through more relevancy filters and/or be editorially reviewed for relevancy more frequently. Also, because they want people to click on the AdWords ads there is a heavy informational bias on the oranic search results.

I know some people who have large ad spends that get notifications of new ad system changes ahead of time, and others who get to give direct feedback to allow them to participate in cleaning up search results and minimizing unfair competing actions in the ad systems. So that is one type of cross-over / feedback that exists, but I think that tends to be more rare, and the more important cross over / feedback that exists is an indirect one.

Just by Being Real
You can't really explain why and how everyone does what they do. Some people who find your product and enjoy it enough to leave glowing testimonials will even tell you that they don't know how they found it.

In the same way that targeted ads can lead to purchases, they can also lead to an increase in mindshare, brand, reach, usage data, and linkage data. Just by being real and being seen you are going to pick up quality signals. If you try to factor all of those into your ad buys most markets are still under-priced.

Cross Over Due to Buying AdWords:
A well thought out pay per click campaign can feed into your SEO campaign more ways than I can count. Here are a few examples.

Integrating Offline & Online:
In a TV commercial Pontiac told people to search Google, and got a ton of press.

Big Controversial Ads:
Mazda quickly bid on Pontiac.

Many companies also have strong ties between the legal and marketing departments. If buying or selling an ad gets you sued and gets you in the news the value of the news coverage can far exceed the cost of the ads and legal fees.

Small Controversial Ads:
When I was newer to the field one friend called me the original link spammer. He meant it as a compliment, and I still take it as one. In much the same way I was an aggressive link builder, I was also quite aggressive at ad buying.

I caused controversy by buying names of other people in the industry as AdWords ads. I was prettymuch a total unknown when I did that, but some of the top names in the industry elevated my status by placing my name in heated discussion about what was fair and reasonable or not.

You can always consider placing controversial / risky ideas or ads against your brand or competing brands as a way to generate discussion (but of course consider legal ahead of time).

Drafting Off New Words & Industry News:
When the nigritude ultramarine SEO contest started I bid on AdWords. Some people discussing the contest mentioned that I bid on that word. If an event bubbles up to gain mainstream coverage and you make it easy to identify your name as being associated with it then you might pick up some press coverage.

Industry buzz words that are discussed often have significant mindshare, get searched for frequently, and larger / bureaucratic competitors are going to struggle to be as plugged into the market as you are or react as quickly to the changing language.

Snagging a Market Position Early:
When a friend recommended I read the TrustRank research paper in February of last year I knew it was going to become an important idea (especially because that same friend is brilliant, helped me more ways and times than I can count, and was the guy who came up with the idea of naming the Google Dances).

I read it and posted a TrustRank synopsis. In addition to trying to build a bit of linkage for that idea I also ensured that I bought that keyword on AdWords. Today I rank #1 in Google for TrustRank, and I still think I am the only person buying that keyword, which I find fascinating given how many people use that word and how saturated this market is.

Buying AdSense Ads

Buying Ads Creates Content:
If your ads are seen on forums people may ask about your product or brands. I know I have seen a number of threads on SEO forums that were started with something like I saw this SEO Book ad and I was wondering what everyone thought of it. Some people who start talking about you might not even click your ads.

Each month my brand gets millions and millions of ad impressions at an exceptionally reasonable price, especially when you factor in the indirect values.

Appealing to an Important Individual:
I have seen many people advertise on AdSense targeting one site at a time, placing the webmaster's name in the ad copy. It may seem a bit curt for some, but it is probably more likely to get the attention of and a response from a person than if you request a link from them.

Ads are another type / means of communication.

Appealing to a Group of People:
I get a ton of email relating to blogs and blogging. And in Gmail I keep seeing Pew Internet ads over and over and over again. Their ads range from Portrait of a Blogger to Who are Bloggers?

Going forward they will have added mindshare, link equity, and a report branded with that group of people. When people report on blogging or do research about blogging in the future the Pew report is likely to be referenced many times.

Selling AdSense Ads

Don't Sell Yourself Short:
Given the self reinforcing nature of links (see Filthy Linking Rich) anything that undermines your authority will cause you to get less free exposure going forward. So you really have to be careful with monetization. If you do it for maximal clickthough rate that will end up costing you a lot of trust and link equity.

There are other ways to improve your AdSense CTR and earnings without costing your credibility and authority.

More tips:

Don't Monetize Too Early:
Given the lack of monetization ability of a new site with few visitors and the importance of repeat visits in building trust and mindshare you don't want to monetize a new site too aggressively unless it is an ecommerce type site. It is hard to build authority if people view your site as just enough content to wrap around the AdSense.

Spam, Footprints, & Smart Pricing:
In the past search engines may have discounted pages that had poison words on them. Search is all about math / communication / patterns.

If your site fits the footprints of many spammy sites then your site might be flagged for review or reduced in authority. MSN did research on detecting spam via footprints, and link spam detection based on mass estimation shows how power laws could make it easy to detect such footprints.

Graywolf recently noted that landing page slippage may be an input into landing page and site quality scores for AdWords ad buyers. Google could also use AdSense account earnings or AdSense CTR data to flag sites for editorial reviews, organic search demotion, ad payout reduction, or smart pricing.

Google as the Default Web Host

Google today announced that they bought JotSpot (a wiki company). They recently purchased YouTube (the largest online video site). They already own blogger. They run the default distributed automated ad platform (AdSense), are processing payments (Google Checkout), and provide one of the best free analytics products on the market. As Google worms their way onto more and more websites, and owns the platforms on which more and more media is consumed they are going to be able to create a much better web graph than competing companies.

Google will nearly immediately know what parts of the web are active, when they are active, how they are active, and why they are active. Tie that up with things like Gmail and Google Custom Search, and they have yet another way to see what people are referencing, looking for, and how quickly markets are growing. A big advantage over the competition for a company that is essentially an ad platform recommendation engine.

Killing Google PageRank: Making Relevancy Irrelevant

This is old news, but a while ago on TW I posted that UPI, a 100 year old company, was overtly selling PageRank, even mentioning PageRank on their advertisement pages. Search works so well because they measure relevancy using things that are hard to manipulate or things that people wouldn't generally think to manipulate. Thus, if a 100 year old slow moving company is doing something you know that the method of relevancy they aim to manipulate is generally likely already dead.

Google will likely filter out overt link buys like this
Link Spam.
especially when they are marketed this aggressively on Google's own ad network
Buy PageRank from UPI.
If a link buy is so overt that people would talk about it, then an engineer or algorithm has probably caught it already. But that sort of example can be seen as a proxy for the market as a whole, and Google have also significantly lowered the weighting on raw PageRank scores over the past few years, because too many people know about it and manipulate it. Just looking at PageRank is nearly as useless as a meta keywords tag.

Get a Top Ranking in Google in 1 Day for Free

Google recently launched their Google Customized Search Engine, which allows webmasters to easily integrate Google search results into their site while also giving webmasters editorial control to bias the results.

Webmasters can bias the results harnising the power of Topic Sensitive PageRank, tag relevant results, allow editors or users to tag relevant results, and select a seed set of sites to search against or bias the results toward (and sites to remove from the results).

Surely some shifty outfits will use this as a way to show their ranking success, but this also makes me wonder what the net effect on Google's brand will be if people see powered by Google on sites which provide terrible relevancy, or results that are obviously biased toward racism or other horrific parts of humanity. Will searchers learn to trust search less when they start seeing different Google results all over the web? Or will anyone even notice?

Will most people be willing to subscribe to relevancy which reinforces their current worldview?

This release essentially will make Google the default site search on millions of websites, which is great for Google given the volume of site level search. I still think Google's stock is priced ahead of itself trading on momentum and short covering, but this release gives Google a bunch more inventory and further establishes them as the default search platform.

By allowing webmasters to easily integrate results biased toward internal content, backfilling the results with other content when the site does not meet all of a searchers needs, and then allowing the delivery of profitable relevant ads near the content, Google is paying webmasters in numerous highly automated ways that build great value by being layered on top of one another.

I also have to think this is going to further place a dent in the business model of running directories, or other sites with thin content that do not add much editorial value to the subject they talk about. This blend of editorial and algorithms is invariably going to kill off many editorial only listing companies.

As an SEO, I think this customized tool can also be used to help further test the depth and authority of a site relative to others in its group by allowing you to bias the results to multiple similar seed sites and see which pages on those sites that Google promotes most. This could even be used as a tool to help you determine which domain is more valuable in terms of ranking potential if you are comparing a couple domains that you are thinking of buying.

Pages