Matt Cutts on Using Search Usage Data to Fight Spam

Jun 28th

A couple weeks back we mentioned that Google's Peter Norvig stated that Google does not use search usage data directly in their relevancy algorithms. Yesterday Matt Cutts made a post on the official Google blog stating that Google does look at search logs / usage data to determine how large spam attacks are and how well new anti-spam measures are doing

Data from search logs is one tool we use to fight webspam and return cleaner and more relevant results. Logs data such as IP address and cookie information make it possible to create and use metrics that measure the different aspects of our search quality (such as index size and coverage, results "freshness," and spam).

Whenever we create a new metric, it's essential to be able to go over our logs data and compute new spam metrics using previous queries or results. We use our search logs to go "back in time" and see how well Google did on queries from months before. When we create a metric that measures a new type of spam more accurately, we not only start tracking our spam success going forward, but we also use logs data to see how we were doing on that type of spam in previous months and years.

Published: June 28, 2008

New to the site? Join for Free and get over $300 of free SEO software.

Once you set up your free account you can comment on our blog, and you are eligible to receive our search engine success SEO newsletter.

Already have an account? Login to share your opinions.

Comments

June 28, 2008 - 6:32pm

In particular....I'd love to hear some ideas on this sentence:

"...go over our logs data and compute new spam metrics using previous queries or results."

I wonder what are some specific "spam metrics" flags?

Anyone have thoughts?

June 29, 2008 - 10:18am

Maybe things like the percent of doorway pages and the percent of pages with affiliate links on them.

Plus they use human editors to review pages, so presumably they can look at their old search logs, see how they manually classified pages (is it spam or not) and then when they roll out a new algorithm they can see if pages that they manually classified as spam in the past tend to show up now OR if they get flagged or burried by the new algorithm.

June 30, 2008 - 8:17am

What about # of adsense blocks? Maybe that, like affiliate links could be a way to check for spam. Bill slawski talked about this a while back on a Yahoo patent he saw.

June 30, 2008 - 1:23pm

Hi Will
For larger sites Google will have no part in considering # of adsense blocks or paid links as a signal of spam. Just look at Business.com or About.com as examples of where determining the difference between ads and editorial is nearly impossible.

New to the site? Join for Free and get over $300 of free SEO software.

Once you set up your free account you can comment on our blog, and you are eligible to receive our search engine success SEO newsletter.

Already have an account? Login to share your opinions.

  • Over 100 training modules, covering topics like: keyword research, link building, site architecture, website monetization, pay per click ads, tracking results, and more.
  • An exclusive interactive community forum
  • Members only videos and tools
  • Additional bonuses - like data spreadsheets, and money saving tips
We love our customers, but more importantly

Our customers love us!






    Email Address
    Pick a Username
    Yes, please send me "7 Days to SEO Success" mini-course (a $57 value) for free.

    Learn More

    We value your privacy. We will not rent or sell your email address.