Matt Cutts on Using Search Usage Data to Fight Spam

A couple weeks back we mentioned that Google's Peter Norvig stated that Google does not use search usage data directly in their relevancy algorithms. Yesterday Matt Cutts made a post on the official Google blog stating that Google does look at search logs / usage data to determine how large spam attacks are and how well new anti-spam measures are doing

Data from search logs is one tool we use to fight webspam and return cleaner and more relevant results. Logs data such as IP address and cookie information make it possible to create and use metrics that measure the different aspects of our search quality (such as index size and coverage, results "freshness," and spam).

Whenever we create a new metric, it's essential to be able to go over our logs data and compute new spam metrics using previous queries or results. We use our search logs to go "back in time" and see how well Google did on queries from months before. When we create a metric that measures a new type of spam more accurately, we not only start tracking our spam success going forward, but we also use logs data to see how we were doing on that type of spam in previous months and years.

Published: June 28, 2008 by Aaron Wall in google


June 28, 2008 - 6:32pm

In particular....I'd love to hear some ideas on this sentence:

"...go over our logs data and compute new spam metrics using previous queries or results."

I wonder what are some specific "spam metrics" flags?

Anyone have thoughts?

June 29, 2008 - 10:18am

Maybe things like the percent of doorway pages and the percent of pages with affiliate links on them.

Plus they use human editors to review pages, so presumably they can look at their old search logs, see how they manually classified pages (is it spam or not) and then when they roll out a new algorithm they can see if pages that they manually classified as spam in the past tend to show up now OR if they get flagged or burried by the new algorithm.

June 30, 2008 - 8:17am

What about # of adsense blocks? Maybe that, like affiliate links could be a way to check for spam. Bill slawski talked about this a while back on a Yahoo patent he saw.

June 30, 2008 - 1:23pm

Hi Will
For larger sites Google will have no part in considering # of adsense blocks or paid links as a signal of spam. Just look at or as examples of where determining the difference between ads and editorial is nearly impossible.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.

Gain a Competitive Advantage Today

Your top competitors have been investing into their marketing strategy for years.

Now you can know exactly where they rank, pick off their best keywords, and track new opportunities as they emerge.

Explore the ranking profile of your competitors in Google and Bing today using SEMrush.

Enter a competing URL below to quickly gain access to their organic & paid search performance history - for free.

See where they rank & beat them!

  • Comprehensive competitive data: research performance across organic search, AdWords, Bing ads, video, display ads, and more.
  • Compare Across Channels: use someone's AdWords strategy to drive your SEO growth, or use their SEO strategy to invest in paid search.
  • Global footprint: Tracks Google results for 120+ million keywords in many languages across 28 markets
  • Historical data: since 2009, before Panda and Penguin existed, so you can look for historical penalties and other potential ranking issues.
  • Risk-free: Free trial & low price.
Your competitors, are researching your site

Find New Opportunities Today