Helpful Robots Txt Tip

May 17th

When creating a robots.txt file, if you specify something for all bots (using a *) and then later specify something for a specific bot (like Googlebot) then search engines tend to ignore the broad rules and follow the ones you defined specifically for them. From Matt Cutts:

If there's a weak specification and a specific specification for Googlebot, we'll go with the one for Googlebot. If you include specific directions for Googlebot and also want Googlebot to obey the "generic" directives, you'd need to include allows/disallows from the generic section in the Googlebot section.

I believe most/all search engines interpret robots.txt this way--a more specific directive takes precedence over a weaker one.

Published: May 17, 2007

Comments

David
May 18, 2007 - 5:40pm

Thanks for the tip Aaron. It reinforces what I have been learning this week about robots.txt while working on preventing dup content on a wordpress blog.

One interesting thing I learned about giving specific commands to certain bots, was for adsense publishers. If you deny all bots access to, for instance, the archives section, then you should write a specific directive for the adsense bot to allow it. That way you don't end up with untargeted ads on those pages.

Greg
May 18, 2007 - 7:07pm

Great tip David. I didn't think about adsense as it's own bot.

Mack
May 18, 2007 - 9:15pm

Awesome tip Aaron. I was formatting my robots.txt all wrong.

Josh
May 21, 2007 - 10:43pm

Robots.txt validators that I've used indicate that the rules for specific robots should come first and the wildcard rules should go last. I'm not sure if it's in the specification, but I try to do it that way. Some robots may just look for the first rule that matches -- so I don't want them seeing the wildcard first.

January 22, 2009 - 12:01pm

Nice work Aaron, this will be a great help to people not usually used to using robots.txt.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.

Gain a Competitive Advantage Today

Your top competitors have been investing into their marketing strategy for years.

Now you can know exactly where they rank, pick off their best keywords, and track new opportunities as they emerge.

Explore the ranking profile of your competitors in Google and Bing today using SEMrush.

Enter a competing URL below to quickly gain access to their organic & paid search performance history - for free.

See where they rank & beat them!

  • Comprehensive competitive data: research performance across organic search, AdWords, Bing ads, video, display ads, and more.
  • Compare Across Channels: use someone's AdWords strategy to drive your SEO growth, or use their SEO strategy to invest in paid search.
  • Global footprint: Tracks Google results for 120+ million keywords in many languages across 28 markets
  • Historical data: since 2009, before Panda and Penguin existed, so you can look for historical penalties and other potential ranking issues.
  • Risk-free: Free trial & low price.
Your competitors, are researching your site

Find New Opportunities Today






    Email Address
    Pick a Username
    Yes, please send me "7 Days to SEO Success" mini-course (a $57 value) for free.

    Learn More

    We value your privacy. We will not rent or sell your email address.