Is ignoring the robots.txt file an accident, or a normal feature at Google?
I have a rather small blog, with about 1,000 posts on it. Google is showing 5,000 pages from my site in it's index. Some of my normal pages are already not being cached because Google is indexing my site less aggressivley due to seeing no unique content on the pages where THEY IGNORE THE ROBOTS.TXT PROTOCOL. Pretty evil shit, Google.
Now I need to figure out how to do some search engine friendly cloaking or somehow issue Googlebot 403 errors when it tries to spider those URLs. Way to suck Googlebot.
Perhaps this issue would have been noticed and addressed by a MovableType employee if they didn't have blind trust in search engines and think all SEOs are scum.
Many TypePad hosted sites & MovableType sites are being screwed / partially indexed due to this problem. MovableType owes it to their paid customers to ensure problems like these are not happening.
Gain a Competitive Advantage Today
Your top competitors have been investing into their marketing strategy for years.
Now you can know exactly where they rank, pick off their best keywords, and track new opportunities as they emerge.
Explore the ranking profile of your competitors in Google and Bing today using SEMrush.
Enter a competing URL below to quickly gain access to their organic & paid search performance history - for free.
See where they rank & beat them!
- Comprehensive competitive data: research performance across organic search, AdWords, Bing ads, video, display ads, and more.
- Compare Across Channels: use someone's AdWords strategy to drive your SEO growth, or use their SEO strategy to invest in paid search.
- Global footprint: Tracks Google results for 120+ million keywords in many languages across 28 markets
- Historical data: since 2009, before Panda and Penguin existed, so you can look for historical penalties and other potential ranking issues.
- Risk-free: Free trial & low price.