Anand Rajaraman recently spoke with Peter Norvig, who revealed that:
- their best machine learning algorithms is already as good as, and sometimes better than their current hand roled relevancy algorithms
- but they still prefer to use their hand roled algorithms because of hubris, and they feel that machine learning algorithms may be more inclined to have catastrophic errors on searches that do not look much like those in the training set
I think a third piece (that you will never hear Google employees admit to) is that as the web's structure changes Google feels they have use FUD to police the web and help ensure Google has revenue entry points into important markets. In their 2007 Google search quality rater guidelines they used a typical Commission Junction link as an example of a sneaky redirect. It is doubtful that Google would ever do that with AdSense code or a Performics link (since they own those).
In the follow up post about his chat with Peter Norvig, Anand highlighted how Google measures relevancy. In the post he stated why Google prefers internal review data relative to using direct usage data:
Peter confirmed that Google does collect such [usage] data, and has scads of it stashed away on their clusters. However -- and here's the shocker -- these metrics are not very sensitive to new ranking models! When Google tries new ranking models, these metrics sometimes move, sometimes not, and never by much. In fact Google does not use such real usage data to tune their search ranking algorithm.
Exposure from top rankings already creates a self-reinforcing effect because of the power of defaults. Further tying in search usage data directly into relevancy might not add much benefit to searchers, especially as more people click on the first search result. Anand further explained why direct usage data is not used to refine Google's relevancy algorithms:
The first is that we have all been trained to trust Google and click on the first result no matter what. So ranking models that make slight changes in ranking may not produce significant swings in the measured usage data. The second, more interesting, factor is that users don't know what they're missing.
Gain a Competitive Advantage Today
Your top competitors have been investing into their marketing strategy for years.
Now you can know exactly where they rank, pick off their best keywords, and track new opportunities as they emerge.
Explore the ranking profile of your competitors in Google and Bing today using SEMrush.
Enter a competing URL below to quickly gain access to their organic & paid search performance history - for free.
See where they rank & beat them!
- Comprehensive competitive data: research performance across organic search, AdWords, Bing ads, video, display ads, and more.
- Compare Across Channels: use someone's AdWords strategy to drive your SEO growth, or use their SEO strategy to invest in paid search.
- Global footprint: Tracks Google results for 120+ million keywords in many languages across 28 markets
- Historical data: since 2009, before Panda and Penguin existed, so you can look for historical penalties and other potential ranking issues.
- Risk-free: Free trial & low price.