How Does Google Work?

This image might need updated in the years to come, but it does a great job laying out how Google works when you type a query into their search engine. Search is so easy to do that it is hard to appreciate how complex it is unless you take a look under the hood. Which is exactly what this graphic does :D

Click the image to get the full sized beefy image :D
How Google Works.

A side benefit of this graphic is that it should help prospective clients realize how complex SEO & PPC campaigns can be. So if anyone is trying to be an el cheapo with their budget you can use this to remind them how complex search is, and thus how time consuming and expensive a proper search marketing campaign is.

Published: June 30, 2010 by Aaron Wall in google

Comments

seoinabruzzo
June 30, 2010 - 12:29pm

Hi Aaron,
the infograph is nice, but I'm sorry to say that Google index your page even if you exclude them via robots.txt

MattCutts published a video about this some times ago, and I tested this "weakness" personally.

July 1, 2010 - 3:34am

Yes if you block a page via robots.txt they can still list that page in their search results, but it will be listed URL only.

If you block a page via robots.txt exclusion then they will list it URL only or pull in data from DMOZ or use your inbound anchor text data, but they won't return your page content.

If you block a page via a robots meta tag they won't list it in their search results.

If you block a page via both then they won't be able to see the robots meta tag because the robots.txt exclusion prevents them from crawling it to see the robots meta tag. And so if you use both it is only like using robots.txt.

seoinabruzzo
July 1, 2010 - 8:06am

"If you block a page via a robots meta tag they won't list it in their search results."

They returned the URL in the SERP even with a disallow in the robots.txt

Regarding to this Matt said that if it exist even a URL that point to that page, it may be a chance the robots is ignored.

This is absurd, and I don't find it fair, a lot of people don't find fair ... but they rules.

Wesley LeFebvre
July 3, 2010 - 5:18pm

Hi seoinabruzzo,
It seems you're still talking about robots.txt file.

Aaron was talking about the page/url being completely removed only when you block it via a meta tag on that specific page.

META NAME="ROBOTS" CONTENT="NOINDEX"

Although, I haven't had the need to verify this myself in quite some time.

seoinabruzzo
July 3, 2010 - 5:54pm

... I was talking about the infographic ppcblog did, and the second box. The box state that is possible to block URL indexing via robots.txt ... and I just said it's not properly true.

That's all.

kalinvasilev
June 30, 2010 - 2:30pm

...you can use this to remind them how complex search is, and thus how time consuming and expensive a proper search marketing campaign is....

I'm not sure this would be a valid argument though. That SEM is complex is true, but that's not necessarily because search algos are complex.

When you plan a SEM campaign, do you think of all that? Probably not, but you've got another thousand things to think of, such as the product niche, buying intent, landing page or site usability, keyword research, localization, you name it.

When I'm driving my car I never think about how the pistons and other parts in the engine work; I have other stuff to care about. And if you are a taxi driver I don't think your customers would care a bit about how complex your engine is; all they care about is the value they can have.

July 1, 2010 - 3:38am

Illustrating how complex search is shows how hard it can be to manipulate it, and perhaps many things one should consider in their strategy. I am not saying you have to do everything all the time, but manipulating a search engine is a lot different than driving a car. When you drive a car there are signs and lanes, etc. When you do SEO you need to know how aggressive you should be, and be able to read what you can do. While some of the search guidelines tell you that the speed limit is 5 miles an hour and that you should drive on the right side of the road, sometimes you should go 50 mph on the left side of the road.

Sure people care about value. But most people still want to get a great deal, and for every customer who gets the value of search and wants to pay a fair livable wage for the services, there are 10 or 20 people who want to buy Manhattan real estate for $10,000.

hugoguzman
June 30, 2010 - 3:08pm

I'm also in a nitpicky mood like seoinabruzzo, so I feel that it's worth mentioning that based on multiple experiments that I've run, Google will not index a page based purely on it being included in an XML site map.

If said page does not also have at least one inbound link (or is part of a blog/atom feed) then Google will ignore said page even if you submit via XML sitemap. There's even some verbiage somewhere on Google that mentions how XML sitemap inclusion will not override Google's standard crawling/indexing methods.

That said, this is still a pretty cool graphic explaining just how complicated search can be.

July 1, 2010 - 3:44am

Hi Hugo
Agree 100%. And that is part of the reason why we added "deeply or regularly" rather than just trying to make it black/white yes/no...we were trying to highlight how authority and PageRank yield a deeper and more comprehensive and fresher crawl as the site gains trust.

jrotman
July 1, 2010 - 3:01am

Aaron, Despite the nitpickers...I think this is an awesome piece of content depicting the Google "megalopolis." You guys outdid yourselves on this one.

July 1, 2010 - 3:45am

Thanks Jen :D

We miss you bunches over here! Thanks for all the great help over the years!

hugoguzman
July 1, 2010 - 1:29pm

Sorry to be a nitpicker, Aaron! I figured you would agree, though.

Small Business ...
July 1, 2010 - 8:45pm

Despite the commentary above, this entry was nonetheless helpful. All SEO specialists are trying to discover the secrets of major search engines to optimize, and any tips breaking it down is good information to have! Thanks.

seopractices
July 2, 2010 - 3:50pm

Google still crawls and displays it in search results even though it has been blocked by robots, this is an example:

We built a new site: bestwallsolutions.com, the developer blocked it via robots.txt while building the site. We forgot about it. I placed a link to the site through my site:
seopractices.com with the anchor text: Decoracion de Paredes Interiores and that is what google is displaying now when you search for: bestwallsolutions (I just happen to remove the blocking from the robots metatag today.

spode
July 2, 2010 - 6:10pm

To me, there is no rhyme or reason to Google approach for indexing. I probably have 10 websites out there and they bounce around page ranks on an hourly basis. I have had my website 2nd place on the first page of Google and then 4 hours later it's on page 4. I try to do alot of SEO on a weekly basis on my top sites and what works for one site, doesn't work for the other. I have a website that has 22 different pages that is on page 2 and another website in the same niche that is on page 1 and has very little info on it. The more I put time and effort into it, the more Google drives me nuts. Great blog post and very informative. We all will continue to chip away :-)

massa
July 4, 2010 - 9:58am

Bravo!
One of the most unique and highest quality contributions to the online promotion community in a long time!

In my opinion, the possible imperfection of specific details only illustrate the extreme value of your offering. Were it not for the courage to put your neck in the noose,(you KNOW Google only follows one set of protocols and that is their own, so you also knew there would be challengers to the reported data), there would never been the opportunity to openly discuss those discrepancies in a way that expose those unaware to some of the things that are the most frustrating to the less initiated.

Typically, discussions of things which absolutely do happen that go against conventional wisdom at best or out and out self serving, promotional bullshit at worst, end up in one more circular screaming match with no real resolution. Yet you have again, delivered real value while pulling the community a little further together instead of pushing it closer apart.

massa gives wall a standing ovation.

Peach Out Y'all
massa
superior organic placement without
conversion is the epitome of futility

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.