Wasting Link Authority on Ineffective Internal Link Structure

If you put one form of internal navigation in parallel with another you are essentially telling search engines that both paths and both subset pages are of the same significance. Many websites likely lose 20% or more of their potential traffic due to sloppy information architecture that does not consider search engines.

Many people believe that having more pages is always better, but ever since Google got more aggressive with duplicate content filters and started using minimum PageRank thresholds to set index inclusion priorities that couldn't be further from the truth. Shoemoney increased his Google search traffic 1400% this past month by PREVENTING some of his pages from being indexed. Some types of filtering are good for humans while being wasteful for search engines. For example, some people may like to sort through products by price levels or look at different sizes and colors, but pages that are almost duplicate with the exception of price point, size, model number, or item color may create near duplicate content that search engines do not want to index.

If you are wasting link equity getting low value noisy pages indexed then your high value pages will not rank as well as they could because you wasted link equity getting low value pages indexed. In some cases getting many noisy navigational pages indexed could put your site on a reduced crawling status (shallow crawl or less frequent crawl) that may preclude some of your higher value long tail brand specific pages from getting indexed.

More commonly searches that have some sort of filter associated with them will be associated specific brands rather than how we sort through those brands via price-points. Plus the ads for those terms tend to be more expensive as well.

The reasons brands exist is that they are points of differentiation that allow us to charge non commodity prices for commodities. That associated profit margin and marketing driven demand is why there is typically so much more money in branded terms than other non-brand related filters.

When designing your site's internal link structure make sure that you are not placing noisy low value pages and paths in parallel or above higher value paths and pages.

Published: February 2, 2007 by Aaron Wall in seo tips

Comments

Nick
February 5, 2007 - 5:35pm

Hey Aaron,

One last question. How about member profiles on forums? I don't really think they get much SE traffic and for sure you'd rather the spiders be indexing the actual threads (content) so is it best to robots.txt them too?

Really appreciate your comments.

steve
February 5, 2007 - 7:11pm

Aaron

What is the situation with a content (>200 pages) site without duplicate content. Can adding extra content effectively devalue existing content pages in the SERPs.
Is there an optimum size (No pages) for a site?

February 6, 2007 - 1:32am

Hi Steve
The optimum number of pages (and internal link structure) depends on the profit potential of each page or section, the competition in those markets, the cost of creating the content, the uniqueness of the content, and the link equity of the site.

Dan C.
March 19, 2007 - 10:05pm

I would like to find out a bit more about the situation discussed in the original post where you have pages that contain filtering for price and other factors.

Specifically, I have a search results page that lists the available products for a category (e.g. results.aspx?catid=12). Users can paginate through the grid by clicking plain hrefs (e.g. results.aspx?catid=12&page=2). However, users can also filter the results by several other factors, such as price (e.g. results.aspx?cat=12&low=50&high=100).

Since the price filter simply returns a subset of the same content, we have decided to mark the price filter links with the rel="nofollow" attribute. Our reasoning is that we want to maximize the number of unique pages spiders retrieve from our site and (since they seem to have caps on the number of pages retrieved or time spent during a scan) not waste resources or get stuck on duplicate content (then possibly penalizing us).

Is this a good idea?

July 20, 2007 - 8:04pm

You also want to make sure you are not leaking good link power out to other sites like affiliates. Add a no follow to the URL really helped me drive more search engine traffic.

July 30, 2007 - 11:30pm

I think internal linkng structure is probably the most overlook SEO technique. Sure everbody nows to link with keywords to more important pages but to not link is just as important, as you say.

I think I need to develop a good strategy to go about creating the best linking structure.

Anyways, thanks for the informative post.

February 7, 2007 - 10:22pm

my charts show that the site did not increase 1400 percent but did drop about 1400 percent for aweek and then recovered back to normal care to show the logs to prove you incressed traffic 1400 percent I think you clam is misleading

February 8, 2007 - 12:25am

Do you think I am stupid enough to think Alexa has people monitoring web traffic trend blog posts to comment on them?

And do you think my site is actually a top 1,000 web property? It is if you look at Alexa, but obviously this small blog with only a couple thousand pages has nowhere near that much reach.

And people who say "show the logs" "show proof" etc. are also the type of people who call it smoke and mirrors when one does.

I am disappointed in myself for even responding to your comment instead of just deleting it.

February 8, 2007 - 3:18am

For people who say, "show the logs" and "show proof", just go read other blogs. Aaron is an SEO expert and is giving us free advice that he has found effective. Take it or leave it. If you don't believe him, go troll another site.

I do have one argument (as if I know enough to make a counter-point). For Steve's question, I understand it to be saying, "I have a lot of quality content, can I max out per the search engines." I think that asking this question defeats the purpose of building a site. The goal is to create a quality site that people want to read and link to, not to design it to meet the criteria of a Google algorithm that no one "truly" knows all the details of (and are subject to change). You are shooting at a moving target. I say keep building quality pages that people will be motivated to link to. You get more inbound links; this results in a better PR; this results in more pages being indexed. To stop building quality content for fear of not being 100% indexed is just going to give you a dead site.

February 2, 2007 - 8:31am

I'm having a hard time understanding the 4th paragraph, and what a good way to do link structure for a site would be...

Eg. If you can find cars by Class (SUV, sports, etc.) and Price, what's the best way to structure that?

February 2, 2007 - 8:40am

Well automotive class (SUV etc) might not be bad to use...in fact that one would probably be good, but lots of price point pages...maybe those are not so good for search engines.

February 2, 2007 - 10:32am

I suppose controlling what the spiders can and cannot access over time is also a good way to ensure that the 'right' pages are indexed first.

My guess is that this rarely enters the equation for most sites - just throw it up and hope for the best...

February 2, 2007 - 10:48am

Aaron, are you saying that a page should not have 2 navigation bars linking to the same results? Or is 2 acceptable?

February 2, 2007 - 11:09am

Connors specializes in taking database assets and slicing and dicing it in different ways. What Aaron speaks of is striking the right slice & dice balance. Say you have an tag attribute system. You could spin out a series of pages, using the car example, based on manufacturer, model, year, price, performance, etc. Taken to the extreme, the same database that was intended to control an internal site search tool could be used to spin out tens-of-thousands or even millions of pages. That can be a bad thing.

February 2, 2007 - 11:51am

My site has 2 navigation bars (mostly to solidify and crisp the design aspect), so I obviously do not think 2 nav bars are bad. My point was like Mike was saying. One can go too far and get junk pages indexed.

Each site only has so much authority to leverage across so many pages. Add more pages and you have less authority per page. Add pages of low value and you are wasting authority on low value pages.

February 2, 2007 - 3:11pm

Its really not that hard to understand - consider your serps. Where you have a selection of pages that vary only slightly, you want only one of those to appear in the serps, so you guide the biots by banning them strategically from pages in robots.txt (or the meta is fine too)
You want to appear high in the serps for particular terms (obviously) so you don't waste opportunity by having 2, 10 or 300 pages appearing low instead.
Extending this idea, Google now takes a bunch of those pages and sticks them in the Supplemental Index - in the worst case, all of the variations go in, and you don't get any results.
SO keep them away from all but the selected few.
Basically, you don't want to complete with yourself. So, don't trickle your Trust and Authority wastefully :)

February 2, 2007 - 3:18pm

*sigh* and apparently I can't spell either :(

February 2, 2007 - 5:16pm

Good post Aaron. Most people don't realize how much site architecture plays a role in good optimization. Much of it is little things that we often don't even think about. Thanks for highlighting this particular issue.

John Williamson
February 2, 2007 - 5:50pm

Here is a problem with that I would like Aaron to comment on for best practice.

Say you have a site, a rather medium sized site ( in page size ) where you sell a lot of products. Some of these products you want to rank well for. Lets take a product like surgey lights. In my case, I want to rank well for surgical lights, surgey lights, surgical medical lights, surgical headlight, surgery headlight, surgery medical light, ..etc

My best thought was to not 'stuff' these terms onto one page of course, so I made several pages with each page focusing on 2 of the words I want to rank for. I ended up with 4 pages about these surgery lights that are similar, but also different enough to pass a dup test.

My site links to these 4 pages so am I wasting my links?

I don't want to exclude these pages from Google since I want to rank for all these terms, and since I don't know where I will initally rank for these terms since they are new pages, I need them all indexed to see which pages rank the best in Google.

Whats the best way around this without wasting my links?

February 2, 2007 - 5:54pm

Great Post Aaron, it is important to think about how the visitors will interact with your site vs. the spiders and I find it effective balance, I sometimes use a graphic link instead of a text link I feel it still allows the visitor a way to locate what they may what to see and it also allows the search engines to do there jobs without having to always say "is this a duplicate page".

February 27, 2007 - 9:48am

I just wanted to give you the advice, if you're able to apply it. It's about your article titles. If you could simplify them a bit more, it would be quiet easier gor foreigners or those who speak english not on the highest level, to get the point what you're writing about. For example this article is great, because of the advice to take care about your duplicate content and what the results are if you don't have it anymore. But the topic sounds complicated and fll theoretical. Something like "No content is better then duplicate content" would surely atract more attention. And yes, thanks for your advices you share all the time.

February 27, 2007 - 10:00am

Aaron, could this linkage stuff get explained like this: Every site has let's say 100 points and if you put 100 links, every link, no matter how deep it is, will get one point. So from your index you should be careful where you linking at, because every new link decreases the importance of the other links? What you say that the weight (100 points) are split into two pots - internal & external pot, or are the all in the same pot, just you give 2 points for external and just 1 for internal links? Could we simplify this duplicate content stuff in saying: Do not let index pages which aren't rich with text and also optimized like photo galleries and contact pages?

February 2, 2007 - 6:32pm

I'm sorry, but really I'm tired of reading about how Jeremy increased his visits by excluding folders in his robots.txt file. Didn't anyone notice that he ALSO gave each of his blog posts a unique title and description, where once they all shared the same title and description?

Don't you think that maybe, just maybe, that had a little more to do with an increase in traffic than dissalowing some folders that could be seen as duplicate content?

February 27, 2007 - 10:05am

Yes...that is generally the idea. If you are wasting your link equity on low quality pages then some of those link points (or however you want to visualize them) are not being spent on your good content.

Mariense
February 2, 2007 - 7:27pm

>>Didn't anyone notice that he ALSO gave each of his blog posts a unique title and description, where once they all shared the same title and description

Well, if that is the case, I'll put my money on it as being the main cause of the increase in visitors.

February 2, 2007 - 8:01pm

Hi Everett
I think both fixes played a big role. But both are common problems. But if he would have fixed the page titles and did nothing with the link structure his traffic increase would have been a fraction of what it was.

Hi John
Usually a page on an authoritative site could proably rank for all of those phrases, but what is the best thing to do may depend on your site's authority, depth, and a few other factors. Most likely I would target all those phrases on one page, or maybe two at the most.

February 2, 2007 - 9:41pm

Given the fact that the formula for PageRank (see definition below) is PR(a) = (1-d) + d(PR(t1)/C(t1) + ... + PR(tn)/C(tn))), the more outbound links your sites has the higher the divider --C(T1), resulting in a lower page rank, unless you have overwhelming inbound links or links from other sites to yours.

However, it is good practice to internal link or crosslink within your site. Make sure these links are relevant links from one subject to another related subjects. For example, link the Fruit Page to the Apple and Orange pages. Do not link Food Page to the Text Book Page. This is considered non-related linking, which is a bad, bad thing since Google is all about relevance or so the theory states.

Lets talk about external or outbound links. They are good too, but keep in mind the more links you have out to another site the lower your PageRank is. Do the math. The higher the denominator the smaller the value. But then who care, PageRank is only a multiplier of the Relevance-Ranking Algorithm right;-) If all other things are right, then what is it in a multiplier.

What is PageRank

Page Rank is the sum of the PageRanks of all pages that link to it (its incoming links), divided by the number of links on each of those pages (its outgoing links) (Rael Dornfest - Google Hacks, 3rd Ed) In depth book on how the back-end of Google technology works. Best I read yet.

sirNitti.com
For the cost of a Yellow page ad!

February 2, 2007 - 9:42pm

Man ... paragraph #3 is like, "welcome to my life." All I've been doing lately is looking at (mostly big) retail sites, and this is such a common problem. I'm not sure I explained it to them as eloquently as you just put it, Aaron. Well done, and thanks. Good stuff.

Deke
February 3, 2007 - 12:19am

Hey Aaron,

Intriguing blog, but I have to admit I am a bit confused as to what strategy you are trying to explain.

From the what I gather, it sounds like you are basically saying, don't let your "fluff" pages get indexed. But isn't our goal from the start not to have "fluff" pages? You emphasize so much having sites that are chock full of useful and linkworthy information... what if there is no fluff? Even my worst of websites don't have any fluff because it isn't worth my time to make a page that isn't useful to anyone.

Am I on the wrong track here? I may have missed the point.

February 3, 2007 - 12:23am

Many content management systems create fluff pages, even if you did not intend for them to exist. Also some navigational systems work better for people than for bots, and things that are good for people are noisy for some search systems.

February 3, 2007 - 12:58am

Aaron, today's post really was an eye opener for me and have to do some work to improve things, especially eliminating the noisy pages or placing appropriate meta tags on those pages.

Also, the comments were helpful.

Thanks all.

February 3, 2007 - 12:51am

i'm late, but in regards to an automotive site, you want to forget any page that is based on vin explosion.

your site architecture should put your text content...buying research, loan info & calculations, etc the priority, and make your millions of classified ads (appear to be) a small section of the site.

i asked matt cutts whether or not a VIN number will "obstruct" a URL like session ids and the answer is YES they are of no interest to the crawler.

February 3, 2007 - 9:34am

Thanks for the interesting post.

I'm wondering how this affects Wordpress blogs when people view a category and potentially get the same pages that appear in the index and archives.
Would this be an issue? If so what would be the best way to combat it?
1. Just show introduction text in categories and archives?
2. Prevent SE's from crawling categories and archives?

February 3, 2007 - 9:38am

I think you want the categories and archives crawled because they link to the individual post pages and help spread link equity around your site from the most important pages through to some of the original content that may not be well cited amongst the web in general.

Categories can be configured to show just the introductory text or also show whole posts. I think you would have to look at it on a site by site basis to determine which is best.

February 3, 2007 - 3:39pm

So would you suggest that for a retail website it might be a good idea to disallow pages such as:

¢ My Account
¢ Cart Contents
¢ Checkout

Which all have content on them that a visitor wouldn't really be searching for to get to your site ?

February 3, 2007 - 10:24pm

Hi Eddy
Yup, but where some people run into big problems is having many versions of those types of pages get indexed...like a unique shopping cart entrance URL per each product.

bestoptimized
February 4, 2007 - 4:41pm

I have a couple questions.
Say I have a site with 50,000 products
and alot of these products are in the supplemental results.
Say that the I have alot of links and alot of good links from prestigeous sites and the pagerank is 7.
Say I have good categorical and related product linking.
The product pages cannot be sorted so there is no duplicate content.
Does having these supplemental product pages cause the rest of the site to rank lower?
Is this always the case?

Alex
May 13, 2007 - 6:22pm

I recently realized how bad my internal linking structure is. I'm running a wordpress blog, and my category pages are /tags/%categoryname%/. These pages have page rank already. My individual posts, are /category/postname. Notice no "tags" in there. I would like to change this, and I've thought about using a 301 redirect...but I would lose the page rank on these category pages, right ?

What do you suggest I should do ? More than 50% of my pages are in the supplemental index because of this.. Is the page rank worth losing to change it ?

February 4, 2007 - 9:43pm

If those product pages are not duplicate content then it should not be acceptable to just leave them stuck supplemental.

Mac
July 26, 2007 - 7:47am

Aaron,
I have unique problem that you may be able to answer. I am purchasing a website with 250 backlinks and 250 pages of content. The seller wants to keep the content, which is fine, because I really want the domain name. The current site is rated number one in Google though, and although the website is an awful looking Web 1.0 site with 80 links on the home page (I only want 10), can I simply create 50 completely new pages with similar keyword density and hope to keep that number one ranking? Or do I have to change the site content slowly. There is also the problem of duplicate content when he posts the content to his new site. How would you go about changing the content and converting to a Web 2.0 look without losing my rankings.

July 26, 2007 - 8:17am

Hi Mac
That is the type of question that is hard to answer for free because there are so many variables.

Nick
February 4, 2007 - 8:50pm

Aaron first of all, great blog post, probably one of the best I've seen in a long time.

To add to the comment by Eddy above, what about things like:

Image galleries
Contact Us pages
Member profiles

And does robots.txt'ing them really do the job? Surely this will just stop them being indexed, but the fact is that the links are still on the page = the pagerank is still being sent to them. How about using nofollow for the links?

Nick
February 4, 2007 - 9:04pm

Oops, apparently my comment was posted an hour before I actually posted it hrm.

Anyway Aaron, please see two comments up.

February 6, 2007 - 1:29am

Hi Nick
In his post Shoemoney said he used Robots.txt. Image galleries is the perfect example of the type of low information page that do not make sense to get indexed, especially if you have a ton of them.

delage
September 26, 2007 - 8:39am

I have been reading these posts with interest. I have a website which sells only one product so I dont have the problem of many of the above regarding trying to be relevant for 50000 products. My website (www.backupanytime.com) is in my humble opinion well presented and indexed. Also I can be found on page 1 (3rd to 6th position) for a very limited number of search terms such as "online backup ireland" and "irish online backup"
My problem is page rank. Even though my site content is relevant and all original (I have written all content myself), I cant acheive page rank as I just dont know anyone in a non competing, related line of business with a good page rank who I can acquire incoming links from. I understand that link farms are "bad" so wont be going that route. What can I do to indrease page rank?

September 26, 2007 - 6:30pm

Offer to provide free services to businesses or non profits that could use your services.

Ask them to link to your site from their official sites as a thank you.

ravetildon
January 14, 2008 - 3:04am

Hey Aaron:

Just curious if you know of any other case studies with implementing a robots.txt file for Google & getting large traffic increases on your wordpress blogs?

I'm playing with a couple of the many wordpress blogs I have. These particular 2 are both 99% supplemental according to this tool:

http://www.mapelli.info/tools/supplemental-index-ratio-calculator/

It's been about a month and stuff is SLOWLY coming out of supplemental. One is at 75%, the other 95% now.

Still no changes in traffic or rankings...

January 14, 2008 - 3:44am

It might take a while for your PageRank scores to be recomputed and your site re-ranked higher. The compute PageRank estimates on the fly, but I think the webwide calculations only come about once a month or so.

elkiwi
March 23, 2009 - 9:38pm

Hi, after reading these comments, I just wanted to ask if in a situation like mine, where my links are generated automatically by proximity to latitude, longitude co-ordinates of other pages for a travel site this is bad?

I add pages all the time which could be considered "low value" but I want my users to see what is else around the hotel, restaurant or beach they are looking at on my site even if Google considers it to be of low value.

ElKiwi

March 24, 2009 - 3:55am

Hard to say without looking at it ElKiwi. If you want you could put your site up for review in our forums.

Elliot Kadin
July 19, 2011 - 10:07am

internal link structure is great idea for enhance website popularity.

Garykumar
October 31, 2012 - 9:34am

Its realy great presented and become the power guide of website internal linking sructure . But still i have some doubts regarding thesame.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.