Filters vs Penalties & Optimization vs Over-Optimization

Jill Whalen recently posted to SEL about how duplicate content penalties are not penalties, but filters.

If you duplicate on a small scale duplicate content does not hurt you (other than perhaps wasting some of your link authority), but if you do it on a large scale (affiliate feed or similar) then it may suck a bunch of link equity out of your site, put your site in reduced crawling status, and / or place many of your pages in Google's supplemental results. Jilll's article mentioned the difference between penalties and filters:

Search engine penalties are reserved for pages and sites that are purposely attempting to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers an infraction through spam reports and other means. To many people's surprise, penalties rarely happen to the average website. Most that receive a penalty know exactly what they did to deserve it.

From a search engineer's perspective, the line between optimization and deception is thin and curvy. Because that is the case it is much easier for Google to be aggressive with filters while being much more restrictive with penalties.

From my recent experiences most sites that lost rankings typically did so due to filters, and most site owners that got filtered have no idea why they were filtered. If you were aggressively auto-generating sites your experience set might be different (biased more toward penalized over filtered), but here are examples of some filters I have seen:

  • Duplicate Content: This filter doesn't matter for much of anything. Only one copy of a syndicated article should rank in the search results. If they don't rank all of them who cares? Even though duplicate pages are filtered out of the search results after the search query, they still pass link authority, so the idea of remixing articles to pass link authority is a marketing scam.

  • Reciprocal Linking: Natural quality nepotistic links are not bad (as they are part of a natural community) but exclusively relying on them, or letting them comprise most of your link authority is an unnatural pattern. A friend's site that was in a poor community had their rankings sharply increase after we removed their reciprocal link page.
  • Limited Authority & Noise: A site which has most of it's pages in the supplemental results can bring many of them out of the supplemental results by ensuring the page level content is unique, preventing low value pages from getting indexed, and building link authority.
  • Over-Optimization Filter: I had 2 pages on a site ranking for 2 commercially viable 2 word phrases. Both of them were linked to sitewide using a single word that was part of the two word phrases. Being aggressive, I switched both sitewide links to using the exact phrases in the internal anchor text. One of the pages now ranks #1 in Google, while the other page got filtered. I will leave the #1 ranking page as is, but for the other page I changed the internal anchor text to something that does not exactly match the keyword phrase. After Google re-caches the site, the filtered page will pop back to ranking near the top of the results.

The difference between a penalty and a filter is the ability to recover quickly if you understand what is wrong. The reason tracking changes is so important is it helps you understand why a page may be filtered.

How can you be certain that a page is filtered? Here are some common symptoms or clues which may be present:

  • many forum members are complaining about similar sites or page types getting filtered or penalized (although it is tricky to find signal amongst the noise)

  • reading blogs and talking to friends about algorithm updates (much better signal to noise ratio)
  • seeing pages or sites similar to yours that were ranking in the search results that also had their rankings drop
  • knowing that you just did something aggressive that may make the page too well aligned with a keyword
  • seeing an inferior page on your site ranking while the more authoritative page from the same site is nowhere
Published: March 16, 2007 by Aaron Wall in seo tips

Comments

March 17, 2007 - 6:27am

Aaron,

Thanks for post and link to Jill's article.

One comforting notion is Jill's take on bylined articles showing up on other web sites;
"If your own bylined articles are getting published elsewhere, that's a good thing. There's no need for you to provide a different version to other sites or to not allow them to be republished at all."

Anyone have recommendations on good article submission services?

martijn
March 17, 2007 - 12:30pm

Hi Aaron,
if you have site with lots of low value pages, a shop for instance, the product-pages.
Would it be better to noindex or nofllow them?
I thought that noindex would to drastic and to keep the juice in the relevant pages use nofollow.

thnx inadvance!

March 17, 2007 - 8:52pm

Hi Martijn
That would be one solution. Another would be to minimized some of the repetitive templating issues and add unique content to each page (via consumer feedback and better product descriptions).

March 19, 2007 - 9:46am

Thanks Aaron for the post! I actually used this to explain a little "Over-Optimization" and how that can be filtered out to a friend tonight!

March 19, 2007 - 1:36pm

Here's what Google says about duplicate content. Bet this post will settle any doubts.

Official Google Webmaster Central Blog post on Duplicate Content.

Great post!

mice
March 19, 2007 - 3:53pm

Hi Aaron,
I started reading your blog lately and found your posts very informative and to the point, unlike some other blogs.

Specifically this post caught my attention, as this is a debate we're having right now.

Our website is very authoritative (having about 40K real links when doing linkdomain:mydomain.com -site:mydomain.com including many edu and gov).

Still most of our pages are marked supplemental, and our traffic got hit on January.
We're not doing any black hat, and our pages are not not a duplicate of other pages.

Could it be that all of our pages got filtered?
Even our homepage that was quite good on a related SERP, got kicked down some 80 places...

That doesn't sound like a filter to me...
Still, Jill claims "Most that receive a penalty know exactly what they did to deserve it".

Your thought in the matter would be greatly appreciated.

stiffpicken
March 19, 2007 - 5:49pm

mice,

do you auto-generate pages that are filled with search results?

March 20, 2007 - 6:29pm

Another quality post Aaron. I like how you differentiate between penalties and just filters.

March 20, 2007 - 7:15pm

I saw a video of Matt Cutts stating that Google didn't really care about duplicate content that existed within a single site that was trying to get a specific product or revice targeted towards different geographic locations. Has anyone else seen this?

March 20, 2007 - 7:52pm

Aaron, what about a general blog site about a common disease. We write 3-5 articles a week on this disease about general topics, cures and current news.

The front page is a PR5 but the majority of blog posts (~280) are PR2 at best. Could we have to many pages and spreading the link juice too thin?

thwart
March 20, 2007 - 11:56pm

Shad, yes I saw this too. Actually it was Vanessa Fox saying it on the video I saw...

March 21, 2007 - 12:19am

Hi Ken
I think you see signs of problems when many pages go supplemental and/or your deep pages do not rank where they should.

If you are well indexed, and are creating content, make sure that you create some content which is linkworthy and builds links. Make sure you build linkage data in proportion to the growth of your site size.

Another related tip would be that it may be worth creating longer content articles so you can get more text on each pages if you are near the limits of your link authority and are struggling to build links.

mice
March 29, 2007 - 1:51pm

Hi stiffpicken,

Yes, our website have pages that are in fact search results, but this was always the case.
Our competitors do have such pages as well, and you can see such pages also on www.shopping.com or their rival www.nextag.com (our PR equals to nextag's...).

I was cutting radically the number of pages we have, aligning with the "cutting the fat" strategy - I wonder if this will help.
I'm open for ideas though...

March 21, 2007 - 11:26am

Aaron I have specific question regarding this i would like to request to see the issue.

I have a website that was ranking No.4 For the term with monthly searches over 33,000 and No.2 for the term with searches over 7000 according to overture.

By mistake the site is hosted on the server IP and by default those domains that are not confired shows the same content as the site have. Now it is about a month i realised this and changed the IP address of the domain name to dedicated IP.

Now the domain is out for those keywords but still sitting at 10th Position for not so competitive keywords. I dont think it is plenty i think it is filter but now i want to know how to remove that Filter i send re-inclusion request 3 weeks ago still no responce nor the site is back.

Can you help me with this matter?

Jason
March 30, 2007 - 6:57pm

mice on -

I feel your pain, I have the site with the same issue. All unique content but tons of it showing up in supplemental. I sure wish someone could put some solid reason to getting placed in the supplemental results.

I personally think google is just broke!!!

Heck as of the time of this writing even matt cutts blog is mostly all supplemental results.

site:www.mattcutts.com/blog

If Matt is not an authority I do not know what is. Come on google get it together. Your primary business is search, not all those other failure you are investing in.

August 30, 2007 - 3:29am

Aaron,
I know this is late in coming but you mentioned pages getting a filter - do whole sites get a filter. I am still unclear about how to tell the difference between a filter and a penalty.

Tim
March 23, 2007 - 4:56am

Hi Aaron, I was wondering if you could clarify something in this post. In the "Over-Optimization Filter" list item, the first four sentences in the paragraph are in the past-tense. This leads me to believe you had a specific example of this happening. However in the last two sentences, you talk about getting out of the filter in the future tense.

Have you tried this in the past with success getting out of the filter, or are you predicting what will happen in the future to those pages?

March 23, 2007 - 7:03am

Hi Tim
My experience with that issue is both past tense and current.

Tim
March 24, 2007 - 2:32pm

Thanks for the clarification, Aaron. If I could ask one more follow-up, in your experience what is the time between Google caching the site again and the ranking change?

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.