Forbes recently wrote an article about Google's supplemental results, painting it as webpage hell. The article states that pages in Google's Supplemental index is trusted less than pages in the regular index:
Google's programmers appear to have created the supplemental index with the best intentions. It's designed to lighten the workload of Google's "spider," the algorithm that constantly combs and categorizes the Web's pages. Google uses the index as a holding pen for pages it deems to be of low quality or designed to appear artificially high in search results.
I have worked on some of the largest sites and network of sites on the web (hundreds of millions+ pages). When looking for duplicate content or information architecture related issues, the search engines do not allow you to view deep enough to see all indexing problems, so one of the first things I do is use this search to find low quality pages (ie: things that suck PageRank and do not add much unique content to their site). After you find some of the major issues you can dig deeper by filtering out some of the core issues that showed up on your first supplemental searches. For example, here are threadwatch.org supplemental results that do not contain the word node in the URL.
If you have duplicate content issues, at best you are splitting your PageRank, but you might also affect your crawl priorities. If Google thinks 90% of a site is garbage (or not worth trusting much) I am willing to bet that they also trust anything else on that domain a bit less than they otherwise would, and are more restrictive with their willingness to crawl the rest of the site. As noted in Wasting Link Authority on Ineffective Internal Link Structure, ShoeMoney increased his search traffic 1400% after blocking some of his supplemental pages.
I once saw a college professor cite a page about caffiene on a low quality site about pornography, gambling, and drugs on his official profile page. Many people never look beyond the page when linking to a story.
This is not to say that one should put a story on a bad website, but that one should make the story page they are currently marketing as clean as possible so it is easy to link at. And you are probably better off placing your marketing stories on your key site if you think they will still spread.
Over time people will become more aware of using content bait on a crappy site, but for now most people don't look beyond the page when referencing a story.
Social news sites come to prominence largely over the controversies associated with people gaming them, and without people gaming them few would ever garner a critical mass. Marketers spamming a social news site is part of the growth cycle.
If I can come up with an easy search string to detect that many Pligg sites you have to think that as people and spambots abuse them, the search engines will discount most of their votes, but short and long term there is still going to be value to many of them.
Why buy low quality PR2 and PR3 links from inactive parts of the web when you can get on topic ones for free? Of course most of these communities will have limited value and die (failing to build a critical mass), but if you are submitting useful content to the real ones that will also lead to indirect links and other signs of trust and quality.
Content networks with virtually no content cost, free software, and limited editorial control might call people who submit self promotional stuff spammers, but what are all these sites until they build a critical mass? Parasitic useless noise, a form of spam.
The difference between a spammer and a contributor is that a contributor will post at least a few entries that are not self promotional, and they will also create content worthy of exposure. Both of which help build the community.
If search engines already have a reason to trust your site then leveraging SEO may help you gain more exposure. However, if your conversion process is not smooth, search as an isolated marketing channel is rarely an effective long-term business model.
If you have launched a new site and are not getting much Yahoo! traffic, submitting a few of your highest value pages is a good call. If you have key deep high value pages that are not staying indexed in Yahoo! this program also makes sense.
There are two basic ways to do SEO. One is to look for the criteria you think the search engine wants to see, and then work to slowly build it day after day, chipping away doing great keyword research and picking up one good links one at a time here or there. If you understand what the search engines are looking for this is still readily possible in most markets, but with each passing day this gets harder.
The other way to do SEO is to move markets. When I interviewed Bob Massa, his words search engines follow people stuck in my head. So what does it mean to move markets? People are using the word linkarati. It wasn't a word until recently. Rand made it up. As that word spreads his brand equity, market position, and link authority all improve. Does that make Rand an SEO expert or a person good with words? Probably both, as far as engines and the public are concerned.
I have seen friends get free homepage links from businesses that are making 10s of millions of dollars profit per year. I have had fortune 500 companies contact me with free co-branding offers for new sites. I have came up with content ideas that naturally made it to the #1 position on Netscape and stuck there for 20+ hours straight. I still fail often and have a lot to learn, but I do know this: If you are the featured content on most of the sites in your field then YOU are relevant, and search engines will pick up on it unless their algorithms are broke.
When I was new to SEO I did much more block and tackle SEO. I had to because I had limited knowledge, no trust, no leverage, no money, and was a bad writer. The little things mattered a lot. They had to. As I learned more about the web I have tried to transition into the second mode of marketing. Neither method is right or wrong, each works better for different people at different stages, but as more people come online I think the second path is easier, safer, more stable, more profitable, and more rewarding.
If you are empathetic towards a market and have interests aligned with a market you do not need to understand exactly how search engines work. Search engines follow people.
It is still worth doing the little things right so that when the big things hit you are as efficient as possible, but if you can mix research, active marketing, and reactive marketing into your site strategy you will be more successful than you would be if you ignored one of them.
WMW has a good thread about some of the changes people are noticing at Google. Two big things that are happening are more and more pages are getting thrown in Google's supplemental results, and Google may be getting more aggressive with re-ranking results based on local inter-connectivity and other quality related criteria. You need some types of links to have enough raw PageRank to keep most of your pages indexed, and to have your deeper pages included in the final selection set of long tail search results. You need links from trusted related sites in order to get a boost in result re-ranking.
There are also a few other types of links to look at, if you wanted to take a more holistic view:
links from general trusted seed sites
links that drive sales
links that lead to additional trusted links
links that gain you mindshare or subscribers
Some of those other links may not even be traditional links, but may come from a well placed ad buy.
Every unbranded site is heavily unbalanced in their link profile. If you do not have a strong brand then the key people in your community who should be talking about you are not (and thus you are lacking those links).
Wikipedia ranks #2 for Aaron right now. They also rank for millions of other queries. They don't rank because their information is of great quality, they rank because everything else is so bad. About.com was once considered best of breed, but scaling content costs and profitability is hard. Google doesn't hate thin affiliate sites because they are bad. They only hate them because the same thing already exists elsewhere. Search engines try to benchmark information quality, and create a structure which encourages the creation and open sharing of higher quality content. When you see poor sites at the top of search results view it as a sign of opportunity. Realize that whatever ranks today is probably not what search engines want, but it is what is considered best giving the lack of structure to the web and how poor most websites are.
There are many ups and downs to adding a user generated content section to a site. It has been interesting watching the effects of SEOMoz's user generated content and points systems. The ups:
users feel they are part of the brand.
they are more likely to push the brand and link to the site
points are created free but give some perception of value
users create free content for the site even when you are not doing so.
some of their content will rank in search results. today I did a search for search engine marketing and saw Google listing a link for recent blog posts listing this post
contributors might give you good marketing ideas or help you catch important trends before competitors do
people who spend lots of time contributing tend not to value their time too much AND are hard to profit from (especially in savvy marketplaces that ignore ads).
having many relationships allows you to be a connector that knows someone for just about any job, but focusing heavily on building community and maintaining the many relationships needed to do so may hold you down on the value chain. A few strong relationships will likely create more value than many weak ones, especially as we run into scale related issues.
if your site is not authorititative, user generated content may waste your link authority and lead to keyword canibalization
if your site is authoritative many people will look for ways to leverage your domain or authority
as you get more authoritative more people will try to exploit it. even friends get aggressive with it, and unless you call people out it gets out of control quickly.
as you extend your commitments, spending time to police a site, it is harder to change course. I get frustrated when I see spam on the homepage of ThreadWatch, but I guess I can't be surprised people do it, and due to database issues I am uncertain if I will be able to upgrade TW without just archiving the old information and switching to a new CMS.
some people looking to promote their work may spam or aggressively associate your brand with the articles they wrote. For example, is this comment spam? Or is it good?
If a relationship is affiliate based it is quite easy to police undesirable activity by banning accounts, but if people are adding content to your site and marketing it aggressively in ways that may not bode well with your brand it might be harder to police it, especially as you scale your community. And typically the people that are most likely to give you crap for it are hypocritical with their beliefs.
I think on the whole a community section is a pretty good idea if you tie it into a paid content model, but even when you do that you will still run into scale issues if you provide any type of support for the paid content. I have over 600 emails in my inbox, and recently stopped advertising free consulting with an ebook purchase because I stopped scaling as a person. As your profits scale the opportunity cost of any one revenue channel become more apparent. That is one of the things which has prevented me from putting a forum or community section on this site.
If you duplicate on a small scale duplicate content does not hurt you (other than perhaps wasting some of your link authority), but if you do it on a large scale (affiliate feed or similar) then it may suck a bunch of link equity out of your site, put your site in reduced crawling status, and / or place many of your pages in Google's supplemental results. Jilll's article mentioned the difference between penalties and filters:
Search engine penalties are reserved for pages and sites that are purposely attempting to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers an infraction through spam reports and other means. To many people's surprise, penalties rarely happen to the average website. Most that receive a penalty know exactly what they did to deserve it.
From a search engineer's perspective, the line between optimization and deception is thin and curvy. Because that is the case it is much easier for Google to be aggressive with filters while being much more restrictive with penalties.
From my recent experiences most sites that lost rankings typically did so due to filters, and most site owners that got filtered have no idea why they were filtered. If you were aggressively auto-generating sites your experience set might be different (biased more toward penalized over filtered), but here are examples of some filters I have seen:
Duplicate Content: This filter doesn't matter for much of anything. Only one copy of a syndicated article should rank in the search results. If they don't rank all of them who cares? Even though duplicate pages are filtered out of the search results after the search query, they still pass link authority, so the idea of remixing articles to pass link authority is a marketing scam.
Reciprocal Linking: Natural quality nepotistic links are not bad (as they are part of a natural community) but exclusively relying on them, or letting them comprise most of your link authority is an unnatural pattern. A friend's site that was in a poor community had their rankings sharply increase after we removed their reciprocal link page.
Limited Authority & Noise: A site which has most of it's pages in the supplemental results can bring many of them out of the supplemental results by ensuring the page level content is unique, preventing low value pages from getting indexed, and building link authority.
Over-Optimization Filter: I had 2 pages on a site ranking for 2 commercially viable 2 word phrases. Both of them were linked to sitewide using a single word that was part of the two word phrases. Being aggressive, I switched both sitewide links to using the exact phrases in the internal anchor text. One of the pages now ranks #1 in Google, while the other page got filtered. I will leave the #1 ranking page as is, but for the other page I changed the internal anchor text to something that does not exactly match the keyword phrase. After Google re-caches the site, the filtered page will pop back to ranking near the top of the results.
The difference between a penalty and a filter is the ability to recover quickly if you understand what is wrong. The reason tracking changes is so important is it helps you understand why a page may be filtered.
How can you be certain that a page is filtered? Here are some common symptoms or clues which may be present:
many forum members are complaining about similar sites or page types getting filtered or penalized (although it is tricky to find signal amongst the noise)
reading blogs and talking to friends about algorithm updates (much better signal to noise ratio)
seeing pages or sites similar to yours that were ranking in the search results that also had their rankings drop
knowing that you just did something aggressive that may make the page too well aligned with a keyword
seeing an inferior page on your site ranking while the more authoritative page from the same site is nowhere