Search Engine Cloaking FAQs: an Interview With Dan Kramer, Creator of Kloakit

I recently asked Dan Kramer of KloakIt if I could interview him about some common cloaking questions I get asked, and he said sure.

How does cloaking work?

It is easiest to explain if you first understand exactly what cloaking is. Web page cloaking is the act of showing different content to different visitors based on some criterion, such as whether they are a search engine spider, or whether they are located in a particular country.

A cloaking program/script will look at a number of available pieces of information to determine the identity of a visitor: the IP address, the User-Agent string of the browser, the referring URL, all of which are contained in the HTTP headers of the request for the web page. The script will make a decision based on this information and serve the appropriate content to the visitor.

For SEO purposes, cloaking is done to serve optimized versions of web pages to search engine spiders and hide that optimized version from human visitors.

What are the risks associated with cloaking? What types of sites should consider cloaking?

Many search engines discourage the practice of cloaking. They threaten to penalize or ban those caught using cloaking techniques, so it is wise to plan a cloaking campaign carefully. I tell webmasters that if they are going to cloak, they should set up separate domains from their primary website and host the cloaked pages on those domains. That way, if their cloaked pages are penalized or banned, it will not affect their primary website.

The types of sites that successfully cloak fall into a couple of categories. First, you have those who are targeting a broad range of "long tail" keywords, typically affiliate marketers and so on. They can use various cloaking software packages to easily create thousands of optimized pages which can rank well. Here, quantity is the key.

Next, you have those with websites that are difficult for search engines to index. Some people with Flash-based websites want to present search engine spiders with text versions of their sites that can be indexed, while still delivering the Flash version to human visitors to the same URL.

What is the difference between IP delivery and cloaking?

IP delivery is a type of cloaking. I mentioned above that there are several criteria by which a cloaking script judges the identity of a visitor. One of the most important is the IP address of the visitor.

Every computer on the internet is identified by its IP address. Lists are kept of the IP addresses of the various search engine spiders. When a cloaking script has a visitor, it looks at their IP address and compares it against its list of search engine spider IP addresses. If a match is found, it delivers up the optimized version of the web page. If no match is found, it delivers up the "landing page", which is meant for human eyes. Because the IP address is used to make the decision, it's called "IP delivery".

IP delivery is considered the best method of cloaking because of the difficulty involved in faking an IP address. There are other methods of cloaking, such as by User-Agent, which are not as secure. With User-Agent cloaking, the User-Agent string in the HTTP headers is compared against a list of search engine spider User-Agents. An example of a search engine spider User-Agent is
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)".

The problem with User-Agent cloaking is that it is very easy to fake a User-Agent, so your competitor could easily decloak one of your pages by "spoofing" the User-Agent of his browser to make it match that of a search engine spider.

How hard is it to keep up with new IP addresses? Where can people look to find new IP addresses?

It's a chore the average webmaster probably wouldn't relish. There are always new IP addresses to add (the best cloaking software will do this automatically), and it is a never-ending task. First, you have to set up a network of bot-traps that notify you whenever a search engine spider visits one of your web pages. You can have a CGI script that does this for you, and possibly check the IP address against already known search engine spiders. Then, you can take the list of suspected spiders generated that way and do some manual checks to make sure the IP addresses are actually registered to search engine companies. Also, you have to keep an eye out for new search engines... you would not believe how many new startup search engines there are every month.

Instead of doing it all yourself, you can get IP addresses from some resources that can be found on the web. I manage a free public list of search engine spider IP addresses. There
are also some commercial resources available (no affiliation with me). In addition to those lists, you can find breaking info at the Search Engine Spider Identification Forum at WebmasterWorld.

Is cloaking ethical? Or as it relates to SEO is ethics typically a self serving word?

Some would say that cloaking is completely ethical, others disagree. Personally, my opinion is that if you own your website, you have the right to put whatever you like on it, as long as it is legal. You have the right to choose which content you display to any visitor. Cloaking for SEO purposes is done to increase the relevancy of search engine queries... who wants visitors that aren't interested in your site?

On the other hand, as you point out, the ethics of some SEOs are self serving. I do not approve of those who "page-jack" by stealing others content and cloaking it. Also, if you are trying to get rankings for one topic, and sending people to a completely unrelated web page, that is wrong in my book. Don't send kids looking for Disney characters to your porn site.

I have seen many garbage subdomains owning top 10 rankings for 10s to 100s of thousands of phrases in Google recently. Do you think this will last very long?

No, I don't. I believe this is due to an easily exploitable hole in Google's algorithm that really isn't related to cloaking, although I think some of these guys are using cloaking techniques as a traffic management tool. Google is already cleaning up a lot of those SERPs and will soon have it under control. The subdomain loophole will be closed soon.

How long does it usually take each of the engines to detect a site that is cloaking?

That's a question that isn't easily answered. The best answer is "it depends". I've had sites that have never been detected and are still going strong after five or six years. Others are banned after a few weeks. I think you will be banned quickly if you have a competitor who believes you might be cloaking and submits a spam report. Also, if you are creating a massive number of cloaked pages in a short period of time, I think this is a flag for search engines to investigate. Same goes for incoming links... try to get them in a "natural" looking progression.

What are the best ways to get a cloaked site deeply indexed quickly?

My first tip would be to have the pages located on a domain that is already indexed -- the older the better. Second, make sure the internal linking structure is adequate to the task of spidering all of the pages. Third, make sure incoming links from outside the domain link to both the index (home) cloaked page and to other "deep" cloaked pages.

As algorithms move more toward links and then perhaps more toward the social elements of the web do you see any social techniques replacing the effect of cloaking?

Cloaking is all about "on-page" optimizing. As links become more important to cracking the algorithms, the on-page factors decline in importance. The "new web" is focused on the social aspects of the web, with people critiquing others content, linking out, posting their comments, blogging, etc. The social web is all about links, and as links become more of a factor in rankings, the social aspects of the web become more important.

However, while what people say about your website will always be important, what your website actually says (the text indexed from your site) cannot be ignored. The on-page factors in rankings will never go away. I cannot envision "social techniques" (I guess we are talking about spamming Slashdot or Digg?) replacing on-page optimization, but it makes a hell of a supplement... the truly sophisticated spammer will make use of all the tools in his toolbox.

How does cloaking relate to poker? And can you cheat at online poker, or are you just head and shoulders above the rest of the SEO field?

Well, poker is a game of deception. As a pioneer in the cloaking field, I suppose I have picked up a knack for the art of lying through my teeth. In the first SEO Poker Tournament, everybody kept folding to my bluffs. While it is quite tempting to run poker bots and cheat, I find there is no need with my excellent poker skills. Having said all that, I quietly await the next tournament, where I'm sure I'll be soundly thrashed in the first few minutes ;)

How long do you think it will be before search engines can tell the difference between real page content and garbled markov chain driven content? Do you think it will be computationally worthwhile for them to look at that? Or can they leverage link authority and usage data to negate needing to look directly at readability as a datapoint?

I think they can tell now, if they want to devote the resources to it.

However, this type of processing is time/CPU intensive and I'm not sure they want to do it on a massive scale. I'm not going to blueprint the techniques they should use to pick which pages to analyze, but they will have to make some choices. Using link data to weed out pages they don't need to analyze would be nice, but in this age of rampant link selling, link authority may not be as reliable an indicator as they would like. Usage data may not be effective because in order to get it, the page has to be indexed so they can track the clicks, defeating the purpose of spam elimination. There best bet would be to look at creation patterns... look to see which domains are creating content and gaining links at an unreasonable rate.

What is the most amount of money you have ever made from ranking for a misspelled word? And if you are bolder than I am, what word did you spell wrong so profitably?

I made a lot of money from ranking for the word "incorparating". This was waaay back in the day. I probably made (gross) in the high five figures a year for several years from that word. Unfortunately, either people became better spellers or search engines got smarter, because the traffic began declining for the word about four or five years ago.

If I wanted to start cloaking where is the best place to go, and what all should I know before I start? Can you offer SEO Book readers a coupon to get them started with KloakIt?

KloakIt is a great cloaking program for both beginners and advanced users, because it is easy to get running and extremely flexible and powerful. There is a forum for cloakers there where you can go for information and tips. I am also the moderator of the Cloaking Forum over at WebmasterWorld, and I welcome questions and comments there.

SEO Book readers can get a $15.00 discount of a single domain license of KloakIt by entering the coupon code "seobook" into the form on the KloakIt download page. I offer a satisfaction guarantee, and, should you decide to upgrade your license to an unlimited domains license, you can get credit for your original purchase towards the upgrade fee.

----

Please note that I am not being paid an affiliate commission for KloakIt downloads, and I have not deeply dug in to try out the software yet. I just get lots of cloaking questions and wanted to interview an expert on the topic, and since Dan is a cool guy I asked him.

Thanks for the interview Dan. If you have any other questions for Dan ask them below and I will see if I can ask Dan if he would be willing to answer them.

Published: July 10, 2006 by Aaron Wall in interviews

Comments

July 12, 2006 - 10:41am

Please do not speak for me in broad strokes. Generally there is a relationship with age and trust and I have seen too many instances to think otherwise.

Having said that, if you go after uncompetitive niches and / or get a ton of legitimate editorial citations and/or are not too aggressive in your initial marketing strategies (ie: get quality links, not bulk low quality links) new sites can rank.

Jeff Forest
July 12, 2006 - 8:14pm

a question for Dan: you say that

"if they are going to cloak, they should set up separate domains from their primary website and host the cloaked pages on those domains."

What do you mean by that? Sounds to me like you're saying that you will host the (cloaked) pages for one domain... on another domain? ie. something like this...

eg.
www.domain.com
/index.php -> hosted on domain.com server
(www.domain.com)/faq.php -> hosted on "separate domain".com server?

If so, how would you do this? with a frame where the content is derived from the separate domain? by doing a 302?

I'm just wondering about the mechanics of it all, and what you mean by that statement.

thanks,
Jeff

July 12, 2006 - 10:17pm

>if they are going to cloak, they should set up separate domains from their primary website and host the cloaked pages on those domains.

What I mean is that say you have a website at XYZ.com that you are trying to promote. You could obtain another domain, 123.com and create a cloaked page there.

When a search engine spider visits the cloaked page at 123.com, some optimized HTML is displayed.

When a human visits that same cloaked page at 123.com, the cloaking script sends out an HTTP request to XYZ.com for the home page. It then displays that home page to the visitor, under the 123.com URL. There is no frame or redirect. The cloaking script sort of acts like a proxy.

July 13, 2006 - 5:12am

Aaron:"Please do not speak for me in broad strokes. Generally there is a relationship with age and trust and I have seen too many instances to think otherwise..."

I like that answer Aaron. And I suppose that answer does generally disagree with the theory that This isn't true - the patent actually makes it clear that "in some cases they may favour old sites (established, authority site on a topic that is not changing day-by-day), but in other cases will favour new sites "
***for topics that have very recently had a lot of news activity, a new site is likely to be better***

I will definitely take your word for that one, that "OLD SITES RANK BETTER BECAUSE THEY HAVE MORE TRUST"
But... I just want to make sure I get this right, so are you saying that *IN GENERAL* older sights have an advantage over newer sites since they are trusted more, or are you saying that "ALWAYS* older sites have an advantage over newer sites since they are trusted more?

Kat
July 19, 2006 - 8:36pm

I have a newbie question. I'm wondering if the cloaking risks discussed here apply also to the concept of affiliate link cloaking, such as that provided by GoTryThis, currently being touted across the web.

Thanks in advance for your answer.

July 19, 2006 - 8:47pm

You can use .htaccess or a free PHP jump redirect script to redirect your affiliate links.

I don't think obfuscating and redirecting them hurt you with organic rankings, but redirects might lower your perceived "landing page quality" if you market the page in AdWords, which could cause you to have to pay more per click, but the only way to know is to test it.

Also keep in mind how many computers Google has toolbars on. They probably have a good idea where traffic ends up after it winds up on your site.

Abe008
February 4, 2007 - 5:35am

Hello,

I am trying to build an affiliate site and then show multiple products. When a user clicks on a product they go to a sign in page and upon hitting enter on the sign in page they are taken via a clocked PHP link to the product page.

Is this still legal if I put a DO NOT FOLLOW after the product into page so that the bots will not crawl the landing page. What will be a legal for me to hide my affiliate code ID when the user visits the actual product site. I do not want the user erasing my Affiliate ID. Any other method that is legal to achieve this and still have my site get high PR rating and make it search engine friendly. Any tip or guidance to do this and be legal will be helpful and greatly appreciated. Thanks

July 10, 2006 - 3:57pm

Yes, very good read into the cloaking art. Shows some good reasons why cloaking may be needed for some sites ( like the ones heavy on flash ).

Thanks for the info Aaron

February 4, 2007 - 10:39am

Legal is probably the wrong frame of reference. You should say not legal vs illegal, but within the terms of service vs outside the terms of service.

And keep in mind the terms of service are a set of loosely enforced guidelines aimed to control people through fear and make it easy and scalable for search engines to profit from indexing other's content.

July 10, 2006 - 5:15pm

Interesting interiew. I always though of cloaking as being "a lot of work" and this interview sort of confirms that for me.

Nick Pang
July 10, 2006 - 5:45pm

Excellent interview questions!

Good to understand the pros and cons of cloaking since I never used it...

Again, VERY useful information as usual :-)

July 10, 2006 - 7:52pm

Great Interview - lots of great education packed in there! Great job Aaron and Dan!

July 10, 2006 - 10:33pm

I have a new client that appears to need to cloak because the site is all Flash and uses authentication and gateway to show one verison of navigation to those who are logged in and another to those that are not.

Even the parts of the site that are open - due to the gateway - lose their pagerank and are not properly indexed for backlinks - and none of the site's pages rank well.

To me, Cloaking is a situtation like that, seems to be a very legit way to go. why does one have to be dishonest about it -?

Is there a way to legit Cloak? A way search engines will accept and that no competitior can use against you?

July 10, 2006 - 10:42pm

> Is there a way to legit Cloak? A way search engines will accept and that no competitior can use against you?

If you mean is there a way to cloak with a guarantee of not being penalized, then the answer, unfortunately, is no. There are no guarantees in this business.

Cloaking a Flash site or cloaking to bypass authentication for search engines may be acceptable under human review by search engines, however. If you do intend to go this route, make sure the text you present to search engines matches the text in your Flash presentation. Do not do anything tricky. Your goal will be to pass a human review. What you are really doing is creating a text-only version of the site and providing that for the search engines. I would also advise making it available to human visitors.

yottabyte
July 11, 2006 - 3:07am

[Cloak Newbie] Question...

Are the cloaking risks IP address specific?

To cloak, should I get at different domain or IP and domain? How are shared IPs affected? If one domain is banned on a shared IP will the whole IP get banned?

Getting a new IP is not a big deal, but I do not want to try cloaking and get all my legit sites banned.

The Cloak SuperDuber-Rookie,
YByte

July 11, 2006 - 11:00am

Cloaking can be ethical when there is no other way for the spider to access the dynamic page

July 11, 2006 - 11:45am

I would have to say that this Interview just made me understand the subject from most aspects better than before. I had many cloaking scripts on my system just got it from here and there but by this interview i can clearly get the idea where and how to do what on your website.

By this interview you get to know the legal aspects in the field of cloaking and what is comming and what has to go in the future.

I would like to thank Aaron for taking his precious time out and really providing us such know how of this subject also i am most thankful to Dan that he gave this interview.

Great work Aaron keep it up, you always tell us something difficult in a simple way and valuable enough to open our mind's to the truth.

Keep it Up and Keep it Simple ;)

July 11, 2006 - 4:19pm

>To cloak, should I get at different domain or IP and domain? How are shared IPs affected? If one domain is banned on a shared IP will the whole IP get banned?

Usually, if a domain is banned, and other domains reside on the same IP address, the other domains are OK. However, if there is a pattern of "bad behavior" on the IP address, the engines just may ban all of the domains on the IP.

I usually recommend that people use unique IPs for each domain. I like to set mine up on different Class C subnets as well. By Class C subnet, I mean that the first three groupings of numbers in the IP address are not all the same.... like 123.123.123.xxx is different from 123.123.68.xxx.

July 12, 2006 - 3:31am

A few comments:

In the Google patent, there are many age factors they refer to. These include age of the domain, age of the inbound links and age of the individual pages. A common misconception is that "new" domains/links/pages are by default penalised in some way by Google. This isn't true - the patent actually makes it clear that in some cases they may favour old sites (established, authority site on a topic that is not changing day-by-day), but in other cases will favour new sites (e.g. for topics that have very recently had a lot of news activity, a new site is likely to be better).

There is nothing published by Google that says "new" = bad.

I also don't believe most of the sandbox comments. I believe it's perfectly possible for a newly created domain to appear high up (<50) in Google SERPs within a relatively short space of time. I think this doesn't happen very often because people focus too much attention on getting a lot of links in a short space of time, and the unnaturally rapid growth in links could easily trigger Google's "spammy link growth" filters. These filters can be triggered by old, established sites in just the same way - but the rapid growth in new links (coupled with no existing old links) means this problem is nearly always encountered with newly establised sites.

Personally, I think this is the reason many people claim to be "sandboxed".

July 12, 2006 - 3:34am

Aaron Wall actually completely disagrees with this theory. In his ebook he states that google focuses a lot on older age domains, perhaps more than they should.

I'd be interested to hear your response to the above post Aaron!

***!!!GOOGLE AND AGE!!!***

tag
January 2, 2008 - 5:34pm

I've been cloaking sites for the last 5 years, i needed to have bots crawl from one site to another from links but not let humans see the links, all other content on the index's and sites remained the same and the links were placed normally on the site and only once in case a human did happen to find them. The method I used was a simple script placed at the top of my index. You could use a file to store the IP's instead of an array since there would be so many. And in my case since I wasn't doing anything other then adding a single link on a few sites redirecting based on user agent would be ok too even though it could be spoofed no one coming to the site would have a need unlike some major affiliate marketing networks/sites. This could also be done with .htaccess which has the benefeit of letting you enter in domains and IP blocks, im sure you could code that into the script as well. I don't think there is anything bad about cloaking if people use scripts to redirect users based on browser for better navigation or screen resolution something as simple as displaying a different index to link your sites isn't going to hurt. Just don't over do it and you shouldn't get banned or removed from indexes.

<?php
$bot = array("192.168.0.2","192.168.0.1");

if (in_array ($_SERVER['REMOTE_ADDR'], $bot)) {
header("Location: http://www.yoursite.com/index2.php");
exit();
}
?>

and for HTTP_USER_AGENT or something similar to check an array or file for the bot user agents

if(strstr($_SERVER['HTTP_USER_AGENT'], 'bot') || strstr($_SERVER['HTTP_USER_AGENT'], 'Bot')) { header('Location: botindex.html'); die(); }

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.