Spy on Visitor Browsing History for Competitive Research

Spyjax allows you to view the browsing history of website visitors. You upload a list of competing URLs and see which ones the browser visited before visiting your site, which can be used to let you know what competing sites people typically visit before seeing your site. By tracking this, you can replicate the features and/or marketing strategy of other well visited sites and move yourself earlier into the buy cycle.

Published: June 1, 2007 by Aaron Wall in seo tools

Comments

webascender.com
June 4, 2007 - 2:28pm

Brilliant... it's so simple I have no idea why I didn't think of it ;)

Nice job.

There is no reason to debate the ethics of this find, people link spam, hack, etc. People will find ways to exploit this and devise elaborate phishing scams. I am glad you have pointed this out Justin, I have no intent or motive for using this @ this time but thanks for showcasing the capabilities.

Good time to write a firefox plugin to disable the alink vlink differences in the DOM :)

Ryan D
www.webascender.com

jacob
June 4, 2007 - 3:06pm

Will the links to my competitors be indexed by the serps? I dont want to give one way outbound links to them.

Adam Moro
June 4, 2007 - 10:16pm

Dave: "Do I read Digg, reddit or just Google Reader? Do I have a myspace?..."

Google knows if you do these things. Do you have a problem with that too? Dave, do you also have a problem with people looking at your source code? I mean, there's no choice there either right?

Dave: "It is none of your damn business."

Everything you do on the Internet is my business if the information is available to me (by legal means, of course) and I have an interest in it. It's YOUR responsibility to safeguard yourself. Not Aaron Wall's, not Google's, not anyone's but yours.

Now, if you want to REALLY want to get scared, go do a search for website security and look at all the other vulnerabilities and exploits out there being used today (in your case, maybe do a search for wordpress security ;)

Thanks for the tip, Aaron. Great post!

dave
June 5, 2007 - 6:31pm

Adam Moro: "Google knows if you do these things."

Only if I gave them permission (assuming you are talking about Google Desktop). As I told corey above, using this code takes that choice away from your users.

"It's YOUR responsibility to safeguard yourself. Not Aaron Wall's, not Google's, not anyone's but yours."

a) I never said that it wasn't, in fact I mentioned that I do take precautions against it.
b) This is completely irrelevant. Do I make burglary more socially acceptable by locking my door?

"Dave, do you also have a problem with people looking at your source code?"

This is just a silly argument. Personal information is not the same as source code. Hiding source code is easy, don't use an interpreted language.

"Everything you do on the Internet is my business if the information is available to me (by legal means, of course) and I have an interest in it."

What an absurd sociopathic cop-out. There are plenty of XSS javascript attacks out there for you to use, that inject fairly well documented iframe trojans. These attacks are far too modern to have ever been addressed by the law - but as long as it's "available" and you have an interest, that must make it OK.

Stalking has only been defined as a crime since the early 90s. Was it ethically A-OK with you prior to that?

Again, viewing visitors' browsing history without their consent is not ethical. Use of this code is an attack against your users.

jamie
June 5, 2007 - 6:53pm

Aaron-very cool. Gotta play with it.

Dave-I hope you aren't on a computer with any personal info. Or use any email services like gmail, hotmail, yahoo, msn or pretty much any other one. Also, IM-ing would be out for you. No paying bills online or purchasing anything from eBay. Using a public computer out of town to view the news so that no one can see what political news stories you follow and know what part of the bipartisan system you relate to.

Get over yourself. The second you first open a browser window on a computer with internet access, you are asking someone to look at what you do. Relevent info for most of us is what other site that sells the same thing I do did you look at first - not what porn site you visited last.

dave
June 5, 2007 - 8:01pm

jamie: What world are you living in. I'm "asking someone" to look at what I do? I affirmitively consent to having my personal email, financial data, and reading habits harvested by anyone? By opening a browser window? Does anyone else remember consenting to this? That statement is so bizarre and factually baseless I can't imagine you are stupid enough to actually believe it.

"Relevent info for most of us is what other site that sells the same thing I do did you look at first - not what porn site you visited last."

Once again - which piece of data you are more interested in is inconsequential. You are equally unjustified in attempting to access either one. It is NOT YOUR DATA.

dave
June 5, 2007 - 8:16pm

It's worth noting that seobook.com would appear to be in violation of it's own Privacy Policy by implementing this code:

"For each visitor to our Web page, our Web server automatically recognizes only the consumer's domain name and IP address (which is only recorded if they post comments)"
http://www.seobook.com/archives/000157.shtml

The Privacy Policy for merchantos.com, who host SpyJax, is a little less clear, but it could credibly be construed as a violation:

"We collect the domain name and e-mail address (where possible) of visitors to our Web page . . . aggregate information on what pages consumers access or visit, user-specific information on what pages consumers access or visit"
http://www.merchantos.com/privacy/

June 5, 2007 - 8:27pm

It would appear that Dave is in violation of being a trolling asshole for suggesting I implemented it on this site.

Ease up or complain somewhere else, but suggesting I implemented it on this site without changing the privacy policy is poor form.

Dan
June 5, 2007 - 8:39pm

Dave,

I find your argument powerful, persuasvive, accurate and ultimately irrelevant. You are absolutely right - 99% of the average consumers would see this as an invasion of privacy. If you sent an email out to your customer base and
asked if they minded if you spied on what sites they had visited - I am sure they would all mind.

Why do I find the argument irrelevant - people will still use this exploit - it is out there and just a fact of online life now. Just as viruses, spyware and spam are all unethical but still in use.

Here is how I see this becoming a powerful sales tool...

Since the early days of the web (and all ad copy for that matter), lesson number 1
has always been to know your audience and write to your audience. We all do this with a landing page that ties into a keyword buy. If you sell used
and new cars, and someone clicks on your ad for "used car" you want to have them land on a page that is all about used cars.

Now imagine the possibilities with this hack...

1. When user visits your site you check their history against a list of popular car sites.

2. You store this info, user IP and email (if they sign up for your free newsletter) in a data table.
3. You use this data to dynamically generate popups on your site.

Has your user been to the Toyota site? Hit them up with your best deals on a new Toyota. Have they been to Cars.Com - show a table which explains
why your site is better than Cars.Com . If the user signs up for your free newsletter - now you can send them offers specific to Toyota cars.

Possibilities for customs tailored content that talks directly to your audience is endless.

While I think some might argue this is a positive use of this hack - Dave is right - still invades their privacy.

-Dan

jamie
June 5, 2007 - 9:08pm

I work for an ad agency that specializes in new home builders. I vaguely mentioned it to my supervisor and had people coming into his office to tell which sites to test this nifty little function on.

FYI-I will have a notice on each of these web sites that I test but I can guarantee you that less that 5% of the people who traffic our sites will read any notice or disclaimer that I place.

As to privacy policy, if you in any way think that anything you do on the internet is 'private' you have nicely deluded yourself. Anyone in the technology industry, particularly on the server side of operations can tell just how ethereal that 'privacy' is. And while there are child molesters and terrorists and all sorts of scary people out there, you will continue to have less and less 'privacy.' Now an ad agency, or web site owner, viewing a visitor's history to gain some info about the competition, esp in the form that this little Spyjax provides, is not going to get me your mother's maiden name and where you graduated preschool. For those people who do hack, they have much better ways of doing so.

dave
June 5, 2007 - 9:16pm

Aaron: "suggesting I implemented it on this site without changing the privacy policy is poor form."

I tried to word my post carefully and really did not mean to suggest you had actually implemented it. If I wasn't careful enough I apologize.

dave
June 5, 2007 - 9:34pm

Dan: "viruses, spyware and spam are all unethical but still in use."

I completely agree.

"Now imagine the possibilities with this hack..."

Attacking my own users is where I get off the boat. Good luck to you.

Alin Rosca
June 5, 2007 - 10:31pm

I agree with Dave's points. Spying on your customer is simply immoral. Turning a blind eye on this and focusing on technicalities does not make it become ethical.

Some are arguing that this is not theft, and focus on how the data is used/reported (i.e., in the aggregate), as opposed to how it is collected. I appreciate that Spyjax's author is concerned with privacy. But his focus is on the use, not on the gathering of data.

Whether the use of data is ethical or not, the gathering is unethical when the visitors are unaware of that. That data is someone's property. When you are taking something that belongs to someone without that someone's informed consent, that is theft, and is unethical.

Unfortunately, this application will catch on. But the fact that a lot of people do something doesn't make that something either less or more moral. I can't see how gathering a visitor's personal data (regardless of how that data is subsequently used & reported) without his/her permission is morally defensible in this case.

Adam Moro
June 6, 2007 - 1:44am

@ Dave - You sound like a smart guy so I can't help but wonder what you expect to accomplish by your comments on this post. I'm honestly curious. Do you really think that the people taking advantage of exploits like this care what's right and wrong (especially the ones you mentioned that do so with malicious intent)? You know just as well as anybody here that there really isn't a practical way to govern these activities. Even if there was a law that addressed this specific exploit, how could it be governed? Besides, the Internet is much too volatile to tame so why waste your time preaching about how it *should* be?

Now, to address your replies to my comment:
"Only if I gave them permission (assuming you are talking about Google Desktop). As I told corey above, using this code takes that choice away from your users."
I'm not talking about Google Desktop. Use your head a bit. Just because you "opt-out" of email subscriptions doesn't mean you stop receiving spam.
a) I never said that it wasn't, in fact I mentioned that I do take precautions against it.
b) This is completely irrelevant. Do I make burglary more socially acceptable by locking my door?"

a) You mean you took something positive away from this article?? Congratulations!
b) How could it possibly be construed as irrelevant? Are you trying to change the world or have a discussion about Internet ethics? Of course burglary isn't socially acceptable but you still lock your doors! Of course XSS attacks are socially unacceptable but you still close your holes!

"This is just a silly argument. Personal information is not the same as source code. Hiding source code is easy, don't use an interpreted language."

First of all, you can't "hide" your source code (you can IP cloak, obfuscate, take measures to make it less accessible, etc. but ultimately you can't block someone from viewing the source of a webpage they're currently viewing). With that being the case, are you seriously stating there can be nothing in your source that would be considered personally identifiable? Do you use tracking scripts, affiliate programs, any service that asks you to insert a line of code (containing a unique identifier of some sort) into your page(s)?

"What an absurd sociopathic cop-out. There are plenty of XSS javascript attacks out there for you to use, that inject fairly well documented iframe trojans. These attacks are far too modern to have ever been addressed by the law - but as long as it's "available" and you have an interest, that must make it OK."

Don't hate the player! ;)

"Stalking has only been defined as a crime since the early 90s. Was it ethically A-OK with you prior to that?"

What an absurd "sociopathic" analogy. Again Dave, this is about Internet ethics. The fact that you compare this exploit to stalking someone (which, by the way, is something I find despicable) is ridiculous. We're talking about data - not people. Don't use an interpreted language.

One more question. Did you ever so much as glance at the AOL data that came out last year? I'd be very curious to hear the HONEST answer to that question.

Chris
June 7, 2007 - 3:01am

Alin: Whether the use of data is ethical or not, the gathering is unethical when the visitors are unaware of that.

In total agreement. I wouldn't have any problem with this practice IF the user had specifically consented to it. But this is the internet equivalent of having somebody break in your home and steal any documents they like to do whatever they please with. Even if its only ever to be used for positive profiling / marketing purposes - it is theft, and its that black and white as far as I'm concerned.

Sure, you expect any site you visit to profile you as best they can and retain and build upon this data over time, and I'm sure most of us posting comments here know by and large what kind of data that is likely to be as we're probably all involved in that to one degree or another I'm guessing. I think generally speaking most people know this and accept it.

The difference with this practice is that it is clandestine, illicit, surreptitious, and unethical because the user is not consenting to give that kind of information away.

Imagine if I ordered a pizza by phone. And then had some other pizza company monitor which phone numbers I had been dialing and detect that I'd ordered a pizza from a rival company so they decide to call me and offer a $2.00 discount. Yeah sure, I might benefit from a $2.00 discount if I order from them in future, and thats useful to know, but I don't really want the rival pizza company knowing who I've telephoned and then disturb me. Would you be happy for that to happen left, right and centre with the phone? Maybe thats not the best analogy ever but its 2am *props up eyelids*. The point being... I don't remember signing up to give the telephone numbers I've dialed away to anyone other than to the phone provider and the person I call, so why should the internet be any different? It isn't a free for all!

Just because its out there and already happening, and probably will do forever more to some extent, doesn't make it acceptable. And people can't keep defending and justifying the practice with the excuse of the internet being ungovernable and a no-mans land when it comes to the law. I agree its fairly hard to govern and this is probably going to be the case forever to certain extents, but that doesn't make this practice any less acceptable. I hope that any major sites found to be using anything like this are 'outted' and stigmatised a la Comet Cursors!

Maybe Dave, just like me, wasn't actually trying to achieve anything! Maybe he just wanted to air his views in a 'Post a comment' section of a very interesting discussion, which he is absolutely entitled to, just as I am.

June 7, 2007 - 6:33am

The action that people miss here... is that the USER IS REQUESTING whatever I want to deliver. They asked for the relevancy.

Google shows me local ads and redirects some people to the local service. They also match ads to text in Gmail. Why shouldn't I make the user experience as relevant as possible?

Requested relevancy is totally different than a cold call offers for pizza.

Chris
June 7, 2007 - 11:00am

With regards to GMail, and the display of matching ad content, then yes, that is requested relevancy, as the user has to agree to a user agreement t&c before an account can be created. I use GMail and have no problems with this, I was happy to accept the T&C of a user agreement.

With regards to Google and any other search engines delivering local ads, this is requested relevancy, and I don't see a problem with this either, that's the nature of search engines.

But in neither of these instances is the user giving permission to have their browser history (away from the sites in question) perused and used. And this has to be the case for the practice to be ethical.

Google does track your entire browsing history through use of its browser Toolbar if you agree to install that. They make this very clear during the installation process, and by installing the Toolbar, you are effectively saying yes look at what I surf I don't mind. I personally have the Toolbar installed and I don't mind. I have given my consent to Google.

However, I don't want the world and its dog to be able to look through my history and do as they please, even if it is for beneficial purposes. And I've got to say that some of the potential possibilities of Spyjax are great for both the visitor and site owner. However, it's very name 'Spyjax', and the title of this article 'Spy on Visitor Browsing History for Competitive Research' says it all I'm afraid. Unless sites using Spyjax or similar software alert the user that by proceeding to use our site any further, that you agree for us to have a snoop around your browser history, then it is crossing a line.

Aaron I've read your articles over the last couple of months now, and you've written some pretty fine and thought provoking and downright helpful material along the way, and I for one hope you continue to do so, I'll certainly continue to read your output. However, I'm really disappointed that you appear to be in support of this product / practice, and don't appear to be taking the ethics of it into account.

Forgive me if I'm wrong but other than cookie usage (which will track movement from the point the cookie is created), nowhere in the user agreements of the browser software I use agree for my retrospective browsing history to be viewed.

Like I say the potential positive possibilities from a marketing perspective and the end user are very appealing. But going about it in an underhand and surreptitious manner are not appealing and I imagine will outrage many if not most people if they haven't consented to it. It's black hat marketing, no two ways about it in my mind.

Bill Hartzer
June 1, 2007 - 11:10pm

This is too cool. I didn't know that such an extensive browser history is available. Will definitely be using this tool. Thanks for mentioning it, Aaron.

Dave
June 1, 2007 - 11:43pm

I've always been conscious of the technical possibility of this and taken some safeguards against it. Still, as a user, I'd be furious if I knew this technique were being used on me, and I will be keeping my eye out for any precedent-setting legal challenges to this.

As a publisher/affiliate, I refuse to stoop this low. It's disappointing but not unexpected that a great deal of readers here would be so sanguine about something so blatantly unethical.

Your user's history object is none of your fucking business.

corey
June 2, 2007 - 12:14am

"Your user's history object is none of your fucking business."

So when I get phone calls from people considering my software, I have no place asking who else they've shopped?

"I refuse to stoop this low."

How does this hurt the consumer? What if I'm a portal like capterra, and my sole mission is to recommend alternatives that the shopper has yet to consider?

Dave
June 2, 2007 - 1:04am

"I have no place asking who else they've shopped?"

Ask all you want. The difference, of course, is that the person being asked has the option of declining to answer. You have no place going through personal data that wasn't volunteered to you.

"How does this hurt the consumer?"

Today it's your competitors' sites, tomorrow maybe you'll be wondering about which email service I use? Do I read Digg, reddit or just Google Reader? Do I have a myspace? Have I been buying stuff on ebay? Do I bank with Wachovia, or B of A? What kind of porn have I been looking at? The privacy implications and potential for abuse involved in examining browser history data, data which is almost certainly not limited to legitimate competitive intelligence, really ought to be obvious here. It is none of your damn business.

Dave L
June 2, 2007 - 1:26am

Of course, Google collects user data as a feature:

http://www.seobook.com/archives/002272.shtml

Wodow
June 2, 2007 - 1:50am

And here is the counter-solution, if you are using Firefox:

http://safehistory.com/

Sean
June 2, 2007 - 1:51am

wow, that is a neat trick. this is nothing compared to what Google has on us all.

corey
June 2, 2007 - 2:01am

"Ask all you want. The difference, of course, is that the person being asked has the option of declining to answer. You have no place going through personal data that wasn't volunteered to you."

Ok--so you'll agree that this code could be used for good? Aaron didn't say "steal browser history", and no one is encouraging shady use of this data. You assumed the worst.

"The privacy implications and potential for abuse involved in examining browser history data, data which is almost certainly not limited to legitimate competitive intelligence, really ought to be obvious here. It is none of your damn business."

'potential for abuse' in sentence one. 'none of your damn business' in sentence two. You're not considering any responsible use of this information? It's just all bad?

Justin
June 2, 2007 - 7:38am

Hi, I'm the creator of Spyjax. I think Dave has a valid point. However Spyjax does not track who you are, it just tracks what websites you have been to and reports that. There is nothing that can tie you as an individual to that data within Spyjax. It's basically anonymous data reported in aggregate form.
That being said it would be pretty easy to extend this same concept on a site that knows who you are (by email adddress, or name etc). Then a record could be kept of where 'you' in particular have been.
IMHO it comes down to the same issue as always: be careful where you give your personal information. It can be used to associate you with many things, not just your browser history.
Thanks for the write up Aaron!

Dave
June 2, 2007 - 8:02pm

corey: "no one is encouraging shady use of this data. You assumed the worst."

Yes, *of course* I did. Why should I trust you to live up to your own definition (let alone mine) of non-shady use of my data?

http://en.wikipedia.org/wiki/Identity_theft#Spread_and_impact_of_consume...

"You're not considering any responsible use of this information? It's just all bad?"

You are missing the point.

YOU do not get to decide what constitutes responsible use of MY information.

Given the choice, I would prefer you to simply not see that data at all. I'm guessing so would the vast majority of users everywhere. You would rather just bypass that choice, and collect information that a) isn't yours, and b) your users don't want you to have.

To me that is, simply by definition, a violation of someone's privacy. I don't see how you could describe it otherwise.

chris
July 4, 2007 - 3:36am

I've known about this for a while, it's not a bad work around. And its totally ethical.

If people want cheap products on the web, free stuff on the web they have to give something to those companies. Typically its basic browsing information which will only help these companies make things more better for the clients visitors and help them in return.

I miss the days when History object was fully available and I didnt have to go to server logs to see my referrers. :(

Chris
June 3, 2007 - 2:36pm

Corey: Ok--so you'll agree that this code could be used for good? Aaron didn't say "steal browser history", and no one is encouraging shady use of this data. You assumed the worst.

Whilst Aaron didn't use those words, the headline of this article is 'Spy on Visitor Browsing History for Competitive Research', and the software is called 'Spyjax'. The operative word being 'spy'.

No matter what purposes you are actually delving into a users browser history for, be it for competitive research, or to have a nosey around, it is spying, and snooping is most definitely a shady practice and an invasion of privacy as far as I'm concerned.

I'm 100% with Dave on this.

Unless you have agreed for your entire browser history to be viewed by all and sundry in an agreement for the software you are using to browse websites, then the only people that should ever have access to that history are yourself, and the authorities if appropriate. It is a privacy violation, no matter what the intentions are. You may as well come round my house and root through all my drawers, cupboards, and dirty laundry whilst you are at it!

Totally unethical.

James Oppenheim
June 3, 2007 - 3:43pm

Wow, interesting article, I never thought of using Ajax that way. But anyway there is so much data out there I think my browser history is the least of my problems.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.