Link Harvester - Free & Deep Access to Link Information

Tool from Last Month:
None of the major text link analysis tools for sale allow you to check co-citation, or pages which link to multiple related resources.

Last month I had a friend create Hub Finder, which is a free on topic link analysis tool which looks for co-citation. I have not got much feedback on the tool yet, but a few people have said they found it to be useful.

New SEO tool for this month:
Another common problem with most link analysis tools is that they do not make it quick, easy, and convenient for you to be able to search past the 1,000 backlink barrier set by most search engines. What is the point of being slow to give you more details than you need, only to survey a small portion of the inbound links?

A friend of mine is a decent programmer, and I had him whip up a tool I call Link Harvester, which has a ton of cool features:

  • uses the Yahoo! API, so it is in compliance with their TOS.

  • free
  • makes saving and exporting data in CSV as simple as a click of the mouse
  • does not require any software downloading
  • quickly grabs the number of .gov, .edu, & .ac.uk inbound links while also listing each individual link.
  • quickly grabs the number of unique linking domains while listing them
  • quickly grabs the number of unique linking C block IP addresses while listing the C block next to each domain
  • allows you to check links pointing at a page or at a domain
  • displays the total number of links showed by Yahoo!
  • displays the total number of pages indexed by Yahoo!
  • links next to each domain that point at its WhoIs source information and Wayback Machine information.
  • if a site links at your site more than 5 times then it is bolded in the results and a checkbox is autochecked, which allows you to filter out that site and spider deeper through the link database. This harvesting action is how you can spider deeper than 1,000 backlinks and where the tool got its name from.
  • Link Harvester is open source. If you like the tool & find it useful you can add it to your site. Also if you can think of ways to make it better you can modify it however you please.

Why Not Look at Anchor Text?

  • I did not want this tool to spider websites.

  • I wanted this tool to be faster than anything on the market.
  • It is important to understand what anchor text variations people are using, but usually you can figure out how stiff the competition is just by quickly glancing through their backlink profile without necissarily looking too deeply into anchor text. The current off the shelf tools that monitor the anchor text only give you a small sample of backlink data.
  • This tool was not designed to be the comprehensive show all link analysis tool, but just something that was useful and quick and easy to use.

After you see enough linkage data you become aware of how competitive a site is and how you should go about promoting it. It is kinda like the thin slicing concept Malcolm Gladwell talks about in Blink.

Feedback:
Please let me know what you think about Link Harvester in the comments below.

Want to Host Link Harvester? Want to make it better?
grab the source code here.

Published: May 6, 2005 by Aaron Wall in seo tools

Comments

July 27, 2006 - 9:50pm

Well installed and re-installed it carefully ..a couple of times: http://www.afroarticles.com/SEO-Search-Engine-Optimization-Tools/Link-Ha...

------------------------------
Getting error: Fatal error: Call to undefined function: domxml_open_mem() in /home/afroarti/public_html/SEO-Search-Engine-Optimization-Tools/Link-Harvester/class_gamy.php on line 200
------------------------------

Checked all the includes...they look fine to me.

What's happening here?

January 8, 2007 - 5:49am

Aaron, I have got to tell you that I use this tool every week. In fact, I probably use one of your reat tools every day. thanks!

Kevin
May 18, 2006 - 12:35pm

Would you consider adding the "target page" of the link that is found?

I can't find a single tool out there that will find backlinks for an entire domain (not just a single URL at a time) *and* give the target page. This is critical when performing site redesigns.

This script is a great start - any chance of those features being added?

May 18, 2006 - 1:01pm

Hi Kevin
when my programer is free there is a pretty good chance of it being added I think...

you also realize that Yahoo! Site Explorer is pretty handy for doing what you want to do, right?

March 8, 2007 - 12:53pm

Although the tool seems to be working - also get displayed on repeated lines when you run a query:
Warning: parse_url(http://) [function.parse-url]: Unable to parse url in /home/latentse/public_html/backlinks.php on line 309.

See here:
http://www.latentsemanticindexing.co.uk/backlinks.php

Any ideas please?

Curt
June 7, 2006 - 9:22am

Looks like a great tool. I am trying to install it but seems as though it was written for PHP4 and not PHP5 ie. it won't operate under PHP4.

Do you have any plans to update the code to PHP5?

I am sure your handy dandy coder could fix it in no time flat :)

Thanks so much - again, great tool.

June 7, 2006 - 9:55am

Hi Curt
since it is currently free and functional an upgrade is low on the priority list, but it may eventually happen.

curt
June 7, 2006 - 5:29pm

Aaron,

Thanks. I have actually tried to install on a PHP4 machine and there seems to be addtional errors. Undefine variables etc. So I am will continue to try and diagnose the issues.

August 29, 2006 - 1:38am

I´d like to know more about this , can anyone help me. thanks

Rhoda Schueller
May 9, 2006 - 7:28pm

Link Harvester for MSN is showing these problems with the coding:

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

I noticed it the other day. It is still showing the problem today so I thought I should let you know.

Thanks for the great tool.
Rhoda

A Reader
April 18, 2006 - 11:21am

Excellent!

But: Put up a little key/guide to what everything means

* The checkboxes - what are they for and why are these domains bolded?

* The letters - what do they mean? (I can work out most by mouseovering, but it's inconveient)

Is there any way to mark probable scraper sites linking in? For example, can it be figured out by all the Google or whatever search results clogging up their pages or by the percentage of adverts on the page? (Probably not, but scraper sites are SOO annoying)

May 9, 2006 - 7:59pm

Hi Rhonda
I just used Link Harvester and it worked for me.

Tom
May 10, 2007 - 9:54am

On Dreamhost it doesn't work :(

Graeme
February 17, 2007 - 12:17am

Tried your link harvester trial...you have an code error on the msn query that might need some attention.....no use buying something that doesnt work :)

Kemp
August 1, 2007 - 5:52am

Good tool, however, I'm quite worried that people are able to also see private addresses published with this tool. Is this not a breach in privacy policy?

January 9, 2006 - 9:28pm

When you run the tool it lists the links--when you click on a link it goes to that page (which is fine). But, if you use the Back button you lose all the data you just got by running the tool.

Those links need to open up in a new window so you don't have to run the tool again.

Kate
January 10, 2006 - 9:59am

Hi,
I just tried link harvester and it's a great tool. I'm confused though: when I export the .csv file instead of getting the actual URLs of pages linking to my site, I get the text "Array" in each field. I used link harvester a couple of months ago and was getting the actual URLs, so I'm not sure what has changed. Any help?

thanks for the clever tool.

July 3, 2005 - 11:56am

Hi Aaron,

Thanks for having a great tool developed. I started using your tool when you first wrote about it and now it is part of my "SEO Tool Chest". I didn't want to use up too much of your API limit, so I mirrored the tool at http://webseodesign.com/seo-tool-chest/backlinks.php.

Thanks for making this Open Source.

martin

January 10, 2006 - 11:51am

power comment here ;)

>What does "filtered sites" mean in the link harvester returns?

it means many links came from that site and it was filtered to view more sites linking in

>Question though. When I am searching on a particular page ie www.domain.com/pagename.htm what is the difference between "links to domain" and "links to homepage"?

Links to home page is links pointing at www.seobook.com or seobook.com

Links at site is the linkdomain function (all links pointing into the site)

>Love the SEO tools. Can you clarify something. It's a question that I've always had in regards to the API's. The limit is 1,000 to 5,000 daily requests depending on the search engine, correct? What exactly is a request?

I believe you can grab up to 50 search results per request.

>When you run the tool it lists the links--when you click on a link it goes to that page (which is fine). But, if you use the Back button you lose all the data you just got by running the tool.

>Those links need to open up in a new window so you don't have to run the tool again.

will try to get that fixed soon Bill.

>export the .csv file instead of getting the actual URLs of pages linking to my site, I get the text "Array" in each field. I used link harvester a couple of months ago and was getting the actual URLs, so I'm not sure what has changed. Any help?

Friend must have wacked the tool while adding features ;)

Will try to get that fixed soon Kate.

August 31, 2006 - 9:33pm

Link Harvester is offline for some days already. What happened? Too much traffic? Do you need mirror or something?

David
August 2, 2007 - 10:11am

Hi Aaron,

Excellent tool, however i am experiencing the same problem as Howard with the parse_url function.

"Warning: parse_url(http://): Unable to parse url in /var/www/html/tools/link-harvest/backlinks.php on line 309"

Please advise.

Rgds.

August 24, 2005 - 10:54am

http://tools.zettwalls.com/backlinks.php

this tool dosent work at all.

can you tell what happend?

May 5, 2005 - 9:25pm

Great tools Aaron - I like how link harvestor groups the links by domains.

Bill
October 26, 2005 - 11:01pm

Aaron,

Love the SEO tools. Can you clarify something. It's a question that I've always had in regards to the API's. The limit is 1,000 to 5,000 daily requests depending on the search engine, correct? What exactly is a request? When I use your tool to search for the backlinks to my site is that one request? Or is the 1,000 returned results each considered a request? This has always confused me. Thanks in advance for any info.

September 19, 2005 - 11:13pm

Hello!

Fantastic tool, thanks for making it open source. I <3 this thing. I <3 it.

Question though. When I am searching on a particular page ie www.domain.com/pagename.htm what is the difference between "links to domain" and "links to homepage"?

I'm guessing that "links to homepage" is how many total links(even if more than one comes from the same url) I have to the URL www.domain.com/pagename.htm.

May 7, 2005 - 5:03pm

I tried the Link Harveter, but regrettably discovered an error. Surprisingly it occured on domains that I know have extensive backlinks.

Warning: Division by zero in /**/**/**/**/**/backlinks.php on line 447

I hope this helps.

October 27, 2005 - 2:01pm

2 Mike
when you do links to domain it does linkdomain:
when you do links to page it does link:

2 Bill
each engine has its own api

this is the current info (which may change)

Google's API limit is 1,000 daily usages per user key

Yahoo!'s API limit is 5,000 daily uses per IP address. if the tool is web based then all queries using that tool are from the IP address of the website

MSN is like Yahoo!'s, but offers 10,000 daily uses

the limits can be modified if you get permission

j jensen
May 9, 2005 - 11:57am

displays the total number of pages indexed by Yahoo!

is this the total number of pages indexed of the domain you search backlinks for?

May 9, 2005 - 6:17pm

Getting similar problem when trying to check backlinks of a particular page. Like...
www.domain.com/pagename.shtml

Warning: Division by zero in /home/wmcommun/public_html/hounds/link-harvester/backlinks.php on line 447

May 13, 2005 - 1:44am

>displays the total number of pages indexed by Yahoo!

>is this the total number of pages indexed of the domain you search backlinks for?

it should be. sometimes the number might be a little off since the API data and Yahoo! data may be a bit behind actual web conditions because it takes time for them to spider pages and find links and update their data.

>Getting similar problem when trying to check backlinks of a particular page. Like...
www.domain.com/pagename.shtml

the backlink checking function is different for individual pages and sites.

when you check a backlink into a specific page you need to add http:// ahead of all the other information. Not sure why Yahoo! made it that way, but they did.

Hopefully I can try to have my friend rewrite it with better error checking to auto correct that situation.

thanks for the feedback
cheers
aaron

May 13, 2005 - 10:34pm

the tool was changed to where the http:// part is no longer needed. also the results now show the request URL that was sent to Yahoo! for troubleshooting purposes.

December 2, 2005 - 7:19am

I also added a mirror to my site:

http://www.webmasterinvestments.com/backlinks/

July 18, 2005 - 7:21pm

Thanks for the cool tool.
What does "filtered sites" mean in the link harvester returns?
R

May 28, 2005 - 5:10pm

Hi, I've added a mirror on my seo-scoop site for the tool, and I will be blogging about it tomorrow.

Bounty
September 30, 2006 - 2:34pm

Yes i would like to know aswell i have the same problem, Blank Page, but also noticed that on the site there "Start over button" does not work but on the code that you download now it does but does not submit, and the csv gets java error and the other site that work dont.

http://pcaccessoriesparts.com/Tools/Link_Harvester/backlinks.php

ERROR FIXES
To some of the other posts above, the below errors

Warning: Invalid argument supplied for foreach() in /home/linkhou/public_html/link-harvester/class_gamy.php on line 62

_____________________________________

Getting error: Fatal error: Call to undefined function: domxml_open_mem() in /home/afroarti/public_html/SEO-Search-Engine-Optimization-Tools/Link-Harvester/class_gamy.php on line 200

______________________________________

I had both at one stage or anouther i dont think both are api related but the later is. You need to go get an api key for your site

Tom
October 11, 2006 - 4:15pm

With the yahoo engine the tool doesn't do anything at all on my server. The msn engine is just working fine. I tryed to change the yahoo api but it won't change anything.

Geoff Vines
June 3, 2005 - 2:45pm

Just tried to use your link harvesting program and received the following error:

Warning: file_get_contents(http://api.search.yahoo.com/WebSearchService/V1/webSearch?query=linkdoma...): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /home/wmcommun/public_html/hounds/link-harvester/backlinks.php on line 338

Fatal error: Call to a member function on a non-object in /home/wmcommun/public_html/hounds/link-harvester/backlinks.php on line 342

June 4, 2005 - 2:32am

403 errors mean the query limit is used up for the day. try one of the mirrors.

Eagleslife
September 6, 2006 - 1:20am

Tried to install Link Harvester version 3.0.
The form appears but when the query is run it produces no results (i.e nothing happens) with Yahoo or MSN. The program seems to run but only for a nano second and then says done with a blank output. I tried one of the mirror sites (Donna) and it does the exact same thing.

any ideas ? thanks!!

Eagleslife
September 6, 2006 - 1:21am

Tried to install Link Harvester version 3.0.
The form appears but when the query is run it produces no results (i.e nothing happens) with Yahoo or MSN. The program seems to run but only for a nano second and then says done with a blank output. I tried one of the mirror sites (Donna) and it does the exact same thing.

any ideas ? thanks!!

CR
September 25, 2007 - 1:34pm

Hi

Great software, just what I was looking for. I downloaded and installed and replaced Yahoo api key with my own, once registering application with Yahoo. Nothing seemed to work.

After some diagnosing I found that the form fields were not being passed in backlinks.php

As a work around I inserted GET statements @ line 225 as below and all works well.

$query = $_GET["query"];
$engine = $_GET["engine"];
$linktype = $_GET["linktype"];
$manual_filter = $_GET["manual_filter"];

Hope this helps those experiencing issues.

CR

Now have online version working @ http://www.sector3it.com/pages/backlinks.php

Anon A. Mus
November 28, 2007 - 3:40pm

I entered the URL "rdesgr.com/WhatsAllThisThen", click [Query] and received the following error with 8 variations:

Warning: file_get_contents(http://api.search.yahoo.com/WebSearchService/V1/webSearch?query=linkdoma...): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /home/linkhou/public_html/link-harvester/class_gamy.php on line 199

Thank you for your time and attention. I hope that this helps.

November 28, 2007 - 4:18pm

That host may have exceeded it's limit for API daily usage. Look to some of the mirrors and use them if the main tool is down that day.

Merriadok
July 28, 2008 - 12:10pm

once i tried to enter a big ammount of sites to the filter, i got an error. the browser(opera 9.50) said that the url was too long, so i've got an advice to fix this trouble. please, change the method of request from "GET" to "POST" and also all the "$_GET" to "$_POST". i tried to do this on my website, but my webmaster said the server we use was not good enough. so, please try to optimise, cause i am sure, i am not alone having this trouble. also, i do have the whole script ready, so feel free to contact in order to get it from me.

thank you,
alex

July 28, 2008 - 12:15pm

How long was the URL you were using Alex?

Merriadok
July 28, 2008 - 12:39pm

i had to filter 382 websites

Merriadok
July 28, 2008 - 12:47pm

you may try it with "POST" and "$_POST" at http://www.dropshiparea.com/prov/Link_Harvester/backlinks.php
the simple analysing works well, but once i try to filter those 382 websites, it just refuses to work properly

chghealthcare
July 29, 2008 - 4:25pm

I get an error after installing:

Fatal error: Cannot redeclare class soapclient in /home/user/mydomain/public_html/linkhound/nusoap.php on line 4104

July 29, 2008 - 5:36pm

I think it is designed for PHP 4 and would need re-coded to be PHP 5 friendly.

Merriadok
July 30, 2008 - 2:28pm

Dear Mr. Wall,

Still haven't got a response from you concerning the filter limit(due to URL length) enquiry. Please, let me know if there is anything possible to do about that. I can understand if you will not change anything, cause your tool is free, but I was just hoping you could make another improvement.

Thank you for your attention,

Alex Tapper

July 30, 2008 - 7:16pm

Can you try the tool now Alex?

Merriadok
July 31, 2008 - 1:19pm

Thank you, It is much better this way.

Keesjan
September 29, 2008 - 10:06am

Hi,
maybe iam overasking but iam laos interesed in a PHP5 version.
because php4 is now officially retired, I think you should upgrade it because nobody can use it on fresh servers anymore. No servers will be serving php4 anymore...

Zoran
December 29, 2008 - 9:39am

Hi Aaron,

when the tool was on URL: www.linkhounds.com - I got pretty bigger number for linking domains. For some site I got information that they have got 670 unique domain linking to them.

Why did you set the maximum number of 250 now?
I saw this as you move the tool to seobook.com

Can you answer?

Now, this is not complete information from the tool.

December 29, 2008 - 9:56am

Since this site is so much more popular I had to lower the limits to ensure the API key lasts longer.

Zoran
December 30, 2008 - 1:13pm

Can we expect that you will bring it back, max number to 1000?

Because this is now incomplete information.
I'm disappointed. I loved this tool.

December 30, 2008 - 3:56pm

If I bring it back at that level it will become a member's only tool for paying subscribers.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.