The Website Health Check tool aims to provide a simple and intuitive interface to seeing if your site has any major SEO issues. The site queries Google to grab pages you have indexed in Google, and looks for issues amongst the first 1,000 results.
If your site is exceptionally large, you can use the date based filters to view a sample of recently indexed pages in Google to see if there are any duplication issues amongst those pages.
Questions Answered by the Website Health Check Tool
Is Google indexing your site? Are they quickly indexing your new pages?
Do you have duplicate content pages getting indexed in Google?
Do you have canonical URL issues?
Are any of your pages in Google missing page titles?
Does your server send correct error messages?
This tool is in beta. Please leave feedback below.
I sent the programmer this URL and he would love to get your feedback on what you think of it. We are looking to have version two out before the end of the month.
Features We Are Looking to Add
Allow you to search for not just a site, but a site and a keyword, like [seobook.com seo]
Add indexed page counts from all major global search engines (Google, Yahoo, Microsoft, Ask)
Allow webmasters to grab results from any of the above 4 engines, or mix and match
Make each data point we collect link to the source
What other features would you like to see?
Video About How to Use the Website Health Check Tool
Michael Jenson from Solo SEO recently emailed me about a cool new free SEO tool he created called Index Rank. After seeing my post about Google date based filters, Michael created the Index Rank tool, which allows you to see the growth of a site's profile in Google based on the number of pages indexed over different periods of time. The tool also allows you to compare multiple sites against each other.
Why is this data useful?
Since Google removed the supplemental results label, the next best thing we have to test site trust for lower end longtail pages is how quickly new pages are getting indexed.
If you see a rapid increase in indexing you know that is caused by an increase in domain trust due to better inlinks, an increase in content creation that leveraged unused authority the site was sitting on, solving a crawling issue, improving internal site architecture, or some technical issue that might be associated with creating duplicate content pages.
If everything you create is getting indexed you may consider creating content at a faster rate, perhaps using sub-brands off subdomains.
If you keep pumping out content but are not seeing your indexing stats go up, that is a cue to build links.
The people from SEO Digger recently put together some research on search spam. Some of the terminology they use (like using the word illicit) is inaccurate, but the trends they discovered align well with what one would expect.
In high money niches, spam sites tended to dominate longer search queries while having less exposure in search results for shorter queries. View the below graph with adult, pills, dating, cars, gifts, and casinos. It shows the normalized density of spam sites ranking in Google by 1, 2, and 3 word queries.
Why is Casino an Anomaly?
I believe the reasons casinos appear so tight nit are
US advertising laws and gaming laws prohibit some of the common spam related revenue streams
leading online gaming sites have heavily embraced both offline advertising and SEO
people who gamble tend to be quite passionate about gambling
That passion means gamers are more active to participate in community sites in that niche, which further consolidates traffic streams due to network effects and creates a lot of free on topic content for some of the major community driven sites.
Effective Search Spamming Business Models
Given this research, if you were to create a business model revolving around spamming, it makes sense to focus on the long tail of search. Get enough PageRank to get your pages indexed, but do not worry about accumulating enough PageRank to try to rank for core keywords in the spammy niches. Plus, staying away from the core keywords makes your sites less likely to get booted from a manual review and/or a competitor snitching on you.
Spam & Ranking Low Trust New Sites
The exact same trend that is seen between real sites vs spam sites is paralleled when considering new websites vs older websites.
Older websites that are heavily linked at and heavily trusted dominate the core category related keywords.
Longer search queries have less matches in the search database, and are thus more reliant on the on the page aspects of SEO.
Older sites can not possibly adequately cover all the related longtail search phrases, so newer sites with less authority rank for many of the more accessible long tail keywords.
If you create a new site you can set your goals on ranking for core category keywords, but realize that longtail traffic will come first. If Google lets entire categories get dominated by spam pages then there has to be an associated opportunity to rank real pages.
I just updated SEO for Firefox to include Compete.com website rank and Compete.com monthly uniques. If you leave Compete.com in on demand mode it tends to work quite well. I am also going to ping the guys at Compete.com to ensure the automatic mode gets to be pretty reliable too. Compete.com data is far better than Alexa because it has less of a webmaster bias.
Justin Laing recently emailed me to let me know about his SEO sitefinder tool, which uses the ODP and the Internet Archive to find DMOZ listed websites that have not been updated in a while.
Domain Tools also allows you to find expiring domains that will be up at auction soon. You can view their top picks or use the right rail filters on that page to search for DMOZ and Yahoo! Directory listed domains.
Free tools such as DropScout allow you to find expiring high PageRank domains.
You can also look at TDNam for expiring domains, and either use software to filter through those OR sort the results by bids and prices. Some of the domains with many bidders are pure play domainers, but others are old trustworthy sites in need of a good loving owner.
Yahoo / Overture had the default status as THE keyword tool for about a decade. They lost that last year when Google started opening up their data a bit more. Now Microsoft is getting into the game offering more useful tools and more data. How does Yahoo respond? They stop supporting their keyword tool. No results, no 301 redirect, no rebrand, no description of why it is broke, no anything. Since my keyword tool is powered by their keyword tool I am getting 10 to 20 emails a day. How many people are not emailing? How much more traffic is Yahoo getting than I am? Tens or hundreds of thousands of dollars of shareholder value are wasted each day with that move.
The best spot to market yourself is on your own site. As long as Yahoo continues to undermine their own assets without regard or thought their marketplace will remain inefficient, and each day they will continue to lose marketshare. They paid $350 million for Zimbra, but what are the odds of them not screwing that up? They have too many half done projects that do not gel together.
Quintura recently made a search page for Seo Book. Their search service is likely going to be more useful for large publishers with millions of pages than it is on a personal blog, but give it a try and see what you think.
Their cloudlike visual search service is a great tool for finding related keyword modifiers used in competing sites, but I don't think we will see such technology front and center at the mainstream search engines anytime soon due to future advertising regulation which will make it harder to integrate ads and make search results more profitable than the current Google format is. Though I would love to see their technology integrated against social bookmarking sites and personal search history data.