Best Web Search Engine

Search Engine Watch announced the winners of the 2003 Search Engine Watch Awards. Google took most of the awards again.

Search Engine and Movable Type Toys

When there is not much news you must cover the fun toys.

Spider Hacks teaches you how to create your own spider...which I eventually will.

MTGoogleRank shows how many pages link to any page and where any site ranks for a keyword...am going to try this out real quick

Also MTMacros is really cool looking stuff, which I will need to be play with soon.

Search Engine Awards and Search Engine Articles

Jan 29th

The 4TH annual search engine awards now have open voting until the 4TH of Febuary. Registered members of SearchEngineWatch may vote for the winners.

In addition today's SearchDay references the search engine article series "On Search, The Series" by earch engine pioneer Tim Bray.

Forum Success and Forum Failure

I keep reading these marketing books which say that markets are conversations and over at SearchGuild we recently had two distinctly different types of people come to the forums. Each came to represent their product and they fared way differently. Person 1: Dez from WebSearch.com.au spoke in a human voice about his search engine. He has even lended tips to some moderators on how to improve their sites...we think Dez is the bomb diggity!

Person 2: DirectorySearch from DirectorySearch announced his URL and basically bailed. In the process he alienated many of us and actually hurt his product. He never gave his name and failed to even fully answer the questions about his products.

DirectorySearch does have a good script, but product alone does not make or break you. It's how you use the product and how you communicate with the customer that helps build long term value.

Personalized Search Engines

While the quality of my articles may vary, I think my timing is delicious. At about 2 am this morning I was finishing up a small article titled "The Problems With Search Engine Personalization," when I found out about Eurekster.

Many of the top search engines and search engine experts believe that personalization is going to be important to the future of search. In all honesty it scares me as much as it interests me.

Danny Sullivan just wrote a good article about Eurekster, which is currently powered from user feedback and AllTheWeb.

Eurekster - the Social Search Engine

In marketing the power of the weak tie is astronomical. If you ask a large cross section of society "Who found a job through a weak friend?" the percentage will be exceptionally high. Our friends typically share much of our environment and lifestyle. People who are friends of a friend live in a totally different world and know realities which are completely foreign to us.

Friendster is a free dating and social interaction network which opperates using this idea. Reports have stated that Google wanted to buy them last year for $30 million, but they did not sell.

Google organizes the web based on the social structure of the linking of the entire web. Newer technologies are allowing them to better find local clusters, but The Bost Globe reports that today a new competitor will take this field using the direct route.

For Eurekster to be effective features such as categorizing friends and settings such as trust friends of friends a certain number of levels deep will be necissary. Eurekster works by allowing you to cast a silent vote for a site based on the time you visit the site. Read the official Eurkester about us and Eurkester how it works information.

They hope to get you to download their toolbar and to tell friends about Eurkester via email. Two things which I believe to be errors in spreading this message are that the name of the search engine is hard for me to remember, and that they have a somewhat cluttered home page when compared with the current major search engines.

Yahoo Research Labs

Jan 20th

Me Too! Google is frequently cooking up something in the Google Labs. Yahoo today announces the creation of "Yahoo Research Labs."

Latent Semantic Indexing

Jan 18th

(GEEK STUFF) One of the largest problems many search engines run into is that after they get to a few hundred million documents their algorithms and hardware hit a wall.

For those companies that can afford the investment to get past this point they still run into the problem that each additional resource makes their job a bit harder.

One of the major ways around this problem is to take advantage of the natural patterns in human language. Using Latent Semantic Indexing allows indexing search results based on the pairing of like words within documents.

Many complex searches may lack exact matches in the results as well. Being able to find near matches will allow search engines to provide more comprehensive results.

Its hard to get computers to understand anything human, but the process of latent semantic indexing delivers conceptual results while being entirely mathematically driven.

There are two main ways to do this, single variable decomposition and multi dimentional scaling.

Some of the steps of the single variable decomposition process are to:

  • create a database of all words in relevant documents
  • remove common stop words
  • stemming
  • remove words appearing in all results
  • remove words only appearing in one result
  • create a database of relavent keywords
  • weight the pages based on the frequency of keyword distribution
  • increasing the relevance of terms which appear in a small number of pages (as they are more likely to be on topic than words that appear in most all documents)
  • normalize the page to remove the pagelength as a factor
  • create relevancy vectors for the keywords

The single variable decomposition process is not scalable enough to work on large scale search engines though as it requires too much processor time. Multi dimentional scaling allows us to take snapshots of the topicology of different documents. "Instead of deriving the best possible projection through matrix decomposition, the MDS algorithm starts with a random arrangement of data, and then incrementally moves it around, calculating a stress function after each perturbation to see if the projection has grown more or less accurate. The algorithm keeps nudging the data points until it can no longer find lower values for the stress function."

This does not provide exact results, but only a rough approximation. When combined with other factors this approximation improves scalability and quality of search.

Good Reading on latent semantic indexing

This technology is so amazing that it may eventually help lead to a cure for cancer. Already the technology is being refined for cognitive improvements and test grading!

How to Make Dynamic URLs Static

Jan 15th

Many of these tips originate from members of the I search discussion list (which is an amazing resource well worth the money).

This guy has an datebase ASP website and makes his dynamic content look static to the search engines using a custom 404 error pag build.

Additional ideas are a server side filter softwarehttp://www.smalig.com/url_rewrite-en.htm and URL rewriting software http://www.opcode.co.uk/components/rewrite.asp.

Here is the Apache Mod Rewrite page for you Apache people...

General tips to make a dynamic site get spidered
1.) Do not force feed the spider a cookie
2.) Use 3 or less variables
3.) Have each query string 10 or less digets
4.) Create a sitemap which links to many of the main database locations.
5.) Build up link popularity from a few quality inbound links. The PageRank (or link popularity in search engines other than Google) will make the spider more inclined to spider deep through your site.

ChriSEO's 'Glass Ceiling'

In any medium there will be free rides as new adopters take advantage of knowledge not share by their competitors. While there is always a new technology which creates new markets, this quick read does a good job of explaining why off the page optimization is more effective than on the page optimization. Chris Ridings explains "The Glass Ceiling."

Update: above link to chriseo.com/modules.php?op=modload&name=News&file=article&sid=62&mode=thread&order=0&thold=0 delinked, as the site is owned by a domainer and is a page full of ppc ads

Pages






    Email Address
    Pick a Username
    Yes, please send me "7 Days to SEO Success" mini-course (a $57 value) for free.

    Learn More

    We value your privacy. We will not rent or sell your email address.