'technology' Archive

Improve your rankings, traffic, and profits today. The SEO Book training program offers you:

  • Over 100 training modules, covering topics like: keyword research, link building, site architecture, website monetization, pay per click ads, tracking results, and more.
  • An exclusive interactive community forum
  • Members only videos and tools
  • Additional bonuses - like data spreadsheets, and money saving tips
  • Every order comes risk free, and with the best selling SEO Book as a free bonus

Watch this video to learn more.

Improve your rankings today!

Sep
25

I recently updated my article about search engine history.

Any and all feedback is appreciated.

Sep
24

In Human Computation Luis von Ahn talks about how the Google Image labeler turns work into a game, and how you can enhance that information further by using a game like Peakaboom.

How many cool things will people do on the web for arbitrary points? And are the points actually arbitrary if they make people happy :)

Jun
07

Tony Spencer here doing a guest spot on SEOBook. Aaron was asking me some 301 redirect questions a while back and recently asked me if I would drop in for some
tips on common scenarios so here goes. Feel free to drop me any questions in the comments box.

Jan
22

Tom Evslin posted about his experiences working with Gerard Salton in the early 1960's.

Everybody assumed that the best results would be obtained by algorithms which made an attempt at understanding English syntax. (which is very hard to do). WRONG! Turns out that syntax was a waste of time; all that matters is semantics – the actual words used in the query and the documents – not how they relate to each other in a sentence. Sometimes it was (and still is) useful to search for phrases as if they were words. But you get that just by observing word order or how close words are to each other – not trying to parse sentences.

Modern search engines may use quite a large amount of user tracking and heavily emphasize linkage data, but if you want to see the roots of search I highly recommend reading Salton's A Theory of Indexing.

Dec
23

When I interviewed Matt Cutts he stated that some people who want to know how search engines work might do well to create one. Here are some tips on how to create one.

Dec
10

cool post with links to a variety of search research. I hope to have time to read all the referenced papers.

Greg Linden asks

The probabilities of jumping to an unconnected page in the graph rather than following a link -- and briefly suggests that this personalization vector could be determined from actual usage data.

In fact, at least to my reading, the paper seems to imply that it would be ideal for both of these -- the probability of following a link and the personalization vector's probability of jumping to a page -- to be based on actual usage data. They seem to suggest that this would yield a PageRank that would be the best estimate of searcher interest in a page.

But, if I have enough usage data to do this, can't I calculate the equivalent PageRank directly?

Ho John Lee answers Greg's question here.

Nov
07

Not sure if I have seen this mentioned before. Dan Thies noticed Googlebot's wildcard robot.txt support:

Google's URL removal page contains a little bit of handy information that's not found on their webmaster info pages where it should be.

Nov
03

Loren does a good rundown of a new Google patent Personalization of placed content ordering in search results in his Organic Results Ranked by User Profiling post. Some of the things in the patent may be a bit ahead of themselves, but the thesis is...

GenericScore=QueryScore*PageRank.

This GenericScore may not appropriately reflect the site’s importance to a particular user if the user’s interests or preferences are dramatically different from that of the random surfer. The relevance of a site to user can be accurately characterized by a set of profile ranks, based on the correlation between a sites content and the user’s term-based profile, herein called the TermScore, the correlation between one or more categories associated with a site and user’s category-based profile, herein called the CategoryScore, and the correlation between the URL and/or host of the site and user’s link-based profile, herein called the LinkScore. Therefore, the site may be assigned a personalized rank that is a function of both the document’s generic score and the user profile scores. This personalized score can be expressed as: PersonalizedScore=GenericScore*(TermScore+CategoryScore+LinkScore).

For those big into patents: Stephen Arnold has a $50 CD for sale containing over 120 Google patent related documents.

I think he could sell that as a subscription service, so long as people didn't know all the great stuff Gary Price compiles for free. (Link from News.com)

Oct
31

Google looks like it wants to own Madison Avenue. The Journal also has a free article on Google vs Madison Ave., and John Battelle recently interviewed Google's Omid Kordestani and Sergey Brin.

If you look at the SEO Bytes monthly toplist you will see that in spite of a recent major Google update many of the most popular threads are about how to monetize Google AdSense ad space.

A year or two ago few of the threads covered monetizing content. It seemed like everyone just wanted to rank or assumed nobody would share that how to profit info. AdSense and similar programs work well for quality and automated sites alike.

While Google monetizes crap sites they usually deny their connection to it, keeping the shadiness far away, funding much of it.

Ask Jeeves is a bit closer in some of their relationships. A few days ago I noticed my mom's computer had some Ask MySearch type spyware activites on it. Sure some of it may be uninstallable, but sometimes when you enter a URL in the address bar it says no site found just to redirect you to ads. Shady.

While some say one bad AdSense site may bring down the whole Google AdSense only took around an hour to approve my mom's new site for AdSense, so Google is not putting up much of a barrier to entry.

The more I read and learn about communities and click pimping the less value I see in my current business model, especially when SEO is usually framed in a negative light and I have to deal with this sort of garbage. After all, even as Case is out AOL is suddenly hot again, and some said Steve was just another spammer. :)

Oct
20

Orion posted an interesting thread at SEW, citing this fallacies of relevance page. The SEW thread also has some good posts by other members & looks to be shaping up into a great thread.

Orion stated that he did not think current systems could yet grasp relevancy fallacies.

When you are trying to win an arguement, if you use any logical fallacies make sure you use these 38 sure fire techniques. <-- amazing resource!

Oct
11

Oh so quietly Google added a tagging feature to their My Search History product.

I believe Google will eventually find ways to trust Google accounts more the same way they trust domains more as they age. The tags surely can be abused, but so can links. Just like link anchor text, the tagging could be used by Google to help understand the aboutness of a page or site.

It would take a good bit of knowledge to create a variety of random Google accounts that had regular and unique search habbits over time. Google does not need to try to stop all search spammers, they only need to make search spamming so complex or expensive that most people would just rather put in the effort to create something of high quality.

Yahoo! added a rich get richer factor into their algorithms, adding blogs to their news search. In an interview with Forbes.com Joff Redfern, a director in Yahoo! Search, stated blog rankings may be due in part to the number of My Yahoo! subscribers:

"If we've got more people subscribed to a blog, there is presumably more credibility to its reputation," says Redfern.

You gotta wonder how many fake accounts are getting set up as I type this.

Do any SEO websites sell search behavior or established user accounts yet? If not I wonder how long until they hit the market and how long until those services are claimed on many sites :)

Oct
10

Berkeley has been recording lectures from some of the best minds in search. So far some of the videos include Norvig, Battelle, & Brin. Gary posted a bit about Sergey here.

I am not sure what the problem was, but my connection kept breaking in the middle of the shows, which is annoying. They have a wide variety of Podcasts available here.

Oct
08

People will avoid certain types of information they need:

An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have it.

How do you get people to find information they do not want to find?

Mooers Second Law of Documentation:

In the same manner that color samples provide a test for the detection of color blindness in a person, the descriptor technique provides a means for the detection of the "word-bound" or "idea-blind" person. Such detection is important because a word-bound person may not be able to provide idea-based (word-independent) retrieval service of the kind which is most congenial and most desired by the non-word-bound part of the population. - source

The concept sure highlights the need for writing to the audience the way they speak and think.

Calvin Northrup Moores background - learn more about the man who coined the term Information Retrieval

Sep
29

If you like patents try here, here, and here.

Jul
07

Yahoo! launches their SMS service

the new Google toolbar added a send to phone feature

not too long ago Google became the default home page for T mobile

Business 2.0 recently posted an article about the looming mobile search wars:

According to the Pierz Group, Americans spent nearly $2 billion on directory assistance from their mobile phones last year -- at an average of $1.25 a call -- which suggests a healthy demand for information on the go. And that's just a fraction of the overall mobile search market. Providing instantaneous answers to a wide range of queries is what will make mobile search invaluable. And whoever figures that out is golden.

Jun
06

Google's search quality evaluation process site may have been around for years.

SearchBistro recently posted a 22 page PDF titled General Guidelines on Random-Query Evaluation that was last revised on December 31, 2003.

May
23

A while ago I wrote a bit about TrustRank after reading the PDF about it.

It is fairly easy to understand many of the concepts of it (like attenuating a possitive trust score or offsetting the effects of link spam with a negative trust score), but it is even easier to understand them if you visualize the concept of trust attenuation.

Most sites are not exceptionally compelling, so there are usually not many legitimate hubs in any industry, but many sites are glorified link farms which will not pass any positive trust value.

For a while I helped promote many directories, but many of the new ones on the market have little to no legitimate value, and some of the links from them may even have negative value.

I just wrote an article called TrustRank & the Company You Keep, in which I made this graphic explaining the concept of AntiTrust (yet another SEO phrase I made up hehehe).

The red X's represent things that should be, but are not there.
Bad directory image, showing inbound & outbound link profile.

Yes, I know, the drop shadow is too dark, my web designer friend already yelled at me for that. Other than that, I hope the image clearly demonstrates the concept I was trying to get across.

Other than drop shadow remarks, please leave comments on the article and image below.

May
20

Portalized:
Google offers portalization of Google.com. Danny Sullivan has an in depth review. They have a number of features and intend to add many, such as RSS feed support.

Stemming:
Rand points out a post by Xan on stemming and a free online stemming tool

DMOZ:
kills the submission status review. Now its even easier to be corrupt ;)

New York Times:
Begins charging for some of their content. Most of their content remains free. They are also replacing the CEO of About.com.

When Not to Submit to Directories:
when a person creates about a half dozen general directories and promotes them all together. that is not building value, that is trying to cash out and milk the web.

Many directory owners have become exceedingly greedy recently. All the while search algorithms continue to advance and few of the directory owners are actually trying to build any legitimate value.

The Search:
You can pre order John Battelle's new book. He said if you use this link he may be able to autograph it for you, assuming he can work out the shipping details.

The Size of Google's Index:
might have been a bit frothy

Google Factory Tour:
video presentations (should be up soon), Philip Lessen has highlights

Mirago AdSense:
Apparently they have a product similar to AdSense, which might be useful for companies like HotNacho.

May
05

If you are a search geek you may like Fractals, L-Systems and Semantics

Xan questions the paper a bit at the SEW forums.

You guys as you say find inspiration in Orion's theories, even if they have not been proved, and it gives you the motivation to improve your content. This is sufficient enough to see the use of them.

The problem of the ideas as a whole as they do not take into account the big picture but focus down on a very specific are which is the content on the page, when what you should be looking at is the content you share with your peers, and how this all links in together. Starting to look at the various different dimensions your content has in relation to the rest of the world around it may tell you some more. Demo's I've seen do include the use of clustering but in the sense of topic classification. Each site or even each part will belong to 1 or many different spheres of belonging if you like. I've seen demo's that spit out the "topic sphere" if you like and enable the user to visually manipulate this or textually manipulate this to get the results they want.

Never forget the big picture!

I think Xan's point is valid in that by following rules or focusing on specific things sometimes we miss out on the big picture or create artificial machine identifiable patterns. With that being said I find lots of the stuff Orion posts interesting.

Off topic, but Orion the Hunter is my favorite constellation. I have been exploring the universe a bit recently, watching some Cosmos :)

Apr
22

Why did Adobe Buy MacroMedia?
all the reasons. no spin.

Algorithms & Patents & Spam, oh My:
Yahoo!'s Concept Network & SuperUnits

Is NickW for Blog Spam?
certainly not, when its done sloppily to one of his blogs ;)

The Wrong Tail:
people are starting to use The Long Tail without purpose. better get that book printed quick.

Yahoo! Buys TeRespondo.com:
a good post from Nacho.

New Blog:
O'Reilly Radar

New Browser:
Opera 8 Launched

Media Futures:
Media Futures, Part 1/5: AUTOMATA

Internet Advertising:
A decade in Online Advertising (PDF) - report by DoubleClick, who may get bought out soon. found on Lee's blog

Wanna Park?
viral marketing at its best: I Park Like an Idiot

Apr
20

This post is a few bulleted points which point at the web of trust Google is trying to build.

  • Google has expressed intent in using user feedback to help define relevancy.

  • They may follow click streams to understand who your sponsors are. (also mentioned in the above patent)
  • Google may be doing a decent amount of temporal link analysis, especially for sites below a certain authority level. (also mentioned in the above patent)
  • Google created a system which stores search history over time. Google may shift how much they trust these profiles based on
    • search volume

    • how well a profile related to other search profiles
    • location based on IP addresses (they could discount the effect of profiles which were primarily created through open proxies or in poor areas).
  • Installing their toolbar means they probably know what sites you own (since site owners tend to visit their sites more often than anyone else).
  • Google has access to registrar data. This can likely be used to help determine if and how sites are related.
  • Google runs the world's single largest distributed ad network. If you use that network they know what sites you are marketing. They know what markets you are in.
  • Google has been filtering or banning sites which have unnatural linkage profiles.

PageRank was broken from the start. The concept they were going after may still well exist though if they can get enough users of their search history tool. While other search engines still seem relatively easy to spam Google may be trying to measure web wide trust scores using much more than just raw linkage data.

Google need not stomp SEO techniques out, they only need to:

Some people will be untouchable. They will know enough about social engineering and database programming to where they will still spam Google all day long. I am sure Google realizes that, but they want to continually increase costs to where that is an exceptionally small pool.

As SEO gets harder Google makes more money from ads. As they make more money from ads they can spend more into making SEO harder.

Now if only they could share more data with advertisers to help make click fraud easier to detect. Google bought Urchin. Why not buy, create, or offer something like Who's Clicking Who. Surely Google has the market data and it will not increase costs much to give advertisers more options and more data.

A search company which makes tons of profit organizing data should recognize that by making advertising transparent and making more ad information available they will create a more efficient market which creates more profits. The advertising community would likely police themselves if you gave them enough data and responded to feedback.

Google Inc. (GOOG.O: Quote, Profile, Research) on Wednesday debuted a test service called My Search History that analysts said is a move closer to personalized search, which is widely considered the Holy Grail for the Web search leader and its rivals. source

to use My Search History you must register at Google Accounts and maintain an active account.

Apr
11

.JOBS and .TRAVEL:
to come late 2005

Cheap Promotional Technique:
throw some political ad on Google. after you get a ton of press coverage say it was an accident.

Direct Answers:
Google adds direct answers to SERPs.

Keyword Research:
Statistically Improbible Phrases (found by Ploppy)

Words which rarely occur in a search index likely are more likely to be more descriminant than common words and thus likely have greater term weight.

Search Research & Spam Papers for AIRweb:
Intallment #1
Gary Price also stated that A Taxonomy of Web Spam (PDF) was recently updated, and they covered that in the forums here. Here is a list of some of the newer Stanford research papers.

Tailoring Technology:
Jeff Weiner, VP of Yahoo! Search, chats about search and customizing software.

Webmaster Radio:
Audio archives now online. thanks to StuntDubl

Good Forum Thread:
about Google's new patent.

Encarta:
accepts user feedback and editing, although I can't imagine it is as appealing to add content next to their ads.

Oil & You:
The Long Emergency

Cool:
Stor Troopers are back :)

Apr
05

The Term Extraction service provides a list of significant words or phrases extracted from a larger content. It is one of the technologies used in Y!Q.

Google Blogoscoped created a free auto linker tool, which makes adding on topic outbound links exceptionally easy. Am betting some people creating fake blogs probably enjoy the offering.

Part of Google's strong brand is PageRank, which now is of little use AND rarely updated. With all of these other good ideas Yahoo! Search is coming out with I am a bit surprised they are not providing and heavily promoting a regularly updated connectivity measurement service. Whatever happened to WebRank?

Apr
01

Greywolf does a greate review of the recently awarded Google patent.

Mar
31

Google:
Patent dealing with temparal ranking effects - Greg Boser called this "The most important SEO related document in the last 5 years."
2004 annual financials report

Yahoo!:
to give a clear API Answer? maybe

Search Awards:
Danny Sullivan's SearchEngineWatch announced the 5 annual search awards. Yahoo! wins the outstanding search service award.

Mar
22

Google Groups:
Froogle Merchants Group
also if you do not yet subscribe to SEM 2.0 it is a good list.

Buy Forum Sigs:
not sure how much value there is to it, but Sig Trader buys and sells forum post sig links. Amazing how many different ways there are to build links.

For Search Geeks:
in a forum post Xan recently mentioned
IBM Research Natural Language Processing
The retrieval of information from historical perspective

Become.com:

Feb
21

Under the Covers: How Search Engines Work by Tiziana Perinotti

from 97, talks about stuff like natural language processing.

Information Retrieval and Text Mining
Information Retrieval and Text Mining PDF (PDF with different info in it)
and a bunch more PDFs & the like here

found on WMW forums

Feb
15

Xan has a cool post:
if you are really interested in AI or search technology you should go read it.

Recently while talking to two different friends they stated that if you want to be a good SEO you should think more like a search scientist than as a webmaster, and Xan is surely trying to help us out with that ;)

Feb
07

A buddy of mine pointed me to a white paper by Zoltan Gyongyi, Hector Garcia-Molina, & Jan Pederson about a concept called TrustRank(PDF).

Human editors help search engines combat search engine spam, but reviewing all content is impractical. TrustRank places a core vote of trust on a seed set of reviewed sites to help search engines identify pages that would be considered useful from pages that would be considered spam. This trust is attenuated to other sites through links from the seed sites.

Feb
04

Writing:
Everything You Need to Know About Writing Successfully: in Ten Minutes

How to Be a Consultant:
Create The Warm Fuzzy Feelingâ„¢. Reading it certainly takes much longer than 10 minutes, but it is well worth it if you are considering becoming a consultant.

Feb
03

Many people have been noticing a wide shuffle in search relevancy scores recently. Some of those well in the know attribute this to latent semantic indexing. Even if they are not using LSI, Google has likely been using other word relationship technologies for a while, but recently increased its weighting.

Jan
12

Eating Your Own Crap:
Fractal Spam - search engines may be known to like their own search results...at least for a while.

Overture Direct Traffic Center:
Some big advertisers are not too impressed with the reporting delays and clunky interface.

SEM Cares? SEMPO Cares? or is it Nobody Cares?
SEM Cares perhaps too little, too late for Barbara and others to put out the good word? The domain name sounds a bit Orewellian, which almost makse it sound like maybe nobody cares.

Free Culture Stuff:
A few good links from ThreadWatch's thread about big blue Open Sourcing 500 patents.

Patented European webshop
Software patents – Obstacles to software development by Richard Stallman

Chatter:
There is also chatter that Google may be dropping some spammed out subdomains from some competitive keywords in some of their data centers.

Jan
10

ChrisG mentions that Google's site flavored search automatically suggests categories for websites, and that generally it has spot on results.

I am sure it is only a small sample of what Google's technologies do, but it is interesting nonetheless, and it may tell you what Google thinks of your site as well as help you think of related categorical sites to get links from. Maybe it would also be a good way for a small new directory owner to grab a unique category structure for their site?

On a side note, apparently Google has no idea what Black Hat SEO is...

Jan
09

Home Page of the Day:
Jon Kleinberg - he worked on lots of the underlying theory that created the hubs and authority ranking system which eventually led to Teoma.

He has all kinds of cool PDFs on his site such as Maximizing the Spread of Influence through a Social Network - cool stuff. If I were better at math and network theory stuff his home page would be a virtual candy store.

Interesting & Awaiting Results:
fathom is conducting a link title attribute test

Undersold ad space
Anna Kournikova on advertising...er, advertising on Anna Kournikova

Illigitimate ad space:
Bush Administration Invents 'News' and Pays Journalist

Hosed Ad Space:
Kraft WHITE American Cheese - AdWords ad targeting problems :(

Really, I am not a Slimeball Ads:
Ken Lay starts advertising on AdWords. Interesting what the other AdWords ads say about him too.

Meta "ingnore this part of the page" tag:
I can't really see it coming anytime soon, but some want to push the idea.

MSN Beta to ramp up testing:
MSN Beta to ramp up testing

Developing a Directory?
The Don'ts of Directory Development offers tips to help you get your directory off the ground.

ESearch Online E Search Online ApexSearch Apex Search (look out):
another SEO firm out of Vegas that is allegedly cold calling people.

I did not find any legitimate backlinks into the apexesearch site. The only one I found in Google was from a forum solicitation by a guy by the name of Sincity

Sincity would like to offer you...

In that forum post it states:

real results refferences provided in business since 1996 no cusomer complaints EVER!!!!

and yet its registration details state

Registered through: GoDaddy.com (http://www.godaddy.com)
Domain Name: APEXESEARCH.COM
Created on: 20-Apr-04

Domain Name: E-SEARCHONLINE.COM
Created on: 22-Dec-04

I did not see any meaningful company information on their company information page either http://www.apexesearch.com/info.htm. Some people are wondering if this firm has anything to do with Traffic Power. If any SEO calls you up out of the blue trying to tell you that you MUST buy something TODAY then odds are they are NOT worth buying from. Cold calls = crap. Traffic

How Not to Make Friends:
Promote your services in others forums while trashing their business model in your own forum.

How can a person wanting to set up an automated link network say that people should not be able to buy links by PageRank?

How Not to Make Friends...Part 2:
For a while the name of the SEO firm that wanted RustyBrick to link to them was posted in this rant thread.

One time some guy with a big mouth emailed me about how great his firm was and felt that for that reason he felt he deserved a link from my site. I also had a hunch that when another well known firm told me to add them to my SEO forums page that they were spamming me. Not too long ago I got an email from an express link building firm which used "stuff" as the the email title. I wonder how many people use these same shoddy techniques to "promote" (or otherwise destroy the brand of) their clients sites?

Dec
09

Google Finance:
John Battelle has lots of yummy stats about Google's finances...

  • nearly 17% of visitors click on ads.

  • Google makes an average of 54 cents a click.
  • Google makes on average nearly a dime from the average US search

Though Danny Sullivan makes a guest appearance in the comments to say the figures may be off (if they did not take in account for contextual ads).

Rob Frankel:
My favorite branding guru has a great rant blog. His view of Paxil and Prozac for children...

Trellian Seasonal Keyword Research:
Out of touch with the season?

Malcolm Gladwell:
One of my favorite authors gives a speech (about a month old, but his stuff is always good)

Contextual Ads:
Chitika is a new contextual ad network (their parent company has also been powering eBay's keyword driven banners)...rumor has it they might be writing some quality PR stuff too.

Laptops & Porn:
always a bad idea...

Mobile Search:
How it will change everything...or will it? I think there is a ton more to the world than just registering a name. Sure people will easily be able to link up regular publications and products to web locations, but the reason Amazon is successful is not just its product offering or customer service, but the rich feedback past consumers have left in their system. I think our social interactions and the trails we leave on the web are worth a ton more than this article seems to believe.

Mobile People Search:
US to use electronic passports.

Eventual RSS Doom:
Will its popularity destroy it?
Should People Run RSS Ads?

I think the links and attention you get from RSS subscribers will have more longterm value than their cost. If hosting costs are killing you go with Blogger or find a host who wants some cheap marketing (a hosted by link on your site).

Its not uncommon for businesses to have loss liters. If many of your readers / RSS subscribers also provide you tons of links then maybe you should look at the bandwidth as an advertising expense.

Those Random Late Night Purchases:
Internet Accelerator may help you download pages rack up credit card bills quicker.

Dec
08

SEO Old Timer Tips:
An Old Timers Perspective...from SEGuru

Search Engine Old Timer Tips:
Recently a friend of mine bought me a copy of A Theory of Indexing by Gerard Salton. It is a 50 page book from 1975 with lots of charts and math, but in those few pages it has a ton of information about many of the ideas which current search technologies have been built upon.

I am probably going to have to read it again because it was so dense with information and had lots of math that was a wee bit above me the first time around, but to anyone interested in learning about search technology it is a great book...much like Mike Grehan's.

A Theory of Indexing talks about a ton of interesting things like:

  • signal to noise

  • inverse document frequency
  • discrimination value
  • and lots of other stuff

Here is a small bit I learned from the last few pages...

If words exist in a high % of the total documents in a document collection then they are not usually going to be good at discriminating which documents are relevant for a particular query (since they appear in too many documents).

If words exist is a low % of the total documents then they are not usually going to be good at discriminating which documents are relevant for a particular query (since they appear in so few documents).

Words with a mid range document frequency are better discriminators.

To make better use of words that appear in a high % of the total documents you can combine the words into word pairs or triples - which will have a lower frequency and may be better at descriminating document relevancy.

To make better use of words that appear in a low % of the total documents you can cluster the words into groups via the use of a thesaurus - which will have the net effect of creating higher frequency word classes / clusters - which may be better at descriminating document relevancy.

Sep
13

Einat Amitay's Web IR & IE has a bunch of links to technical research in the search field. Chris Sherman covered this site in SearchDay.

Sep
07

Gary has a new site called Docuticker keeping track of some of the newer search and other useful research papers / documents.

More info about Docuticker

(thanks to Andy)

Jul
26

Lexapro Feedback
Recently I have been getting a good amount of feedback for Lexapro, which has led me to do a bit of hunting at different SERPs.

Yahoo!
I created the main depression feedback category pages so that they would be intentionally keyword dense for Yahoo!. Not surprisingly if you search Yahoo! for "Lexapro feedback" the category page ranks first.

Google
On Google I have couple top ten rankings for the same term, but it is the inner pages which are ranking above the category page. They have a much lower keyword density and much less link popularity, but the text reads more like natural text since most of the feedback comes from people outside of my control who know nothing about SEO.

MSN Search Preview
MSN tech preview currently prefers more of the natural occurances like Google and thus lists the inner pages before the category page.

Current MSN Search
However the currently used MSN search is also like Yahoo! in that it favors the keyword dense category page over the other pages in my site.

HIDDEN GEM
When I searched MSN for the single word "Lexapro" the second search result was www.lexapro.netfirms.com, which does not even have a site there. For backlinks MSN is only showing about a dozen guestbook spam links pointing at that site (generally I do not heavily recommend this technique but some people obviously are finding it successful).

Back when MSN was powered by LookSmart I know I got tons of hits everyday for searches like "lexapro." I do not do lots of hyper aggressive pharmacy type SEO, but this search looks to be some low hanging fruit for anyone who does.

Another interesting thing about MSN search is that they had an Ebay ad above the official Lexapro site in their sponsored listings. A new concept there...bid on your prescription from Ebay...

Feedback
Here is one of the more interesting Lexapro feedbacks that was left today

My boyfriend started to take Lexapro for his depression and mood swings. He was doing fine, but he was very tired. He was then on his way to work one morning, when he backedout at the wheel. He ran off the road and wrecked the car. I dont understand because he was taking it at night before bed. The police came and they actually charged him with a DUI. He told them that he had started a new perscription and they had no sympathy.

Jul
19

Daniel E Ross and Danny Levinson (of Yahoo!) recently created a whitepaper titled Understanding User Goals in Web Search, which aimed to figure out "why are people searching?"

Jul
11

Search engine lectures

Learning the structure of unstructured document bases
Lecture by David Cohn (Carnegie Mellon)

How to Crawl the Web
Lecture by Hector Garcia-Molina (Stanford)

The Structure of Information Networks
Lecture by Jon Kleinberg (Cornell)

All The World's Information at Everyone's Fingertips
Lecture by Udi Mandbar (requires Real Player)

thanks to Gary for posting at SEW forums.

Apr
17

Recently Mike Grehan interviewed Jon Glick, who is Yahoo!'s Senior Manager for Web Search. You can read all the good Yahoo! Search stuff (note to self: stuff is a generic word to use in anchor text) in it, or look at my synopsis below.

Apr
06

Recently I was over at Topix.net and glanced at their blog and found a great post about Google by their founder Rich Skrenta which highlights Google's competitive advantages.

...the story is about seemingly incremental features that are actually massively expensive for others to match, and the platform that Google is building which makes it cheaper and easier for them to develop and run web-scale applications than anyone else...While competitors are targeting the individual applications Google has deployed, Google is building a massive, general purpose computing platform for web-scale programming.

Mar
04

Some good info about SEM, websites, and the future of search.

Nick Scevak of Jupiter Media recently showed a yummie pie graph which showed

  • 16% of businesses surveyed outsourced search engine marketing

  • 15% do not do search engine marketing
  • 69% do search engine marketing in house

This shows some amazing room for growth potential within the industry.
Nick also stated that his biggest fear with paid search is that we may have unrealistic expectations based on amazing performace and returns for early adoptors.

Cheryle Pingle of Range Online Media stated that a large portion of the current growth in search is due to the growth of the economy.

Michael Sack of Inceptor stated that of the term space the bulk of commerce comes from a few hundred thousand terms. He believes this year that large companies will begin to buy out markets to place them out of the reach of smaller businesses.

Geoff Ramsey of emarketer.com also had many yummie pie graphs. His graphs he showed at the SEMPO meeting showed that

  • from 2000 - 2003 the search marketing industry has increased about 10 fold

  • from 2002-2003 search engine marketing had a 145% year over year growth rate
  • 22% of US households have broadband
  • Yellow Pages currently make $14.3 billion annually, whereas paid search is currently only a 2.2 billion dollar industry.

Also at SEMPO Google announced that it is now supporting search engine marketing and sponsoring SEMPO as the rising complexity and competition in the industry is preventing many business owners from being able to functionally use the marketing systems.

Fredrick Marckini of iProspect quoted a stat from StatMarket which stated the average retail web site conversion rate is 1.8 - 2.0%

Greg Boser of WebGuerrilla also provided a few good link tips on the day. When buying links, 501 C organizations are a good place to look. He also stated that he has seen unlinked URLs in TXT files count as backlinks. Some other good link ideas offered by others include trade organizations, tools, and specialty directories.

Feb
20

Do Links Count from Unrelated Sites?
Yes. Links count even if it is from an unrelated site.
Links count even more if they are from on topic sites.

Feb
11

Many people have certain restrictions which prevent them from being able to download the Google Toolbar, which was designed for the windows operating system. The Google Toolbar was one of only two locations Google intended to display Google PageRank.

How do I get Google PageRank Without the Google Toolbar?
Well there are a couple options for extracting Google PageRank without the toolbar.

It appears Google has gone far beyond stemming with their current algorithm update. They seem to be looking for semantic intent of the query as well as the page, and then returning a result based upon it. The resulting pages frequently may not even have the query on the page.

(original discussion in HighRankings Forums)

Feb
09

Search Engine Watch announced the winners of the 2003 Search Engine Watch Awards. Google took most of the awards again.

Feb
05

When there is not much news you must cover the fun toys.

Spider Hacks teaches you how to create your own spider...which I eventually will.

MTGoogleRank shows how many pages link to any page and where any site ranks for a keyword...am going to try this out real quick

Also MTMacros is really cool looking stuff, which I will need to be play with soon.

Jan
29

The 4TH annual search engine awards now have open voting until the 4TH of Febuary. Registered members of SearchEngineWatch may vote for the winners.

In addition today's SearchDay references the search engine article series "On Search, The Series" by earch engine pioneer Tim Bray.

Jan
27

I keep reading these marketing books which say that markets are conversations and over at SearchGuild we recently had two distinctly different types of people come to the forums. Each came to represent their product and they fared way differently.

Jan
21

While the quality of my articles may vary, I think my timing is delicious. At about 2 am this morning I was finishing up a small article titled "The Problems With Search Engine Personalization," when I found out about Eurekster.

Many of the top search engines and search engine experts believe that personalization is going to be important to the future of search. In all honesty it scares me as much as it interests me.

Danny Sullivan just wrote a good article about Eurekster, which is currently powered from user feedback and AllTheWeb.

In marketing the power of the weak tie is astronomical. If you ask a large cross section of society "Who found a job through a weak friend?" the percentage will be exceptionally high. Our friends typically share much of our environment and lifestyle. People who are friends of a friend live in a totally different world and know realities which are completely foreign to us.

Friendster is a free dating and social interaction network which opperates using this idea. Reports have stated that Google wanted to buy them last year for $30 million, but they did not sell.

Google organizes the web based on the social structure of the linking of the entire web. Newer technologies are allowing them to better find local clusters, but The Bost Globe reports that today a new competitor will take this field using the direct route.

For Eurekster to be effective features such as categorizing friends and settings such as trust friends of friends a certain number of levels deep will be necissary. Eurekster works by allowing you to cast a silent vote for a site based on the time you visit the site. Read the official Eurkester about us and Eurkester how it works information.

They hope to get you to download their toolbar and to tell friends about Eurkester via email. Two things which I believe to be errors in spreading this message are that the name of the search engine is hard for me to remember, and that they have a somewhat cluttered home page when compared with the current major search engines.

Jan
19

Me Too! Google is frequently cooking up something in the Google Labs. Yahoo today announces the creation of "Yahoo Research Labs."

Jan
18

(GEEK STUFF) One of the largest problems many search engines run into is that after they get to a few hundred million documents their algorithms and hardware hit a wall.

For those companies that can afford the investment to get past this point they still run into the problem that each additional resource makes their job a bit harder.

One of the major ways around this problem is to take advantage of the natural patterns in human language. Using Latent Semantic Indexing allows indexing search results based on the pairing of like words within documents.

Many complex searches may lack exact matches in the results as well. Being able to find near matches will allow search engines to provide more comprehensive results.

Its hard to get computers to understand anything human, but the process of latent semantic indexing delivers conceptual results while being entirely mathematically driven.

There are two main ways to do this, single variable decomposition and multi dimentional scaling.

Some of the steps of the single variable decomposition process are to:

  • create a database of all words in relevant documents
  • remove common stop words
  • stemming
  • remove words appearing in all results
  • remove words only appearing in one result
  • create a database of relavent keywords
  • weight the pages based on the frequency of keyword distribution
  • increasing the relevance of terms which appear in a small number of pages (as they are more likely to be on topic than words that appear in most all documents)
  • normalize the page to remove the pagelength as a factor
  • create relevancy vectors for the keywords

The single variable decomposition process is not scalable enough to work on large scale search engines though as it requires too much processor time. Multi dimentional scaling allows us to take snapshots of the topicology of different documents. "Instead of deriving the best possible projection through matrix decomposition, the MDS algorithm starts with a random arrangement of data, and then incrementally moves it around, calculating a stress function after each perturbation to see if the projection has grown more or less accurate. The algorithm keeps nudging the data points until it can no longer find lower values for the stress function."

This does not provide exact results, but only a rough approximation. When combined with other factors this approximation improves scalability and quality of search.

Good Reading on latent semantic indexing

This technology is so amazing that it may eventually help lead to a cure for cancer. Already the technology is being refined for cognitive improvements and test grading!

Jan
14

Many of these tips originate from members of the I search discussion list (which is an amazing resource well worth the money).

This guy has an datebase ASP website and makes his dynamic content look static to the search engines using a custom 404 error pag build.

Additional ideas are a server side filter softwarehttp://www.smalig.com/url_rewrite-en.htm and URL rewriting software http://www.opcode.co.uk/components/rewrite.asp.

Here is the Apache Mod Rewrite page for you Apache people...

General tips to make a dynamic site get spidered
1.) Do not force feed the spider a cookie
2.) Use 3 or less variables
3.) Have each query string 10 or less digets
4.) Create a sitemap which links to many of the main database locations.
5.) Build up link popularity from a few quality inbound links. The PageRank (or link popularity in search engines other than Google) will make the spider more inclined to spider deep through your site.

Jan
12

In any medium there will be free rides as new adopters take advantage of knowledge not share by their competitors. While there is always a new technology which creates new markets, this quick read does a good job of explaining why off the page optimization is more effective than on the page optimization. Chris Ridings explains "The Glass Ceiling."

Jan
06

Search Engine Milestones for December (via Search Engine Watch)

#3 PPC player FindWhat completes aquisition of Meva Merchant, delays talks with Espotting on merger, and is to start using IntelliMap broad match technology later this month.
Sign up now for $5 bonus ($50 minimum credit purchase required.)

24/7 Media search is partnering with Lycos for its ad program. press release
ePilot gearing up for beta testing regional ads.

Ego ad agencies do horrible search engine marketing.

Shopping search results 2003 are in.

Paid Inclusion has a so so outlook. (I suggest using a blog instead)

Lots of news on top searches 2003
Google Yahoo

and of least importance to those walking the Earth: Macon online is at #5 for Macon AGAIN...to them I tip my hat.

Jan
04

The biggest gripe most people have with Google AdWords is that niche specific products must compete for market share with general merchandise using the broad match feature. Overture places exact matches above broad match ads.
Its seems Overture is going another step further to make its product more user friendly. Later this month Overture will allow seperate bidding for its Content Match product. While implementing this change they will also remove the 20% discount they initially offered and are expanding the product throughout the Yahoo! network.

Open an Overture account today and get a $50 signup bonus.

Thanks to Michael Wong