David Mihm – web designer and local SEO extraordinaire – recently asked me to participate in his local SEO ranking factors survey. And it got me thinking as to how a search engine might consider the usefulness in ranking sites of any particular factor. Let’s see what the thought process in this part of a search engineer’s workday is like.
First, we need to consider the types of factors:
- Positive Ranking Factors
- Negative Ranking Factors
- Positive Indexing Factors
- Negative Indexing Factors
For positive ranking factors, widespread adoption is necessary. If you’re going to have a level playing field and make any meaningful sense of the web, you need to rely on signals that are consistently used across the web. That’s why basic HTML elements still matter so much – they’ve been around longest and therefore have the widest adoption rates.
A further point that’s important to understand about this is that the ease with which the factor can be manipulated – especially in ways that only affect search engines and not visitors – is inversely proportional to its utility as a positive ranking factor.
The reason is that everyone wants to rank higher, and people will do whatever they can to achieve that. And this often decreases relevance. That’s why third party data like links (and CTR for personalized search) is golden for this part of the algo. It also explains how Google Bombs were possible – Google over-relied on third party data to rank sites.
For negative ranking factors, you just need to be able to pattern this factor across some quantity of spam. Does the page have the words ‘guestbook’ in it or in the URL? Does it link out to both ponies, viagra, directory submission, and porn sites? You don’t need widespread adoption of these across the web, just a fair size chunk of the shady parts.
These factors should mostly be based upon the site’s own techniques, like cloaking, doorway pages, etc. Otherwise, you run into the risk of having competitors “bowling” each other out of the SERPs. Of course, when you see 20,000 spam links to a given site, it’s harder to believe it’s innocence, but in competitive niches, it can happen, as with my friend Richard‘s company, Cheap Flights.
For positive indexing factors, more limited adoption is acceptable where the signal has a high correlation to increased relevance. In English, that means that the factor being used just across a few sites will help you index more relevant sites.
For negative indexing factors, you’ll be taking into account cost-benefit where cost is computer resources and benefit is increased relevance and quality in the index. For example, Google recently told us that the extra cost in computer resources for making Googlebot perform funky tricks it couldn’t do previously was worth the benefit of more high-quality content in the index, and thus better longtail SERPs.
Just my 2 cents. I would love to hear from any real engineers what they make of this, as well as some technical friends of mine who would know better than I would… If you liked this post on search geekery, get my RSS feed.