Google is broken because PageRank no longer models what it did in Brin and Page’s foundational paper.
The reason that Google’s PageRank algorithm functioned well originally was because it modeled user behaviour, not webmaster behaviour (aka “the web’s democratic nature,” per some self-righteous Wikipedia editor). It determined how likely a visitor was to visit the site, based on linking patterns external to the site. PageRank viewed links as navigation. Today, PageRank views links are”votes,” per Matt Cutts. Links are counted as an editorially-given seal of approval and that makes all the difference.
What are the practical implications of Google’s two conceptions of PageRank?
Different things need to be measured depending on whether links are understood as navigation or seen as votes.
When you keep to measuring probability, such as the probability that a link will send visitors, you can rely on math formulas and computer calculations to do the work. Even the on-page evaluations to determine how relevant a link is to the receiving site can be done algorithmically to a large extent. That’s the way Google used to do business.
Nowadays, the intangibles of intent, sincerity and authenticity are supposed to be measured by Google’s algorithms. The problem is that computer programs are not yet at that level of sophistication that they can understand these items. So as with the recent paid links dustup and notably the -70 penalty attributed to Text Link Ads, manual evaluation is de rigueur. The computer calculations that discount paid links seek to footprint paid links by looking for indicative nearby text such as “Sponsors,” “Ads” etc.
What are the consequences of Google’s broken state?
- Google’s model is having difficulty scaling, particularly in average-Joe moderately competitive areas where link buys are less likely to be noticed (though some might say that this is therefore less of a problem, that’s mere bigotry from those in the more competitive industries). Paying humans to review intent isn’t as cost-effective as running a computer. Expanding the definition of spam to include paid links means that the system as a whole is taxed with policing something that by definition requires human evaluation for anything mildly sophisticated.
- I confused myself into thinking that publishing ads aggressively should cut a site’s PageRank since AdSense ads, by definition, send visitors off a site rather than inwards to other pages (tip: use your highest CTR ad copy for anchor text ideas, in order to get the most out of your links). I say confused because ads obviously are not votes, which is how Google supposedly understands links today.
How does Google right itself?
I’m still debating whether to answer this, though I’m certain that the stratospheric IQs of many of my readers have lead some to the conclusion already.
There was a time when I liked Google. When their results seemed relevant on a regular basis. But the sandbox filters have this site firmly stuck at #10 for searches for SEO ROI while both MSN and Yahoo are savvy enough to rank this #1. And Matt Cutts, while claiming to want to improve webmaster communication, is ducking my tough questions (and no, they’re not about paid links – I’m as sick of that useless debate as anyone). (Edit: That’s a criticism of Matt’s behaviour, not Matt himself, whom I respect as a highly intelligent and capable guy. Just so we’re clear.)
(Aside: Sandboxing relevant content through an explicit algorithm or as a side effect of a set of filters has got to be one of the worst ideas ever, incidentally. It’s like saying that Shakespeare’s plays would not have been great until they were recognized as such by external references. This gets things backwards. The plays got recognition for their excellence; they weren’t excellent because of their recognition. Yet that is what the sandbox does.)
Depending on how this plays out in the comments here and on Sphinn, I’ll decide whether to share my idea on how Google goes from being broken to once again having the best algorithm around. (The solution is the same for all the engines, incidentally. Wouldn’t want to make the folks at MSN think I’ve given up trying to help them gain their share of the search market.) I’d also like to hear your answers and ideas on what Google can do to solve its problem. Perhaps you guys have a better solution than I do? I’ll definitely be spreading the link love if that’s the case. Like this editorial? Get my rss feed.Algorithms, Google, Search Engines