SEO ROI

SEO Services For Serious ROI. Blog Posts For Serious SEOs.

Google Is Indexing Site-Search Results Pages

Author: Gabriel Goldenberg, March 27, 2008

Google Analytics is broken (like PageRank is broken), and leaking my data into the index. All the site searches here on SEO ROI are resulting in site-SERPs pages getting into G’s index. How is this happening?

Final Update: This has been disproven as being the source of the site-search-results appearing in Google’s search results. I had good reason to believe that Google Analytics was the source of this (you can see below for my original thoughts on the matter), but there’s now a clarification. My apologies to Google and to my readers for the mistake.

A while back I saw a video about using Google Analytics to track what searches people perform on your site (note: the video seems to be gone, but you can read Avinash’s explanation on how to set up the site search tracking). It seemed like something worth tracking, in order to find out what I might write to meet my visitors need, and perhaps get some keyword research too. So I set it up.

I forgot about the whole thing until I did a site: search on SEO ROI in order to do some research on how the big G was liking my pages. Well lookee here! SEO ROI Site SERPs in Google site: SERPs!

Naturally, I wanted to see whether this was just affecting me. Guess what? Brian Chappell set up the tracking too, and here’s what a site search on brianchappell.com turns up:

Brian Chappell.com's site: search SERPs in Google SERPs| Google Analytics Leaking Data

Google Analytics Leaking Brian Chappell’s site search results.

While if I were a blackhat this would be great – I’d automate bots to search for keywords on my sites, have Google Analytics push it all into the index, then linkspam my SERPs – I’m a white-ish SEO with the coding skills of a luddite. I can do html reasonably well, but programming is waaay over my head.

More importantly: This proves that Google Analytics is integrated with Google’s organic search, and not just AdWords. I remember reading Winning Results with Google AdWords, and Andrew highlighted a debate as to whether or not Google Analytics “gives your sales receipts to the landlord,” to reprise his metaphore. Here we have another problem with Google Analytics, quite evidently.

This is another reason to make the switch to a Conversion Ruler (hat tip for the reference to Andrew), Click Audit for your [very easy to setup and use click tracking - you'll notice me using it instead of Feedburner/Google Analytics for tracking subscriptions], Mongoose for call-tracking, or any of the other analytics tools on the market. On a related note, if you found this material interesting, why not subscribe to my RSS feed ? There’s lots more where this comes from.

Update: Matt Cutts and Mike VanDeMar have anecdotal evidence (which I verified) that this happens on sites not using GA too. Mike’s site had this without ever using Google Analytics, apparently (he’s since developed a neat bit of code to drop those pages out of the index). So GA isn’t necessarily involved. It may, however, be one of several possible causes.

Update 2: Matt emailed me and he “will try” to get a post up on Google’s Webmaster blog about this.

Update 3: Both Google’s official post and Matt’s announcement are now live. Turns out Google itself is doing a new form of crawling, submarine crawling, which includes querying forms and discovering links from Javascript.

StumbleUpon It!

Related posts

Comments

  1. Hmmm, very interesting. Wonder how this will be “explained”.

    Reply

    Comment by DazzlinDonna — March 27, 2008 @ 8:40 pm

  2. One problem with your conclusion… I had the exact same issue on Smackdown… and I do not have GA set up on it at all. Problem is, I don’t know exactly what is causing it. I had asked Matt Cutts about it, and someone else had asked John Mueller on my behalf, because in my case there is suspicious activity surrounding the existence of the pages. As of yet neither has replied. Well, John did, but hadn’t looked into it yet.

    Reply

    Comment by Michael VanDeMar — March 27, 2008 @ 9:28 pm

  3. Michael, I had a look at your site and couldn’t see any such results. Can you show me some? Perhaps I’m not doing the right search… Interesting to know that you don’t have GA set up – I’d be happy to hear that Google Analytics itself isn’t broken.

    That said, I still don’t see why Google is showing site SERPs for pages that obviously were never targeted with links etc.

    Reply

    Comment by Gabriel Goldenberg — March 27, 2008 @ 10:29 pm

  4. I can verify that Michael VanDeMar contacted me about the existence of these search result urls on his site on Feb. 26th. Looking at Michaels’ site it’s clear that Michael doesn’t have any Google Analytics code on his site. So that disproves the “Google Analytics is leaking urls into the index” claim.

    Michael, I think I might know how these search results showed for you and there’s a good/non-conspiracy reason. Since you wrote me about it a month earlier and I’d been meaning to write you back anyway (it was starred in my inbox, I promise :) , I dropped you a note to find out more details from you.

    Reply

    Comment by Matt Cutts — March 27, 2008 @ 11:42 pm

  5. Gabriel, drop me a quick email please…

    Reply

    Comment by Michael VanDeMar — March 28, 2008 @ 12:17 am

  6. Done :) .

    Reply

    Comment by Gabriel Goldenberg — March 28, 2008 @ 12:40 am

  7. Gabs, check your Junk folder… you should have received 2 emails from me by now.

    Reply

    Comment by Michael VanDeMar — March 28, 2008 @ 1:59 am

  8. Good to know it’s not GA, but it is curious how this happens…as well as, how does an addon domain get indexed when it doesn’t have links to it. (i.e. site1.com/site2 is the foldoer for site2.com and site1.com/site2 gets indexed for no reason).

    Somehow G is getting this information when it shouldn’t be. Would love to know what is sending G the info.

    Reply

    Comment by DazzlinDonna — March 28, 2008 @ 7:45 am

  9. What’s in it for Google and it’s users? Wouldn’t this bloat its index?

    I know I can’t stand when I do a search in Google and I land on a thin affiliate site that happens to have search ads displaying with my search query pre-typed into their directory search box. Do you know what I mean? If I find an example I can post it, but usually what happens is I get to some useless site that’s like a really crappy search engine, it shows up in Google serps, there’s nothing about my keyword there but there’s a bunch of ads.

    Spam pages, right?

    Useless to search engine users – UNLESS they click on the Google ads ;-)

    But I don’t think Google means to do this, maybe it’s just a bug.

    Or could it be that this is useful, you don’t have an optimized page for a term but you have a lot of content in aggregate and users on your site search for xyz terms meaning that’s a good shot your site is relevant for the search – get what I’m saying? Maybe Google is testing whether that theory is true. Trying to match relevance to queries by site topic/visitor profile rather than page content?

    Scary about the Google-sharing-data thing. Another reason not to opt into Benchmarking either…

    Reply

    Comment by Linda Bustos — March 28, 2008 @ 10:29 am

  10. I can confirm I am seeing this issue as well since I setup Google site search in GA. I’m not sure whether this is a good or bad thing.

    In analyzing one of my sites, C28.com, G seems to have indexed searches that occur more frequently than most.

    I seem to remember a while back Google being worried about indexing site searches, due to an unlimited amount of potential duplicate content. I guess they don’t care anymore.

    Great observation Gabriel.

    Reply

    Comment by Palmer Web Marketing — March 28, 2008 @ 10:44 am

  11. Mike, I’ve responded to your emails.

    Donna, I’m as anxious as you are to see what’s going on! Mike’s got a solution if you’re interested, so keep an eye on his blog or drop him a line.

    As to what’s in it for G, you make some good guesses Linda. It could be trying to deliver content rich pages and see what the result is for users. That said, many of the pages indexed are for gibberish keywords (weight? advisable?) that don’t return great site-SERPs. Half-baked experiment if that’s really what’s going on.

    Reply

    Comment by Gab — March 28, 2008 @ 12:31 pm

  12. Linda, I’m pretty sure on this one it’s not something that G is doing on purpose, and I’m more conspiracy-theory-istic than most you might meet. :D

    Reply

    Comment by Michael VanDeMar — March 28, 2008 @ 2:37 pm

  13. This is not new! It has been around for a couple of months! Google is performing automatic searches, using the intern search function of some CMS (I have only noticed in on wordpress so far). They even perform searches for keywords that are absolutely unique and nonsense (like nicknames mentioned somewhere on your blog) and this internal search pages can rank in the SERP!

    Reply

    Comment by Malte Landwehr — March 29, 2008 @ 5:40 am

  14. So, is anyone ready to say anything yet? What’s the cause?

    Reply

    Comment by Tim Dineen — March 29, 2008 @ 4:13 pm

  15. Malte – right you are. Mike noticed this in October, and I wrote about it in a prior Scratchpad which only a few people noticed (I’m telling you guys you need to pay more attention to those columns! Some of my juiciest bits go in there ;D!)

    Tim, Malte’s answered your question.

    Reply

    Comment by Gabriel Goldenberg — March 29, 2008 @ 10:14 pm

  16. No offense to Malte, who I don’t know, but I was hoping Matt or Michael would provide details since they seem closest to this issue (and one obviously is closer than all).

    Reply

    Comment by Tim Dineen — March 30, 2008 @ 10:45 am

  17. well, ive been reading on the article, and hope to know whats really the cause?

    Reply

    Comment by amelia — March 30, 2008 @ 10:58 pm

  18. Maybe it’s the Google toolbar installed by users who perform searches on your site?
    The toolbar sends every url to Google, and there are indications that Google toolbar traffic date influences SERPS, e.g. the sitelinks:
    http://googlesystem.blogspot.com/2008/03/google-sitelinks-using-traffic.html

    Reply

    Comment by Pascal Van Hecke — April 9, 2008 @ 8:34 am

  19. Hi again,

    Apparently this is what’s happening:
    http://googlewebmastercentral.blogspot.com/2008/04/crawling-through-html-forms.html

    Google is now crawling forms, by submitting and entering data into the fields, including search forms…

    Reply

    Comment by Pascal Van Hecke — April 13, 2008 @ 4:18 am

  20. Tim, see Pascal’s link. Also I’ve updated the post, Pascal.

    Reply

    Comment by Gabriel Goldenberg — April 13, 2008 @ 12:45 pm

  21. Hey well an interesting post and a serious issue too.
    Well keeping a track on updates, thanks for letting this post on.

    Great work

    Reply

    Comment by Paintworkz Web Design — May 3, 2008 @ 11:39 am

  22. I think the problem remains as to where Google is getting its search queries…. If it is getting those search queries from Google analytics, it makes no difference from using Google Analytics data to crawl the site.

    Reply

    Comment by Sign — August 1, 2008 @ 7:50 am

  23. well, ive been reading on the article, and hope to know whats really the cause?

    Reply

    Comment by perde — December 8, 2008 @ 4:05 pm

  24. [...] Site Searches Are Being Indexed by Google | Key Followup: Google Reveals Source of Site Searches in Index [...]

    Pingback by My Most Popular Posts: Social Media — January 26, 2009 @ 9:33 pm

  25. [...] read about submarine crawling’s existence a month before Google’s announcement, which a post of mine (and its erroneous conclusion) following up the first post, precipitated. Not quite as spectacular [...]

    Pingback by Brand Building For SEOs and Internet Marketing Companies — March 3, 2009 @ 9:17 pm

Post a comment.