Steve recently got a warning from Google that a number of pages were returning soft 404s, and the next week, his traffic dropped 10%.
The following week, another 30% was gone. What are these dramatically harsh “soft” 404 errors? How do you fix them?
What is a soft 404? It’s when your content is no longer what was there but you return an HTTP status code other than 404 or 410 (i.e. you don’t return an error code). For instance, you might return a 200 code. As Google says in describing such errors, you can label a giraffe a dog, but it’s still a giraffe.
If Google detects such errors, you can expect them to index less of your pages and to eliminate the rankings those pages might once have had. In Steve’s case, this was probably partly to blame, but an additional server crash certainly did not help. So don’t worry about as drastic drops as Steve experienced, but be aware that you can see traffic drops due to soft 404s.
The difficulty this client experienced is that contrary to the more common soft 404 situations where 200 status codes are being served, we had established 301 redirects.
Here’s my analysis of the problem affecting this client’s site, all of which essentially stemmed from a recent redesign.
With the redesign, the client also implemented search-engine-friendly URLs. Great!
The trick, of course, is how do you maintain the rankings? Standard SEO best practice is to 301 redirect the old URLs to new URLs. Great!
What about when some of the pages on the old site won’t exist on the new site? This client has a dynamic classifieds site, so products are constantly being sold and delisted by the merchant who advertised it.
During the transition from old to the new site, a number of product detail pages became useless because that merchant sold out his/her inventory.
We couldn’t leave the pages in place and just label them as sold, with links to other merchants’ identical-product listings. While good for users and Googlebot, it would burden the server heavily given the size and volume of this site.
Instead, we had to redirect these expiring classified listings. The priority list for 301 permanent redirects was as follows:
1. Exact-Product overview page that links to listings from several merchants. So an expired Nike Air Jordan 1991 Red Men’s Shoe product listing would send visitors to the Nike Air Jordan 1991 Red Men’s product overview page.
2. If such a page didn’t exist, a redirect to the deepest subcategory (i.e. from breadcrumb navigation) would be shown.
3. In the absence of a subcategory, a category page would be the redirect’s target.
4. In the absence of a relevant category, the general “Pick A Category” page would be shown.
Google showed us three URLs that it said were soft 404s, and indeed they were right with regards to two.
Those two redirected to the Pick A Category page. While that used to be acceptable and even widely advised SEO practice for the sake of maintaining traffic, Google’s realized it’s not an ideal user experience.
So Google tells you that such redirects (in their example, to one’s home page) are no good.
Fair enough, we’ll change that to 410 errors and hopefully make up the lost traffic on any specific rankings with increased indexing and longtail traffic to other product detail pages.
But the third page was perplexing – it redirected from a product detail page to a page that listed classified ads for exactly the same product. It wasn’t clear why Google thought this was a soft 404? A false positive from Google, clearly.
This lead me to write the following analysis for the client:
I looked around, checked out SEOmoz Pro and Google help and used my noggin. Here’s my analysis:
1) Generally, soft 404 is when there’s a 200 response even though the content isn’t there.
2) Google’s help pages also talk about redirecting to the home page as being a soft 404 – so we know that it’s not just 200 responses but also 301 that can cause soft 404s.
3) How does Google detect that there’s a soft 404? It checks if the content of the page says that the file is not found or otherwise says that an error took place.
ALSO – It seems it can look for content that is too different – as in 2). So probably it’s looking at title tags, headers, images, and content similarity.
If you look at the first soft 404 page (— edited to protect the innocent—), it’s more surprising because that category/search result shows exactly the products someone is searching for.
However, if you look at the html code, neither the title nor the headers (h1 h2 etc) reflect the product searched for. So for a robot, it’s not clear that the page it gets redirected to is about the same topic as the page it tried to fetch.
Therefore, the solution I propose to fix the soft 404 errors is:
For products that are gone and redirect to perfectly accurate subcategories or model pages, to edit pages of that type (is it search results? something else? product-overview page?) and include the keywords in the important areas for SEO: title, h1 and let the on page content take care of the rest.
On a related note to the above, I suggest changing the css/coding around so that on category pages, not just product pages, the product name is in a large-font size h1. Currently they’re not in h1s at all. And other things that are not really important (e.g. non-category-specific sidebar headings) should not be h2.
Ex.: “Brand” “Color” etc should not be in h2, because that tells Google the wrong message about the page’s topic. They can still be in a larger font via css, but preferably not
Another benefit of making the category keyword a prominent h1 heading visible on the page is that this should hopefully help reduce bounce
For product pages that are currently redirecting to the pick a category page, or a regular category page, let’s eliminate the 301 and just do a 410 gone .
In any case, Google is not treating them like 301s, so we may as well try and do what Google’s asking and show 410. On the 410 page, we can still show links to the relevant category page as well as the pick a category page.
For product pages that are currently redirecting to product-overview pages, let’s make sure the model pages have h1s set up right.
For product pages that are redirecting to subcategories, can we prioritize the same exact same product to show up first?