I’ve been working on an upcoming talk that I’m doing in June (it was originally supposed to be last week but it’s been postponed) and I needed an example of a soft 404 page. I ended up finding a few on some big-name, well-known websites – where you wouldn’t expect to see them – so I thought that I’d point them out.
First things first, though…
What is a soft 404?
When you access a page that doesn’t exist on a website, the server gives an error code (a 404 code, a.k.a. a “Page Not Found” code) saying that there’s nothing there – no file or webpage could be found. When a page does exist, it gives a 200 code (a.k.a. an OK code).
A soft 404 is when you access a page that doesn’t exist, however the server gives a 200 code, not a 404 code. In other words, the website might show a “Page Not Found”-style page, but search engines such as Google will see the page and think that it’s an actual live page…
For more info, Google do a good job explaining soft 404s on their Soft 404 Errors page.
The problem with soft 404s from an SEO POV
If Google sees the page and thinks it’s ok (a 200 code), it may index it. If accessing a non-existent URL redirects to a “Page Not Found” style page (e.g. domain.com/zxcvbnm takes the user to domain.com/404), then only one page could be indexed. But if it doesn’t redirect (e.g. domain.com/zxcvbnm shows a “Page Not Found” page but doesn’t redirect, meaning that domain.com/asdfghjkl and domain.com/qwertyuiop do the same) then there’s the risk that any and all error pages could be indexed.
Google hates duplicate content, so if it happens to index a few dozen or a few hundred soft 404s, it might think that you’re trying to game the system – even though it’s a genuine and innocent mistake – and your whole site could suffer as a result.
Finding examples of soft 404 pages
When I was researching the talk, I asked people on Twitter if they knew of any examples of soft 404 pages, but didn’t have much luck unfortunately (many people passed on examples of correctly-working 404 pages). Then it hit me: a Google search for “page not found” would do the trick – granted that you’d have to dig down a few pages in order to find some.
Checking your 404 page
Curious to know if your 404 page is actually showing a 404 code? Use SEOBook’s status code checker. There are plenty of other similar tools out there, but this one is my favourite. Below each example I will include a link to each example’s results in the tool, so that you can see for yourself. You can also use Google Webmaster Tools’ Fetch as Google, if you’d prefer to see what Google makes of the page themselves…
Speaking of which… onto the fun bit now. Here are 4 big websites that currently* have soft 404 pages.
* Note: obviously if you’re reading this post weeks, months or even years after its publication date, they might have been fixed by then, so please bear that in mind…
1) Mozilla
http://www.mozilla.org/en-US/404.html
Mozilla’s soft 404 was the first one that I came across, which was the inspiration for this post and the example that I’ll be using for my talk in June. I like their 404 page (especially the fact that the purple guy’s eyes move and blink every so often), but I imagine the fact that it’s a soft 404 is simply an oversight or an accident.
SEOBook link for Mozilla’s soft 404
This is what you should be seeing, by the way…
The underlined bit should be saying “404 Not Found,” not “200 OK.”
2) Symantec
http://www.symantec.com/errors/notfound.jsp
Once I’d found Mozilla’s soft 404 page, it didn’t take long to find more. Symantec’s was the next one that I found.
SEOBook link for Symantec’s soft 404
3) WWE.com
http://www.wwe.com/f/404-not-found.html
Next up – WWE.com. Interesting choice of ads I’ve received there…
SEOBook link for WWE’s soft 404
4) Spotify
https://www.spotify.com/uk/icantypeanythinghere/
Notice the URL? You can literally type anything there and it’ll show up. This is bad because – as explained above – for each error page that Google comes across, it can be crawled and indexed. As I type this, Google has indexed 35 pages of Spotify’s containing the words “this page is sleeping” (see here) – that’s 35 pages more than it could be/should be showing…
SEOBook link for Spotify’s soft 404
How to fix soft 404s
Configuring your 404 page (i.e. telling your site which URL you’d like to designate as your 404 page) is controlled through your website’s .htaccess file. Most (if not all) WordPress themes come with a 404 page already created, but if you’re not using WordPress then you might have to create and configure one yourself. I had to do this for a client of mine recently – who had built his site using Dreamweaver – as one didn’t already exist.
In order to configure your 404 page, simply add this line to the site’s .htaccess file:
ErrorDocument 404 /cool404.html
The “/cool404.html” in the example above can be anything you want, so if you want your 404 page to live at the URL domain.com/page-not-found then change it to “/page-not-found”. I’ve taken that example from this awesome post on .htaccess file edits on Moz (it’s #4), which contains info on a few other handy .htaccess edits that you can do.
Seen any other big players serve up a soft 404? Please share examples in the comments below!
[Embroidered 404 image credit – Willem Velthoven; all soft 404 example screenshots taken from their respective websites]
Albert
May 1, 2014 at 9:03 amNice article. I think that we could find more examples of such sites. Big companies usually dont care about good SEO.
Rhys Gregory
May 7, 2014 at 2:56 pmGreat article Steve! Very helpful and good use of Google to find those pages 🙂
See you next week!
Rhys
11 Productivity Hacks for Your Content Creation - Matthew Barby
September 1, 2014 at 8:43 pm[…] the light-bulb went off and I thought “hey, I could turn this into a blog post,” which I’ve since written. What’s important is that I didn’t originally set out to write a blog post about it […]
Ashish Vishwakarma
May 17, 2017 at 8:14 amHi Great Post,
Another big name(Facebook) is having soft404 error:
When i checked server response code for https://www.facebook.com/rwhrtzdehfnbs
It shows:
Requesting https://www.facebook.com/rwhrtzdehfnbs
SERVER RESPONSE: HTTP/1.1 200 OK
—
Is this soft 404 error?
Steve
May 17, 2017 at 8:18 amWell well… Good find! Weirdly though, despite no noindex in place, I can’t find any instances of it being indexed in Google (unless I didn’t look close enough).