What Is A Soft 404?

What Is A Soft 404
>Seobility

Date First Published: 10th February 2023

Topic: Web Design & Development

Subtopic: SEO

Article Type: Computer Terms & Definitions

Difficulty: Advanced

Difficulty Level: 8/10

Learn more about what a soft 404 is in this article.

A soft 404 is when the server returns a 200 HTTP status code, indicating that the request is successful, but the webpage or content is missing. It is a label that Google applies to a page, but a soft 404 error is not an official status code and the server did not send a soft 404 response to a browser. Redirecting the page, removing the broken links, and optimising the page (if there is thin content) can solve soft 404s.

Soft 404s can be found using Google Search Console, which flags soft 404s that it discovers and using crawling tools, such as Screaming Frog SEO Spider and DeepCrawl, which show the different HTTP status codes of the pages they crawl, allowing users to look through pages that should be providing a 404 status code instead of a 200 status code as well as any broken links, which are a common cause of 404 error pages.

Difference Between A Normal 404 and A Soft 404

The main difference between a normal 404 and a soft 404 is that a normal 404 is when the server returns a 404 HTTP status code, meaning that the page can't be found and a soft 404 is when the server returns a 200 HTTP status code, but the webpage or content is missing. A normal 404 may also be known as a hard 404, which means that the 404 error code is returned to both visitors and search engines. Soft 404 pages fail to return a 404 HTTP status code to search engines.

SEO

Even though soft 404 pages indicate to visitors that the page does not exist, they can indicate to search engines that the page is a real page on the site, which can cause duplicate content issues and a wasted crawl budget. Search engines spend time crawling soft 404 pages and they could be indexed if they are not immediately flagged as soft 404s, taking the main focus away from the real pages of a website. The missing pages that are improperly substituted cause issues for search engines that are trying to index real pages.

What Causes A Soft 404?

Common causes of a soft 404 include:

  • The missing page redirects to the homepage. Publishers may sometimes redirect the missing page to the homepage, even though the homepage is not what the user requested. These types of failed page requests are called "soft 404s" by Google.
  • The webpage is missing and the server responds with a 200 okay status. This happens when a webpage is missing, but the server redirects the missing page to the homepage or a custom URL. Even though the webpage is missing, the publisher has done something to fulfil the request.
  • The content is missing or "thin". When the content is missing or there is very little of it (thin content), the server will respond with a 200 okay status, but for indexing webpages that are not successful webpage requests, search engines will call this a soft 404.

History

The concept of a soft 404 may have come from a 2004 research paper titled Towards an Understanding of the Web’s Decay (PDF). The research paper says:

"According to the HTTP protocol, when a request is made to a server for a page that is no longer available, the server is supposed to return an error code. In fact, many servers, including the most reputable ones, do not return a 404 code - instead, the servers return a substitute page and an OK code (200). Our study shows that these types of substitutions, called "soft-404s" account for more than 15% of the dead links."


Feedback

  • Is there anything that you disagree with on this page?
  • Are there any spelling, grammatical, or punctuation errors on this page?
  • Are there any broken links or design errors on this page?

If so, it is important that you tell me as soon as possible on this page.