Date First Published: 10th February 2023
Topic: Web Design & Development
Subtopic: SEO
Article Type: Computer Terms & Definitions
Difficulty: AdvancedDifficulty Level: 8/10
Learn more about what a soft 404 is in this article.
A soft 404 is when the server returns a 200 HTTP status code, indicating that the request is successful, but the webpage or content is missing. It is a label that Google applies to a page, but a soft 404 error is not an official status code and the server did not send a soft 404 response to a browser. Redirecting the page, removing the broken links, and optimising the page (if there is thin content) can solve soft 404s.
Soft 404s can be found using Google Search Console, which flags soft 404s that it discovers and using crawling tools, such as Screaming Frog SEO Spider and DeepCrawl, which show the different HTTP status codes of the pages they crawl, allowing users to look through pages that should be providing a 404 status code instead of a 200 status code as well as any broken links, which are a common cause of 404 error pages.
The main difference between a normal 404 and a soft 404 is that a normal 404 is when the server returns a 404 HTTP status code, meaning that the page can't be found and a soft 404 is when the server returns a 200 HTTP status code, but the webpage or content is missing. A normal 404 may also be known as a hard 404, which means that the 404 error code is returned to both visitors and search engines. Soft 404 pages fail to return a 404 HTTP status code to search engines.
Even though soft 404 pages indicate to visitors that the page does not exist, they can indicate to search engines that the page is a real page on the site, which can cause duplicate content issues and a wasted crawl budget. Search engines spend time crawling soft 404 pages and they could be indexed if they are not immediately flagged as soft 404s, taking the main focus away from the real pages of a website. The missing pages that are improperly substituted cause issues for search engines that are trying to index real pages.
Common causes of a soft 404 include:
The concept of a soft 404 may have come from a 2004 research paper titled Towards an Understanding of the Web’s Decay (PDF). The research paper says:
"According to the HTTP protocol, when a request is made to a server for a page that is no longer available, the server is supposed to return an error code. In fact, many servers, including the most reputable ones, do not return a 404 code - instead, the servers return a substitute page and an OK code (200). Our study shows that these types of substitutions, called "soft-404s" account for more than 15% of the dead links."
If so, it is important that you tell me as soon as possible on this page.
Network Services Network Setups Network Standards Network Hardware Network Identifiers Network Software Internet Protocols Internet Organisations Data Transmission Technologies Web Development Web Design Web Advertising Web Applications Web Organisations Web Technologies Web Services SEO Threats To Systems, Data & Information Security Mechanisms & Technologies Computer Hardware Computer Software Ethics & Sustainability Legislation & User Data Protection