Date First Published: 8th August 2022
Topic: Web Design & Development
Subtopic: SEO
Article Type: Computer Terms & Definitions
Difficulty: MediumDifficulty Level: 7/10
Learn more about what a canonical URL is in this article.
A canonical URL is a HTML element used to tell search engines which URL is preferred when a single page is accessible by multiple URLs, such as the homepage being accessible by both ‘example.com’ and ‘example.com/index.html’. Without telling search engines the right canonical URL, search engines will automatically make the choice based on factors, such as HTTPS and page quality or they may consider both of them of equal importance, leading to duplicate content issues. Relying on search engines to automatically choose URLs as canonical is not recommended as they might select a URL that the owner of the website do not want to be canonical. The canonical link element was introduced in February 2009 by Google, Bing, and Yahoo.
When the rel=“canonical” tag is added to the head of the HTML page, search engines will crawl the specified URL as canonical and all other URLs will be considered duplicates and crawled less frequently. It can only be seen by viewing the source code of a HTML page.
301 redirects can be used for deprecating a duplicate URL. These redirect traffic from one URL to another. (e.g. if a page is reachable by two URLs with and without the 'www' prefix 'https://www.example.com and 'https://example.com', the URLs without the 'www' prefix could be chosen as the canonical version and all URLs could be redirected to there. These two URLs would be viewed as completely different pages by search engines even though they have the same content, which is why it is important to stick to one version and redirect all URLs from that version to the other.
In CMSes, such as WordPress, canonical tags can be automatically added for users with plugins, such as Yoast SEO, making it easier for webmasters and reducing the chances of mistakes.
For HTML documents, the canonical link element can be added to the head of the document by this HTML element.
An example of a HTML document that uses the canonical link element inside the <head> tag can be seen below. In this example, the code could be used on a page with a URL of https://example.com/index.html to tell search engines that https://example.com without the 'index.html' is the preferred version of the webpage.
Even though it is possible to map an unlimited number of duplicate URLs, adding the canonical link tag can increase the size of the page. In addition, this method only works for HTML pages since no other file types have the <head> tag.
For non-HTML documents, such as PDF files that have no method of placing canonical tags at the head of the page, another way of setting a canonical URL is the HTTP header. They cannot be set as part of the URL. For example, if an image name has a corresponding HTML page of the same name, the HTTP header could provide a canonical URL for the page associated with that URL. By checking the HTTP response headers in the inspector of a web browsers, it can be known whether it is working. Below is an example of the HTTP response headers of a PDF file with a corresponding HTML page with the URL of https://example.com/page.html.
This method is beneficial as it does not increase page size and an unlimited number of duplicate URLs can be mapped. However, it can be complex to maintain the mapping on large websites.
These guidelines should be followed for canonicalisation below:
If so, it is important that you tell me as soon as possible on this page.
Network Services Network Setups Network Standards Network Hardware Network Identifiers Network Software Internet Protocols Internet Organisations Data Transmission Technologies Web Development Web Design Web Advertising Web Applications Web Organisations Web Technologies Web Services SEO Threats To Systems, Data & Information Security Mechanisms & Technologies Computer Hardware Computer Software Ethics & Sustainability Legislation & User Data Protection