01 The Rule
Every indexable page must include a self-referencing canonical tag. Duplicate or equivalent pages must canonical to the preferred version. Canonical URLs must be absolute and match the protocol, domain, and trailing slash convention of the canonical form.
02 Rationale
Without explicit canonical signals, search engines must infer which version of a page to index. For e-commerce sites where products appear under multiple categories, or sites with session/tracking parameters, this inference often fails — resulting in the wrong version being indexed or link equity being split.
03 Implementation
- Add <link rel="canonical" href="..."> to every indexable page
- Always use absolute URLs with protocol and domain
- Self-reference on pages without duplicates
- Point parameter/filtered variants to the clean URL
- Use HTTP Link header for non-HTML resources (PDFs, images)
- Ensure canonical URL returns 200, not a redirect
04 Common Violations & Consequences
Violation
Missing canonical tags on indexable pages
Consequence
Search engine guesses canonical — may choose parameter variant or wrong protocol
Violation
Canonical pointing to a 301 redirect target
Consequence
Redirect chain through canonical — weakened signal, wasted crawl
Violation
Canonical URL differs from what's in sitemap
Consequence
Conflicting signals — search engines distrust both
Violation
Relative canonical URLs
Consequence
Ambiguous interpretation across protocols and subdomains
05 The Fix
Implement canonical tags as part of your page template. Use a crawl tool to verify every page has a canonical that returns 200 and matches the sitemap URL. The Canonical Audit Tool automates this check.