Stable Best Practice (No RFC) Architecture

Clean URL Architecture

Last updated: 2025-12-01

01 The Rule

Content URLs must be human-readable, keyword-relevant, and free of unnecessary parameters, session IDs, or tracking codes. The URL alone should communicate what content the page contains.

02 Rationale

Clean URLs improve crawl efficiency (fewer parameter variants to discover), user trust (visible in SERPs and when shared), and keyword relevance (path segments contribute to topical signals). At scale, clean URLs reduce the crawlable URL space by orders of magnitude.

03 Implementation

  • Use descriptive slugs: /shoes/running/nike-pegasus not /p?id=4829
  • Strip session IDs, tracking parameters, and sort orders from canonical URLs
  • Implement parameter handling via robots.txt and Google Search Console
  • Use server-side rewrites to map clean URLs to internal identifiers
  • Ensure URL changes are accompanied by 301 redirects

04 Common Violations & Consequences

Violation

Session IDs in URLs (?sid=abc123)

Consequence

Infinite URL variants per page — massive crawl waste and duplicate content

Violation

Tracking parameters visible to crawlers (?utm_source=...)

Consequence

Each tracking variant creates a crawlable duplicate

Violation

Numeric-only URLs (/page/12345)

Consequence

No keyword relevance signal; poor user comprehension

05 The Fix

Implement URL rewrite rules that map clean paths to internal queries. Use JavaScript-based tracking parameters (fragment identifiers or POST) that crawlers can't follow. Configure Google Search Console parameter handling for remaining parameters.