Stable RFC 3986 (2005) Architecture

URL Format & Structure

Last updated: 2025-12-01

01 The Rule

Every indexable URL must use lowercase characters, hyphens as word separators, and reflect the site's content hierarchy in its path segments. URLs must be deterministic — the same content always resolves to exactly one URL.

02 Rationale

Search engines treat URLs as unique identifiers. Case variations, underscores vs hyphens, and inconsistent trailing slashes create duplicate content. At scale, even small URL inconsistencies multiply into thousands of duplicate signals that dilute crawl budget and link equity.

03 Implementation

  • Use lowercase only — enforce via server-side redirect for any uppercase variant
  • Separate words with hyphens, never underscores
  • Reflect hierarchy in path: /category/subcategory/item
  • Avoid query parameters for primary content URLs
  • Limit directory depth to 3-4 levels
  • Use absolute URLs in all internal references

04 Common Violations & Consequences

Violation

Mixed case URLs serving identical content

Consequence

Duplicate content signals, split link equity between case variants

Violation

Underscores in URLs instead of hyphens

Consequence

Google treats underscores as joiners, not separators — words aren't tokenized correctly

Violation

Parameters for primary content (?id=123)

Consequence

Poor crawl priority, fragile canonicalization, analytics attribution problems

05 The Fix

Audit all URL patterns with a crawl tool. Implement server-side normalization that redirects non-canonical variants (uppercase, underscore, parameter-based) to the canonical form with a 301.