unemployed.dev☕ Support
system-design/url-shortener
Topic 10System Walkthroughs

Design a URL Shortener

Classic first system design. High read QPS, simple write path.

Design a service like bit.ly. Users submit a long URL and get a short code back. Anyone who visits the short URL is redirected to the original. Sounds simple — the interesting parts are scale and reliability.

Requirements

Clarify scope before designing.

  • Functional: shorten a URL, redirect short URL to original, optionally track click analytics
  • Non-functional: 100M URLs created/day, 10:1 read-to-write ratio, URLs live for 10 years
  • Scale estimate: 10B redirects/day ≈ ~115K QPS reads, ~1.1K QPS writes
  • Storage: 100M URLs/day × 365 × 10 years × ~500 bytes ≈ ~180 TB total

Core design

The key components and decisions.

  • Short code generation: base62 encode a unique ID (a-z, A-Z, 0-9 = 62 chars, 7 chars = 3.5T combinations)
  • ID generation: auto-increment DB ID, or a distributed ID generator (Snowflake)
  • Storage: write URL mapping to PostgreSQL. Cache hot URLs in Redis.
  • Redirect: 301 (permanent, browser caches — saves QPS) vs 302 (temporary — better for analytics)
  • Read path: check Redis cache → if miss, hit DB → return redirect

Scaling considerations

At 115K read QPS, the bottleneck is clear.

  • Cache aggressively — 80% of redirects hit the same 20% of URLs
  • Add read replicas for the database
  • CDN for redirect responses if using 301 caching
  • Shard by short code hash if write volume ever becomes a problem

Tradeoffs

Decisions worth discussing with the interviewer.

  • 301 vs 302 redirect: 301 saves server load; 302 enables per-click analytics
  • Custom aliases: allow users to choose short code? Need uniqueness enforcement.
  • Expiration: TTL on URLs adds complexity but controls storage growth
  • Analytics: separate analytics write path from redirect path to avoid latency coupling

Interview tips

  • Always estimate QPS before proposing architecture
  • Explain the 301 vs 302 tradeoff — it shows product thinking
  • Mention ID generation strategy: DB auto-increment vs Snowflake
  • Address hash collisions if you use hashing instead of encoding

Follow-up questions to expect

  • ?How do you prevent someone from guessing other users' short URLs?
  • ?How would you add real-time click analytics?
  • ?How do you handle URL expiration at scale?
TLDR
  • Short code = base62-encoded unique ID
  • Read-heavy: cache aggressively in Redis with LRU
  • 301 saves server load; 302 enables analytics — pick based on requirement
  • Separate the analytics write path from the critical redirect path
  • At scale: read replicas + cache covers almost all read load