Topic 10System Walkthroughs

Design a URL Shortener

Classic first system design. High read QPS, simple write path.

Design a service like bit.ly. Users submit a long URL and get a short code back. Anyone who visits the short URL is redirected to the original. Sounds simple — the interesting parts are scale and reliability.

Requirements

Clarify scope before designing.

›Functional: shorten a URL, redirect short URL to original, optionally track click analytics
›Non-functional: 100M URLs created/day, 10:1 read-to-write ratio, URLs live for 10 years
›Scale estimate: 10B redirects/day ≈ ~115K QPS reads, ~1.1K QPS writes
›Storage: 100M URLs/day × 365 × 10 years × ~500 bytes ≈ ~180 TB total

Core design

The key components and decisions.

›Short code generation: base62 encode a unique ID (a-z, A-Z, 0-9 = 62 chars, 7 chars = 3.5T combinations)
›ID generation: auto-increment DB ID, or a distributed ID generator (Snowflake)
›Storage: write URL mapping to PostgreSQL. Cache hot URLs in Redis.
›Redirect: 301 (permanent, browser caches — saves QPS) vs 302 (temporary — better for analytics)
›Read path: check Redis cache → if miss, hit DB → return redirect

Scaling considerations

At 115K read QPS, the bottleneck is clear.

›Cache aggressively — 80% of redirects hit the same 20% of URLs
›Add read replicas for the database
›CDN for redirect responses if using 301 caching
›Shard by short code hash if write volume ever becomes a problem

Tradeoffs

Decisions worth discussing with the interviewer.

›301 vs 302 redirect: 301 saves server load; 302 enables per-click analytics
›Custom aliases: allow users to choose short code? Need uniqueness enforcement.
›Expiration: TTL on URLs adds complexity but controls storage growth
›Analytics: separate analytics write path from redirect path to avoid latency coupling

Interview tips

✓Always estimate QPS before proposing architecture
✓Explain the 301 vs 302 tradeoff — it shows product thinking
✓Mention ID generation strategy: DB auto-increment vs Snowflake
✓Address hash collisions if you use hashing instead of encoding

Follow-up questions to expect

?How do you prevent someone from guessing other users' short URLs?
?How would you add real-time click analytics?
?How do you handle URL expiration at scale?

TLDR

›Short code = base62-encoded unique ID
›Read-heavy: cache aggressively in Redis with LRU
›301 saves server load; 302 enables analytics — pick based on requirement
›Separate the analytics write path from the critical redirect path
›At scale: read replicas + cache covers almost all read load

Building blocks

Systems

←Rate Limiting Design a Twitter Feed→