unemployed.dev☕ Support
system-design/estimation
Topic 02Building Blocks

Estimation

Rough numbers that drive real architecture decisions.

Back-of-the-envelope math helps you decide whether to use one database or ten, one server or a fleet. You don't need precision — you need the right order of magnitude.

Key numbers to internalize

Memorize these. They come up in every estimation.

  • 1 million seconds ≈ 11.5 days
  • 1 billion requests/day ≈ ~11,600 requests/second
  • 1 KB × 1 billion = 1 TB
  • Average tweet: ~300 bytes. Image: ~300 KB. Video: ~10 MB/minute
  • SSD read latency: ~0.1ms. Network round trip: ~1–10ms. Disk seek: ~10ms
  • A single server can handle ~10,000 QPS for simple reads

Estimation framework

Run through this consistently for every system.

  • Daily Active Users (DAU): how many users do things per day?
  • Requests per second (QPS): DAU × actions/day ÷ 86,400
  • Storage: records/day × record size × retention period
  • Bandwidth: QPS × response size
  • Memory for cache: top 20% of data handles 80% of traffic

Worked example: Twitter-scale feed

300M DAU, 5 tweets read per session, average tweet 300 bytes.

  • Read QPS: 300M × 5 ÷ 86,400 ≈ 17,000 QPS
  • Write QPS: assume 1% of users write → 300M × 0.01 ÷ 86,400 ≈ 35 writes/sec
  • Storage: 35 writes/sec × 300 bytes × 86,400 sec × 365 days ≈ 330 GB/year
  • Conclusion: reads dominate → optimize read path with caching

Interview tips

  • Always do estimation before proposing architecture
  • State your assumptions out loud: 'I'm assuming 100M DAU'
  • Let the numbers justify your choices: 'at 50K QPS, one DB won't cut it'
  • Be wrong confidently — correcting assumptions is fine, silence isn't

Follow-up questions to expect

  • ?How does your architecture change at 10x the estimated QPS?
  • ?At what scale would a single database stop being viable?
  • ?How would you estimate cache hit rate for this system?
TLDR
  • QPS = DAU × actions per day ÷ 86,400
  • Storage = records per day × size × retention
  • Order of magnitude is enough — don't over-optimize the math
  • Estimation tells you what the bottleneck will be before you design
  • Read QPS >> Write QPS in most consumer systems