Replication & Sharding

Scale reads with replication. Scale writes and storage with sharding.

A single database node will eventually hit its limits. Replication adds read capacity and fault tolerance. Sharding splits data across nodes to distribute write load and storage.

Replication

Copies of your data on multiple nodes.

›Primary-replica — one primary accepts writes, replicas serve reads
›Reduces read load on primary significantly
›Lag: replicas may be seconds behind primary — eventual consistency
›Failover: if primary dies, a replica is promoted
›Multi-primary — multiple nodes accept writes, conflict resolution needed

Sharding (horizontal partitioning)

Splitting data across multiple database nodes.

›Each shard holds a subset of the data
›Hash sharding — hash(user_id) % N determines shard. Even distribution.
›Range sharding — shard by date or ID range. Easy range queries, hotspot risk.
›Directory sharding — lookup service maps keys to shards. Flexible, more complexity.
›Cross-shard queries and transactions are expensive — avoid them in your schema

When to shard

Sharding adds significant operational complexity. Don't do it early.

›Single DB is hitting CPU, memory, or storage limits
›Write throughput is the bottleneck, not reads
›You've already added read replicas and caching
›Rule of thumb: shard when a single node can't keep up even with replicas and cache

Interview tips

✓When you add sharding, immediately address the partition key choice
✓Mention hotspot risk — what if one shard gets 80% of traffic?
✓Explain your replication strategy when discussing fault tolerance
✓Show you know the order: cache → replicas → sharding

Follow-up questions to expect

?How do you rebalance shards when you add a new node?
?What is your sharding key and why?
?How do you handle a query that spans multiple shards?

TLDR

›Replication = copies for read scaling and fault tolerance
›Sharding = horizontal split for write scaling and storage
›Use replication first — sharding is operationally expensive
›Hash sharding for even distribution; range sharding for time-series
›Cross-shard transactions are painful — design your schema to avoid them

Building blocks

Systems

←Consistency vs Availability Rate Limiting→