unemployed.dev☕ Support
system-design/chat-app
Topic 12System Walkthroughs

Design a Chat App

Real-time messaging, message history, and delivery guarantees.

Design a system like WhatsApp or Slack. Users send messages to individuals or groups, messages are delivered in real-time, and history is persisted. The core challenge is real-time delivery at scale.

Requirements

Scoping the problem.

  • 1-to-1 and group messaging (up to 100 members per group)
  • Messages delivered in real-time when recipient is online
  • Messages persisted and retrievable when recipient was offline
  • 50M DAU, avg 40 messages/day per user → ~23K messages/sec
  • Message storage: 23K msg/sec × 100 bytes × 86,400 × 365 ≈ 70 TB/year

Real-time delivery

How messages get to recipients immediately.

  • WebSocket — persistent bidirectional connection between client and chat server
  • Each user maintains a WebSocket connection to a chat server
  • Message arrives → chat server looks up recipient's connection → pushes immediately
  • Connection service maps user ID → which chat server holds their WebSocket
  • Long polling as fallback for environments that don't support WebSockets

Message storage and history

Persisting messages for retrieval.

  • Cassandra for message storage — write-heavy, append-only, partition by conversation ID
  • Each message gets a unique monotonic ID (Snowflake) for ordering
  • Offline delivery: message stored → when user reconnects, fetch unread messages
  • Message sync: client tracks last seen message ID, fetches delta on reconnect

Group messaging

Fan-out to multiple recipients.

  • Small groups: fan-out on write to all member connections
  • Large groups (Slack channels): fan-out via message queue — each member's server pulls
  • Read receipts: separate lightweight event, not the critical path

Interview tips

  • Always specify WebSockets for real-time — 'HTTP polling' is wrong here
  • Address offline delivery — a common follow-up
  • Mention message ordering and deduplication
  • Separate the concerns: presence, delivery, storage, and notifications

Follow-up questions to expect

  • ?How do you handle message ordering across distributed servers?
  • ?How do you implement end-to-end encryption?
  • ?How do you scale to 1000-person group chats?
TLDR
  • WebSockets for real-time delivery — persistent bidirectional connection
  • Cassandra for message storage — write-heavy, append-only access pattern
  • Connection service maps user ID to chat server for routing
  • Offline users: store messages, deliver on reconnect
  • Large groups: queue-based fan-out to avoid thundering herd