Design Twitter / X (Feed)

The interesting half of Twitter is the home feed read — serving each user a merged, reverse-chronological stream of the people they follow, fast, at enormous scale. The skill being tested is evolving a simple baseline under questioning, not presenting the optimized design first. This walks that arc, with the diagram refined in stages — grey = baseline, indigo = fan-out-on-write, green = celebrity hybrid — so you can see exactly what each deep dive adds. Terms are defined in the Study List.

Non-functional requirements

State the qualities up front; they’re what every later decision answers to.

Highly available, preferring availability over consistency — a feed that’s a few seconds stale is fine; a feed that errors is not.
Scale to 100M+ DAU, read-heavy by a wide margin (people scroll far more than they post).
Feed reads under 200ms — the latency the whole design exists to hit.

High-level design

The naive baseline — correct, but it can’t meet the NFRs. That’s deliberate: it’s the structural skeleton you draw first, then refine.

Components: a Client (any first-party app — web/iOS/Android), an API Gateway, three microservices (Tweet, Follower, Feed), and two Postgres DBs (Tweet DB, Following DB).

Why an API Gateway, not just a load balancer. The gateway is the internal front door: it routes requests across heterogeneous services and manages cross-cutting concerns — auth, throttling, request validation, versioning. An ALB only routes and distributes load. They compose rather than compete: client → gateway → ALB per service → instances.
Why three services. The split is along real boundaries — independent scaling profiles (reads ≫ writes), separate data ownership, and isolation of the expensive feed workload from cheap CRUD. The honest tradeoff: a real early-stage product should start as a modular monolith; the microservice split is justified here by Twitter’s read/write asymmetry, not by default.
Feed Service has no database. It’s a stateless composer doing fan-out-on-read: getFollowers(), then getTweets(followees), merged live per request. Correct, but every feed load fans out across potentially thousands of followees and hits the Tweet DB hard — it cannot meet 200ms at 100M DAU. This is the starting point the deep dives fix.

Deep dives

Fan-out-on-write (push)

Flip the work from read time to write time. When a tweet is posted, the Tweet Service drops a job on a queue; a fleet of fan-out workers asynchronously writes that tweet into each follower’s precomputed per-user timeline cache (Redis). Reads collapse to a single cache lookup — comfortably under 200ms. The cost is write amplification, which is acceptable for normal accounts and is exactly what eventual consistency lets us absorb asynchronously.

The celebrity hybrid (and why)

Pushing a tweet from a 50M+ follower account is a write storm — and the naive instinct is to blame propagation latency. That’s the wrong reason: under eventual consistency, slow propagation is fine. The real problems with pushing celebrities are:

Wasted work and storage — one celebrity tweet materializes 50M+ writes into timelines, most belonging to inactive users who’ll never read them.
Storage multiplication — the same tweet duplicated across tens of millions of timeline caches.
Write-capacity contention — the storm steals shared worker and Redis throughput from normal active users whose tweets also need fanning out, degrading everyone.

So celebrities are pulled at read time instead. The critical nuance: the pull is cached once per celebrity — their recent tweets are identical for every follower — and merged into the feed at read time. It’s one extra cheap cache hit, not a live DB query per follower.

The rule generalizes: push scales with follower count; pull scales with read demand. For celebrities these diverge by orders of magnitude. So push when followers are few and likely-active; pull when they’re vast and mostly-idle, then merge the two at read time.

Mermaid sources

%% 1 — HLD baseline (fan-out-on-read)
flowchart LR
  classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
  classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
  classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;

  Client(["Client<br/>web · iOS · Android"]):::base
  GW{{"API Gateway<br/>route · auth · throttle"}}:::base
  TS("Tweet Service"):::base
  FS("Follower Service"):::base
  Feed("Feed Service<br/>stateless · fan-out-on-read"):::base
  TDB[("Tweet DB · Postgres")]:::base
  FDB[("Following DB · Postgres")]:::base

  Client --> GW
  GW -->|write| TS
  GW --> FS
  GW -->|read feed| Feed
  TS --> TDB
  FS --> FDB
  Feed -->|"getFollowers()"| FS
  Feed -->|"getTweets(followees)<br/>merge live · slow"| TS

%% 2 — + fan-out-on-write (indigo)
flowchart LR
  classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
  classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
  classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;

  Client(["Client<br/>web · iOS · Android"]):::base
  GW{{"API Gateway<br/>route · auth · throttle"}}:::base
  TS("Tweet Service"):::base
  FS("Follower Service"):::base
  Feed("Feed Service<br/>stateless composer"):::base
  TDB[("Tweet DB · Postgres<br/>sharded by authorId")]:::base
  FDB[("Following DB · Postgres")]:::base
  Q["Fan-out queue"]:::write
  W("Fan-out workers ×N"):::write
  TC[("Timeline cache · Redis<br/>per-user precomputed feed")]:::write

  Client --> GW
  GW -->|write| TS
  GW --> FS
  GW -->|read feed| Feed
  TS --> TDB
  FS --> FDB
  Feed -.->|"baseline: merge live · slow"| TS
  TS -->|"on tweet · async"| Q
  Q --> W
  W -->|"getFollowers()"| FS
  W -->|"push into each<br/>follower's timeline"| TC
  Feed ==>|"1 cache lookup · &lt;200ms"| TC

%% 3 — + celebrity hybrid (green)
flowchart LR
  classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
  classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
  classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;

  Client(["Client<br/>web · iOS · Android"]):::base
  GW{{"API Gateway<br/>route · auth · throttle"}}:::base
  TS("Tweet Service"):::base
  FS("Follower Service"):::base
  Feed("Feed Service<br/>merge cache + celebrities"):::base
  TDB[("Tweet DB · Postgres<br/>sharded by authorId")]:::base
  FDB[("Following DB · Postgres")]:::base
  Q["Fan-out queue"]:::write
  W("Fan-out workers ×N"):::write
  TC[("Timeline cache · Redis<br/>per-user precomputed feed")]:::write
  CC[("Celebrity tweet cache<br/>recent tweets · cached once, shared")]:::celeb

  Client --> GW
  GW -->|write| TS
  GW --> FS
  GW -->|read feed| Feed
  TS --> TDB
  FS --> FDB
  TS -->|"on tweet · async<br/>(normal authors only)"| Q
  Q --> W
  W -->|"getFollowers()"| FS
  W -->|"push into each<br/>follower's timeline"| TC
  Feed -->|"normal: 1 cache lookup"| TC
  TS -.->|"recent tweets"| CC
  Feed ==>|"celebrity authors:<br/>cached pull + merge"| CC

Scaling & data

Horizontal scaling — services autoscale behind the load balancer. Don’t draw N boxes; annotate ×N.
DB sharding — Tweet DB sharded by authorId (always name the shard key): a user’s tweets co-locate, so the celebrity pull and per-author reads stay single-shard.
Caching — the timeline cache is the central low-latency mechanism; the celebrity cache is the second, read-side cache that keeps hot accounts off the write path.

Interview meta-note

HLD = the boxes and data ownership (the structural skeleton). Deep dives = the algorithms inside the boxes and the hard scaling mechanics (the behavioral detail — fan-out strategy, shard key, hybrid). The signal an interviewer is reading is whether you can evolve the baseline as they probe — so present the naive version on purpose, then earn each optimization with the constraint that forces it.