Design Twitter / X (Feed)
The interesting half of Twitter is the home feed read — serving each user a merged, reverse-chronological stream of the people they follow, fast, at enormous scale. The skill being tested is evolving a simple baseline under questioning, not presenting the optimized design first. This walks that arc, with the diagram refined in stages — grey = baseline, indigo = fan-out-on-write, green = celebrity hybrid — so you can see exactly what each deep dive adds. Terms are defined in the Study List.
Non-functional requirements
Section titled “Non-functional requirements”State the qualities up front; they’re what every later decision answers to.
- Highly available, preferring availability over consistency — a feed that’s a few seconds stale is fine; a feed that errors is not.
- Scale to 100M+ DAU, read-heavy by a wide margin (people scroll far more than they post).
- Feed reads under 200ms — the latency the whole design exists to hit.
High-level design
Section titled “High-level design”The naive baseline — correct, but it can’t meet the NFRs. That’s deliberate: it’s the structural skeleton you draw first, then refine.
Components: a Client (any first-party app — web/iOS/Android), an API Gateway, three microservices (Tweet, Follower, Feed), and two Postgres DBs (Tweet DB, Following DB).
- Why an API Gateway, not just a load balancer. The gateway is the internal front door: it routes requests across heterogeneous services and manages cross-cutting concerns — auth, throttling, request validation, versioning. An ALB only routes and distributes load. They compose rather than compete:
client → gateway → ALB per service → instances. - Why three services. The split is along real boundaries — independent scaling profiles (reads ≫ writes), separate data ownership, and isolation of the expensive feed workload from cheap CRUD. The honest tradeoff: a real early-stage product should start as a modular monolith; the microservice split is justified here by Twitter’s read/write asymmetry, not by default.
- Feed Service has no database. It’s a stateless composer doing fan-out-on-read:
getFollowers(), thengetTweets(followees), merged live per request. Correct, but every feed load fans out across potentially thousands of followees and hits the Tweet DB hard — it cannot meet 200ms at 100M DAU. This is the starting point the deep dives fix.
Deep dives
Section titled “Deep dives”Fan-out-on-write (push)
Section titled “Fan-out-on-write (push)”Flip the work from read time to write time. When a tweet is posted, the Tweet Service drops a job on a queue; a fleet of fan-out workers asynchronously writes that tweet into each follower’s precomputed per-user timeline cache (Redis). Reads collapse to a single cache lookup — comfortably under 200ms. The cost is write amplification, which is acceptable for normal accounts and is exactly what eventual consistency lets us absorb asynchronously.
The celebrity hybrid (and why)
Section titled “The celebrity hybrid (and why)”Pushing a tweet from a 50M+ follower account is a write storm — and the naive instinct is to blame propagation latency. That’s the wrong reason: under eventual consistency, slow propagation is fine. The real problems with pushing celebrities are:
- Wasted work and storage — one celebrity tweet materializes 50M+ writes into timelines, most belonging to inactive users who’ll never read them.
- Storage multiplication — the same tweet duplicated across tens of millions of timeline caches.
- Write-capacity contention — the storm steals shared worker and Redis throughput from normal active users whose tweets also need fanning out, degrading everyone.
So celebrities are pulled at read time instead. The critical nuance: the pull is cached once per celebrity — their recent tweets are identical for every follower — and merged into the feed at read time. It’s one extra cheap cache hit, not a live DB query per follower.
The rule generalizes: push scales with follower count; pull scales with read demand. For celebrities these diverge by orders of magnitude. So push when followers are few and likely-active; pull when they’re vast and mostly-idle, then merge the two at read time.
Mermaid sources
%% 1 — HLD baseline (fan-out-on-read)flowchart LR classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a; classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a; classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;
Client(["Client<br/>web · iOS · Android"]):::base GW{{"API Gateway<br/>route · auth · throttle"}}:::base TS("Tweet Service"):::base FS("Follower Service"):::base Feed("Feed Service<br/>stateless · fan-out-on-read"):::base TDB[("Tweet DB · Postgres")]:::base FDB[("Following DB · Postgres")]:::base
Client --> GW GW -->|write| TS GW --> FS GW -->|read feed| Feed TS --> TDB FS --> FDB Feed -->|"getFollowers()"| FS Feed -->|"getTweets(followees)<br/>merge live · slow"| TS
%% 2 — + fan-out-on-write (indigo)flowchart LR classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a; classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a; classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;
Client(["Client<br/>web · iOS · Android"]):::base GW{{"API Gateway<br/>route · auth · throttle"}}:::base TS("Tweet Service"):::base FS("Follower Service"):::base Feed("Feed Service<br/>stateless composer"):::base TDB[("Tweet DB · Postgres<br/>sharded by authorId")]:::base FDB[("Following DB · Postgres")]:::base Q["Fan-out queue"]:::write W("Fan-out workers ×N"):::write TC[("Timeline cache · Redis<br/>per-user precomputed feed")]:::write
Client --> GW GW -->|write| TS GW --> FS GW -->|read feed| Feed TS --> TDB FS --> FDB Feed -.->|"baseline: merge live · slow"| TS TS -->|"on tweet · async"| Q Q --> W W -->|"getFollowers()"| FS W -->|"push into each<br/>follower's timeline"| TC Feed ==>|"1 cache lookup · <200ms"| TC
%% 3 — + celebrity hybrid (green)flowchart LR classDef base fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a; classDef write fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a; classDef celeb fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;
Client(["Client<br/>web · iOS · Android"]):::base GW{{"API Gateway<br/>route · auth · throttle"}}:::base TS("Tweet Service"):::base FS("Follower Service"):::base Feed("Feed Service<br/>merge cache + celebrities"):::base TDB[("Tweet DB · Postgres<br/>sharded by authorId")]:::base FDB[("Following DB · Postgres")]:::base Q["Fan-out queue"]:::write W("Fan-out workers ×N"):::write TC[("Timeline cache · Redis<br/>per-user precomputed feed")]:::write CC[("Celebrity tweet cache<br/>recent tweets · cached once, shared")]:::celeb
Client --> GW GW -->|write| TS GW --> FS GW -->|read feed| Feed TS --> TDB FS --> FDB TS -->|"on tweet · async<br/>(normal authors only)"| Q Q --> W W -->|"getFollowers()"| FS W -->|"push into each<br/>follower's timeline"| TC Feed -->|"normal: 1 cache lookup"| TC TS -.->|"recent tweets"| CC Feed ==>|"celebrity authors:<br/>cached pull + merge"| CCScaling & data
Section titled “Scaling & data”- Horizontal scaling — services autoscale behind the load balancer. Don’t draw N boxes; annotate ×N.
- DB sharding — Tweet DB sharded by
authorId(always name the shard key): a user’s tweets co-locate, so the celebrity pull and per-author reads stay single-shard. - Caching — the timeline cache is the central low-latency mechanism; the celebrity cache is the second, read-side cache that keeps hot accounts off the write path.
Interview meta-note
Section titled “Interview meta-note”HLD = the boxes and data ownership (the structural skeleton). Deep dives = the algorithms inside the boxes and the hard scaling mechanics (the behavioral detail — fan-out strategy, shard key, hybrid). The signal an interviewer is reading is whether you can evolve the baseline as they probe — so present the naive version on purpose, then earn each optimization with the constraint that forces it.