Skip to content

Event-Driven Architecture

The layer above the transports (Kafka, queues, Temporal): how services actually talk. In event-driven architecture (EDA) — the standard name for this — services communicate by asynchronous messages instead of direct synchronous calls, which buys loose coupling, isolation, and resilience (a slow or dead consumer doesn’t take the producer down).

The messages are mostly events — facts like OrderPlaced — which is where the name comes from, but the same pipes also carry commands (instructions like ChargeCard):

  • Event — announces a fact; the producer doesn’t know or care who reacts. Maximally decoupled.
  • Command — directs a specific handler to do something.

Same messages on the wire; the difference is intent. (The Reactive Manifesto calls the underlying async-messaging foundation “message-driven” — a less-common synonym you’ll occasionally see.)

The big structural fork — who owns the multi-step flow?

Choreography (services react to events) vs orchestration (a coordinator directs)

Mermaid source
flowchart TB
classDef svc fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
classDef coord fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
subgraph CH["Choreography — services react to events (no central owner)"]
direction LR
A1["Order svc"]:::svc -->|"OrderPlaced"| B1["Payment svc"]:::svc
B1 -->|"PaymentTaken"| C1["Shipping svc"]:::svc
end
subgraph OR["Orchestration — a coordinator directs the flow"]
direction LR
O["Orchestrator"]:::coord -->|"charge"| B2["Payment svc"]:::svc
O -->|"ship"| C2["Shipping svc"]:::svc
B2 -.->|"result"| O
C2 -.->|"result"| O
end
  • Choreography — each service reacts to others’ events; no central controller. Maximally decoupled and easy to extend (add a consumer without touching anyone), but the end-to-end flow is emergent — nobody owns it, and “what happened across the whole order?” is reconstructed from event soup. Great for simple, loosely-related reactions.
  • Orchestration — a coordinator issues commands, collects results, and owns the state. The flow is explicit and visible, failure/compensation is centralized — at the cost of a component that knows about the others. This is where an orchestrator (Temporal) earns its place; the queue request-reply trap is what pushes you here.

Rule of thumb: choreography for a few independent reactions; orchestration once you must track the state of a process.

Where Temporal fits: it doesn’t replace EDA — it’s an abstraction layer over it. At runtime Temporal compiles ordinary sequential code into a durable, event-driven state machine: every completed step is appended to an event history, and state is rebuilt by replaying that log. So you write the flow as straight-line code, but get orchestration’s payoff — the workflow state is centralized and observable instead of reconstructed from event soup, while async events can still trigger steps. The resilience that choreography makes you hand-roll — retries, timeouts, dead-letter queues, state durability — the engine owns. Reach for it when a flow spans time and must be tracked end-to-end (a payment saga, a long-running agent task), not for a couple of fire-and-forget reactions.

Common shapes once you’re event-driven:

  • Event notification — a thin “X happened, go look” ping; the consumer calls back for details. Lowest coupling, but chatty.
  • Event-carried state transfer — the event carries the data the consumer needs, so it doesn’t call back. Decoupled and fast, at the cost of duplicated/denormalized data.
  • Event sourcing — the event log is the source of truth; state is a fold over events. Full audit + replay, harder to query.
  • CQRS — split the write model from read models kept in sync via events; often paired with event sourcing.

Interviewers grade the justification, not the brand — lead with the requirement, then the pick. “I need fan-out to many independent consumers with replay, so Kafka” scores; “Kafka because it’s popular” doesn’t. The menu by role:

RoleDefault pickAlternativesReach for it when
Event log / streaming backboneKafkaKinesis (AWS-native), Pulsar (tiered storage, multi-tenant), Redpanda (Kafka-API, lower latency)replay, multiple independent consumers, high throughput, event sourcing
Task / work queueSQS (managed) / RabbitMQpoint-to-point, competing consumers, “do this job once,” decoupled offload
Async background jobsSQS + workersCelery + Redis/RabbitMQ (Python), Sidekiq (Ruby)email, image processing, anything off the request path
Workflow orchestrationTemporalAWS Step Functionsdurable multi-step flows with retries, compensation, sagas
Pub/sub fan-outSNS (AWS)Kafka topics, Redis Pub/Sub (ephemeral), Google Pub/Subnotify many subscribers of one event

The 90% answer for a generic design: Kafka for the event backbone, SQS/RabbitMQ for task queues, and call out the outbox pattern for reliable publishing. Then bend it to the problem: replay/multiple consumers → Kafka; simple decoupled jobs → SQS; long stateful flows → Temporal/Step Functions.

Patterns worth name-dropping (they signal you’ve shipped this):

  • Outbox — write the event to a DB table in the same transaction as the state change, relay it out-of-band; closes the dual-write gap between DB commit and publish.
  • CDC / Debezium — capture DB changes as a stream instead of dual-writing at all.
  • Dead-letter queue — where poison messages go after N failed retries, so one bad message doesn’t wedge the consumer.
  • Event sourcing + CQRS — when the domain genuinely wants an audit log and split read/write models; don’t reach for it by default.

Message-/event-driven isn’t free — know what you’re buying:

  • Wins: loose coupling, independent scaling and deployment, resilience (absorb bursts, survive a down consumer), and easy extension (new consumers attach without touching producers).
  • Costs: eventual consistency (no synchronous “it’s done”), much harder debugging and observability (a request’s path is spread across services and time — lean on tracing), ordering and duplicate handling (idempotency), and — in pure choreography — no single owner of the flow.

Reach for it when components must scale and fail independently and reactions are many/unknown. Keep a plain synchronous call when you need an immediate answer and the coupling is fine.


These are working notes — the architecture that composes the messaging transports. The throughline: async messages decouple services; events let many react, commands direct one, and the choreography-vs-orchestration choice decides who owns the flow. Transports on the Kafka, Queues, and Temporal pages; vocabulary in the Terminology.