Skip to content

Temporal

Temporal is a durable execution engine: you write a multi-step process as ordinary code, and it survives crashes, restarts, and waits of arbitrary length. The engine persists every step so a workflow that’s halfway through — having charged a card, waiting three days for a callback — resumes exactly where it left off after a deploy or a node death. It’s the answer to “I’m juggling retries, timeouts, correlation IDs, and brittle glue code across services” — that’s not messaging, it’s workflow orchestration. As the Kafka page puts it: Kafka moves data; Temporal moves work.

PieceWhat it is
WorkflowThe orchestrator — your code, but deterministic: it issues steps, waits, and decides the flow. Its state is durable.
ActivityA single unit of side-effecting work (charge a card, call an API, write a file). The “command”; this is where non-determinism and failure live.
WorkerYour process that hosts and runs workflow + activity code. Temporal itself runs no business logic.
Task queueHow the Temporal service dispatches work to your workers (Temporal’s own queue, not Kafka).

The core trick is event history + deterministic replay. Every workflow decision and activity result is appended to a durable event history. If a worker dies, another picks up the workflow and replays the history to reconstruct in-memory state exactly, then continues. That’s why workflow code must be deterministic (no now(), no random, no direct I/O) and all side effects go in activities — replay must produce the same decisions every time.

Temporal durable execution — a workflow orchestrates activities on workers, with every step persisted to event history

Mermaid source
flowchart LR
classDef client fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
classDef wf fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
classDef act fill:#e7f5ec,stroke:#3f9c5a,stroke-width:1.5px,color:#0f172a;
classDef store fill:#fef6e7,stroke:#d9a441,stroke-width:1.5px,color:#0f172a;
Client(["Start workflow"]):::client
WF["Workflow — orchestrator<br/>your code · deterministic"]:::wf
subgraph WK["Workers"]
A1["Activity<br/>charge card"]:::act
A2["Activity<br/>archive docs"]:::act
end
EH[("Event history<br/>every step persisted")]:::store
Client --> WF
WF -->|"call + await result"| A1
A1 -.->|"durable result · auto-retry"| WF
WF -->|"call + await result"| A2
A2 -.->|result| WF
WF <-->|"append · replay on crash"| EH

What you’d otherwise hand-roll, Temporal owns:

  • Automatic retries — activities retry on failure per a configurable policy (backoff, max attempts), durably, across restarts.
  • Timeouts — per activity and per workflow; a stuck step fails cleanly instead of hanging forever.
  • Durable timerssleep(3 days) actually works; the timer survives crashes and costs nothing while waiting.
  • Saga compensation — on failure partway through, run compensating steps to undo prior work — the saga pattern, but expressed as plain try/catch instead of a hand-built state machine.

Orchestration vs choreography — who owns the flow

Section titled “Orchestration vs choreography — who owns the flow”

This is the real reason to reach for it. In choreography (services reacting to each other’s events), responsibility is diffuse: the end-to-end flow is emergent, nobody owns it, and you reconstruct “what happened” from event soup. In orchestration, the workflow is explicitly responsible — it issues each command and gets the result back.

That answers the usual confusion about commands: a command isn’t fire-and-forget. The workflow awaits the activity and receives a durable result or a failure — the outcome is first-class, not something you hope arrives later as a separate “result event.” One thing owns the process, sees its whole state, and decides what’s next.

A single workflow orchestrating its own activities is one team’s process. Nexus extends durable execution across service / team / namespace boundaries: one application calls another’s exposed operation as a typed, durable request and gets a result back — retries and timeouts handled — instead of stitching services together with raw events and queues and rebuilding the flow from “result events.” So orchestration’s “the caller owns it and gets a durable result” model holds between services, not just within one. Kafka moves data; Temporal moves work — and Nexus is how Temporal moves work between services.

Use it for: multi-step, long-running, must-not-drop-work processes — payments and order fulfillment, provisioning, onboarding, data/ML pipelines, anything with waits, retries, and compensation where you need to see and own the flow.

Avoid it for: simple request/response, single fire-and-forget tasks (a queue is lighter), pure event fan-out (that’s Kafka), or ultra-low-latency hot paths. It’s an orchestration layer, not a message bus or a job runner.

Temporal is the best-known, not the only option — they differ mostly in deployment shape:

EngineShapeNotes
Temporalself-hosted cluster or Temporal Cloudthe most mature; the model described above
Cadenceself-hosted clusterUber’s open-source engine — Temporal is a fork of it, so the model is nearly identical; still developed at Uber
DBOSlibrary on Postgresdurable steps persisted in your own Postgres, running in-process — no separate cluster to operate; lighter when you already run Postgres
Step Functions · Azure Durable Functions · Restate · Inngestmanaged / serverlessthe same durable-execution idea as a hosted service — less to run, less control

The axis to weigh: a cluster to operate (Temporal/Cadence) vs. a library on a DB you already run (DBOS) vs. a managed service — power and portability against operational weight.


These are working notes on durable execution. The one idea to keep: persist every step so the process survives anything, and make one thing own the flow — which is what separates orchestration from a pile of events and glue. Vocabulary in the Study List; the data-vs-work contrast on the Kafka page.