Classic distributed transactions — blocking semantics, why 2PC stalls participants, what 3PC was supposed to fix, and when 2PC is still the right call

Two-Phase Commit (2PC) & Three-Phase Commit (3PC)

2PC is the textbook algorithm for committing a transaction across multiple participants atomically. It does what people expect of a "distributed transaction": either all participants commit, or all abort, with no partial state. It has been around since 1978 (Gray, Lampson) and is still inside every XA transaction manager, every Postgres PREPARE TRANSACTION, every Kafka transaction coordinator.

It also has one well-known failure mode — blocking — that makes it operationally unattractive at the scales most modern services run at. This page is about how 2PC works, what 3PC adds, and where in the design space these protocols belong relative to Saga and TCC.

What 2PC Solves

The problem: a transaction spans multiple participants — databases, services, message queues. You need an outcome where either all of them commit or all abort. A single-node BEGIN ... COMMIT cannot reach across the network; some coordination protocol is required.

The standard cast:

Coordinator — the process that runs the protocol. Often the originating client or a transaction manager (XA, Postgres, Kafka).
Participants — the data stores or services that hold the affected state. Each can independently commit or abort its local part.

The Two Phases

Phase 1: Prepare (Voting)

The coordinator sends PREPARE to every participant.

Each participant:

Performs the work (writes locks, validates constraints).
Persists enough state to commit or abort on demand. This is the prepared state — locks are held, but the transaction is not yet committed.
Replies YES (ready to commit) or NO (must abort).

If a participant crashes between receiving PREPARE and replying, it must, on recovery, be able to find the prepared transaction and ask the coordinator for the outcome.

Phase 2: Commit / Abort

If all participants voted YES, the coordinator writes a commit decision to its own durable log, then sends COMMIT to all participants.

If any voted NO, or any failed to respond, the coordinator writes an abort decision and sends ABORT to all participants.

Each participant, on receiving the decision, applies it locally and acknowledges.

Coordinator                          Participants
     │                                    │
     │── PREPARE ──────────────────────▶  │
     │                                    │ writes prepared state,
     │                                    │ holds locks
     │  ◀── YES / NO ──────────────────── │
     │                                    │
     │  (decision: all YES → COMMIT)      │
     │  write COMMIT to local log         │
     │                                    │
     │── COMMIT ────────────────────────▶ │ apply, release locks
     │                                    │
     │  ◀── ACK ─────────────────────────  │

The coordinator's local decision log is the point of no return. Once written, even if the coordinator crashes mid-broadcast, it (or its replacement) will replay the same decision.

Why 2PC Blocks

The failure mode everyone learns about: the coordinator crashes after participants have voted but before they have received the decision.

Coordinator                          Participants
     │                                    │
     │── PREPARE ──────────────────────▶  │ prepared, locks held
     │  ◀── YES ────────────────────────  │
     │                                    │
     │  [coordinator crashes here]        │
     │                                    │
     │                                    │ waiting...
     │                                    │ waiting...
     │                                    │ locks still held...

The participants cannot decide on their own. They voted YES and must honor that vote until they hear from the coordinator. They cannot abort (the decision might have been commit). They cannot commit (the decision might have been abort). They block — holding locks, accumulating waiters, refusing other work — until the coordinator recovers.

This is the central operational problem with 2PC. In the steady state it works fine. Under coordinator failure, a participant can be stuck for the time it takes to recover the coordinator and replay its decision log. With cross-region setups or hot-standby coordinators, that is minutes; with manual recovery, hours.

3PC: The Attempt to Fix Blocking

Three-Phase Commit (Skeen, 1981) adds an intermediate phase to give participants enough information to recover without waiting for the coordinator.

The three phases:

CanCommit — coordinator asks each participant if it can commit. Participants reply YES/NO but do not prepare yet.
PreCommit — if all said YES, coordinator sends PRE-COMMIT. Participants now prepare (write to log, hold locks) and reply ACK.
DoCommit — coordinator sends COMMIT once all PRE-COMMIT acks arrive.

The crucial property: in 3PC, all participants in PRE-COMMIT state know that all others have also acknowledged PRE-COMMIT. If the coordinator crashes after PRE-COMMIT but before DoCommit, participants can timeout and agree on commit among themselves — they all have evidence that all others were ready.

In theory, this makes 3PC non-blocking.

In practice, 3PC is almost never used. The reasons:

3PC assumes a synchronous network. The timeout-and-decide step only works if you can reliably distinguish "participant crashed" from "network is slow." Real networks are partially synchronous, and 3PC is not safe under network partition — it can lead to inconsistent decisions if a partition isolates participants in different phases.
Extra latency. An extra round trip is paid on every transaction, not just during recovery.
Higher implementation complexity with little benefit at the scale most systems operate.

The modern approach is to keep 2PC for the steady state and pair it with Paxos / Raft replicating the coordinator state, so coordinator failure no longer means an unrecoverable log. Spanner, CockroachDB, and Percolator all combine 2PC across shards with consensus-replicated coordinators.

When 2PC Is Acceptable

Despite its reputation, 2PC is the right answer in narrower contexts than people fear:

Same datacenter, few participants, short transactions. Latency is low, coordinator failure is rare, and locks are held briefly. XA across two databases in the same datacenter is a textbook 2PC use case.
Coordinator backed by consensus. Spanner-style: the coordinator's decision log is replicated via Paxos/Raft, so the "coordinator crash leaves participants stuck" problem becomes "coordinator failover takes a few hundred milliseconds." Recovery is automatic and bounded.
You actually need strict atomicity. Money movement between accounts, inventory + payment together when partial state is unacceptable. Trade the operational risk for the correctness guarantee, with eyes open.

When 2PC is not the right answer:

Long-running workflows. Hours-to-days operations should use Saga, not 2PC. Holding locks for hours is operationally impossible at any scale.
Cross-organizational transactions. You will not get participants from another company to participate in your 2PC protocol. Sagas with compensation, or TCC with explicit reservation, are the right tools.
High-throughput / low-latency systems. Every cross-shard 2PC costs at least 2 RTTs plus durable writes on every participant. At very high throughput this becomes a dominant cost.
Operations where eventual consistency is acceptable. Most product analytics, recommendation pipelines, and non-financial systems do not need atomic commits. Pay the right cost for the right requirement.

2PC vs Saga vs TCC

The three families of "distributed transactions" are not interchangeable; each occupies a different cell in the design matrix.

Property	2PC	Saga	TCC
Isolation	Strong (locks held until commit)	None (intermediate states visible)	Weak (resources reserved, not locked)
Duration	Seconds	Hours to days	Seconds to minutes
Coupling	Tight (synchronous, same protocol)	Loose (events / orchestration)	Medium (explicit try/confirm/cancel)
Compensation	Built-in (abort phase)	Application-defined per step	Application-defined cancel
Failure recovery	Coordinator-driven, can block	Saga-coordinator-driven, never blocks	Coordinator-driven, never blocks if confirm/cancel are idempotent
Use case	ACID across databases	Long-running business workflows	Cross-service operations with reservable resources

The honest framing: 2PC for "must commit atomically right now," Saga for "long process tolerant of intermediate states," TCC for "we need atomicity but cannot afford locks."

Production Lessons

In-doubt transactions are an operational reality. A coordinator failure in 2PC leaves prepared transactions in participants' logs. Have runbooks for finding and resolving them. Postgres has pg_prepared_xacts; Oracle has DBA_2PC_PENDING. Know the equivalent for your stack.
Lock duration is the killer. A 2PC transaction holds locks across the full prepare-decide-commit cycle. Network latency or coordinator slowness directly multiplies lock duration, which multiplies contention.
Coordinator failover must be automatic. A 2PC coordinator that requires manual recovery is the worst case: locks are held until a human notices. Make coordinator recovery automatic, or use a system (CockroachDB, Spanner) where it already is.
Avoid 2PC across regions. The latency makes it operationally impractical. If you must, replicate the coordinator within each region and use a single-region 2PC plus async replication out.

Pre-commit Checklist

For each 2PC transaction in my system, do I know what happens if the coordinator crashes mid-protocol? Is recovery automatic?
Are my participant locks held for milliseconds or seconds, not minutes?
Do I have runbooks (and monitoring) for in-doubt prepared transactions?
Have I considered whether Saga or TCC would meet my actual requirement at lower operational cost?
For cross-region operations, am I avoiding cross-region 2PC?
Is the coordinator itself fault-tolerant (consensus-backed, not a single process)?

Two-Phase Commit (2PC) & Three-Phase Commit (3PC)

Two-Phase Commit (2PC) & Three-Phase Commit (3PC)

What 2PC Solves

The Two Phases

Phase 1: Prepare (Voting)

Phase 2: Commit / Abort

Why 2PC Blocks

3PC: The Attempt to Fix Blocking

When 2PC Is Acceptable

2PC vs Saga vs TCC

Production Lessons

Further Reading

Pre-commit Checklist

On this page