The theory and mechanisms that make multi-node systems correct — CAP, consensus, consistency models, ordering, idempotency, distributed transactions

Distributed Systems

A distributed system is any system whose correctness depends on more than one process talking to another over an unreliable channel. Once that line is crossed — the moment you have a replica, a worker, a second region, a third party — the rules change. Local intuitions about "the call either succeeded or failed" and "now happened before then" stop holding.

This section is about the ideas and mechanisms that make multi-node systems behave correctly anyway. It is theory-leaning by design: specific technologies (Kafka, etcd, Spanner) appear as illustrations, but the focus is on properties and protocols that outlive any one vendor.

Topics

Theory

CAP & PACELC — Why you cannot have everything, and the more honest extension.
Consistency Models — Linearizable, sequential, causal, eventual — and what each one actually buys you.
FLP Impossibility — Why deterministic consensus under pure asynchrony is impossible, and why we still ship consensus systems.
Failure Models — Crash-stop, crash-recovery, omission, timing, Byzantine.

Consensus

Raft — The understandable consensus algorithm.
Paxos — Classic Paxos, Multi-Paxos, and what the variants actually solve.
Leader Election — Patterns, fencing tokens, split-brain.
Quorum Systems — Read/write quorums, sloppy quorums, hinted handoff.

Coordination

Logical Clocks — Lamport timestamps, vector clocks, version vectors, HLCs.
Distributed Locks — Why "just use Redlock" is rarely the answer.
Membership & Failure Detection — Heartbeats, SWIM, phi accrual.

Correctness

Idempotency — Designing operations that are safe to retry.
Exactly-Once Semantics — Why it does not exist, and what "effectively-once" actually delivers.
Saga Protocol — Compensation-based long-running transactions (the correctness view — the pattern view lives in Software Architecture).

Transactions

Two-Phase Commit (2PC) & Three-Phase Commit (3PC) — Blocking, recovery, and when 2PC is acceptable.
TCC (Try-Confirm-Cancel) — Application-level distributed transactions.
Distributed Snapshots — Chandy-Lamport and friends.

Scope and Boundaries

Concern	Lives here	Lives elsewhere
CAP, consistency models, consensus, ordering	✓	—
Saga as a protocol (compensation correctness)	✓	—
Saga as a pattern (long-running workflows)	—	`software-architecture/event-driven`
Specific replication or sharding implementations	partly	`database/advanced/`
Kafka exactly-once configuration	—	`infrastructure/message-queues/`
Network protocols (TCP, TLS, HTTP/2)	—	(planned: `architecture/networking/`)

How to Read This Section

Distributed systems is one of the few areas in software where reading the original papers still pays off — they are short and the explanations elsewhere often paper over the assumptions. Each page here links the seminal paper alongside the practitioner's view.

If You Have Time for One Page

Read Consistency Models. Every other discussion in this section assumes its vocabulary. If you can name the difference between linearizable, causal, and read-your-writes, you have already cleared the bar for 80% of distributed-systems conversations.

Reading Tracks by Goal

Different starting points reach the same material from different angles. Pick the track whose premise matches yours.

Track A — First-time orientation. You have read about distributed systems but never had to design one. Build vocabulary first.

CAP & PACELC — frame the trade-off space.
Consistency Models — the language for "how correct is correct enough."
Failure Models — what "the node failed" actually means.
FLP Impossibility — the result that bounds everything else.
Idempotency — the most reusable tool in your future.

Track B — Picking a database or message broker. You are evaluating Cassandra vs Postgres, Kafka vs RabbitMQ, MongoDB vs DynamoDB. Optimize for spotting marketing claims.

CAP & PACELC — decode the CP/AP labels honestly.
Consistency Models — when "strongly consistent" is meaningful and when it is not.
Quorum Systems — what R + W > N does and does not give you.
Exactly-Once Semantics — what the broker actually delivers.
Idempotency — what your consumer code must guarantee anyway.

Track C — Building a service that talks to other services. You are designing a microservice that calls payments, sends emails, writes to two databases. Optimize for not corrupting data.

Idempotency — make every external call safe to retry.
Exactly-Once Semantics — what your messaging layer cannot give you, and what to do instead.
Saga Protocol — multi-service workflows with compensation.
TCC and 2PC — the other two transaction patterns and when each fits.
Distributed Locks — the alternatives you should consider first.

Track D — Operating a cluster. You are on-call for a Kafka, etcd, Cassandra, or similar cluster and want to understand the failure modes before they happen to you.

Failure Models — the taxonomy you need before reading any post-mortem.
Membership & Failure Detection — what "the cluster lost a node" really means.
Raft or Paxos — pick whichever your cluster runs.
Leader Election — the failure mode that causes more incidents than any other.
Quorum Systems — why your reads went stale.

Track E — Going deep on consensus. You are implementing a consensus-backed system or auditing one.

FLP Impossibility — the foundation. Read this before the algorithms.
Consistency Models — what consensus is paying for.
Quorum Systems — the building block.
Raft — pick this for understandability.
Paxos — read it to know the literature.
Leader Election — the operational side.

The Dependency Graph

If you prefer to read by prerequisites rather than by goal:

                  Failure Models
                        │
                        ▼
   CAP ──────▶ Consistency Models ──────▶ Logical Clocks
    │                  │                          │
    │                  ▼                          ▼
    └────────▶  FLP ──────▶  Quorum  ──────▶  Raft  ───▶  Paxos
                                       │                    │
                                       └──▶  Leader Election
                                                  │
                                                  ▼
                                       Membership & Failure Detection
                                                  │
                                                  ▼
                                       Distributed Locks
                                                  
   Idempotency ──▶ Exactly-Once ──▶ 2PC ──▶ TCC ──▶ Saga Protocol
                          │
                          ▼
                  Distributed Snapshots

Reading top-to-bottom on either column gives a self-contained path. The two columns can be read independently — the Correctness column (right side of life) is application-engineer-flavored; the Theory + Consensus column is database-engineer-flavored.

Practitioner Reality Check

A few pages punch above their weight in day-to-day work; if you read nothing else, read these:

Idempotency — used in every microservice you will ever write.
Consistency Models — used in every "is this safe?" conversation.
CAP & PACELC — used to decode every database's marketing page.
Failure Models — used in every post-mortem.

The rest are tools you pull off the shelf when a specific problem calls for them.

Distributed Systems

On this page