Duplication
DRY, orthogonality, and the kinds of duplication that matter
Duplication
Duplication is the most reliable predictor of future bugs. Every duplicated piece of knowledge has to be updated in lock-step the next time the underlying decision changes; the day someone forgets, the divergence becomes a defect.
The core principle, from The Pragmatic Programmer:
DRY: Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
The principle is stated in terms of knowledge, not code. Two functions that happen to have similar shapes but represent unrelated decisions are not a DRY violation; one decision expressed in two places is.
Kinds of Duplication
Imposed duplication
The environment seems to require it: data definitions in code and database, types in client and server, documentation that has to mirror behavior. Resist by generating one from the other. A schema that produces both the database migration and the typed client code keeps them honest; manual mirroring guarantees drift.
Inadvertent duplication
Two developers solve the same problem in two places without knowing the other exists. The fix is awareness — code search, shared catalogs of utilities, code review that asks "is this written somewhere already?"
Impatient duplication
A developer copies an existing block "for now" and intends to consolidate later. Later rarely comes. The pragmatic compromise is to mark the duplication explicitly (a TODO: consolidate with X) so that the next reader can act on it; without the marker, the duplication becomes invisible.
Inter-developer duplication
Across a large team, the same utility is written ten times under ten names. The remedy is mostly social — a shared library culture, visible internal docs, and review that names the duplication out loud.
Common Patterns to Eliminate
Duplicated logic
Two functions calculating the same thing — even if their input shapes differ. Extract the calculation; let both call sites convert to the shared form.
Duplicated conditional structure
The same if/else over the same predicate appears in three places. Extract a polymorphism, a strategy object, or a table that captures the decision once.
Duplicated data shape
The same fields with the same constraints, declared in three classes. Extract a value type that owns the constraints; let each holder reference it.
Duplicated control flow
A retry loop, a transactional wrapper, a logging-on-entry-and-exit pattern repeated everywhere. Extract a higher-order function or decorator that encapsulates the pattern.
Duplicated knowledge across artifacts
The constants MAX_USERS = 1000 defined in code, in a config file, in a runbook, and in a wiki. Pick one source of truth and have the others read from it (or be generated from it). The version that is read at runtime wins by default; everything else should defer to it.
When Duplication Is Not the Problem
A premature de-duplication is often more expensive than the duplication itself. Two pieces of code that look alike but represent different decisions will eventually diverge — and the abstraction tying them together will resist that divergence, forcing one of them into a bad shape.
The Rule of Three applies here:
- The first duplication is fine; the requirement is not yet clear.
- The second is a flag; observe whether the two pieces are truly the same decision.
- The third is a signal to act, and by now the right shape of the abstraction is usually visible.
Sandi Metz's reformulation captures the trade-off:
Duplication is far cheaper than the wrong abstraction.
When in doubt, leave the duplication and revisit when the shape of the underlying knowledge becomes clearer.
Orthogonality
DRY says "don't say it twice." Orthogonality says "make sure changing one thing does not change another." The two principles reinforce each other.
A system is orthogonal when its modules are decoupled along the axes that matter:
- Changing the storage backend does not require changing the rendering layer.
- Switching the payment provider does not require changing the checkout flow.
- Replacing the logger does not require touching domain logic.
Orthogonality is what lets a team work in parallel and what lets a refactoring stay local. The opposite — a tightly coupled system where every change ripples — is the technical debt that is hardest to repay because every payment has to be made everywhere at once.
Symptoms of low orthogonality
- A change that "should be one line" turns into edits across many files.
- Adding a new variant requires editing several unrelated modules.
- A new developer cannot learn one part of the system without learning all of it.
- Tests at one layer require setup from every other layer.
Tactics for raising orthogonality
- Layer dependencies. Define a directed dependency graph between modules; do not allow cycles.
- Hide implementation behind interfaces. A module's clients depend on its contract, not on its file structure or storage layout.
- Decouple events from handlers. A module that emits an event does not know who reacts to it.
- Avoid global state. Globals couple every module that touches them.
- Inject dependencies. A class that constructs its collaborators is wedded to them; a class that accepts them is reusable.
Reversibility
A close cousin of orthogonality. The Pragmatic Programmer observes that few decisions in software are truly irreversible — but the cost of reversal varies enormously. Designs that locally encapsulate decisions ("which database," "which payment provider," "which auth scheme") preserve the option to change them.
The cost of an interface today is usually small; the cost of not having one when the underlying choice changes is usually large.
The opposite anti-pattern is coupling to a vendor: code that uses provider-specific features pervasively, so the provider can never be replaced without rewriting half the system. Even when no replacement is planned, the lock-in itself is a strategic risk.
Levels of Application
DRY and orthogonality apply at every level:
- Within a function. Repeated expressions become local variables or small helpers.
- Within a class. Repeated logic across methods becomes a private helper or a state-machine transition.
- Across classes. Repeated patterns become shared abstractions, decorators, or higher-order functions.
- Across services. Shared schemas, generated clients, and contract tests prevent the same decision from being made twice in two services that disagree.
- Across documentation, code, and configuration. One source of truth, with the others derived.
The principle is the same; only the mechanism changes.
Pre-Commit Checklist
- Does each piece of knowledge in the system have one authoritative source?
- When two pieces of code look the same, are they expressing the same decision or different decisions that happen to look similar?
- Could a change to one decision force changes in many files? If so, where would consolidation help?
- Are constants, schemas, and configuration values defined once and referenced elsewhere?
- Have you avoided abstracting too early — leaving acceptable duplication until the shared shape is clear?