API Design
Designing public interfaces that are minimal, stable, and easy to use correctly
API Design
An API — whether the public surface of a library, a service interface, or just the public methods of an internal class — is a commitment. Every consumer becomes a constraint. Every behavior, intentional or accidental, becomes a contract you may not be able to change.
The discipline of API design is to make those commitments deliberately: to expose what callers need, hide what they do not, and evolve the surface in a way that does not break what depends on it.
The Two Hardest Problems
Hard to use incorrectly
The best APIs make the right thing easy and the wrong thing difficult or impossible. A function that compiles only when used correctly is safer than one that runs and silently misbehaves.
// Easy to misuse — wrong order is a runtime mystery
copy(src, dst)
copy(dst, src) // accidentally reversed; compiles, runs, corrupts
// Hard to misuse — typed wrappers prevent the mistake
copy({ from: src, to: dst })
// Or — distinct types
copy(src: Source, dst: Destination)Joshua Bloch's framing: "APIs should be easy to use correctly and hard to use incorrectly." The asymmetry matters — making the API merely easy is not enough if the easy form is also the buggy one.
Easy to evolve
Once an API has consumers, the cost of changing it grows roughly linearly with the number of consumers and the time since the change was made. APIs designed without evolution in mind become permanent — the original mistakes calcify, and the team works around them forever.
The two skills compound: an API that is hard to misuse needs less evolution, because callers do not depend on the misbehaviors that would otherwise need preserving.
Minimality
Every method on the public surface is a method that:
- Must be documented.
- Must be tested.
- May be called by anyone.
- Must continue to behave the same way until the API version changes.
The cost of each surface element is real. Two consequences:
Start small
Expose the smallest surface that solves the problem. New methods are easy to add later, when a real need has appeared; methods that have been exposed are hard to remove.
The rule of thumb: when in doubt, leave it out. The cost of "should I expose this?" wrong in the conservative direction is one bug report ("I needed X"); wrong in the liberal direction is years of having to support X.
Avoid options that change behavior fundamentally
A single method with a flag that switches between two behaviors is two methods in disguise. Two named methods are clearer at the call site, and easier to evolve independently.
// Option overload
parse(input, { strict: true })
parse(input, { strict: false })
// Two methods
parseStrict(input)
parseLenient(input)The exception is when the option genuinely tunes the same behavior (a precision parameter, a timeout, a buffer size).
Naming
API naming carries an outsized weight: the name is what consumers read, search for, and remember. The principles in Naming apply, plus:
- Consistency across the API. If you have
getUser, do not also havefetchOrderandloadInvoice. Pick one verb for "read by ID" and use it everywhere. - Predictability. A consumer who learns one part of the API should be able to guess the rest. Symmetry across operations (
open/close,start/stop) is a small effort that pays back constantly. - Self-documenting call sites. A reader of
bookFlight(passenger, leg)should not need to look at the docs to know what is happening. Naming is the cheapest documentation.
Defaults
Every default is a decision the API makes on the consumer's behalf. Two consequences:
Default to the safe choice
When the consumer does not specify, the API should pick the option that is least likely to cause damage:
- Secure by default. Authentication on, encryption on, validation strict.
- Conservative by default. Smaller batches, shorter timeouts, fewer retries.
- Visible by default. Errors propagated, not swallowed; effects logged, not hidden.
The opt-in for the more permissive behavior is small ceremony for the consumer, large protection for the consumer's mistakes.
Defaults are commitments
Once a default ships, changing it breaks every consumer who relied on it implicitly. Treat default values as part of the API surface — write them down, version them, and change them only with the same care as a function signature change.
Errors
The error model is part of the API. Treat it as such.
Make failure modes part of the contract
A function that can fail should communicate which failures are possible, in the type system where possible:
// Failure modes invisible
async function transfer(from, to, amount): Promise<void>
// Failure modes part of the type
async function transfer(from, to, amount):
Promise<Result<void, InsufficientFunds | AccountFrozen | NetworkError>>Consumers can then exhaustively handle each case; new failure modes added later force callers to consider them.
Stable error identification
Consumers programmatically respond to errors based on identifying information — code, type, structured field. Make that information stable:
- A machine-readable error code or type.
- A message intended for humans, separate from the code.
- A structured cause chain for higher-level errors that wrap lower-level ones.
Free-text error messages that consumers parse are a contract you did not intend to make. Stable codes prevent it.
Distinguish user error from system error
A clear API distinguishes:
- The caller did something wrong (validation failure, unauthorized) — return early with a clear identification.
- The system encountered a problem (downstream timeout, internal exception) — surface it as a different category.
Mixing the two pushes diagnosis cost onto every consumer.
Versioning and Evolution
No API survives unchanged forever. Plan for change.
Semantic versioning, when applicable
A widely understood convention:
- Major — breaking changes; consumers must update.
- Minor — new functionality; existing callers unaffected.
- Patch — bug fixes; behavior matches the documented contract.
The discipline is on the producer: any change that could break a correct caller belongs in a major version. Promoting a patch to a major when in doubt is much cheaper than the bug reports.
Additive change is cheap; removal is expensive
Adding a new method, a new optional parameter, or a new field rarely breaks consumers. Removing or renaming any of them does.
The asymmetry guides design: when in doubt, add a new way to do the thing rather than changing the existing way. When the new way is established, deprecate the old; when deprecation has run long enough, remove.
Deprecate before removing
Removal without warning breaks consumers; deprecation gives them time to migrate.
A useful deprecation includes:
- A signal in the API. A
@deprecatedannotation, a runtime warning, a documentation note. - A timeline. "Will be removed in v3.0," not "soon."
- A migration path. What to use instead, and how to translate.
- Mechanical help where possible. Codemods, lint rules, search-and-replace recipes.
Deprecation without these is procrastinated removal — the consumer faces the same break later, with less warning.
Maintain compatibility within a major version
The implicit promise of "no breaking changes within a major version" is what consumers rely on to upgrade safely. Keeping it requires discipline:
- Run tests against the old API surface, not just the current one.
- When in doubt, ask "would a correct caller of the previous version still work?"
- Avoid relying on side effects, ordering, or implementation details when designing internal changes.
Hyrum's Law
With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.
The consequence is sobering: the actual contract is whatever consumers can observe, not what is documented. Examples of accidental contracts:
- The order in which a function returns elements (when it is not guaranteed).
- The timing of an asynchronous callback.
- The exact text of an error message.
- A field that exists in the response but is undocumented.
Two practical responses:
- Hide what you do not promise. If a behavior is not part of the contract, make it harder to observe — randomize iteration order, vary timings, omit undocumented fields.
- Test against your real consumers. Contract tests, integration with real consumer code, and migration testing surface accidental dependencies before the change ships.
You will not eliminate Hyrum's Law. You can reduce its scope.
Documentation
API documentation is a part of the API. Treat it with the same rigor.
A useful docstring covers:
- What the function does — the contract, not the mechanism.
- Parameters — types, units, constraints, defaults.
- Return value — what comes back, including the absence cases.
- Errors — what can go wrong, and how each is signaled.
- Side effects — what changes outside the function.
- Concurrency — whether the function is safe to call from multiple threads, holds locks, blocks.
- Examples — at least one runnable example for non-trivial APIs.
A document that says only "calculates the price" — for a function called calculatePrice — is worse than no documentation, because it occupies the space where useful documentation would otherwise go.
Internal APIs
The principles above were written with public, external APIs in mind. They apply, in attenuated form, to internal interfaces too:
- A widely used internal class is an API for the rest of the team.
- A module's public exports are an API for the rest of the codebase.
- A function imported across many call sites is an API for those callers.
The cost of a sloppy internal API is lower than a public one — you can refactor consumers when you change it — but it is not zero. Internal interfaces with poor names, leaky abstractions, or accidental contracts produce the same friction as external ones, just confined to a smaller blast radius.
The investment in clear internal APIs is the investment in a codebase that stays workable as it grows.
Pre-Commit Checklist
- Is the public surface as small as it can be while still solving the problem?
- Is each operation hard to misuse? Are wrong call patterns prevented by the type system where possible?
- Are defaults set to the safe option?
- Are failure modes part of the type signature, with stable identifiers consumers can act on?
- If this is a change to an existing API, is it additive — or has the change been planned with deprecation and migration?
- Does the documentation cover preconditions, return values, errors, side effects, and concurrency?
- Have you considered which observable-but-undocumented behaviors might become accidental contracts?