A discipline for reviewing code that catches defects, transfers knowledge, and raises the standard of the codebase

Code Review

Code review is the most consistent leverage point a senior engineer has. Every review either raises the bar of the codebase or lowers it; there is no neutral review. The author has thought about this code for hours, the reviewer for minutes — the asymmetry is unavoidable, but a method exists for closing it.

The goal of a review is not to find fault. It is to produce code that the team is collectively willing to own.

What a Review Is For

A useful review delivers four things, in this order of priority:

Correctness. Does the code do what the change description claims? Does it handle the cases the author has not thought about yet?
Design. Is the change well-placed? Does it strengthen the existing structure, or pile onto something that should be reshaped first?
Readability. Will the next person who opens this file understand it in a minute?
Consistency. Does it match the conventions of the surrounding code, or invent a new one without justification?

Style, naming, and formatting matter — but they are downstream of these four. Reviews that stop at the surface miss the changes worth making.

The Wrong Ways

The failure modes are predictable:

Rubber-stamping. "LGTM" without reading. The codebase silently drifts.
Bikeshedding. Hours of debate on naming a private variable, none on whether the function should exist.
Author-driven review. Approving because the author is senior, or because pushback feels rude. The PR is not the person.
Reviewer-driven rewrite. Demanding the change be reshaped to match how the reviewer would have written it. The standard is correctness and clarity, not personal taste.
Scope creep. "While you're in there, can you also fix…?" The PR ships when it does what it set out to do.
Ambiguous comments. "This is weird." The author has to guess at what to change. Be specific or do not comment.
Blocking on opinion, suggesting on requirement. The reverse of what should happen — must-fix items are flagged as "consider," and nice-to-haves block merge.

Each of these is an instance of the same root cause: the reviewer has not separated what must change from what they would have done differently.

How to Review

1. Read the description before the diff

The description tells you what the change is supposed to do, why, and what the author is unsure about. Reviewing the diff first means evaluating against unstated criteria — usually the reviewer's own.

If the description is missing or unclear, that is the first comment. Reviews built on guesses produce comments built on guesses.

2. Read the tests before the implementation

The tests state what the code claims to do. The implementation states how it claims to do it. Reading the tests first lets you evaluate the contract independently of the mechanism, and surface gaps in coverage before they are obscured by detail.

If a behavior is described in the PR but not tested, that is the next comment.

3. Read the diff in passes

Three passes is usually enough:

First pass — shape. What files changed? What is added, removed, moved? Is the structure of the change appropriate to the goal?
Second pass — correctness. Walk the code paths. What happens with an empty input, a null, a timeout, a concurrent caller? Does the error handling cover the failure modes the author noted?
Third pass — readability. Names, comments, formatting, dead code, drive-by changes that should be split out.

Doing all three at once produces shallow reviews. Splitting them produces deeper ones.

4. Run the code where it matters

For non-trivial changes, check out the branch and run it. UI changes need a browser; CLI changes need a terminal; migrations need a database. Reading the diff is necessary but not sufficient — many defects are visible only at runtime.

This is doubly true for cross-platform code. A change that looks correct on Android may break on iOS, and vice versa. Both targets are part of the review.

5. Frame comments precisely

A comment that does not tell the author what to do is wasted typing. Use a consistent prefix:

Prefix	Meaning
`must:`	Blocking — fix before merge.
`should:`	Strong suggestion — fix unless there is a reason not to.
`consider:`	Take it or leave it. The reviewer has thought about it; the author can decide.
`nit:`	Style or taste. Never blocking.
`q:`	A question. Not a request to change anything.
`praise:`	The code is good — name what specifically. Reinforcement matters.

The prefix is for the author's benefit: it removes the guess about how much weight to give each comment, and it removes the reviewer's temptation to soften must-fix items into "I wonder if…".

6. Comment at the right altitude

A single design issue is more valuable than ten style issues. If the design needs to change, say so first and stop reviewing the details — they will all change anyway.

The order matters: a reviewer who buries the design comment under fifteen nits forces the author to fix the nits, then redo the design, then re-fix new nits. Lead with the load-bearing comment.

7. Suggest, do not dictate

must: handleError swallows the original cause when it re-throws.
    Consider attaching it via the cause property so logs retain the
    full chain. Alternatively, let it propagate and handle one level up.

Two paths are offered; the author chooses. Compare:

must: rewrite this to use the cause property.

The second is shorter but worse — it removes context, removes the alternative, and removes the author's agency. The longer form raises the bar without taking ownership of the design.

What to Look For

A senior reviewer's eye is trained on a specific list of things that go wrong:

Correctness

Off-by-one and boundary cases. Empty input, single element, maximum size.
Null and undefined. Every dereference that the type system did not prove safe.
Concurrency. Shared state without locks; awaits that race with each other; cancellation not propagated.
Error paths. What happens when the call fails? Is the failure observable?
Resource cleanup. Files, connections, subscriptions, timers — released in every path including exceptions?

Design

Premature abstraction. Three call sites is the threshold; one is not.
Hidden coupling. A module that reaches across layers, or names things from another bounded context.
Mixed levels of abstraction. A function that opens a file, parses JSON, computes a discount, and writes to a database — four jobs, one function.
Inappropriate intimacy. Reaching into another object's internals; modifying its state from outside.
Public surface. Every new export is a commitment. Is it necessary?

Readability

Names. Does the name predict the contents? Is it the same as a name nearby that means something else?
Function size. A function longer than the screen needs justification.
Comments that lie. Comments that no longer match the code, or that explain what instead of why.
Dead code. Removed safely is better than commented out.

Security and Safety

Untrusted input. Anything from the network, the user, or a third-party API. Validated at the boundary?
Authorization checks. Present at the entry point of the action, not delegated to the UI.
Secrets. Hardcoded, logged, or committed?
Injection vectors. SQL, shell, HTML, prompt — parameterized?

Tests

Coverage of the change. Every new branch should have a test, or a reason not to.
Tests that test the mock. Setup that mirrors the implementation and would pass even if the implementation were broken.
Brittleness. Tests coupled to implementation details that will change next sprint.

Reviewing Specific Kinds of Change

The general checklist covers most diffs. Some changes carry their own landmines, and a senior reviewer reaches for a different list when one shows up:

Database migrations. Is it reversible, or is there a tested rollback? Does it lock a large table and stall writes? Is it backward-compatible with the currently deployed code, so the old and new versions can run side by side during a rollout? Schema changes and the code that depends on them usually have to ship in separate, ordered PRs.
Dependency bumps. What actually changed — read the changelog, not just the version number. Does a major bump cross a breaking-change boundary? Does a transitive dependency introduce a new license or a known CVE? A lockfile diff of 4,000 lines still deserves a look at why.
Config and feature flags. Config is code that ships without tests. What is the default, and is it safe? Does a new flag have a removal plan, or is it permanent tech debt the day it lands? Will the flag's off-path be exercised, or does it rot untested?
Deletions. Removed code is the easiest to approve and the easiest to regret. Is anything still calling it — including reflection, config, or another service? Is the behavior genuinely dead, or just dormant? A deletion with no corresponding test or caller change is worth a question.
Security-sensitive paths. Anything touching authentication, authorization, payments, or personal data gets a slower, more adversarial pass — and a reviewer with the relevant expertise. "It works" is not the bar; "it cannot be abused" is.

How to Receive a Review

The author has the harder job — and it is a discipline of its own. Assuming good intent, responding to every thread, pushing back with a reason, and splitting the PR when the review reveals two issues are covered from the author's side in Opening a Pull Request.

The loop only closes when both ends hold: the reviewer raises the bar, and the author meets it without resentment. That is how the standard holds.

Cost and Cadence

Reviews have a real cost: every hour spent reviewing is an hour not spent writing. Two failure modes follow:

Reviews too cheap. Every PR is rubber-stamped to keep the queue moving. The codebase drifts.
Reviews too expensive. Reviewers demand exhaustive design discussion on every change. The team slows down; PRs sit for days; authors stop attempting larger changes.

The cadence that works:

Small PRs reviewed quickly. A PR under 200 lines should get a first response within a few hours during working time.
Large PRs reviewed slowly. A PR over 800 lines is itself a smell; consider whether it should be split before reviewing.
One reviewer for routine changes, two for architecturally significant ones. Two reviewers on every PR is theater.
Async by default, synchronous when stuck. If a thread is on its fourth round-trip, hop on a call.

Pre-Merge Checklist

A senior engineer's quick mental pass before approving:

The description matches the diff. No surprise files, no silent scope creep.
The tests describe the contract the code now offers. New branches are covered.
Public APIs introduced are minimal and necessary.
Failure paths are explicit: every external call has a defined behavior on error.
No secrets, no untrusted input flowing unchecked into a sensitive sink.
Names predict contents; comments explain why, not what.
CI is green, including any pipeline that runs against the platforms the change targets (web, iOS, Android).
Must-fix items are labeled must; preferences are labeled consider.

If a comment is not in one of those categories, ask whether it belongs in this review at all.

Pre-Commit Question

Before approving, ask:

Would I be willing to maintain this code at 2 a.m. next year?

If yes, approve. If no, the comment that answers "what would change that" is the comment to write.

Code Review

On this page