Steven's Knowledge

Microservices

When the cost of coordination is lower than the cost of distribution — what microservices actually buy you, what they cost, and the failure modes that catch teams unprepared

Microservices

A microservices architecture is a system composed of small, independently deployable services that communicate over a network. Each service owns its data, its release cadence, and ideally its team. The promise is autonomy: teams ship without coordinating, services scale independently, failures stay local.

Everything about that promise is real. So is the cost of getting there. This page is about what microservices actually buy you, what you pay for the privilege, and the specific failure modes that catch teams that adopted the pattern before they needed it. For the alternatives, see Monolith and Modular Monolith; for the distributed-systems primitives microservices require (idempotency, saga, distributed transactions, leader election), see the distributed-systems chapter.

What Microservices Actually Are

The honest definition is operational, not architectural:

  • Independently deployable. Each service can be released without requiring a coordinated deploy of other services.
  • Independently scalable. Each service runs as many or as few replicas as its load demands.
  • Independently owned data. Each service has its own database (or schema), and other services interact with that data only through the service's API.
  • Independently failable. A service's failure does not cascade to take down others, by default.

Notice what is not on the list. "Small" is not a defining property — there is no line count threshold. "REST" is not required — any RPC mechanism works. "Cloud-native" is incidental. The defining property is independence across the four dimensions above. Without all four, you have a distributed monolith, which is worse than a regular monolith.

What Microservices Buy You

When the boundaries are right and the discipline is in place, the wins are real:

  • Team velocity. A team owns a service end-to-end and ships without coordinating with other teams' release windows. This is the largest single benefit and the one that justifies the cost when team count grows past ~30 engineers.
  • Independent scaling. A search service that needs 50 replicas does not require the billing service (which needs 2) to also run 50 copies. Capacity becomes responsive to actual load profile.
  • Stack diversity. A new team can pick a language, runtime, or database that fits their problem without conflicting with the rest of the system. The cost is operational diversity — you now operate every stack you chose.
  • Fault isolation. A memory leak in the recommendations service does not take down checkout. By default the blast radius is one service.
  • Replaceability. A small, well-bounded service can be rewritten by a team in weeks. A monolith's equivalent change requires a project.

What Microservices Cost

Everything you got "for free" in a monolith becomes work:

Distribution Tax

Every previously in-process function call becomes a network call. That is:

  • Latency — milliseconds instead of nanoseconds.
  • Failure modes — timeouts, partial failures, retries, idempotency requirements (see Idempotency).
  • Serialization overhead — every call costs CPU at both ends.
  • Network capacity planning — services that previously shared memory now share a network.

Data Consistency Tax

A monolith's BEGIN ... COMMIT is gone. Multi-service operations require:

  • Saga patterns (see Saga Protocol) for long-running workflows.
  • Outbox pattern (see Exactly-Once Semantics) for reliable event publishing.
  • TCC (see TCC) for cross-service atomicity without locks.
  • Eventual consistency tolerance in the application logic for read paths.

Operational Tax

You now operate N services instead of 1. That means:

  • N deployment pipelines, dashboards, runbooks, on-call rotations.
  • Service discovery and routing. What was a function reference is now a DNS entry plus an envoy plus a circuit breaker.
  • Cross-service observability. Distributed tracing becomes mandatory; without it, debugging is impossible.
  • Versioning of every API. A change that was a refactor in a monolith is now a versioned interface change with deprecation windows.
  • Security per service. AuthN/AuthZ at each boundary, certificate rotation everywhere, secrets management at scale.

Coordination Tax (Subtle)

The microservices promise is autonomy, but autonomy is real only when boundaries are right. When boundaries are wrong:

  • Feature changes require multi-service deploys. "Add a field to the user record" touches three services. Each team's release window must align. This is a distributed monolith in disguise.
  • Contract negotiation overhead. Adding a field requires coordinating with every consumer. The number of bilateral conversations scales quadratically.
  • Cross-team incidents. A failure mode spans three services and three teams. Root cause analysis becomes diplomatic.

Practitioners who switched from monolith to microservices honestly report the same total complexity, redistributed. Coordination overhead moves from "merge conflicts and deploy windows" to "API negotiations and cross-team incidents."

When Microservices Are the Right Answer

Specifically:

  • Large engineering organization. Past ~30-50 engineers, coordination overhead inside a monolith starts to dominate. Microservices let teams move in parallel. The threshold varies by team and codebase, but it exists.
  • Real independent-scaling needs. Some component genuinely has a load profile that justifies its own footprint. Image processing, ML inference, real-time chat, search indexing.
  • Stack divergence is real. A subsystem genuinely benefits from a different language or database.
  • Independent release cadence is required. This module ships hourly while the rest ships weekly; coupling them costs both.
  • Failure isolation is mandatory. This component must not be able to take down the rest of the system. Common in regulated environments, multi-tenancy, or in customer-facing components paired with internal-facing ones.
  • The bounded contexts are clear. You know which subsystems are which because you have already lived in the codebase and understand the domain. Carving by guess produces wrong cuts.

The two most reliable signals: team size growing past coordination tolerance and you have already operated a monolith long enough to know your real bounded contexts.

When Microservices Are the Wrong Answer

The same list inverted:

  • Small team. A 10-person team doing microservices is paying overhead for independence they do not need.
  • Early-stage product. You do not yet know your domain. Premature service boundaries freeze the wrong cuts.
  • Tightly coupled data model. Your "user" or "order" entity touches every part of the system. Splitting along that fault line creates more pain than it solves.
  • No measured operational problem. You are adopting microservices because they are the modern thing, not because the monolith hurts.

The Distributed Monolith Anti-Pattern

The worst-of-both-worlds failure mode. Symptoms:

  • Deploying service A requires also deploying services B and C, in a specific order.
  • A feature change touches three services every time.
  • Database writes cross service boundaries (one service writes to another's tables).
  • Services share a database "for now."
  • Compile-time dependencies in shared libraries force coordinated upgrades.
  • Releases require a coordinator role to sequence them.

A distributed monolith is worse than a regular monolith. You pay the full distribution tax of microservices and get none of the independence benefits. The cure is either to recombine into a real monolith or to invest in genuine service independence: separate databases, versioned APIs, independent deploys. Half-measures stay broken.

Microservices Patterns You Will End Up Needing

If you go microservices, you will encounter these patterns whether or not you planned for them:

  • API Gateway / BFF — single entry point for clients, handles cross-cutting concerns.
  • Service mesh — observability, retries, circuit breaking, mTLS for service-to-service calls.
  • Event bus / message broker — async communication, decoupling, audit trail.
  • Distributed tracing — debugging cross-service flows.
  • Idempotency keys — every cross-service call eventually retries.
  • Saga / outbox — multi-service workflows that need atomicity.
  • Versioned APIs with deprecation windows — you cannot change a contract atomically.
  • Centralized config / secrets management — distributed config drift is a real risk.
  • Service discovery — DNS, Consul, Kubernetes Services.
  • Circuit breakers and bulkheads — preventing cascading failures.

If you are adopting microservices and have not planned for most of these, you are not yet ready.

Common Mistakes

The full catalogue lives in Anti-Patterns; the headline failures specific to microservices:

  • One service per database table. Microservices for the sake of small. Each service is too small to own meaningful behavior; you get distribution tax with no autonomy benefit.
  • Synchronous chains. Service A calls B calls C calls D. End-to-end latency is the sum; reliability is the product. Use async messaging or rethink the boundaries.
  • Shared database. "Just for transitional convenience." This is how you get a distributed monolith.
  • Building infrastructure before you need it. Adopting Kubernetes, service mesh, distributed tracing, message broker, and observability stack all at once for a 5-person team. The infrastructure work consumes the engineering capacity meant for the product.
  • No team boundaries that match. A microservice without a clear owning team becomes everyone's and no one's responsibility. See Conway's Law.
  • Splitting because microservices are modern. This is by far the most common mistake. The signal to split is operational pain, not architectural fashion.

A Practical Adoption Path

If you have decided microservices are right, the lower-risk path (see When to Split for the decision itself):

  1. Start as a modular monolith. Get the boundaries right while you can still refactor across them cheaply.
  2. Identify the one clearest candidate for splitting. Usually a component with the strongest scaling, ownership, or stack divergence pressure.
  3. Extract that one service. Build the operational infrastructure for two-service operation — pipeline, observability, eventing. This is the largest investment.
  4. Operate it for months before extracting the next. Discover what hurts.
  5. Continue extracting only where there is a named reason. Keep what can stay in the monolith.

A modular monolith with three extracted services often beats fifteen microservices with the same total scope.

Further Reading

  • Sam Newman, Building Microservices (2nd ed., 2021) — the standard practitioner reference.
  • Sam Newman, Monolith to Microservices (2019) — the migration book.
  • Martin Fowler, Microservices (essay, 2014) — the article that put the word on the map.
  • James Lewis & Martin Fowler, Microservices: a definition of this new architectural term (2014) — the formal definition.
  • Susan Fowler, Production-Ready Microservices (2016) — what running them at Uber's scale taught.
  • Phil Calçado, Pattern: Service Mesh — how the operational tax becomes infrastructure.

Pre-commit Checklist

  • For each service I am creating, does it have an independent deployment pipeline, an owning team, and its own data store?
  • For each cross-service call, is the consumer prepared to handle the failure modes (timeout, retry, partial failure)?
  • For each cross-service write, am I using saga, TCC, or outbox — not a 2PC attempt?
  • Do I have distributed tracing in place before I need to debug a multi-service incident?
  • Are my service boundaries aligned with team boundaries (see Conway's Law)?
  • For every "microservice" I plan, can I justify it with a named operational reason — not just "smaller is better"?
  • Am I avoiding the distributed-monolith anti-pattern? Test: can each service be deployed without requiring others to deploy?

On this page