What happens inside a relational database — schema, indexes, transactions, query plans, and how it all changes at scale

Database

This section is about what happens inside a relational database: how rows are stored and found, how concurrent changes stay consistent, how the optimizer turns SQL into a plan, and how all of that changes once one machine is no longer enough.

The material is organized by the questions you ask in practice:

How should I model the data? — design
How does the engine actually find rows? — indexing
How do concurrent changes stay consistent? — transactions
Why is this query slow, and how do I fix it? — optimization
What changes when one machine is no longer enough? — advanced

Topics

Modeling

Deciding what tables exist and how they relate before the engine ever sees a query.

Design — Schema design, normalization assessment, data integrity, performance metrics, best-practices checklist.

Engine Internals

What the database is doing on every read and every write.

Indexing — Index principles: B-Tree, hash, composite indexes, and the strategies that decide which one to use.
Transactions — ACID properties, isolation levels, concurrency control, locking.

Performance & Distribution

Making slow queries fast, then keeping things working once one node is not enough.

Optimization — Query optimization, execution-plan analysis, caching strategies.
Advanced — Distributed databases, sharding, replication, CAP theorem.

Scope and Boundaries

Concern	Lives here	Lives elsewhere
Relational schema design, normalization, integrity	✓	—
Index types, strategies, B-Tree / hash internals	✓	—
ACID, isolation levels, locking, concurrency control	✓	—
Query optimization, execution plans, plan-level caching	✓	—
Sharding, replication, distributed RDBMS, CAP-in-practice	✓	—
In-memory caches (Redis, Memcached)	—	`software-development/infrastructure/cache`
Object storage, search engines, vector DBs, time-series, warehouses	—	`software-development/infrastructure` (data & storage)
General distributed systems theory (CAP, consensus, consistency models)	—	`software-development/architecture/distributed-systems`
ORM design, connection pooling, application access patterns	—	`software-development/back-end`
Stream processing and event-driven data flow	—	`software-development/infrastructure/stream-processing`

The split with infrastructure/ is the most common point of confusion: this section is about what happens inside a relational database engine. Anything else that stores bytes — caches, blobs, search, vectors, warehouses, streams — lives in infrastructure/, because operationally it behaves more like infrastructure than like SQL.

Database

Database

Topics

Modeling

Engine Internals

Performance & Distribution

Scope and Boundaries

On this page