Steven's Knowledge

Database

What happens inside a relational database — schema, indexes, transactions, query plans, and how it all changes at scale

Database

This section is about what happens inside a relational database: how rows are stored and found, how concurrent changes stay consistent, how the optimizer turns SQL into a plan, and how all of that changes once one machine is no longer enough.

The material is organized by the questions you ask in practice:

  • How should I model the data? — design
  • How does the engine actually find rows? — indexing
  • How do concurrent changes stay consistent? — transactions
  • Why is this query slow, and how do I fix it? — optimization
  • What changes when one machine is no longer enough? — advanced

Topics

Modeling

Deciding what tables exist and how they relate before the engine ever sees a query.

  • Design — Schema design, normalization assessment, data integrity, performance metrics, best-practices checklist.

Engine Internals

What the database is doing on every read and every write.

  • Indexing — Index principles: B-Tree, hash, composite indexes, and the strategies that decide which one to use.
  • Transactions — ACID properties, isolation levels, concurrency control, locking.

Performance & Distribution

Making slow queries fast, then keeping things working once one node is not enough.

  • Optimization — Query optimization, execution-plan analysis, caching strategies.
  • Advanced — Distributed databases, sharding, replication, CAP theorem.

Scope and Boundaries

ConcernLives hereLives elsewhere
Relational schema design, normalization, integrity
Index types, strategies, B-Tree / hash internals
ACID, isolation levels, locking, concurrency control
Query optimization, execution plans, plan-level caching
Sharding, replication, distributed RDBMS, CAP-in-practice
In-memory caches (Redis, Memcached)software-development/infrastructure/cache
Object storage, search engines, vector DBs, time-series, warehousessoftware-development/infrastructure (data & storage)
General distributed systems theory (CAP, consensus, consistency models)software-development/architecture/distributed-systems
ORM design, connection pooling, application access patternssoftware-development/back-end
Stream processing and event-driven data flowsoftware-development/infrastructure/stream-processing

The split with infrastructure/ is the most common point of confusion: this section is about what happens inside a relational database engine. Anything else that stores bytes — caches, blobs, search, vectors, warehouses, streams — lives in infrastructure/, because operationally it behaves more like infrastructure than like SQL.

On this page