Steven's Knowledge

Vector Databases

Comparing Pinecone, Weaviate, Qdrant, Chroma, pgvector, and Milvus — and when to use what

Choosing a vector database is one of the first decisions in any retrieval pipeline, and it is often over-agonized. The dirty secret: for most teams under 10M vectors, almost any of these will work. The real questions are operational — who runs it, how does it fail, and what else does it need to do.

The Landscape

The major options, grouped by deployment model:

Managed / serverless:

  • Pinecone — the "just works" option. Serverless tier scales to zero, great DX, limited filtering and no self-hosted option. Best for teams that want to ship fast and don't need exotic features.
  • Weaviate Cloud — managed Weaviate. Richer query language, built-in hybrid search, supports modules for vectorization at ingest time.

Self-hosted first:

  • Qdrant — Rust-based, fast, excellent filtering, payload indexing. The best self-hosted option if you want fine-grained control. Also has a managed cloud.
  • Milvus — built for scale. Handles billions of vectors with partitioning and GPU indexing. Overkill for small datasets, essential for truly large ones.
  • Chroma — the "SQLite of vector DBs." Embeds in your Python process, zero config. Perfect for prototyping and single-node workloads. Not for production clusters.

Postgres extension:

  • pgvector — adds vector search to your existing Postgres. If your data is already in Postgres and you have <5M vectors, this avoids an entire new infrastructure dependency. Performance ceiling is lower than purpose-built DBs.

When to Use What

  1. Prototype / hackathon — Chroma. Zero setup, in-process, good enough.
  2. Startup shipping fast — Pinecone serverless or Weaviate Cloud. Minimize ops burden.
  3. Need advanced filtering — Qdrant. Payload-based filtering without post-filter recall loss.
  4. Already on Postgres — pgvector. One less system to operate.
  5. Billions of vectors — Milvus. Partitioned, GPU-accelerated, built for it.
  6. Need hybrid search built-in — Weaviate. Native BM25 + vector fusion.

Managed vs Self-Hosted

The tradeoff is simple in principle, brutal in practice:

  • Managed saves you from index tuning, shard rebalancing, backup orchestration, and upgrade headaches. You pay a premium and lose some control.
  • Self-hosted gives you full control over latency, cost, and data residency. You pay in engineering time — and vector DBs under load have sharp edges (memory pressure, compaction stalls, slow cold starts).

For most teams: start managed, self-host when the bill gets painful or compliance demands it.

What Actually Matters

Beyond the logo on the dashboard, evaluate these:

  • Filtering performance — can you filter by metadata without destroying recall? Some DBs do pre-filtering (good recall, slower), others post-filter (fast, lossy).
  • Update semantics — can you upsert vectors without a full re-index? Critical for real-time pipelines.
  • Multi-tenancy — namespace isolation, per-tenant scaling. Essential for SaaS.
  • Backup and restore — sounds boring until you lose an index.
  • Embedding model coupling — some DBs want to own the embedding step. Prefer decoupled architectures unless you have a strong reason.

The Honest Take

The vector DB market is consolidating. Pinecone and Qdrant are winning mindshare. pgvector is eating the "good enough" tier. Milvus owns the scale tier. Chroma is the dev tool.

Don't spend weeks evaluating. Pick one that matches your deployment model, load your data, measure recall and latency on your actual queries, and move on. The chunking and reranking stages will matter more than the DB choice.

On this page