Steven's Knowledge

Feature Stores

Serving consistent features to training and inference without skew

A feature store solves one problem that sounds simple and is surprisingly hard: serve the same feature values to training and inference, with correct point-in-time semantics, at the latency your serving path demands. If you've ever debugged a model that works in offline eval but fails in production, there's a good chance training/serving skew was the cause — and a feature store is the fix.

When You Need One

You probably need a feature store when:

  • Multiple models share features and you're copy-pasting feature pipelines between projects.
  • Online serving needs precomputed features at low latency (single-digit ms).
  • Training/serving skew is causing silent accuracy degradation.
  • Point-in-time correctness matters — your training data must reflect what was known at prediction time, not what you know now.

You probably don't need one when:

  • You have one model with a small, simple feature set.
  • All features come directly from the request payload (no lookups needed).
  • You're still iterating on what features to use and haven't stabilized.

Don't adopt a feature store because it sounds mature. Adopt it when the pain is real.

Online vs Offline Stores

Every feature store has two halves:

  • Offline store — the historical data warehouse. Used for training and batch scoring. Backed by something like BigQuery, Snowflake, Redshift, or Parquet on S3. Latency doesn't matter; completeness does.
  • Online store — the low-latency serving layer. Used at inference time. Backed by Redis, DynamoDB, Bigtable, or similar. Only stores the latest feature values (or a small time window). Latency is everything.

The feature store keeps these two in sync: features are computed once, written to the offline store, and materialized to the online store for serving.

Feature Computation

Features can be computed in three modes:

  1. Batch — scheduled jobs (hourly, daily) compute features over historical data and write them to both stores. The simplest pattern. Good for features that don't change fast.
  2. Streaming — features are updated in near-real-time from event streams (Kafka, Kinesis). Needed when freshness matters — e.g., "number of transactions in the last 5 minutes."
  3. On-demand — computed at request time from the raw input. No store needed, but adds latency and can't share across models easily.

Most production systems use batch as the default and add streaming only for features where freshness is critical.

Point-in-Time Correctness

This is the subtle but crucial concept. When you build a training dataset, each example needs the feature values as they existed at the time of that example — not the current values. Without this:

  • Label leakage — future information leaks into training features.
  • Optimistic offline metrics — your model looks great offline but underperforms online because it was trained on "future" features.

A proper feature store handles point-in-time joins automatically: given an entity and a timestamp, return the feature values that were current at that timestamp. This is deceptively hard to implement correctly with raw SQL.

The Tools

  • Feast — open-source, cloud-agnostic. The most popular OSS option. Supports batch and streaming, online and offline stores. You manage the infrastructure.
  • Tecton — managed feature platform, built by Feast's original creators. Strong on real-time features and enterprise governance. Expensive.
  • Vertex AI Feature Store — GCP-managed. Good integration with Vertex pipelines. Less flexible but zero infra management.
  • SageMaker Feature Store — AWS-managed. Same trade-offs as Vertex.
  • Databricks Feature Store — integrated with Delta Lake and MLflow. Natural if you're in the Databricks ecosystem.
  • Hopsworks — open-source alternative with a strong feature engineering story.

If you're just starting: Feast gives you the concepts without vendor lock-in. If you're at scale and want managed: Tecton or your cloud provider's offering.

Common Mistakes

  • Building one too early — if you have three features and one model, a feature store is overhead. Start with a clean feature pipeline and migrate later.
  • Ignoring point-in-time semantics — the whole value proposition collapses without correct temporal joins.
  • Online store as a database — the online store is a cache for serving, not a general-purpose DB. Keep it narrow.
  • No monitoring — feature drift is as important as model drift. Track feature distributions over time.
  • Treating it as a data warehouse — a feature store is not where you do exploratory analysis. It stores curated, production-ready features.

On this page