What is Feature Poisoning?

Techniques for poisoning feature store data to manipulate model behavior: direct feature value manipulation, time-travel attacks, online/offline store consistency exploitation, and targeted entity-level feature poisoning.

What is Access Control?

Access control strategies for feature stores: feature-level permissions, cross-team data leakage prevention, PII protection in features, service account management, and implementing least-privilege access for ML feature infrastructure.

Feature Store Security (Llmops Security)

intermediate10 min readUpdated 2026-03-15

Security overview of ML feature stores (Feast, Tecton, Vertex Feature Store): architecture and trust model, attack surfaces in online and offline stores, and the security implications of centralized feature management for ML systems.

feature-store feast tecton vertex online-store offline-store feature-engineering

Feature Store Security

Feature stores centralize the computation, storage, and serving of ML features -- the input variables that models use for predictions. By providing a single source of truth for features across training and inference, feature stores solve the training-serving skew problem. They also create a single point of attack. A compromised feature store affects every model that consumes its features, making it a high-leverage target for adversaries seeking to manipulate model behavior at scale.

Feature Store Architecture

Core Components

A feature store consists of several interconnected components, each with its own security properties:

Component	Purpose	Security Relevance
Offline store	Historical feature data for training	Training data integrity; contains full feature history
Online store	Low-latency feature serving for inference	Real-time prediction manipulation; directly affects live traffic
Feature transformation pipeline	Computes features from raw data	Transformation logic manipulation; controls what models see
Feature registry	Metadata about available features	Feature discovery; exposes data schema and relationships
Materialization pipeline	Syncs features from offline to online store	Consistency enforcement; sync compromise creates skew

Data Flow

Raw data sources -> Feature transformation -> Offline store -> Materialization -> Online store
                                                    |                                |
                                                    v                                v
                                             Training pipeline                 Inference API

An attacker can intervene at any point in this flow. The impact varies:

Transformation manipulation affects both training and inference data
Offline store poisoning affects future model training
Online store poisoning affects real-time predictions immediately
Materialization manipulation creates training-serving skew that causes subtle prediction errors

Platform Comparison

Feast (Open Source)

Feast is the most widely deployed open-source feature store.

Architecture: Feast acts as a thin serving layer over existing data infrastructure. The offline store is typically a data warehouse (BigQuery, Redshift, Snowflake) and the online store is a key-value store (Redis, DynamoDB, SQLite).

Security model:

Authentication depends on the underlying data stores (cloud IAM, database credentials)
No built-in access control at the feature level
Feature definitions are code in a Git repository
The Feast server itself has no authentication in default configuration

Key weaknesses:

Default deployment has no authentication
Feature-level access control is not supported
Materialization jobs run with the permissions of the Feast service account
Online store credentials are shared across all feature consumers

Tecton

Tecton is a managed feature platform that adds enterprise features on top of similar concepts.

Security model:

Workspace-based access control
Role-based permissions for feature management
Audit logging for feature access
Network isolation options (VPC, private endpoints)

Key weaknesses:

Workspace-level granularity may be too coarse
Service accounts for materialization have broad data access
Feature transformation code executes in Tecton-managed infrastructure
Cross-workspace feature sharing may violate data boundaries

Vertex Feature Store (Google Cloud)

Vertex Feature Store is Google Cloud's managed feature store integrated with Vertex AI.

Security model:

Google Cloud IAM for access control
VPC Service Controls for network isolation
Data encryption at rest and in transit
Audit logging via Cloud Audit Logs

Key weaknesses:

IAM policies may be overly permissive
Feature store access is often granted at the project level rather than per-feature
BigQuery offline store access may not be independently controlled
Service accounts used for feature ingestion may have broad permissions

Attack Surface Analysis

Online Store Attacks

The online store serves features for real-time inference. Manipulating online store data directly affects model predictions.

Direct value manipulation. If an attacker gains write access to the online store (Redis, DynamoDB), they can modify feature values for specific entities. The model receives manipulated features and produces attacker-influenced predictions without any change to the model itself.

Cache poisoning. If the online store caches feature values, poisoning the cache causes incorrect features to be served until the cache is refreshed. The offline store may contain correct values, but the online store serves poisoned ones.

Availability attacks. Taking the online store offline forces models to either fail (if features are required) or fall back to default values (if configured). Either outcome degrades model quality and may create exploitable behavior.

Offline Store Attacks

The offline store contains historical feature data used for training. Poisoning the offline store affects future model training.

Historical data modification. Modifying historical feature values changes the training data for future models. Because feature stores maintain time-series data, an attacker can modify specific time windows to target particular training periods.

Schema manipulation. Changing feature schemas (data types, value ranges, default values) can cause training pipeline failures or subtle data corruption that is not caught by type checks.

Feature Transformation Attacks

Feature transformations convert raw data into model-ready features. Compromising transformations affects all downstream consumers.

Logic modification. Changing the transformation logic (e.g., altering a normalization formula, changing a categorical encoding, or modifying a feature derivation) affects all features computed by that transformation.

Dependency injection. Feature transformations may import external libraries. Compromising these libraries compromises the transformation pipeline.

Training-Serving Skew as Attack Vector

Training-serving skew is normally a bug. An attacker can weaponize it by intentionally creating skew.

Intentional Skew Attacks

Technique	Mechanism	Effect
Offline-only poisoning	Modify offline store but not online store	Model trains on poisoned data but serves with clean features; model performance degrades unpredictably
Online-only poisoning	Modify online store but not offline store	Model trained on clean data receives unexpected feature distributions at inference; prediction quality drops
Selective poisoning	Poison features for specific entities only	Targeted prediction manipulation for specific users or items
Temporal poisoning	Modify features for specific time windows	Future training runs that include the poisoned window produce compromised models

Data Protection Concerns

PII in Features

Feature stores commonly contain personally identifiable information:

User demographics used as model features
Transaction histories
Location data
Behavioral profiles
Text embeddings that can be reversed to recover original content

A compromised feature store leaks PII for every entity in the store, which may include all users of the ML system.

Cross-Team Data Exposure

Feature stores are designed for feature sharing across teams. This creates data exposure risks:

Team A creates features from sensitive data and shares them
Team B consumes the features without understanding the sensitivity
The feature store becomes a side channel for accessing data that Team B should not have

Implement feature-level access control, not just store-level
Use separate service accounts for read and write operations
Restrict feature discovery (registry access) to authorized teams
Audit who accesses which features and how often

Data Integrity

Hash feature values during materialization and verify at serving time
Monitor feature distributions for unexpected shifts
Implement data quality checks in the transformation pipeline
Maintain immutable audit logs of all feature modifications

Network Security

Isolate the online store from direct external access
Use private endpoints for feature store APIs
Restrict materialization pipeline network access
Monitor for unusual feature access patterns

Compliance

Classify features by sensitivity (PII, financial, health)
Implement data masking for sensitive features in non-production environments
Track feature lineage back to source data for compliance auditing
Enforce data residency requirements in online store replication

Feature Poisoning -- Specific feature poisoning attack techniques
Feature Store Access Control -- Detailed access control strategies

Knowledge Check

An attacker modifies feature values in the online store (used for inference) but not the offline store (used for training). What type of attack is this, and what is the expected impact?

Feature Store Security (Llmops Security)

Learning Path

Related articles

Feature Store Security (Llmops Security)

Learning Path

Related articles