MLflow Attack Surface

advanced10 min readUpdated 2026-03-15

Security analysis of MLflow: tracking server authentication weaknesses, artifact store access control, model registry tampering, SQL injection in tracking queries, and exploitation techniques for both open-source and managed deployments.

mlflow tracking-server model-registry sql-injection artifact-store authentication

MLflow Attack Surface

MLflow is the most widely deployed open-source ML lifecycle platform, with components for experiment tracking, model packaging, model registry, and deployment. Its open-source roots mean that security was retrofitted rather than designed in. The default MLflow deployment has no authentication, no authorization, and no encryption -- and many production deployments retain these defaults.

Tracking Server Authentication

The Default: No Authentication

Open-source MLflow ships with no authentication mechanism. The tracking server exposes a REST API on port 5000 that accepts any request without credentials. This means any client with network access can:

Read all experiment metadata, parameters, metrics, and tags
Download all logged artifacts (models, datasets, code)
Create, modify, or delete experiments and runs
Register models in the Model Registry
Transition model versions through deployment stages

Authentication Options and Their Weaknesses

Method	Implementation	Weakness
Basic auth plugin	MLflow 2.5+ includes `mlflow-auth` plugin	Single shared credentials, no RBAC, credentials in plaintext in client configs
Reverse proxy (nginx/Envoy)	HTTP basic auth or OAuth2 proxy in front of MLflow	MLflow itself unaware of identity, no per-user authorization
Cloud IAM	Cloud provider identity for managed MLflow	Depends entirely on correct IAM policy configuration
Custom middleware	Organization-built authentication layer	Varies; often bypassed by direct API access

Bypass Techniques

Even when authentication is configured, common bypass vectors include:

Direct database access. MLflow stores tracking data in SQLite (default), MySQL, or PostgreSQL. If the database is accessible (same machine, shared network, exposed port), an attacker can query it directly, bypassing any API-level authentication.

Artifact store direct access. Artifacts are stored separately from tracking metadata -- on local filesystem, S3, GCS, Azure Blob, or HDFS. If the artifact store is accessible independently (a public S3 bucket, an NFS mount), API authentication is irrelevant for artifact access.

Internal network exposure. Teams deploy MLflow on internal networks assuming network isolation provides security. VPN compromise, SSRF vulnerabilities in other applications, or a compromised developer machine all provide network access.

Artifact Store Access Control

MLflow separates metadata (stored in the tracking database) from artifacts (stored in configurable blob storage). This separation creates a split-brain access control problem.

Common Artifact Store Configurations

Store Type	Default Permissions	Attack Vector
Local filesystem	OS file permissions	Path traversal, symlink attacks
S3	IAM-based	Overly permissive bucket policies, SSRF for credential theft
GCS	IAM-based	Service account key exposure, uniform bucket-level access misconfig
Azure Blob	RBAC or SAS tokens	SAS token leakage, overly permissive container access
HDFS	Kerberos or none	Often no authentication in development clusters

S3 Artifact Store Exploitation

The most common production configuration uses S3 for artifact storage. Typical weaknesses:

Identify the S3 bucket
The MLflow tracking server configuration or API response reveals the artifact store URI (e.g., s3://company-mlflow-artifacts/).
Test direct bucket access
Attempt to list and read objects using the attacker's own AWS credentials or from a compromised EC2 instance's metadata service. Organizations frequently grant overly broad S3 access to roles used by training jobs.
Download model artifacts
Model checkpoints, datasets, and configuration files are stored as S3 objects. Direct download bypasses any MLflow API-level access controls.
Upload poisoned artifacts
If write access exists, replace model checkpoints with poisoned versions. The MLflow tracking metadata still points to the same S3 path, so the substitution is invisible through the MLflow UI.

SQL Injection in Tracking Queries

MLflow's tracking API accepts search queries using a filter string syntax for experiments and runs. The implementation translates these filter strings into SQL queries against the backing database.

Vulnerable Endpoints

The /api/2.0/mlflow/runs/search and /api/2.0/mlflow/experiments/search endpoints accept a filter_string parameter. While MLflow uses parameterized queries for most operations, custom filter expressions in certain configurations and versions have been vulnerable to injection.

Attack Patterns

Filter string manipulation. The filter syntax allows expressions like metrics.accuracy > 0.9 and params.model_name = 'gpt'. Injecting SQL through crafted parameter names or values can extract data beyond the intended query scope.

Tag-based injection. Run tags accept arbitrary string values. If tag values are not properly sanitized when used in filter queries, they become an injection vector.

import mlflow
 
# Set tracking URI to target
mlflow.set_tracking_uri("http://target-mlflow:5000")
 
# Test for SQL injection in filter string
# These are detection payloads, not exploitation
test_payloads = [
    "params.name = '1' OR '1'='1'",
    "params.name = '1'; SELECT 1--",
    "metrics.accuracy > 0 UNION SELECT 1,2,3--",
    "tags.mlflow.runName = '1' OR 1=1--"
]
 
for payload in test_payloads:
    try:
        runs = mlflow.search_runs(
            experiment_ids=["0"],
            filter_string=payload
        )
        print(f"Payload returned {len(runs)} results: {payload}")
    except Exception as e:
        print(f"Error with payload: {e}")

Model Registry Tampering

The MLflow Model Registry manages model versions and their deployment lifecycle stages: None, Staging, Production, and Archived. Tampering with the registry is a direct path to deploying a malicious model.

Stage Transition Attacks

Model versions move through stages via API calls. Without authentication, any client can:

Transition a malicious model version to "Production" stage
Transition the legitimate production model to "Archived"
Modify model version descriptions to hide the change
Delete model version tags that might trigger review

Version Confusion

The registry assigns incrementing version numbers. An attacker can:

Register a new model version with poisoned weights
If pipelines reference "latest version" or "Production stage," the poisoned version is served
The version history shows the new version as the latest legitimate addition

Model Name Squatting

MLflow's registry uses flat model names within a namespace. An attacker with write access can:

Register models with names similar to legitimate models (typosquatting)
Register models with names that deployment pipelines might reference
Overwrite model descriptions to redirect users to malicious resources

from mlflow.tracking import MlflowClient
 
client = MlflowClient("http://target-mlflow:5000")
 
# List all registered models
for model in client.search_registered_models():
    print(f"Model: {model.name}")
    for version in model.latest_versions:
        print(f"  v{version.version} | Stage: {version.current_stage}")
 
# Transition attacker's model to Production
client.transition_model_version_stage(
    name="production-llm",
    version="42",  # Attacker's poisoned version
    stage="Production",
    archive_existing_versions=True  # Archives the legitimate version
)

Managed MLflow Security Differences

Databricks Managed MLflow

Databricks provides MLflow as a managed service with significant security improvements over open-source:

Unity Catalog integration for fine-grained access control
Workspace-level isolation
Audit logging for all operations
Network access controls (private endpoints, IP allowlists)

Remaining risks: Unity Catalog misconfigurations, overly permissive workspace access grants, service principal key leakage, cross-workspace access through shared storage.

Azure ML MLflow Integration

Azure ML provides an MLflow-compatible API with Azure Active Directory authentication:

AAD-based identity and RBAC
Managed identity for compute-to-tracking authentication
Virtual network integration

Remaining risks: RBAC role assignments that are too broad, managed identity over-permissioning, artifact storage access through direct Azure Storage APIs bypassing ML workspace controls.

Reconnaissance and Enumeration

Identifying MLflow Instances

MLflow tracking servers can be discovered through:

Port scanning for 5000/tcp (default port)
HTTP response headers identifying MLflow
API endpoint probing (/api/2.0/mlflow/experiments/list)
Cloud metadata services revealing MLflow URIs in environment variables
CI/CD configuration files referencing MLFLOW_TRACKING_URI

Information Gathering from an Exposed Instance

API Endpoint	Information Revealed
`/api/2.0/mlflow/experiments/list`	All experiment names and metadata
`/api/2.0/mlflow/runs/search`	Run configurations, metrics, parameters
`/api/2.0/mlflow/artifacts/list`	Artifact paths and structure
`/api/2.0/mlflow/registered-models/list`	Production model registry
`/version`	MLflow version (for CVE matching)

Hardening Recommendations

Immediate Actions

Enable authentication using the built-in auth plugin or a reverse proxy
Restrict network access to the tracking server and artifact store
Audit artifact store permissions -- ensure S3/GCS/Azure Blob access is scoped appropriately
Update MLflow to the latest version to address known vulnerabilities
Enable TLS for all tracking server communications

Ongoing Security

Implement audit logging for all API operations
Monitor for unusual query patterns (bulk data extraction, filter string anomalies)
Regularly review model registry stage transitions
Scan experiment logs for credential leakage
Separate development and production MLflow instances

References

MLflow Security Documentation -- Authentication plugin documentation
MLflow REST API Reference -- Full API specification
CVE entries for MLflow -- Known vulnerabilities

Knowledge Check

An organization deploys MLflow with nginx basic authentication in front of the tracking server. The artifact store uses S3 with a broadly permissive IAM policy. Why does the nginx authentication not fully protect the ML assets?

Edit this page on GitHub

MLflow Attack Surface

advanced10 min readUpdated 2026-03-15

mlflow tracking-server model-registry sql-injection artifact-store authentication

MLflow Attack Surface

Tracking Server Authentication

The Default: No Authentication

Read all experiment metadata, parameters, metrics, and tags
Download all logged artifacts (models, datasets, code)
Create, modify, or delete experiments and runs
Register models in the Model Registry
Transition model versions through deployment stages

Authentication Options and Their Weaknesses

Method	Implementation	Weakness
Basic auth plugin	MLflow 2.5+ includes `mlflow-auth` plugin	Single shared credentials, no RBAC, credentials in plaintext in client configs
Reverse proxy (nginx/Envoy)	HTTP basic auth or OAuth2 proxy in front of MLflow	MLflow itself unaware of identity, no per-user authorization
Cloud IAM	Cloud provider identity for managed MLflow	Depends entirely on correct IAM policy configuration
Custom middleware	Organization-built authentication layer	Varies; often bypassed by direct API access

Bypass Techniques

Even when authentication is configured, common bypass vectors include:

Artifact Store Access Control

MLflow separates metadata (stored in the tracking database) from artifacts (stored in configurable blob storage). This separation creates a split-brain access control problem.

Common Artifact Store Configurations

Store Type	Default Permissions	Attack Vector
Local filesystem	OS file permissions	Path traversal, symlink attacks
S3	IAM-based	Overly permissive bucket policies, SSRF for credential theft
GCS	IAM-based	Service account key exposure, uniform bucket-level access misconfig
Azure Blob	RBAC or SAS tokens	SAS token leakage, overly permissive container access
HDFS	Kerberos or none	Often no authentication in development clusters

S3 Artifact Store Exploitation

The most common production configuration uses S3 for artifact storage. Typical weaknesses:

Identify the S3 bucket
The MLflow tracking server configuration or API response reveals the artifact store URI (e.g., s3://company-mlflow-artifacts/).
Test direct bucket access
Attempt to list and read objects using the attacker's own AWS credentials or from a compromised EC2 instance's metadata service. Organizations frequently grant overly broad S3 access to roles used by training jobs.
Download model artifacts
Model checkpoints, datasets, and configuration files are stored as S3 objects. Direct download bypasses any MLflow API-level access controls.
Upload poisoned artifacts
If write access exists, replace model checkpoints with poisoned versions. The MLflow tracking metadata still points to the same S3 path, so the substitution is invisible through the MLflow UI.

SQL Injection in Tracking Queries

MLflow's tracking API accepts search queries using a filter string syntax for experiments and runs. The implementation translates these filter strings into SQL queries against the backing database.

Vulnerable Endpoints

Attack Patterns

Tag-based injection. Run tags accept arbitrary string values. If tag values are not properly sanitized when used in filter queries, they become an injection vector.

import mlflow
 
# Set tracking URI to target
mlflow.set_tracking_uri("http://target-mlflow:5000")
 
# Test for SQL injection in filter string
# These are detection payloads, not exploitation
test_payloads = [
    "params.name = '1' OR '1'='1'",
    "params.name = '1'; SELECT 1--",
    "metrics.accuracy > 0 UNION SELECT 1,2,3--",
    "tags.mlflow.runName = '1' OR 1=1--"
]
 
for payload in test_payloads:
    try:
        runs = mlflow.search_runs(
            experiment_ids=["0"],
            filter_string=payload
        )
        print(f"Payload returned {len(runs)} results: {payload}")
    except Exception as e:
        print(f"Error with payload: {e}")

Model Registry Tampering

Stage Transition Attacks

Model versions move through stages via API calls. Without authentication, any client can:

Transition a malicious model version to "Production" stage
Transition the legitimate production model to "Archived"
Modify model version descriptions to hide the change
Delete model version tags that might trigger review

Version Confusion

The registry assigns incrementing version numbers. An attacker can:

Register a new model version with poisoned weights
If pipelines reference "latest version" or "Production stage," the poisoned version is served
The version history shows the new version as the latest legitimate addition

Model Name Squatting

MLflow's registry uses flat model names within a namespace. An attacker with write access can:

Register models with names similar to legitimate models (typosquatting)
Register models with names that deployment pipelines might reference
Overwrite model descriptions to redirect users to malicious resources

from mlflow.tracking import MlflowClient
 
client = MlflowClient("http://target-mlflow:5000")
 
# List all registered models
for model in client.search_registered_models():
    print(f"Model: {model.name}")
    for version in model.latest_versions:
        print(f"  v{version.version} | Stage: {version.current_stage}")
 
# Transition attacker's model to Production
client.transition_model_version_stage(
    name="production-llm",
    version="42",  # Attacker's poisoned version
    stage="Production",
    archive_existing_versions=True  # Archives the legitimate version
)

Managed MLflow Security Differences

Databricks Managed MLflow

Databricks provides MLflow as a managed service with significant security improvements over open-source:

Unity Catalog integration for fine-grained access control
Workspace-level isolation
Audit logging for all operations
Network access controls (private endpoints, IP allowlists)

Remaining risks: Unity Catalog misconfigurations, overly permissive workspace access grants, service principal key leakage, cross-workspace access through shared storage.

Azure ML MLflow Integration

Azure ML provides an MLflow-compatible API with Azure Active Directory authentication:

AAD-based identity and RBAC
Managed identity for compute-to-tracking authentication
Virtual network integration

Remaining risks: RBAC role assignments that are too broad, managed identity over-permissioning, artifact storage access through direct Azure Storage APIs bypassing ML workspace controls.

Reconnaissance and Enumeration

Identifying MLflow Instances

MLflow tracking servers can be discovered through:

Port scanning for 5000/tcp (default port)
HTTP response headers identifying MLflow
API endpoint probing (/api/2.0/mlflow/experiments/list)
Cloud metadata services revealing MLflow URIs in environment variables
CI/CD configuration files referencing MLFLOW_TRACKING_URI

Information Gathering from an Exposed Instance

API Endpoint	Information Revealed
`/api/2.0/mlflow/experiments/list`	All experiment names and metadata
`/api/2.0/mlflow/runs/search`	Run configurations, metrics, parameters
`/api/2.0/mlflow/artifacts/list`	Artifact paths and structure
`/api/2.0/mlflow/registered-models/list`	Production model registry
`/version`	MLflow version (for CVE matching)

Hardening Recommendations

Immediate Actions

Enable authentication using the built-in auth plugin or a reverse proxy
Restrict network access to the tracking server and artifact store
Audit artifact store permissions -- ensure S3/GCS/Azure Blob access is scoped appropriately
Update MLflow to the latest version to address known vulnerabilities
Enable TLS for all tracking server communications

Ongoing Security

Implement audit logging for all API operations
Monitor for unusual query patterns (bulk data extraction, filter string anomalies)
Regularly review model registry stage transitions
Scan experiment logs for credential leakage
Separate development and production MLflow instances

References

MLflow Security Documentation -- Authentication plugin documentation
MLflow REST API Reference -- Full API specification
CVE entries for MLflow -- Known vulnerabilities

Knowledge Check

Edit this page on GitHub

MLflow Attack Surface

Identify the S3 bucket

Test direct bucket access

Download model artifacts

Upload poisoned artifacts

Related articles

MLflow Attack Surface

Identify the S3 bucket

Test direct bucket access

Download model artifacts

Upload poisoned artifacts

Related articles