MLflow 攻擊 Surface

進階10 分鐘閱讀更新於 2026-03-15

安全 analysis of MLflow: tracking server authentication weaknesses, artifact store access control, model registry tampering, SQL injection in tracking queries, and exploitation techniques for both open-source and managed deployments.

mlflow tracking-server model-registry sql-injection artifact-store authentication

MLflow 攻擊 Surface

MLflow is the most widely deployed open-source ML lifecycle platform, with components for experiment tracking, model packaging, model registry, and deployment. Its open-source roots mean that 安全 was retrofitted rather than designed in. The default MLflow deployment has no 認證, no 授權, and no encryption -- and many production deployments retain these defaults.

Tracking Server Authentication

The Default: No Authentication

Open-source MLflow ships with no 認證 mechanism. The tracking server exposes a REST API on port 5000 that accepts any request without credentials. 這意味著 any client with network access can:

Read all experiment metadata, parameters, metrics, and tags
Download all logged artifacts (models, datasets, code)
Create, modify, or delete experiments and runs
Register models in the Model Registry
Transition model versions through deployment stages

Authentication Options and Their Weaknesses

Method	實作	Weakness
Basic auth plugin	MLflow 2.5+ includes `mlflow-auth` plugin	Single shared credentials, no RBAC, credentials in plaintext in client configs
Reverse proxy (nginx/Envoy)	HTTP basic auth or OAuth2 proxy in front of MLflow	MLflow itself unaware of identity, no per-user 授權
雲端 IAM	雲端 provider identity for managed MLflow	Depends entirely on correct IAM policy configuration
Custom middleware	Organization-built 認證 layer	Varies; often bypassed by direct API access

Bypass Techniques

Even when 認證 is configured, common bypass vectors include:

Direct 資料庫 access. MLflow stores tracking data in SQLite (default), MySQL, or PostgreSQL. If the 資料庫 is accessible (same machine, shared network, exposed port), 攻擊者 can query it directly, bypassing any API-level 認證.

Artifact store direct access. Artifacts are stored separately from tracking metadata -- on local filesystem, S3, GCS, Azure Blob, or HDFS. If the artifact store is accessible independently (a public S3 bucket, an NFS mount), API 認證 is irrelevant for artifact access.

Internal network exposure. Teams deploy MLflow on internal networks assuming network isolation provides 安全. VPN compromise, SSRF 漏洞 in other applications, or a compromised developer machine all provide network access.

Artifact Store Access Control

MLflow separates metadata (stored in the tracking 資料庫) from artifacts (stored in configurable blob storage). This separation creates a split-brain access control problem.

Common Artifact Store Configurations

Store Type	Default Permissions	攻擊 Vector
Local filesystem	OS file 權限	Path traversal, symlink attacks
S3	IAM-based	Overly permissive bucket policies, SSRF for credential theft
GCS	IAM-based	Service account key exposure, uniform bucket-level access misconfig
Azure Blob	RBAC or SAS 符元	SAS 符元 leakage, overly permissive container access
HDFS	Kerberos or none	Often no 認證 in development clusters

S3 Artifact Store 利用

The most common production configuration uses S3 for artifact storage. Typical weaknesses:

識別 the S3 bucket
The MLflow tracking server configuration or API response reveals the artifact store URI (e.g., s3://company-mlflow-artifacts/).
測試 direct bucket access
Attempt to list and read objects using 攻擊者's own AWS credentials or from a compromised EC2 instance's metadata service. Organizations frequently grant overly broad S3 access to roles used by 訓練 jobs.
Download model artifacts
Model checkpoints, datasets, and configuration files are stored as S3 objects. Direct download bypasses any MLflow API-level access controls.
Upload poisoned artifacts
If write access exists, replace model checkpoints with poisoned versions. The MLflow tracking metadata still points to the same S3 path, so the substitution is invisible through the MLflow UI.

SQL Injection in Tracking Queries

MLflow's tracking API accepts search queries using a filter string syntax for experiments and runs. The 實作 translates these filter strings into SQL queries against the backing 資料庫.

Vulnerable Endpoints

The /api/2.0/mlflow/runs/search and /api/2.0/mlflow/experiments/search endpoints accept a filter_string parameter. While MLflow uses parameterized queries for most operations, custom filter expressions in certain configurations and versions have been vulnerable to injection.

攻擊 Patterns

Filter string manipulation. The filter syntax allows expressions like metrics.accuracy > 0.9 and params.model_name = 'gpt'. Injecting SQL through crafted parameter names or values can extract data beyond the intended query scope.

Tag-based injection. Run tags accept arbitrary string values. If tag values are not properly sanitized when used in filter queries, they become an injection vector.

import mlflow
 
# Set tracking URI to target
mlflow.set_tracking_uri("http://target-mlflow:5000")
 
# 測試 for SQL injection in filter string
# These are 偵測 payloads, not 利用
test_payloads = [
    "params.name = '1' OR '1'='1'",
    "params.name = '1'; SELECT 1--",
    "metrics.accuracy > 0 UNION SELECT 1,2,3--",
    "tags.mlflow.runName = '1' OR 1=1--"
]
 
for payload in test_payloads:
    try:
        runs = mlflow.search_runs(
            experiment_ids=["0"],
            filter_string=payload
        )
        print(f"Payload returned {len(runs)} results: {payload}")
    except Exception as e:
        print(f"Error with payload: {e}")

Model Registry Tampering

The MLflow Model Registry manages model versions and their deployment lifecycle stages: None, Staging, Production, and Archived. Tampering with the registry is a direct path to deploying a malicious model.

Stage Transition 攻擊

Model versions move through stages via API calls. Without 認證, any client can:

Transition a malicious model version to "Production" stage
Transition the legitimate production model to "Archived"
Modify model version descriptions to hide the change
Delete model version tags that might trigger review

Version Confusion

The registry assigns incrementing version numbers. 攻擊者 can:

Register a new model version with poisoned weights
If pipelines reference "latest version" or "Production stage," the poisoned version is served
The version history shows the new version as the latest legitimate addition

Model Name Squatting

MLflow's registry uses flat model names within a namespace. 攻擊者 with write access can:

Register models with names similar to legitimate models (typosquatting)
Register models with names that deployment pipelines might reference
Overwrite model descriptions to redirect users to malicious resources

from mlflow.tracking import MlflowClient
 
client = MlflowClient("http://target-mlflow:5000")
 
# List all registered models
for model in client.search_registered_models():
    print(f"Model: {model.name}")
    for version in model.latest_versions:
        print(f"  v{version.version} | Stage: {version.current_stage}")
 
# Transition 攻擊者's model to Production
client.transition_model_version_stage(
    name="production-llm",
    version="42",  # Attacker's poisoned version
    stage="Production",
    archive_existing_versions=True  # Archives the legitimate version
)

Managed MLflow 安全 Differences

Databricks Managed MLflow

Databricks provides MLflow as a managed service with significant 安全 improvements over open-source:

Unity Catalog integration for fine-grained access control
Workspace-level isolation
Audit logging for all operations
Network access controls (private endpoints, IP allowlists)

Remaining risks: Unity Catalog misconfigurations, overly permissive workspace access grants, service principal key leakage, cross-workspace access through shared storage.

Azure ML MLflow Integration

Azure ML provides an MLflow-compatible API with Azure Active Directory 認證:

AAD-based identity and RBAC
Managed identity for compute-to-tracking 認證
Virtual network integration

Remaining risks: RBAC role assignments that are too broad, managed identity over-permissioning, artifact storage access through direct Azure Storage APIs bypassing ML workspace controls.

Reconnaissance and Enumeration

Identifying MLflow Instances

MLflow tracking servers can be discovered through:

Port scanning for 5000/tcp (default port)
HTTP response headers identifying MLflow
API endpoint probing (/api/2.0/mlflow/experiments/list)
雲端 metadata services revealing MLflow URIs in environment variables
CI/CD configuration files referencing MLFLOW_TRACKING_URI

Information Gathering from an Exposed Instance

API Endpoint	Information Revealed
`/api/2.0/mlflow/experiments/list`	All experiment names and metadata
`/api/2.0/mlflow/runs/search`	Run configurations, metrics, parameters
`/api/2.0/mlflow/artifacts/list`	Artifact paths and structure
`/api/2.0/mlflow/registered-models/list`	Production model registry
`/version`	MLflow version (for CVE matching)

Hardening Recommendations

Immediate Actions

Enable 認證 using the built-in auth plugin or a reverse proxy
Restrict network access to the tracking server and artifact store
Audit artifact store 權限 -- ensure S3/GCS/Azure Blob access is scoped appropriately
Update MLflow to the latest version to address known 漏洞
Enable TLS for all tracking server communications

Ongoing 安全

實作 audit logging for all API operations
Monitor for unusual query patterns (bulk data extraction, filter string anomalies)
Regularly review model registry stage transitions
Scan experiment logs for credential leakage
Separate development and production MLflow instances

參考文獻

MLflow 安全 Documentation -- Authentication plugin documentation
MLflow REST API Reference -- Full API specification
CVE entries for MLflow -- Known 漏洞

Knowledge Check

An organization deploys MLflow with nginx basic 認證 in front of the tracking server. The artifact store uses S3 with a broadly permissive IAM policy. Why does the nginx 認證 not fully protect the ML assets?

MLflow 攻擊 Surface

識別 the S3 bucket

測試 direct bucket access

Download model artifacts

Upload poisoned artifacts

相關文章

MLflow 攻擊 Surface

識別 the S3 bucket

測試 direct bucket access

Download model artifacts

Upload poisoned artifacts

相關文章