MLflow 安全 Hardening

概覽

MLflow is the most widely adopted open-source platform for managing the machine learning lifecycle. It provides experiment tracking, model packaging, a model registry, and deployment tools. Organizations use MLflow to track 訓練 runs, compare model performance, store model artifacts, and manage model versions from development through production.

The 安全 problem with MLflow is that it was designed as a data science productivity tool, not as a 安全-critical system. The default deployment has no 認證, no 授權, and exposes a REST API that allows anyone with network access to read all experiments, modify model artifacts, register new models, and transition models to production. 這是 not a hypothetical concern — Protect AI's huntr bug bounty program has documented multiple critical 漏洞 in MLflow, and internet-facing MLflow instances with no 認證 remain common.

This article covers the 攻擊面 of MLflow deployments, the specific hardening steps required to secure them, and the 紅隊 techniques for assessing MLflow 安全. The 漏洞 described here map to OWASP LLM Top 10 2025 LLM06 (Excessive Agency) when MLflow is integrated into automated deployment pipelines, and to MITRE ATLAS AML.T0010 (ML Supply Chain Compromise).

MLflow Architecture and 攻擊 Surface

Components

MLflow consists of several components, each with its own 攻擊面:

Component	Purpose	Default Exposure	Risk
Tracking Server	Records experiment parameters, metrics, artifacts	HTTP API on port 5000	Unauthenticated read/write
Model Registry	Stores and versions trained models	Via Tracking Server API	Model replacement/投毒
Artifact Store	Stores model files, datasets, logs	S3, GCS, Azure Blob, or local filesystem	Direct access to model files
Backend Store	Metadata 資料庫 (SQLite, MySQL, PostgreSQL)	Depends on deployment	SQL injection (in older versions)
MLflow UI	Web dashboard for experiment visualization	Same port as Tracking Server

Default Configuration Risks

Out of the box, mlflow server starts with no 認證:

# 這是 how most MLflow tutorials start — completely open
mlflow server --host 0.0.0.0 --port 5000
 
# Anyone on the network can now:
# - Read all experiments and runs
# - Modify any experiment data
# - Upload malicious model artifacts
# - Transition any model to "Production" stage
# - Delete experiments and runs

The MLflow REST API is fully functional without any credentials:

import requests
from typing import Dict, List, Any
 
class MLflowSecurityScanner:
    """Scan an MLflow deployment for 安全 misconfigurations."""
 
    def __init__(self, mlflow_url: str):
        self.base_url = mlflow_url.rstrip("/")
 
    def check_authentication(self) -> Dict[str, Any]:
        """測試 if the MLflow API requires 認證."""
        endpoints = [
            "/api/2.0/mlflow/experiments/search",
            "/api/2.0/mlflow/registered-models/search",
            "/api/2.0/mlflow/runs/search",
        ]
 
        results = {"authenticated"

Authentication and Authorization

Enabling MLflow Authentication

MLflow 2.5+ includes a built-in 認證 plugin. Enable it by starting the server with the --app-name flag:

# Enable basic 認證
mlflow server \
  --host 0.0.0.0 \
  --port 5000 \
  --app-name basic-auth \
  --backend-store-uri postgresql://mlflow:password@db:5432/mlflow \
  --default-artifact-root s3://mlflow-artifacts/
 
# Create an admin user
mlflow server create-admin \
  --username admin \
  --password "$(openssl rand -base64 32)"

Configure 授權 with the basic_auth.ini file:

[mlflow]
default_permission = READ
admin_username = admin
admin_password = changeme
authorization_function = mlflow.server.auth:authenticate_request
database_uri = sqlite:///basic_auth.db

Reverse Proxy Authentication

For production deployments, use a reverse proxy (NGINX, Envoy, or a 雲端 load balancer) with proper 認證:

# /etc/nginx/sites-available/mlflow
server {
    listen 443 ssl;
    server_name mlflow.internal.company.com;
 
    ssl_certificate /etc/ssl/certs/mlflow.crt;
    ssl_certificate_key /etc/ssl/private/mlflow.key;
 
    # Require client certificate 認證
    ssl_client_certificate /etc/ssl/certs/ca.crt;
    ssl_verify_client on;
 
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=mlflow:10m rate=10r/s;
 
    location / {
        limit_req zone=mlflow burst=20 nodelay;
 
        # Forward to MLflow server
        proxy_pass http://127.0.0.1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $

OAuth2/OIDC Integration

For organizations using SSO, integrate MLflow with an identity provider:

"""
範例: MLflow 認證 middleware using OAuth2.
Place in a custom MLflow plugin or reverse proxy.
"""
from functools import wraps
from flask import request, jsonify
import requests
from typing import Optional
 
class OAuth2Middleware:
    """OAuth2 認證 middleware for MLflow."""
 
    def __init__(self, issuer_url: str, client_id: str, required_scopes: list):
        self.issuer_url = issuer_url
        self.client_id = client_id
        self.required_scopes = required_scopes

Model Registry 安全

Preventing Model Poisoning

The MLflow model registry is a critical target 因為 it is often integrated directly into deployment pipelines. If 攻擊者 can register a malicious model or transition a poisoned model to the "Production" stage, that model may be automatically deployed.

import mlflow
import hashlib
import json
from pathlib import Path
from typing import Dict, Optional
 
class ModelRegistryGuard:
    """安全 controls for the MLflow model registry."""
 
    def __init__(self, tracking_uri: str, allowed_signers: list):
        mlflow.set_tracking_uri(tracking_uri)
        self.allowed_signers = allowed_signers
 
    def compute_model_hash(self, model_uri: str) -> str:
        """Compute SHA-256 hash of a registered model's artifacts."""
        local_path

Artifact Store 安全

The artifact store holds the actual model files, datasets, and other artifacts. Depending on the backend, this could be a local filesystem, S3 bucket, GCS bucket, or Azure Blob Storage. The artifact store must be secured independently of the MLflow tracking server:

import boto3
from typing import Dict, List
 
class ArtifactStoreAuditor:
    """Audit the 安全 of MLflow artifact storage backends."""
 
    def audit_s3_bucket(self, bucket_name: str) -> Dict:
        """Audit an S3 bucket used for MLflow artifacts."""
        s3 = boto3.client("s3")
        findings = []
 
        # Check bucket policy
        try:
            policy = s3.get_bucket_policy(Bucket=bucket_name)
            findings.append({
                "check": "bucket_policy",
                "status":

Network 安全

Isolating MLflow from External Networks

MLflow should never be directly accessible from the internet. Place it behind a VPN or private network:

#!/bin/bash
# Firewall rules to restrict MLflow access (iptables example)
 
# Allow access only from internal networks
iptables -A INPUT -p tcp --dport 5000 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -s 172.16.0.0/12 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -j DROP
 
# Log blocked access attempts
iptables -A INPUT -p tcp --dport 5000 -j LOG --log-prefix "MLFLOW_BLOCKED: "

TLS Configuration

Always use TLS for MLflow communication, especially when the artifact store or backend store is on a separate host:

# Start MLflow with TLS
mlflow server \
  --host 0.0.0.0 \
  --port 5000 \
  --gunicorn-opts "--certfile=/etc/ssl/certs/mlflow.crt --keyfile=/etc/ssl/private/mlflow.key" \
  --backend-store-uri postgresql://mlflow:password@db:5432/mlflow \
  --default-artifact-root s3://mlflow-artifacts/

Audit Logging

Implementing Comprehensive Audit Logging

MLflow does not provide detailed audit logging by default. 實作 a custom logging solution:

import logging
import json
from datetime import datetime, timezone
from functools import wraps
from typing import Callable, Any
 
class MLflowAuditLogger:
    """Audit logger for MLflow operations."""
 
    def __init__(self, log_file: str = "/var/log/mlflow/audit.json"):
        self.logger = logging.getLogger("mlflow.audit")
        handler = logging.FileHandler(log_file)
        handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.

Known 漏洞

MLflow has had several significant 安全漏洞 discovered through responsible disclosure:

CVE-2023-6831: Path traversal 漏洞 in MLflow allowing arbitrary file read via the artifact download API. 攻擊者 could read any file on the MLflow server by crafting a malicious artifact path.
CVE-2024-27132: Remote code execution via MLflow recipes. Crafted recipe configurations could execute arbitrary Python code on the server.
CVE-2023-6977: Path traversal in the MLflow artifact upload endpoint allowing file writes outside the artifact directory.

These CVEs demonstrate that MLflow's 安全 posture requires active 監控 and prompt patching. Subscribe to MLflow's 安全 advisories and maintain an upgrade cadence.

Defensive Recommendations

Enable 認證 immediately — never run MLflow without 認證 in any environment, including development
Use a reverse proxy with TLS termination, rate limiting, and additional 認證 layers
Restrict network access to MLflow to internal networks only; never expose to the internet
Secure the artifact store independently with encryption, access controls, and versioning
實作 audit logging for all model registry operations, especially stage transitions
Verify model integrity with checksums before any model is promoted to production
Patch regularly — MLflow has had critical CVEs; monitor 安全 advisories
Use a dedicated 資料庫 (PostgreSQL/MySQL) instead of the default SQLite for the backend store
實作 RBAC to restrict who can register models, transition stages, and delete experiments

參考文獻

MLflow Documentation — https://mlflow.org/docs/latest/
CVE-2023-6831 — MLflow path traversal file read 漏洞
CVE-2024-27132 — MLflow remote code execution via recipes
CVE-2023-6977 — MLflow path traversal in artifact upload
Protect AI huntr — https://huntr.com/ — bug bounty platform where many MLflow 漏洞 were reported
MITRE ATLAS — AML.T0010 (ML Supply Chain Compromise)
OWASP LLM Top 10 2025 — LLM06 (Excessive Agency)

MLflow 安全 Hardening

Related articles