CTF: Supply Chain Attack
Find and exploit vulnerabilities in an ML supply chain including compromised dependencies, poisoned models, backdoored training data, and malicious model files. Practice ML-specific supply chain security assessment.
Challenge Briefing
A machine learning team has built and deployed a text classification model. Unbeknownst to them, their supply chain has been compromised at multiple points. Your job is to audit the entire pipeline -- from Python packages through training data to the deployed model -- and find the four flags hidden in the compromised components.
Challenge Overview
| Flag | Title | Attack Stage | Points | Par Time |
|---|---|---|---|---|
| Flag 1 | Dependency Trap | Compromised pip package | 150 | 20 min |
| Flag 2 | Pickle Bomb | Malicious model file | 250 | 30 min |
| Flag 3 | Data Sabotage | Poisoned training dataset | 300 | 40 min |
| Flag 4 | Config Corruption | Backdoored configuration | 300 | 35 min |
Total possible points: 1,000
Flag 1: Dependency Trap
Scenario
The project's requirements.txt includes several ML packages. One of them is a typosquatted package that contains extra code beyond its stated functionality. Examine the installed packages and their source code to find the flag.
Approach Guidance
Audit requirements.txt
Compare each package name against the canonical PyPI listing. Look for subtle misspellings:
scikit-learnvsscikitlearn,tensorflowvstensor-flow.Inspect Package Contents
For any suspicious packages, examine their source code. Look for
__init__.pyfiles that import unexpected modules or execute code at import time.Analyze Post-Install Scripts
Check
setup.pyorsetup.cfgfor post-install hooks that execute code duringpip install.
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -22 pts | One package name has a subtle misspelling |
| Hint 2 | -37 pts | Check the __init__.py of the misspelled package |
| Hint 3 | -75 pts | The package's __init__.py writes the flag to /tmp/.cache_token during import |
Flag 2: Pickle Bomb
Scenario
The team downloaded a "pre-trained" model from a community model hub. The model file uses Python's pickle format. Inside the pickle, malicious code is embedded alongside the model weights.
Approach Guidance
Identify the Model Format
Check file extensions and loading code. Files loaded with
torch.load()orpickle.load()are potential vectors.Analyze Without Executing
Use pickle analysis tools to inspect the pickle opcodes without actually deserializing the file. Tools like
pickletoolsorficklingcan reveal embedded code.Extract the Payload
The malicious pickle contains a
__reduce__method that executes a command. The flag is in the command string.
#!/usr/bin/env python3
"""Safely analyze pickle files for malicious payloads."""
import pickletools
import sys
def analyze_pickle(filepath: str):
"""Disassemble a pickle file to inspect its contents."""
with open(filepath, "rb") as f:
pickletools.dis(f)
# Usage: python pickle_analyzer.py model.pkl
# Look for REDUCE, GLOBAL, or INST opcodes that reference
# os.system, subprocess, eval, exec, or __import__
if __name__ == "__main__":
analyze_pickle(sys.argv[1])Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -37 pts | The model file is model_checkpoint.pkl |
| Hint 2 | -62 pts | Use pickletools.dis() to disassemble without executing |
| Hint 3 | -125 pts | Look for a GLOBAL opcode referencing os.system or subprocess.run |
Flag 3: Data Sabotage
Scenario
The training dataset has been poisoned with backdoor samples. A small percentage of training examples contain a trigger phrase that causes the model to produce a specific (wrong) classification. Identify the trigger phrase and the target label to construct the flag.
Approach Guidance
Statistical Analysis
Analyze label distributions across the dataset. Look for unusual correlations between specific text patterns and labels.
N-gram Analysis
Search for n-grams that appear disproportionately in one class versus others. A trigger phrase will show extremely high correlation with the target label.
Construct the Flag
The flag is
FLAG\where the trigger phrase and target label are what you discovered.
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -45 pts | About 2% of training samples are poisoned |
| Hint 2 | -75 pts | The trigger is a common English phrase that appears in poisoned samples of one specific class |
| Hint 3 | -150 pts | Run n-gram analysis per class and look for phrases with >90% class correlation |
Flag 4: Config Corruption
Scenario
The training configuration file has been modified to include a subtle change that affects model behavior. The modification is hidden in a large YAML configuration file and changes how the model processes inputs at inference time.
Approach Guidance
Version Control Analysis
If the config is in version control, diff it against the previous version. Look for changes that would not be part of normal hyperparameter tuning.
Semantic Analysis
Read the configuration carefully. Look for entries that do not match the expected schema, unusual preprocessing steps, or post-processing hooks that modify outputs.
Find the Hidden Entry
The backdoor is a post-processing rule that modifies model outputs under specific conditions. The flag is encoded in the rule's condition string.
Hints
| Hint | Cost | Content |
|---|---|---|
| Hint 1 | -45 pts | The config has over 200 lines -- the modification is in the postprocessing section |
| Hint 2 | -75 pts | Look for a rule with a Base64-encoded condition |
| Hint 3 | -150 pts | Decode the Base64 string in the postprocessing override rule |
Scoring and After the CTF
Completing all four flags demonstrates comprehensive ML supply chain security assessment skills. These attack vectors correspond to real-world supply chain compromises that have affected production ML systems.
Supply Chain Security Checklist
After completing this CTF, use this checklist for real-world ML supply chain audits:
| Component | Verification Method | Tools |
|---|---|---|
| Python packages | Compare names against canonical PyPI listings; inspect __init__.py and setup.py for suspicious code | pip-audit, safety, manual review |
| Model files | Analyze serialization format; scan pickle files for code execution payloads; verify hashes against published checksums | pickletools, fickling, safetensors-check |
| Training data | Run statistical analysis for label anomalies; perform n-gram analysis for trigger patterns; verify data source provenance | Custom scripts, pandas, nltk |
| Configuration | Diff against version-controlled baselines; schema validation; scan for encoded strings or unusual entries | diff, YAML schema validators |
| Pre-trained models | Behavioral fingerprinting against published benchmarks; weight distribution analysis | Custom evaluation scripts |
Defense-in-Depth for ML Supply Chains
No single check catches all supply chain compromises. Effective defense requires verification at every stage:
- Dependency pinning and hash verification. Pin exact package versions and verify hashes. Use lockfiles. Audit new dependencies before adding them.
- Safe model formats. Prefer SafeTensors or ONNX over pickle-based formats. If pickle is unavoidable, use sandboxed loading.
- Data provenance tracking. Maintain a chain of custody for training data from source through preprocessing to training.
- Configuration management. Version control all configurations with code review requirements for changes. Schema-validate against expected structure.
- Behavioral verification. Run behavioral tests on models before deployment to detect behavioral anomalies introduced by any supply chain compromise.
Related Topics
- ML Pipeline Poisoning - Full pipeline compromise lab
- Model Registry Compromise - Registry-specific attack techniques
- Supply Chain Saboteur CTF - Additional supply chain challenges
- Infrastructure Security - Deployment-stage security
References
- "BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" - Gu et al. (2019) - Foundational ML supply chain backdoor research
- "Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning" - Li et al. (2021) - Weight-level model poisoning techniques
- "Risks of AI-Specific Supply Chain Attacks" - Trail of Bits (2024) - Comprehensive ML supply chain threat analysis
- "Never a Dill Moment: Exploiting Machine Learning Pickle Files" - Muñoz (2023) - Pickle deserialization attacks on ML model files
Why are pickle-based model files particularly dangerous in the ML supply chain?