July 2026: Supply Chain Audit Challenge

advanced10 min readUpdated 2026-03-15

Audit an ML project's entire supply chain for security issues including dependencies, model provenance, data pipelines, training infrastructure, and deployment artifacts.

challenge supply-chain audit dependencies provenance july-2026

July 2026: Supply Chain Audit Challenge

Audit the complete supply chain of an ML project -- from training data and dependencies to model artifacts and deployment configuration. This challenge requires breadth across multiple security domains and develops the systematic thinking needed for comprehensive security assessments.

Challenge Overview

Detail	Value
Duration	July 1--31, 2026
Difficulty	Advanced
Estimated Time	12--18 hours
Prerequisites	Software supply chain security concepts, Python packaging, ML pipeline familiarity, container security basics
Max Points	100

Scenario

You have been given access to the complete source repository for "SentiBot," a sentiment analysis service. The repository contains everything needed to train, evaluate, package, and deploy the model. Your task is to audit this project for supply chain security issues at every level.

The project is structured as a typical ML application:

sentibot/
├── data/
│   ├── raw/                  # Training data sources
│   ├── processed/            # Preprocessed datasets
│   └── scripts/              # Data collection and processing scripts
├── models/
│   ├── base/                 # Base model checkpoints
│   ├── fine-tuned/           # Fine-tuned model artifacts
│   └── configs/              # Training configurations
├── src/
│   ├── training/             # Training code
│   ├── inference/            # Inference server code
│   ├── preprocessing/        # Data preprocessing pipeline
│   └── evaluation/           # Evaluation scripts
├── deploy/
│   ├── docker/               # Dockerfiles
│   ├── k8s/                  # Kubernetes manifests
│   ├── terraform/            # Infrastructure as code
│   └── ci/                   # CI/CD pipeline configs
├── tests/                    # Test suite
├── requirements.txt          # Python dependencies
├── setup.py                  # Package configuration
├── pyproject.toml             # Build configuration
└── README.md

Audit Categories

Category 1: Dependency Security (25 points)

Audit all dependencies of the project for known vulnerabilities and supply chain risks.

1.1 Python Dependencies (10 points)

Examine requirements.txt and pyproject.toml for pinned vs. unpinned dependencies
Check for known vulnerabilities (CVEs) in declared dependencies
Identify dependencies from unusual or potentially compromised sources
Look for typosquatting risks in package names
Check for dependencies that have been abandoned or have concerning maintainer changes

1.2 Container Dependencies (8 points)

Audit Dockerfiles for base image security (unpinned tags, outdated images, unnecessary packages)
Check for secrets baked into container layers
Evaluate the container build process for injection points
Review multi-stage build hygiene (leaking build artifacts into runtime images)

1.3 Infrastructure Dependencies (7 points)

Review Terraform modules for pinned versions and source integrity
Audit Kubernetes manifests for security misconfigurations (privileged containers, host network access, missing resource limits)
Check CI/CD pipeline configs for injection vulnerabilities and secret handling

Category 2: Data Provenance (20 points)

Audit the training data pipeline for integrity and provenance issues.

2.1 Data Source Verification (10 points)

Trace every training data source to its origin
Verify that data collection scripts fetch from expected sources with integrity checks
Check for data sources that could be manipulated by an adversary (public URLs without checksums, user-contributed content, scraped websites)
Look for data that has been included without proper licensing

2.2 Data Pipeline Security (10 points)

Review preprocessing scripts for injection vulnerabilities (e.g., eval() calls on data content, pickle deserialization of untrusted data)
Check for data validation gaps that would allow poisoned samples to enter the training set
Verify that processed data checksums are validated before training
Look for race conditions or TOCTOU issues in the data pipeline

Category 3: Model Security (25 points)

Audit the model artifacts and training process for security issues.

3.1 Base Model Provenance (10 points)

Verify the source and integrity of the base model checkpoint
Check how the base model was downloaded (pinned hash vs. mutable URL)
Look for model serialization format risks (pickle-based formats that allow arbitrary code execution)
Verify that model files have not been tampered with since download

3.2 Training Process Security (8 points)

Review training configurations for settings that could introduce vulnerabilities (e.g., learning rates that allow fast overwriting of safety training)
Check for logging of training metrics that could indicate data poisoning
Verify that training runs are reproducible from declared inputs
Look for training shortcuts that bypass validation or evaluation steps

3.3 Model Export and Serving (7 points)

Audit the model export process for injection points
Check that exported models are in safe serialization formats (safetensors vs. pickle)
Verify model signing or integrity checking in the deployment pipeline
Review the inference server for deserialization vulnerabilities

Category 4: Deployment Security (20 points)

Audit the deployment pipeline and runtime configuration.

4.1 CI/CD Pipeline (10 points)

Review CI/CD configuration for command injection vulnerabilities
Check for secrets in pipeline logs or environment variables
Verify that pipeline runs use pinned tool versions
Look for missing security gates (no vulnerability scanning, no model validation before deployment)
Check for pull request pipeline triggers that could execute attacker-controlled code

4.2 Runtime Configuration (10 points)

Review Kubernetes manifests for security context misconfigurations
Check for exposed management interfaces or debug endpoints
Verify that secrets are managed through a proper secrets manager (not hardcoded or in environment variables)
Review network policies for proper segmentation
Check for overly permissive IAM roles or service account permissions

Category 5: Bonus Findings (10 points)

Additional points for:

Discovering issues that do not fit neatly into the categories above
Demonstrating attack chains that combine multiple supply chain weaknesses
Providing a risk-prioritized remediation roadmap
Identifying systemic issues in the project's security practices

Audit Methodology

Phase 1: Inventory (2--3 hours)

Before looking for vulnerabilities, build a complete inventory:

Bill of Materials. List every dependency, tool, and external resource used by the project.
Data flow diagram. Map how data moves from sources through processing, training, and into the deployed model.
Trust boundaries. Identify where the project trusts external inputs (downloaded models, fetched data, user-contributed code).

Phase 2: Automated Scanning (2--3 hours)

Use automated tools to find known issues:

Tool Category	What It Finds	Limitations
Dependency scanners (pip-audit, safety)	Known CVEs in Python packages	Only finds published vulnerabilities
Container scanners (Trivy, Grype)	OS and language package vulnerabilities in containers	Does not audit application logic
IaC scanners (Checkov, tfsec)	Misconfigurations in Terraform and Kubernetes	Rule-based, misses novel issues
SAST (Bandit, Semgrep)	Code-level security issues	High false positive rate, misses ML-specific issues

Phase 3: Manual Review (6--8 hours)

Automated tools miss the most interesting supply chain issues. Manual review should focus on:

Trust decisions. Every place the code trusts an external input is a potential vulnerability. Review each one.
Serialization boundaries. Data crossing serialization boundaries (pickle, JSON, protobuf) is a common attack vector.
Build reproducibility. Can you reproduce the same model artifact from the same inputs? If not, something in the pipeline is non-deterministic and potentially manipulable.
Access control gaps. Who can modify each component? A dependency that anyone can publish a new version of is different from one maintained by a trusted organization.

Phase 4: Reporting (2--3 hours)

Organize findings by risk level and provide actionable remediation:

## Finding: [Title]
**Risk Level:** Critical/High/Medium/Low
**Category:** Dependency/Data/Model/Deployment
**CVSS:** [Score if applicable]
 
### Description
[What the issue is and where it exists]
 
### Evidence
[Specific files, lines, or configurations that demonstrate the issue]
 
### Attack Scenario
[How an adversary could exploit this issue]
 
### Remediation
[Specific steps to fix the issue, with code examples where helpful]
 
### Priority
[Why this should be fixed now/soon/eventually]

Planted Vulnerabilities

The repository contains a mix of realistic vulnerabilities:

Obvious issues that any security scanner would catch (5--6 findings)
Subtle issues that require manual review to identify (8--10 findings)
ML-specific issues that traditional security tools miss entirely (4--5 findings)
Red herrings that look suspicious but are not actually exploitable (2--3 instances)

Part of the challenge is distinguishing real vulnerabilities from false positives and prioritizing findings by actual risk.

Scoring

Category	Points	What Earns Full Marks
Dependency Security	25	All planted dependency issues found with severity assessment
Data Provenance	20	Complete data flow audit with provenance gaps identified
Model Security	25	Model artifact integrity issues and serialization risks found
Deployment Security	20	CI/CD and runtime misconfigurations documented
Bonus	10	Attack chains, prioritized remediation, systemic analysis

Why Supply Chain Auditing Is Different for ML

Traditional software supply chain auditing focuses on code dependencies and build pipelines. ML projects have additional surfaces that traditional tools do not cover:

Data as a Dependency

In traditional software, dependencies are libraries and frameworks -- versioned, published, and (ideally) reviewed. In ML, training data is a dependency with none of these properties:

Training data is often unversioned. The same data source may change over time without any record.
Data provenance is rarely tracked. The chain of custody from original source to training input is usually undocumented.
Data integrity is hard to verify. A poisoned training sample looks identical to a legitimate one without knowing the correct label.

This challenge forces you to think about data with the same rigor you apply to code dependencies: where did it come from, how was it verified, and what happens if it is compromised?

Models as Build Artifacts

A trained model is the ML equivalent of a compiled binary. Like a binary, it is opaque -- you cannot easily inspect its behavior by reading its weights. Unlike a binary, it cannot be reproducibly built from source in most cases:

Non-deterministic training means the same code and data may produce different models.
Base models downloaded from public repositories are trusted on reputation, not on verifiable provenance.
Model serialization formats (especially pickle-based formats) can contain arbitrary executable code.

The GPU Trust Boundary

ML training and inference runs on GPU hardware that introduces its own trust considerations:

GPU drivers and CUDA libraries are part of the trusted computing base but are rarely audited.
Shared GPU environments (cloud instances, shared clusters) may leak information between tenants through shared memory.
GPU-specific optimizations (mixed precision, quantization) alter model behavior in ways that may have security implications.

Connecting to Professional Practice

Supply chain auditing is one of the most sought-after skills in AI security. Organizations deploying ML systems need to answer questions like:

"Can we trust this model from Hugging Face?"
"What happens if our training data provider is compromised?"
"Are our CI/CD pipelines secure against an insider threat?"
"How do we verify that the model in production is the model we trained?"

This challenge develops the systematic thinking and technical skills needed to answer these questions for real organizations.

July 2026: Supply Chain Audit Challenge

Related articles

July 2026: Supply Chain Audit Challenge

Related articles