Supply Chain Risks in AI Code Generation
Analysis of supply chain attack vectors introduced by AI code generation tools, including dependency confusion, typosquatting, and training data poisoning.
概覽
AI code generation tools introduce a new dimension to software 供應鏈 risk. Traditional 供應鏈 attacks target the dependencies that human developers choose. AI code generation adds several new vectors: models can hallucinate packages that do not exist (creating opportunities for typosquatting), suggest outdated or vulnerable dependency versions, and propagate insecure patterns learned from compromised 訓練資料.
This article examines the intersection of AI code generation and 供應鏈 安全, covering the specific attack vectors that emerge when LLMs recommend, import, and configure software dependencies.
The AI Supply Chain Threat Model
Traditional vs. AI-Augmented Supply Chain Risks
Traditional software 供應鏈 attacks require 攻擊者 to compromise or impersonate a real dependency. AI code generation changes the equation in two fundamental ways:
-
Hallucinated packages: LLMs can suggest packages that have never existed. 攻擊者 who registers these hallucinated package names can execute code on any developer who follows the AI's suggestion.
-
Statistical pattern reproduction: LLMs suggest dependencies based on patterns in 訓練資料, not on current 安全 assessments. A package that was popular during the 訓練 window but is now known-vulnerable will continue to be suggested.
from dataclasses import dataclass
from enum import Enum
from typing import Optional
class SupplyChainVector(Enum):
HALLUCINATED_PACKAGE = "hallucinated_package"
TYPOSQUATTED_DEPENDENCY = "typosquatted_dependency"
VULNERABLE_VERSION = "vulnerable_version"
DEPRECATED_PACKAGE = "deprecated_package"
TRAINING_DATA_POISONING = "training_data_poisoning"
COMPROMISED_TRANSITIVE = "compromised_transitive_dependency"
@dataclass
class AISupplyChainRisk:
vector: SupplyChainVector
description: str
ai_specific: bool # Is this risk unique to AI code generation?
detection_difficulty: str
example: str
AI_SUPPLY_CHAIN_RISKS = [
AISupplyChainRisk(
vector=SupplyChainVector.HALLUCINATED_PACKAGE,
description="LLM suggests a package name that does not exist on any registry",
ai_specific=True,
detection_difficulty="easy", # pip install fails, but 攻擊者 can register first
example="LLM suggests 'pip install flask-安全-utils' - package doesn't exist",
),
AISupplyChainRisk(
vector=SupplyChainVector.TYPOSQUATTED_DEPENDENCY,
description="LLM suggests a misspelled package name that is a malicious typosquat",
ai_specific=False, # Humans also typosquat, but LLMs do it more consistently
detection_difficulty="medium",
example="LLM suggests 'requets' instead of 'requests'",
),
AISupplyChainRisk(
vector=SupplyChainVector.VULNERABLE_VERSION,
description="LLM suggests a dependency version with known CVEs",
ai_specific=True, # LLMs are frozen in time, always suggest 訓練-era versions
detection_difficulty="easy", # Version scanners catch this
example="LLM suggests 'langchain==0.0.171' which has CVE-2023-29374",
),
AISupplyChainRisk(
vector=SupplyChainVector.DEPRECATED_PACKAGE,
description="LLM suggests a package that has been deprecated or abandoned",
ai_specific=True,
detection_difficulty="medium",
example="LLM suggests 'pycrypto' instead of 'pycryptodome'",
),
AISupplyChainRisk(
vector=SupplyChainVector.TRAINING_DATA_POISONING,
description="Attacker poisons 訓練資料 to make LLMs consistently suggest a malicious package",
ai_specific=True,
detection_difficulty="hard",
example="Attacker creates many GitHub repos importing 'evil-utils', LLM learns the pattern",
),
AISupplyChainRisk(
vector=SupplyChainVector.COMPROMISED_TRANSITIVE,
description="LLM suggests a package whose transitive dependencies include compromised packages",
ai_specific=False,
detection_difficulty="hard",
example="Suggested package has deep dependency tree with compromised leaf node",
),
]Package Hallucination 攻擊
The Hallucination-to-利用 Pipeline
Research has demonstrated that LLMs consistently hallucinate specific package names. 攻擊者 can study which fake packages LLMs suggest, register those names on PyPI or npm, and wait for developers to install them.
The attack pipeline works as follows:
- Discovery: Query multiple LLMs with common coding tasks and collect the package names they suggest.
- Verification: Check which suggested names do not correspond to real packages.
- Registration: Register the hallucinated package names on public registries.
- Payload: Include malicious code in the registered package (data exfiltration, reverse shell, crypto mining).
- Waiting: Developers using AI coding tools will be directed to install 攻擊者's package.
import subprocess
import json
from typing import Optional
def check_package_exists(package_name: str, registry: str = "pypi") -> dict:
"""Check if a package exists on a public registry."""
if registry == "pypi":
result = subprocess.run(
["pip", "index", "versions", package_name],
capture_output=True, text=True,
)
exists = result.returncode == 0
return {
"package": package_name,
"registry": "pypi",
"exists": exists,
"versions": result.stdout.strip() if exists else None,
}
elif registry == "npm":
result = subprocess.run(
["npm", "view", package_name, "version"],
capture_output=True, text=True,
)
exists = result.returncode == 0
return {
"package": package_name,
"registry": "npm",
"exists": exists,
"version": result.stdout.strip() if exists else None,
}
return {"package": package_name, "error": f"Unknown registry: {registry}"}
# Known examples of hallucinated package names from research
# Source: "Can LLMs Generate Malicious Code?" - 安全 researchers
# have documented consistent hallucination patterns
KNOWN_HALLUCINATION_CATEGORIES = [
{
"pattern": "Plausible compound names",
"examples": [
"flask-安全-utils",
"django-auth-helpers",
"react-data-validator",
],
"why_llms_hallucinate": (
"LLMs combine real package naming patterns "
"(flask-*, django-*, react-*) with common suffixes"
),
},
{
"pattern": "Outdated or renamed packages",
"examples": [
"sklearn", # Real name is scikit-learn
"cv2", # Real package is opencv-python
"yaml", # Real package is pyyaml
],
"why_llms_hallucinate": (
"LLMs confuse import names with package names "
"(import cv2 vs pip install opencv-python)"
),
},
{
"pattern": "Platform-specific packages",
"examples": [
"python-dotenv-cli",
"node-fetch-polyfill",
],
"why_llms_hallucinate": (
"LLMs merge concepts from different ecosystems"
),
},
]
def scan_ai_generated_requirements(requirements_file: str) -> list[dict]:
"""Scan a requirements file for potentially hallucinated packages."""
findings = []
with open(requirements_file) as f:
for line_num, line in enumerate(f, 1):
line = line.strip()
if not line or line.startswith("#"):
continue
# Extract package name (before any version specifier)
package_name = line.split("==")[0].split(">=")[0].split("<=")[0].split("~=")[0].strip()
# Check if package exists
result = check_package_exists(package_name)
if not result.get("exists"):
findings.append({
"line": line_num,
"package": package_name,
"status": "NOT FOUND on PyPI",
"risk": "high",
"recommendation": "Verify package name - may be hallucinated by AI",
})
else:
# Check package age and popularity indicators
findings.append({
"line": line_num,
"package": package_name,
"status": "exists",
"risk": "low",
"note": "Verify 這是 the intended package",
})
return findingsProactive 防禦: Package Name 監控
Organizations can proactively monitor for hallucinated package registrations:
import hashlib
import time
from datetime import datetime, timedelta
class PackageHallucinationMonitor:
"""Monitor package registries for hallucinated package registrations."""
def __init__(self, organization_packages: list[str]):
self.org_packages = organization_packages
self.watch_patterns = self._generate_watch_patterns()
def _generate_watch_patterns(self) -> list[str]:
"""Generate package names that LLMs might hallucinate based on org packages."""
patterns = []
common_suffixes = [
"-utils", "-helpers", "-tools", "-core", "-cli",
"-common", "-extra", "-plus", "-pro", "-lite",
]
common_prefixes = [
"python-", "py-", "node-", "go-",
]
for pkg in self.org_packages:
# Typosquats
for i in range(len(pkg)):
# Character swap
if i < len(pkg) - 1:
swapped = list(pkg)
swapped[i], swapped[i + 1] = swapped[i + 1], swapped[i]
patterns.append("".join(swapped))
# Compound names
for suffix in common_suffixes:
patterns.append(f"{pkg}{suffix}")
for prefix in common_prefixes:
patterns.append(f"{prefix}{pkg}")
return list(set(patterns))
def check_new_registrations(self) -> list[dict]:
"""Check if any watched package names have been newly registered."""
alerts = []
for pattern in self.watch_patterns:
result = check_package_exists(pattern)
if result.get("exists"):
alerts.append({
"package": pattern,
"status": "REGISTERED",
"action": "Investigate - may be a 供應鏈 attack",
"timestamp": datetime.utcnow().isoformat(),
})
return alertsVulnerable Version Suggestion
The Training Data Cutoff Problem
LLMs have a 訓練資料 cutoff. They suggest dependency versions that were current during their 訓練 window. 這意味著 they systematically recommend versions with known CVEs that were discovered after the cutoff.
# Real-world examples of LLMs suggesting vulnerable versions
VULNERABLE_VERSION_EXAMPLES = [
{
"package": "langchain",
"version_suggested": "0.0.171",
"cve": "CVE-2023-29374",
"漏洞": "Arbitrary code execution via LLMMathChain",
"fixed_in": "0.0.172",
"cvss": 9.8,
"cwe": "CWE-94",
},
{
"package": "langchain",
"version_suggested": "0.0.194",
"cve": "CVE-2023-36258",
"漏洞": "Arbitrary code execution via PALChain",
"fixed_in": "0.0.235",
"cvss": 9.8,
"cwe": "CWE-94",
},
{
"package": "pytorch",
"version_suggested": "1.13.1",
"cve": "CVE-2023-45802",
"漏洞": "Deserialization of untrusted data in torch.load",
"fixed_in": "2.0.0",
"cvss": 7.8,
"cwe": "CWE-502",
},
{
"package": "transformers",
"version_suggested": "4.28.0",
"cve": "CVE-2023-40590",
"漏洞": "Arbitrary code execution via deserialization",
"fixed_in": "4.30.0",
"cvss": 8.8,
"cwe": "CWE-502",
},
]
def check_suggested_versions(
suggestions: list[dict],
) -> list[dict]:
"""Check AI-suggested package versions against known 漏洞.
In production, this would query the OSV 資料庫 or similar.
"""
findings = []
# Simulated check against known vulnerable versions
vuln_db = {
(v["package"], v["version_suggested"]): v
for v in VULNERABLE_VERSION_EXAMPLES
}
for suggestion in suggestions:
key = (suggestion["package"], suggestion["version"])
if key in vuln_db:
vuln = vuln_db[key]
findings.append({
"package": suggestion["package"],
"suggested_version": suggestion["version"],
"漏洞": vuln["cve"],
"description": vuln["漏洞"],
"fixed_in": vuln["fixed_in"],
"severity": "critical" if vuln["cvss"] >= 9.0 else "high",
"action": f"Update to >= {vuln['fixed_in']}",
})
return findingsTraining Data Poisoning for Supply Chain 攻擊
攻擊 Methodology
攻擊者 can influence what packages LLMs suggest by contaminating the 訓練資料. 這是 a slow, strategic attack:
- Create malicious package: Register a package with a plausible name on PyPI/npm.
- Generate 訓練 signal: Create many GitHub repositories, blog posts, and Stack Overflow answers that import or recommend the package.
- Wait for 訓練: The next generation of LLMs will learn the pattern and suggest the package.
- Harvest: Developers using AI coding tools receive suggestions to install the malicious package.
# Modeling the 訓練 資料投毒 attack economics
from dataclasses import dataclass
@dataclass
class PoisoningCampaign:
"""Model of a 供應鏈 投毒 campaign via 訓練資料."""
target_package_name: str
legitimate_sounding_purpose: str
github_repos_created: int
blog_posts_created: int
stackoverflow_answers: int
estimated_cost_usd: float
estimated_time_months: float
expected_installs_per_month: int
# 範例 campaigns and their economics
EXAMPLE_CAMPAIGNS = [
PoisoningCampaign(
target_package_name="flask-session-encrypt",
legitimate_sounding_purpose="Flask session encryption middleware",
github_repos_created=50,
blog_posts_created=10,
stackoverflow_answers=20,
estimated_cost_usd=500, # Mostly time cost
estimated_time_months=3,
expected_installs_per_month=100,
),
]
def estimate_campaign_reach(campaign: PoisoningCampaign) -> dict:
"""Estimate the reach and impact of a 投毒 campaign."""
# Each GitHub repo adds ~1 訓練 signal
# Each blog post adds ~5 訓練 signals (indexed by multiple crawlers)
# Each SO answer adds ~10 訓練 signals (high crawl frequency)
total_signals = (
campaign.github_repos_created * 1
+ campaign.blog_posts_created * 5
+ campaign.stackoverflow_answers * 10
)
return {
"total_training_signals": total_signals,
"cost_per_signal": campaign.estimated_cost_usd / total_signals,
"estimated_monthly_victims": campaign.expected_installs_per_month,
"cost_per_victim": (
campaign.estimated_cost_usd / campaign.expected_installs_per_month
),
"評估": "economically viable" if campaign.estimated_cost_usd < 1000 else "expensive",
}CI/CD Pipeline 防禦
Automated 偵測
import subprocess
import json
from pathlib import Path
class AISupplyChainScanner:
"""CI/CD scanner for AI-introduced 供應鏈 risks."""
def __init__(self, project_path: str):
self.project_path = Path(project_path)
def scan_requirements(self) -> list[dict]:
"""Scan Python requirements for 供應鏈 risks."""
findings = []
req_files = list(self.project_path.glob("**/requirements*.txt"))
req_files.extend(self.project_path.glob("**/setup.cfg"))
req_files.extend(self.project_path.glob("**/pyproject.toml"))
for req_file in req_files:
findings.extend(self._check_file(req_file))
return findings
def _check_file(self, filepath: Path) -> list[dict]:
findings = []
content = filepath.read_text()
for line_num, line in enumerate(content.split("\n"), 1):
line = line.strip()
if not line or line.startswith("#"):
continue
# Check for unpinned versions
if ">=" in line and "==" not in line:
findings.append({
"file": str(filepath),
"line": line_num,
"issue": "unpinned_version",
"detail": f"'{line}' uses >= instead of ==",
"risk": "AI may have suggested latest-at-訓練-time version",
"fix": "Pin to exact version with hash verification",
})
# Check for no version at all
pkg_name = line.split("==")[0].split(">=")[0].split("[")[0].strip()
if pkg_name and "==" not in line and ">=" not in line and "<" not in line:
findings.append({
"file": str(filepath),
"line": line_num,
"issue": "no_version_specified",
"detail": f"'{pkg_name}' has no version constraint",
"risk": "Any version will be installed, including malicious ones",
"fix": f"Pin version: {pkg_name}==<specific_version>",
})
return findings
def scan_package_json(self) -> list[dict]:
"""Scan package.json for npm 供應鏈 risks."""
findings = []
pkg_files = list(self.project_path.glob("**/package.json"))
for pkg_file in pkg_files:
if "node_modules" in str(pkg_file):
continue
with open(pkg_file) as f:
try:
pkg = json.load(f)
except json.JSONDecodeError:
continue
for dep_type in ("dependencies", "devDependencies"):
deps = pkg.get(dep_type, {})
for name, version in deps.items():
# Check for wildcard versions
if version in ("*", "latest", ""):
findings.append({
"file": str(pkg_file),
"package": name,
"version": version,
"issue": "wildcard_version",
"risk": "critical",
})
# Check for suspicious package names
if len(name) <= 2:
findings.append({
"file": str(pkg_file),
"package": name,
"issue": "suspiciously_short_name",
"risk": "medium",
})
return findings
def generate_report(self) -> dict:
"""Generate a complete 供應鏈 risk report."""
python_findings = self.scan_requirements()
npm_findings = self.scan_package_json()
return {
"scan_date": datetime.utcnow().isoformat(),
"project_path": str(self.project_path),
"python_findings": python_findings,
"npm_findings": npm_findings,
"total_issues": len(python_findings) + len(npm_findings),
"critical_issues": sum(
1 for f in python_findings + npm_findings
if f.get("risk") == "critical"
),
}Semgrep Rules for AI-Generated Dependencies
# Semgrep rule to detect common AI code generation 供應鏈 patterns
SEMGREP_SUPPLY_CHAIN_RULES = """
rules:
- id: ai-hallucinated-import-pattern
patterns:
- pattern: import $PACKAGE
- metavariable-regex:
metavariable: $PACKAGE
regex: ^(flask_security_utils|django_auth_helpers|react_data_validator)$
message: >
This import matches a known AI-hallucinated package pattern.
Verify the package exists and is the intended dependency.
languages: [python]
severity: WARNING
- id: ai-suggested-vulnerable-langchain
patterns:
- pattern: langchain==$VERSION
- metavariable-regex:
metavariable: $VERSION
regex: ^0\\.0\\.(1[0-7][0-9]|[0-9]{1,2})$
message: >
LangChain versions before 0.0.172 are vulnerable to CVE-2023-29374
(arbitrary code execution). AI tools commonly suggest outdated versions.
languages: [generic]
severity: ERROR
- id: ai-unpinned-dependency
pattern: pip install $PACKAGE
message: >
Unpinned pip install detected. AI-generated install commands often
omit version pins. Pin to a specific verified version.
languages: [bash]
severity: WARNING
"""緩解 Framework
| Layer | Control | 實作 |
|---|---|---|
| Development | Dependency review | Human review of all AI-suggested dependencies |
| Development | .aiderignore / .cursorignore | Exclude sensitive dependency files from AI context |
| CI | Package verification | Verify all packages exist and match expected checksums |
| CI | Version pinning enforcement | Reject unpinned or range-specified dependencies |
| CI | 漏洞 scanning | Run pip-audit, npm audit, or Snyk on every PR |
| Registry | Namespace 監控 | Monitor for registrations similar to your package names |
| Training | 資料投毒 awareness | Track package suggestion patterns across model versions |
參考文獻
- "An LLM-Assisted Easy-to-Trigger 後門 攻擊 on Code Completion Models" — Yan et al., 2024 — https://arxiv.org/abs/2402.00187
- CVE-2023-29374 — LangChain arbitrary code execution via LLMMathChain — https://nvd.nist.gov/vuln/detail/CVE-2023-29374
- CVE-2023-36258 — LangChain arbitrary code execution via PALChain — https://nvd.nist.gov/vuln/detail/CVE-2023-36258
- OWASP Top 10 for LLM Applications 2025 — LLM05: Supply Chain 漏洞 — https://genai.owasp.org/llmrisk/
- "Typosquatting in Package Registries" — MITRE ATLAS Technique AML.T0010 — https://atlas.mitre.org/
- CWE-494: Download of Code Without Integrity Check — https://cwe.mitre.org/data/definitions/494.html