Simulation: Open Source AI Project Audit

intermediate10 min readUpdated 2026-03-15

Security audit simulation for an open-source AI application, covering code review, dependency analysis, model supply chain verification, and deployment configuration review.

simulation open-source supply-chain code-review security-audit

Engagement Brief

Project: OpenChat-Plus -- an open-source AI-powered chat application with 15K GitHub stars Repository: Fictional GitHub repository with documented architecture Components: FastAPI backend, React frontend, LangChain orchestration, multiple model backends (OpenAI, Anthropic, local Ollama), PostgreSQL + pgvector for RAG, Redis for sessions Scope: Full source code review, dependency audit, model supply chain analysis, deployment configuration review Duration: 2 hours Deliverable: Security audit report with prioritized findings and recommended fixes

Project Structure

openchat-plus/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI application
│   │   ├── config.py            # Configuration management
│   │   ├── auth/                # Authentication module
│   │   ├── chat/                # Chat endpoint handlers
│   │   ├── rag/                 # RAG pipeline
│   │   ├── models/              # Model integration layer
│   │   └── utils/               # Utility functions
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   ├── package.json
│   └── Dockerfile
├── docker-compose.yml
├── .env.example
├── docs/
│   └── deployment.md
└── models/
    └── download_models.sh

Phase 1: Code Review (45 minutes)

Configuration and secrets management

Review how the application handles sensitive configuration. This is the most common source of security issues in open-source projects.

# config.py - Review for security issues
config_review_checklist = [
    "Are API keys stored in environment variables (good) or hardcoded (bad)?",
    "Does .env.example contain real credentials or obvious defaults?",
    "Is there a .gitignore entry for .env files?",
    "Are secrets logged during startup or error conditions?",
    "Is there a config validation step that fails safely on missing values?",
    "Are default values secure (e.g., debug=False, CORS restricted)?",
]

Common findings in open-source AI projects:

# FINDING: Hardcoded default API key
# Severity: High
# File: backend/app/config.py
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "sk-default-key-for-testing")
 
# FINDING: Debug mode enabled by default
# Severity: Medium
# File: backend/app/config.py
DEBUG = os.getenv("DEBUG", "true").lower() == "true"
 
# FINDING: Overly permissive CORS
# Severity: Medium
# File: backend/app/main.py
app.add_middleware(CORSMiddleware, allow_origins=["*"])

Input handling and prompt construction

Review how user input flows into LLM prompts. This is the primary attack surface for prompt injection.

# Review the chat handler for injection vulnerabilities
input_review_checklist = [
    "Is user input sanitized before being included in prompts?",
    "Is the system prompt separated from user input?",
    "Are there input length limits enforced?",
    "Is user input escaped in structured prompts (JSON, XML)?",
    "Are file uploads validated before processing?",
    "Is there rate limiting on AI endpoints?",
]

Common vulnerable patterns:

# FINDING: Direct string interpolation in prompt
# Severity: Critical
# File: backend/app/chat/handler.py
prompt = f"System: {system_prompt}\nUser: {user_message}\nAssistant:"
 
# FINDING: No input length validation
# Severity: Medium
# File: backend/app/chat/handler.py
@app.post("/chat")
async def chat(message: str):  # No max_length constraint
    response = await llm.generate(message)
 
# FINDING: User-controlled system prompt component
# Severity: High
# File: backend/app/chat/handler.py
system = f"You are a helpful assistant for {user.company_name}."
# user.company_name is user-supplied and unvalidated

Authentication and authorization review

Review the auth system for weaknesses that could be exploited alongside AI vulnerabilities.

auth_review_checklist = [
    "What authentication mechanism is used (JWT, session, API key)?",
    "Are tokens properly validated on every request?",
    "Is there role-based access control for admin functions?",
    "Are AI-specific endpoints behind authentication?",
    "Is rate limiting applied per-user or globally?",
    "Are conversation histories access-controlled per-user?",
]

Common findings:

# FINDING: API key authentication with no rotation mechanism
# Severity: Medium
# File: backend/app/auth/api_key.py
 
# FINDING: Admin endpoints accessible without authentication
# Severity: Critical
# File: backend/app/main.py
@app.get("/admin/config")  # No auth decorator
async def get_config():
    return {"model": settings.MODEL, "api_key": settings.API_KEY[:8] + "..."}
 
# FINDING: Conversation history accessible by ID without ownership check
# Severity: High
# File: backend/app/chat/history.py
@app.get("/chat/history/{conversation_id}")
async def get_history(conversation_id: str):
    return db.get_conversation(conversation_id)  # No user ownership check

RAG pipeline security review

Review the RAG implementation for injection and data leakage risks.

rag_review_checklist = [
    "Are uploaded documents sanitized before embedding?",
    "Is there access control on the vector store (per-user/per-role)?",
    "Are retrieved documents filtered before being included in prompts?",
    "Is there a maximum number of retrieved chunks?",
    "Are metadata fields sanitized (filename, author, etc.)?",
    "Is the embedding model downloaded from a trusted source?",
]

Phase 2: Dependency Analysis (30 minutes)

Python dependency audit

Analyze the Python dependency tree for known vulnerabilities and suspicious packages.

dependency_audit_steps = [
    # Step 1: Check for known vulnerabilities
    "pip audit -r requirements.txt",
    "safety check -r requirements.txt",
 
    # Step 2: Check for pinned vs unpinned versions
    # Unpinned: langchain  (vulnerable to supply chain attacks)
    # Pinned: langchain==0.1.12  (reproducible, auditable)
 
    # Step 3: Check for typosquatting risks
    # langchain vs langchainn vs lang-chain
    # openai vs open-ai vs openaii
 
    # Step 4: Check for abandoned or unmaintained packages
    # Last commit date, open security issues, maintainer activity
]

Common AI-specific dependency risks:

| Package | Risk | Severity |
|---------|------|----------|
| langchain (unpinned) | Rapid release cycle, breaking changes, new vulnerabilities | High |
| transformers | Large attack surface, model loading executes arbitrary code | High |
| pickle-based serialization | Arbitrary code execution on deserialization | Critical |
| gradio (if present) | Known SSRF and path traversal vulnerabilities | High |

Frontend dependency audit

Analyze JavaScript dependencies for supply chain risks.

frontend_audit_steps = [
    "npm audit --production",
    "npx lockfile-lint --path package-lock.json --type npm --allowed-hosts npm",
 
    # Check for dependencies that handle AI responses in the frontend
    # XSS risk: if AI responses are rendered as HTML
    # Markdown rendering: check for sanitization
]

Transitive dependency analysis

Check the full dependency tree, not just direct dependencies.

transitive_audit = [
    "pip install pipdeptree && pipdeptree --warn silence",
    # Look for:
    # - Deep dependency chains (fragile supply chain)
    # - Dependencies with few maintainers
    # - Dependencies that pull in native code (C extensions)
    # - Dependencies that require network access during installation
]

Phase 3: Model Supply Chain (30 minutes)

Model provenance verification

Verify that models used by the application come from trusted sources.

model_supply_chain = {
    "download_script_review": [
        "Does download_models.sh download from official sources (HuggingFace, official APIs)?",
        "Are model checksums verified after download?",
        "Is HTTPS used for all downloads?",
        "Are model versions pinned (specific commit hash, not 'latest')?",
    ],
    "model_loading_review": [
        "Does the code use pickle to load models (arbitrary code execution risk)?",
        "Are safetensors used instead of pytorch .bin files?",
        "Is there integrity verification before loading?",
        "Can users specify arbitrary model paths or URLs?",
    ],
    "api_model_review": [
        "Are API-based models (OpenAI, Anthropic) accessed through official SDKs?",
        "Is the API base URL configurable? (Proxy/MITM risk)",
        "Are API responses validated before processing?",
    ],
}

Common model supply chain findings:

# FINDING: Model downloaded without integrity verification
# Severity: High
# File: models/download_models.sh
# wget https://huggingface.co/model/resolve/main/model.bin -O model.bin
# No checksum verification
 
# FINDING: Pickle deserialization for model loading
# Severity: Critical
# File: backend/app/models/local.py
# model = torch.load("model.bin")  # Executes arbitrary code in the pickle
 
# FINDING: User-configurable model source
# Severity: High
# File: backend/app/config.py
# MODEL_URL = os.getenv("MODEL_URL", "https://huggingface.co/...")
# Users can point to any URL, including malicious model files

Model file analysis

If the project includes or downloads model files, analyze them for tampering.

model_file_checks = [
    "Verify file sizes match expected values from official sources",
    "Compare SHA-256 hashes against published checksums",
    "Check for unexpected files in model archives (code files, scripts)",
    "Scan model config files for unusual settings",
    "Check if model files contain embedded code (pickle exploit)",
]

Phase 4: Deployment Configuration Review (15 minutes)

Docker and container security

Review containerization for security misconfigurations.

docker_review = {
    "dockerfile_checks": [
        "Is the container running as root? (should use non-root user)",
        "Are secrets passed as build args? (visible in image layers)",
        "Is the base image pinned to a specific digest?",
        "Are unnecessary tools installed (curl, wget in production)?",
        "Is multi-stage build used to minimize attack surface?",
    ],
    "compose_checks": [
        "Are ports exposed unnecessarily?",
        "Are volumes mounted with write access that should be read-only?",
        "Are environment variables with secrets in docker-compose.yml?",
        "Is the network configuration appropriate (internal vs external)?",
        "Are resource limits set (CPU, memory)?",
    ],
}

Common findings:

# FINDING: Container running as root
# Severity: Medium
# File: backend/Dockerfile
# No USER directive -- runs as root by default
 
# FINDING: Secrets in docker-compose.yml
# Severity: High
# File: docker-compose.yml
# environment:
#   - OPENAI_API_KEY=sk-real-key-here
 
# FINDING: Database exposed to host network
# Severity: High
# File: docker-compose.yml
# ports:
#   - "5432:5432"  # PostgreSQL accessible from host

Deployment documentation review

Check whether the deployment guide leads users toward secure or insecure configurations.

docs_review = [
    "Does the deployment guide mention HTTPS configuration?",
    "Are security-critical environment variables documented?",
    "Does the guide warn against using default credentials?",
    "Is there guidance on network security (firewall, VPN)?",
    "Are backup and update procedures documented?",
    "Is there a security policy (SECURITY.md) for vulnerability reporting?",
]

Audit Report Template

# Security Audit Report: OpenChat-Plus
 
## Project Information
- **Repository:** [URL]
- **Version audited:** [commit hash]
- **Audit date:** 2026-03-15
- **Auditor:** [Name]
 
## Executive Summary
[Overall security posture and critical risks]
 
## Findings Summary
| # | Finding | Severity | Category | File(s) |
|---|---------|----------|----------|---------|
| 1 | [Title] | Critical | Code | [path] |
 
## Detailed Findings
[Standard finding format with code snippets and fix recommendations]
 
## Dependency Audit Results
[Vulnerability scan results, supply chain risks]
 
## Model Supply Chain Assessment
[Model provenance, integrity, loading security]
 
## Deployment Security
[Container, configuration, and documentation findings]
 
## Recommendations
### For Maintainers
[Code fixes, CI/CD pipeline improvements, security processes]
 
### For Users/Deployers
[Secure deployment checklist, configuration hardening]

Supply Chain Security for AI -- AI supply chain threat models
Code Review for AI Applications -- AI-focused code review techniques
Dependency Security -- Dependency analysis methods
Container Security -- Container hardening for AI workloads

Knowledge Check

Why is pickle-based model loading considered a critical security risk in open-source AI projects?

Simulation: Open Source AI Project Audit

Configuration and secrets management

Input handling and prompt construction

Authentication and authorization review

RAG pipeline security review

Python dependency audit

Frontend dependency audit

Transitive dependency analysis

Model provenance verification

Model file analysis

Docker and container security

Deployment documentation review

Related articles

Simulation: Open Source AI Project Audit

Configuration and secrets management

Input handling and prompt construction

Authentication and authorization review

RAG pipeline security review

Python dependency audit

Frontend dependency audit

Transitive dependency analysis

Model provenance verification

Model file analysis

Docker and container security

Deployment documentation review

Related articles