Capstone: 紅隊 Program Design
Design a complete AI red team program for a fictional enterprise, producing a comprehensive program charter document.
概覽
An effective AI 紅隊 is more than a collection of skilled testers — it is a program with clear objectives, defined processes, organizational support, and measurable outcomes. This capstone project challenges you to design a complete AI 紅隊 program from scratch for a fictional enterprise, producing the charter document that would be used to launch and operate the program.
Program design requires a different skill set than technical 測試. You will need to think about organizational dynamics, resource allocation, stakeholder management, and how to demonstrate value to executives who may not 理解 the technical details of AI 安全.
先備知識
- Full Engagement Methodology — 理解 what a complete engagement looks like
- Planning & Scoping — Engagement-level planning concepts
- Governance & Compliance — Regulatory context for AI 安全 programs
- AI Threat Landscape — 理解 the threats the program must address
- Experience with or exposure to 安全 programs, penetration 測試 programs, or risk management frameworks
Project Brief
Scenario
You have been hired as the founding lead of the AI 紅隊 at Helios Financial Group, a large financial services company with 15,000 employees. Helios has been deploying AI aggressively over the past two years and now has:
- 12 production AI systems ranging from customer-facing chatbots to internal fraud 偵測 models
- 6 AI systems in development including an autonomous trading advisor and a loan approval assistant
- 3 third-party AI integrations including an AI-powered CRM, an automated document processing vendor, and a code generation tool used by developers
- No formal AI 安全 測試 program — 安全 assessments have been ad hoc, performed by the general penetration 測試 team with limited AI expertise
The CISO has given you a mandate: design an AI 紅隊 program, present it to the executive committee, and be ready to begin operations within 90 days. Your budget for the first year is $1.2 million (covering headcount, tools, and external engagements). You have authority to hire up to 4 full-time team members.
Organizational Context
- Helios is subject to financial regulatory requirements (SOX, PCI-DSS, OCC guidelines on AI in banking)
- The company has a mature traditional 安全 program with an existing pen 測試 team, SOC, and 漏洞 management
- The AI development teams are distributed across 4 business units, each with its own engineering practices
- 存在 executive support for AI 安全 but limited 理解 of AI-specific threats
- The existing pen 測試 team has attempted AI 測試 but acknowledges their techniques are not effective against LLM-based systems
Deliverables
Primary Deliverables
| Deliverable | Description | Weight |
|---|---|---|
| Program charter | Complete program charter document (15-25 pages) | 40% |
| Organizational structure | Org chart, role descriptions, hiring plan | 15% |
| Engagement framework | Engagement types, templates, and cadence plan | 20% |
| Metrics and reporting | KPIs, dashboards, and executive reporting templates | 15% |
| Budget and roadmap | 12-month budget breakdown and quarterly milestones | 10% |
Rubric Criteria
- Completeness (20%) — Charter covers all essential program elements (mission, scope, structure, process, metrics, governance)
- Practicality (25%) — Designs are realistic and implementable within the stated constraints (budget, headcount, timeline)
- Risk Prioritization (15%) — The program prioritizes high-risk systems and aligns 測試 cadence to risk levels
- Organizational Integration (20%) — Program design accounts for integration with existing 安全 teams, development workflows, and compliance processes
- Executive Communication (20%) — Charter includes an executive summary and business case that would persuade a non-technical audience
Phased Approach
Phase 1: Program Foundation (2 hours)
Define the program mission and scope
Write a clear mission statement that articulates what the AI 紅隊 does, why it exists, and what it does not do. Define the scope: which AI systems are covered, what types of 測試 are performed, and how the program relates to the existing pen 測試 team and SOC.
Design the organizational structure
Define the team structure: roles (lead, senior tester, junior tester, automation engineer), reporting relationships (reports to CISO? To VP of 安全?), and collaboration model with other teams. Create a hiring plan that accounts for the 90-day launch timeline.
Define engagement types
Establish the types of engagements the team will perform: comprehensive assessments (full 紅隊), focused assessments (single attack category), pre-deployment reviews (new system launch gates), continuous 監控 (CART), and incident support. Define when each type is appropriate.
Establish the risk-based prioritization framework
Create a framework for prioritizing which systems to 測試 first and how often. 考慮: regulatory risk (systems subject to compliance requirements), data sensitivity (systems processing PII, financial data, or health information), autonomy level (systems that take actions vs. systems that advise humans), and external exposure (customer-facing vs. internal).
Phase 2: Process Design (2 hours)
Create the engagement lifecycle
Document the end-to-end engagement process: intake (how do teams request assessments?), scoping (who defines scope and rules of engagement?), execution (測試 methodology and standards), reporting (finding format, severity criteria, SLAs for remediation), and follow-up (retesting, verification, closure).
Design the pre-deployment gate
Define how AI 安全 review integrates into the development lifecycle. What criteria trigger a mandatory 評估? What is the minimum 評估 scope for a high-risk system? How does the team handle urgent launches that cannot wait for a full 評估?
Establish rules of engagement templates
Create standardized RoE templates for different engagement types. Include: scope definition checklists, authorized 測試 techniques, data handling requirements, escalation procedures (what happens when 測試 causes a production issue?), and communication protocols.
Define the finding lifecycle
Document how findings move from discovery to closure: triage, severity assignment, remediation SLA (critical: 7 days, high: 30 days, medium: 90 days), remediation verification, exception process (for findings that cannot be fixed), and risk acceptance (who can accept residual risk).
Phase 3: Metrics and Reporting (2 hours)
Define program KPIs
Establish measurable KPIs: coverage (percentage of AI systems assessed in the past 12 months), finding metrics (findings by severity, time to remediation, exception rate), efficiency (time per engagement, cost per finding), and maturity (types of assessments performed, depth of 測試).
Design the executive dashboard
Create a template for quarterly executive reporting. Include: program status (systems assessed, findings identified, remediation progress), risk posture trends (is the organization getting more or less secure?), coverage gaps (which systems have not been assessed?), and resource utilization.
Design the technical reporting standard
Define the report format and quality standard for all engagements. Include: required sections, finding template, severity classification criteria, evidence requirements, and statistical reporting standards for AI-specific findings.
Phase 4: Operationalization (2 hours)
Build the 12-month roadmap
Plan the first year quarter by quarter: Q1 (hiring, tooling, first assessments on highest-risk systems), Q2 (expand coverage, establish CART for critical systems, first executive report), Q3 (pre-deployment gate integration, vendor AI assessments, team skill development), Q4 (full coverage of high-risk systems, program maturity 評估, year 2 planning).
Create the budget breakdown
Allocate the $1.2M budget across: headcount (salaries and benefits for 4 FTEs), tooling (commercial and open-source tools, 雲端 infrastructure for 測試), external engagements (specialist consultants for niche assessments), 訓練 (team skill development, conferences, certifications), and operational costs (reporting tools, collaboration platforms).
Write the executive summary and business case
Write the opening section of the charter: why the program exists, what risks it mitigates, what the cost of inaction is (regulatory penalties, reputational damage, financial loss from AI 利用), and what success looks like at the 12-month mark.
Address common objections
Anticipate and address executive objections: "Why can't the existing pen 測試 team handle this?" "AI is changing too fast to have a fixed program." "How do we measure ROI on 安全 測試?" Prepare clear, concise responses with supporting evidence.
範例 輸出
範例 Program Mission Statement
## Mission
The Helios AI 紅隊 exists to proactively 識別 and help remediate
安全 漏洞 in AI systems before they can be exploited by
adversaries or cause harm to customers, employees, or the organization.
The team accomplishes this through:
- Structured 安全 assessments of AI systems across their lifecycle
- Pre-deployment 安全 reviews that serve as quality gates for new AI launches
- Continuous automated 測試 of critical AI systems in production
- Incident response support for AI-specific 安全 events
- 安全 guidance and threat intelligence for AI development teams
The AI 紅隊 is NOT responsible for:
- General application 安全 測試 (handled by the AppSec team)
- Infrastructure and network penetration 測試 (handled by the Pen 測試 team)
- AI model performance, fairness, or bias 測試 (handled by the AI Ethics team)
- Day-to-day 安全 監控 (handled by the SOC)
The team collaborates closely with all of the above groups and serves as
the AI 安全 subject matter expert for the broader 安全 organization.範例 Risk Prioritization Matrix
| System | Risk Level | Regulatory | Data Sensitivity | Autonomy | Exposure | Priority |
|--------|-----------|------------|-----------------|----------|----------|----------|
| Loan Approval AI | Critical | OCC, ECOA | PII, financial | Decision-making | Customer-facing | P1 |
| Trading Advisor | Critical | SEC, FINRA | Financial | Autonomous | Internal | P1 |
| Customer Chatbot | High | PCI-DSS | PII, payment | Advisory | Customer-facing | P2 |
| Fraud 偵測 | High | SOX | Transaction data | Flagging | Internal | P2 |
| Code Gen Tool | Medium | None | Source code | Advisory | Internal | P3 |
| Doc Processing | Medium | SOX | Financial docs | Processing | Vendor | P3 |範例 Budget Breakdown
## Year 1 Budget: $1,200,000
### Headcount (68% — $816,000)
- AI 紅隊 Lead: $210,000 (fully loaded)
- Senior AI 安全 Tester (x2): $380,000 ($190k each)
- Automation/Tooling Engineer: $180,000
- Recruiting costs: $46,000
### Tooling (15% — $180,000)
- Commercial AI 安全 tools: $80,000
- 雲端 infrastructure for 測試: $50,000
- Open-source tool hosting and maintenance: $20,000
- Reporting and collaboration platforms: $30,000
### External Engagements (10% — $120,000)
- Specialist consultant engagements (2x): $100,000
- Third-party AI vendor assessments: $20,000
### Training and Development (5% — $60,000)
- Conference attendance (team): $25,000
- Training courses and certifications: $20,000
- Research time and lab materials: $15,000
### Operational Reserve (2% — $24,000)
- Unplanned expenses and contingencyHints
Why should an AI 紅隊 program define a pre-deployment 安全 gate rather than relying solely on periodic assessments of production systems?