Professional Practice Study 指南
Study guide covering AI red teaming methodology, engagement management, report writing, governance frameworks, and professional ethics.
Professional Practice Study Guide
This guide covers the professional side of AI 紅隊演練: how to plan and execute engagements, write effective reports, navigate governance frameworks, and maintain ethical standards. It supports preparation for the Governance and Professional Skills assessments.
Engagement Methodology
A professional AI 紅隊 engagement follows a structured methodology that ensures thoroughness, legal protection, and actionable results.
Engagement Lifecycle
| Phase | Key Activities | Deliverables |
|---|---|---|
| Pre-engagement | Scoping, rules of engagement, legal 授權, team assembly | Signed scope document, RoE, 授權 letter |
| Reconnaissance | 系統提示詞 extraction, model fingerprinting, capability mapping, trust boundary identification | Architecture diagram, 威脅模型, 攻擊面 inventory |
| Threat Modeling | Identifying threats by component, prioritizing by impact and exploitability | Prioritized threat matrix, 測試 plan |
| Active 測試 | Executing attacks by category, logging all 測試 cases, iterating on findings | 測試 log, raw findings |
| Analysis | Classifying findings by severity, determining root causes, developing remediations | Draft findings with severity ratings |
| Reporting | Writing the report, peer review, executive summary, technical details | Final report |
| Debrief | Presenting findings to stakeholders, answering questions, planning remediation | Debrief meeting notes, remediation tracking |
| Retest | Verifying remediations, 測試 for regressions | Retest report |
Scoping 最佳實務
The scope document is the foundation of the engagement. It should be specific enough to prevent scope disputes but flexible enough to allow following interesting leads.
Must include:
- Target system architecture (model provider, version, orchestration framework, connected tools, data sources)
- 攻擊 categories in scope (提示詞注入, 越獄, data extraction, 代理 利用, etc.)
- Explicit exclusions (production systems with real users, specific techniques, out-of-scope infrastructure)
- Rules of engagement (測試 windows, escalation contacts, data handling, emergency procedures)
- Success criteria (what qualifies as a finding, severity thresholds for reporting)
- Deliverables and timeline
- Written 授權 with signatures from individuals with legal authority
Common scoping mistakes:
- Scoping too broadly ("測試 everything") without prioritization
- Failing to specify which model version and configuration are in scope
- Not addressing how model version changes during the engagement will be handled
- Omitting data handling procedures for accidentally discovered PII
- Not establishing communication cadence and escalation paths
Rules of Engagement
Rules of engagement define the operational boundaries. They protect both the 紅隊 and the client.
| RoE Element | Purpose | 範例 |
|---|---|---|
| Authorized targets | Legal protection, prevents scope creep | "測試 is limited to the staging instance of the customer-facing chatbot at chat-staging.example.com" |
| Prohibited techniques | Prevents unacceptable risk | "No 測試 against production endpoints serving real users. No social engineering of employees." |
| 測試 windows | Minimizes operational disruption | "測試 permitted 09:00-18:00 EST, Monday-Friday. No 測試 during quarterly board presentations." |
| Data handling | Protects discovered sensitive data | "Any PII encountered will be immediately reported to the 安全 contact and not stored in 測試 logs." |
| Escalation contacts | Ensures rapid response to critical findings | "Critical findings: call [name] at [phone]. Non-critical: email [address] within 24 hours." |
| Communication | Keeps stakeholders informed | "Daily status updates via Slack. Weekly summary calls. Final readout presentation." |
Threat Modeling for AI Systems
Threat modeling should be performed after scoping and before active 測試. It guides where to focus 測試 effort.
STRIDE Applied to AI Systems
| STRIDE Category | AI-Specific 範例 |
|---|---|
| Spoofing | Impersonating authorized users or systems, forging 工具呼叫 responses, spoofing MCP server identity |
| Tampering | Poisoning 訓練資料, modifying RAG documents, altering model artifacts in transit |
| Repudiation | Insufficient logging of prompt/response pairs, lack of audit trail for model decisions |
| Information Disclosure | 系統提示詞 leaking, 訓練資料 extraction, PII in model outputs, 嵌入向量 inversion |
| Denial of Service | Resource-exhaustion prompts, 上下文視窗 flooding, infinite 工具呼叫 loops |
| Elevation of Privilege | Prompt injection gaining tool access, cross-代理 escalation, bypassing role-based 護欄 |
Trust Boundary Analysis
Identifying trust boundaries is critical. Every point where untrusted data enters 系統 is a potential injection surface:
- 使用者輸入 to model: The primary direct injection surface
- Retrieved documents to model: Indirect injection via RAG
- Tool outputs to model: Indirect injection via tool responses
- Inter-代理 messages: Lateral injection in multi-代理 systems
- Model 輸出 to downstream systems: Insecure 輸出 handling
- External APIs to orchestration layer: 供應鏈 and integration risks
Report Writing
The report is the primary deliverable. A well-written report translates technical findings into actionable intelligence for multiple audiences.
Report Structure
| Section | Audience | Content |
|---|---|---|
| Executive 總結 | Executives, legal, compliance | Business impact, risk level, key findings (no technical jargon), strategic recommendations |
| Methodology | Technical leads, auditors | Approach, scope, tools used, frameworks referenced (ATLAS, OWASP), limitations |
| Findings | Engineers, 安全 team | Detailed findings with severity, reproduction steps, root cause, remediation |
| Negative Results | 安全 team, auditors | 攻擊 categories tested that were successfully defended, establishing coverage baseline |
| Recommendations | Engineering, management | Prioritized remediation roadmap, quick wins vs. structural improvements |
| Appendices | Engineers | Raw 測試 logs, full prompt/response sequences, tool configurations |
Finding Format
Each finding should follow a consistent structure:
Title: [Descriptive name]
ID: [Unique identifier]
Severity: [Critical/High/Medium/Low/Informational]
OWASP LLM: [LLM01-LLM10 mapping]
MITRE ATLAS: [Technique ID]
Description: What the 漏洞 is and why it matters.
Business Impact: What 攻擊者 could achieve and the consequence
to the organization.
Reproduction Steps:
1. [Detailed step with exact inputs]
2. [Expected model response]
3. [Next step...]
Evidence: [Screenshots, full prompt/response logs]
Root Cause: Why the 漏洞 exists at a technical level.
Remediation: Specific, implementable recommendations.
Verification: How to confirm the fix works.
Severity Classification
Use a consistent rubric that considers multiple factors:
| Factor | Critical | High | Medium | Low |
|---|---|---|---|---|
| Exploitability | Single prompt, no prerequisites | Few turns, minimal prerequisites | Requires specific conditions | Complex, unreliable |
| Impact | RCE, full data access, credential theft | PII leakage, significant policy bypass | Moderate policy bypass, off-topic responses | Minor behavioral deviation |
| Scope | All users, all contexts | Many users or broad contexts | Specific user scenarios | Edge cases |
| Compensating controls | None effective | Partially effective | Significant 緩解 exists | Strong 緩解 exists |
Common Report Writing Mistakes
Governance Frameworks
AI red teamers must 理解 the governance landscape to contextualize findings and communicate with compliance stakeholders.
Framework Comparison
| Framework | Type | Scope | Key Features | 紅隊 Relevance |
|---|---|---|---|---|
| NIST AI RMF | Voluntary framework | All AI systems | Four functions: Govern, Map, Measure, Manage | Red teaming is a key Measure function activity |
| EU AI Act | Regulation (mandatory) | AI systems deployed/affecting EU | Risk-based classification, mandatory requirements for high-risk | Red teaming supports conformity 評估 |
| ISO/IEC 42001 | Certifiable standard | Organizations deploying AI | AI Management System requirements | Provides governance structure for actioning findings |
| OWASP LLM Top 10 | Industry guidance | LLM-based applications | Ranked risk list with mitigations | Primary technical taxonomy for findings |
| MITRE ATLAS | 知識庫 | ML/AI systems | Adversary TTP matrix | 攻擊 classification and coverage tracking |
Risk Management Process
理解 how 紅隊演練 findings flow through governance:
- Finding identified during 紅隊 engagement
- Classified by severity using consistent rubric
- Mapped to frameworks (OWASP, ATLAS, regulatory requirements)
- Communicated to stakeholders via report and debrief
- Entered into risk register with owner, timeline, and acceptance criteria
- Remediated by engineering team within SLA
- Verified through retesting by 紅隊
- Tracked through governance review cycles
- Accepted (if residual risk exists) by authorized individual with documented rationale
Ethics and Professional Conduct
Ethical Boundaries
AI 紅隊演練 presents unique ethical challenges that go beyond traditional penetration 測試:
| Challenge | Traditional Pentesting | AI 紅隊演練 |
|---|---|---|
| Harmful content | Rarely generated directly | Successful jailbreaks produce genuinely harmful text that must be handled responsibly |
| Technique portability | Exploits require technical skill to replicate | Prompt-based attacks are trivially copy-pasted by anyone |
| Bias exposure | Not typically relevant | 測試 may reveal biases affecting real demographic groups |
| Dual-use risk | Moderate -- exploits have specific targets | High -- attack prompts work across model families |
| Researcher wellbeing | Physical 安全 concerns in some contexts | Psychological impact of repeatedly eliciting harmful content |
Professional Standards
- Authorization: Never 測試 without written 授權. Scope documents protect you legally.
- Data handling: Follow agreed procedures for PII and sensitive data. When in doubt, do not store it.
- Responsible disclosure: Balance transparency with harm prevention when publishing research.
- Confidentiality: Client findings are confidential unless explicitly agreed otherwise.
- Honest reporting: Report what you found, including limitations. Do not overstate or understate findings.
- Continuous learning: The field evolves rapidly. Stay current through research, community, and practice.
- Colleague wellbeing: Monitor for signs of secondary stress from exposure to harmful content. Rotate team members on particularly distressing 測試.
Common Pitfalls
Study Checklist
Before taking the Governance and Professional Skills assessments, confirm you can:
- Describe the complete engagement lifecycle from scoping through retest
- Write a finding in the standard format with all required components
- Classify a finding's severity using a multi-factor rubric
- Explain the purpose of each section in a 紅隊 report
- Name three governance frameworks and their relationship to 紅隊演練
- Describe the NIST AI RMF's four core functions
- Explain how the EU AI Act's risk classification affects 測試 requirements
- Articulate the ethical challenges unique to AI 紅隊演練
- Describe how findings flow through governance from discovery to remediation
- Explain why negative results are valuable in a 紅隊 report