Rules of Engagement Template for AI Red Team Operations
Step-by-step guide to creating comprehensive rules of engagement documents for AI red team assessments, covering authorization, scope, constraints, communication, and legal protections.
Rules of Engagement (ROE) documents serve as the legal and operational foundation for every red team engagement. For AI red teaming, the standard penetration testing ROE template falls short because it does not address AI-specific concerns such as generated content liability, model interaction boundaries, training data handling, and third-party AI provider policies. This walkthrough guides you through building an ROE document that covers these gaps while remaining practical and enforceable.
A well-crafted ROE protects everyone involved. It protects the red team from accusations of unauthorized access. It protects the client from unexpected service disruptions. It protects both parties from regulatory complications. Invest the time to get this document right.
Step 1: Document Header and Engagement Overview
Start with the administrative information that identifies the engagement and its participants.
# RULES OF ENGAGEMENT
# AI System Security Assessment
## Document Control
- Document ID: ROE-[CLIENT]-[YEAR]-[SEQ]
- Version: 1.0
- Classification: CONFIDENTIAL
- Created: [DATE]
- Last Updated: [DATE]
## Engagement Overview
| Field | Value |
|-------|-------|
| Engagement Name | [Descriptive name] |
| Client Organization | [Legal entity name] |
| Client Point of Contact | [Name, title, email, phone] |
| Red Team Organization | [Legal entity name] |
| Red Team Lead | [Name, title, email, phone] |
| Engagement Type | [Prompt-Level / Application-Level / Full-Stack / Continuous] |
| Start Date | [YYYY-MM-DD] |
| End Date | [YYYY-MM-DD] |
| Authorization Reference | [Contract/SOW reference number] |The document ID should follow a consistent naming convention across engagements. This makes it easier to reference during and after the engagement, and during any future legal proceedings.
Step 2: Authorization and Legal Framework
This is the most important section of the ROE. It establishes who authorized the testing, what legal framework governs it, and what protections are in place.
Authorization Statement
## 1. Authorization
### 1.1 Authorization Statement
[CLIENT ORGANIZATION] hereby authorizes [RED TEAM ORGANIZATION] to conduct
an AI system security assessment as described in this document. This
authorization is granted by [SPONSOR NAME], [SPONSOR TITLE], who has the
authority to approve security testing of the systems described in Section 2.
### 1.2 Scope of Authorization
This authorization covers:
- Interaction with AI models and APIs listed in Section 2
- Attempted bypass of content filtering and guardrails
- Testing of input validation, output handling, and data flow controls
- Assessment of authentication and authorization mechanisms
- Automated scanning using approved tools listed in Section 4
- Documentation and evidence collection of all findings
This authorization does NOT cover:
- Testing of systems not listed in Section 2
- Physical access to client facilities
- Social engineering of client personnel (unless explicitly approved below)
- Testing outside the approved time window
- Any activity prohibited by applicable law
### 1.3 Third-Party Considerations
The following third-party AI services are used by the target system:
- [Provider 1]: Testing is permitted under [provider's testing policy]
- [Provider 2]: Advance notification submitted on [DATE], reference [REF]
### 1.4 Legal Framework
This engagement is governed by:
- Master Services Agreement dated [DATE]
- Statement of Work [SOW NUMBER] dated [DATE]
- Non-Disclosure Agreement dated [DATE]
- This Rules of Engagement documentAI-Specific Legal Considerations
Traditional ROEs do not address these AI-specific legal questions. Include explicit answers for each:
| Question | Default Position | Notes |
|---|---|---|
| Is the red team authorized to generate harmful/offensive content as part of testing? | Yes, within testing scope | Required for content policy bypass testing |
| Who owns content generated during testing? | Client retains all rights | Include in IP section |
| Can test prompts and responses be used in future research? | No, unless explicitly agreed | Requires separate data use agreement |
| Is the red team liable for AI-generated outputs during testing? | No, testing is authorized activity | Confirm with legal |
| What happens if testing reveals the AI was trained on copyrighted material? | Report finding, no further action | Not in scope to investigate IP claims |
| Does testing violate the AI provider's terms of service? | Provider policy reviewed and complied with | Document provider policy review |
Step 3: Scope Definition
The scope section must be precise enough that any team member can determine whether a specific action is in scope or out of scope without asking for clarification.
## 2. Scope
### 2.1 In-Scope Systems
#### AI Models and Endpoints
| ID | System | Endpoint | Model | Environment |
|----|--------|----------|-------|-------------|
| T1 | Customer chatbot | https://api.example.com/v1/chat | GPT-4o (Azure) | Production |
| T2 | Internal assistant | https://internal.example.com/assist | Claude 3.5 Sonnet | Staging |
| T3 | Document analyzer | https://api.example.com/v1/analyze | Custom fine-tuned model | Production |
#### Supporting Infrastructure
| ID | Component | Description | Testing Permitted |
|----|-----------|-------------|-------------------|
| S1 | RAG knowledge base | Pinecone vector store | Query-level testing only |
| S2 | Function calling layer | Internal tool orchestration | Full testing |
| S3 | Content filter | Custom moderation pipeline | Bypass testing permitted |
| S4 | Authentication | OAuth 2.0 + API keys | Standard auth testing |
### 2.2 Out-of-Scope Systems
- Cloud infrastructure (AWS/Azure/GCP management plane)
- Third-party AI provider infrastructure (OpenAI, Anthropic servers)
- Client corporate network (unless AI-related)
- Mobile applications (web API only in scope)
- Physical security
- [Additional client-specific exclusions]
### 2.3 Testing Boundaries
#### Permitted Activities
- Prompt injection and jailbreak attempts
- Content policy bypass testing
- System prompt extraction attempts
- RAG poisoning simulation (staging environment only)
- Function calling abuse testing
- Authentication and authorization bypass
- Rate limit testing (up to 2x documented limit)
- Data exfiltration testing via model outputs
#### Prohibited Activities
- Denial of service attacks against production systems
- Data modification in production databases
- Testing against real customer data or sessions
- Exfiltration of actual customer PII
- Activities that violate applicable laws
- Model weight extraction (unless explicitly approved)
- Supply chain attacks against dependenciesStep 4: Testing Methodology and Constraints
Document the approach, tools, and constraints that govern how testing is conducted.
## 3. Testing Constraints
### 3.1 Technical Constraints
| Constraint | Value | Enforcement |
|-----------|-------|-------------|
| Maximum requests per minute | [N] | Client-side rate limiting + red team self-enforcement |
| Maximum tokens per request | [N] | Red team self-enforcement |
| Maximum concurrent sessions | [N] | Red team self-enforcement |
| Maximum daily API cost | $[N] | Red team monitors, client alerts at threshold |
| Testing hours | [HH:MM] to [HH:MM] [TZ] | Red team self-enforcement |
| Blackout dates | [List] | Red team calendar blocks |
### 3.2 Behavioral Constraints
- No generation of CSAM or content depicting real minors in harmful contexts
- No generation of content that constitutes a true threat against real persons
- No generation of content that could constitute defamation of real persons
- No submission of generated content to any external service or platform
- All generated content remains within the testing environment
### 3.3 Approved Tools
| Tool | Version | Purpose | Configuration |
|------|---------|---------|---------------|
| Garak | Latest stable | Automated vulnerability scanning | Default probes |
| Promptfoo | Latest stable | Eval-driven testing | Custom test suite |
| Burp Suite Pro | Latest | HTTP proxy and analysis | Standard config |
| Custom scripts | Engagement-specific | Targeted test cases | Reviewed by lead |Cost Management Framework
AI red teaming can incur significant API costs. Include explicit cost management provisions:
### 3.4 Cost Management
#### API Cost Responsibilities
- Red team API costs for testing: [Client / Red Team] responsibility
- Client provides API keys with billing to: [Client / Red Team account]
- Cost ceiling for engagement: $[AMOUNT]
- If cost ceiling is reached: [pause and consult / continue at red team cost]
#### Cost Monitoring
- Red team monitors daily API usage and costs
- Client receives daily cost summary (if requested)
- Automated alerts configured at 50%, 75%, and 90% of cost ceiling
- Testing pauses automatically at 100% until client approves additional budgetStep 5: Communication and Escalation Protocols
## 4. Communication
### 4.1 Regular Communication
| Type | Frequency | Channel | Participants |
|------|-----------|---------|-------------|
| Status update | Daily | Email | Red team lead + client POC |
| Progress review | Weekly | Video call | Full team + client POC |
| Finding discussion | As needed | Secure messaging | Red team lead + client tech POC |
### 4.2 Escalation Protocol
#### Critical Finding (Severity: Critical)
Definition: Vulnerability that allows unauthorized access to sensitive data,
complete bypass of safety controls, or poses immediate risk of harm.
1. Red team lead notifies client escalation contact within 2 HOURS
2. Notification via: Phone call + encrypted email
3. Client acknowledges receipt within 4 hours
4. Joint decision on immediate mitigation within 24 hours
5. Finding documented in interim report within 48 hours
#### High Finding (Severity: High)
Definition: Significant vulnerability with limited exploitation prerequisites.
1. Red team lead notifies client POC within 24 HOURS
2. Notification via: Encrypted email
3. Client acknowledges receipt within 48 hours
#### Emergency Stop Procedure
Either party may invoke an emergency stop at any time:
1. Contact other party via phone immediately
2. All testing ceases upon notification
3. Emergency meeting within 4 hours to assess situation
4. Testing resumes only after written agreement from both partiesStep 6: Data Handling and Evidence Management
## 5. Data Handling
### 5.1 Data Classification
| Data Type | Classification | Handling Requirements |
|-----------|---------------|---------------------|
| Test prompts and responses | Confidential | Encrypted storage, engagement team only |
| Screenshots and recordings | Confidential | Encrypted storage, engagement team only |
| System prompts (if extracted) | Highly Confidential | Encrypted, engagement lead only |
| Client source code (if shared) | Highly Confidential | Encrypted, no copies |
| PII encountered during testing | Restricted | Minimize collection, redact in reports |
### 5.2 Evidence Retention
- During engagement: All evidence retained in encrypted storage
- Report delivery to final report acceptance: Full evidence retained
- After acceptance: Evidence retained for [90 / 180 / 365] days
- After retention period: Secure destruction with certificate of destruction
### 5.3 Data Destruction
- Method: Secure deletion (DoD 5220.22-M or equivalent)
- Scope: All engagement data including local copies, cloud storage, backups
- Verification: Certificate of destruction provided to client
- Timeline: Within [30] days of retention period expirationStep 7: Deliverables and Reporting
## 6. Deliverables
### 6.1 Deliverable Schedule
| Deliverable | Due Date | Format | Recipients |
|-------------|----------|--------|------------|
| Daily status updates | Each testing day | Email | Client POC |
| Interim critical findings | As discovered | Encrypted email | Escalation contacts |
| Draft report | [DATE] | PDF, encrypted | Client POC |
| Client review period | [N] business days | N/A | N/A |
| Final report | [DATE] | PDF, encrypted | Distribution list |
| Executive debrief | [DATE] | Presentation | Leadership team |
| Retest (if applicable) | [DATE] | PDF addendum | Distribution list |
### 6.2 Report Contents
- Executive summary with risk rating
- Methodology description
- Finding details with reproduction steps
- Evidence (screenshots, logs, sanitized prompts)
- Severity classification per agreed framework
- Remediation recommendations
- Technical appendix with raw dataStep 8: Signature and Acceptance
The final section formalizes acceptance by both parties.
## 7. Acceptance
### 7.1 Agreement
Both parties agree to the terms described in this Rules of Engagement
document. Any modifications must be agreed in writing by both parties.
### 7.2 Signatures
CLIENT:
Name: ___________________________
Title: ___________________________
Organization: ___________________________
Signature: ___________________________
Date: ___________________________
RED TEAM:
Name: ___________________________
Title: ___________________________
Organization: ___________________________
Signature: ___________________________
Date: ___________________________
### 7.3 Distribution
Signed copies distributed to:
- [ ] Client engagement sponsor
- [ ] Client legal department
- [ ] Client security team lead
- [ ] Red team lead
- [ ] Red team organization legal/complianceStep 9: Review Checklist
Before sending the ROE for signature, verify that every section is complete and accurate.
# ROE Completeness Checklist
## Document Quality
- [ ] All placeholders replaced with actual values
- [ ] Dates are correct and consistent throughout
- [ ] Names and titles are accurate
- [ ] Contact information verified
- [ ] No conflicting statements between sections
## Legal Coverage
- [ ] Authorization statement is clear and specific
- [ ] Third-party provider policies addressed
- [ ] AI-specific legal considerations documented
- [ ] Liability limitations defined
- [ ] Data handling and destruction terms specified
## Technical Accuracy
- [ ] All in-scope systems listed with correct details
- [ ] Out-of-scope items explicitly stated
- [ ] Testing constraints are realistic and achievable
- [ ] Approved tools list matches planned methodology
- [ ] Cost management terms are clear
## Operational Completeness
- [ ] Communication plan is detailed and actionable
- [ ] Escalation procedures specify timeframes and channels
- [ ] Emergency stop procedure is defined
- [ ] Deliverable schedule is realistic
- [ ] Retest provisions are included (if applicable)Common ROE Mistakes
-
Vague scope language. "Test the AI chatbot" is not sufficient. Specify endpoints, models, environments, and exactly which types of testing are authorized. Ambiguity creates legal risk.
-
Missing third-party provider section. If the client uses hosted AI services, the ROE must address the provider's testing policies. Omitting this can lead to service termination or legal action from the provider.
-
No cost ceiling. AI red teaming can rack up thousands of dollars in API costs. Without a cost ceiling and monitoring plan, the client receives an unexpected bill and the relationship suffers.
-
Outdated contact information. People change roles and phone numbers. Verify all contact information within 48 hours of engagement start.
-
No emergency stop procedure. If testing causes an unexpected production issue, both parties need a clear, immediate procedure to halt testing. Without it, confusion during a crisis makes the situation worse.
Which of the following is an AI-specific concern that traditional penetration testing ROEs typically do not address?
Related Topics
- Pre-Engagement Checklist -- Using the ROE as the foundation for engagement preparation
- Scoping Checklist Walkthrough -- The scoping process that feeds into ROE creation
- Evidence Collection Methods -- Detailed evidence handling procedures referenced in the ROE
- Responsible Disclosure for AI -- Disclosure procedures that complement the ROE