Engagement Kickoff Walkthrough
Step-by-step guide to launching an AI red team engagement: initial client meetings, scope definition, rules of engagement, legal agreements, environment setup, and tool selection.
The kickoff phase sets the foundation for everything that follows. Skipping or rushing it leads to scope creep, legal exposure, missed attack surfaces, and reports that do not address the client's actual concerns. A thorough kickoff typically takes one to three days and produces three essential documents: a scope definition, rules of engagement, and an environment access plan.
Step 1: The Initial Meeting
Meeting Agenda
Structure the first client meeting around these topics:
- System overview (30 minutes) -- What does the AI system do? Who uses it? What data does it process?
- Architecture review (30 minutes) -- What models, APIs, frameworks, and infrastructure components are involved?
- Security concerns (20 minutes) -- What keeps the client up at night? Have there been incidents?
- Compliance requirements (15 minutes) -- What regulations or standards apply? (EU AI Act, NIST AI RMF, industry-specific rules)
- Engagement logistics (15 minutes) -- Timeline, contacts, access, communication channels
Key Questions to Ask
About the system:
- What model(s) are deployed? (Provider, version, fine-tuned or base?)
- Is the system prompt static or dynamically generated?
- Does the system have tool calling, function calling, or agentic capabilities?
- What external data sources are connected (RAG, databases, APIs)?
- What guardrails or safety layers are currently in place?
About the threat model:
- Who are the expected users? (Employees, customers, public?)
- What would constitute a serious security incident?
- Are there specific attack scenarios the client is most concerned about?
- Has the system been tested before? What were the results?
About constraints:
- Are there models, endpoints, or environments that are off-limits?
- Are there time windows when testing must not occur? (Production hours, maintenance windows)
- What is the acceptable rate of test traffic?
- Who should be notified if a critical vulnerability is found during testing?
Step 2: Scope Definition
The Scope Document
Create a clear scope document that answers: what is being tested, what is not being tested, and what success looks like.
# AI Red Team Engagement Scope
## Client: [Client Name]
## Date: [Date]
## Red Team Lead: [Name]
## In-Scope Targets
1. Customer-facing chatbot at chat.example.com
- Model: GPT-4o via Azure OpenAI
- System prompt: Provided by client (see Appendix A)
- Connected tools: Order lookup API, FAQ retrieval (RAG)
2. Internal knowledge assistant at knowledge.internal.example.com
- Model: Claude Sonnet via Anthropic API
- System prompt: Dynamically generated based on user role
- Connected tools: Confluence search, Jira ticket creation
## Out of Scope
- Training data and fine-tuning pipelines
- Infrastructure (cloud accounts, network, servers)
- Source code review (unless specifically affecting prompt logic)
- Third-party model provider infrastructure (OpenAI, Anthropic)
- Social engineering of client employees
## Testing Categories
- [ ] Prompt injection (direct and indirect)
- [ ] Jailbreaking and content policy bypass
- [ ] System prompt extraction
- [ ] Data exfiltration via the model
- [ ] Tool/function calling abuse
- [ ] RAG poisoning (if writable knowledge base is in scope)
- [ ] Authentication and authorization bypass
- [ ] Rate limiting and abuse potential
## Success Criteria
- All in-scope categories tested with documented results
- Findings rated by severity (Critical, High, Medium, Low, Informational)
- Remediation recommendations provided for each finding
- Final report delivered within 5 business days of testing completion
## Timeline
- Kickoff: [Date]
- Testing window: [Start Date] to [End Date]
- Draft report: [Date]
- Final report: [Date]Common Scope Pitfalls
| Pitfall | Problem | Solution |
|---|---|---|
| "Test everything" | Unbounded scope leads to unfocused testing | List specific endpoints, models, and categories |
| Forgetting RAG sources | Indirect injection via knowledge base content is missed | Explicitly include or exclude RAG data sources |
| Ignoring tool calling | Function calling abuse is a critical attack vector | List all tools/functions the model can call |
| No exclusions documented | Ambiguity about what is off-limits | Always have an explicit out-of-scope section |
Step 3: Rules of Engagement
The ROE Document
# Rules of Engagement
## Authorization
This document authorizes [Red Team] to perform AI security testing
against the targets listed in the Scope Document dated [Date],
signed by [Authorized Client Representative].
## Testing Boundaries
1. Testing is limited to the targets listed in the scope document
2. No denial-of-service attacks or intentional service disruption
3. No modification of production data without explicit approval
4. No testing outside the agreed time window: [Window]
5. API request rate must not exceed [X] requests per minute
## Data Handling
1. All test data and findings are confidential
2. Screenshots and response logs may be collected as evidence
3. No client data extracted during testing will be retained after engagement
4. Evidence will be stored encrypted and destroyed after [retention period]
## Communication
- Primary contact: [Name, email, phone]
- Emergency contact: [Name, email, phone]
- Status updates: Daily via [channel]
- Critical findings: Immediate notification via [channel]
## Critical Finding Protocol
If a critical vulnerability is discovered:
1. Stop exploitation immediately after confirming the vulnerability
2. Document the finding with minimal evidence needed for reproduction
3. Notify the client emergency contact within 1 hour
4. Do not attempt further exploitation of the critical finding
5. Continue testing other in-scope targets
## Legal
- This engagement is governed by [Master Services Agreement / SOW]
- Testing is authorized only for the duration and scope specified
- The red team will not intentionally access, modify, or exfiltrate
real user data beyond what is necessary to demonstrate a vulnerabilityROE Negotiation Tips
Clients sometimes push back on aspects of the ROE. Common negotiation points:
"We want you to test production" -- Acceptable if the ROE includes rate limits, time windows, and a halt clause. Insist on a staging environment for destructive tests.
"Can you test on weekends?" -- Ensure someone on the client side is available to respond to critical findings.
"We do not want you to test [category]" -- Document the exclusion explicitly. Note in the final report that this category was not assessed.
Step 4: Environment Setup
Access Checklist
Before testing begins, verify you have:
- API keys or credentials for all in-scope endpoints
- System prompt documentation or access to retrieve it
- Sample valid requests and expected responses
- Contact information for on-call support during testing
- VPN access if testing internal systems
- Staging/sandbox environment access (if available)
- Access to any monitoring dashboards (to watch your test traffic)
Testing Environment Configuration
# Create a dedicated project directory
mkdir -p engagement-[client-name]/
cd engagement-[client-name]/
# Set up environment variables
cat > .env << 'EOF'
# Client API credentials
TARGET_API_KEY=provided-by-client
TARGET_BASE_URL=https://api.example.com
# Attacker model credentials (for PyRIT, etc.)
OPENAI_API_KEY=your-openai-key
# Engagement metadata
ENGAGEMENT_ID=ENG-2026-001
CLIENT_NAME=Example Corp
EOF
# Create directory structure
mkdir -p evidence/ reports/ configs/ logs/ payloads/
# Initialize tools
# (Assuming virtual environments per tool as described in tool walkthroughs)Step 5: Tool Selection
Match tools to the engagement scope:
| Engagement Type | Primary Tools | Secondary Tools |
|---|---|---|
| Prompt-level assessment | Promptfoo, Garak | Ollama (local testing) |
| Full-stack AI assessment | PyRIT, Burp Suite, Promptfoo | Garak, custom Python |
| Agentic system assessment | PyRIT, custom Python | Burp Suite, Promptfoo |
| Compliance evaluation | Inspect AI, Promptfoo | Garak |
| Continuous red teaming | Promptfoo (CI/CD), Garak | Custom automation |
Pre-Engagement Tool Validation
Before the testing window opens, validate that all tools can reach the target:
# Test API connectivity
curl -s https://api.example.com/health -H "Authorization: Bearer $TARGET_API_KEY"
# Test with garak
garak --model_type rest --model_name target --probes test.Blank
# Test with promptfoo
promptfoo eval --config configs/connectivity-test.yaml
# Test local environment
ollama run llama3.1:8b "Connection test"Step 6: Kickoff Deliverables Checklist
Before moving to the reconnaissance phase, confirm all kickoff deliverables are complete:
- Scope document signed by both parties
- Rules of engagement signed by authorized client representative
- API credentials obtained and validated
- Testing environment configured and connectivity verified
- Tools installed, configured, and tested against target
- Communication channels established
- Emergency contact information documented
- Timeline agreed and calendar invitations sent
- NDA or confidentiality agreement in place
Related Topics
- Reconnaissance Workflow -- The next phase after kickoff
- Tool Walkthroughs -- Detailed setup guides for each tool
- Governance, Legal & Compliance -- Regulatory context for engagement scoping
- Professional Skills -- Client communication and project management
What is the most important document to have signed before beginning any AI red team testing?