Authorization, Contracts & Liability
Rules of engagement, scope documents, liability clauses, and contract templates for AI red teaming engagements. What to include to protect yourself and the client.
Every legitimate AI red teaming engagement begins with paperwork. The rules of engagement (RoE) is the single most important document in your legal defense toolkit. Without it, even well-intentioned testing can expose you to criminal liability.
Anatomy of an AI Red Team Authorization
AI red teaming authorizations require provisions that traditional penetration testing contracts do not cover. The following sections should appear in every engagement document.
Scope Definition
| Scope Element | What to Specify | Example |
|---|---|---|
| Target systems | Exact endpoints, models, versions | "GPT-4o via api.openai.com, deployment ID xyz-123" |
| Attack categories | Which attack types are authorized | "Prompt injection, jailbreaking, output manipulation" |
| Excluded attacks | What is explicitly out of scope | "No model extraction, no training data extraction" |
| Data boundaries | What data can be used in testing | "Synthetic data only; no real PII in test prompts" |
| Infrastructure limits | Which supporting systems can be touched | "API endpoints only; no network infrastructure testing" |
| Time window | When testing is permitted | "Business hours EST, March 1-31, 2026" |
Authorization Chain
The person signing the authorization must have the legal authority to grant access. This is more complex for AI systems than traditional IT assets.
Identify the system owner
For AI systems, this may be the organization deploying the model, the model provider, or both. Cloud-hosted AI services may require authorization from the cloud provider as well.
Verify signing authority
Confirm the signer has authority over the specific systems being tested. A VP of Engineering may authorize API testing but not have authority over the underlying cloud infrastructure.
Address third-party dependencies
If the AI system uses third-party APIs, models, or data sources, determine whether those third parties need to be notified or authorize testing.
Document the authorization chain
Maintain a clear record showing who authorized what, and their authority to do so.
Essential Contract Clauses
Limitation of Liability
Template clause (adapt with legal counsel):
"Tester's aggregate liability under this Agreement shall not exceed
the total fees paid under this engagement. In no event shall Tester
be liable for indirect, incidental, or consequential damages arising
from authorized testing activities conducted within the defined scope."Indemnification
Both parties should indemnify each other for claims arising from their respective obligations:
- Client indemnifies tester for claims arising from the client's failure to disclose relevant system information, or from third-party claims related to authorized testing
- Tester indemnifies client for claims arising from testing outside the authorized scope, or from negligent handling of discovered vulnerabilities
Data Handling and Confidentiality
AI red teaming creates unique data handling concerns:
| Data Category | Handling Requirement | Retention Period |
|---|---|---|
| Test prompts and payloads | Encrypted storage, access-controlled | Duration of engagement + 90 days |
| Model outputs (including harmful content) | Encrypted, labeled as test artifacts | Duration of engagement + 90 days |
| Discovered vulnerabilities | Classified, restricted distribution | Per disclosure timeline |
| System prompts or model details | Client confidential information | Per NDA terms |
| Extracted training data (if in scope) | Treat as client PII/trade secrets | Delete after analysis |
Intellectual Property
Clarify ownership of:
- Testing tools and methodologies: Typically retained by the tester
- Custom exploits developed during the engagement: Negotiate per engagement
- Findings and report: Typically delivered to the client with a license for the tester to reference (anonymized) in future work
- Discovered vulnerabilities: Client owns the finding; tester retains the right to disclose after an agreed embargo period
Rules of Engagement Template
A complete RoE for AI red teaming should include these sections:
Engagement overview
Parties, objectives, timeline, primary contacts, escalation procedures.
Scope definition
In-scope systems, authorized attack categories, out-of-scope items, testing windows.
Testing methodology
Approach (black-box, gray-box, white-box), tools to be used, severity classification system.
Communication protocol
How to report critical findings (emergency contact), regular status updates, final report delivery.
Incident handling
What happens if testing causes an outage, data exposure, or other unintended impact. Who to contact, how quickly, and what to document.
Data handling
How test data, model outputs, and findings will be stored, transmitted, and eventually destroyed.
Completion criteria
What constitutes a completed engagement. Deliverables, acceptance process, post-engagement support.
Contract Red Flags
Watch for these provisions that create unacceptable risk:
| Red Flag | Why It's Dangerous | What to Negotiate |
|---|---|---|
| Unlimited liability | Single finding could bankrupt you | Cap at engagement fees or insurance limit |
| No indemnification | You absorb all third-party claims | Mutual indemnification for respective obligations |
| Vague scope | "Test our AI systems" with no boundaries | Specific endpoints, methods, and exclusions |
| No incident provision | No process if something goes wrong | Defined escalation and communication procedures |
| Work-for-hire on tools | Client claims ownership of your tooling | Retain IP for pre-existing and general-purpose tools |
| No safe harbor clause | Client can pursue legal action for in-scope findings | Explicit covenant not to sue for authorized activities |
| Mandatory NDA with no carve-outs | Cannot discuss findings even anonymously | Carve-outs for anonymized case studies and conference talks |
AI-Specific Contract Considerations
AI red teaming introduces contractual issues that do not exist in traditional security testing.
Model Provider vs. Deployer
When testing an AI application built on a third-party model (e.g., a company's customer service bot built on Claude or GPT-4), you need to consider:
- The deployer authorizes testing of their application
- The model provider's Terms of Service may restrict certain testing methods
- Vulnerabilities may exist in the underlying model, not the application layer
Harmful Output Provisions
AI red teaming intentionally produces harmful outputs. Your contract must address:
- Legal safe harbor for generating harmful content during authorized testing
- Secure handling and deletion of harmful outputs
- No liability for content generated by the AI system in response to test prompts
- Clear distinction between testing artifacts and actual harmful intent
Related Topics
- Legal Frameworks for AI Red Teaming -- the legal landscape underpinning these contract requirements
- Ethics & Responsible Disclosure -- ethical obligations that extend beyond contractual ones
- Insurance & Compliance Requirements -- insurance that backs up contractual protections
- Red Team Metrics Beyond ASR -- metrics frameworks referenced in engagement reports
References
- "Model Contract for AI Penetration Testing" - Cloud Security Alliance (2024) - Template contract provisions for AI security assessments including safe harbor language
- "CREST Code of Conduct for Penetration Testing" - CREST International (2023) - Professional standards and contractual obligations for security testing engagements
- "Legal Considerations for AI Security Testing" - NIST SP 800-218A (2024) - Guidance on legal frameworks applicable to AI system security assessments
- "Professional Liability in AI Security Assessments" - IEEE Security & Privacy (2024) - Analysis of liability exposure and contractual protections for AI red teamers
Which clause is most critical to include in an AI red teaming contract to protect the tester from legal action?