Report Templates & Examples

intermediate9 min readUpdated 2026-03-13

Full AI red team report templates: executive summary, technical findings, methodology section, remediation roadmap, and annotated examples.

templates examples report reference

Report Templates & Examples

This page provides ready-to-use templates for every section of an AI red team report. Each template includes annotations explaining what belongs in each section and common mistakes to avoid.

Full Report Structure

A complete AI red team report follows this structure:

Section	Pages	Primary Audience	Purpose
Cover Page	1	All	Engagement identification, classification
Table of Contents	1	All	Navigation
Executive Summary	1-2	Leadership	Business risk overview, key recommendations
Engagement Details	2-3	All	Scope, methodology, timeline
Findings	10-30	Engineers, security	Individual vulnerability documentation
Attack Narratives	3-5	Technical leadership	End-to-end attack chain stories
Risk Assessment	2-3	Leadership, compliance	Overall posture evaluation
Recommendations	3-5	Engineering, leadership	Prioritized remediation plan
Appendices	Variable	Engineers	Full evidence, reproduction details

Cover Page Template

# AI Red Team Assessment Report
 
**Client:** [Organization Name]
**Target System:** [System name and description]
**Assessment Period:** [Start Date] -- [End Date]
**Report Date:** [Date]
**Report Version:** [1.0]
**Classification:** [CONFIDENTIAL]
 
**Prepared by:**
[Red Team Organization Name]
[Lead assessor name and title]
[Contact information]
 
**Distribution:**
This report is classified [CONFIDENTIAL] and is intended only for
the named recipients. Do not distribute without written authorization
from [Client Security Contact].
 
| Recipient | Role | Copy |
|---|---|---|
| [Name] | CISO | Primary |
| [Name] | VP Engineering | CC |
| [Name] | Security Engineering Lead | CC |

Engagement Details Template

## Engagement Details
 
### Scope
| Parameter | Value |
|---|---|
| Target system | [Name, URL, version] |
| Testing type | [Black-box / Gray-box / White-box] |
| In scope | [Specific components, endpoints, models] |
| Out of scope | [Components explicitly excluded] |
| Access level | [Unauthenticated / Standard user / Admin] |
 
### Methodology
This assessment followed the [framework name, e.g., OWASP LLM Top 10,
NIST AI RMF] methodology, covering the following test categories:
 
| Category | Tests Planned | Tests Executed | Coverage |
|---|---|---|---|
| Prompt injection (direct) | [N] | [N] | [%] |
| Prompt injection (indirect) | [N] | [N] | [%] |
| Safety/jailbreak bypass | [N] | [N] | [%] |
| Information disclosure | [N] | [N] | [%] |
| Tool/function abuse | [N] | [N] | [%] |
| Data exfiltration | [N] | [N] | [%] |
| **Total** | **[N]** | **[N]** | **[%]** |
 
### Timeline
| Date | Activity |
|---|---|
| [Date] | Engagement kickoff, scope confirmation |
| [Date range] | Reconnaissance and initial testing |
| [Date range] | Active testing phase |
| [Date] | Testing complete, evidence consolidation |
| [Date] | Draft report delivered |
| [Date] | Final report delivered |
 
### Team
| Name | Role | Focus Area |
|---|---|---|
| [Name] | Engagement Lead | Overall coordination, executive reporting |
| [Name] | Senior Analyst | Prompt injection, jailbreak testing |
| [Name] | Analyst | Tool exploitation, data exfiltration |
 
### Tools Used
| Tool | Purpose | Version |
|---|---|---|
| [Tool name] | [Purpose] | [Version] |

Individual Finding Template

## F[NNN]: [Descriptive Title]
 
| Attribute | Value |
|---|---|
| **Severity** | [Critical / High / Medium / Low] |
| **Category** | [OWASP LLM category or custom taxonomy] |
| **Attack Surface** | [User input / System prompt / Tool interface / ...] |
| **Exploitability** | [Trivial / Moderate / Difficult] |
| **Success Rate** | [X/Y attempts (Z%)] |
| **OWASP LLM Ref** | [LLM01 / LLM02 / ...] |
 
### Description
[2-3 sentences: what the vulnerability is and its business impact.
Written for a technical audience but free of unnecessary jargon.]
 
### Technical Details
[Detailed explanation of the vulnerability mechanism.
Include relevant architecture context.]
 
### Reproduction Steps
**Environment:** [Model, version, temperature, system prompt hash]
 
1. [Step 1 with exact payload]
2. [Step 2]
3. [Step 3]
4. Observe: [Expected result with objective success criteria]
 
**Evidence:** See Appendix [X], evidence files [list].
 
### Impact Assessment
- **Confidentiality:** [Impact description]
- **Integrity:** [Impact description]
- **Availability:** [Impact description]
- **Business impact:** [Specific business consequences]
- **Regulatory:** [Relevant compliance implications]
 
### Remediation
**Primary recommendation:**
[Specific technical action with implementation guidance]
 
**Alternative (if primary is infeasible):**
[Alternative approach with trade-offs noted]
 
**Effort estimate:** [Engineering time]
 
**Verification:** [How to confirm the fix works]
 
### References
- [Related CVE, research paper, or public incident]

Remediation Roadmap Template

Immediate (0-2 weeks)
Finding Action Owner Effort
F001 (Critical) Deploy output PII filter ML Engineering 3 days
F003 (Critical) Restrict system prompt scope Platform Team 1 day
Short-term (2-8 weeks)
Finding Action Owner Effort
F002 (High) Add safety classifier layer ML Engineering 3 weeks
F005 (High) Implement tool call validation Backend Team 2 weeks
Medium-term (2-3 months)
Finding Action Owner Effort
F004, F006 (Medium) Redesign prompt architecture ML Engineering 4 weeks
F007 (Medium) Add monitoring/alerting for anomalous usage SRE 2 weeks

Finding	Action	Owner	Effort
F001 (Critical)	Deploy output PII filter	ML Engineering	3 days
F003 (Critical)	Restrict system prompt scope	Platform Team	1 day

Finding	Action	Owner	Effort
F002 (High)	Add safety classifier layer	ML Engineering	3 weeks
F005 (High)	Implement tool call validation	Backend Team	2 weeks

Finding	Action	Owner	Effort
F004, F006 (Medium)	Redesign prompt architecture	ML Engineering	4 weeks
F007 (Medium)	Add monitoring/alerting for anomalous usage	SRE	2 weeks

Ongoing

Initiative	Action	Owner	Cadence
Regression testing	Implement CART pipeline	Security	Quarterly
Model update testing	Red team on every model upgrade	Security	Per release
Team training	AI security training for ML engineers	Security	Annual

Report Variants

Different engagement types require adapted report structures:

The full template above. Used for point-in-time engagements with defined start and end dates.

For ongoing automated testing programs, replace the narrative structure with a dashboard-style report:

## CART Monthly Report - [Month Year]
 
### Key Metrics
| Metric | This Month | Last Month | Trend |
|---|---|---|---|
| Total tests executed | [N] | [N] | [arrow] |
| Overall ASR | [%] | [%] | [arrow] |
| New regressions | [N] | [N] | [arrow] |
| Mean time to detect | [hours] | [hours] | [arrow] |
 
### Regressions Detected
[Table of new failures with root cause and status]
 
### Coverage Changes
[What was added/removed from test suites]
 
### Recommendations
[Adjustments to test coverage or thresholds]

For engagements driven by regulatory requirements (EU AI Act, NIST AI RMF):

## Compliance Mapping
 
| Requirement | Section | Finding(s) | Status |
|---|---|---|---|
| EU AI Act Art. 9 - Risk Management | 4.1 | F001, F003 | Non-compliant |
| NIST AI RMF MAP 1.5 | 4.2 | F002 | Partially compliant |
| ISO 42001 A.7.3 | 4.3 | None | Compliant |

Add a compliance mapping section after the risk assessment. Each finding should include the specific regulatory requirement it relates to.

Common Template Mistakes

Mistake	Problem	Fix
Copy-pasting from previous reports	Wrong client name, wrong system details	Use find-and-replace checklist before delivery
Inconsistent severity ratings	Undermines credibility	Apply the severity matrix consistently across all findings
Missing reproduction steps	Finding cannot be verified or remediated	Every finding must have standalone reproduction steps
No positive observations	Report reads as an attack, not an improvement tool	Include "Effective Controls" section
Generic recommendations	"Improve security" is not actionable	Every recommendation must be specific enough to create a JIRA ticket

Writing Executive Summaries -- detailed guidance on the summary section
Technical Findings Documentation -- deep dive on finding documentation
CART Pipelines -- automated testing that feeds continuous reports

References

"PTES Reporting Guidelines" - Penetration Testing Execution Standard (2024) - Report structure and content requirements for professional security assessments
"OWASP Testing Guide: Reporting" - OWASP Foundation (2024) - Standardized templates and formats for security assessment deliverables
"ISO/IEC 27035: Information Security Incident Management" - International Organization for Standardization (2023) - Reporting standards applicable to AI security finding documentation

Knowledge Check

What is the most important characteristic of an individual finding in a red team report?

Report Templates & Examples

Immediate (0-2 weeks)

Short-term (2-8 weeks)

Medium-term (2-3 months)

Ongoing

Related articles

Report Templates & Examples

Immediate (0-2 weeks)

Short-term (2-8 weeks)

Medium-term (2-3 months)

Ongoing

Related articles