Visualizing Red Team Results

intermediate7 min readUpdated 2026-03-13

Creating effective visualizations for AI red team reports: ASR charts, attack taxonomy heatmaps, defense coverage matrices, and Python visualization code.

visualization results charts dashboards

Visualizing Red Team Results

A well-chosen visualization communicates findings faster and more memorably than paragraphs of text. This page covers the chart types most useful for AI red team reports, when to use each, and production-ready Python code to generate them.

Visualization Selection Guide

Data Type	Best Visualization	Audience	Example
Success rates across techniques	Horizontal bar chart	Technical, executive	ASR by attack category
Coverage across attack surfaces	Heatmap	Technical, management	Attack taxonomy coverage
Finding severity distribution	Donut or stacked bar chart	Executive	Critical/High/Medium/Low breakdown
Defense effectiveness	Matrix / grid	Technical	Defense vs. attack type matrix
Trend over time	Line chart	Management, executive	ASR across quarterly assessments
Attack chain flow	Sankey diagram	Technical	Multi-step attack progression

Attack Success Rate (ASR) Charts

Attack Success Rate is the most fundamental metric in AI red teaming.

Horizontal Bar Chart -- ASR by Category

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import numpy as np
 
# Data
categories = [
    "Direct Injection",
    "Indirect Injection",
    "Role-play Jailbreak",
    "Encoding Bypass",
    "System Prompt Extraction",
    "Tool Abuse",
    "Multi-turn Manipulation",
    "Safety Bypass",
]
asr_values = [45, 20, 70, 15, 60, 10, 35, 55]
 
# Color by severity threshold
colors = []
for v in asr_values:
    if v >= 50:
        colors.append("#dc3545")   # Red - critical
    elif v >= 30:
        colors.append("#fd7e14")   # Orange - warning
    else:
        colors.append("#28a745")   # Green - acceptable
 
fig, ax = plt.subplots(figsize=(10, 6))
bars = ax.barh(categories, asr_values, color=colors, height=0.6)
 
# Add value labels
for bar, val in zip(bars, asr_values):
    ax.text(bar.get_width() + 1.5, bar.get_y() + bar.get_height() / 2,
            f"{val}%", va="center", fontsize=11, fontweight="bold")
 
# Threshold lines
ax.axvline(x=50, color="#dc3545", linestyle="--", alpha=0.5, label="Critical threshold")
ax.axvline(x=30, color="#fd7e14", linestyle="--", alpha=0.5, label="Warning threshold")
 
ax.set_xlabel("Attack Success Rate (%)", fontsize=12)
ax.set_title("Attack Success Rate by Category", fontsize=14, fontweight="bold")
ax.set_xlim(0, 100)
ax.xaxis.set_major_formatter(mticker.PercentFormatter())
ax.legend(loc="lower right")
ax.invert_yaxis()
plt.tight_layout()
plt.savefig("asr_by_category.png", dpi=150, bbox_inches="tight")
plt.show()

Grouped Bar Chart -- Model Comparison

import matplotlib.pyplot as plt
import numpy as np
 
categories = ["Injection", "Jailbreak", "Extraction", "Tool Abuse", "Safety"]
model_a = [45, 70, 60, 10, 55]
model_b = [25, 40, 30, 5, 20]
 
x = np.arange(len(categories))
width = 0.35
 
fig, ax = plt.subplots(figsize=(10, 6))
bars1 = ax.bar(x - width / 2, model_a, width, label="Before Remediation",
               color="#dc3545", alpha=0.8)
bars2 = ax.bar(x + width / 2, model_b, width, label="After Remediation",
               color="#28a745", alpha=0.8)
 
ax.set_ylabel("Attack Success Rate (%)")
ax.set_title("Remediation Effectiveness by Attack Category")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend()
ax.set_ylim(0, 100)
 
# Add value labels
for bars in [bars1, bars2]:
    for bar in bars:
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width() / 2, height + 1,
                f"{height}%", ha="center", va="bottom", fontsize=9)
 
plt.tight_layout()
plt.savefig("remediation_comparison.png", dpi=150, bbox_inches="tight")
plt.show()

Attack Taxonomy Heatmap

A heatmap shows testing coverage and results across two dimensions (attack type vs. target component, for example).

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
 
# Rows: attack types, Columns: target components
attack_types = [
    "Direct Injection", "Indirect Injection", "Role-play Jailbreak",
    "Encoding Bypass", "Prompt Extraction", "Tool Manipulation",
]
components = ["Chat API", "RAG Pipeline", "Agent Tools", "Safety Filter"]
 
# Values: ASR percentages (-1 = not tested)
data = np.array([
    [45, 20, -1, 30],
    [15, 55, 40, 10],
    [70, -1, 25, 50],
    [15, 5,  -1, 20],
    [60, 30, 10, -1],
    [-1, -1, 35, 5],
])
 
# Mask untested cells
mask = data == -1
display_data = np.where(mask, 0, data)
 
fig, ax = plt.subplots(figsize=(10, 7))
sns.heatmap(
    display_data, annot=True, fmt="d", mask=mask,
    xticklabels=components, yticklabels=attack_types,
    cmap="RdYlGn_r", vmin=0, vmax=100,
    linewidths=1, linecolor="white",
    cbar_kws={"label": "Attack Success Rate (%)"},
    ax=ax,
)
 
# Mark untested cells
for i in range(data.shape[0]):
    for j in range(data.shape[1]):
        if mask[i, j]:
            ax.text(j + 0.5, i + 0.5, "N/T", ha="center", va="center",
                    fontsize=10, color="gray", fontstyle="italic")
 
ax.set_title("Attack Taxonomy Coverage Heatmap", fontsize=14, fontweight="bold")
ax.set_ylabel("Attack Type")
ax.set_xlabel("Target Component")
plt.tight_layout()
plt.savefig("attack_heatmap.png", dpi=150, bbox_inches="tight")
plt.show()

Defense Coverage Matrix

Shows which defenses are effective against which attack types:

	Input Filter	Output Filter	Safety Classifier	Rate Limiting	Prompt Hardening
Direct Injection	Partial	Effective	Effective	No effect	Partial
Indirect Injection	No effect	Partial	Partial	No effect	No effect
Jailbreak	Partial	Partial	Effective	Partial	No effect
Prompt Extraction	No effect	Effective	No effect	No effect	Effective
Tool Abuse	No effect	No effect	No effect	Effective	Partial

Legend: Effective = blocks >80% | Partial = blocks 30-80% | No effect = blocks <30%

Severity Distribution

A donut chart showing finding severity distribution is a quick executive-level summary:

import matplotlib.pyplot as plt
 
labels = ["Critical", "High", "Medium", "Low"]
sizes = [2, 4, 6, 3]
colors = ["#dc3545", "#fd7e14", "#ffc107", "#28a745"]
explode = (0.05, 0, 0, 0)
 
fig, ax = plt.subplots(figsize=(8, 8))
wedges, texts, autotexts = ax.pie(
    sizes, explode=explode, labels=labels, colors=colors,
    autopct=lambda pct: f"{pct:.0f}%\n({int(round(pct / 100 * sum(sizes)))})",
    shadow=False, startangle=90, pctdistance=0.75,
    textprops={"fontsize": 12},
)
 
# Draw center circle for donut effect
centre_circle = plt.Circle((0, 0), 0.50, fc="white")
ax.add_artist(centre_circle)
ax.text(0, 0, f"{sum(sizes)}\nFindings", ha="center", va="center",
        fontsize=16, fontweight="bold")
 
ax.set_title("Finding Severity Distribution", fontsize=14, fontweight="bold")
plt.tight_layout()
plt.savefig("severity_distribution.png", dpi=150, bbox_inches="tight")
plt.show()

Visualization Best Practices

Practice	Why
Use colorblind-safe palettes	~8% of men have color vision deficiency
Include data labels on all bars	Readers should not have to estimate from axes
Export at 150+ DPI	Prevents pixelation in reports and presentations
Title every chart descriptively	"ASR by Attack Category" not "Figure 1"
Explain what "good" and "bad" look like	Add threshold lines or color coding
Keep it simple	One message per chart -- split complex data into multiple visuals

Writing Executive Summaries -- charts that support executive communication
Report Templates & Examples -- where visualizations fit in the report
Metrics, KPIs & Demonstrating ROI -- program-level metrics and dashboards

References

"Data Visualization Best Practices for Security Reporting" - SANS Institute (2024) - Visualization techniques tailored to security assessment audiences
"The Visual Display of Quantitative Information" - Edward Tufte (2001) - Foundational principles for effective data visualization applicable to red team reporting
"D3.js: Data-Driven Documents" - Observable (2024) - Visualization library commonly used for interactive security dashboards and heatmaps
"Storytelling with Data" - Cole Nussbaumer Knaflic (2015) - Communication-focused approach to data visualization for non-technical stakeholders

Knowledge Check

In an attack taxonomy heatmap, how should untested attack/component combinations be displayed?

Edit this page on GitHub

Visualizing Red Team Results

intermediate7 min readUpdated 2026-03-13

Creating effective visualizations for AI red team reports: ASR charts, attack taxonomy heatmaps, defense coverage matrices, and Python visualization code.

visualization results charts dashboards

Visualizing Red Team Results

Visualization Selection Guide

Data Type	Best Visualization	Audience	Example
Success rates across techniques	Horizontal bar chart	Technical, executive	ASR by attack category
Coverage across attack surfaces	Heatmap	Technical, management	Attack taxonomy coverage
Finding severity distribution	Donut or stacked bar chart	Executive	Critical/High/Medium/Low breakdown
Defense effectiveness	Matrix / grid	Technical	Defense vs. attack type matrix
Trend over time	Line chart	Management, executive	ASR across quarterly assessments
Attack chain flow	Sankey diagram	Technical	Multi-step attack progression

Attack Success Rate (ASR) Charts

Attack Success Rate is the most fundamental metric in AI red teaming.

Horizontal Bar Chart -- ASR by Category

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import numpy as np
 
# Data
categories = [
    "Direct Injection",
    "Indirect Injection",
    "Role-play Jailbreak",
    "Encoding Bypass",
    "System Prompt Extraction",
    "Tool Abuse",
    "Multi-turn Manipulation",
    "Safety Bypass",
]
asr_values = [45, 20, 70, 15, 60, 10, 35, 55]
 
# Color by severity threshold
colors = []
for v in asr_values:
    if v >= 50:
        colors.append("#dc3545")   # Red - critical
    elif v >= 30:
        colors.append("#fd7e14")   # Orange - warning
    else:
        colors.append("#28a745")   # Green - acceptable
 
fig, ax = plt.subplots(figsize=(10, 6))
bars = ax.barh(categories, asr_values, color=colors, height=0.6)
 
# Add value labels
for bar, val in zip(bars, asr_values):
    ax.text(bar.get_width() + 1.5, bar.get_y() + bar.get_height() / 2,
            f"{val}%", va="center", fontsize=11, fontweight="bold")
 
# Threshold lines
ax.axvline(x=50, color="#dc3545", linestyle="--", alpha=0.5, label="Critical threshold")
ax.axvline(x=30, color="#fd7e14", linestyle="--", alpha=0.5, label="Warning threshold")
 
ax.set_xlabel("Attack Success Rate (%)", fontsize=12)
ax.set_title("Attack Success Rate by Category", fontsize=14, fontweight="bold")
ax.set_xlim(0, 100)
ax.xaxis.set_major_formatter(mticker.PercentFormatter())
ax.legend(loc="lower right")
ax.invert_yaxis()
plt.tight_layout()
plt.savefig("asr_by_category.png", dpi=150, bbox_inches="tight")
plt.show()

Grouped Bar Chart -- Model Comparison

import matplotlib.pyplot as plt
import numpy as np
 
categories = ["Injection", "Jailbreak", "Extraction", "Tool Abuse", "Safety"]
model_a = [45, 70, 60, 10, 55]
model_b = [25, 40, 30, 5, 20]
 
x = np.arange(len(categories))
width = 0.35
 
fig, ax = plt.subplots(figsize=(10, 6))
bars1 = ax.bar(x - width / 2, model_a, width, label="Before Remediation",
               color="#dc3545", alpha=0.8)
bars2 = ax.bar(x + width / 2, model_b, width, label="After Remediation",
               color="#28a745", alpha=0.8)
 
ax.set_ylabel("Attack Success Rate (%)")
ax.set_title("Remediation Effectiveness by Attack Category")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend()
ax.set_ylim(0, 100)
 
# Add value labels
for bars in [bars1, bars2]:
    for bar in bars:
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width() / 2, height + 1,
                f"{height}%", ha="center", va="bottom", fontsize=9)
 
plt.tight_layout()
plt.savefig("remediation_comparison.png", dpi=150, bbox_inches="tight")
plt.show()

Attack Taxonomy Heatmap

A heatmap shows testing coverage and results across two dimensions (attack type vs. target component, for example).

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
 
# Rows: attack types, Columns: target components
attack_types = [
    "Direct Injection", "Indirect Injection", "Role-play Jailbreak",
    "Encoding Bypass", "Prompt Extraction", "Tool Manipulation",
]
components = ["Chat API", "RAG Pipeline", "Agent Tools", "Safety Filter"]
 
# Values: ASR percentages (-1 = not tested)
data = np.array([
    [45, 20, -1, 30],
    [15, 55, 40, 10],
    [70, -1, 25, 50],
    [15, 5,  -1, 20],
    [60, 30, 10, -1],
    [-1, -1, 35, 5],
])
 
# Mask untested cells
mask = data == -1
display_data = np.where(mask, 0, data)
 
fig, ax = plt.subplots(figsize=(10, 7))
sns.heatmap(
    display_data, annot=True, fmt="d", mask=mask,
    xticklabels=components, yticklabels=attack_types,
    cmap="RdYlGn_r", vmin=0, vmax=100,
    linewidths=1, linecolor="white",
    cbar_kws={"label": "Attack Success Rate (%)"},
    ax=ax,
)
 
# Mark untested cells
for i in range(data.shape[0]):
    for j in range(data.shape[1]):
        if mask[i, j]:
            ax.text(j + 0.5, i + 0.5, "N/T", ha="center", va="center",
                    fontsize=10, color="gray", fontstyle="italic")
 
ax.set_title("Attack Taxonomy Coverage Heatmap", fontsize=14, fontweight="bold")
ax.set_ylabel("Attack Type")
ax.set_xlabel("Target Component")
plt.tight_layout()
plt.savefig("attack_heatmap.png", dpi=150, bbox_inches="tight")
plt.show()

Defense Coverage Matrix

Shows which defenses are effective against which attack types:

	Input Filter	Output Filter	Safety Classifier	Rate Limiting	Prompt Hardening
Direct Injection	Partial	Effective	Effective	No effect	Partial
Indirect Injection	No effect	Partial	Partial	No effect	No effect
Jailbreak	Partial	Partial	Effective	Partial	No effect
Prompt Extraction	No effect	Effective	No effect	No effect	Effective
Tool Abuse	No effect	No effect	No effect	Effective	Partial

Legend: Effective = blocks >80% | Partial = blocks 30-80% | No effect = blocks <30%

Severity Distribution

A donut chart showing finding severity distribution is a quick executive-level summary:

import matplotlib.pyplot as plt
 
labels = ["Critical", "High", "Medium", "Low"]
sizes = [2, 4, 6, 3]
colors = ["#dc3545", "#fd7e14", "#ffc107", "#28a745"]
explode = (0.05, 0, 0, 0)
 
fig, ax = plt.subplots(figsize=(8, 8))
wedges, texts, autotexts = ax.pie(
    sizes, explode=explode, labels=labels, colors=colors,
    autopct=lambda pct: f"{pct:.0f}%\n({int(round(pct / 100 * sum(sizes)))})",
    shadow=False, startangle=90, pctdistance=0.75,
    textprops={"fontsize": 12},
)
 
# Draw center circle for donut effect
centre_circle = plt.Circle((0, 0), 0.50, fc="white")
ax.add_artist(centre_circle)
ax.text(0, 0, f"{sum(sizes)}\nFindings", ha="center", va="center",
        fontsize=16, fontweight="bold")
 
ax.set_title("Finding Severity Distribution", fontsize=14, fontweight="bold")
plt.tight_layout()
plt.savefig("severity_distribution.png", dpi=150, bbox_inches="tight")
plt.show()

Visualization Best Practices

Practice	Why
Use colorblind-safe palettes	~8% of men have color vision deficiency
Include data labels on all bars	Readers should not have to estimate from axes
Export at 150+ DPI	Prevents pixelation in reports and presentations
Title every chart descriptively	"ASR by Attack Category" not "Figure 1"
Explain what "good" and "bad" look like	Add threshold lines or color coding
Keep it simple	One message per chart -- split complex data into multiple visuals

Writing Executive Summaries -- charts that support executive communication
Report Templates & Examples -- where visualizations fit in the report
Metrics, KPIs & Demonstrating ROI -- program-level metrics and dashboards

References

"Data Visualization Best Practices for Security Reporting" - SANS Institute (2024) - Visualization techniques tailored to security assessment audiences
"The Visual Display of Quantitative Information" - Edward Tufte (2001) - Foundational principles for effective data visualization applicable to red team reporting
"D3.js: Data-Driven Documents" - Observable (2024) - Visualization library commonly used for interactive security dashboards and heatmaps
"Storytelling with Data" - Cole Nussbaumer Knaflic (2015) - Communication-focused approach to data visualization for non-technical stakeholders

Knowledge Check

In an attack taxonomy heatmap, how should untested attack/component combinations be displayed?

Edit this page on GitHub

Visualizing Red Team Results

Related articles

Visualizing Red Team Results

Related articles