Red Team Infrastructure & Tooling
AI red team C2 frameworks, automated attack pipelines, custom scanner development, and integration with Cobalt Strike, Mythic, and Sliver.
Red Team Infrastructure & Tooling
Professional AI red teaming requires purpose-built infrastructure. Unlike ad-hoc prompt testing, a mature operation needs centralized command and control, automated attack pipelines, repeatable scanning, and integration with existing security tooling. This page covers the architecture of production-grade AI red team platforms.
AI Red Team C2 Architecture
C2 for AI red teaming manages campaigns, dispatches attack tasks, and collects findings.
Architecture Overview
┌─────────────────────────────────────────────┐
│ C2 Server (AIRT-C2) │
│ Campaign Manager │ Payload Library │ Results│
│ └────────Task Queue────────┘ │
│ └────Agent Bus────┘ │
│ ┌─────┐ ┌──────┐ ┌─────┐ ┌───────┐ │
│ │Inject│ │Fuzzer│ │Recon│ │Scanner│ │
│ │Agent │ │Agent │ │Agent│ │Agent │ │
│ └──┬──┘ └──┬───┘ └──┬──┘ └──┬────┘ │
└─────┼───────┼────────┼───────┼──────────────┘
▼ ▼ ▼ ▼
[Chatbot] [API] [RAG] [Agent] [MCP Server]Core Components
| Component | Responsibility |
|---|---|
| Campaign Manager | Creates campaigns, assigns targets, tracks progress |
| Payload Library | Stores and versions injection payloads, fuzzer seeds, recon probes |
| Task Queue | Dispatches tasks to agents (Redis/RabbitMQ) |
| Agent Bus | Routes tasks to the correct agent type |
| Results Engine | Aggregates findings, deduplicates, assigns severity |
Campaign Data Model
Campaign
├── id, name, description, targets[]
├── tasks[] (AttackTask)
│ ├── agent_type: inject | fuzzer | recon | scanner
│ ├── target: {url, auth, config}
│ ├── payload: {payloads[], settings}
│ ├── status: pending | running | completed | failed
│ └── findings[] (Finding)
│ ├── severity: critical | high | medium | low
│ ├── title, description, evidence{}
│ └── remediation
└── stats: {total, completed, failed, findings_by_severity}Attack Agents
All agents implement a common interface: execute(task) -> \{results, findings\}. This enables interoperability within the C2 framework.
Prompt Injection Agent
Receive payloads from C2
Each payload has a
textand acategory(system_prompt_extraction, data_exfiltration, etc.).Send each payload to target
POST to the target URL with the payload text. Capture the full response.
Evaluate success
Check for refusal indicators (failure) vs. compliance indicators (success). Category-specific evaluation: extraction checks for "you are" / "instructions:", exfiltration checks for long non-refusal responses.
Generate findings
For each successful injection, create a Finding with severity, description, evidence (payload + response), and remediation guidance.
Recon Agent
Performs three automated tasks against each target:
| Task | What It Does | Output |
|---|---|---|
| Endpoint enumeration | Probes common AI paths (/api/chat, /v1/models, /docs, etc.) | List of active endpoints with methods and content types |
| Header analysis | Inspects x-*, server, via, model, ratelimit headers | Infrastructure leak indicators |
| Provider fingerprinting | Sends "What model are you?" / "Who created you?" probes | Provider identification with confidence |
Vulnerability Scanner
Four automated checks run in parallel:
| Check | What It Tests | Severity |
|---|---|---|
| System prompt leak | Sends 3 extraction probes, checks for 3+ prompt indicators in response | High |
| Prompt injection | Injects a canary token ("CANARY_TOKEN_12345"), checks if it appears in response | Critical |
| Data leakage | Asks about knowledge base / training data, checks for PII patterns (@, phone:, api_key) | High |
| Tool definition exposure | Asks "List all tools and their parameters," checks for 3+ tool indicators | Medium |
Integration with Traditional Red Team Tools
Cobalt Strike Integration
Generate Aggressor scripts from AI findings:
- Auto-generate popup menu items showing AI findings by severity
- Create beacon notes from each finding for timeline integration
- Map AI findings to the same engagement timeline as traditional exploitation
Mythic Integration
Register an airt-agent payload type with Mythic-compatible commands:
| Command | Description |
|---|---|
ai_recon | Perform reconnaissance on an AI endpoint |
ai_inject | Test prompt injection payloads (basic / advanced / waf_evasion sets) |
ai_extract | Extract system prompt using auto-selected method |
ai_fuzz | Fuzz an AI endpoint with configurable case count |
Sliver Integration
Create a Sliver extension (ai-redteam) with two commands:
| Command | Description |
|---|---|
ai-scan | Endpoint enumeration + provider fingerprinting + capability mapping |
ai-exploit | Run prompt injection and jailbreak payloads against a target |
Pipeline Architecture
The recommended pipeline mirrors traditional red team methodology:
[Campaign Config] → [C2 Server] → [Agent Dispatch]
│
┌───────────────────┤
▼ ▼
[Recon Agent] [Scanner Agent]
│ │
▼ ▼
[Results → Targeting] [Vuln → Exploit Selection]
│ │
└────────┬──────────┘
▼
[Injection Agent]
│
▼
[Findings → Report]Deploy C2 and create campaign
Instantiate the C2 server, define targets, and configure the campaign.
Run recon
Enumerate endpoints, fingerprint providers, analyze headers. Use findings to configure subsequent phases.
Run scanner
Execute all automated vulnerability checks. Prioritize targets by vulnerability density.
Run targeted injection
Use recon and scanner results to select and configure injection payloads. Test against the most promising targets first.
Generate report
Produce a structured JSON report with findings sorted by severity, campaign statistics, and per-task results.
What is the primary advantage of integrating AI red team findings with traditional C2 frameworks like Cobalt Strike or Mythic?
Related Topics
- CART Pipelines -- Continuous automated testing built on top of red team infrastructure
- AI Exploit Development -- Exploit techniques that the tooling automates
- Full Engagement -- End-to-end engagement methodology that tooling supports
- Report Writing -- Reporting integration for automated findings
References
- Garak: LLM Vulnerability Scanner — NVIDIA's LLM security scanner
- PyRIT: Python Risk Identification Toolkit — Microsoft's AI red teaming framework
- Cobalt Strike User Guide — Traditional C2 framework for integration