What is Anatomy of an API Call?

Understand the HTTP request structure for OpenAI, Anthropic, and other LLM APIs — system messages, parameters, function calling, and common misconfigurations.

What is Agent Architectures?

How ReAct, Plan-and-Execute, and LangGraph agent patterns work — tool definition, invocation, and result processing — and where injection happens in each architecture.

What is Deployment Patterns?

Analysis of deployment patterns — chatbots, copilots, autonomous agents, batch processing, and fine-tuned models — with their unique attack surfaces and security considerations.

What is Lab: System Mapping?

Hands-on lab walking through reconnaissance of an AI system — identifying components, mapping data flows, enumerating tools, and documenting the attack surface.

AI System Architecture for Red Teamers

beginner7 min readUpdated 2026-03-13

How AI systems are deployed in production — model API, prompt templates, orchestration, tools, memory, and guardrails — with attack surface analysis at each layer.

architecture systems deployment beginner

Beyond the Model: AI Systems in Production

When red teaming AI, you are almost never attacking a bare model. Production AI systems are layered architectures with multiple components, each introducing its own attack surface. Understanding this architecture is step one of any engagement.

A typical production AI system includes:

┌─────────────────────────────────────────────────┐
│                   User Interface                 │
├─────────────────────────────────────────────────┤
│               Input Guardrails                   │
├─────────────────────────────────────────────────┤
│             Orchestration Layer                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │  Prompt   │  │  Memory  │  │  Tool/API    │  │
│  │ Template  │  │  Store   │  │  Connectors  │  │
│  └──────────┘  └──────────┘  └──────────────┘  │
├─────────────────────────────────────────────────┤
│                 Model API                        │
├─────────────────────────────────────────────────┤
│              Output Guardrails                   │
├─────────────────────────────────────────────────┤
│                 User Interface                   │
└─────────────────────────────────────────────────┘

Component Breakdown

1. Model API

The LLM itself, accessed via API (OpenAI, Anthropic, local deployment, etc.).

Aspect	Details
What it does	Generates text based on prompt input
Trust level	High — treated as the "brain" of the system
Attack surface	Prompt injection, jailbreaking, parameter manipulation
Key parameters	model, temperature, max_tokens, system prompt

See Anatomy of an LLM API Call for a deep dive.

2. Prompt Template

The system prompt and template that shapes model behavior for the specific use case.

SYSTEM_PROMPT = """You are a customer support agent for Acme Corp.
Rules:
- Only answer questions about Acme products
- Never reveal internal pricing formulas
- Always be polite and professional
- If unsure, escalate to a human agent
 
Context: {retrieved_context}
"""

Aspect	Details
What it does	Defines the model's role, constraints, and available context
Trust level	Developer-controlled, should be treated as confidential
Attack surface	System prompt extraction, instruction override, template injection

3. Orchestration Layer

The application logic that coordinates between the model, tools, memory, and user interface. Frameworks like LangChain, LlamaIndex, and custom code.

Aspect	Details
What it does	Routes requests, manages conversation flow, handles tool calls
Trust level	Application code — varies by implementation quality
Attack surface	Logic bugs, improper input validation, state manipulation

4. Tool Connectors

External tools and APIs the model can invoke — databases, web search, code execution, file systems, third-party APIs.

Aspect	Details
What it does	Extends model capabilities with real-world actions
Trust level	HIGH RISK — tools can have side effects (write data, send emails, execute code)
Attack surface	Tool invocation manipulation, parameter injection, privilege escalation

5. Memory / State

Conversation history, user preferences, session state, and long-term memory stores.

Aspect	Details
What it does	Maintains context across interactions
Trust level	Contains prior model outputs and user data — mixed trust
Attack surface	Memory poisoning, context manipulation, cross-session attacks

6. Guardrails

Input and output filters that enforce safety and business rules.

Type	Placement	Techniques
Input guardrails	Before model	Content classifiers, keyword filters, prompt injection detectors
Output guardrails	After model	Toxicity filters, PII redaction, format validation
Structural guardrails	Orchestration	Rate limiting, output length limits, tool call validation

Trust Boundaries

A trust boundary exists wherever data moves between components with different trust levels:

Boundary	From → To	Key Risk
User → System	Untrusted → Trusted	Prompt injection
RAG retrieval → Prompt	Semi-trusted → Trusted	Indirect injection
Model → Tool call	Model-generated → Executed	Arbitrary tool invocation
Tool result → Model	External data → Trusted context	Result injection
Memory → Prompt	Stored data → Active context	Persistent injection

Quick-Reference: Attack Surface by Layer

Layer	Primary Attacks	Impact
User interface	Input crafting, encoding tricks	Low — filtered by downstream layers
Input guardrails	Filter bypass, evasion	Medium — gains access to model
Prompt template	System prompt extraction, override	Medium — changes model behavior
Orchestration	Logic exploitation, state manipulation	High — can alter control flow
Model API	Jailbreaking, prompt injection	High — controls generated output
Tool connectors	Parameter injection, unauthorized calls	Critical — real-world side effects
Memory	Poisoning, cross-session injection	High — persistent compromise
Output guardrails	Output manipulation, encoding bypass	Medium — evades safety filters

Anatomy of an LLM API Call — deep dive into the model interface layer
Agent Architectures & Tool Use Patterns — orchestration patterns and tool use
Common AI Deployment Patterns — how these components are configured for different use cases
Lab: Mapping an AI System's Attack Surface — hands-on practice

References

"OWASP Top 10 for LLM Applications" - OWASP (2025) - Industry-standard classification of LLM application security risks including insecure plugin design and excessive agency
"Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection" - Greshake et al. (2023) - Research demonstrating how trust boundaries in AI systems can be exploited through indirect injection
"Architectural Risk Analysis of Large Language Models" - Trail of Bits (2024) - Systematic analysis of attack surfaces across production AI system architectures
"The Dual LLM Pattern for Building AI Assistants That Can Resist Prompt Injection" - Simon Willison (2023) - Practical architectural pattern for separating trusted and untrusted contexts in AI systems

Knowledge Check

Which component in a production AI system typically has the highest-impact attack surface?

AI System Architecture for Red Teamers

Beyond the Model: AI Systems in Production

Component Breakdown

1. Model API

2. Prompt Template

3. Orchestration Layer

4. Tool Connectors

5. Memory / State

6. Guardrails

Trust Boundaries

Quick-Reference: Attack Surface by Layer

References

Learning Path

AI System Architecture for Red Teamers

Beyond the Model: AI Systems in Production

Component Breakdown

1. Model API

2. Prompt Template

3. Orchestration Layer

4. Tool Connectors

5. Memory / State

6. Guardrails

Trust Boundaries

Quick-Reference: Attack Surface by Layer

References

Learning Path

AI System Architecture for Red Teamers

Learning Path

Related articles

AI System Architecture for Red Teamers

Learning Path

Related articles