Agentic Exploitation
Comprehensive coverage of security vulnerabilities in agentic AI systems, including MCP tool exploitation, multi-agent protocol attacks, function calling abuse, memory system compromise, framework-specific weaknesses, and workflow pattern attacks.
Agentic AI systems represent the most security-critical evolution in how large language models are deployed. Unlike simple chatbots that produce text in response to queries, agentic systems grant LLMs the ability to take actions in the real world: executing code, browsing the web, sending emails, querying databases, modifying files, and interacting with external APIs. Each of these capabilities transforms what would otherwise be a text-generation vulnerability into a pathway for real-world harm.
The shift from passive text generation to active tool use fundamentally changes the threat model. A prompt injection against a chatbot might produce inappropriate text. The same injection against an agent with file system access, email capabilities, and database credentials can exfiltrate sensitive data, modify critical records, or establish persistent backdoors. The blast radius of every vulnerability scales with the agent's granted permissions.
The Agentic Attack Stack
Modern agentic systems are composed of multiple layers, each presenting distinct attack surfaces. Understanding this stack is essential for systematic red teaming.
At the protocol layer, standards like the Model Context Protocol (MCP) and Agent-to-Agent (A2A) protocol define how agents discover, invoke, and communicate with tools and other agents. These protocols introduce attack surfaces that do not exist in simple API-based deployments: tool description poisoning, transport-level man-in-the-middle attacks, and trust boundary violations between agents that assume peer communications are trustworthy.
At the interface layer, function calling mechanisms expose structured APIs that agents invoke based on natural language instructions. The translation from unstructured text to structured function calls creates injection opportunities at every step: schema manipulation, parameter injection, result poisoning, and recursive call exploitation. Attackers who understand how an LLM selects and parameterizes function calls can steer the agent to invoke the right tools with the wrong arguments, or the wrong tools entirely.
At the state layer, agent memory systems persist information across conversations and sessions. Memory poisoning attacks inject malicious instructions that survive context window boundaries, creating long-lived backdoors. Unlike prompt injection which requires active exploitation during a session, memory poisoning can be a one-time attack with persistent effects across all future interactions.
Protocol-Level Exploitation
The standardization of agent communication protocols has created new classes of vulnerabilities. MCP, which provides a unified interface for agents to discover and invoke tools, introduces specific attack vectors around tool enumeration, description manipulation, and transport security. When an agent queries an MCP server for available tools, the tool descriptions returned can contain adversarial instructions that influence the agent's subsequent behavior. This is particularly dangerous because tool descriptions are often treated as trusted system-level context rather than untrusted user input.
The A2A protocol, designed for inter-agent communication, creates trust boundary challenges at scale. When multiple agents coordinate to complete complex tasks, each agent typically trusts messages from other agents as if they were system instructions. An attacker who compromises one agent in a multi-agent workflow can propagate malicious instructions through the entire agent network, turning a single-point vulnerability into a cascade failure across the system.
Function Calling as an Attack Surface
Function calling is the mechanism through which agents translate natural language intent into structured API calls. This translation process involves multiple exploitable steps. The agent must select which function to call from its available set, determine the correct parameters, execute the call, and interpret the results. Each step presents injection opportunities.
Schema injection attacks manipulate how the agent understands function signatures, causing it to invoke functions with incorrect or malicious parameters. Parameter manipulation techniques craft inputs that pass the agent's validation but contain payloads that exploit the downstream function. Result poisoning injects malicious content into function return values, steering subsequent agent reasoning. Recursive call attacks exploit agent loop behaviors to trigger unintended sequences of function invocations that escalate privileges or exhaust resources.
What You'll Learn in This Section
- AI Agent Exploitation -- Core exploitation techniques against ReAct, Plan-and-Execute, and LangGraph agent architectures, covering confused deputy attacks, goal hijacking, privilege escalation, and sandbox escape
- Multi-Agent & A2A Protocol Exploitation -- Attacking trust boundaries between cooperating agents, exploiting orchestrator patterns, and compromising A2A protocol communications
- MCP & Tool Surface Exploitation -- Exploiting the Model Context Protocol including tool description poisoning, transport-level attacks, and server-side injection vectors
- Function Calling Exploitation -- Schema injection, parameter manipulation, recursive call exploitation, and result poisoning in structured function calling interfaces
- Agent Memory Systems Security -- Memory poisoning, context manipulation, and exfiltration attacks against persistent agent memory stores
- Agent Framework Security -- Framework-specific vulnerabilities in LangChain, CrewAI, AutoGen, and OpenAI Assistants
- Agentic Workflow Pattern Attacks -- Exploiting sequential, parallel, and hierarchical workflow patterns used in production agent deployments
Prerequisites
Before working through this section, you should be familiar with:
- Prompt injection fundamentals -- Understanding of direct and indirect prompt injection techniques from the Prompt Injection section
- Agent architecture basics -- How agents use tool calling, memory, and planning loops, covered in AI System Architecture
- LLM fundamentals -- How large language models process text and generate outputs, covered in How LLMs Work
- Basic security concepts -- Familiarity with attack surfaces, trust boundaries, and privilege escalation from traditional security