Agent Framework Security
Security analysis of major AI agent frameworks including LangChain, CrewAI, AutoGen, Semantic Kernel, and OpenAI Assistants, covering default configurations, common vulnerabilities, and framework-specific attack surfaces.
Agent Framework Security
Agent frameworks abstract the complexity of building AI agents -- tool registration, memory management, conversation flow, and multi-step reasoning. This abstraction creates a double-edged sword: frameworks make it easy to build powerful agents, but they also make it easy to build insecure agents. Default configurations prioritize developer experience over security, community-contributed components are rarely audited, and the frameworks themselves have vulnerabilities that propagate to every application built on them.
Framework Landscape
The major agent frameworks as of early 2026:
| Framework | Maintainer | Primary Use Case | Agent Pattern |
|---|---|---|---|
| LangChain / LangGraph | LangChain Inc. | General-purpose agent development | Chains, graphs, ReAct agents |
| CrewAI | CrewAI Inc. | Multi-agent collaboration | Role-based crews with task delegation |
| AutoGen | Microsoft | Multi-agent conversation | Conversable agents with group chat |
| Semantic Kernel | Microsoft | Enterprise AI integration | Plugin-based with planner |
| OpenAI Assistants | OpenAI | Managed agent deployment | Hosted assistants with managed state |
Common Security Problems Across Frameworks
Despite their architectural differences, all major frameworks share recurring security problems:
1. Dangerous Defaults
Frameworks ship with defaults optimized for quick setup, not security:
| Framework | Dangerous Default | Impact |
|---|---|---|
| LangChain | PythonREPLTool available with no sandboxing | Arbitrary code execution on the host |
| CrewAI | Agents can delegate tasks to any other agent | Uncontrolled lateral movement between agents |
| AutoGen | code_execution_config enables local code execution | Container escape if not properly sandboxed |
| Semantic Kernel | Auto-invoke mode calls functions without confirmation | No human-in-the-loop for sensitive operations |
| OpenAI Assistants | Code Interpreter has network access by default | Data exfiltration via code execution |
2. Implicit Trust in Tool Outputs
No major framework treats tool outputs as untrusted by default. When a tool returns data, that data enters the agent's context with the same priority as system instructions. This is the root cause of the confused deputy vulnerability in every framework.
3. Community Components Without Audit
LangChain has 800+ community integrations. CrewAI has a growing tool marketplace. These components are rarely security-audited:
- Community tools may execute arbitrary code, make network requests, or access the filesystem
- Tool descriptions (which the model reads as instructions) may contain unintentional or malicious injection content
- Dependencies of community tools may introduce supply chain vulnerabilities
4. Memory Without Access Controls
All frameworks that support memory (LangChain, CrewAI, AutoGen, Semantic Kernel) store conversation history and learned preferences without fine-grained access controls. There is no standard mechanism for:
- Distinguishing between system-level and user-level memories
- Preventing tool outputs from being stored as memories
- Encrypting sensitive memory contents
- Enforcing memory isolation between users in multi-tenant deployments
Framework-Specific Vulnerabilities
LangChain / LangGraph
LangChain's extensive abstraction layer creates a large attack surface. Key concerns:
- Chain composition attacks -- Chains that pass output directly between components without sanitization
- Callback exploitation -- Custom callbacks that execute on every agent action, including tool calls
- Hub prompt risks -- LangChain Hub prompts are community-contributed and may contain injection
- Agent executor loops -- Default
max_iterationsis generous, enabling resource exhaustion
See LangChain Security Deep Dive for detailed analysis.
CrewAI & AutoGen
Multi-agent frameworks introduce inter-agent attack surfaces:
- Role manipulation -- Agents with defined roles can be convinced to act outside their role
- Inter-agent injection -- Agent A's output is Agent B's input, creating injection chains
- Delegation abuse -- Hierarchical delegation can be exploited for privilege escalation
See CrewAI & AutoGen Security for detailed analysis.
OpenAI Assistants
The managed platform has its own unique concerns:
- File search exploitation -- Uploaded files can contain injection payloads
- Code interpreter abuse -- Sandboxed but still capable of data processing and exfiltration
- Thread injection -- Adding messages to threads can poison context
See OpenAI Assistants API Security for detailed analysis.
Semantic Kernel
Enterprise-focused but not immune:
- Plugin trust model -- All registered plugins are equally trusted
- Planner manipulation -- The AI planner can be steered to invoke plugins in unintended sequences
- Connector exploitation -- Database and API connectors pass through model-generated queries
Assessment Methodology
Identify the framework and version
Determine which framework(s) are in use and their exact versions. Check for known CVEs and security advisories specific to those versions.
Audit default configurations
Review the agent's configuration against the framework's defaults. Identify any dangerous defaults that have not been overridden (code execution, unrestricted tool access, open delegation).
Enumerate registered tools and components
List all tools, plugins, chains, or integrations registered with the agent. Identify community-contributed vs. first-party components. Audit tool descriptions for injection content.
Test tool output handling
Inject adversarial content through tool results and observe whether the agent follows the injected instructions. This tests the framework's (lack of) output sanitization.
Test memory isolation
If the application is multi-tenant, test whether one user's memories are accessible to another. Test whether tool outputs are stored in memory without sanitization.
Test framework-specific attack surfaces
Apply framework-specific tests: chain composition for LangChain, inter-agent injection for CrewAI/AutoGen, file search poisoning for OpenAI Assistants.
Choosing a Framework by Security Requirements
For a detailed comparison of security features across frameworks, see Security Comparison Matrix.
Related Topics
- LangChain Security Deep Dive -- Detailed LangChain vulnerability analysis
- CrewAI & AutoGen Security -- Multi-agent framework security
- OpenAI Assistants API Security -- Managed platform security
- Security Comparison Matrix -- Side-by-side framework comparison
- Function Calling Exploitation -- Tool calling attacks applicable to all frameworks
What is the most dangerous default configuration pattern shared across multiple agent frameworks?
References
- LangChain Security Documentation (2025)
- Microsoft AutoGen Security Considerations (2024)
- OpenAI Assistants API Documentation (2025)
- OWASP Top 10 for LLM Applications v2.0
- CrewAI Documentation (2025)