# agent-security
標記為「agent-security」的 42 篇文章
Link-Based Exfiltration
Using hyperlinks, redirects, or URL parameters to exfiltrate data from AI systems through markdown links, tool-generated URLs, and API callback exploitation.
Markdown Image Injection
Injecting markdown image tags with attacker-controlled URLs to exfiltrate conversation data via HTTP image requests.
Permission Boundary Bypass
Escalating from limited to elevated permissions in AI agent systems through scope creep, implicit permission inheritance, and capability confusion.
LangChain Security Deep Dive (Agentic Exploitation)
Comprehensive security analysis of LangChain and LangGraph, covering dangerous defaults, chain composition attacks, callback exploitation, community tool risks, and agent executor vulnerabilities.
Case Study: MCP Tool Poisoning Attacks (Invariant Labs 2025)
Analysis of tool poisoning vulnerabilities in the Model Context Protocol (MCP) discovered by Invariant Labs, where malicious tool descriptions manipulate AI agents into data exfiltration and unauthorized actions.
CaMeL & Dual LLM Pattern
Architectural defense patterns that separate trusted and untrusted processing: Simon Willison's Dual LLM concept and Google DeepMind's CaMeL framework for defending tool-using AI agents against prompt injection.
A2A Trust Boundary Attack
Advanced walkthrough of exploiting trust boundaries between agents in multi-agent systems using the Agent-to-Agent (A2A) protocol.
Agent Context Overflow
Walkthrough of overflowing agent context windows to push safety instructions out of the LLM's attention, enabling bypasses of system prompts and guardrails.
Agent Loop Hijacking
Advanced walkthrough of hijacking agentic loops to redirect autonomous agent behavior, alter reasoning chains, and achieve persistent control over multi-step agent workflows.
Agent Persistence via Memory
Advanced walkthrough of using agent memory systems to create persistent backdoors that survive restarts, updates, and session boundaries.
Callback Abuse in MCP
Advanced walkthrough of abusing MCP callback mechanisms for unauthorized actions, data exfiltration, and privilege escalation in agent-tool interactions.
Function Calling Parameter Injection
Walkthrough of manipulating function call parameters through prompt-level techniques, injecting malicious values into LLM-generated API calls.
MCP Tool Shadowing
Advanced walkthrough of creating shadow tools that override legitimate MCP (Model Context Protocol) tools, enabling interception and manipulation of agent-tool interactions.
Memory Poisoning Step by Step
Walkthrough of persisting injection payloads in agent memory systems to achieve long-term compromise of LLM-based agents.
Multi-Agent Prompt Relay
Advanced walkthrough of relaying prompt injection payloads across multiple agents in a pipeline, achieving cascading compromise of multi-agent systems.
Orchestrator Manipulation
Advanced walkthrough of attacking the orchestrator layer in multi-agent systems to gain control over task delegation, agent coordination, and system-wide behavior.
Plugin Confusion Attack
Walkthrough of confusing LLM agents about which plugin or tool to invoke, causing them to call the wrong tool or pass data to unintended destinations.
Tool Call Injection
Step-by-step walkthrough of injecting malicious parameters into LLM tool and function calls to execute unauthorized actions in agent systems.
Sandboxing and Permission Models for Tool-Using Agents
Walkthrough for implementing sandboxing and permission models that constrain tool-using LLM agents, covering least-privilege design, parameter validation, execution sandboxes, approval workflows, and audit logging.
Security Testing LangChain Applications
Step-by-step walkthrough for identifying and exploiting security vulnerabilities in LangChain-based applications, covering chain injection, agent manipulation, tool abuse, retrieval poisoning, and memory extraction attacks.
基於連結的外洩
使用超連結、重導向或 URL 參數從 AI 系統外洩資料,透過 markdown 連結、工具產生的 URL 與 API 回呼利用。
Markdown Image Injection
Injecting markdown image tags with attacker-controlled URLs to exfiltrate conversation data via HTTP image requests.
Permission Boundary Bypass
Escalating from limited to elevated permissions in AI agent systems through scope creep, implicit permission inheritance, and capability confusion.
LangChain 安全深入探討(代理攻擊)
對 LangChain 與 LangGraph 的完整安全分析,涵蓋危險預設、chain 組合攻擊、callback 利用、社群工具風險,以及 agent executor 漏洞。
MCP 安全:新的攻擊面
深入探討模型上下文協議安全——分析工具註冊攻擊、傳輸層風險、跨伺服器利用與實務強化策略。
OpenClaw:解剖 2026 年第一場重大 AI 代理安全危機
OpenClaw 如何從一鳴驚人成為 GitHub 最受歡迎的專案,同時暴露出關鍵的代理式 AI 漏洞——從 ClawJacked WebSocket 劫持(CVE-2026-25253)到散布 macOS 竊取程式的惡意技能。紅隊員與防禦者必須知道的事。
Case Study: MCP 工具 投毒 攻擊s (Invariant 實驗室s 2025)
Analysis of tool poisoning vulnerabilities in the 模型 Context Protocol (MCP) discovered by Invariant 實驗室s, where malicious tool descriptions manipulate AI agents into data exfiltration and unauthorized actions.
CaMeL & Dual LLM Pattern
Architectural defense patterns that separate trusted and untrusted processing: Simon Willison's Dual LLM concept and Google DeepMind's CaMeL framework for defending tool-using AI agents against prompt injection.
A2A Trust Boundary 攻擊
進階 walkthrough of exploiting trust boundaries between agents in multi-agent systems using the 代理-to-代理 (A2A) protocol.
代理 Context Overflow
Walkthrough of overflowing agent context windows to push safety instructions out of the LLM's attention, enabling bypasses of system prompts and guardrails.
代理 Loop Hijacking
進階 walkthrough of hijacking agentic loops to redirect autonomous agent behavior, alter reasoning chains, and achieve persistent control over multi-step agent workflows.
代理 Persistence via 記憶體
進階 walkthrough of using agent memory systems to create persistent backdoors that survive restarts, updates, and session boundaries.
Callback Abuse in MCP
進階 walkthrough of abusing MCP callback mechanisms for unauthorized actions, data exfiltration, and privilege escalation in agent-tool interactions.
Function Calling Parameter Injection
導覽 of manipulating function call parameters through prompt-level techniques, injecting malicious values into LLM-generated API calls.
MCP 工具 Shadowing
進階 walkthrough of creating shadow tools that override legitimate MCP (模型 Context Protocol) tools, enabling interception and manipulation of agent-tool interactions.
記憶體 投毒 Step by Step
導覽 of persisting injection payloads in agent memory systems to achieve long-term compromise of LLM-based agents.
Multi-代理 Prompt Relay
進階 walkthrough of relaying prompt injection payloads across multiple agents in a pipeline, achieving cascading compromise of multi-agent systems.
Orchestrator Manipulation
進階 walkthrough of attacking the orchestrator layer in multi-agent systems to gain control over task delegation, agent coordination, and system-wide behavior.
Plugin Confusion 攻擊
導覽 of confusing LLM agents about which plugin or tool to invoke, causing them to call the wrong tool or pass data to unintended destinations.
工具 Call Injection
Step-by-step walkthrough of injecting malicious parameters into LLM tool and function calls to execute unauthorized actions in agent systems.
Sandboxing and Permission 模型s for 工具-Using 代理s
導覽 for implementing sandboxing and permission models that constrain tool-using LLM agents, covering least-privilege design, parameter validation, execution sandboxes, approval workflows, and audit logging.
安全 Testing LangChain Applications
Step-by-step walkthrough for identifying and exploiting security vulnerabilities in LangChain-based applications, covering chain injection, agent manipulation, tool abuse, retrieval poisoning, and memory extraction attacks.