# exploitation
標記為「exploitation」的 188 篇文章
Exploiting Agent Tool Use
How to manipulate AI agents into calling tools with attacker-controlled parameters, abusing tool capabilities for data exfiltration, privilege escalation, and unauthorized actions.
Tool Use Exploitation
Comprehensive techniques for exploiting how AI agents call external tools and APIs, including tool description poisoning, overly permissive access abuse, and tool output manipulation.
AI Agent Exploitation
Methodology for exploiting AI agent architectures through confused deputy attacks, goal hijacking, privilege escalation, and sandbox escape.
CrewAI Multi-Agent Exploitation
Exploiting CrewAI's multi-agent orchestration for task injection and cross-agent attacks.
Haystack Pipeline Exploitation
Exploiting Haystack's pipeline architecture for component injection and data flow manipulation.
Phidata Agent Attacks
Security analysis of Phidata agent framework including knowledge base poisoning and tool abuse.
Semantic Kernel Plugin Exploitation
Exploiting Microsoft's Semantic Kernel plugins and planners for unauthorized code execution.
Browser Agent Framework Exploitation
Exploiting browser-use and web agent frameworks through DOM injection and navigation manipulation.
Context Window Attacks
Techniques for exploiting LLM context window limits, including strategic context overflow to push out system instructions, attention manipulation, and context budget exhaustion attacks.
Vector-Based Memory Poisoning
Poisoning vector-based memory stores in agent systems to inject false context into retrieval.
Memory Eviction Exploitation
Exploit memory eviction policies in context-limited agents to selectively remove safety-relevant context.
Function Calling Error Exploitation
Leveraging error handling paths in function calling implementations to leak information or bypass controls.
Function Type Confusion Attacks
Exploiting type system weaknesses in function calling interfaces to trigger unexpected behavior through type confusion.
Function Calling Exploitation
Overview of how LLM function/tool calling works, the attack surface it creates, and systematic approaches to exploiting function calling interfaces in AI systems.
Parallel Function Call Exploitation
Exploiting parallel function calling to create race conditions and bypass sequential validation.
Parameter Manipulation
Techniques for crafting malicious parameter values in LLM function calls, including type confusion, boundary violations, injection through parameters, and exploiting weak validation.
Function Result Poisoning (Agentic Exploitation)
Techniques for manipulating function return values to influence LLM behavior, inject instructions via tool results, and chain poisoned results into multi-step exploitation.
JSON Schema Injection
Techniques for manipulating function definitions and JSON schemas to alter LLM behavior, inject additional parameters, and exploit schema validation gaps in tool calling systems.
Structured Output Tool Injection
Exploiting structured output mode to inject tool call directives into model responses.
Tool Disambiguation Attacks
Exploiting tool selection ambiguity to redirect function calls to unintended tools.
Function Hallucination Exploitation
Exploit model tendency to hallucinate function calls to non-existent APIs for information disclosure.
MCP Configuration Injection
Injecting malicious configuration into MCP server initialization for persistent compromise.
MCP Dynamic Tool Registration Attacks
Attacking dynamic tool registration in MCP to inject malicious tools at runtime.
MCP Root Listing Exploitation
Exploiting MCP root listing capabilities to discover and access sensitive file system resources.
MCP Sampling API Exploitation
Exploiting the MCP sampling API to manipulate how servers request LLM completions.
MCP SSE Transport Security Analysis
Security analysis of Server-Sent Events transport in MCP including reconnection attacks and event injection.
MCP stdio Transport Exploitation
Exploiting the stdio transport mechanism in MCP for inter-process communication attacks and data interception.
MCP Batch Tool Call Exploitation
Exploit batch tool calling in MCP to create race conditions and bypass per-call validation.
A2A Artifact Manipulation
Manipulating artifacts exchanged between agents in A2A protocol for data poisoning and injection.
A2A Agent Discovery Exploitation
Exploiting the A2A agent discovery mechanism to register malicious agents or impersonate trusted ones.
A2A Push Notification Abuse
Abusing A2A push notification mechanisms for out-of-band data exfiltration and command injection.
A2A Task State Manipulation
Manipulating task states in A2A to skip validation, bypass approval, or redirect task completion.
Reflection Loop Exploitation
Exploiting self-reflection and self-correction loops in agent workflows.
Supervisor Agent Override
Techniques for overriding supervisor agents in hierarchical multi-agent architectures.
Tool Chain Amplification Attacks
Amplifying attack impact by chaining tool calls in agent workflows for cascading exploitation.
Fallback Handler Exploitation
Exploit fallback and error handlers in agent workflows that have weaker security controls than primary paths.
Retry Loop Exploitation
Exploit retry and error-handling loops in agent workflows to amplify attack payloads and exhaust resources.
Workflow Checkpoint Manipulation
Manipulating workflow checkpoints and savepoints for state rollback attacks.
Agentic Exploitation Assessment
Assessment covering MCP exploitation, function calling abuse, agent memory attacks, and A2A injection.
Skill Verification: MCP Exploitation
Hands-on skill verification for MCP transport attacks, tool description injection, and server impersonation.
Case Study: AI Hiring System Bias Exploitation
Analysis of adversarial attacks on AI hiring systems and resume screening exploitation.
Code Assistant Exploitation Techniques
Techniques for exploiting AI code assistants to generate insecure code or leak repository information.
Test Generation Exploitation
Manipulating AI test generation to produce tests that pass but miss critical vulnerabilities.
March 2026: Agent Exploitation Challenge
Compromise a multi-tool agent system through prompt injection and tool abuse, completing multiple objectives with escalating difficulty and point values.
Training Data Memorization Exploitation
Techniques for exploiting model memorization to extract verbatim training examples.
Hybrid Search Exploitation
Exploiting hybrid dense-sparse search systems through coordinated embedding manipulation.
Reranker Exploitation Techniques
Attacking cross-encoder rerankers used in multi-stage retrieval pipelines to promote adversarial documents past initial retrieval filtering.
Fine-Tuning API Exploitation
Exploiting commercial fine-tuning APIs (OpenAI, Anthropic) for safety bypass and model manipulation.
Copilot/Cursor IDE Exploitation
Exploiting IDE-integrated AI code assistants: repository context poisoning, malicious comments that steer suggestions, data exfiltration through code completions, and prompt injection via file content.
Model Collapse in Recursive Training
Security implications of model collapse when models are trained on AI-generated data from previous generations.
Chain-of-Thought Exploitation
Techniques for manipulating reasoning chains in CoT-enabled models: false premise injection, logic bombs, reasoning hijacking, and chain corruption attacks.
Reasoning Model Exploitation
Exploiting extended thinking and chain-of-thought reasoning in o1, Claude, and DeepSeek-R1 models.
Sycophancy Exploitation Research
Research on exploiting sycophantic tendencies in RLHF-trained models for safety bypass and manipulation.
World Model Exploitation in AI Agents
Exploiting learned world models in AI agents to cause unsafe behavior through environmental manipulation.
Context Window Internals
How attention decay, positional encoding limits, and memory management in transformer context windows create exploitable patterns for red team operations.
Agent Tool Chain Exploitation
Chain multiple tool calls in an agent system to achieve multi-step exploitation.
Cross-Plugin Exploitation Chain
Chain exploits across multiple LLM plugins to achieve capabilities not available through any single plugin.
Lab: MCP Full Exploitation
Hands-on lab for conducting a complete Model Context Protocol server compromise including tool enumeration, permission escalation, cross-tool attacks, and data exfiltration through MCP channels.
Output Constraint Exploitation
Exploit structured output requirements to force models into revealing restricted information.
Lab: RLHF Training Exploitation
Exploit reward model weaknesses to craft inputs that score highly on safety metrics while achieving adversarial objectives.
Lab: Sparse Attention Exploitation
Exploit sparse attention patterns in long-context models to hide injection payloads in low-attention regions.
LangChain CVE Exploitation Lab
Reproduce and analyze LangChain CVEs including CVE-2023-29374 and CVE-2023-36258 in a safe lab environment.
Reasoning Trace Exploitation in CoT Models
Exploit visible chain-of-thought reasoning traces in models like o1 and DeepSeek-R1 to manipulate outputs.
JSON Output Exploitation Basics
Exploit LLM JSON mode to bypass content filters and extract information through structured output channels.
Lab: Output Format Exploitation
Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.
Error Message Exploitation
Trigger and analyze error messages from LLM applications to reveal architecture and configuration details.
Emergent Behavior Exploitation
Identify and exploit emergent behaviors in frontier models that arise from scale and are not present in smaller models.
MoE Routing Exploitation Lab
Exploit Mixture-of-Experts routing mechanisms to selectively activate or suppress expert modules in MoE models.
Quantization-Induced Vulnerability Exploitation
Exploit behavioral differences between full-precision and quantized models.
MoE Routing Exploitation
Exploit Mixture-of-Experts routing mechanisms to activate specific expert networks for adversarial purposes.
Lab: Exploiting AI Agents
Hands-on lab for exploiting AI agents with tool access through indirect prompt injection, tool-call chaining, and privilege escalation techniques.
Lab: Chain-of-Thought Exploitation
Exploit chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.
Lab: Exploiting LLM Hallucinations
Exploit hallucination tendencies to trigger fabricated tool calls, invented API endpoints, and false fact injection.
Lab: MCP Server Exploitation
Hands-on lab for attacking a vulnerable MCP server, demonstrating tool shadowing, consent phishing, and server impersonation techniques.
Lab: Output Schema Exploitation
Exploit structured output schemas to force models into revealing information through required field population.
Lab: Role Hierarchy Exploitation
Exploit multi-role LLM architectures by manipulating the instruction hierarchy between system, user, and assistant roles.
Session State Exploitation Lab
Exploit session management vulnerabilities in LLM applications to access other users data or escalate privileges.
Structured Output Exploitation
Exploit JSON mode and structured output constraints to bypass safety filters.
Advanced A/B Test Exploitation
Manipulating A/B testing frameworks to bias model selection toward less secure variants or introduce adversarial model candidates.
MoE Routing Exploitation (Model Deep Dives)
Detailed analysis of how Mixture-of-Experts routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Document Parsing Exploitation
Exploiting PDF, DOCX, and other document parsers in multimodal AI systems for injection and data extraction.
Video Understanding Model Exploitation
Attacking video captioning, video Q&A, and action recognition models with adversarial videos that cause misclassification or instruction injection.
Instruction Hierarchy Exploitation
Exploiting ambiguities in instruction priority hierarchies across different model providers.
Curriculum Learning Exploitation (Training Pipeline)
Exploiting curriculum learning and data ordering to amplify the effect of poisoned training examples.
RLHF Pipeline Exploitation
Exploiting reward model training, preference data collection, and RLHF optimization loops.
API Chaining Exploitation Walkthrough
Walkthrough of chaining multiple API calls in agent systems to achieve multi-step unauthorized actions.
Function Calling Exploitation Guide
Complete walkthrough of exploiting function calling in OpenAI, Anthropic, and Google AI APIs.
Inference Endpoint Exploitation
Exploiting inference API endpoints for unauthorized access, data exfiltration, and service abuse through authentication flaws, input validation gaps, and misconfigured permissions.
Logit Bias Exploitation Walkthrough
Exploit logit bias API parameters to force models toward specific token sequences and bypass safety filters.
Reasoning Model Exploitation Walkthrough
Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.
Role Confusion Exploitation Walkthrough
Exploit role boundaries in chat APIs by injecting assistant-role messages and system-level instructions.
API Parameter Exploitation Walkthrough
Walkthrough of exploiting undocumented and misconfigured API parameters in LLM service endpoints.
Batch API Exploitation Walkthrough
Walkthrough of exploiting batch processing APIs for large-scale prompt injection and output analysis.
Instruction Hierarchy Exploitation Walkthrough
Walkthrough of exploiting how models prioritize system, user, and tool instructions to override safety directives.
利用代理工具使用
如何操弄 AI 代理,使其以攻擊者可控的參數呼叫工具,並濫用其能力達成資料外洩、提權,以及未授權動作。
工具 Use 利用ation
Comprehensive techniques for exploiting how AI agents call external tools and APIs, including tool description poisoning, overly permissive access abuse, and tool output manipulation.
AI 代理利用
經混淆代理攻擊、目標劫持、特權升級與沙箱逃逸利用 AI 代理架構之方法論。
CrewAI Multi-代理 利用ation
Exploiting CrewAI's multi-agent orchestration for task injection and cross-agent attacks.
Haystack Pipeline 利用ation
Exploiting Haystack's pipeline architecture for component injection and data flow manipulation.
Phidata 代理 攻擊s
安全 analysis of Phidata agent framework including knowledge base poisoning and tool abuse.
Semantic Kernel Plugin 利用ation
Exploiting Microsoft's Semantic Kernel plugins and planners for unauthorized code execution.
Browser 代理 Framework 利用ation
利用ing browser-use and web agent frameworks through DOM injection and navigation manipulation.
脈絡視窗攻擊
利用 LLM 脈絡視窗限制之技術,含策略性脈絡溢位以推出系統指令、注意力操弄,與脈絡預算耗盡攻擊。
Vector-Based 記憶體 投毒
投毒 vector-based memory stores in agent systems to inject false context into retrieval.
記憶體 Eviction 利用ation
利用 memory eviction policies in context-limited agents to selectively remove safety-relevant context.
Function Calling Error 利用ation
Leveraging error handling paths in function calling implementations to leak information or bypass controls.
Function Type Confusion 攻擊s
利用ing type system weaknesses in function calling interfaces to trigger unexpected behavior through type confusion.
函式呼叫攻擊(Function Calling Exploitation)
概述 LLM 函式/工具呼叫的運作方式、其產生的攻擊面,以及對 AI 系統中函式呼叫介面進行利用的系統化方法。
Parallel Function Call 利用ation
利用ing parallel function calling to create race conditions and bypass sequential validation.
參數操弄
打造 LLM 函式呼叫中惡意參數值的技術,包含型別混淆、邊界越界、經由參數之注入,以及利用不完善之驗證。
函式結果投毒(代理式攻擊)
操弄函式回傳值以影響 LLM 行為的技術、透過工具結果注入指令,以及將被投毒結果串接為多步攻擊。
JSON Schema 注入
操弄函式定義與 JSON schema 以改變 LLM 行為、注入額外參數,以及利用工具呼叫系統中 schema 驗證缺口之技術。
Structured Output 工具 Injection
利用ing structured output mode to inject tool call directives into model responses.
工具 Disambiguation 攻擊s
利用ing tool selection ambiguity to redirect function calls to unintended tools.
Function Hallucination 利用ation
利用 model tendency to hallucinate function calls to non-existent APIs for information disclosure.
MCP Configuration Injection
Injecting malicious configuration into MCP server initialization for persistent compromise.
MCP Dynamic 工具 Registration 攻擊s
攻擊ing dynamic tool registration in MCP to inject malicious tools at runtime.
MCP Root Listing 利用ation
利用ing MCP root listing capabilities to discover and access sensitive file system resources.
MCP Sampling API 利用ation
利用ing the MCP sampling API to manipulate how servers request LLM completions.
MCP SSE Transport 安全 Analysis
安全 analysis of Server-Sent Events transport in MCP including reconnection attacks and event injection.
MCP stdio Transport 利用ation
利用ing the stdio transport mechanism in MCP for inter-process communication attacks and data interception.
MCP Batch 工具 Call 利用ation
利用 batch tool calling in MCP to create race conditions and bypass per-call validation.
A2A Artifact Manipulation
Manipulating artifacts exchanged between agents in A2A protocol for data poisoning and injection.
A2A 代理 Discovery 利用ation
利用ing the A2A agent discovery mechanism to register malicious agents or impersonate trusted ones.
A2A Push Notification Abuse
Abusing A2A push notification mechanisms for out-of-band data exfiltration and command injection.
A2A Task State Manipulation
Manipulating task states in A2A to skip validation, bypass approval, or redirect task completion.
Reflection Loop 利用ation
利用ing self-reflection and self-correction loops in agent workflows.
Supervisor 代理 Override
Techniques for overriding supervisor agents in hierarchical multi-agent architectures.
工具 Chain Amplification 攻擊s
Amplifying attack impact by chaining tool calls in agent workflows for cascading exploitation.
Fallback Handler 利用ation
利用 fallback and error handlers in agent workflows that have weaker security controls than primary paths.
Retry Loop 利用ation
利用 retry and error-handling loops in agent workflows to amplify attack payloads and exhaust resources.
Workflow Checkpoint Manipulation
Manipulating workflow checkpoints and savepoints for state rollback attacks.
代理式 利用ation 評量
評量 covering MCP exploitation, function calling abuse, agent memory attacks, and A2A injection.
Skill Verification: MCP 利用ation
Hands-on skill verification for MCP transport attacks, tool description injection, and server impersonation.
Case Study: AI Hiring System Bias 利用ation
Analysis of adversarial attacks on AI hiring systems and resume screening exploitation.
Code Assistant 利用ation Techniques
Techniques for exploiting AI code assistants to generate insecure code or leak repository information.
Test Generation 利用ation
Manipulating AI test generation to produce tests that pass but miss critical vulnerabilities.
2026 年 3 月:代理利用挑戰
經提示注入與工具濫用破壞多工具代理系統,以升級之難度與分數值完成多個目標。
訓練 Data Memorization 利用ation
Techniques for exploiting model memorization to extract verbatim training examples.
Hybrid Search 利用ation
利用ing hybrid dense-sparse search systems through coordinated embedding manipulation.
Reranker 利用ation Techniques
攻擊ing cross-encoder rerankers used in multi-stage retrieval pipelines to promote adversarial documents past initial retrieval filtering.
Fine-Tuning API 利用ation
利用ing commercial fine-tuning APIs (OpenAI, Anthropic) for safety bypass and model manipulation.
Copilot/Cursor IDE 利用ation
利用ing IDE-integrated AI code assistants: repository context poisoning, malicious comments that steer suggestions, data exfiltration through code completions, and prompt injection via file content.
模型 Collapse in Recursive 訓練
安全 implications of model collapse when models are trained on AI-generated data from previous generations.
Chain-of-Thought 利用ation
Techniques for manipulating reasoning chains in CoT-enabled models: false premise injection, logic bombs, reasoning hijacking, and chain corruption attacks.
Reasoning 模型 利用ation
利用ing extended thinking and chain-of-thought reasoning in o1, Claude, and DeepSeek-R1 models.
Sycophancy 利用ation Research
Research on exploiting sycophantic tendencies in RLHF-trained models for safety bypass and manipulation.
World 模型 利用ation in AI 代理s
利用ing learned world models in AI agents to cause unsafe behavior through environmental manipulation.
Context Window Internals
How attention decay, positional encoding limits, and memory management in transformer context windows create exploitable patterns for red team operations.
代理 工具 Chain 利用ation
Chain multiple tool calls in an agent system to achieve multi-step exploitation.
Cross-Plugin 利用ation Chain
Chain exploits across multiple LLM plugins to achieve capabilities not available through any single plugin.
實驗室: MCP Full 利用ation
Hands-on lab for conducting a complete 模型 Context Protocol server compromise including tool enumeration, permission escalation, cross-tool attacks, and data exfiltration through MCP channels.
Output Constraint 利用ation
利用 structured output requirements to force models into revealing restricted information.
實驗室: RLHF 訓練 利用ation
利用 reward model weaknesses to craft inputs that score highly on safety metrics while achieving adversarial objectives.
實驗室: Sparse Attention 利用ation
利用 sparse attention patterns in long-context models to hide injection payloads in low-attention regions.
LangChain CVE 利用ation 實驗室
Reproduce and analyze LangChain CVEs including CVE-2023-29374 and CVE-2023-36258 in a safe lab environment.
Reasoning Trace 利用ation in CoT 模型s
利用 visible chain-of-thought reasoning traces in models like o1 and DeepSeek-R1 to manipulate outputs.
JSON Output 利用ation Basics
利用 LLM JSON mode to bypass content filters and extract information through structured output channels.
實驗室: Output Format 利用ation
Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.
Error Message 利用ation
Trigger and analyze error messages from LLM applications to reveal architecture and configuration details.
Emergent Behavior 利用ation
Identify and exploit emergent behaviors in frontier models that arise from scale and are not present in smaller models.
MoE Routing 利用ation 實驗室
利用 Mixture-of-專家s routing mechanisms to selectively activate or suppress expert modules in MoE models.
Quantization-Induced 漏洞 利用ation
利用 behavioral differences between full-precision and quantized models.
MoE Routing 利用ation
利用 Mixture-of-專家s routing mechanisms to activate specific expert networks for adversarial purposes.
實驗室: 利用ing AI 代理s
Hands-on lab for exploiting AI agents with tool access through indirect prompt injection, tool-call chaining, and privilege escalation techniques.
實驗室: Chain-of-Thought 利用ation
利用 chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.
實驗室: 利用ing LLM Hallucinations
利用 hallucination tendencies to trigger fabricated tool calls, invented API endpoints, and false fact injection.
實驗室: MCP Server 利用ation
Hands-on lab for attacking a vulnerable MCP server, demonstrating tool shadowing, consent phishing, and server impersonation techniques.
實驗室: Output Schema 利用ation
利用 structured output schemas to force models into revealing information through required field population.
實驗室: Role Hierarchy 利用ation
利用 multi-role LLM architectures by manipulating the instruction hierarchy between system, user, and assistant roles.
Session State 利用ation 實驗室
利用 session management vulnerabilities in LLM applications to access other users data or escalate privileges.
Structured Output 利用ation
利用 JSON mode and structured output constraints to bypass safety filters.
進階 A/B Test 利用ation
Manipulating A/B testing frameworks to bias model selection toward less secure variants or introduce adversarial model candidates.
MoE Routing 利用ation (模型 Deep Dives)
Detailed analysis of how Mixture-of-專家s routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Document Parsing 利用ation
利用ing PDF, DOCX, and other document parsers in multimodal AI systems for injection and data extraction.
視訊理解模型利用
以造成誤分類或指令注入之對抗視訊攻擊視訊字幕、視訊 Q&A 與動作辨識模型。
Instruction Hierarchy 利用ation
利用ing ambiguities in instruction priority hierarchies across different model providers.
Curriculum Learning 利用ation (訓練 Pipeline)
利用ing curriculum learning and data ordering to amplify the effect of poisoned training examples.
RLHF Pipeline 利用ation
利用ing reward model training, preference data collection, and RLHF optimization loops.
API Chaining 利用ation 導覽
導覽 of chaining multiple API calls in agent systems to achieve multi-step unauthorized actions.
Function Calling 利用ation 指南
Complete walkthrough of exploiting function calling in OpenAI, Anthropic, and Google AI APIs.
Inference Endpoint 利用ation
利用ing inference API endpoints for unauthorized access, data exfiltration, and service abuse through authentication flaws, input validation gaps, and misconfigured permissions.
Logit Bias 利用ation 導覽
利用 logit bias API parameters to force models toward specific token sequences and bypass safety filters.
Reasoning 模型 利用ation 導覽
利用 extended thinking in reasoning models to inject false premises and manipulate conclusion generation.
Role Confusion 利用ation 導覽
利用 role boundaries in chat APIs by injecting assistant-role messages and system-level instructions.
API Parameter 利用ation 導覽
導覽 of exploiting undocumented and misconfigured API parameters in LLM service endpoints.
Batch API 利用ation 導覽
導覽 of exploiting batch processing APIs for large-scale prompt injection and output analysis.
Instruction Hierarchy 利用ation 導覽
導覽 of exploiting how models prioritize system, user, and tool instructions to override safety directives.