# exploitation
93 articlestagged with “exploitation”
Exploiting Agent Tool Use
How to manipulate AI agents into calling tools with attacker-controlled parameters, abusing tool capabilities for data exfiltration, privilege escalation, and unauthorized actions.
AI Agent Exploitation
Methodology for exploiting AI agent architectures through confused deputy attacks, goal hijacking, privilege escalation, and sandbox escape.
CrewAI Multi-Agent Exploitation
Exploiting CrewAI's multi-agent orchestration for task injection and cross-agent attacks.
Haystack Pipeline Exploitation
Exploiting Haystack's pipeline architecture for component injection and data flow manipulation.
Phidata Agent Attacks
Security analysis of Phidata agent framework including knowledge base poisoning and tool abuse.
Semantic Kernel Plugin Exploitation
Exploiting Microsoft's Semantic Kernel plugins and planners for unauthorized code execution.
Browser Agent Framework Exploitation
Exploiting browser-use and web agent frameworks through DOM injection and navigation manipulation.
Context Window Attacks
Techniques for exploiting LLM context window limits, including strategic context overflow to push out system instructions, attention manipulation, and context budget exhaustion attacks.
Vector-Based Memory Poisoning
Poisoning vector-based memory stores in agent systems to inject false context into retrieval.
Memory Eviction Exploitation
Exploit memory eviction policies in context-limited agents to selectively remove safety-relevant context.
Function Calling Error Exploitation
Leveraging error handling paths in function calling implementations to leak information or bypass controls.
Function Type Confusion Attacks
Exploiting type system weaknesses in function calling interfaces to trigger unexpected behavior through type confusion.
Function Calling Exploitation
Overview of how LLM function/tool calling works, the attack surface it creates, and systematic approaches to exploiting function calling interfaces in AI systems.
Parallel Function Call Exploitation
Exploiting parallel function calling to create race conditions and bypass sequential validation.
Parameter Manipulation
Techniques for crafting malicious parameter values in LLM function calls, including type confusion, boundary violations, injection through parameters, and exploiting weak validation.
Function Result Poisoning (Agentic Exploitation)
Techniques for manipulating function return values to influence LLM behavior, inject instructions via tool results, and chain poisoned results into multi-step exploitation.
JSON Schema Injection
Techniques for manipulating function definitions and JSON schemas to alter LLM behavior, inject additional parameters, and exploit schema validation gaps in tool calling systems.
Structured Output Tool Injection
Exploiting structured output mode to inject tool call directives into model responses.
Tool Disambiguation Attacks
Exploiting tool selection ambiguity to redirect function calls to unintended tools.
Function Hallucination Exploitation
Exploit model tendency to hallucinate function calls to non-existent APIs for information disclosure.
MCP Configuration Injection
Injecting malicious configuration into MCP server initialization for persistent compromise.
MCP Dynamic Tool Registration Attacks
Attacking dynamic tool registration in MCP to inject malicious tools at runtime.
MCP Root Listing Exploitation
Exploiting MCP root listing capabilities to discover and access sensitive file system resources.
MCP Sampling API Exploitation
Exploiting the MCP sampling API to manipulate how servers request LLM completions.
MCP SSE Transport Security Analysis
Security analysis of Server-Sent Events transport in MCP including reconnection attacks and event injection.
MCP stdio Transport Exploitation
Exploiting the stdio transport mechanism in MCP for inter-process communication attacks and data interception.
MCP Batch Tool Call Exploitation
Exploit batch tool calling in MCP to create race conditions and bypass per-call validation.
A2A Artifact Manipulation
Manipulating artifacts exchanged between agents in A2A protocol for data poisoning and injection.
A2A Agent Discovery Exploitation
Exploiting the A2A agent discovery mechanism to register malicious agents or impersonate trusted ones.
A2A Push Notification Abuse
Abusing A2A push notification mechanisms for out-of-band data exfiltration and command injection.
A2A Task State Manipulation
Manipulating task states in A2A to skip validation, bypass approval, or redirect task completion.
Reflection Loop Exploitation
Exploiting self-reflection and self-correction loops in agent workflows.
Supervisor Agent Override
Techniques for overriding supervisor agents in hierarchical multi-agent architectures.
Tool Chain Amplification Attacks
Amplifying attack impact by chaining tool calls in agent workflows for cascading exploitation.
Fallback Handler Exploitation
Exploit fallback and error handlers in agent workflows that have weaker security controls than primary paths.
Retry Loop Exploitation
Exploit retry and error-handling loops in agent workflows to amplify attack payloads and exhaust resources.
Workflow Checkpoint Manipulation
Manipulating workflow checkpoints and savepoints for state rollback attacks.
Agentic Exploitation Assessment
Assessment covering MCP exploitation, function calling abuse, agent memory attacks, and A2A injection.
Skill Verification: MCP Exploitation
Hands-on skill verification for MCP transport attacks, tool description injection, and server impersonation.
Case Study: AI Hiring System Bias Exploitation
Analysis of adversarial attacks on AI hiring systems and resume screening exploitation.
Code Assistant Exploitation Techniques
Techniques for exploiting AI code assistants to generate insecure code or leak repository information.
Test Generation Exploitation
Manipulating AI test generation to produce tests that pass but miss critical vulnerabilities.
March 2026: Agent Exploitation Challenge
Compromise a multi-tool agent system through prompt injection and tool abuse, completing multiple objectives with escalating difficulty and point values.
Training Data Memorization Exploitation
Techniques for exploiting model memorization to extract verbatim training examples.
Hybrid Search Exploitation
Exploiting hybrid dense-sparse search systems through coordinated embedding manipulation.
Reranker Exploitation Techniques
Attacking cross-encoder rerankers used in multi-stage retrieval pipelines to promote adversarial documents past initial retrieval filtering.
Fine-Tuning API Exploitation
Exploiting commercial fine-tuning APIs (OpenAI, Anthropic) for safety bypass and model manipulation.
Copilot/Cursor IDE Exploitation
Exploiting IDE-integrated AI code assistants: repository context poisoning, malicious comments that steer suggestions, data exfiltration through code completions, and prompt injection via file content.
Model Collapse in Recursive Training
Security implications of model collapse when models are trained on AI-generated data from previous generations.
Chain-of-Thought Exploitation
Techniques for manipulating reasoning chains in CoT-enabled models: false premise injection, logic bombs, reasoning hijacking, and chain corruption attacks.
Reasoning Model Exploitation
Exploiting extended thinking and chain-of-thought reasoning in o1, Claude, and DeepSeek-R1 models.
Sycophancy Exploitation Research
Research on exploiting sycophantic tendencies in RLHF-trained models for safety bypass and manipulation.
World Model Exploitation in AI Agents
Exploiting learned world models in AI agents to cause unsafe behavior through environmental manipulation.
Context Window Internals
How attention decay, positional encoding limits, and memory management in transformer context windows create exploitable patterns for red team operations.
Agent Tool Chain Exploitation
Chain multiple tool calls in an agent system to achieve multi-step exploitation.
Cross-Plugin Exploitation Chain
Chain exploits across multiple LLM plugins to achieve capabilities not available through any single plugin.
Lab: MCP Full Exploitation
Hands-on lab for conducting a complete Model Context Protocol server compromise including tool enumeration, permission escalation, cross-tool attacks, and data exfiltration through MCP channels.
Output Constraint Exploitation
Exploit structured output requirements to force models into revealing restricted information.
Lab: RLHF Training Exploitation
Exploit reward model weaknesses to craft inputs that score highly on safety metrics while achieving adversarial objectives.
Lab: Sparse Attention Exploitation
Exploit sparse attention patterns in long-context models to hide injection payloads in low-attention regions.
LangChain CVE Exploitation Lab
Reproduce and analyze LangChain CVEs including CVE-2023-29374 and CVE-2023-36258 in a safe lab environment.
Reasoning Trace Exploitation in CoT Models
Exploit visible chain-of-thought reasoning traces in models like o1 and DeepSeek-R1 to manipulate outputs.
JSON Output Exploitation Basics
Exploit LLM JSON mode to bypass content filters and extract information through structured output channels.
Lab: Output Format Exploitation
Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.
Error Message Exploitation
Trigger and analyze error messages from LLM applications to reveal architecture and configuration details.
Emergent Behavior Exploitation
Identify and exploit emergent behaviors in frontier models that arise from scale and are not present in smaller models.
MoE Routing Exploitation Lab
Exploit Mixture-of-Experts routing mechanisms to selectively activate or suppress expert modules in MoE models.
Quantization-Induced Vulnerability Exploitation
Exploit behavioral differences between full-precision and quantized models.
MoE Routing Exploitation
Exploit Mixture-of-Experts routing mechanisms to activate specific expert networks for adversarial purposes.
Lab: Exploiting AI Agents
Hands-on lab for exploiting AI agents with tool access through indirect prompt injection, tool-call chaining, and privilege escalation techniques.
Lab: Chain-of-Thought Exploitation
Exploit chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.
Lab: Exploiting LLM Hallucinations
Exploit hallucination tendencies to trigger fabricated tool calls, invented API endpoints, and false fact injection.
Lab: MCP Server Exploitation
Hands-on lab for attacking a vulnerable MCP server, demonstrating tool shadowing, consent phishing, and server impersonation techniques.
Lab: Output Schema Exploitation
Exploit structured output schemas to force models into revealing information through required field population.
Lab: Role Hierarchy Exploitation
Exploit multi-role LLM architectures by manipulating the instruction hierarchy between system, user, and assistant roles.
Session State Exploitation Lab
Exploit session management vulnerabilities in LLM applications to access other users data or escalate privileges.
Structured Output Exploitation
Exploit JSON mode and structured output constraints to bypass safety filters.
Advanced A/B Test Exploitation
Manipulating A/B testing frameworks to bias model selection toward less secure variants or introduce adversarial model candidates.
MoE Routing Exploitation (Model Deep Dives)
Detailed analysis of how Mixture-of-Experts routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Document Parsing Exploitation
Exploiting PDF, DOCX, and other document parsers in multimodal AI systems for injection and data extraction.
Video Understanding Model Exploitation
Attacking video captioning, video Q&A, and action recognition models with adversarial videos that cause misclassification or instruction injection.
Instruction Hierarchy Exploitation
Exploiting ambiguities in instruction priority hierarchies across different model providers.
Curriculum Learning Exploitation (Training Pipeline)
Exploiting curriculum learning and data ordering to amplify the effect of poisoned training examples.
RLHF Pipeline Exploitation
Exploiting reward model training, preference data collection, and RLHF optimization loops.
API Chaining Exploitation Walkthrough
Walkthrough of chaining multiple API calls in agent systems to achieve multi-step unauthorized actions.
Function Calling Exploitation Guide
Complete walkthrough of exploiting function calling in OpenAI, Anthropic, and Google AI APIs.
Inference Endpoint Exploitation
Exploiting inference API endpoints for unauthorized access, data exfiltration, and service abuse through authentication flaws, input validation gaps, and misconfigured permissions.
Logit Bias Exploitation Walkthrough
Exploit logit bias API parameters to force models toward specific token sequences and bypass safety filters.
Reasoning Model Exploitation Walkthrough
Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.
Role Confusion Exploitation Walkthrough
Exploit role boundaries in chat APIs by injecting assistant-role messages and system-level instructions.
API Parameter Exploitation Walkthrough
Walkthrough of exploiting undocumented and misconfigured API parameters in LLM service endpoints.
Batch API Exploitation Walkthrough
Walkthrough of exploiting batch processing APIs for large-scale prompt injection and output analysis.
Instruction Hierarchy Exploitation Walkthrough
Walkthrough of exploiting how models prioritize system, user, and tool instructions to override safety directives.