Rate Limiting, Sandboxing & Execution Controls
Rate limiting strategies for AI APIs, sandboxing code execution with E2B and Docker, tool call approval workflows, and the principle of least privilege for AI agents.
Architecture-level controls are the hardest defenses for attackers to bypass because they operate outside the model's influence. A jailbroken model that wants to execute malicious code cannot do so if execution is sandboxed. An agent tricked into calling dangerous tools cannot do so if those tools require human approval.
Rate Limiting Strategies
Rate limiting for AI APIs differs from traditional web API rate limiting because AI requests have highly variable cost (both computational and financial).
Dimensions of Rate Limiting
| Dimension | What It Limits | Why It Matters |
|---|---|---|
| Requests per minute | Raw request count | Prevents automated attack tools |
| Tokens per minute | Total input + output tokens | Prevents cost abuse and context stuffing |
| Concurrent requests | Simultaneous in-flight requests | Prevents resource exhaustion |
| Cost per hour | Dollar cost of inference | Prevents financial damage from exploitation |
| Requests per session | Messages within a conversation | Prevents multi-turn escalation |
Implementation Pattern
from dataclasses import dataclass, field
from datetime import datetime, timedelta
@dataclass
class RateLimitConfig:
requests_per_minute: int = 20
tokens_per_minute: int = 40_000
max_input_tokens: int = 4_096
max_output_tokens: int = 4_096
max_session_messages: int = 50
cost_limit_per_hour_usd: float = 10.0
class AIRateLimiter:
def __init__(self, config: RateLimitConfig):
self.config = config
self.windows: dict[str, list] = {}
def check_and_record(
self, user_id: str, input_tokens: int, session_messages: int
) -> tuple[bool, str]:
now = datetime.utcnow()
window_start = now - timedelta(minutes=1)
key = f"rpm:{user_id}"
# Request rate check
recent = [t for t in self.windows.get(key, []) if t > window_start]
if len(recent) >= self.config.requests_per_minute:
return False, "Rate limit exceeded: too many requests"
# Input size check
if input_tokens > self.config.max_input_tokens:
return False, f"Input too large: {input_tokens} tokens"
# Session length check
if session_messages > self.config.max_session_messages:
return False, "Session message limit reached"
recent.append(now)
self.windows[key] = recent
return True, "OK"Red Team Bypass Techniques for Rate Limiting
| Technique | Description | Mitigation |
|---|---|---|
| Distributed requests | Use multiple API keys or accounts | Per-organization aggregate limits |
| Slow-and-steady | Stay just below the rate limit | Behavioral analysis over longer windows |
| Session rotation | Start new sessions to reset session limits | Per-user (not per-session) tracking |
| Off-peak timing | Attack during low-traffic periods when dynamic limits are higher | Fixed rate limits regardless of load |
Sandboxing Code Execution
When AI agents generate and execute code, sandboxing is critical. Without it, a compromised agent can access the host filesystem, network, and other infrastructure.
E2B provides cloud-hosted sandboxed environments specifically designed for AI code execution:
from e2b_code_interpreter import Sandbox
# Create isolated sandbox with 30-second timeout
sandbox = Sandbox(timeout=30)
# Execute AI-generated code in isolation
result = sandbox.run_code("""
import os
# This runs in a completely isolated environment
# No access to host filesystem, network restrictions applied
print(os.listdir('/')) # Only sees sandbox filesystem
""")
print(result.text) # Output from sandboxed execution
sandbox.kill() # Clean upSecurity properties:
- Isolated filesystem (no host access)
- Network restrictions (configurable allowlists)
- CPU/memory limits
- Automatic timeout and cleanup
- No persistence between executions
Docker containers provide process-level isolation for code execution:
import docker
import tempfile
client = docker.from_env()
def execute_sandboxed(code: str, timeout: int = 10) -> str:
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write(code)
f.flush()
container = client.containers.run(
"python:3.11-slim",
command=f"python /code/{f.name.split('/')[-1]}",
volumes={f.name: {"bind": f"/code/{f.name.split('/')[-1]}", "mode": "ro"}},
network_disabled=True, # No network access
mem_limit="256m", # Memory cap
cpu_period=100000, # CPU throttling
cpu_quota=50000,
read_only=True, # Read-only filesystem
remove=True, # Auto-cleanup
timeout=timeout,
)
return container.decode("utf-8")Key Docker flags for AI sandboxing:
network_disabled=True-- prevents data exfiltrationread_only=True-- prevents filesystem modificationmem_limit-- prevents memory exhaustion attacks--security-opt=no-new-privileges-- prevents privilege escalation
WASM runtimes provide the strongest isolation for lightweight code execution:
# Using wasmtime for sandboxed Python execution
# Pyodide runs Python in a WASM sandbox with no system access
from pyodide_sandbox import PySandbox
sandbox = PySandbox(
allowed_imports=["math", "json", "re"], # Whitelist imports
max_execution_time_ms=5000,
max_memory_mb=64,
)
result = sandbox.execute(ai_generated_code)Trade-off: Strongest isolation but most limited capability. No filesystem, no network, restricted standard library.
Tool Call Approval Workflows
For AI agents that can perform real-world actions (sending emails, modifying databases, making purchases), approval gates prevent unauthorized actions.
Approval Architecture
from enum import Enum
class ApprovalPolicy(Enum):
AUTO_APPROVE = "auto" # Low-risk tools
NOTIFY_AND_PROCEED = "notify" # Medium-risk: log but allow
REQUIRE_APPROVAL = "approve" # High-risk: block until human approves
ALWAYS_DENY = "deny" # Forbidden tools
TOOL_POLICIES = {
"search_web": ApprovalPolicy.AUTO_APPROVE,
"read_file": ApprovalPolicy.NOTIFY_AND_PROCEED,
"send_email": ApprovalPolicy.REQUIRE_APPROVAL,
"execute_sql": ApprovalPolicy.REQUIRE_APPROVAL,
"delete_record": ApprovalPolicy.ALWAYS_DENY,
"modify_permissions": ApprovalPolicy.ALWAYS_DENY,
}
class ToolGatekeeper:
def __init__(self, policies: dict[str, ApprovalPolicy]):
self.policies = policies
async def check_tool_call(
self, tool_name: str, arguments: dict, user_id: str
) -> tuple[bool, str]:
policy = self.policies.get(tool_name, ApprovalPolicy.ALWAYS_DENY)
if policy == ApprovalPolicy.AUTO_APPROVE:
return True, "Auto-approved"
elif policy == ApprovalPolicy.NOTIFY_AND_PROCEED:
await self.log_notification(tool_name, arguments, user_id)
return True, "Approved with notification"
elif policy == ApprovalPolicy.REQUIRE_APPROVAL:
approved = await self.request_human_approval(
tool_name, arguments, user_id
)
return approved, "Human review"
else:
return False, f"Tool '{tool_name}' is forbidden"Red Team Considerations
| Attack | Approval Control | Bypass Potential |
|---|---|---|
| Direct tool call to forbidden tool | ALWAYS_DENY | None -- architecturally blocked |
| Argument injection in allowed tool | AUTO_APPROVE on the tool | High -- arguments are not reviewed |
| Tool chaining (safe tools → unsafe outcome) | Individual tool policies | Medium -- chain creates emergent risk |
| Approval fatigue (many low-risk requests) | REQUIRE_APPROVAL | Medium -- humans rubber-stamp after many approvals |
Principle of Least Privilege for AI Agents
Least privilege is the most important architectural defense for agentic AI systems.
Implementation Checklist
| Principle | Implementation | Example |
|---|---|---|
| Minimal tool set | Only register tools the agent actually needs | Customer service bot gets search_faq and create_ticket, not execute_sql |
| Read-before-write | Default to read-only access; write requires explicit grant | Agent can read database but not modify it without elevated session |
| Scoped credentials | Each tool gets credentials limited to its function | Email tool can only send from a specific address to allowed domains |
| Time-limited access | Permissions expire after a session or time window | Database write access revoked after 5 minutes |
| Audit trail | Log every tool invocation with full arguments | Searchable log of all agent actions for forensic review |
Further Reading
- Defense-in-Depth for LLM Apps -- where execution controls fit in the defense stack
- Runtime Monitoring & Anomaly Detection -- detecting bypass attempts
- Guardrails & Safety Layer Architecture -- complementary guardrail layers
- Tool Abuse -- attacks that these controls defend against
Related Topics
- Defense-in-Depth for LLM Apps - Where execution controls fit in the defense stack
- Runtime Monitoring & Anomaly Detection - Detecting bypass attempts against execution controls
- Agent Architectures & Tool Use Patterns - The agent patterns these controls defend
- Guardrails & Safety Layer Architecture - Complementary guardrail layers
References
- "E2B Documentation: AI Code Execution Sandbox" - E2B (2025) - Documentation for the cloud-hosted sandboxed execution environment designed for AI agents
- "Docker Security Best Practices" - Docker Inc. (2025) - Official security guidance for container isolation including network disabling and read-only filesystems
- "Principle of Least Privilege in Modern Applications" - NIST SP 800-53 (2023) - Federal security control defining least privilege requirements applicable to AI agent tool access
- "OWASP Top 10 for LLM Applications: LLM08 Excessive Agency" - OWASP (2025) - Risk classification for over-permissioned AI agents that execution controls mitigate
An AI agent has been jailbroken and the attacker instructs it to delete all records in a database. The agent has the 'execute_sql' tool with REQUIRE_APPROVAL policy and 'delete_record' with ALWAYS_DENY policy. What happens?