Rate Limiting, Sandboxing & Execution Controls
Rate limiting strategies for AI APIs, sandboxing code execution with E2B and Docker, tool call approval workflows, and the principle of least privilege for AI agents.
Architecture-level controls are the hardest 防禦 for attackers to bypass 因為 they operate outside 模型's influence. A jailbroken model that wants to execute malicious code cannot do so if execution is sandboxed. An 代理 tricked into calling dangerous tools cannot do so if those tools require human approval.
Rate Limiting Strategies
Rate limiting for AI APIs differs from traditional web API rate limiting 因為 AI requests have highly variable cost (both computational and financial).
Dimensions of Rate Limiting
| Dimension | What It Limits | Why It Matters |
|---|---|---|
| Requests per minute | Raw request count | Prevents automated attack tools |
| Tokens per minute | Total 輸入 + 輸出 符元 | Prevents cost abuse and context stuffing |
| Concurrent requests | Simultaneous in-flight requests | Prevents resource exhaustion |
| Cost per hour | Dollar cost of 推論 | Prevents financial damage from 利用 |
| Requests per session | Messages within a conversation | Prevents multi-turn escalation |
實作 Pattern
from dataclasses import dataclass, field
from datetime import datetime, timedelta
@dataclass
class RateLimitConfig:
requests_per_minute: int = 20
tokens_per_minute: int = 40_000
max_input_tokens: int = 4_096
max_output_tokens: int = 4_096
max_session_messages: int = 50
cost_limit_per_hour_usd: float = 10.0
class AIRateLimiter:
def __init__(self, config: RateLimitConfig):
self.config = config
self.windows: dict[str, list] = {}
def check_and_record(
self, user_id: str, input_tokens: int, session_messages: int
) -> tuple[bool, str]:
now = datetime.utcnow()
window_start = now - timedelta(minutes=1)
key = f"rpm:{user_id}"
# Request rate check
recent = [t for t in self.windows.get(key, []) if t > window_start]
if len(recent) >= self.config.requests_per_minute:
return False, "Rate limit exceeded: too many requests"
# 輸入 size check
if input_tokens > self.config.max_input_tokens:
return False, f"輸入 too large: {input_tokens} 符元"
# Session length check
if session_messages > self.config.max_session_messages:
return False, "Session message limit reached"
recent.append(now)
self.windows[key] = recent
return True, "OK"紅隊 Bypass Techniques for Rate Limiting
| Technique | Description | 緩解 |
|---|---|---|
| Distributed requests | Use multiple API keys or accounts | Per-organization aggregate limits |
| Slow-and-steady | Stay just below the rate limit | Behavioral analysis over longer windows |
| Session rotation | Start new sessions to reset session limits | Per-user (not per-session) tracking |
| Off-peak timing | 攻擊 during low-traffic periods when dynamic limits are higher | Fixed rate limits regardless of load |
Sandboxing Code Execution
When AI 代理 generate and execute code, sandboxing is critical. Without it, a compromised 代理 can access the host filesystem, network, and other infrastructure.
E2B provides 雲端-hosted sandboxed environments specifically designed for AI code execution:
from e2b_code_interpreter import Sandbox
# Create isolated sandbox with 30-second timeout
sandbox = Sandbox(timeout=30)
# Execute AI-generated code in isolation
result = sandbox.run_code("""
import os
# This runs in a completely isolated environment
# No access to host filesystem, network restrictions applied
print(os.listdir('/')) # Only sees sandbox filesystem
""")
print(result.text) # 輸出 from sandboxed execution
sandbox.kill() # Clean up安全 properties:
- Isolated filesystem (no host access)
- Network restrictions (configurable allowlists)
- CPU/memory limits
- Automatic timeout and cleanup
- No persistence between executions
Docker containers provide process-level isolation for code execution:
import docker
import tempfile
client = docker.from_env()
def execute_sandboxed(code: str, timeout: int = 10) -> str:
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write(code)
f.flush()
container = client.containers.run(
"python:3.11-slim",
command=f"python /code/{f.name.split('/')[-1]}",
volumes={f.name: {"bind": f"/code/{f.name.split('/')[-1]}", "mode": "ro"}},
network_disabled=True, # No network access
mem_limit="256m", # Memory cap
cpu_period=100000, # CPU throttling
cpu_quota=50000,
read_only=True, # Read-only filesystem
remove=True, # Auto-cleanup
timeout=timeout,
)
return container.decode("utf-8")Key Docker flags for AI sandboxing:
network_disabled=True-- prevents data exfiltrationread_only=True-- prevents filesystem modificationmem_limit-- prevents memory exhaustion attacks--安全-opt=no-new-privileges-- prevents privilege escalation
WASM runtimes provide the strongest isolation for lightweight code execution:
# Using wasmtime for sandboxed Python execution
# Pyodide runs Python in a WASM sandbox with no system access
from pyodide_sandbox import PySandbox
sandbox = PySandbox(
allowed_imports=["math", "json", "re"], # Whitelist imports
max_execution_time_ms=5000,
max_memory_mb=64,
)
result = sandbox.execute(ai_generated_code)Trade-off: Strongest isolation but most limited capability. No filesystem, no network, restricted standard library.
Tool Call Approval Workflows
For AI 代理 that can perform real-world actions (sending emails, modifying databases, making purchases), approval gates prevent unauthorized actions.
Approval Architecture
from enum import Enum
class ApprovalPolicy(Enum):
AUTO_APPROVE = "auto" # Low-risk tools
NOTIFY_AND_PROCEED = "notify" # Medium-risk: log but allow
REQUIRE_APPROVAL = "approve" # High-risk: block until human approves
ALWAYS_DENY = "deny" # Forbidden tools
TOOL_POLICIES = {
"search_web": ApprovalPolicy.AUTO_APPROVE,
"read_file": ApprovalPolicy.NOTIFY_AND_PROCEED,
"send_email": ApprovalPolicy.REQUIRE_APPROVAL,
"execute_sql": ApprovalPolicy.REQUIRE_APPROVAL,
"delete_record": ApprovalPolicy.ALWAYS_DENY,
"modify_permissions": ApprovalPolicy.ALWAYS_DENY,
}
class ToolGatekeeper:
def __init__(self, policies: dict[str, ApprovalPolicy]):
self.policies = policies
async def check_tool_call(
self, tool_name: str, arguments: dict, user_id: str
) -> tuple[bool, str]:
policy = self.policies.get(tool_name, ApprovalPolicy.ALWAYS_DENY)
if policy == ApprovalPolicy.AUTO_APPROVE:
return True, "Auto-approved"
elif policy == ApprovalPolicy.NOTIFY_AND_PROCEED:
await self.log_notification(tool_name, arguments, user_id)
return True, "Approved with notification"
elif policy == ApprovalPolicy.REQUIRE_APPROVAL:
approved = await self.request_human_approval(
tool_name, arguments, user_id
)
return approved, "Human review"
else:
return False, f"Tool '{tool_name}' is forbidden"紅隊 Considerations
| 攻擊 | Approval Control | Bypass Potential |
|---|---|---|
| Direct 工具呼叫 to forbidden tool | ALWAYS_DENY | None -- architecturally blocked |
| Argument injection in allowed tool | AUTO_APPROVE on the tool | High -- arguments are not reviewed |
| Tool chaining (safe tools → unsafe outcome) | Individual tool policies | Medium -- chain creates emergent risk |
| Approval fatigue (many low-risk requests) | REQUIRE_APPROVAL | Medium -- humans rubber-stamp after many approvals |
Principle of Least Privilege for AI 代理
Least privilege is the most important architectural 防禦 for 代理式 AI systems.
實作 Checklist
| Principle | 實作 | 範例 |
|---|---|---|
| Minimal tool set | Only register tools the 代理 actually needs | Customer service bot gets search_faq and create_ticket, not execute_sql |
| Read-before-write | Default to read-only access; write requires explicit grant | 代理 can read 資料庫 but not modify it without elevated session |
| Scoped credentials | Each tool gets credentials limited to its function | Email tool can only send from a specific address to allowed domains |
| Time-limited access | Permissions expire after a session or time window | 資料庫 write access revoked after 5 minutes |
| Audit trail | Log every tool invocation with full arguments | Searchable log of all 代理 actions for forensic review |
Further Reading
- 防禦-in-Depth for LLM Apps -- where execution controls fit in the 防禦 stack
- Runtime 監控 & Anomaly 偵測 -- detecting bypass attempts
- 護欄 & 安全 Layer Architecture -- complementary 護欄 layers
- Tool Abuse -- attacks that these controls defend against
相關主題
- 防禦-in-Depth for LLM Apps - Where execution controls fit in the 防禦 stack
- Runtime 監控 & Anomaly 偵測 - Detecting bypass attempts against execution controls
- 代理 Architectures & Tool Use Patterns - The 代理 patterns these controls defend
- 護欄 & 安全 Layer Architecture - Complementary 護欄 layers
參考文獻
- "E2B Documentation: AI Code Execution Sandbox" - E2B (2025) - Documentation for the 雲端-hosted sandboxed execution environment designed for AI 代理
- "Docker 安全 最佳實務" - Docker Inc. (2025) - Official 安全 guidance for container isolation including network disabling and read-only filesystems
- "Principle of Least Privilege in Modern Applications" - NIST SP 800-53 (2023) - Federal 安全 control defining least privilege requirements applicable to AI 代理 tool access
- "OWASP Top 10 for LLM Applications: LLM08 Excessive Agency" - OWASP (2025) - Risk classification for over-permissioned AI 代理 that execution controls mitigate
An AI 代理 has been jailbroken and 攻擊者 instructs it to delete all records in a 資料庫. The 代理 has the 'execute_sql' tool with REQUIRE_APPROVAL policy and 'delete_record' with ALWAYS_DENY policy. What happens?