API Key Compromise Investigation
Investigating AI API key compromise incidents including detection, scope assessment, usage forensics, and remediation procedures.
概覽
AI API keys are high-value targets for attackers. A compromised OpenAI, Anthropic, Google AI, or Azure OpenAI API key grants 攻擊者 access to powerful language models at the victim's expense. The consequences range from financial: attackers have run up bills exceeding $100,000 in a single weekend using stolen GPT-4 keys, to operational: 攻擊者 can use the key to process sensitive data through 模型, potentially exfiltrating information from RAG-connected systems. In the worst case, a compromised key with 微調 權限 allows 攻擊者 to poison models associated with the account.
API key compromise investigations differ from traditional credential compromise in several important ways. First, the "blast radius" depends heavily on what the key is connected to. A key used only for simple completions has a different risk profile than one wired into an 代理式 system with tool access. Second, the forensic artifacts are primarily API usage logs and billing records rather than system logs and network captures. Third, the 攻擊面 for credential theft is broad: keys end up in git repositories, client-side code, CI/CD logs, environment variable dumps, and shared configuration files.
This article walks through the complete investigation lifecycle for AI API key compromise: initial 偵測, scope 評估, forensic reconstruction of 攻擊者 activity, and remediation.
偵測 of API Key Compromise
Usage Anomaly 偵測
The most reliable signal for API key compromise is anomalous usage. This manifests as spikes in request volume, unexpected model usage (攻擊者 using GPT-4-turbo when your application only uses GPT-3.5), requests from unfamiliar IP addresses, or usage outside of normal business hours.
import json
from datetime import datetime, timedelta
from dataclasses import dataclass, field
from collections import defaultdict
from typing import Optional
@dataclass
class UsageRecord:
"""A single API usage record."""
timestamp: str
api_key_prefix: str # Last 4 chars or masked key
model: str
input_tokens: int
output_tokens: int
endpoint: str
source_ip: Optional[str] = None
user_agent: Optional[str] = None
status_code: int = 200
cost_usd: float = 0.0
@dataclass
class AnomalyAlert:
"""An anomaly detected in API usage."""
alert_type: str
severity: str # low, medium, high, critical
description: str
evidence: dict
timestamp: str
api_key_prefix: str
class APIKeyUsageAnalyzer:
"""Detect anomalous API key usage patterns that may
indicate compromise."""
# Approximate costs per 1K 符元 (USD) as of early 2026
MODEL_COSTS = {
"gpt-4": {"輸入": 0.03, "輸出": 0.06},
"gpt-4-turbo": {"輸入": 0.01, "輸出": 0.03},
"gpt-4o": {"輸入": 0.005, "輸出": 0.015},
"gpt-3.5-turbo": {"輸入": 0.0005, "輸出": 0.0015},
"claude-3-opus": {"輸入": 0.015, "輸出": 0.075},
"claude-3.5-sonnet": {"輸入": 0.003, "輸出": 0.015},
"claude-3.5-haiku": {"輸入": 0.0008, "輸出": 0.004},
}
def __init__(self, baseline_window_days: int = 30):
self.baseline_window_days = baseline_window_days
self.baselines: dict[str, dict] = {} # key_prefix -> baseline stats
def build_baseline(self, historical_records: list[UsageRecord]) -> None:
"""Build usage baselines from historical data."""
key_records: dict[str, list[UsageRecord]] = defaultdict(list)
for record in historical_records:
key_records[record.api_key_prefix].append(record)
for key_prefix, records in key_records.items():
daily_requests: dict[str, int] = defaultdict(int)
daily_costs: dict[str, float] = defaultdict(float)
models_used: set[str] = set()
ips_seen: set[str] = set()
hourly_distribution: dict[int, int] = defaultdict(int)
for r in records:
try:
dt = datetime.fromisoformat(r.timestamp)
except (ValueError, TypeError):
continue
day_key = dt.strftime("%Y-%m-%d")
daily_requests[day_key] += 1
daily_costs[day_key] += r.cost_usd
models_used.add(r.model)
if r.source_ip:
ips_seen.add(r.source_ip)
hourly_distribution[dt.hour] += 1
req_counts = list(daily_requests.values()) or [0]
cost_values = list(daily_costs.values()) or [0.0]
self.baselines[key_prefix] = {
"avg_daily_requests": sum(req_counts) / max(len(req_counts), 1),
"max_daily_requests": max(req_counts),
"std_daily_requests": self._std(req_counts),
"avg_daily_cost": sum(cost_values) / max(len(cost_values), 1),
"max_daily_cost": max(cost_values),
"known_models": models_used,
"known_ips": ips_seen,
"active_hours": {
h for h, c in hourly_distribution.items()
if c > len(records) * 0.01 # at least 1% of traffic
},
}
def analyze_records(
self,
records: list[UsageRecord],
) -> list[AnomalyAlert]:
"""Analyze usage records against baselines to detect anomalies."""
alerts = []
# Group by key and day
key_day_records: dict[str, dict[str, list[UsageRecord]]] = defaultdict(
lambda: defaultdict(list)
)
for r in records:
try:
day = datetime.fromisoformat(r.timestamp).strftime("%Y-%m-%d")
except (ValueError, TypeError):
continue
key_day_records[r.api_key_prefix][day].append(r)
for key_prefix, days in key_day_records.items():
baseline = self.baselines.get(key_prefix)
if not baseline:
continue
for day, day_records in days.items():
alerts.extend(
self._check_volume_anomaly(key_prefix, day, day_records, baseline)
)
alerts.extend(
self._check_cost_anomaly(key_prefix, day, day_records, baseline)
)
alerts.extend(
self._check_model_anomaly(key_prefix, day_records, baseline)
)
alerts.extend(
self._check_ip_anomaly(key_prefix, day_records, baseline)
)
alerts.extend(
self._check_time_anomaly(key_prefix, day_records, baseline)
)
return sorted(alerts, key=lambda a: a.timestamp)
def _check_volume_anomaly(
self, key_prefix: str, day: str,
records: list[UsageRecord], baseline: dict,
) -> list[AnomalyAlert]:
"""Check for request volume anomalies."""
alerts = []
count = len(records)
avg = baseline["avg_daily_requests"]
std = baseline["std_daily_requests"]
if std > 0 and count > avg + 3 * std:
severity = "critical" if count > avg * 10 else "high"
alerts.append(AnomalyAlert(
alert_type="volume_spike",
severity=severity,
description=(
f"Key {key_prefix}: {count} requests on {day}, "
f"baseline avg={avg:.0f}, std={std:.0f}"
),
evidence={
"day": day,
"request_count": count,
"baseline_avg": avg,
"baseline_std": std,
"multiplier": round(count / max(avg, 1), 1),
},
timestamp=records[0].timestamp,
api_key_prefix=key_prefix,
))
return alerts
def _check_cost_anomaly(
self, key_prefix: str, day: str,
records: list[UsageRecord], baseline: dict,
) -> list[AnomalyAlert]:
"""Check for cost anomalies."""
alerts = []
total_cost = sum(r.cost_usd for r in records)
avg_cost = baseline["avg_daily_cost"]
if total_cost > max(avg_cost * 5, 10.0): # 5x baseline or $10 min
alerts.append(AnomalyAlert(
alert_type="cost_spike",
severity="critical",
description=(
f"Key {key_prefix}: ${total_cost:.2f} on {day}, "
f"baseline avg=${avg_cost:.2f}"
),
evidence={
"day": day,
"total_cost": total_cost,
"baseline_avg_cost": avg_cost,
},
timestamp=records[0].timestamp,
api_key_prefix=key_prefix,
))
return alerts
def _check_model_anomaly(
self, key_prefix: str,
records: list[UsageRecord], baseline: dict,
) -> list[AnomalyAlert]:
"""Check for usage of unexpected models."""
alerts = []
known = baseline.get("known_models", set())
for r in records:
if r.model not in known:
alerts.append(AnomalyAlert(
alert_type="unknown_model",
severity="high",
description=(
f"Key {key_prefix}: model '{r.model}' not in baseline. "
f"Known models: {known}"
),
evidence={
"model": r.model,
"known_models": list(known),
},
timestamp=r.timestamp,
api_key_prefix=key_prefix,
))
break # One alert per model is sufficient
return alerts
def _check_ip_anomaly(
self, key_prefix: str,
records: list[UsageRecord], baseline: dict,
) -> list[AnomalyAlert]:
"""Check for requests from unknown IP addresses."""
alerts = []
known_ips = baseline.get("known_ips", set())
if not known_ips:
return alerts
unknown_ips = set()
for r in records:
if r.source_ip and r.source_ip not in known_ips:
unknown_ips.add(r.source_ip)
if unknown_ips:
alerts.append(AnomalyAlert(
alert_type="unknown_source_ip",
severity="high",
description=(
f"Key {key_prefix}: requests from {len(unknown_ips)} "
f"unknown IP(s): {list(unknown_ips)[:5]}"
),
evidence={
"unknown_ips": list(unknown_ips),
"known_ips": list(known_ips)[:10],
},
timestamp=records[0].timestamp,
api_key_prefix=key_prefix,
))
return alerts
def _check_time_anomaly(
self, key_prefix: str,
records: list[UsageRecord], baseline: dict,
) -> list[AnomalyAlert]:
"""Check for requests outside normal operating hours."""
alerts = []
active_hours = baseline.get("active_hours", set())
if not active_hours:
return alerts
off_hours_count = 0
for r in records:
try:
hour = datetime.fromisoformat(r.timestamp).hour
if hour not in active_hours:
off_hours_count += 1
except (ValueError, TypeError):
continue
if off_hours_count > len(records) * 0.3: # >30% off-hours
alerts.append(AnomalyAlert(
alert_type="off_hours_usage",
severity="medium",
description=(
f"Key {key_prefix}: {off_hours_count}/{len(records)} "
f"requests outside normal hours"
),
evidence={
"off_hours_count": off_hours_count,
"total_records": len(records),
"active_hours": list(active_hours),
},
timestamp=records[0].timestamp,
api_key_prefix=key_prefix,
))
return alerts
@staticmethod
def _std(values: list[float]) -> float:
"""Calculate standard deviation."""
if len(values) < 2:
return 0.0
mean = sum(values) / len(values)
variance = sum((x - mean) ** 2 for x in values) / (len(values) - 1)
return variance ** 0.5監控 Billing Dashboards
Beyond API-level 監控, billing dashboards provide a secondary 偵測 signal. Most AI providers offer billing alerts, but the default thresholds are often set too high to catch early-stage compromise. Configure billing alerts at levels that are meaningful for your actual usage: if you typically spend $50/day on OpenAI, set alerts at $75, $150, and $500. The first threshold catches slow leaks, the second catches active misuse, and the third catches large-scale abuse.
Scope 評估
Determining What Was Exposed
Once a compromised key is identified, the immediate question is: what can 攻擊者 do with it? The answer depends on the key's 權限 and what it is connected to.
@dataclass
class KeyScopeAssessment:
"""評估 of what a compromised key can access."""
key_prefix: str
provider: str
權限: list[str]
connected_services: list[str]
data_exposure_risk: str # low, medium, high, critical
financial_exposure: str
findings: list[str]
def assess_openai_key_scope(
key_prefix: str,
org_settings: dict,
usage_history: list[dict],
) -> KeyScopeAssessment:
"""
評估 the scope of a compromised OpenAI API key.
Args:
key_prefix: The key identifier (last 4 chars).
org_settings: Organization settings from OpenAI dashboard.
usage_history: Historical usage records.
Returns:
KeyScopeAssessment with full scope analysis.
"""
權限 = []
findings = []
connected_services = []
data_risk = "low"
financial_risk = "low"
# Check what models are accessible
models_used = {r.get("model") for r in usage_history if r.get("model")}
權限.append(f"model_access: {models_used}")
# Check for 微調 access
fine_tune_records = [
r for r in usage_history
if "fine_tun" in r.get("endpoint", "").lower()
]
if fine_tune_records:
權限.append("fine_tuning")
findings.append(
"CRITICAL: Key has 微調 access. Attacker could "
"create poisoned models in your organization."
)
data_risk = "critical"
# Check for file upload access
file_records = [
r for r in usage_history
if "/files" in r.get("endpoint", "")
]
if file_records:
權限.append("file_upload")
findings.append(
"HIGH: Key has file API access. Attacker could upload "
"訓練資料 or access existing uploaded files."
)
data_risk = max(data_risk, "high", key=_risk_ord)
# Check for assistants/threads access (may contain conversation history)
assistant_records = [
r for r in usage_history
if "/assistants" in r.get("endpoint", "")
or "/threads" in r.get("endpoint", "")
]
if assistant_records:
權限.append("assistants_api")
connected_services.append("assistants")
findings.append(
"HIGH: Key has Assistants API access. Attacker could "
"read existing thread histories containing user data."
)
data_risk = max(data_risk, "high", key=_risk_ord)
# Estimate financial exposure
monthly_limit = org_settings.get("monthly_spend_limit")
if monthly_limit:
financial_risk = (
"critical" if monthly_limit > 10000
else "high" if monthly_limit > 1000
else "medium"
)
findings.append(
f"Monthly spend limit: ${monthly_limit}. "
f"Attacker could consume up to this amount."
)
else:
financial_risk = "critical"
findings.append(
"WARNING: No monthly spend limit configured. "
"Attacker has unlimited financial exposure."
)
return KeyScopeAssessment(
key_prefix=key_prefix,
provider="openai",
權限=權限,
connected_services=connected_services,
data_exposure_risk=data_risk,
financial_exposure=financial_risk,
findings=findings,
)
def _risk_ord(level: str) -> int:
"""Convert risk level to ordinal for comparison."""
return {"low": 0, "medium": 1, "high": 2, "critical": 3}.get(level, -1)Assessing Connected Systems
A compromised API key is often more dangerous 因為 of what it is connected to than 因為 of the direct API access it provides. Map out the key's integration points. Is it used by an application that has access to a 向量資料庫? Does the application have tool-use capabilities that give 模型 access to internal systems? Is the key shared across multiple services?
對每個 connected system, document what data 攻擊者 could access through the AI application, what actions they could trigger through 工具使用, and whether 存在 any rate limits or access controls beyond the API key itself.
Forensic Reconstruction of Attacker Activity
Building the 攻擊 Timeline
With the compromised key identified and its scope assessed, the next step is reconstructing exactly what 攻擊者 did. This requires correlating API usage logs, billing records, and any application-level logs.
from enum import Enum
class AttackPhase(Enum):
RECONNAISSANCE = "reconnaissance"
CAPABILITY_TESTING = "capability_testing"
EXPLOITATION = "利用"
PERSISTENCE = "persistence"
EXFILTRATION = "exfiltration"
@dataclass
class AttackerAction:
"""A reconstructed 攻擊者 action."""
timestamp: str
phase: AttackPhase
action: str
details: dict
confidence: float # 0-1
class APIKeyForensicReconstructor:
"""Reconstruct 攻擊者 activity from API usage logs
after a key compromise."""
def reconstruct_timeline(
self,
usage_records: list[UsageRecord],
compromise_start: str,
known_legitimate_ips: set[str],
) -> list[AttackerAction]:
"""
Reconstruct 攻擊者 activity timeline.
Args:
usage_records: All usage records for the compromised key.
compromise_start: ISO timestamp of estimated compromise start.
known_legitimate_ips: IPs known to be legitimate.
Returns:
Ordered list of reconstructed 攻擊者 actions.
"""
start_dt = datetime.fromisoformat(compromise_start)
# Filter to records after compromise and from unknown IPs
suspicious = []
for r in usage_records:
try:
r_dt = datetime.fromisoformat(r.timestamp)
except (ValueError, TypeError):
continue
if r_dt < start_dt:
continue
if r.source_ip and r.source_ip in known_legitimate_ips:
continue
suspicious.append(r)
suspicious.sort(key=lambda r: r.timestamp)
actions = []
# Phase 偵測 based on temporal patterns and request types
for i, record in enumerate(suspicious):
phase = self._classify_phase(record, i, suspicious)
action_desc = self._describe_action(record, phase)
actions.append(AttackerAction(
timestamp=record.timestamp,
phase=phase,
action=action_desc,
details={
"model": record.model,
"endpoint": record.endpoint,
"input_tokens": record.input_tokens,
"output_tokens": record.output_tokens,
"source_ip": record.source_ip,
"status_code": record.status_code,
"cost_usd": record.cost_usd,
},
confidence=0.7, # Default; refined by context
))
return actions
def _classify_phase(
self,
record: UsageRecord,
index: int,
all_records: list[UsageRecord],
) -> AttackPhase:
"""Classify the attack phase of a given record."""
# First few requests are typically reconnaissance
if index < 3:
return AttackPhase.RECONNAISSANCE
# Small 符元 counts suggest 測試/probing
if record.input_tokens < 50 and record.output_tokens < 50:
return AttackPhase.CAPABILITY_TESTING
# 微調 or file endpoints suggest persistence
if any(
kw in record.endpoint.lower()
for kw in ["fine-tun", "files", "batch"]
):
return AttackPhase.PERSISTENCE
# Large 輸出 符元 counts may indicate exfiltration
if record.output_tokens > 4000:
return AttackPhase.EXFILTRATION
return AttackPhase.EXPLOITATION
def _describe_action(
self,
record: UsageRecord,
phase: AttackPhase,
) -> str:
"""Generate a human-readable description of the action."""
descriptions = {
AttackPhase.RECONNAISSANCE: (
f"Probed {record.endpoint} with model {record.model}"
),
AttackPhase.CAPABILITY_TESTING: (
f"Tested {record.model} capabilities via {record.endpoint} "
f"({record.input_tokens} in, {record.output_tokens} out)"
),
AttackPhase.EXPLOITATION: (
f"Used {record.model} via {record.endpoint} "
f"({record.input_tokens}+{record.output_tokens} 符元, "
f"${record.cost_usd:.4f})"
),
AttackPhase.PERSISTENCE: (
f"Accessed {record.endpoint}, possible persistence mechanism"
),
AttackPhase.EXFILTRATION: (
f"Large 輸出 ({record.output_tokens} 符元) from "
f"{record.model}, possible data exfiltration"
),
}
return descriptions.get(phase, f"Unknown action on {record.endpoint}")
def generate_forensic_report(
self,
actions: list[AttackerAction],
key_scope: KeyScopeAssessment,
) -> str:
"""Generate a forensic report from reconstructed actions."""
lines = [
"=" * 70,
"API KEY COMPROMISE - FORENSIC INVESTIGATION REPORT",
"=" * 70,
"",
f"Compromised Key: ...{key_scope.key_prefix}",
f"Provider: {key_scope.provider}",
f"Data Exposure Risk: {key_scope.data_exposure_risk}",
f"Financial Exposure: {key_scope.financial_exposure}",
"",
"SCOPE FINDINGS:",
]
for finding in key_scope.findings:
lines.append(f" - {finding}")
lines.extend([
"",
f"ATTACKER ACTIONS: {len(actions)} total",
"",
])
# Group by phase
phase_groups: dict[AttackPhase, list[AttackerAction]] = defaultdict(list)
for action in actions:
phase_groups[action.phase].append(action)
for phase in AttackPhase:
phase_actions = phase_groups.get(phase, [])
if not phase_actions:
continue
lines.append(f"--- {phase.value.upper()} ({len(phase_actions)} actions) ---")
for a in phase_actions[:10]: # Show first 10 per phase
lines.append(f" [{a.timestamp}] {a.action}")
if len(phase_actions) > 10:
lines.append(f" ... and {len(phase_actions) - 10} more")
lines.append("")
# Financial impact
total_cost = sum(
a.details.get("cost_usd", 0) for a in actions
)
total_tokens = sum(
a.details.get("input_tokens", 0) + a.details.get("output_tokens", 0)
for a in actions
)
lines.extend([
"FINANCIAL IMPACT:",
f" Total 攻擊者 cost: ${total_cost:.2f}",
f" Total 符元 consumed: {total_tokens:,}",
])
return "\n".join(lines)Source Identification: How Was the Key Leaked?
Common Leak Vectors
Identifying how the key was compromised is essential for remediation. The most common leak vectors for AI API keys are:
Source code repositories: Keys committed to public or improperly secured repositories. Tools like GitHub's secret scanning, truffleHog, and GitLeaks can search commit history. Remember that even if the key is removed in a subsequent commit, it remains in git history.
Client-side exposure: Keys embedded in frontend JavaScript, mobile applications, or browser extensions. These are trivially extractable by any user.
CI/CD pipeline logs: Keys printed in build logs, 測試 輸出, or deployment scripts. Many CI systems retain logs for weeks or months.
Environment variable leaks: Debug endpoints, error pages, or server status pages that dump environment variables.
Third-party service compromise: If the key is stored in a secrets manager, configuration service, or shared credential store that is itself compromised.
import subprocess
from pathlib import Path
def scan_git_history_for_key(
repo_path: str,
key_pattern: str,
) -> list[dict]:
"""
Scan git history for occurrences of an API key pattern.
Args:
repo_path: Path to the git repository.
key_pattern: Regex pattern matching the key format
(e.g., 'sk-[a-zA-Z0-9]{48}' for OpenAI keys).
Returns:
List of commits where the key was found.
"""
results = []
try:
# Use git log -p to search through diffs
cmd = [
"git", "-C", repo_path, "log", "--all", "-p",
f"--grep-reflog={key_pattern}",
"--format=%H|%an|%ae|%aI|%s",
]
# Fallback: use git log with pickaxe search
cmd = [
"git", "-C", repo_path, "log", "--all",
f"-S{key_pattern}", "--format=%H|%an|%ae|%aI|%s",
]
輸出 = subprocess.run(
cmd, capture_output=True, text=True, timeout=120,
)
for line in 輸出.stdout.strip().split("\n"):
if not line or "|" not in line:
continue
parts = line.split("|", 4)
if len(parts) >= 5:
results.append({
"commit_hash": parts[0],
"author_name": parts[1],
"author_email": parts[2],
"date": parts[3],
"message": parts[4],
})
except subprocess.TimeoutExpired:
results.append({"error": "Git search timed out"})
except FileNotFoundError:
results.append({"error": "Git not found or invalid repo path"})
return resultsInvestigating Third-Party Exposure
Check whether the key appears in public data breach dumps, paste sites, or AI-specific key trading channels. Services like Have I Been Pwned (for associated email addresses) and specialized dark web 監控 can indicate whether the key has been publicly exposed. If the key is found in a public repository through GitHub's secret scanning alerts, the exposure timestamp helps establish the start of the compromise window.
Remediation Procedures
Immediate Response
The immediate response to a confirmed API key compromise follows this sequence:
-
Rotate the key immediately. Generate a new key and update all legitimate consumers. Do not simply disable the old key if you can rotate first, as this allows you to update consumers without downtime.
-
Set or reduce spending limits. If the provider supports it, set a hard spending cap to limit further financial damage while you complete the rotation.
-
Review recent API activity. Use the forensic reconstruction techniques above to 理解 what 攻擊者 did with the key.
-
Check for persistence. If 攻擊者 had 微調 access, review all 微調 jobs. If they had file access, review uploaded files. If they had assistant access, review assistants and threads for modifications.
-
Notify affected parties. If user data was potentially exposed through the compromised key (e.g., through access to conversation histories or RAG-connected data), initiate your data breach notification procedures.
Conducting a Post-Incident Review
After the immediate response is complete, conduct a structured post-incident review that goes beyond the technical details of the compromise. Document the full timeline from initial key exposure through 偵測 through containment. Calculate the total financial impact, including both 攻擊者's consumption and the cost of the investigation and remediation effort.
Critically 評估 your 偵測 capabilities. How long was the key compromised before 偵測 (dwell time)? What signal ultimately triggered the investigation? Could that signal have been detected earlier with different alerting thresholds? Many organizations discover that their API key compromise dwell time is measured in weeks or months 因為 their 監控 was not granular enough to catch the early stages of abuse.
Review the key management lifecycle that led to the exposure. Map out every system and person who had access to the key. 識別 where the key was stored, how it was distributed, and what controls were in place at each point. The root cause is rarely "the key was leaked" — it is "the key was in a location where it could be leaked 因為 our key management process allowed it."
Document specific indicators of compromise (IOCs) from the investigation: 攻擊者's IP addresses, user 代理, request patterns, and timing characteristics. Share these IOCs with your 安全 operations team so they can build detections that catch similar activity in the future, and 考慮 sharing them with your AI provider's abuse team.
Provider-Specific Investigation Resources
Each major AI API provider offers different tools for investigating compromised keys. OpenAI provides a usage dashboard with per-key breakdowns and an API for programmatic usage queries. The organization audit log captures key creation, modification, and deletion events. Anthropic's console provides usage tracking by API key with cost breakdowns. Google 雲端's AI Platform integrates with 雲端 Audit Logs for detailed request-level tracking. For Azure OpenAI, diagnostic logs in Azure Monitor provide the most detailed view, including full request and response content when enabled.
When working with provider support teams during an investigation, provide them with the compromised key identifier, the suspected compromise window, and any 攻擊者 IP addresses you have identified. Most providers can provide additional forensic data from their side, including request-level details that may not be available through your own logging.
Long-Term Hardening
After the immediate incident is resolved, 實作 controls to prevent recurrence:
- Use scoped API keys with minimum necessary 權限. If your application only needs chat completions, do not give it file or 微調 access.
- 實作 API key rotation on a regular schedule (at least quarterly).
- Use secrets managers (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault) rather than environment variables or configuration files.
- Deploy pre-commit hooks that scan for API key patterns before code is committed.
- Configure billing alerts at meaningful thresholds.
- 實作 IP allowlisting where supported by the provider.
- Monitor for leaked keys using automated scanning of your repositories and third-party exposure 監控.
Proactive API Key 安全 監控
Continuous 監控 Architecture
A robust API key 安全 program requires continuous 監控 rather than reactive investigation. Deploy an automated 監控 pipeline that ingests API usage data in near-real-time, applies the anomaly 偵測 algorithms described above, and routes alerts to your 安全 operations team.
The 監控 pipeline should process usage records within minutes of their generation. Most AI API providers offer usage data with a delay of 5-30 minutes. Configure the pipeline to pull usage data at this cadence, apply the baseline comparison, and generate alerts for anomalies that exceed your configured thresholds. Feed the alerts into your existing SIEM or 安全 alerting infrastructure so they are triaged alongside other 安全 events.
Regularly rotate the baselines used for anomaly 偵測. As your application's usage grows and changes, the baseline must evolve to avoid false positives from legitimate growth and false negatives from gradual abuse. Recompute baselines weekly using the most recent 30 days of data, excluding any days where confirmed incidents occurred.
參考文獻
- Lasso 安全 (2023). "Thousands of API keys and credentials found in public LLM 訓練資料." https://www.lasso.安全/blog/thousands-of-api-keys-found-in-llm-訓練-data
- OWASP Foundation (2025). "OWASP Top 10 for LLM Applications - LLM06: Sensitive Information Disclosure." https://owasp.org/www-project-top-10-for-large-language-model-applications/
- Truffle 安全 (2024). "TruffleHog: Find and verify credentials in git repositories." https://github.com/trufflesecurity/trufflehog