OSINT for AI Red Teaming
Gathering intelligence about AI deployments from public sources: documentation, job postings, research papers, social media, and technical artifacts.
OSINT for AI Red Teaming
OSINT for AI red teaming leverages publicly available information to build a detailed picture of a target's AI deployment before any direct interaction. Organizations routinely disclose significant details about their AI systems through documentation, blog posts, conference talks, job postings, and open-source contributions. This information, when systematically collected and analyzed, provides the red team with a substantial advantage.
Intelligence Sources
Official Documentation
The target's own documentation is the first and most reliable OSINT source:
| Source | What It Reveals |
|---|---|
| API documentation | Endpoints, parameters, model names, rate limits |
| Developer guides | Architecture, integration patterns, system design |
| Model cards | Training data, capabilities, known limitations |
| Changelog/release notes | Version history, recent changes, bug fixes |
| Terms of service | Usage restrictions, data handling, liability |
| Privacy policy | Data collection, retention, processing details |
| Status pages | Infrastructure details, uptime history |
Job Postings
Job listings reveal technology stacks, team structure, and current initiatives:
Example job posting analysis:
"Senior ML Engineer - Safety Team"
- "Experience with RLHF and constitutional AI methods"
→ Uses RLHF for safety alignment
- "Familiarity with vLLM or TGI deployment frameworks"
→ Self-hosted models, not exclusively API-based
- "Experience with Redis and PostgreSQL"
→ Infrastructure details
- "Knowledge of RAG architectures and vector databases"
→ Uses RAG system for knowledge retrieval
- "Experience with Weights & Biases or MLflow"
→ Training infrastructure details
Research Papers and Blog Posts
Technical blog posts and research papers from the target organization:
| Content Type | Intelligence Value |
|---|---|
| Technical blog posts | Architecture decisions, technology choices, system design |
| Research papers | Model details, training methodology, evaluation results |
| Conference talks | System demos, architecture diagrams, roadmap hints |
| Open-source repos | Code patterns, dependencies, configuration formats |
| Incident postmortems | Previous vulnerabilities, response capabilities |
Social Media and Forums
Engineers discuss their work on public platforms:
- Twitter/X: AI team members often share insights about their work
- LinkedIn: Team composition, individual expertise, recent projects
- Reddit: Technical discussions in AI subreddits
- Hacker News: Launch announcements, technical discussions
- Discord/Slack communities: Some AI teams maintain public communities
- GitHub: Personal repos may contain related experiments or tooling
Technical Artifacts
Public technical artifacts that reveal deployment details:
class TechnicalArtifactCollector:
"""Collect public technical artifacts about AI deployments."""
def check_common_artifacts(self, domain):
"""Check for publicly accessible technical artifacts."""
artifacts = {}
# robots.txt may reveal AI-related paths
artifacts["robots_txt"] = self.fetch(f"https://{domain}/robots.txt")
# OpenAPI/Swagger documentation
swagger_paths = [
"/swagger.json", "/openapi.json",
"/api/docs", "/docs/api",
"/v1/openapi.json"
]
for path in swagger_paths:
result = self.fetch(f"https://{domain}{path}")
if result:
artifacts["api_spec"] = result
# Status and health endpoints
health_paths = ["/health", "/status", "/api/health"]
for path in health_paths:
result = self.fetch(f"https://{domain}{path}")
if result:
artifacts["health"] = result
# JavaScript bundles may contain API endpoints
# (requires manual analysis of page source)
return artifactsOSINT Methodology
Phase 1: Passive Collection
Gather information without directly interacting with the target:
Document official sources
Read all official documentation, API references, blog posts, and press releases. Extract model names, technology stack, and architectural details.
Search for team members
Identify AI team members through LinkedIn, GitHub, and conference speaker lists. Note their expertise and recent public statements about the system.
Analyze open-source contributions
Search GitHub for repositories maintained by the organization or team members. Look for system prompts, configuration files, training scripts, and dependency lists.
Review job postings
Current and historical job postings reveal technology decisions, team growth areas, and potential security gaps (e.g., no AI security roles posted).
Search for incident history
Search for news articles, social media posts, and forum discussions about past AI incidents involving the target.
Phase 2: Active Collection
Interact with publicly available systems to gather additional intelligence:
Interact with the public AI product
Use the AI system as a normal user. Note the model's behavior, formatting style, capability boundaries, and refusal patterns.
Analyze network requests
Use browser developer tools to observe API calls made by the AI product. Note endpoint URLs, request formats, and response structures.
Test publicly documented API
If an API is publicly available, make test calls to verify documentation accuracy and discover undocumented behavior.
Phase 3: Analysis and Synthesis
Combine collected information into an actionable target profile:
class OSINTReport:
"""Structure OSINT findings into an actionable report."""
def __init__(self, target_name):
self.target = target_name
self.findings = {
"infrastructure": {},
"models": {},
"team": {},
"security_posture": {},
"attack_surface": {}
}
def synthesize(self):
"""Generate prioritized attack recommendations."""
recommendations = []
# If we know the model family, recommend targeted attacks
if self.findings["models"].get("family"):
recommendations.append({
"priority": "high",
"finding": f"Model identified as {self.findings['models']['family']}",
"recommendation": "Use model-specific attack techniques"
})
# If we found system prompts or configurations in public repos
if self.findings.get("exposed_configs"):
recommendations.append({
"priority": "critical",
"finding": "System configuration found in public repository",
"recommendation": "Craft injection payloads using exact system prompt"
})
# If no AI security roles in job postings
if not self.findings["team"].get("has_security_team"):
recommendations.append({
"priority": "medium",
"finding": "No evidence of dedicated AI security team",
"recommendation": "Expect weaker defenses; start with basic techniques"
})
return sorted(recommendations,
key=lambda x: {"critical": 0, "high": 1,
"medium": 2, "low": 3}[x["priority"]])Common OSINT Findings
Based on experience across multiple red team engagements, the most frequently discovered items through OSINT:
| Finding | Frequency | Impact |
|---|---|---|
| System prompts in public repos | Common | Critical -- enables targeted injection |
| API keys in code/documentation | Occasional | Critical -- direct access to API |
| Model family identification | Very common | High -- enables attack selection |
| Technology stack details | Very common | Medium -- informs attack vectors |
| Team structure and expertise | Common | Medium -- reveals security maturity |
| Previous incident details | Occasional | Medium -- reveals past vulnerabilities |
| Training data sources | Rare | High -- enables data poisoning assessment |
Related Topics
- Target Profiling — Combining OSINT with direct profiling
- Model Identification — Technical model fingerprinting
- Social Engineering for AI — Using OSINT to inform social engineering
- Attack Surface Mapping — Comprehensive attack surface analysis
While reviewing a target company's GitHub repositories, you find a file called 'system_prompt.txt' in a public repo. The file contains the full system prompt used by their production chatbot, including safety instructions and tool permissions. What is the immediate red team value of this finding?
References
- SANS Institute, "OSINT Fundamentals" (2023)
- IntelTechniques, "Open Source Intelligence Techniques" (2023)
- PTES, "Penetration Testing Execution Standard - Intelligence Gathering" (2023)