OSINT for AI Red Teaming

beginner8 min readUpdated 2026-03-15

Gathering intelligence about AI deployments from public sources: documentation, job postings, research papers, social media, and technical artifacts.

OSINT intelligence reconnaissance public-sources research

OSINT for AI Red Teaming

OSINT for AI red teaming leverages publicly available information to build a detailed picture of a target's AI deployment before any direct interaction. Organizations routinely disclose significant details about their AI systems through documentation, blog posts, conference talks, job postings, and open-source contributions. This information, when systematically collected and analyzed, provides the red team with a substantial advantage.

Intelligence Sources

Official Documentation

The target's own documentation is the first and most reliable OSINT source:

Source	What It Reveals
API documentation	Endpoints, parameters, model names, rate limits
Developer guides	Architecture, integration patterns, system design
Model cards	Training data, capabilities, known limitations
Changelog/release notes	Version history, recent changes, bug fixes
Terms of service	Usage restrictions, data handling, liability
Privacy policy	Data collection, retention, processing details
Status pages	Infrastructure details, uptime history

Job Postings

Job listings reveal technology stacks, team structure, and current initiatives:

Example job posting analysis:

"Senior ML Engineer - Safety Team"
- "Experience with RLHF and constitutional AI methods"
  → Uses RLHF for safety alignment
- "Familiarity with vLLM or TGI deployment frameworks"
  → Self-hosted models, not exclusively API-based
- "Experience with Redis and PostgreSQL"
  → Infrastructure details
- "Knowledge of RAG architectures and vector databases"
  → Uses RAG system for knowledge retrieval
- "Experience with Weights & Biases or MLflow"
  → Training infrastructure details

Research Papers and Blog Posts

Technical blog posts and research papers from the target organization:

Content Type	Intelligence Value
Technical blog posts	Architecture decisions, technology choices, system design
Research papers	Model details, training methodology, evaluation results
Conference talks	System demos, architecture diagrams, roadmap hints
Open-source repos	Code patterns, dependencies, configuration formats
Incident postmortems	Previous vulnerabilities, response capabilities

Engineers discuss their work on public platforms:

Twitter/X: AI team members often share insights about their work
LinkedIn: Team composition, individual expertise, recent projects
Reddit: Technical discussions in AI subreddits
Hacker News: Launch announcements, technical discussions
Discord/Slack communities: Some AI teams maintain public communities
GitHub: Personal repos may contain related experiments or tooling

Technical Artifacts

Public technical artifacts that reveal deployment details:

class TechnicalArtifactCollector:
    """Collect public technical artifacts about AI deployments."""
 
    def check_common_artifacts(self, domain):
        """Check for publicly accessible technical artifacts."""
        artifacts = {}
 
        # robots.txt may reveal AI-related paths
        artifacts["robots_txt"] = self.fetch(f"https://{domain}/robots.txt")
 
        # OpenAPI/Swagger documentation
        swagger_paths = [
            "/swagger.json", "/openapi.json",
            "/api/docs", "/docs/api",
            "/v1/openapi.json"
        ]
        for path in swagger_paths:
            result = self.fetch(f"https://{domain}{path}")
            if result:
                artifacts["api_spec"] = result
 
        # Status and health endpoints
        health_paths = ["/health", "/status", "/api/health"]
        for path in health_paths:
            result = self.fetch(f"https://{domain}{path}")
            if result:
                artifacts["health"] = result
 
        # JavaScript bundles may contain API endpoints
        # (requires manual analysis of page source)
 
        return artifacts

OSINT Methodology

Phase 1: Passive Collection

Gather information without directly interacting with the target:

Document official sources
Read all official documentation, API references, blog posts, and press releases. Extract model names, technology stack, and architectural details.
Search for team members
Identify AI team members through LinkedIn, GitHub, and conference speaker lists. Note their expertise and recent public statements about the system.
Analyze open-source contributions
Search GitHub for repositories maintained by the organization or team members. Look for system prompts, configuration files, training scripts, and dependency lists.
Review job postings
Current and historical job postings reveal technology decisions, team growth areas, and potential security gaps (e.g., no AI security roles posted).
Search for incident history
Search for news articles, social media posts, and forum discussions about past AI incidents involving the target.

Phase 2: Active Collection

Interact with publicly available systems to gather additional intelligence:

Interact with the public AI product
Use the AI system as a normal user. Note the model's behavior, formatting style, capability boundaries, and refusal patterns.
Analyze network requests
Use browser developer tools to observe API calls made by the AI product. Note endpoint URLs, request formats, and response structures.
Test publicly documented API
If an API is publicly available, make test calls to verify documentation accuracy and discover undocumented behavior.

Phase 3: Analysis and Synthesis

Combine collected information into an actionable target profile:

class OSINTReport:
    """Structure OSINT findings into an actionable report."""
 
    def __init__(self, target_name):
        self.target = target_name
        self.findings = {
            "infrastructure": {},
            "models": {},
            "team": {},
            "security_posture": {},
            "attack_surface": {}
        }
 
    def synthesize(self):
        """Generate prioritized attack recommendations."""
        recommendations = []
 
        # If we know the model family, recommend targeted attacks
        if self.findings["models"].get("family"):
            recommendations.append({
                "priority": "high",
                "finding": f"Model identified as {self.findings['models']['family']}",
                "recommendation": "Use model-specific attack techniques"
            })
 
        # If we found system prompts or configurations in public repos
        if self.findings.get("exposed_configs"):
            recommendations.append({
                "priority": "critical",
                "finding": "System configuration found in public repository",
                "recommendation": "Craft injection payloads using exact system prompt"
            })
 
        # If no AI security roles in job postings
        if not self.findings["team"].get("has_security_team"):
            recommendations.append({
                "priority": "medium",
                "finding": "No evidence of dedicated AI security team",
                "recommendation": "Expect weaker defenses; start with basic techniques"
            })
 
        return sorted(recommendations,
                      key=lambda x: {"critical": 0, "high": 1,
                                     "medium": 2, "low": 3}[x["priority"]])

Common OSINT Findings

Based on experience across multiple red team engagements, the most frequently discovered items through OSINT:

Finding	Frequency	Impact
System prompts in public repos	Common	Critical -- enables targeted injection
API keys in code/documentation	Occasional	Critical -- direct access to API
Model family identification	Very common	High -- enables attack selection
Technology stack details	Very common	Medium -- informs attack vectors
Team structure and expertise	Common	Medium -- reveals security maturity
Previous incident details	Occasional	Medium -- reveals past vulnerabilities
Training data sources	Rare	High -- enables data poisoning assessment

Target Profiling — Combining OSINT with direct profiling
Model Identification — Technical model fingerprinting
Social Engineering for AI — Using OSINT to inform social engineering
Attack Surface Mapping — Comprehensive attack surface analysis

Knowledge Check

While reviewing a target company's GitHub repositories, you find a file called 'system_prompt.txt' in a public repo. The file contains the full system prompt used by their production chatbot, including safety instructions and tool permissions. What is the immediate red team value of this finding?

References

SANS Institute, "OSINT Fundamentals" (2023)
IntelTechniques, "Open Source Intelligence Techniques" (2023)
PTES, "Penetration Testing Execution Standard - Intelligence Gathering" (2023)

Edit this page on GitHub

OSINT for AI Red Teaming

beginner8 min readUpdated 2026-03-15

Gathering intelligence about AI deployments from public sources: documentation, job postings, research papers, social media, and technical artifacts.

OSINT intelligence reconnaissance public-sources research

OSINT for AI Red Teaming

Intelligence Sources

Official Documentation

The target's own documentation is the first and most reliable OSINT source:

Source	What It Reveals
API documentation	Endpoints, parameters, model names, rate limits
Developer guides	Architecture, integration patterns, system design
Model cards	Training data, capabilities, known limitations
Changelog/release notes	Version history, recent changes, bug fixes
Terms of service	Usage restrictions, data handling, liability
Privacy policy	Data collection, retention, processing details
Status pages	Infrastructure details, uptime history

Job Postings

Job listings reveal technology stacks, team structure, and current initiatives:

Example job posting analysis:

"Senior ML Engineer - Safety Team"
- "Experience with RLHF and constitutional AI methods"
  → Uses RLHF for safety alignment
- "Familiarity with vLLM or TGI deployment frameworks"
  → Self-hosted models, not exclusively API-based
- "Experience with Redis and PostgreSQL"
  → Infrastructure details
- "Knowledge of RAG architectures and vector databases"
  → Uses RAG system for knowledge retrieval
- "Experience with Weights & Biases or MLflow"
  → Training infrastructure details

Research Papers and Blog Posts

Technical blog posts and research papers from the target organization:

Content Type	Intelligence Value
Technical blog posts	Architecture decisions, technology choices, system design
Research papers	Model details, training methodology, evaluation results
Conference talks	System demos, architecture diagrams, roadmap hints
Open-source repos	Code patterns, dependencies, configuration formats
Incident postmortems	Previous vulnerabilities, response capabilities

Engineers discuss their work on public platforms:

Twitter/X: AI team members often share insights about their work
LinkedIn: Team composition, individual expertise, recent projects
Reddit: Technical discussions in AI subreddits
Hacker News: Launch announcements, technical discussions
Discord/Slack communities: Some AI teams maintain public communities
GitHub: Personal repos may contain related experiments or tooling

Technical Artifacts

Public technical artifacts that reveal deployment details:

class TechnicalArtifactCollector:
    """Collect public technical artifacts about AI deployments."""
 
    def check_common_artifacts(self, domain):
        """Check for publicly accessible technical artifacts."""
        artifacts = {}
 
        # robots.txt may reveal AI-related paths
        artifacts["robots_txt"] = self.fetch(f"https://{domain}/robots.txt")
 
        # OpenAPI/Swagger documentation
        swagger_paths = [
            "/swagger.json", "/openapi.json",
            "/api/docs", "/docs/api",
            "/v1/openapi.json"
        ]
        for path in swagger_paths:
            result = self.fetch(f"https://{domain}{path}")
            if result:
                artifacts["api_spec"] = result
 
        # Status and health endpoints
        health_paths = ["/health", "/status", "/api/health"]
        for path in health_paths:
            result = self.fetch(f"https://{domain}{path}")
            if result:
                artifacts["health"] = result
 
        # JavaScript bundles may contain API endpoints
        # (requires manual analysis of page source)
 
        return artifacts

OSINT Methodology

Phase 1: Passive Collection

Gather information without directly interacting with the target:

Document official sources
Read all official documentation, API references, blog posts, and press releases. Extract model names, technology stack, and architectural details.
Search for team members
Identify AI team members through LinkedIn, GitHub, and conference speaker lists. Note their expertise and recent public statements about the system.
Analyze open-source contributions
Search GitHub for repositories maintained by the organization or team members. Look for system prompts, configuration files, training scripts, and dependency lists.
Review job postings
Current and historical job postings reveal technology decisions, team growth areas, and potential security gaps (e.g., no AI security roles posted).
Search for incident history
Search for news articles, social media posts, and forum discussions about past AI incidents involving the target.

Phase 2: Active Collection

Interact with publicly available systems to gather additional intelligence:

Interact with the public AI product
Use the AI system as a normal user. Note the model's behavior, formatting style, capability boundaries, and refusal patterns.
Analyze network requests
Use browser developer tools to observe API calls made by the AI product. Note endpoint URLs, request formats, and response structures.
Test publicly documented API
If an API is publicly available, make test calls to verify documentation accuracy and discover undocumented behavior.

Phase 3: Analysis and Synthesis

Combine collected information into an actionable target profile:

class OSINTReport:
    """Structure OSINT findings into an actionable report."""
 
    def __init__(self, target_name):
        self.target = target_name
        self.findings = {
            "infrastructure": {},
            "models": {},
            "team": {},
            "security_posture": {},
            "attack_surface": {}
        }
 
    def synthesize(self):
        """Generate prioritized attack recommendations."""
        recommendations = []
 
        # If we know the model family, recommend targeted attacks
        if self.findings["models"].get("family"):
            recommendations.append({
                "priority": "high",
                "finding": f"Model identified as {self.findings['models']['family']}",
                "recommendation": "Use model-specific attack techniques"
            })
 
        # If we found system prompts or configurations in public repos
        if self.findings.get("exposed_configs"):
            recommendations.append({
                "priority": "critical",
                "finding": "System configuration found in public repository",
                "recommendation": "Craft injection payloads using exact system prompt"
            })
 
        # If no AI security roles in job postings
        if not self.findings["team"].get("has_security_team"):
            recommendations.append({
                "priority": "medium",
                "finding": "No evidence of dedicated AI security team",
                "recommendation": "Expect weaker defenses; start with basic techniques"
            })
 
        return sorted(recommendations,
                      key=lambda x: {"critical": 0, "high": 1,
                                     "medium": 2, "low": 3}[x["priority"]])

Common OSINT Findings

Based on experience across multiple red team engagements, the most frequently discovered items through OSINT:

Finding	Frequency	Impact
System prompts in public repos	Common	Critical -- enables targeted injection
API keys in code/documentation	Occasional	Critical -- direct access to API
Model family identification	Very common	High -- enables attack selection
Technology stack details	Very common	Medium -- informs attack vectors
Team structure and expertise	Common	Medium -- reveals security maturity
Previous incident details	Occasional	Medium -- reveals past vulnerabilities
Training data sources	Rare	High -- enables data poisoning assessment

Target Profiling — Combining OSINT with direct profiling
Model Identification — Technical model fingerprinting
Social Engineering for AI — Using OSINT to inform social engineering
Attack Surface Mapping — Comprehensive attack surface analysis

Knowledge Check

References

SANS Institute, "OSINT Fundamentals" (2023)
IntelTechniques, "Open Source Intelligence Techniques" (2023)
PTES, "Penetration Testing Execution Standard - Intelligence Gathering" (2023)

Edit this page on GitHub

OSINT for AI Red Teaming

Document official sources

Search for team members

Analyze open-source contributions

Review job postings

Search for incident history

Interact with the public AI product

Analyze network requests

Test publicly documented API

Related articles

OSINT for AI Red Teaming

Document official sources

Search for team members

Analyze open-source contributions

Review job postings

Search for incident history

Interact with the public AI product

Analyze network requests

Test publicly documented API

Related articles