AI Bug Bounty Programs
Comprehensive guide to active AI bug bounty programs from OpenAI, Anthropic, Google, and platform providers. Scope definitions, reward ranges, submission best practices, and AI-specific vulnerability categories.
AI bug bounty programs reward security researchers for discovering and responsibly disclosing vulnerabilities in AI systems. These programs differ significantly from traditional software bug bounties because the boundary between "feature" and "vulnerability" is less clear when the system's behavior is probabilistic rather than deterministic.
Active AI Bug Bounty Programs
OpenAI Bug Bounty (via Bugcrowd)
| Aspect | Details |
|---|---|
| Platform | Bugcrowd |
| Reward range | $200 - $20,000 |
| In scope | API vulnerabilities, authentication/authorization flaws, data exposure, infrastructure security |
| Out of scope | Model jailbreaks, safety bypasses, prompt injection (unless causing data exposure), content policy violations |
| Notable exclusion | "Model behavior issues" are explicitly excluded — this program focuses on traditional application security |
Key insight for researchers: OpenAI's bug bounty focuses on infrastructure and application-layer vulnerabilities, not model behavior. A prompt injection that causes the model to say something harmful is out of scope, but a prompt injection that leaks other users' data is in scope because it represents a data exposure vulnerability.
Google Vulnerability Reward Program (AI-specific)
| Aspect | Details |
|---|---|
| Platform | Google VRP (bughunters.google.com) |
| Reward range | $500 - $31,337+ |
| In scope | Training data extraction, model manipulation attacks, adversarial examples affecting Google AI products, prompt injection in Google AI features |
| Notable inclusion | Google has expanded scope to include some AI-specific attack categories including prompt injection and training data extraction |
Key insight for researchers: Google's program is one of the broadest, explicitly including AI-specific vulnerability categories. Their AI-specific rewards were expanded in 2024 to cover adversarial attacks on Gemini and other Google AI products.
Anthropic Security Reporting
| Aspect | Details |
|---|---|
| Platform | Direct reporting (security-at-anthropic.com) |
| Reward range | Case-by-case evaluation |
| In scope | Security vulnerabilities in Claude and Anthropic's infrastructure |
| Focus areas | Novel safety vulnerabilities, security-relevant model behaviors, infrastructure flaws |
HackerOne AI Programs
Multiple companies run AI-specific programs through HackerOne:
| Company | Focus | Typical Rewards |
|---|---|---|
| Microsoft | Bing Chat, Copilot, Azure AI | $500 - $30,000 |
| Meta | Llama safety, AI products | $500 - $40,000 |
| Various startups | Application-level AI security | $100 - $10,000 |
What Qualifies as an AI Vulnerability?
The distinction between "the model did something unintended" and "this is a security vulnerability" is the central challenge of AI bug bounties.
Generally In Scope
| Category | Example | Why It Qualifies |
|---|---|---|
| Data exposure | Extracting other users' conversations | Confidentiality violation |
| Authentication bypass | Accessing admin endpoints without credentials | Authorization failure |
| Training data extraction | Retrieving verbatim training data including PII | Privacy violation |
| Cost exhaustion | Bypassing rate limits to generate unbounded costs | Financial impact |
| Cross-tenant data access | One user accessing another's fine-tuned model data | Isolation failure |
| Infrastructure RCE | Code execution on API servers via crafted inputs | Infrastructure compromise |
Generally Out of Scope
| Category | Example | Why It Is Excluded |
|---|---|---|
| Jailbreaks | Getting the model to produce harmful text | No concrete security impact beyond content |
| Hallucinations | Model generates false information | Expected model behavior, not a vulnerability |
| Prompt injection (content only) | Overriding system prompt to change model behavior | Unless it causes data exposure or unauthorized actions |
| Bias | Model exhibits demographic bias | Safety/ethics concern, not security vulnerability |
| Content policy violations | Model generates policy-violating content | Handled through safety teams, not security bounties |
Writing Effective AI Vulnerability Reports
AI vulnerability reports require additional context compared to traditional security reports because the behavior is probabilistic.
Demonstrate Reproducibility
Include exact prompts, API parameters, model version, and timestamp. Provide success rates over multiple attempts (e.g., "Successful in 7 of 10 attempts"). AI vulnerabilities may not reproduce 100% of the time — quantify the success rate.
Prove Security Impact
Clearly articulate what an attacker gains beyond unexpected model output. Data exposure? Financial loss? Unauthorized access? If you cannot articulate concrete security impact, the finding is likely out of scope.
Isolate the Root Cause
Differentiate between model behavior issues and application-layer vulnerabilities. Is this a prompt injection that only affects the model's text output, or does it trigger unauthorized actions through tool use or data access?
Provide Minimal Reproduction Steps
Strip your reproduction to the minimum steps needed. Remove unnecessary complexity — the simpler the reproduction, the faster the triage.
Document the Environment
Include API version, model identifier, date and time of testing, geographic region (some deployments vary by region), and any relevant account configuration.
# Example: structured vulnerability report template
vulnerability_report = {
"title": "Cross-tenant conversation data exposure via crafted API request",
"severity": "High",
"model_version": "gpt-4-0125-preview",
"testing_date": "2026-02-15",
"reproduction_rate": "7/10 attempts",
"steps_to_reproduce": [
"1. Create two separate accounts (Account A, Account B)",
"2. Start a conversation with Account B containing sensitive data",
"3. Using Account A's API key, send the following request...",
"4. Observe Account B's conversation data in the response",
],
"security_impact": "Unauthorized access to other users' conversation data",
"root_cause_hypothesis": "Insufficient tenant isolation in conversation retrieval",
}Responsible Disclosure Challenges in AI
AI vulnerability disclosure presents unique challenges:
Non-deterministic reproduction: Unlike a traditional SQL injection that works 100% of the time with the right input, AI vulnerabilities may succeed probabilistically. Establish a clear success rate before reporting.
Rapid model updates: AI providers update models frequently, sometimes daily. A vulnerability you found last week may not reproduce on the current model version. Document the exact model version and test date.
Scope ambiguity: The line between "jailbreak" (out of scope) and "security vulnerability" (in scope) is often debated. When in doubt, frame your finding in terms of concrete security impact rather than model behavior.
Patch verification: It is difficult to verify that an AI vulnerability has been "fixed" since model updates are opaque. A behavior that disappears after a model update may resurface in the next version.
Maximizing Bug Bounty Success
- Start with infrastructure — Traditional web application vulnerabilities in AI product surfaces (APIs, dashboards, authentication) often have clearer scope and faster payouts
- Focus on data exposure — Any finding that demonstrates unauthorized data access is almost always in scope
- Chain AI and traditional bugs — The most impactful findings combine AI-specific techniques (prompt injection) with traditional vulnerabilities (IDOR, SSRF) to achieve concrete security impact
- Read the scope carefully — Every program defines AI vulnerability scope differently; a finding in scope for one program may be out of scope for another
- Build relationships — Engage with program teams, provide quality reports, and establish yourself as a reliable researcher
For related methodology, see Reconnaissance & Tradecraft, Infrastructure Security, and Report Writing.
Related Topics
- Notable AI Security Incidents -- incident database and analysis framework
- Lessons Learned from AI Security Incidents -- recurring patterns from disclosed vulnerabilities
- Authorization, Contracts & Liability -- legal considerations for bug bounty participation
- Ethics & Responsible Disclosure -- disclosure guidelines for AI vulnerability findings
References
- "OpenAI Bug Bounty Program" - Bugcrowd / OpenAI (2024) - Scope and rules for OpenAI's bug bounty program including AI-specific vulnerability categories
- "Google AI Bug Hunters Program" - Google (2024) - Google's vulnerability rewards program covering AI and ML system security
- "HackerOne AI Safety Initiative" - HackerOne (2024) - Platform-wide guidance on AI vulnerability reporting and scope determination
- "The Bug Hunter's Methodology for AI Systems" - OWASP Foundation (2024) - Structured approach to finding security vulnerabilities in AI applications
Why are most jailbreaks considered out of scope for AI bug bounty programs?