AI Bug Bounty Programs

intermediate9 min readUpdated 2026-03-13

Comprehensive guide to active AI bug bounty programs from OpenAI, Anthropic, Google, and platform providers. Scope definitions, reward ranges, submission best practices, and AI-specific vulnerability categories.

bug-bounty programs vulnerability disclosure

AI bug bounty programs reward security researchers for discovering and responsibly disclosing vulnerabilities in AI systems. These programs differ significantly from traditional software bug bounties because the boundary between "feature" and "vulnerability" is less clear when the system's behavior is probabilistic rather than deterministic.

Active AI Bug Bounty Programs

OpenAI Bug Bounty (via Bugcrowd)

Aspect	Details
Platform	Bugcrowd
Reward range	$200 - $20,000
In scope	API vulnerabilities, authentication/authorization flaws, data exposure, infrastructure security
Out of scope	Model jailbreaks, safety bypasses, prompt injection (unless causing data exposure), content policy violations
Notable exclusion	"Model behavior issues" are explicitly excluded — this program focuses on traditional application security

Key insight for researchers: OpenAI's bug bounty focuses on infrastructure and application-layer vulnerabilities, not model behavior. A prompt injection that causes the model to say something harmful is out of scope, but a prompt injection that leaks other users' data is in scope because it represents a data exposure vulnerability.

Google Vulnerability Reward Program (AI-specific)

Aspect	Details
Platform	Google VRP (bughunters.google.com)
Reward range	$500 - $31,337+
In scope	Training data extraction, model manipulation attacks, adversarial examples affecting Google AI products, prompt injection in Google AI features
Notable inclusion	Google has expanded scope to include some AI-specific attack categories including prompt injection and training data extraction

Key insight for researchers: Google's program is one of the broadest, explicitly including AI-specific vulnerability categories. Their AI-specific rewards were expanded in 2024 to cover adversarial attacks on Gemini and other Google AI products.

Anthropic Security Reporting

Aspect	Details
Platform	Direct reporting (security-at-anthropic.com)
Reward range	Case-by-case evaluation
In scope	Security vulnerabilities in Claude and Anthropic's infrastructure
Focus areas	Novel safety vulnerabilities, security-relevant model behaviors, infrastructure flaws

HackerOne AI Programs

Multiple companies run AI-specific programs through HackerOne:

Company	Focus	Typical Rewards
Microsoft	Bing Chat, Copilot, Azure AI	$500 - $30,000
Meta	Llama safety, AI products	$500 - $40,000
Various startups	Application-level AI security	$100 - $10,000

What Qualifies as an AI Vulnerability?

The distinction between "the model did something unintended" and "this is a security vulnerability" is the central challenge of AI bug bounties.

Generally In Scope

Category	Example	Why It Qualifies
Data exposure	Extracting other users' conversations	Confidentiality violation
Authentication bypass	Accessing admin endpoints without credentials	Authorization failure
Training data extraction	Retrieving verbatim training data including PII	Privacy violation
Cost exhaustion	Bypassing rate limits to generate unbounded costs	Financial impact
Cross-tenant data access	One user accessing another's fine-tuned model data	Isolation failure
Infrastructure RCE	Code execution on API servers via crafted inputs	Infrastructure compromise

Generally Out of Scope

Category	Example	Why It Is Excluded
Jailbreaks	Getting the model to produce harmful text	No concrete security impact beyond content
Hallucinations	Model generates false information	Expected model behavior, not a vulnerability
Prompt injection (content only)	Overriding system prompt to change model behavior	Unless it causes data exposure or unauthorized actions
Bias	Model exhibits demographic bias	Safety/ethics concern, not security vulnerability
Content policy violations	Model generates policy-violating content	Handled through safety teams, not security bounties

Writing Effective AI Vulnerability Reports

AI vulnerability reports require additional context compared to traditional security reports because the behavior is probabilistic.

Demonstrate Reproducibility
Include exact prompts, API parameters, model version, and timestamp. Provide success rates over multiple attempts (e.g., "Successful in 7 of 10 attempts"). AI vulnerabilities may not reproduce 100% of the time — quantify the success rate.
Prove Security Impact
Clearly articulate what an attacker gains beyond unexpected model output. Data exposure? Financial loss? Unauthorized access? If you cannot articulate concrete security impact, the finding is likely out of scope.
Isolate the Root Cause
Differentiate between model behavior issues and application-layer vulnerabilities. Is this a prompt injection that only affects the model's text output, or does it trigger unauthorized actions through tool use or data access?
Provide Minimal Reproduction Steps
Strip your reproduction to the minimum steps needed. Remove unnecessary complexity — the simpler the reproduction, the faster the triage.
Document the Environment
Include API version, model identifier, date and time of testing, geographic region (some deployments vary by region), and any relevant account configuration.

# Example: structured vulnerability report template
vulnerability_report = {
    "title": "Cross-tenant conversation data exposure via crafted API request",
    "severity": "High",
    "model_version": "gpt-4-0125-preview",
    "testing_date": "2026-02-15",
    "reproduction_rate": "7/10 attempts",
    "steps_to_reproduce": [
        "1. Create two separate accounts (Account A, Account B)",
        "2. Start a conversation with Account B containing sensitive data",
        "3. Using Account A's API key, send the following request...",
        "4. Observe Account B's conversation data in the response",
    ],
    "security_impact": "Unauthorized access to other users' conversation data",
    "root_cause_hypothesis": "Insufficient tenant isolation in conversation retrieval",
}

Responsible Disclosure Challenges in AI

AI vulnerability disclosure presents unique challenges:

Non-deterministic reproduction: Unlike a traditional SQL injection that works 100% of the time with the right input, AI vulnerabilities may succeed probabilistically. Establish a clear success rate before reporting.

Rapid model updates: AI providers update models frequently, sometimes daily. A vulnerability you found last week may not reproduce on the current model version. Document the exact model version and test date.

Scope ambiguity: The line between "jailbreak" (out of scope) and "security vulnerability" (in scope) is often debated. When in doubt, frame your finding in terms of concrete security impact rather than model behavior.

Patch verification: It is difficult to verify that an AI vulnerability has been "fixed" since model updates are opaque. A behavior that disappears after a model update may resurface in the next version.

Maximizing Bug Bounty Success

Start with infrastructure — Traditional web application vulnerabilities in AI product surfaces (APIs, dashboards, authentication) often have clearer scope and faster payouts
Focus on data exposure — Any finding that demonstrates unauthorized data access is almost always in scope
Chain AI and traditional bugs — The most impactful findings combine AI-specific techniques (prompt injection) with traditional vulnerabilities (IDOR, SSRF) to achieve concrete security impact
Read the scope carefully — Every program defines AI vulnerability scope differently; a finding in scope for one program may be out of scope for another
Build relationships — Engage with program teams, provide quality reports, and establish yourself as a reliable researcher

For related methodology, see Reconnaissance & Tradecraft, Infrastructure Security, and Report Writing.

Notable AI Security Incidents -- incident database and analysis framework
Lessons Learned from AI Security Incidents -- recurring patterns from disclosed vulnerabilities
Authorization, Contracts & Liability -- legal considerations for bug bounty participation
Ethics & Responsible Disclosure -- disclosure guidelines for AI vulnerability findings

References

"OpenAI Bug Bounty Program" - Bugcrowd / OpenAI (2024) - Scope and rules for OpenAI's bug bounty program including AI-specific vulnerability categories
"Google AI Bug Hunters Program" - Google (2024) - Google's vulnerability rewards program covering AI and ML system security
"HackerOne AI Safety Initiative" - HackerOne (2024) - Platform-wide guidance on AI vulnerability reporting and scope determination
"The Bug Hunter's Methodology for AI Systems" - OWASP Foundation (2024) - Structured approach to finding security vulnerabilities in AI applications

Knowledge Check

Why are most jailbreaks considered out of scope for AI bug bounty programs?

AI Bug Bounty Programs

Demonstrate Reproducibility

Prove Security Impact

Isolate the Root Cause

Provide Minimal Reproduction Steps

Document the Environment

Related articles

AI Bug Bounty Programs

Demonstrate Reproducibility

Prove Security Impact

Isolate the Root Cause

Provide Minimal Reproduction Steps

Document the Environment

Related articles