AI-Focused Bug Bounty Programs
Guide to participating in and running AI-specific bug bounty programs, covering major platforms, AI vulnerability categories, and reward structures.
Overview
Bug bounty programs have become a significant channel for identifying security vulnerabilities in AI systems. As AI labs and companies deploying AI products have recognized the limitations of internal testing, they have established bounty programs that invite external researchers to find and report vulnerabilities in exchange for financial rewards. These programs extend the reach of internal red teams by harnessing the diversity of approaches and sheer volume of testing that a global researcher community provides.
For individual practitioners, AI bug bounties offer income, portfolio-building opportunities, and access to production AI systems that would otherwise be off-limits. For organizations, they provide scalable vulnerability discovery that complements structured red team engagements. This article covers both perspectives: how to succeed as a participant in AI bug bounty programs, and how to design and operate an effective AI-specific bounty program for your organization.
The AI Bug Bounty Landscape
Major AI Lab Programs
Several major AI labs operate bounty programs that specifically cover their AI models and products.
OpenAI Bug Bounty Program: Launched in 2023, OpenAI's program covers ChatGPT, the API, and plugins/GPTs. Rewards range from $200 for low-severity findings to $20,000 for critical vulnerabilities. The program is hosted on Bugcrowd. Notably, OpenAI's program explicitly excludes model jailbreaks and safety bypasses from bounty eligibility, focusing instead on traditional security vulnerabilities (authentication bypass, data leakage, injection attacks in the application layer) and API abuse vectors. Prompt injection that leads to data exfiltration or unauthorized action execution is generally in scope, while pure content policy bypass is not.
Google Vulnerability Reward Program (VRP): Google's VRP covers Bard/Gemini, Vertex AI, and AI features across Google products. Google has expanded its VRP to include AI-specific vulnerability categories, including prompt injection, training data extraction, and model manipulation. Rewards follow Google's standard VRP tier system, with AI-specific findings potentially qualifying for bonuses. Google has also sponsored specific AI red teaming events and competitions.
Microsoft Bug Bounty Program: Microsoft's program covers Azure AI services, Copilot products, and the Bing AI integration. AI-specific findings including prompt injection, content filter bypass, and data leakage are eligible. Microsoft has published detailed guidance on what constitutes an eligible AI vulnerability, which is valuable reading for aspiring bounty hunters.
Anthropic: Anthropic has a vulnerability disclosure process for security issues related to Claude. Their approach emphasizes responsible disclosure, and they have engaged in structured external red teaming exercises. Check their website for current disclosure program details.
Meta: Meta's program covers AI features in its products, including the AI assistant integrated into WhatsApp, Instagram, and Facebook, as well as research model releases (LLaMA family). AI safety issues in Meta's open-source models are handled through a separate responsible disclosure process.
Platform-Hosted Programs
Bug bounty platforms aggregate programs from many organizations, providing centralized submission, triage, and payment infrastructure.
HackerOne: The largest bug bounty platform, hosting AI programs from various technology companies. HackerOne has added AI vulnerability categories to its taxonomy and provides guidance for researchers on reporting AI-specific findings.
Bugcrowd: Hosts several AI-specific programs, including OpenAI's. Bugcrowd has invested in AI security expertise within its triage team to handle the unique characteristics of AI vulnerability reports.
Intigriti: European-focused platform with growing AI program coverage. Relevant for researchers targeting European organizations subject to EU AI Act requirements.
AI-Specific Bounty Events
Beyond ongoing bounty programs, targeted AI security events offer concentrated bounty opportunities.
DEF CON AI Village Red Teaming Events: The AI Village at DEF CON has organized large-scale AI red teaming events where participants test major AI systems in a competitive setting. The 2023 and 2024 events involved testing models from OpenAI, Anthropic, Google, and Meta, with findings reported through structured disclosure processes. These events provide hands-on experience with production AI systems and community networking opportunities.
AI Safety CTFs: Various organizations host AI-focused Capture-the-Flag competitions that, while not traditional bounty programs, provide similar skill-building opportunities and sometimes include financial prizes.
Government-Sponsored AI Testing: The US government has organized AI red teaming events, and programs like the DARPA AI Cyber Challenge have included AI security components. The UK AI Safety Institute conducts model evaluations that sometimes involve external researcher participation.
Finding AI Vulnerabilities for Bounties
High-Value Vulnerability Categories
Understanding which vulnerability types are most likely to be rewarded helps focus your efforts.
Data exfiltration through AI features (High value): Demonstrating that an AI system can be manipulated to extract and disclose data it should not reveal — whether training data, other users' conversations, system configuration, or connected database content — consistently qualifies for high-severity bounty rewards. This includes indirect prompt injection in RAG systems that accesses other tenants' data, conversation history leakage across sessions, and extraction of API keys or credentials from the model's context.
Unauthorized action execution (High value): In agentic AI systems, demonstrating that the AI can be manipulated to execute actions beyond its intended scope — sending unauthorized emails, modifying data, accessing restricted resources through tool use — qualifies as high-severity. The impact is similar to a traditional privilege escalation vulnerability.
Authentication and authorization bypass via AI (High value): Using AI features to bypass authentication, access other users' data, or escalate privileges. For example, manipulating an AI assistant to access another user's account information through tool-calling capabilities. These findings combine AI vulnerability with traditional security impact, making them high-value.
Cross-tenant data access in AI SaaS (High value): Multi-tenant AI systems that allow one tenant's data to influence another tenant's AI interactions represent significant vulnerabilities. Demonstrating cross-tenant data leakage through shared model context, embedding similarity exploitation, or RAG pipeline contamination is consistently rewarded well.
Prompt injection with demonstrable impact (Medium-High value): Pure prompt injection (overriding system instructions) is lower value by itself, but prompt injection that leads to concrete impact (data access, action execution, safety bypass with real-world consequences) is rewarded at higher severity.
Safety filter bypass (Variable value): Many programs explicitly exclude pure jailbreaks and content policy bypasses. However, safety bypasses that have concrete impact — such as generating content that could enable fraud, producing outputs that violate regulatory requirements, or bypassing safety measures in high-risk domains (medical, financial, legal advice) — may be in scope. Read program guidelines carefully.
Denial of service specific to AI (Low-Medium value): Resource exhaustion through adversarial inputs (prompts that cause disproportionate compute usage), model degradation attacks, and inference pipeline disruption. These are typically lower severity because they affect availability rather than confidentiality or integrity, but they are valid findings in programs that cover availability.
Methodology for Bounty Hunting
Successful AI bounty hunting requires a systematic approach rather than random prompt experimentation.
Read the program scope thoroughly: Before testing, understand exactly what is in scope, what is excluded, and what the program considers a valid vulnerability. Many researchers waste time on findings that are explicitly excluded. Pay particular attention to AI-specific scope definitions, as they vary significantly between programs.
Start with reconnaissance: Understand the target system's architecture, capabilities, and intended use before launching attacks. What models does it use? What data sources does it access? What tools can it call? What are its stated limitations? This information informs your testing strategy and helps you identify the highest-value attack vectors.
Focus on impact, not novelty: Programs reward impact, not cleverness. A simple prompt injection that leaks PII is worth more than an elaborate multi-step attack chain that produces a mildly amusing but harmless response. Always ask: "What is the worst thing an attacker could achieve through this vulnerability?"
Test systematically across the attack surface: Do not focus exclusively on prompt injection. Test the application layer (authentication, authorization, API security), the data pipeline (RAG sources, file upload handling), tool integrations, and session management. Many high-value AI bounties are found at the intersection of AI features and traditional application security, not in pure model-level attacks.
Document meticulously: Your report is your product. Successful bounty reports include a clear title that communicates the finding, step-by-step reproduction instructions, evidence (screenshots, HTTP logs, full prompt/response sequences), impact assessment with concrete examples, and the model or system version tested. Incomplete reports lead to rejection or down-grading.
Writing Effective Bug Reports
The quality of your report directly affects whether your finding is accepted and how it is rewarded.
Title: Specific and impact-focused. "Indirect prompt injection in knowledge base allows extraction of other users' conversation data in Product X" is clear. "Prompt injection bug" is not.
Vulnerability description: Explain the vulnerability in terms the triage team can understand. Define AI-specific concepts if using them. Not everyone triaging reports has deep AI security expertise.
Reproduction steps: Numbered, step-by-step instructions starting from a clean state. Include exact prompts, API calls, or UI interactions. For multi-turn attacks, include the full conversation sequence. Note any prerequisites (specific account type, feature flags, timing conditions).
Proof of concept: Demonstrate the vulnerability achieving meaningful impact. Do not just show that you can override system instructions — show what an attacker can achieve by overriding them. "I can make the chatbot ignore its system prompt" is a mechanism. "I can make the chatbot reveal other users' support ticket contents" is an impact.
Impact statement: Explain who is affected, how severely, and at what scale. Quantify where possible.
Suggested severity: Provide your assessment of severity using the program's framework (CVSS, custom severity scale, etc.) with justification. This helps triage and shows you understand the program's standards.
Suggested remediation: Including remediation recommendations is not required but demonstrates expertise and helps the vendor fix the issue faster, which builds your reputation with the program.
Running an AI Bug Bounty Program
Program Design
Organizations considering an AI-specific bounty program should address several design decisions.
Scope definition: Be precise about what AI systems are in scope, what types of findings are eligible, and what is explicitly excluded. Common exclusions for AI programs include known limitations documented in model cards, expected model hallucinations, subjective content quality issues, and pure jailbreaks without demonstrable security impact. The clearer your scope, the fewer invalid reports you will receive.
Reward structure: AI vulnerability rewards should reflect the unique impact categories of AI systems. A sample reward table:
| Severity | Impact Category | Reward Range |
|---|---|---|
| Critical | Data breach via AI, unauthorized actions with material impact | $10,000 - $50,000 |
| High | Cross-tenant data leakage, significant safety bypass with real impact | $5,000 - $15,000 |
| Medium | System prompt extraction, limited data exposure, moderate safety bypass | $1,000 - $5,000 |
| Low | Information disclosure of system config, minor content policy issues | $200 - $1,000 |
Triage process: AI vulnerability triage requires AI security expertise. Generic security triage teams may not understand prompt injection, may not be able to reproduce probabilistic findings, and may inappropriately triage AI-specific findings using traditional severity frameworks. Invest in training your triage team or engage specialized triage services from bounty platforms.
Disclosure policy: Define your disclosure timeline (typically 90 days from report acknowledgment to public disclosure), how findings will be tracked and communicated, and whether researchers can publish about their findings after remediation.
Operational Challenges
Volume management: AI bounty programs often receive high volumes of low-quality reports, particularly prompt injection attempts that are explicitly out of scope. Clear scope documentation, auto-responses for common out-of-scope categories, and an efficient triage workflow are essential.
Reproducibility verification: AI findings may not reproduce deterministically. Your triage process must account for probabilistic findings by allowing multiple reproduction attempts and testing across different conditions.
Model version management: AI models are updated frequently. A finding that exists in today's model may not exist in next week's update. Track model versions in reports and verify findings against the current deployed version before rewarding.
Coordinated disclosure with AI providers: If your application uses third-party AI services (OpenAI API, Claude API, etc.) and a researcher finds a vulnerability in the underlying model rather than your application, you need a process for coordinating disclosure with the AI provider. Establish these relationships before they are needed.
Measuring Program Effectiveness
Track these metrics to evaluate your bounty program's performance:
Valid finding rate: Percentage of submitted reports that are valid, in-scope findings. Target: 15-30%. Lower rates indicate scope clarity issues; higher rates may indicate the program is not reaching enough researchers.
Time to triage: Average time from submission to initial assessment. Target: under 3 business days. Slow triage discourages researchers.
Time to reward: Average time from validated finding to payment. Target: under 30 days. Fast payment builds researcher loyalty.
Unique vulnerability classes: Number of distinct vulnerability classes reported. This indicates the diversity of testing the program is receiving.
Researcher satisfaction: Survey researchers periodically about their program experience. Researcher reputation drives program quality — satisfied researchers spread the word and attract more talent.
Legal and Ethical Considerations
Safe Harbor
Provide clear safe harbor language that protects researchers from legal action when they test within the defined scope. The Department of Justice's 2022 policy revision on Computer Fraud and Abuse Act enforcement supports good-faith security research, but explicit safe harbor in your program terms provides additional protection that encourages researcher participation.
Responsible Testing
Both bounty programs and individual researchers must address responsible testing practices for AI systems.
Production impact: Bounty programs should specify whether testing against production AI systems is permitted. If so, provide guidance on rate limiting testing to avoid service disruption, avoiding actions that affect other real users, and handling of harmful content generated during testing.
Data handling: Researchers who discover data leakage vulnerabilities may access real user data. Program terms should require researchers to report such findings immediately, not store or share any extracted data, and delete any extracted data after reporting.
Content responsibility: AI red teaming may involve attempting to generate harmful, offensive, or illegal content. Program terms should clarify that generating such content in the course of legitimate security research within the program scope is authorized, while setting reasonable boundaries.
Tax and Payment Considerations
For researchers, bounty income is taxable in most jurisdictions. Major platforms provide tax documentation (1099 forms in the US), but independent programs may not. Maintain records of all bounty income for tax compliance.
For program operators, budget for rewards, platform fees (typically 20-25% of rewards on major platforms), triage costs, and remediation costs associated with validated findings. A well-run AI bounty program typically costs $200,000 to $1,000,000+ annually, depending on scope and reward levels.
Getting Started
For Researchers
- Build foundational skills: Complete AI security training, practice on deliberately vulnerable AI applications, and understand the OWASP Top 10 for LLM Applications.
- Start with established programs: Begin with well-documented programs (OpenAI, Google, Microsoft) that have clear scope and reward guidelines.
- Read program guidelines thoroughly: Understand exactly what is in scope before testing. Out-of-scope submissions waste your time and may result in program bans.
- Focus on unique research: The most valuable bounty findings come from novel approaches, not from running the same techniques every other researcher uses.
- Build your reputation: Consistent, high-quality reports build your reputation on bounty platforms, leading to invitations to private programs with higher rewards and less competition.
For Organizations
- Start with a vulnerability disclosure policy (VDP): Before offering rewards, establish a VDP that provides a channel for reporting AI vulnerabilities. This costs nothing beyond triage effort and establishes the organizational processes needed for a bounty program.
- Run a private program first: Begin with an invitation-only program with a small group of vetted researchers. This provides controlled volume while you build triage expertise.
- Invest in triage capability: Train your triage team on AI vulnerability categories or partner with a platform that provides specialized triage.
- Iterate on scope: Start with a narrow scope and expand based on experience. It is easier to expand scope than to narrow it after researchers have already invested effort.
References
- OWASP Top 10 for LLM Applications, 2025 Edition. https://owasp.org/www-project-top-10-for-large-language-model-applications/ — Framework for categorizing AI vulnerability types in bounty programs.
- MITRE ATLAS (Adversarial Threat Landscape for AI Systems). https://atlas.mitre.org/ — Technique taxonomy for AI system attacks relevant to bounty scope definition.
- OpenAI Bug Bounty Program on Bugcrowd. https://bugcrowd.com/openai — Example of a major AI lab's bounty program structure and scope.
- US Department of Justice. "Department of Justice Announces New Policy for Charging Cases Under the Computer Fraud and Abuse Act," May 2022. https://www.justice.gov/ — Policy supporting good-faith security research under CFAA.