What is Encoding Puzzle?

Decode a series of encoded payloads to find hidden flags, testing knowledge of encoding schemes, obfuscation techniques, and payload analysis.

What is Defense Gauntlet?

Build AI chatbot defenses that survive 100 automated attack attempts across diverse attack categories, scored on block rate and usability preservation.

Achieve jailbreaks with the shortest possible prompts, scored by character count. Develop minimal payloads that bypass safety alignment with maximum efficiency.

Community Contributed Challenges

beginner10 min readUpdated 2026-03-15

How to submit your own AI security challenges to the community, including the review process, quality standards, and contribution guidelines.

community contributed challenges submission guidelines

Community Contributed Challenges

The best challenges come from the community. If you have designed an interesting AI security exercise, puzzle, or scenario, you can submit it for inclusion in the community challenge library. Contributed challenges are reviewed, tested, and published for the entire community to enjoy.

Why Contribute

Contributing challenges benefits you and the community:

Teaching deepens understanding. Designing a challenge requires understanding the technique well enough to create a controlled scenario around it. This is a higher bar than simply executing the technique.
Community recognition. Your name (or handle) is attached to every challenge you create. Quality challenges earn reputation and recognition within the community.
Improving the field. Every new challenge is a training opportunity for the next generation of AI security practitioners. Your contribution has a multiplier effect.
Feedback on your ideas. The review process and community response to your challenge provide valuable feedback on your thinking.

Submission Process

Step 1: Proposal

Before building a full challenge, submit a proposal that describes:

# Challenge Proposal: [Title]
 
## Concept
[2-3 paragraphs describing what the challenge tests and why it is interesting]
 
## Learning Objectives
[What skills or knowledge does this challenge develop?]
 
## Difficulty Level
[Beginner / Intermediate / Advanced / Expert]
 
## Estimated Time
[How long should a participant at the target difficulty level expect to spend?]
 
## Technical Requirements
[What infrastructure does the challenge need? A model API, a web server,
a database, custom tools?]
 
## Novelty
[How does this differ from existing challenges? What makes it unique?]

Submit proposals through the community platform. A reviewer will respond within 7 days with one of:

Accepted -- proceed to building the full challenge
Revision requested -- the concept is interesting but needs refinement (specific feedback provided)
Declined -- the concept overlaps too much with existing challenges or does not meet quality standards (reasoning provided)

Step 2: Challenge Development

Once your proposal is accepted, build the full challenge. A complete challenge submission includes:

Challenge specification:

# [Challenge Title]
 
## Overview
[Detailed description of the scenario and objectives]
 
## Setup Instructions
[How to deploy the challenge environment]
 
## Objectives
[Numbered list of objectives with point values]
 
## Hints
[Progressive hints, from vague to specific]
 
## Solution
[Complete walkthrough of the intended solution]
 
## Alternative Solutions
[Other valid approaches you have identified]
 
## Scoring Rubric
[How to evaluate submissions]

Technical implementation:

Source code for any custom components (chatbots, agents, filters, APIs)
Deployment configuration (Docker, docker-compose, or equivalent)
Test scripts that verify the challenge works correctly
Solution verification script that confirms the intended solution works

Documentation:

README with setup instructions
Dependencies and requirements list
Known issues or limitations

Step 3: Review

Submitted challenges go through a two-phase review:

Technical review (1--2 weeks):

A reviewer deploys the challenge and verifies it works
The intended solution is tested
The challenge is probed for unintended solutions that trivialize it
Code quality and deployment reliability are assessed

Community testing (1--2 weeks):

3--5 volunteer testers attempt the challenge without seeing the solution
Testers provide feedback on difficulty, clarity, and fun factor
Feedback is shared with you for final revisions

Step 4: Publication

After review, your challenge is published to the community challenge library with:

Your attribution as the challenge author
An editorial introduction placing the challenge in the broader curriculum
Community difficulty rating based on tester feedback

Quality Standards

What Makes a Good Challenge

Clear objectives. Participants should know exactly what they are trying to achieve. Ambiguous objectives frustrate participants and make scoring inconsistent.

Appropriate difficulty. The challenge should be hard enough to require genuine effort but not so hard that it is inaccessible to the target audience. The best challenges have a "smooth difficulty curve" -- easy to start, hard to finish.

Educational value. Participants should learn something from attempting the challenge, even if they do not complete it. Challenges that require obscure trivia or lucky guessing do not teach useful skills.

Realistic scenarios. The best challenges model situations that practitioners encounter in real engagements. Artificial constraints are acceptable for pedagogical reasons, but the core scenario should feel plausible.

Fair solutions. The intended solution should be findable through systematic thinking and skill application. Challenges that require "guessing the author's mind" or rely on hidden assumptions are poor design.

Common Rejection Reasons

Issue	Why It Is Rejected	How to Fix
Too similar to existing challenges	The community library already covers this technique	Add a novel twist or combine techniques in a new way
Single-trick solution	The challenge reduces to knowing one specific technique	Add layers that require multiple skills
Unclear instructions	Testers could not understand what they were supposed to do	Get feedback from non-authors before submitting
Unreliable infrastructure	The challenge environment crashes or behaves inconsistently	Add health checks, error handling, and deployment testing
No educational value	Completing the challenge does not develop any transferable skill	Redesign around a learning objective, not just a puzzle
Unfair difficulty	Success depends on guessing rather than skill	Ensure the solution is reachable through systematic exploration

Contribution Guidelines

Technical Requirements

Containerized deployment. Challenges must be deployable via Docker or docker-compose. No manual server setup.
Idempotent setup. Running the setup script multiple times should produce the same result. Challenge state should be resettable.
Resource bounds. Challenges should run on a single machine with 4 CPU cores, 8GB RAM, and optionally 1 GPU. If your challenge requires more resources, note this in the proposal.
No external dependencies at runtime. The challenge should not depend on external services (third-party APIs, live websites) that could become unavailable. Mock external services if needed.

Content Requirements

No real harmful content. Challenges involving jailbreaking should use benign proxy tasks. Challenges involving data exfiltration should use fictional data.
Clear licensing. All code must be licensed under MIT or Apache 2.0. All content must be original or properly attributed.
Inclusive language. Challenge scenarios should not rely on stereotypes or potentially offensive content.

Maintenance Commitment

When you contribute a challenge, you agree to:

Respond to bug reports within 7 days
Update the challenge if underlying models or APIs change
Participate in at least one round of community testing for someone else's challenge

Current Contributed Challenges

The following challenges were submitted and reviewed by community members:

Challenge	Author	Difficulty	Topic
Encoding Puzzle	`cipher_smith`	Intermediate	Encoding and payload obfuscation
Defense Gauntlet	`blue_team_boss`	Advanced	Defensive engineering
Prompt Golf	`minmax_hacker`	Intermediate	Minimal jailbreak optimization

Getting Started

Study existing challenges. Review the monthly challenges and the contributed challenges above to understand the quality bar and format.
Identify a gap. What technique or scenario is not covered by existing challenges? Where did you struggle to find practice material when learning?
Draft a proposal. Use the proposal template above. Focus on the learning objective first, then design the challenge around it.
Build a prototype. Before writing the full submission, build a minimal version and test it yourself. Many challenge ideas that sound good in theory do not work in practice.
Submit and iterate. Submit your proposal, incorporate feedback, build the full challenge, and go through the review process.

The community challenge library grows through contributions from practitioners at all levels. Your unique perspective and experience can create learning opportunities that no one else can.

Challenge Design Principles

Good challenge design is a skill in itself. These principles, developed from years of CTF and challenge design experience, will help you create challenges that participants enjoy and learn from.

The Funnel of Difficulty

The best challenges have a "funnel" structure: easy to start, progressively harder to complete. The first steps should be accessible to anyone at the target difficulty level, giving participants early momentum and confidence. The final objectives should stretch even strong participants.

A concrete example: a prompt extraction challenge might have the flag visible in the system prompt (easy to locate) but behind multiple defense layers (hard to extract). The participant quickly understands what they need -- the flag is right there -- but getting it requires progressively more sophisticated techniques.

Avoid "Guess What I'm Thinking" Challenges

The worst challenges require participants to guess the author's specific intended approach rather than rewarding any valid approach. Signs that a challenge has this problem:

The solution requires trying a very specific prompt that is not suggested by any clue in the challenge
Multiple valid-looking approaches exist but only one actually works, and there is no way to distinguish them without trial and error
The challenge description omits information that is critical to solving it

Provide Meaningful Feedback

Participants should be able to tell whether they are making progress. This does not mean giving away the solution -- it means designing the challenge so that partial successes are visible. For example:

A system that returns different error messages for different failure modes gives the participant information about what defense they hit
A multi-objective challenge where objectives can be completed independently lets participants measure partial progress
Debug output or tool call traces that show the system's internal state help participants understand why their approach is or is not working

Test with Real Users

No amount of self-testing replaces having someone else attempt your challenge. Ask two or three people to try it before submitting:

One person at the target difficulty level (to verify the challenge is appropriately hard)
One person above the target difficulty level (to check for unintended shortcuts)
One person below the target difficulty level (to verify the instructions are clear even if the challenge is too hard for them)

Write Complete Solutions

Your submitted solution should cover:

The intended approach -- the path you designed the challenge to teach
Alternative approaches you are aware of -- other valid solutions that work
Non-solutions that look promising -- approaches that seem like they should work but do not, and why they fail
Edge cases -- unusual approaches that technically satisfy the success criteria but are not in the spirit of the challenge (decide whether to allow or block these)

Maintaining Your Challenge

Contributing a challenge is an ongoing responsibility. Models change, APIs evolve, and defenses improve. Common maintenance tasks:

Model version updates. If your challenge targets a specific model version, test it when new versions are released. Behavior changes may make the challenge easier, harder, or impossible.
Infrastructure updates. Keep container images and dependencies up to date. Security vulnerabilities in challenge infrastructure are embarrassing.
Community feedback. Monitor feedback on your challenge. If multiple participants report the same confusion or frustration, the challenge needs revision.
Difficulty recalibration. As the community's collective skill improves, challenges that were once hard become routine. Consider adding harder objectives or creating a "v2" with updated defenses.

Community Contributed Challenges

Learning Path

Related articles

Community Contributed Challenges

Learning Path

Related articles