Fall 2026 CTF: Advanced Attack Research
An expert-level CTF requiring novel technique development, research-grade problem solving, and original vulnerability discovery in AI systems.
Fall 2026 CTF: Advanced Attack Research
The Fall 2026 CTF is designed for experienced AI security practitioners. Unlike previous CTFs where known techniques can solve most challenges, this CTF requires developing novel approaches. Challenges are intentionally resistant to published techniques, forcing participants to innovate under time pressure.
Theme: The Bleeding Edge
Every challenge in this CTF is designed to resist known, published techniques. Participants who rely solely on their existing playbook will hit walls. Success requires creative thinking, first-principles reasoning, and the ability to develop new techniques on the fly.
Categories
Category 1: Novel Injection Research (4 challenges)
Challenges where all known prompt injection techniques have been specifically defended against. You must develop new approaches.
| Challenge | Difficulty | Starting Points | Description |
|---|---|---|---|
| NI-1: The Dictionary | Hard | 400 | A chatbot that tokenizes and reconstructs your input using a custom encoding before processing. Known injection patterns are neutralized by the encoding. Find an injection that survives the transformation. |
| NI-2: Semantic Firewall | Hard | 450 | A system with an embedding-based semantic classifier that detects injection intent regardless of surface form. Bypass a classifier that understands what you are trying to do, not just what you are saying. |
| NI-3: Formal Verification | Expert | 500 | A system that uses formal methods to verify that the model's output satisfies safety properties before releasing it. The verifier checks structural properties of the response. Find an output that satisfies the formal properties while still containing the flag. |
| NI-4: The Paradox | Expert | 500 | A system with a defense that appears to be theoretically complete: every possible injection vector has a corresponding defense. But the defense has a flaw in its composition. Find the gap between the individual defenses. |
Category philosophy: These challenges test whether you can reason about injection from first principles rather than pattern-matching against known techniques.
Category 2: Model Internals Exploitation (4 challenges)
Challenges that require understanding and exploiting how language models process information internally.
| Challenge | Difficulty | Starting Points | Description |
|---|---|---|---|
| MI-1: Attention Manipulation | Hard | 400 | A model where the flag is present in the context but attention-level interventions prevent the model from attending to it during generation. Craft an input that redirects attention to the flag despite the intervention. |
| MI-2: Logit Lens | Hard | 450 | The flag is encoded in the model's logit distribution for a specific prompt, not in the generated text. You have access to top-10 logits per token. Decode the flag from the probability distribution. |
| MI-3: Steganographic Extraction | Expert | 500 | The flag was embedded in the model during fine-tuning using steganographic techniques. It does not appear in normal outputs. Find the trigger sequence that causes the model to output the hidden information. |
| MI-4: Representation Engineering | Expert | 500 | The model has been modified using representation engineering to suppress a specific concept. The flag is related to that suppressed concept. Find a way to elicit the suppressed information despite the representation-level intervention. |
Category philosophy: These challenges require understanding beyond the prompt interface. You need mental models of how transformers process, store, and generate information.
Category 3: Cross-System Attacks (4 challenges)
Challenges involving attacks that span multiple systems, protocols, or modalities.
| Challenge | Difficulty | Starting Points | Description |
|---|---|---|---|
| CS-1: Protocol Confusion | Hard | 400 | An AI system that processes both natural language and structured data (JSON/XML). The flag is accessible through a confusion attack where the model misinterprets the boundary between natural language and structured data. |
| CS-2: Modality Hop | Hard | 450 | A multimodal system that processes text and images. The flag is protected against text-based extraction. Find a way to extract it through the image processing pathway or by exploiting cross-modal interactions. |
| CS-3: Time Warp | Expert | 500 | A system with temporal defenses: it monitors conversation patterns over time and blocks attacks that develop over multiple turns. Execute a multi-turn attack that evades temporal pattern detection. |
| CS-4: The Ecosystem | Expert | 500 | A complex system with 5 interacting components (model, database, cache, queue, external API). The flag is only accessible through a chain of exploits across 3+ components. No single vulnerability is sufficient. |
Category philosophy: Real-world AI systems are never isolated. These challenges test your ability to find and exploit the seams between components.
Category 4: Defense Breaking (4 challenges)
Challenges where you are given the complete defense specification and must find the flaw.
| Challenge | Difficulty | Starting Points | Description |
|---|---|---|---|
| DF-1: Open Source Defense | Hard | 400 | A defense system whose complete source code is provided. Find the vulnerability in the implementation that allows bypassing it. This is code review meets prompt injection. |
| DF-2: Probabilistic Defense | Hard | 450 | A defense that uses randomization: it randomly reorders tokens, adds decoy instructions, and varies the system prompt phrasing across requests. Find a technique that works despite the randomization. |
| DF-3: Adaptive Defense | Expert | 500 | A defense that learns from attack attempts. Each failed attack strengthens the defense against similar future attempts. Find an attack that succeeds on the first try, or find a way to manipulate the learning process. |
| DF-4: Cryptographic Defense | Expert | 500 | A defense that uses cryptographic techniques (message authentication codes, encrypted instructions) to ensure the model only follows authenticated instructions. Find a flaw in the cryptographic protocol as applied to the LLM context. |
Category philosophy: Understanding defenses deeply enough to break them is the mark of an expert. These challenges require reading and analyzing defense implementations, not just throwing attacks at a black box.
Category 5: Original Research (2 challenges)
Open-ended challenges that reward original discovery.
| Challenge | Difficulty | Starting Points | Description |
|---|---|---|---|
| OR-1: Bug Bounty | Variable | 100--500 | The challenge platform itself is the target. Find genuine security vulnerabilities in the CTF infrastructure. Point value depends on severity. This is not a simulated challenge -- any real vulnerability you find earns points. |
| OR-2: Surprise Us | Variable | 100--500 | Submit an original AI security technique or finding that is not covered by any other challenge. Point value is determined by a panel of judges based on novelty, significance, and quality of documentation. |
Category philosophy: The best security researchers do not wait for someone to define the challenge -- they find the interesting problems themselves. These open-ended challenges reward that mindset.
Scoring
Dynamic Scoring
Same dynamic scoring as described in the competitions overview, with one modification: Expert challenges that remain unsolved after 24 hours increase in point value by 10% (up to 550 maximum).
Research Quality Bonus
For Original Research challenges (OR-1 and OR-2), scoring includes a quality multiplier:
| Quality Factor | Multiplier |
|---|---|
| Novel technique with proof of concept | 1.0x |
| Novel technique with rigorous evaluation | 1.3x |
| Novel technique with proposed defense | 1.5x |
Team Composition Bonus
Solo participants receive a 1.2x multiplier on their total score, recognizing the additional difficulty of competing alone across all categories.
Preparation
Minimum Prerequisites
- Completed at least 2 previous seasonal CTFs or equivalent competition experience
- Comfortable developing tools and scripts under time pressure
- Familiar with at least one ML framework (PyTorch, TensorFlow, JAX)
- Experience reading and implementing techniques from research papers
Recommended Preparation
- Read recent AI security papers. The challenge designers draw inspiration from the latest research. Familiarity with the landscape helps you recognize when a challenge is riffing on a known concept.
- Practice tool development. Several challenges benefit from quickly building custom tools (scrapers, fuzzers, analyzers). Practice building small tools fast.
- Study defense mechanisms. The Defense Breaking category requires understanding defenses at a code level. Review open-source AI defense implementations.
- Develop your research methodology. Original Research challenges reward systematic approaches. Practice forming hypotheses, designing experiments, and documenting results.
Toolkit Recommendations
| Tool | Purpose |
|---|---|
| Python + requests/httpx | API interaction and automation |
| Jupyter notebooks | Exploratory analysis and documentation |
| PyTorch/transformers | Model analysis and probing |
| Burp Suite | HTTP traffic analysis for cross-system challenges |
| Custom scripts | Challenge-specific tooling you build during the CTF |
Prizes
| Place | Prize |
|---|---|
| 1st Place | Custom trophy + conference talk invitation + 2,000 leaderboard points |
| 2nd Place | Conference talk invitation + 1,000 leaderboard points |
| 3rd Place | Featured writeup + 500 leaderboard points |
| Best Original Research | Dedicated publication slot on the wiki + 1,000 leaderboard points |
| Best Solo Performer | Recognition badge + 500 leaderboard points |
Post-CTF
Given the research nature of this CTF, post-event activities include:
- Research presentations. Top performers are invited to present their techniques in a community webinar (2 weeks post-CTF).
- Publication support. Novel techniques discovered during the CTF can be developed into full writeups with editorial support from the community team.
- Defense collaboration. Findings from this CTF are shared (with participant consent) with model providers and defense researchers to improve AI safety.
Mental Models for Novel Challenges
When known techniques fail, you need to reason from first principles. The following mental models help:
The Constraint Model
Every defense imposes constraints on the model's behavior. Constraints are either:
- Hard constraints enforced architecturally (the model literally cannot access certain data)
- Soft constraints enforced through training or prompting (the model is instructed not to do something)
Hard constraints require architectural exploits. Soft constraints require prompt-level exploits. Determine which type you are facing before choosing your approach.
The Information Flow Model
Trace how information flows through the system. At every point where information crosses a boundary (user to model, model to filter, filter to output), there is an opportunity for the information to be transformed, blocked, or leaked. Map the flow explicitly:
- Where does my input enter the system?
- What transformations does it undergo?
- Where can I observe the system's state?
- What side-channels exist at each boundary?
The Composition Model
Individual components may be secure, but their composition may not be. Look for:
- Properties that hold for individual components but not for the combined system
- Assumptions that one component makes about another that are not enforced
- State shared between components that can be manipulated
The most interesting vulnerabilities in complex systems exist at the seams between components, not within individual components.
Further Reading
- Frontier & Emerging Attacks -- cutting-edge research context
- Advanced LLM Internals -- model internals knowledge
- Injection Research & Automation -- automated attack research
- Exploit Dev & Tooling -- building offensive tools