Fall 2026 CTF: Advanced Attack Research

expert10 min readUpdated 2026-03-15

An expert-level CTF requiring novel technique development, research-grade problem solving, and original vulnerability discovery in AI systems.

ctf competition fall-2026 advanced research expert capture-the-flag

Fall 2026 CTF: Advanced Attack Research

The Fall 2026 CTF is designed for experienced AI security practitioners. Unlike previous CTFs where known techniques can solve most challenges, this CTF requires developing novel approaches. Challenges are intentionally resistant to published techniques, forcing participants to innovate under time pressure.

Theme: The Bleeding Edge

Every challenge in this CTF is designed to resist known, published techniques. Participants who rely solely on their existing playbook will hit walls. Success requires creative thinking, first-principles reasoning, and the ability to develop new techniques on the fly.

Categories

Category 1: Novel Injection Research (4 challenges)

Challenges where all known prompt injection techniques have been specifically defended against. You must develop new approaches.

Challenge	Difficulty	Starting Points	Description
NI-1: The Dictionary	Hard	400	A chatbot that tokenizes and reconstructs your input using a custom encoding before processing. Known injection patterns are neutralized by the encoding. Find an injection that survives the transformation.
NI-2: Semantic Firewall	Hard	450	A system with an embedding-based semantic classifier that detects injection intent regardless of surface form. Bypass a classifier that understands what you are trying to do, not just what you are saying.
NI-3: Formal Verification	Expert	500	A system that uses formal methods to verify that the model's output satisfies safety properties before releasing it. The verifier checks structural properties of the response. Find an output that satisfies the formal properties while still containing the flag.
NI-4: The Paradox	Expert	500	A system with a defense that appears to be theoretically complete: every possible injection vector has a corresponding defense. But the defense has a flaw in its composition. Find the gap between the individual defenses.

Category philosophy: These challenges test whether you can reason about injection from first principles rather than pattern-matching against known techniques.

Category 2: Model Internals Exploitation (4 challenges)

Challenges that require understanding and exploiting how language models process information internally.

Challenge	Difficulty	Starting Points	Description
MI-1: Attention Manipulation	Hard	400	A model where the flag is present in the context but attention-level interventions prevent the model from attending to it during generation. Craft an input that redirects attention to the flag despite the intervention.
MI-2: Logit Lens	Hard	450	The flag is encoded in the model's logit distribution for a specific prompt, not in the generated text. You have access to top-10 logits per token. Decode the flag from the probability distribution.
MI-3: Steganographic Extraction	Expert	500	The flag was embedded in the model during fine-tuning using steganographic techniques. It does not appear in normal outputs. Find the trigger sequence that causes the model to output the hidden information.
MI-4: Representation Engineering	Expert	500	The model has been modified using representation engineering to suppress a specific concept. The flag is related to that suppressed concept. Find a way to elicit the suppressed information despite the representation-level intervention.

Category philosophy: These challenges require understanding beyond the prompt interface. You need mental models of how transformers process, store, and generate information.

Category 3: Cross-System Attacks (4 challenges)

Challenges involving attacks that span multiple systems, protocols, or modalities.

Challenge	Difficulty	Starting Points	Description
CS-1: Protocol Confusion	Hard	400	An AI system that processes both natural language and structured data (JSON/XML). The flag is accessible through a confusion attack where the model misinterprets the boundary between natural language and structured data.
CS-2: Modality Hop	Hard	450	A multimodal system that processes text and images. The flag is protected against text-based extraction. Find a way to extract it through the image processing pathway or by exploiting cross-modal interactions.
CS-3: Time Warp	Expert	500	A system with temporal defenses: it monitors conversation patterns over time and blocks attacks that develop over multiple turns. Execute a multi-turn attack that evades temporal pattern detection.
CS-4: The Ecosystem	Expert	500	A complex system with 5 interacting components (model, database, cache, queue, external API). The flag is only accessible through a chain of exploits across 3+ components. No single vulnerability is sufficient.

Category philosophy: Real-world AI systems are never isolated. These challenges test your ability to find and exploit the seams between components.

Category 4: Defense Breaking (4 challenges)

Challenges where you are given the complete defense specification and must find the flaw.

Challenge	Difficulty	Starting Points	Description
DF-1: Open Source Defense	Hard	400	A defense system whose complete source code is provided. Find the vulnerability in the implementation that allows bypassing it. This is code review meets prompt injection.
DF-2: Probabilistic Defense	Hard	450	A defense that uses randomization: it randomly reorders tokens, adds decoy instructions, and varies the system prompt phrasing across requests. Find a technique that works despite the randomization.
DF-3: Adaptive Defense	Expert	500	A defense that learns from attack attempts. Each failed attack strengthens the defense against similar future attempts. Find an attack that succeeds on the first try, or find a way to manipulate the learning process.
DF-4: Cryptographic Defense	Expert	500	A defense that uses cryptographic techniques (message authentication codes, encrypted instructions) to ensure the model only follows authenticated instructions. Find a flaw in the cryptographic protocol as applied to the LLM context.

Category philosophy: Understanding defenses deeply enough to break them is the mark of an expert. These challenges require reading and analyzing defense implementations, not just throwing attacks at a black box.

Category 5: Original Research (2 challenges)

Open-ended challenges that reward original discovery.

Challenge	Difficulty	Starting Points	Description
OR-1: Bug Bounty	Variable	100--500	The challenge platform itself is the target. Find genuine security vulnerabilities in the CTF infrastructure. Point value depends on severity. This is not a simulated challenge -- any real vulnerability you find earns points.
OR-2: Surprise Us	Variable	100--500	Submit an original AI security technique or finding that is not covered by any other challenge. Point value is determined by a panel of judges based on novelty, significance, and quality of documentation.

Category philosophy: The best security researchers do not wait for someone to define the challenge -- they find the interesting problems themselves. These open-ended challenges reward that mindset.

Scoring

Dynamic Scoring

Same dynamic scoring as described in the competitions overview, with one modification: Expert challenges that remain unsolved after 24 hours increase in point value by 10% (up to 550 maximum).

Research Quality Bonus

For Original Research challenges (OR-1 and OR-2), scoring includes a quality multiplier:

Quality Factor	Multiplier
Novel technique with proof of concept	1.0x
Novel technique with rigorous evaluation	1.3x
Novel technique with proposed defense	1.5x

Team Composition Bonus

Solo participants receive a 1.2x multiplier on their total score, recognizing the additional difficulty of competing alone across all categories.

Preparation

Minimum Prerequisites

Completed at least 2 previous seasonal CTFs or equivalent competition experience
Comfortable developing tools and scripts under time pressure
Familiar with at least one ML framework (PyTorch, TensorFlow, JAX)
Experience reading and implementing techniques from research papers

Recommended Preparation

Read recent AI security papers. The challenge designers draw inspiration from the latest research. Familiarity with the landscape helps you recognize when a challenge is riffing on a known concept.
Practice tool development. Several challenges benefit from quickly building custom tools (scrapers, fuzzers, analyzers). Practice building small tools fast.
Study defense mechanisms. The Defense Breaking category requires understanding defenses at a code level. Review open-source AI defense implementations.
Develop your research methodology. Original Research challenges reward systematic approaches. Practice forming hypotheses, designing experiments, and documenting results.

Toolkit Recommendations

Tool	Purpose
Python + requests/httpx	API interaction and automation
Jupyter notebooks	Exploratory analysis and documentation
PyTorch/transformers	Model analysis and probing
Burp Suite	HTTP traffic analysis for cross-system challenges
Custom scripts	Challenge-specific tooling you build during the CTF

Prizes

Place	Prize
1st Place	Custom trophy + conference talk invitation + 2,000 leaderboard points
2nd Place	Conference talk invitation + 1,000 leaderboard points
3rd Place	Featured writeup + 500 leaderboard points
Best Original Research	Dedicated publication slot on the wiki + 1,000 leaderboard points
Best Solo Performer	Recognition badge + 500 leaderboard points

Post-CTF

Given the research nature of this CTF, post-event activities include:

Research presentations. Top performers are invited to present their techniques in a community webinar (2 weeks post-CTF).
Publication support. Novel techniques discovered during the CTF can be developed into full writeups with editorial support from the community team.
Defense collaboration. Findings from this CTF are shared (with participant consent) with model providers and defense researchers to improve AI safety.

Mental Models for Novel Challenges

When known techniques fail, you need to reason from first principles. The following mental models help:

The Constraint Model

Every defense imposes constraints on the model's behavior. Constraints are either:

Hard constraints enforced architecturally (the model literally cannot access certain data)
Soft constraints enforced through training or prompting (the model is instructed not to do something)

Hard constraints require architectural exploits. Soft constraints require prompt-level exploits. Determine which type you are facing before choosing your approach.

The Information Flow Model

Trace how information flows through the system. At every point where information crosses a boundary (user to model, model to filter, filter to output), there is an opportunity for the information to be transformed, blocked, or leaked. Map the flow explicitly:

Where does my input enter the system?
What transformations does it undergo?
Where can I observe the system's state?
What side-channels exist at each boundary?

The Composition Model

Individual components may be secure, but their composition may not be. Look for:

Properties that hold for individual components but not for the combined system
Assumptions that one component makes about another that are not enforced
State shared between components that can be manipulated

The most interesting vulnerabilities in complex systems exist at the seams between components, not within individual components.

Fall 2026 CTF: Advanced Attack Research

Related articles

Fall 2026 CTF: Advanced Attack Research

Related articles