What is CTF：越獄考驗?

A series of progressively harder 越獄 challenges where each level adds stronger defenses. Score points through technique diversity and creativity as you break through escalating safety layers.

What is CTF：代理劫案?

多階段代理利用挑戰，你滲透 AI 代理之工具生態系、提升權限並於不觸發安全警報下外洩目標資料。

What is CTF: RAG Infiltrator?

Poison a RAG system to return attacker-controlled content for specific queries. Score based on precision of targeting and stealth of the injected documents.

What is CTF: Alignment Breaker?

Break the 對齊 of a heavily defended model with multiple defense layers. Requires combining advanced 技術 including 對抗性 suffixes, multi-turn manipulation, and novel 越獄 approaches.

What is CTF: Multimodal Maze?

Navigate a multimodal challenge using image, text, and audio injection vectors. Each modality unlocks the next stage of the maze, requiring cross-modal attack chaining.

What is CTF: Supply Chain Saboteur?

Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.

What is CTF：防禦挑戰賽（藍隊）?

藍隊 CTF 挑戰：你要建構並防禦一個 AI 聊天機器人，對抗一連串日益精密的自動化攻擊。

What is CTF：代理升級?

聚焦利用代理 AI 系統之五旗幟 CTF 挑戰。自基礎工具濫用漸進至特權升級、間接注入、記憶投毒，與多代理鏈攻擊。

What is CTF：雲端 AI 劫案?

透過利用組態錯誤、SSRF、metadata 端點與模型層漏洞，從部署於雲端的 AI 應用中擷取機密。

What is CTF：AI 鑑識調查?

分析日誌、模型輸出與系統產物以重建 AI 安全事件。發展針對 AI 特有攻擊模式、資料外洩痕跡，與對抗提示偵測之鑑識分析技能。

AI 紅隊演練 CTF 挑戰

中級5 分鐘閱讀更新於 2026-03-13

Overview of Capture-the-Flag challenges for AI 紅隊演練. 學習 the CTF format, scoring system, rules, hints, and strategies for approaching AI security competitions.

ctf challenges competition

什麼是 AI 紅隊演練 CTF？

奪旗（CTF）挑戰將成熟的網路安全競賽格式應用於 AI 和 LLM 安全。你不再是利用緩衝區溢位或破解密碼，而是利用模型行為、繞過安全系統，並操控 AI 流水線以萃取隱藏旗標或達成禁止目標。

本系列的每個挑戰都讓你面對一個帶有特定目標的防禦 AI 系統。你的任務是找到並利用弱點來取回旗標或完成目標。

CTF 格式與結構

本系列的每個挑戰都遵循一致的結構：

挑戰等級

Tier	Difficulty	Points Range	Expected Time	Skills Required
第一層	中等	100–250	15–30 分鐘	基礎提示詞注入、越獄
Tier 2	Advanced	250--500	30--90 min	Multi-step 漏洞利用, tool abuse
第三層	專家	500–1000	1–4 小時	串聯技術、新型方法
加分	可變	100–500	可變	創意或意外解法

挑戰元件

Briefing
每個挑戰都從情境說明、你的目標和交戰規則開始。請仔細閱讀——其中通常包含關於攻擊面的微妙提示。
Target System
你正在攻擊的 AI 系統描述，包括其目的、已知防禦措施，以及對你互動的任何限制（速率限制、輸入長度、模態限制）。
Objective
你必須達成的明確目標說明。可能是萃取特定旗標字串、引起特定行為，或外洩目標資料。
Scoring
主要目標的積分值，加上技術多樣性、隱蔽性或創意解法的額外加分。
Hints
以積分換取指引的分層提示系統。每個提示都會降低你的最高得分，但提供正確方向的引導。

計分系統

計分系統同時獎勵成功和精密度：

積分配置

基礎積分：     達成主要目標即可獲得
技術加分：+10–50%，用於多樣或新穎的技術
隱蔽加分：+10–25%，用於避免偵測或觸發最少警報
速度加分：+5–15%，在標準時間內完成
提示罰分：   -15%（使用提示一），-25%（提示二），-50%（提示三）

排行榜分類

Speed Run -- fastest completion time regardless of technique
Style Points -- most creative or elegant solution
Completionist -- most challenges solved across all tiers
Specialist -- highest score within a single challenge category

交戰規則

All CTF challenges operate under these rules:

Scope -- only attack the designated 目標系統. Do not attempt to compromise the CTF infrastructure itself.
No denial of service -- do not intentionally crash, overload, or degrade the 目標系統 for other participants.
No flag sharing -- flags are individual. Sharing flags with other participants results in disqualification.
Tool freedom -- you may use any tools, scripts, or frameworks unless a challenge explicitly restricts them.
Documentation -- keep a log of your attempts. Writeups may be requested for verification of bonus points.

The Hints System

提示遵循三層結構，旨在引導而不直接給出答案：

提示等級	費用	提供內容
提示一	-15% 最高得分	引導你找到正確的攻擊類別
Hint 2	-25% max score	Identifies the specific technique or 漏洞 class
提示三	-50% 最高得分	提供具體的起始點或部分載荷

提示是累積的——使用提示二也會套用提示一的罰分。將其視為獨立學習與解除卡關之間的權衡。

AI CTF 攻略策略

成功的 CTF 競賽者遵循系統化方法論：

Reconnaissance
在發送任何攻擊載荷前，先了解目標。它是什麼模型？其聲稱的能力是什麼？它提及了哪些安全措施？發送無害查詢以描繪系統的行為和邊界。
Attack Surface Mapping
識別所有輸入向量：文字提示詞、文件上傳、API 參數、間接頻道。每個向量都是潛在的入口點。在選擇方法前列出所有向量。
Technique Selection
將偵察發現與已知攻擊技術對應。具有嚴格關鍵字過濾器的系統建議採用編碼繞過。具有工具存取的系統建議 tool abuse vectors.
Iterative Exploitation
從簡單技術開始並逐步升級。每次嘗試都提供關於防禦層的資訊，即使失敗也是如此。根據錯誤訊息和回應模式調整載荷。
Flag Extraction and Documentation
找到可行方法後，乾淨地萃取旗標。在報告中記錄完整的攻擊鏈，並記錄沿途發現的任何額外漏洞。

挑戰分類

本系列的挑戰按測試不同技能組合的分類組織：

進攻型挑戰

Challenge	Category	Key Skills Tested
越獄考驗	提示詞注入	漸進式越獄技術
代理人劫奪	代理利用	多步驟工具鏈攻擊
RAG 滲透者	資料投毒	有針對性的知識庫操控
對齊破壞者	安全繞過	進階對齊規避
多模態迷宮	多模態攻擊	跨模態注入鏈
Supply Chain Saboteur	Infrastructure	Model pipeline 漏洞利用

防禦型挑戰

Challenge	Category	Key Skills Tested
防禦考驗	藍隊	建構健壯的 AI 護欄

建議準備事項

在嘗試 CTF 挑戰前，確認已完成：

初學者實作 — AI 紅隊演練實作入門提供基礎技能
提示詞注入基礎 — 複習直接注入和防禦規避技術
工具設定 — 確認你的測試框架和 API 存取已配置

環境設定

每個挑戰都有其特定要求，但共同的基準是：

# ctf_setup.py -- baseline CTF environment configuration
import os
import time
import json
from dataclasses import dataclass, field
from datetime import datetime
 
@dataclass
class CTFSession:
    """Track a CTF challenge attempt."""
    challenge_id: str
    start_time: datetime = field(default_factory=datetime.now)
    attempts: list = field(default_factory=list)
    hints_used: int = 0
    flag: str = ""
    max_points: int = 0
 
    def log_attempt(self, payload: str, response: str, success: bool = False):
        self.attempts.append({
            "timestamp": datetime.now().isoformat(),
            "payload": payload,
            "response": response[:500],  # Truncate long responses
            "success": success,
        })
 
    def use_hint(self):
        self.hints_used += 1
        penalties = {1: 0.15, 2: 0.25, 3: 0.50}
        penalty = penalties.get(self.hints_used, 0.50)
        self.max_points = int(self.max_points * (1 - penalty))
 
    def submit_flag(self, flag: str, expected: str) -> bool:
        if flag.strip() == expected.strip():
            self.flag = flag
            elapsed = (datetime.now() - self.start_time).total_seconds()
            print(f"Flag accepted! Time: {elapsed:.0f}s, "
                  f"Attempts: {len(self.attempts)}, "
                  f"Points: {self.max_points}")
            return True
        print("Incorrect flag. Keep trying.")
        return False
 
    def export_writeup(self, filepath: str):
        writeup = {
            "challenge": self.challenge_id,
            "duration_seconds": (datetime.now() - self.start_time).total_seconds(),
            "total_attempts": len(self.attempts),
            "hints_used": self.hints_used,
            "final_points": self.max_points,
            "attempts": self.attempts,
        }
        with open(filepath, "w") as f:
            json.dump(writeup, f, indent=2)

CTF 結束後

完成這裡的挑戰後，考慮進一步學習完整演練模擬，在真實的端對端評估情境中測試你的技能。 You can also explore the Capstone section for guidance on real-world engagement planning and reporting.

參考資料

"HarmBench: A Standardized Evaluation Framework for Automated Red Teaming" — Mazeika et al.（2024）— 為 CTF 挑戰設計和計分提供資訊的基準框架
"JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models" — Chao et al.（2024）— 用於考驗式挑戰的標準化越獄評估
"Red Teaming Language Models to Reduce Harms" — Ganguli et al.（2022）— 指導對抗挑戰設計的研究方法論
"AI Risk Management Framework" — NIST（2023）— 將 CTF 技能置於生產安全情境中的風險框架

Knowledge Check

AI CTF 挑戰中提示系統的主要目的是什麼？

Knowledge Check

在向 CTF 目標發送攻擊載荷之前，你應該做什麼？

AI 紅隊演練 CTF 挑戰

中級5 分鐘閱讀更新於 2026-03-13

Overview of Capture-the-Flag challenges for AI 紅隊演練. 學習 the CTF format, scoring system, rules, hints, and strategies for approaching AI security competitions.

ctf challenges competition

Tier	Difficulty	Points Range	Expected Time	Skills Required
第一層	中等	100–250	15–30 分鐘	基礎提示詞注入、越獄
Tier 2	Advanced	250--500	30--90 min	Multi-step 漏洞利用, tool abuse
第三層	專家	500–1000	1–4 小時	串聯技術、新型方法
加分	可變	100–500	可變	創意或意外解法

挑戰元件

Briefing
每個挑戰都從情境說明、你的目標和交戰規則開始。請仔細閱讀——其中通常包含關於攻擊面的微妙提示。
Target System
你正在攻擊的 AI 系統描述，包括其目的、已知防禦措施，以及對你互動的任何限制（速率限制、輸入長度、模態限制）。
Objective
你必須達成的明確目標說明。可能是萃取特定旗標字串、引起特定行為，或外洩目標資料。
Scoring
主要目標的積分值，加上技術多樣性、隱蔽性或創意解法的額外加分。
Hints
以積分換取指引的分層提示系統。每個提示都會降低你的最高得分，但提供正確方向的引導。

計分系統

計分系統同時獎勵成功和精密度：

積分配置

基礎積分：     達成主要目標即可獲得
技術加分：+10–50%，用於多樣或新穎的技術
隱蔽加分：+10–25%，用於避免偵測或觸發最少警報
速度加分：+5–15%，在標準時間內完成
提示罰分：   -15%（使用提示一），-25%（提示二），-50%（提示三）

排行榜分類

Speed Run -- fastest completion time regardless of technique
Style Points -- most creative or elegant solution
Completionist -- most challenges solved across all tiers
Specialist -- highest score within a single challenge category

交戰規則

All CTF challenges operate under these rules:

Scope -- only attack the designated 目標系統. Do not attempt to compromise the CTF infrastructure itself.
No denial of service -- do not intentionally crash, overload, or degrade the 目標系統 for other participants.
No flag sharing -- flags are individual. Sharing flags with other participants results in disqualification.
Tool freedom -- you may use any tools, scripts, or frameworks unless a challenge explicitly restricts them.
Documentation -- keep a log of your attempts. Writeups may be requested for verification of bonus points.

The Hints System

提示遵循三層結構，旨在引導而不直接給出答案：

提示等級	費用	提供內容
提示一	-15% 最高得分	引導你找到正確的攻擊類別
Hint 2	-25% max score	Identifies the specific technique or 漏洞 class
提示三	-50% 最高得分	提供具體的起始點或部分載荷

提示是累積的——使用提示二也會套用提示一的罰分。將其視為獨立學習與解除卡關之間的權衡。

AI CTF 攻略策略

成功的 CTF 競賽者遵循系統化方法論：

Reconnaissance
在發送任何攻擊載荷前，先了解目標。它是什麼模型？其聲稱的能力是什麼？它提及了哪些安全措施？發送無害查詢以描繪系統的行為和邊界。
Attack Surface Mapping
識別所有輸入向量：文字提示詞、文件上傳、API 參數、間接頻道。每個向量都是潛在的入口點。在選擇方法前列出所有向量。
Technique Selection
將偵察發現與已知攻擊技術對應。具有嚴格關鍵字過濾器的系統建議採用編碼繞過。具有工具存取的系統建議 tool abuse vectors.
Iterative Exploitation
從簡單技術開始並逐步升級。每次嘗試都提供關於防禦層的資訊，即使失敗也是如此。根據錯誤訊息和回應模式調整載荷。
Flag Extraction and Documentation
找到可行方法後，乾淨地萃取旗標。在報告中記錄完整的攻擊鏈，並記錄沿途發現的任何額外漏洞。

挑戰分類

本系列的挑戰按測試不同技能組合的分類組織：

進攻型挑戰

Challenge	Category	Key Skills Tested
越獄考驗	提示詞注入	漸進式越獄技術
代理人劫奪	代理利用	多步驟工具鏈攻擊
RAG 滲透者	資料投毒	有針對性的知識庫操控
對齊破壞者	安全繞過	進階對齊規避
多模態迷宮	多模態攻擊	跨模態注入鏈
Supply Chain Saboteur	Infrastructure	Model pipeline 漏洞利用

防禦型挑戰

Challenge	Category	Key Skills Tested
防禦考驗	藍隊	建構健壯的 AI 護欄

建議準備事項

在嘗試 CTF 挑戰前，確認已完成：

初學者實作 — AI 紅隊演練實作入門提供基礎技能
提示詞注入基礎 — 複習直接注入和防禦規避技術
工具設定 — 確認你的測試框架和 API 存取已配置

環境設定

每個挑戰都有其特定要求，但共同的基準是：

# ctf_setup.py -- baseline CTF environment configuration
import os
import time
import json
from dataclasses import dataclass, field
from datetime import datetime
 
@dataclass
class CTFSession:
    """Track a CTF challenge attempt."""
    challenge_id: str
    start_time: datetime = field(default_factory=datetime.now)
    attempts: list = field(default_factory=list)
    hints_used: int = 0
    flag: str = ""
    max_points: int = 0
 
    def log_attempt(self, payload: str, response: str, success: bool = False):
        self.attempts.append({
            "timestamp": datetime.now().isoformat(),
            "payload": payload,
            "response": response[:500],  # Truncate long responses
            "success": success,
        })
 
    def use_hint(self):
        self.hints_used += 1
        penalties = {1: 0.15, 2: 0.25, 3: 0.50}
        penalty = penalties.get(self.hints_used, 0.50)
        self.max_points = int(self.max_points * (1 - penalty))
 
    def submit_flag(self, flag: str, expected: str) -> bool:
        if flag.strip() == expected.strip():
            self.flag = flag
            elapsed = (datetime.now() - self.start_time).total_seconds()
            print(f"Flag accepted! Time: {elapsed:.0f}s, "
                  f"Attempts: {len(self.attempts)}, "
                  f"Points: {self.max_points}")
            return True
        print("Incorrect flag. Keep trying.")
        return False
 
    def export_writeup(self, filepath: str):
        writeup = {
            "challenge": self.challenge_id,
            "duration_seconds": (datetime.now() - self.start_time).total_seconds(),
            "total_attempts": len(self.attempts),
            "hints_used": self.hints_used,
            "final_points": self.max_points,
            "attempts": self.attempts,
        }
        with open(filepath, "w") as f:
            json.dump(writeup, f, indent=2)

CTF 結束後

參考資料

"HarmBench: A Standardized Evaluation Framework for Automated Red Teaming" — Mazeika et al.（2024）— 為 CTF 挑戰設計和計分提供資訊的基準框架
"JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models" — Chao et al.（2024）— 用於考驗式挑戰的標準化越獄評估
"Red Teaming Language Models to Reduce Harms" — Ganguli et al.（2022）— 指導對抗挑戰設計的研究方法論
"AI Risk Management Framework" — NIST（2023）— 將 CTF 技能置於生產安全情境中的風險框架

Knowledge Check

AI CTF 挑戰中提示系統的主要目的是什麼？

Knowledge Check

在向 CTF 目標發送攻擊載荷之前，你應該做什麼？

AI 紅隊演練 CTF 挑戰

Briefing

Target System

Objective

Scoring

Hints

Reconnaissance

Attack Surface Mapping

Technique Selection

Iterative Exploitation

Flag Extraction and Documentation

學習路徑

相關文章

AI 紅隊演練 CTF 挑戰

Briefing

Target System

Objective

Scoring

Hints

Reconnaissance

Attack Surface Mapping

Technique Selection

Iterative Exploitation

Flag Extraction and Documentation

學習路徑

相關文章