AI 攻擊面繪製

Intermediate4 min readUpdated 2026-03-15

辨識 AI 系統中所有攻擊向量之系統化方法論：輸入通道、資料流、工具整合與信任邊界。

attack-surface mapping methodology threat-modeling vectors

AI 攻擊面繪製

攻擊面繪製對 AI 系統而言，是辨識攻擊者與 AI 部署間每一互動點之流程。不同於聚焦於 HTTP 端點與輸入欄位之傳統 Web 應用攻擊面，AI 攻擊面包括語意通道（影響模型行為之自然語言輸入）、資料通道（文件、知識庫與訓練資料），以及能力通道（模型可呼叫之工具與 API）。

AI 攻擊面模型

AI 系統之攻擊面跨六個維度：

AI 攻擊面維度：

1. 輸入通道
   ├── 直接使用者輸入（聊天、API）
   ├── 間接資料（文件、電子郵件、網頁）
   ├── 多模態輸入（圖像、音訊、視訊）
   └── 系統組態（提示、參數）

2. 資料通道
   ├── RAG 知識庫
   ├── 訓練／微調資料
   ├── 使用者對話歷史
   └── 外部資料來源

3. 能力通道
   ├── 工具／函式呼叫
   ├── 程式碼執行
   ├── API 整合
   └── 檔案系統存取

4. 模型內部
   ├── 權重與 activation
   ├── Tokenizer
   ├── Embedding 空間
   └── Attention 機制

5. 基礎設施
   ├── API 端點
   ├── 驗證／授權
   ├── 速率限制
   └── 日誌與監控

6. 人類通道
   ├── 操作員與管理員
   ├── 內容審核者
   ├── 終端使用者（社交工程）
   └── 開發者團隊

繪製方法論

階段 1：系統分解

將 AI 系統拆解為其元件部分：

class AISystemDecomposition:
    """將 AI 系統分解為可繪製之元件。"""
 
    def __init__(self, system_name):
        self.system = system_name
        self.components = {
            "inputs": [],
            "processing": [],
            "outputs": [],
            "data_stores": [],
            "integrations": [],
            "humans": []
        }
 
    def map_input_channels(self):
        """辨識 AI 系統之所有輸入通道。"""
        channels = [
            {
                "name": "User chat interface",
                "type": "direct",
                "format": "natural language",
                "attacker_controllable": True,
                "filtered": True  # 具內容過濾器
            },
            {
                "name": "Document upload",
                "type": "indirect",
                "format": "PDF, DOCX, TXT",
                "attacker_controllable": True,
                "filtered": False  # 上傳無注入過濾
            },
            {
                "name": "RAG knowledge base",
                "type": "indirect",
                "format": "text chunks",
                "attacker_controllable": "partially",
                "filtered": False
            },
            {
                "name": "System prompt",
                "type": "configuration",
                "format": "text",
                "attacker_controllable": False,
                "filtered": False  # 受信任輸入
            }
        ]
 
        self.components["inputs"] = channels
        return channels

階段 2：信任邊界分析

辨識資料於何處跨越信任邊界：

def map_trust_boundaries(system_components):
    """
    辨識 AI 系統中之信任邊界。
    每個邊界轉換是潛在攻擊點。
    """
    boundaries = []
 
    # 邊界 1：外部 -> 應用
    boundaries.append({
        "name": "User Input Boundary",
        "from_zone": "untrusted (external)",
        "to_zone": "application layer",
        "data_crossing": ["user messages", "uploaded files"],
        "controls": ["input validation", "content filter"],
        "attack_types": ["prompt injection", "file upload attacks"]
    })
 
    # 邊界 2：應用 -> 模型
    boundaries.append({
        "name": "Model Context Boundary",
        "from_zone": "application layer",
        "to_zone": "model context window",
        "data_crossing": ["system prompt", "user message",
                          "retrieved documents", "conversation history"],
        "controls": ["instruction hierarchy"],
        "attack_types": ["indirect injection via retrieved docs",
                         "context window manipulation"]
    })
 
    # 邊界 3：模型 -> 工具
    boundaries.append({
        "name": "Tool Execution Boundary",
        "from_zone": "model decisions",
        "to_zone": "external systems",
        "data_crossing": ["tool calls", "API requests",
                          "database queries", "file operations"],
        "controls": ["tool allowlisting", "parameter validation"],
        "attack_types": ["tool abuse", "data exfiltration",
                         "privilege escalation"]
    })
 
    # 邊界 4：資料儲存 -> 應用
    boundaries.append({
        "name": "Data Retrieval Boundary",
        "from_zone": "data stores",
        "to_zone": "model context",
        "data_crossing": ["knowledge base chunks",
                          "user history", "cached responses"],
        "controls": ["access control", "content sanitization"],
        "attack_types": ["knowledge base poisoning",
                         "data extraction"]
    })
 
    return boundaries

階段 3：資料流繪製

追蹤資料如何於系統中移動：

def map_data_flows(system):
    """繪製通過 AI 系統之所有資料流。"""
    flows = []
 
    # 使用者輸入流
    flows.append({
        "name": "User Query Processing",
        "path": [
            "User Input",
            "Input Filter",
            "Query Embedding",
            "RAG Retrieval",
            "Context Assembly",
            "Model Inference",
            "Output Filter",
            "Response Delivery"
        ],
        "sensitive_data": ["user query (may contain PII)"],
        "injection_points": ["User Input", "RAG Retrieval"]
    })
 
    # RAG 檢索流
    flows.append({
        "name": "Knowledge Retrieval",
        "path": [
            "Query Embedding",
            "Vector Store Search",
            "Chunk Retrieval",
            "Re-ranking",
            "Context Insertion"
        ],
        "sensitive_data": ["knowledge base content"],
        "injection_points": ["Vector Store (if poisoned)",
                              "Context Insertion (if unfiltered)"]
    })
 
    # 工具執行流
    flows.append({
        "name": "Tool Call Execution",
        "path": [
            "Model Decision",
            "Tool Call Parsing",
            "Parameter Validation",
            "Tool Execution",
            "Result Processing",
            "Response Integration"
        ],
        "sensitive_data": ["tool parameters", "tool results"],
        "injection_points": ["Model Decision (via injection)",
                              "Parameter Validation (bypass)"]
    })
 
    return flows

階段 4：攻擊向量排序

def prioritize_attack_vectors(trust_boundaries, data_flows):
    """依風險與可及性排序攻擊向量。"""
    vectors = []
 
    for boundary in trust_boundaries:
        for attack_type in boundary["attack_types"]:
            vectors.append({
                "vector": attack_type,
                "boundary": boundary["name"],
                "accessibility": score_accessibility(boundary),
                "impact": score_impact(attack_type),
                "defense_coverage": score_defenses(boundary["controls"]),
                "priority": None  # 於下方計算
            })
 
    # 計算優先度分數
    for vector in vectors:
        vector["priority"] = (
            vector["accessibility"] * 0.3 +
            vector["impact"] * 0.4 +
            (1 - vector["defense_coverage"]) * 0.3
        )
 
    # 依優先度排序（高者先）
    vectors.sort(key=lambda x: x["priority"], reverse=True)
 
    return vectors

產出攻擊面圖

最終輸出是引導紅隊作業之結構化文件：

攻擊面圖：[目標系統名稱]
日期：[評估日期]

1. 系統概觀
   - 架構類型：具工具存取之 RAG 增強聊天機器人
   - 模型：[已辨識或疑似]
   - 基礎設施：[雲端供應商、部署類型]

2. 輸入通道（依可利用度排序）
   優先 1：文件上傳（無注入過濾）
   優先 2：直接聊天輸入（基本內容過濾）
   優先 3：RAG 知識庫（有限寫入存取）

3. 信任邊界
   [顯示邊界轉換之圖]

4. 資料流
   [顯示敏感資料移動之圖]

5. 工具能力
   - 電子郵件送出（高影響）
   - 資料庫查詢（資料曝露風險）
   - 檔案操作（限於上傳目錄）

6. 建議攻擊序列
   1. 經由文件上傳之間接注入
   2. 經由聊天之系統提示擷取
   3. 為資料外洩之工具濫用
   4. 為持久化之知識庫投毒

參考資料

OWASP，〈Attack Surface Analysis Cheat Sheet〉（2023）
MITRE，〈ATLAS: Adversarial Threat Landscape for AI Systems〉（2023）
Microsoft，〈Threat Modeling for AI/ML Systems〉（2023）

AI 攻擊面繪製

Intermediate4 min readUpdated 2026-03-15

辨識 AI 系統中所有攻擊向量之系統化方法論：輸入通道、資料流、工具整合與信任邊界。

attack-surface mapping methodology threat-modeling vectors

AI 攻擊面繪製

AI 攻擊面模型

AI 系統之攻擊面跨六個維度：

AI 攻擊面維度：

1. 輸入通道
   ├── 直接使用者輸入（聊天、API）
   ├── 間接資料（文件、電子郵件、網頁）
   ├── 多模態輸入（圖像、音訊、視訊）
   └── 系統組態（提示、參數）

2. 資料通道
   ├── RAG 知識庫
   ├── 訓練／微調資料
   ├── 使用者對話歷史
   └── 外部資料來源

3. 能力通道
   ├── 工具／函式呼叫
   ├── 程式碼執行
   ├── API 整合
   └── 檔案系統存取

4. 模型內部
   ├── 權重與 activation
   ├── Tokenizer
   ├── Embedding 空間
   └── Attention 機制

5. 基礎設施
   ├── API 端點
   ├── 驗證／授權
   ├── 速率限制
   └── 日誌與監控

6. 人類通道
   ├── 操作員與管理員
   ├── 內容審核者
   ├── 終端使用者（社交工程）
   └── 開發者團隊

繪製方法論

階段 1：系統分解

將 AI 系統拆解為其元件部分：

class AISystemDecomposition:
    """將 AI 系統分解為可繪製之元件。"""
 
    def __init__(self, system_name):
        self.system = system_name
        self.components = {
            "inputs": [],
            "processing": [],
            "outputs": [],
            "data_stores": [],
            "integrations": [],
            "humans": []
        }
 
    def map_input_channels(self):
        """辨識 AI 系統之所有輸入通道。"""
        channels = [
            {
                "name": "User chat interface",
                "type": "direct",
                "format": "natural language",
                "attacker_controllable": True,
                "filtered": True  # 具內容過濾器
            },
            {
                "name": "Document upload",
                "type": "indirect",
                "format": "PDF, DOCX, TXT",
                "attacker_controllable": True,
                "filtered": False  # 上傳無注入過濾
            },
            {
                "name": "RAG knowledge base",
                "type": "indirect",
                "format": "text chunks",
                "attacker_controllable": "partially",
                "filtered": False
            },
            {
                "name": "System prompt",
                "type": "configuration",
                "format": "text",
                "attacker_controllable": False,
                "filtered": False  # 受信任輸入
            }
        ]
 
        self.components["inputs"] = channels
        return channels

階段 2：信任邊界分析

辨識資料於何處跨越信任邊界：

def map_trust_boundaries(system_components):
    """
    辨識 AI 系統中之信任邊界。
    每個邊界轉換是潛在攻擊點。
    """
    boundaries = []
 
    # 邊界 1：外部 -> 應用
    boundaries.append({
        "name": "User Input Boundary",
        "from_zone": "untrusted (external)",
        "to_zone": "application layer",
        "data_crossing": ["user messages", "uploaded files"],
        "controls": ["input validation", "content filter"],
        "attack_types": ["prompt injection", "file upload attacks"]
    })
 
    # 邊界 2：應用 -> 模型
    boundaries.append({
        "name": "Model Context Boundary",
        "from_zone": "application layer",
        "to_zone": "model context window",
        "data_crossing": ["system prompt", "user message",
                          "retrieved documents", "conversation history"],
        "controls": ["instruction hierarchy"],
        "attack_types": ["indirect injection via retrieved docs",
                         "context window manipulation"]
    })
 
    # 邊界 3：模型 -> 工具
    boundaries.append({
        "name": "Tool Execution Boundary",
        "from_zone": "model decisions",
        "to_zone": "external systems",
        "data_crossing": ["tool calls", "API requests",
                          "database queries", "file operations"],
        "controls": ["tool allowlisting", "parameter validation"],
        "attack_types": ["tool abuse", "data exfiltration",
                         "privilege escalation"]
    })
 
    # 邊界 4：資料儲存 -> 應用
    boundaries.append({
        "name": "Data Retrieval Boundary",
        "from_zone": "data stores",
        "to_zone": "model context",
        "data_crossing": ["knowledge base chunks",
                          "user history", "cached responses"],
        "controls": ["access control", "content sanitization"],
        "attack_types": ["knowledge base poisoning",
                         "data extraction"]
    })
 
    return boundaries

階段 3：資料流繪製

追蹤資料如何於系統中移動：

def map_data_flows(system):
    """繪製通過 AI 系統之所有資料流。"""
    flows = []
 
    # 使用者輸入流
    flows.append({
        "name": "User Query Processing",
        "path": [
            "User Input",
            "Input Filter",
            "Query Embedding",
            "RAG Retrieval",
            "Context Assembly",
            "Model Inference",
            "Output Filter",
            "Response Delivery"
        ],
        "sensitive_data": ["user query (may contain PII)"],
        "injection_points": ["User Input", "RAG Retrieval"]
    })
 
    # RAG 檢索流
    flows.append({
        "name": "Knowledge Retrieval",
        "path": [
            "Query Embedding",
            "Vector Store Search",
            "Chunk Retrieval",
            "Re-ranking",
            "Context Insertion"
        ],
        "sensitive_data": ["knowledge base content"],
        "injection_points": ["Vector Store (if poisoned)",
                              "Context Insertion (if unfiltered)"]
    })
 
    # 工具執行流
    flows.append({
        "name": "Tool Call Execution",
        "path": [
            "Model Decision",
            "Tool Call Parsing",
            "Parameter Validation",
            "Tool Execution",
            "Result Processing",
            "Response Integration"
        ],
        "sensitive_data": ["tool parameters", "tool results"],
        "injection_points": ["Model Decision (via injection)",
                              "Parameter Validation (bypass)"]
    })
 
    return flows

階段 4：攻擊向量排序

def prioritize_attack_vectors(trust_boundaries, data_flows):
    """依風險與可及性排序攻擊向量。"""
    vectors = []
 
    for boundary in trust_boundaries:
        for attack_type in boundary["attack_types"]:
            vectors.append({
                "vector": attack_type,
                "boundary": boundary["name"],
                "accessibility": score_accessibility(boundary),
                "impact": score_impact(attack_type),
                "defense_coverage": score_defenses(boundary["controls"]),
                "priority": None  # 於下方計算
            })
 
    # 計算優先度分數
    for vector in vectors:
        vector["priority"] = (
            vector["accessibility"] * 0.3 +
            vector["impact"] * 0.4 +
            (1 - vector["defense_coverage"]) * 0.3
        )
 
    # 依優先度排序（高者先）
    vectors.sort(key=lambda x: x["priority"], reverse=True)
 
    return vectors

產出攻擊面圖

最終輸出是引導紅隊作業之結構化文件：

攻擊面圖：[目標系統名稱]
日期：[評估日期]

1. 系統概觀
   - 架構類型：具工具存取之 RAG 增強聊天機器人
   - 模型：[已辨識或疑似]
   - 基礎設施：[雲端供應商、部署類型]

2. 輸入通道（依可利用度排序）
   優先 1：文件上傳（無注入過濾）
   優先 2：直接聊天輸入（基本內容過濾）
   優先 3：RAG 知識庫（有限寫入存取）

3. 信任邊界
   [顯示邊界轉換之圖]

4. 資料流
   [顯示敏感資料移動之圖]

5. 工具能力
   - 電子郵件送出（高影響）
   - 資料庫查詢（資料曝露風險）
   - 檔案操作（限於上傳目錄）

6. 建議攻擊序列
   1. 經由文件上傳之間接注入
   2. 經由聊天之系統提示擷取
   3. 為資料外洩之工具濫用
   4. 為持久化之知識庫投毒

參考資料

OWASP，〈Attack Surface Analysis Cheat Sheet〉（2023）
MITRE，〈ATLAS: Adversarial Threat Landscape for AI Systems〉（2023）
Microsoft，〈Threat Modeling for AI/ML Systems〉（2023）

AI 攻擊面繪製

AI 攻擊面繪製

AI 攻擊面模型

繪製方法論

階段 1：系統分解

階段 2：信任邊界分析

階段 3：資料流繪製

階段 4：攻擊向量排序

產出攻擊面圖

相關主題

參考資料

AI 攻擊面繪製

AI 攻擊面繪製

AI 攻擊面模型

繪製方法論

階段 1：系統分解

階段 2：信任邊界分析

階段 3：資料流繪製

階段 4：攻擊向量排序

產出攻擊面圖

相關主題

參考資料

AI 攻擊面繪製

Related articles

AI 攻擊面繪製

Related articles