Mapping van het AI-aanvalsoppervlak

Gemiddeld7 min lezenBijgewerkt op 2026-03-15

Systematische methodologie om alle aanvalsvectoren in AI-systemen te identificeren: inputkanalen, datastromen, tool-integraties en trust boundaries.

attack-surface mapping methodology threat-modeling vectors

Mapping van het AI-aanvalsoppervlak

Attack surface mapping voor AI-systemen is het proces waarbij je elk interactiepunt tussen een aanvaller en de AI-deployment identificeert. Anders dan bij traditionele aanvalsoppervlakken van webapplicaties, die zich richten op HTTP-endpoints en invoervelden, omvatten AI-aanvalsoppervlakken semantische kanalen (natuurlijketaal-input die het gedrag van het model beïnvloedt), datakanalen (documenten, kennisbanken en trainingsdata) en capability-kanalen (tools en API's die het model kan aanroepen).

Het AI-aanvalsoppervlakmodel

AI-systemen hebben aanvalsoppervlak in zes dimensies:

AI Attack Surface Dimensions:

1. Input Channels
   ├── Direct user input (chat, API)
   ├── Indirect data (documents, emails, web pages)
   ├── Multimodal inputs (images, audio, video)
   └── System configuration (prompts, parameters)

2. Data Channels
   ├── RAG knowledge bases
   ├── Training/fine-tuning data
   ├── User conversation history
   └── External data sources

3. Capability Channels
   ├── Tool/function calling
   ├── Code execution
   ├── API integrations
   └── File system access

4. Model Internals
   ├── Weights and activations
   ├── Tokenizer
   ├── Embedding space
   └── Attention mechanics

5. Infrastructure
   ├── API endpoints
   ├── Authentication/authorization
   ├── Rate limiting
   └── Logging and monitoring

6. Human Channels
   ├── Operators and administrators
   ├── Content moderators
   ├── End users (social engineering)
   └── Developer team

Mapping-methodologie

Fase 1: het systeem ontleden

Splits het AI-systeem op in zijn onderdelen:

class AISystemDecomposition:
    """Ontleed een AI-systeem in in kaart te brengen componenten."""
 
    def __init__(self, system_name):
        self.system = system_name
        self.components = {
            "inputs": [],
            "processing": [],
            "outputs": [],
            "data_stores": [],
            "integrations": [],
            "humans": []
        }
 
    def map_input_channels(self):
        """Identificeer alle inputkanalen naar het AI-systeem."""
        channels = [
            {
                "name": "User chat interface",
                "type": "direct",
                "format": "natural language",
                "attacker_controllable": True,
                "filtered": True  # Heeft een contentfilter
            },
            {
                "name": "Document upload",
                "type": "indirect",
                "format": "PDF, DOCX, TXT",
                "attacker_controllable": True,
                "filtered": False  # Geen injectiefilter op uploads
            },
            {
                "name": "RAG knowledge base",
                "type": "indirect",
                "format": "text chunks",
                "attacker_controllable": "partially",
                "filtered": False
            },
            {
                "name": "System prompt",
                "type": "configuration",
                "format": "text",
                "attacker_controllable": False,
                "filtered": False  # Vertrouwde input
            }
        ]
 
        self.components["inputs"] = channels
        return channels

Fase 2: analyse van trust boundaries

Bepaal waar data trust boundaries overschrijdt:

def map_trust_boundaries(system_components):
    """
    Identificeer trust boundaries in het AI-systeem.
    Elke overgang van een boundary is een potentieel aanvalspunt.
    """
    boundaries = []
 
    # Boundary 1: Extern -> Applicatie
    boundaries.append({
        "name": "User Input Boundary",
        "from_zone": "untrusted (external)",
        "to_zone": "application layer",
        "data_crossing": ["user messages", "uploaded files"],
        "controls": ["input validation", "content filter"],
        "attack_types": ["prompt injection", "file upload attacks"]
    })
 
    # Boundary 2: Applicatie -> Model
    boundaries.append({
        "name": "Model Context Boundary",
        "from_zone": "application layer",
        "to_zone": "model context window",
        "data_crossing": ["system prompt", "user message",
                          "retrieved documents", "conversation history"],
        "controls": ["instruction hierarchy"],
        "attack_types": ["indirect injection via retrieved docs",
                         "context window manipulation"]
    })
 
    # Boundary 3: Model -> Tools
    boundaries.append({
        "name": "Tool Execution Boundary",
        "from_zone": "model decisions",
        "to_zone": "external systems",
        "data_crossing": ["tool calls", "API requests",
                          "database queries", "file operations"],
        "controls": ["tool allowlisting", "parameter validation"],
        "attack_types": ["tool abuse", "data exfiltration",
                         "privilege escalation"]
    })
 
    # Boundary 4: Data Store -> Applicatie
    boundaries.append({
        "name": "Data Retrieval Boundary",
        "from_zone": "data stores",
        "to_zone": "model context",
        "data_crossing": ["knowledge base chunks",
                          "user history", "cached responses"],
        "controls": ["access control", "content sanitization"],
        "attack_types": ["knowledge base poisoning",
                         "data extraction"]
    })
 
    return boundaries

Fase 3: datastromen in kaart brengen

Volg hoe data door het systeem beweegt:

def map_data_flows(system):
    """Breng alle datastromen door het AI-systeem in kaart."""
    flows = []
 
    # Stroom van gebruikersinput
    flows.append({
        "name": "User Query Processing",
        "path": [
            "User Input",
            "Input Filter",
            "Query Embedding",
            "RAG Retrieval",
            "Context Assembly",
            "Model Inference",
            "Output Filter",
            "Response Delivery"
        ],
        "sensitive_data": ["user query (may contain PII)"],
        "injection_points": ["User Input", "RAG Retrieval"]
    })
 
    # RAG-retrievalstroom
    flows.append({
        "name": "Knowledge Retrieval",
        "path": [
            "Query Embedding",
            "Vector Store Search",
            "Chunk Retrieval",
            "Re-ranking",
            "Context Insertion"
        ],
        "sensitive_data": ["knowledge base content"],
        "injection_points": ["Vector Store (if poisoned)",
                              "Context Insertion (if unfiltered)"]
    })
 
    # Stroom van tool-uitvoering
    flows.append({
        "name": "Tool Call Execution",
        "path": [
            "Model Decision",
            "Tool Call Parsing",
            "Parameter Validation",
            "Tool Execution",
            "Result Processing",
            "Response Integration"
        ],
        "sensitive_data": ["tool parameters", "tool results"],
        "injection_points": ["Model Decision (via injection)",
                              "Parameter Validation (bypass)"]
    })
 
    return flows

Fase 4: aanvalsvectoren prioriteren

def prioritize_attack_vectors(trust_boundaries, data_flows):
    """Prioriteer aanvalsvectoren op risico en toegankelijkheid."""
    vectors = []
 
    for boundary in trust_boundaries:
        for attack_type in boundary["attack_types"]:
            vectors.append({
                "vector": attack_type,
                "boundary": boundary["name"],
                "accessibility": score_accessibility(boundary),
                "impact": score_impact(attack_type),
                "defense_coverage": score_defenses(boundary["controls"]),
                "priority": None  # Hieronder berekend
            })
 
    # Bereken de prioriteitsscore
    for vector in vectors:
        vector["priority"] = (
            vector["accessibility"] * 0.3 +
            vector["impact"] * 0.4 +
            (1 - vector["defense_coverage"]) * 0.3
        )
 
    # Sorteer op prioriteit (hoogste eerst)
    vectors.sort(key=lambda x: x["priority"], reverse=True)
 
    return vectors

De attack-surface-map opstellen

De uiteindelijke output is een gestructureerd document dat red team-operaties stuurt:

Attack Surface Map: [Target System Name]
Date: [Assessment Date]

1. SYSTEM OVERVIEW
   - Architecture type: RAG-augmented chatbot with tool access
   - Model: [identified or suspected]
   - Infrastructure: [cloud provider, deployment type]

2. INPUT CHANNELS (ranked by exploitability)
   Priority 1: Document upload (no injection filtering)
   Priority 2: Direct chat input (basic content filter)
   Priority 3: RAG knowledge base (limited write access)

3. TRUST BOUNDARIES
   [Diagram showing boundary transitions]

4. DATA FLOWS
   [Diagram showing sensitive data movement]

5. TOOL CAPABILITIES
   - Email sending (high impact)
   - Database queries (data exposure risk)
   - File operations (limited to uploads directory)

6. RECOMMENDED ATTACK SEQUENCE
   1. Indirect injection via document upload
   2. System prompt extraction via chat
   3. Tool abuse for data exfiltration
   4. Knowledge base poisoning for persistence

Gerelateerde onderwerpen

Target Profiling — Eerste verkenning
Model Identification — Het model identificeren
API Enumeration — API-capaciteiten ontdekken
Defense Taxonomy — Begrijpen welke verdedigingen je kunt verwachten

Knowledge Check

Een AI-systeem accepteert gebruikerschat (gefilterd), verwerkt geüploade documenten (ongefilterd) en heeft toegang tot tools om e-mail te versturen. Welk aanvalspad moet een red team prioriteren?

Referenties

OWASP, "Attack Surface Analysis Cheat Sheet" (2023)
MITRE, "ATLAS: Adversarial Threat Landscape for AI Systems" (2023)
Microsoft, "Threat Modeling for AI/ML Systems" (2023)

Mapping van het AI-aanvalsoppervlak

Gemiddeld7 min lezenBijgewerkt op 2026-03-15

Systematische methodologie om alle aanvalsvectoren in AI-systemen te identificeren: inputkanalen, datastromen, tool-integraties en trust boundaries.

attack-surface mapping methodology threat-modeling vectors

Mapping van het AI-aanvalsoppervlak

Het AI-aanvalsoppervlakmodel

AI-systemen hebben aanvalsoppervlak in zes dimensies:

AI Attack Surface Dimensions:

1. Input Channels
   ├── Direct user input (chat, API)
   ├── Indirect data (documents, emails, web pages)
   ├── Multimodal inputs (images, audio, video)
   └── System configuration (prompts, parameters)

2. Data Channels
   ├── RAG knowledge bases
   ├── Training/fine-tuning data
   ├── User conversation history
   └── External data sources

3. Capability Channels
   ├── Tool/function calling
   ├── Code execution
   ├── API integrations
   └── File system access

4. Model Internals
   ├── Weights and activations
   ├── Tokenizer
   ├── Embedding space
   └── Attention mechanics

5. Infrastructure
   ├── API endpoints
   ├── Authentication/authorization
   ├── Rate limiting
   └── Logging and monitoring

6. Human Channels
   ├── Operators and administrators
   ├── Content moderators
   ├── End users (social engineering)
   └── Developer team

Mapping-methodologie

Fase 1: het systeem ontleden

Splits het AI-systeem op in zijn onderdelen:

class AISystemDecomposition:
    """Ontleed een AI-systeem in in kaart te brengen componenten."""
 
    def __init__(self, system_name):
        self.system = system_name
        self.components = {
            "inputs": [],
            "processing": [],
            "outputs": [],
            "data_stores": [],
            "integrations": [],
            "humans": []
        }
 
    def map_input_channels(self):
        """Identificeer alle inputkanalen naar het AI-systeem."""
        channels = [
            {
                "name": "User chat interface",
                "type": "direct",
                "format": "natural language",
                "attacker_controllable": True,
                "filtered": True  # Heeft een contentfilter
            },
            {
                "name": "Document upload",
                "type": "indirect",
                "format": "PDF, DOCX, TXT",
                "attacker_controllable": True,
                "filtered": False  # Geen injectiefilter op uploads
            },
            {
                "name": "RAG knowledge base",
                "type": "indirect",
                "format": "text chunks",
                "attacker_controllable": "partially",
                "filtered": False
            },
            {
                "name": "System prompt",
                "type": "configuration",
                "format": "text",
                "attacker_controllable": False,
                "filtered": False  # Vertrouwde input
            }
        ]
 
        self.components["inputs"] = channels
        return channels

Fase 2: analyse van trust boundaries

Bepaal waar data trust boundaries overschrijdt:

def map_trust_boundaries(system_components):
    """
    Identificeer trust boundaries in het AI-systeem.
    Elke overgang van een boundary is een potentieel aanvalspunt.
    """
    boundaries = []
 
    # Boundary 1: Extern -> Applicatie
    boundaries.append({
        "name": "User Input Boundary",
        "from_zone": "untrusted (external)",
        "to_zone": "application layer",
        "data_crossing": ["user messages", "uploaded files"],
        "controls": ["input validation", "content filter"],
        "attack_types": ["prompt injection", "file upload attacks"]
    })
 
    # Boundary 2: Applicatie -> Model
    boundaries.append({
        "name": "Model Context Boundary",
        "from_zone": "application layer",
        "to_zone": "model context window",
        "data_crossing": ["system prompt", "user message",
                          "retrieved documents", "conversation history"],
        "controls": ["instruction hierarchy"],
        "attack_types": ["indirect injection via retrieved docs",
                         "context window manipulation"]
    })
 
    # Boundary 3: Model -> Tools
    boundaries.append({
        "name": "Tool Execution Boundary",
        "from_zone": "model decisions",
        "to_zone": "external systems",
        "data_crossing": ["tool calls", "API requests",
                          "database queries", "file operations"],
        "controls": ["tool allowlisting", "parameter validation"],
        "attack_types": ["tool abuse", "data exfiltration",
                         "privilege escalation"]
    })
 
    # Boundary 4: Data Store -> Applicatie
    boundaries.append({
        "name": "Data Retrieval Boundary",
        "from_zone": "data stores",
        "to_zone": "model context",
        "data_crossing": ["knowledge base chunks",
                          "user history", "cached responses"],
        "controls": ["access control", "content sanitization"],
        "attack_types": ["knowledge base poisoning",
                         "data extraction"]
    })
 
    return boundaries

Fase 3: datastromen in kaart brengen

Volg hoe data door het systeem beweegt:

def map_data_flows(system):
    """Breng alle datastromen door het AI-systeem in kaart."""
    flows = []
 
    # Stroom van gebruikersinput
    flows.append({
        "name": "User Query Processing",
        "path": [
            "User Input",
            "Input Filter",
            "Query Embedding",
            "RAG Retrieval",
            "Context Assembly",
            "Model Inference",
            "Output Filter",
            "Response Delivery"
        ],
        "sensitive_data": ["user query (may contain PII)"],
        "injection_points": ["User Input", "RAG Retrieval"]
    })
 
    # RAG-retrievalstroom
    flows.append({
        "name": "Knowledge Retrieval",
        "path": [
            "Query Embedding",
            "Vector Store Search",
            "Chunk Retrieval",
            "Re-ranking",
            "Context Insertion"
        ],
        "sensitive_data": ["knowledge base content"],
        "injection_points": ["Vector Store (if poisoned)",
                              "Context Insertion (if unfiltered)"]
    })
 
    # Stroom van tool-uitvoering
    flows.append({
        "name": "Tool Call Execution",
        "path": [
            "Model Decision",
            "Tool Call Parsing",
            "Parameter Validation",
            "Tool Execution",
            "Result Processing",
            "Response Integration"
        ],
        "sensitive_data": ["tool parameters", "tool results"],
        "injection_points": ["Model Decision (via injection)",
                              "Parameter Validation (bypass)"]
    })
 
    return flows

Fase 4: aanvalsvectoren prioriteren

def prioritize_attack_vectors(trust_boundaries, data_flows):
    """Prioriteer aanvalsvectoren op risico en toegankelijkheid."""
    vectors = []
 
    for boundary in trust_boundaries:
        for attack_type in boundary["attack_types"]:
            vectors.append({
                "vector": attack_type,
                "boundary": boundary["name"],
                "accessibility": score_accessibility(boundary),
                "impact": score_impact(attack_type),
                "defense_coverage": score_defenses(boundary["controls"]),
                "priority": None  # Hieronder berekend
            })
 
    # Bereken de prioriteitsscore
    for vector in vectors:
        vector["priority"] = (
            vector["accessibility"] * 0.3 +
            vector["impact"] * 0.4 +
            (1 - vector["defense_coverage"]) * 0.3
        )
 
    # Sorteer op prioriteit (hoogste eerst)
    vectors.sort(key=lambda x: x["priority"], reverse=True)
 
    return vectors

De attack-surface-map opstellen

De uiteindelijke output is een gestructureerd document dat red team-operaties stuurt:

Attack Surface Map: [Target System Name]
Date: [Assessment Date]

1. SYSTEM OVERVIEW
   - Architecture type: RAG-augmented chatbot with tool access
   - Model: [identified or suspected]
   - Infrastructure: [cloud provider, deployment type]

2. INPUT CHANNELS (ranked by exploitability)
   Priority 1: Document upload (no injection filtering)
   Priority 2: Direct chat input (basic content filter)
   Priority 3: RAG knowledge base (limited write access)

3. TRUST BOUNDARIES
   [Diagram showing boundary transitions]

4. DATA FLOWS
   [Diagram showing sensitive data movement]

5. TOOL CAPABILITIES
   - Email sending (high impact)
   - Database queries (data exposure risk)
   - File operations (limited to uploads directory)

6. RECOMMENDED ATTACK SEQUENCE
   1. Indirect injection via document upload
   2. System prompt extraction via chat
   3. Tool abuse for data exfiltration
   4. Knowledge base poisoning for persistence

Gerelateerde onderwerpen

Target Profiling — Eerste verkenning
Model Identification — Het model identificeren
API Enumeration — API-capaciteiten ontdekken
Defense Taxonomy — Begrijpen welke verdedigingen je kunt verwachten

Knowledge Check

Een AI-systeem accepteert gebruikerschat (gefilterd), verwerkt geüploade documenten (ongefilterd) en heeft toegang tot tools om e-mail te versturen. Welk aanvalspad moet een red team prioriteren?

Referenties

OWASP, "Attack Surface Analysis Cheat Sheet" (2023)
MITRE, "ATLAS: Adversarial Threat Landscape for AI Systems" (2023)
Microsoft, "Threat Modeling for AI/ML Systems" (2023)

Mapping van het AI-aanvalsoppervlak

Gerelateerde artikelen

Mapping van het AI-aanvalsoppervlak

Gerelateerde artikelen