Cloud ML Platform 安全 (AWS/Azure/GCP)

Advanced10 min readUpdated 2026-03-13

安全 comparison of cloud ML platforms including AWS SageMaker, Azure Machine Learning, and Google Vertex AI. IAM configuration, data security, model serving, and platform-specific attack surfaces.

cloud aws azure gcp

雲端 ML platforms (AWS SageMaker, Azure Machine Learning, Google Vertex AI) provide managed infrastructure for 訓練, deploying, and serving ML models — including LLMs. Each platform inherits its parent 雲端's 安全 model while adding ML-specific 攻擊面. Red teaming 雲端 ML deployments requires 理解 both 雲端安全 fundamentals and ML-specific 漏洞.

Platform Comparison 概覽

Feature	AWS SageMaker	Azure ML	Google Vertex AI
IAM model	AWS IAM (roles, policies)	Azure RBAC + Azure AD	Google 雲端 IAM
Network isolation	VPC, PrivateLink	VNet, Private Endpoints	VPC, Private Service Connect
Data encryption	KMS (at rest), TLS (transit)	Azure Key Vault, TLS	雲端 KMS, TLS
Model registry	SageMaker Model Registry	Azure ML Model Registry	Vertex AI Model Registry
Serving	SageMaker Endpoints, Serverless	Managed Endpoints, AKS	Vertex AI Endpoints, GKE
Notebook environment	SageMaker Studio, Notebook Instances	Azure ML Compute Instances	Vertex AI Workbench
Compliance	HIPAA, SOC, FedRAMP	HIPAA, SOC, FedRAMP, DoD IL	HIPAA, SOC, FedRAMP

IAM Misconfiguration Risks

IAM misconfigurations are the most common and highest-impact 漏洞 class in 雲端 ML deployments.

AWS SageMaker IAM Risks

# DANGEROUS: Overly permissive SageMaker execution role
overly_permissive_policy = {
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Action": "sagemaker:*",
        "Resource": "*"
    }, {
        "Effect": "Allow",
        "Action": "s3:*",      # Full S3 access — can read ALL buckets
        "Resource": "*"
    }, {
        "Effect": "Allow",
        "Action": "iam:PassRole",  # Can pass any role
        "Resource": "*"
    }]
}
 
# This role can:
# - Access any S3 bucket (訓練資料, other projects' data)
# - Create endpoints with any role attached
# - Modify any SageMaker resource
# - Potentially escalate privileges via PassRole

Azure ML IAM Risks

Misconfiguration	Risk	測試
Contributor role on workspace	Full control over ML resources, data, and models	Check role assignments — Contributor is often over-assigned
Shared managed identity	Multiple services share one identity, cross-service access	Enumerate managed identity assignments
Missing data plane RBAC	Control plane access does not restrict data plane operations	測試 data access after restricting control plane roles
Service principal over-provisioning	Automation accounts with excessive scope	Review SP 權限 across resource groups

Google Vertex AI IAM Risks

Misconfiguration	Risk	測試
`roles/aiplatform.admin` over-assignment	Full control over all Vertex AI resources	Review IAM bindings at project level
Missing VPC Service Controls	Data exfiltration via Vertex AI APIs	測試 data access from outside perimeter
Default service account usage	Compute Engine default SA has broad project-level 權限	Check if default SA is used for ML workloads
Cross-project access	Vertex AI resources accessible from other projects	測試 cross-project API calls

Data 安全 Across the ML Lifecycle

Data Collection → Preparation → Training → Model Storage → Serving → Inference
      ↑               ↑            ↑            ↑              ↑          ↑
  Data source     Feature       Training    Model         Endpoint    Response
  access control  store access  data in     artifact      auth        data
  認證  encryption    compute     encryption    rate limit  leakage

Training Data 安全

Risk	AWS SageMaker	Azure ML	Vertex AI
Data at rest	S3 SSE or KMS	Azure Storage encryption	雲端 Storage encryption
Data in 訓練	EBS encryption on 訓練 instances	Compute encryption	Boot disk + data disk encryption
Data access logging	CloudTrail + S3 access logs	Azure Monitor + Storage Analytics	雲端 Audit Logs
Cross-account data access	S3 bucket policies + IAM	Shared access signatures	IAM + VPC Service Controls
Data exfiltration prevention	VPC endpoints, no internet access	Private endpoints, NSGs	VPC Service Controls

Model Artifact 安全

Concern	Description	測試 Approach
Model access control	Who can download or use model artifacts?	Attempt to access model artifacts with different credentials
Model integrity	Can model artifacts be tampered with?	Check for write access to model storage, verify signing
Model versioning	Are old (potentially vulnerable) model versions accessible?	Enumerate available model versions
Serialization risks	Model format (pickle, safetensors) determines code execution risk	識別 model serialization format, 測試 for deserialization attacks

Model Serving 安全

Endpoint Authentication

Feature	AWS SageMaker	Azure ML	Vertex AI
Default auth	IAM SigV4	Key or 符元-based	IAM + API key
Custom auth	Lambda authorizer	Azure AD integration	IAP + IAM
Public endpoint risk	Can be exposed via API Gateway	Can be exposed publicly	Can be exposed publicly
Private endpoint	PrivateLink	Private Endpoint	Private Service Connect
mTLS	Via API Gateway custom domain	Available	Available

Endpoint 攻擊 Surface

測試 Authentication Bypass
Attempt to invoke model endpoints without valid credentials. Check for misconfigured API gateways, missing auth on health check endpoints, and default credentials.
測試輸入 Validation
Send malformed, oversized, and 對抗性 inputs to model endpoints. Check for error messages that reveal model architecture, framework versions, or infrastructure details.
測試 for Model Extraction
Systematically query the endpoint to extract model behavior. With enough queries, 攻擊者 can build a functional copy of 模型 (model stealing). Check rate limiting and query logging.
測試 for Data Leakage in Responses
Check whether model responses contain 訓練資料, PII, or other sensitive information. 測試 for membership 推論 (determining whether a specific data point was in the 訓練 set).
測試 Network Isolation
Verify that model endpoints cannot be reached from unauthorized networks. Check VPC configurations, 安全 groups, and network policies.

Notebook Environment 安全

ML notebook environments (SageMaker Studio, Azure ML Compute, Vertex AI Workbench) are high-value targets 因為 they typically have broad data access and code execution capabilities:

Risk	Description	緩解
Internet access	Notebooks with internet access can exfiltrate data	Use VPC/VNet with no internet egress
Shared instances	Multiple users sharing notebook infrastructure	Use per-user instances with separate credentials
Persistent credentials	Long-lived credentials stored in notebook environments	Use temporary credentials, metadata service v2 (IMDSv2)
Package installation	Users install arbitrary packages that may be malicious	Restrict to approved package repositories
Root access	Notebook users with root can modify 安全 controls	Restrict to non-root access

Cross-Platform 安全 Checklist

安全 Control	What to Verify	Priority
IAM least privilege	Are ML roles scoped to minimum necessary 權限?	Critical
Network isolation	Are ML resources in private subnets with no unnecessary internet access?	Critical
Data encryption	Is data encrypted at rest and in transit? Are keys managed properly?	High
Endpoint 認證	Are model serving endpoints properly authenticated?	High
Logging and 監控	Are ML API calls, data access, and model invocations logged?	High
Model artifact integrity	Are model artifacts access-controlled and integrity-verified?	Medium
Notebook 安全	Are notebooks isolated, credentialed appropriately, and network-restricted?	High
供應鏈	Are ML frameworks, packages, and pre-trained models from trusted sources?	Medium

For related topics, see Infrastructure 安全, Supply Chain 安全, and LangChain & LlamaIndex 安全.

參考文獻

"AWS SageMaker 安全最佳實務" - Amazon Web Services (2024) - AWS guidance on securing ML workloads including IAM, networking, and encryption
"Azure Machine Learning 安全 Baseline" - Microsoft (2024) - 安全 controls and configuration for Azure ML deployments
"Google 雲端 Vertex AI 安全概覽" - Google 雲端 (2024) - 安全 architecture and best practices for Vertex AI workloads
"CIS Benchmarks for 雲端 ML Services" - Center for Internet 安全 (2024) - Hardening benchmarks for 雲端 ML platform configurations

Knowledge Check

Why is the combination of iam:PassRole and sagemaker:* 權限 a privilege escalation risk?

Cloud ML Platform 安全 (AWS/Azure/GCP)

Advanced10 min readUpdated 2026-03-13

安全 comparison of cloud ML platforms including AWS SageMaker, Azure Machine Learning, and Google Vertex AI. IAM configuration, data security, model serving, and platform-specific attack surfaces.

cloud aws azure gcp

Platform Comparison 概覽

Feature	AWS SageMaker	Azure ML	Google Vertex AI
IAM model	AWS IAM (roles, policies)	Azure RBAC + Azure AD	Google 雲端 IAM
Network isolation	VPC, PrivateLink	VNet, Private Endpoints	VPC, Private Service Connect
Data encryption	KMS (at rest), TLS (transit)	Azure Key Vault, TLS	雲端 KMS, TLS
Model registry	SageMaker Model Registry	Azure ML Model Registry	Vertex AI Model Registry
Serving	SageMaker Endpoints, Serverless	Managed Endpoints, AKS	Vertex AI Endpoints, GKE
Notebook environment	SageMaker Studio, Notebook Instances	Azure ML Compute Instances	Vertex AI Workbench
Compliance	HIPAA, SOC, FedRAMP	HIPAA, SOC, FedRAMP, DoD IL	HIPAA, SOC, FedRAMP

IAM Misconfiguration Risks

IAM misconfigurations are the most common and highest-impact 漏洞 class in 雲端 ML deployments.

AWS SageMaker IAM Risks

# DANGEROUS: Overly permissive SageMaker execution role
overly_permissive_policy = {
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Action": "sagemaker:*",
        "Resource": "*"
    }, {
        "Effect": "Allow",
        "Action": "s3:*",      # Full S3 access — can read ALL buckets
        "Resource": "*"
    }, {
        "Effect": "Allow",
        "Action": "iam:PassRole",  # Can pass any role
        "Resource": "*"
    }]
}
 
# This role can:
# - Access any S3 bucket (訓練資料, other projects' data)
# - Create endpoints with any role attached
# - Modify any SageMaker resource
# - Potentially escalate privileges via PassRole

Azure ML IAM Risks

Misconfiguration	Risk	測試
Contributor role on workspace	Full control over ML resources, data, and models	Check role assignments — Contributor is often over-assigned
Shared managed identity	Multiple services share one identity, cross-service access	Enumerate managed identity assignments
Missing data plane RBAC	Control plane access does not restrict data plane operations	測試 data access after restricting control plane roles
Service principal over-provisioning	Automation accounts with excessive scope	Review SP 權限 across resource groups

Google Vertex AI IAM Risks

Misconfiguration	Risk	測試
`roles/aiplatform.admin` over-assignment	Full control over all Vertex AI resources	Review IAM bindings at project level
Missing VPC Service Controls	Data exfiltration via Vertex AI APIs	測試 data access from outside perimeter
Default service account usage	Compute Engine default SA has broad project-level 權限	Check if default SA is used for ML workloads
Cross-project access	Vertex AI resources accessible from other projects	測試 cross-project API calls

Data 安全 Across the ML Lifecycle

Data Collection → Preparation → Training → Model Storage → Serving → Inference
      ↑               ↑            ↑            ↑              ↑          ↑
  Data source     Feature       Training    Model         Endpoint    Response
  access control  store access  data in     artifact      auth        data
  認證  encryption    compute     encryption    rate limit  leakage

Training Data 安全

Risk	AWS SageMaker	Azure ML	Vertex AI
Data at rest	S3 SSE or KMS	Azure Storage encryption	雲端 Storage encryption
Data in 訓練	EBS encryption on 訓練 instances	Compute encryption	Boot disk + data disk encryption
Data access logging	CloudTrail + S3 access logs	Azure Monitor + Storage Analytics	雲端 Audit Logs
Cross-account data access	S3 bucket policies + IAM	Shared access signatures	IAM + VPC Service Controls
Data exfiltration prevention	VPC endpoints, no internet access	Private endpoints, NSGs	VPC Service Controls

Model Artifact 安全

Concern	Description	測試 Approach
Model access control	Who can download or use model artifacts?	Attempt to access model artifacts with different credentials
Model integrity	Can model artifacts be tampered with?	Check for write access to model storage, verify signing
Model versioning	Are old (potentially vulnerable) model versions accessible?	Enumerate available model versions
Serialization risks	Model format (pickle, safetensors) determines code execution risk	識別 model serialization format, 測試 for deserialization attacks

Model Serving 安全

Endpoint Authentication

Feature	AWS SageMaker	Azure ML	Vertex AI
Default auth	IAM SigV4	Key or 符元-based	IAM + API key
Custom auth	Lambda authorizer	Azure AD integration	IAP + IAM
Public endpoint risk	Can be exposed via API Gateway	Can be exposed publicly	Can be exposed publicly
Private endpoint	PrivateLink	Private Endpoint	Private Service Connect
mTLS	Via API Gateway custom domain	Available	Available

Endpoint 攻擊 Surface

測試 Authentication Bypass
Attempt to invoke model endpoints without valid credentials. Check for misconfigured API gateways, missing auth on health check endpoints, and default credentials.
測試輸入 Validation
Send malformed, oversized, and 對抗性 inputs to model endpoints. Check for error messages that reveal model architecture, framework versions, or infrastructure details.
測試 for Model Extraction
Systematically query the endpoint to extract model behavior. With enough queries, 攻擊者 can build a functional copy of 模型 (model stealing). Check rate limiting and query logging.
測試 for Data Leakage in Responses
Check whether model responses contain 訓練資料, PII, or other sensitive information. 測試 for membership 推論 (determining whether a specific data point was in the 訓練 set).
測試 Network Isolation
Verify that model endpoints cannot be reached from unauthorized networks. Check VPC configurations, 安全 groups, and network policies.

Notebook Environment 安全

ML notebook environments (SageMaker Studio, Azure ML Compute, Vertex AI Workbench) are high-value targets 因為 they typically have broad data access and code execution capabilities:

Risk	Description	緩解
Internet access	Notebooks with internet access can exfiltrate data	Use VPC/VNet with no internet egress
Shared instances	Multiple users sharing notebook infrastructure	Use per-user instances with separate credentials
Persistent credentials	Long-lived credentials stored in notebook environments	Use temporary credentials, metadata service v2 (IMDSv2)
Package installation	Users install arbitrary packages that may be malicious	Restrict to approved package repositories
Root access	Notebook users with root can modify 安全 controls	Restrict to non-root access

Cross-Platform 安全 Checklist

安全 Control	What to Verify	Priority
IAM least privilege	Are ML roles scoped to minimum necessary 權限?	Critical
Network isolation	Are ML resources in private subnets with no unnecessary internet access?	Critical
Data encryption	Is data encrypted at rest and in transit? Are keys managed properly?	High
Endpoint 認證	Are model serving endpoints properly authenticated?	High
Logging and 監控	Are ML API calls, data access, and model invocations logged?	High
Model artifact integrity	Are model artifacts access-controlled and integrity-verified?	Medium
Notebook 安全	Are notebooks isolated, credentialed appropriately, and network-restricted?	High
供應鏈	Are ML frameworks, packages, and pre-trained models from trusted sources?	Medium

For related topics, see Infrastructure 安全, Supply Chain 安全, and LangChain & LlamaIndex 安全.

參考文獻

"AWS SageMaker 安全最佳實務" - Amazon Web Services (2024) - AWS guidance on securing ML workloads including IAM, networking, and encryption
"Azure Machine Learning 安全 Baseline" - Microsoft (2024) - 安全 controls and configuration for Azure ML deployments
"Google 雲端 Vertex AI 安全概覽" - Google 雲端 (2024) - 安全 architecture and best practices for Vertex AI workloads
"CIS Benchmarks for 雲端 ML Services" - Center for Internet 安全 (2024) - Hardening benchmarks for 雲端 ML platform configurations

Knowledge Check

Why is the combination of iam:PassRole and sagemaker:* 權限 a privilege escalation risk?

Cloud ML Platform 安全 (AWS/Azure/GCP)

測試 Authentication Bypass

測試 輸入 Validation

測試 for Model Extraction

測試 for Data Leakage in Responses

測試 Network Isolation

Related articles

Cloud ML Platform 安全 (AWS/Azure/GCP)

測試 Authentication Bypass

測試 輸入 Validation

測試 for Model Extraction

測試 for Data Leakage in Responses

測試 Network Isolation

Related articles

測試輸入 Validation

測試輸入 Validation