Cloud ML Platform 安全 (AWS/Azure/GCP)
安全 comparison of cloud ML platforms including AWS SageMaker, Azure Machine Learning, and Google Vertex AI. IAM configuration, data security, model serving, and platform-specific attack surfaces.
雲端 ML platforms (AWS SageMaker, Azure Machine Learning, Google Vertex AI) provide managed infrastructure for 訓練, deploying, and serving ML models — including LLMs. Each platform inherits its parent 雲端's 安全 model while adding ML-specific 攻擊面. Red teaming 雲端 ML deployments requires 理解 both 雲端 安全 fundamentals and ML-specific 漏洞.
Platform Comparison 概覽
| Feature | AWS SageMaker | Azure ML | Google Vertex AI |
|---|---|---|---|
| IAM model | AWS IAM (roles, policies) | Azure RBAC + Azure AD | Google 雲端 IAM |
| Network isolation | VPC, PrivateLink | VNet, Private Endpoints | VPC, Private Service Connect |
| Data encryption | KMS (at rest), TLS (transit) | Azure Key Vault, TLS | 雲端 KMS, TLS |
| Model registry | SageMaker Model Registry | Azure ML Model Registry | Vertex AI Model Registry |
| Serving | SageMaker Endpoints, Serverless | Managed Endpoints, AKS | Vertex AI Endpoints, GKE |
| Notebook environment | SageMaker Studio, Notebook Instances | Azure ML Compute Instances | Vertex AI Workbench |
| Compliance | HIPAA, SOC, FedRAMP | HIPAA, SOC, FedRAMP, DoD IL | HIPAA, SOC, FedRAMP |
IAM Misconfiguration Risks
IAM misconfigurations are the most common and highest-impact 漏洞 class in 雲端 ML deployments.
AWS SageMaker IAM Risks
# DANGEROUS: Overly permissive SageMaker execution role
overly_permissive_policy = {
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sagemaker:*",
"Resource": "*"
}, {
"Effect": "Allow",
"Action": "s3:*", # Full S3 access — can read ALL buckets
"Resource": "*"
}, {
"Effect": "Allow",
"Action": "iam:PassRole", # Can pass any role
"Resource": "*"
}]
}
# This role can:
# - Access any S3 bucket (訓練資料, other projects' data)
# - Create endpoints with any role attached
# - Modify any SageMaker resource
# - Potentially escalate privileges via PassRoleAzure ML IAM Risks
| Misconfiguration | Risk | 測試 |
|---|---|---|
| Contributor role on workspace | Full control over ML resources, data, and models | Check role assignments — Contributor is often over-assigned |
| Shared managed identity | Multiple services share one identity, cross-service access | Enumerate managed identity assignments |
| Missing data plane RBAC | Control plane access does not restrict data plane operations | 測試 data access after restricting control plane roles |
| Service principal over-provisioning | Automation accounts with excessive scope | Review SP 權限 across resource groups |
Google Vertex AI IAM Risks
| Misconfiguration | Risk | 測試 |
|---|---|---|
roles/aiplatform.admin over-assignment | Full control over all Vertex AI resources | Review IAM bindings at project level |
| Missing VPC Service Controls | Data exfiltration via Vertex AI APIs | 測試 data access from outside perimeter |
| Default service account usage | Compute Engine default SA has broad project-level 權限 | Check if default SA is used for ML workloads |
| Cross-project access | Vertex AI resources accessible from other projects | 測試 cross-project API calls |
Data 安全 Across the ML Lifecycle
Data Collection → Preparation → Training → Model Storage → Serving → Inference
↑ ↑ ↑ ↑ ↑ ↑
Data source Feature Training Model Endpoint Response
access control store access data in artifact auth data
認證 encryption compute encryption rate limit leakage
Training Data 安全
| Risk | AWS SageMaker | Azure ML | Vertex AI |
|---|---|---|---|
| Data at rest | S3 SSE or KMS | Azure Storage encryption | 雲端 Storage encryption |
| Data in 訓練 | EBS encryption on 訓練 instances | Compute encryption | Boot disk + data disk encryption |
| Data access logging | CloudTrail + S3 access logs | Azure Monitor + Storage Analytics | 雲端 Audit Logs |
| Cross-account data access | S3 bucket policies + IAM | Shared access signatures | IAM + VPC Service Controls |
| Data exfiltration prevention | VPC endpoints, no internet access | Private endpoints, NSGs | VPC Service Controls |
Model Artifact 安全
| Concern | Description | 測試 Approach |
|---|---|---|
| Model access control | Who can download or use model artifacts? | Attempt to access model artifacts with different credentials |
| Model integrity | Can model artifacts be tampered with? | Check for write access to model storage, verify signing |
| Model versioning | Are old (potentially vulnerable) model versions accessible? | Enumerate available model versions |
| Serialization risks | Model format (pickle, safetensors) determines code execution risk | 識別 model serialization format, 測試 for deserialization attacks |
Model Serving 安全
Endpoint Authentication
| Feature | AWS SageMaker | Azure ML | Vertex AI |
|---|---|---|---|
| Default auth | IAM SigV4 | Key or 符元-based | IAM + API key |
| Custom auth | Lambda authorizer | Azure AD integration | IAP + IAM |
| Public endpoint risk | Can be exposed via API Gateway | Can be exposed publicly | Can be exposed publicly |
| Private endpoint | PrivateLink | Private Endpoint | Private Service Connect |
| mTLS | Via API Gateway custom domain | Available | Available |
Endpoint 攻擊 Surface
測試 Authentication Bypass
Attempt to invoke model endpoints without valid credentials. Check for misconfigured API gateways, missing auth on health check endpoints, and default credentials.
測試 輸入 Validation
Send malformed, oversized, and 對抗性 inputs to model endpoints. Check for error messages that reveal model architecture, framework versions, or infrastructure details.
測試 for Model Extraction
Systematically query the endpoint to extract model behavior. With enough queries, 攻擊者 can build a functional copy of 模型 (model stealing). Check rate limiting and query logging.
測試 for Data Leakage in Responses
Check whether model responses contain 訓練資料, PII, or other sensitive information. 測試 for membership 推論 (determining whether a specific data point was in the 訓練 set).
測試 Network Isolation
Verify that model endpoints cannot be reached from unauthorized networks. Check VPC configurations, 安全 groups, and network policies.
Notebook Environment 安全
ML notebook environments (SageMaker Studio, Azure ML Compute, Vertex AI Workbench) are high-value targets 因為 they typically have broad data access and code execution capabilities:
| Risk | Description | 緩解 |
|---|---|---|
| Internet access | Notebooks with internet access can exfiltrate data | Use VPC/VNet with no internet egress |
| Shared instances | Multiple users sharing notebook infrastructure | Use per-user instances with separate credentials |
| Persistent credentials | Long-lived credentials stored in notebook environments | Use temporary credentials, metadata service v2 (IMDSv2) |
| Package installation | Users install arbitrary packages that may be malicious | Restrict to approved package repositories |
| Root access | Notebook users with root can modify 安全 controls | Restrict to non-root access |
Cross-Platform 安全 Checklist
| 安全 Control | What to Verify | Priority |
|---|---|---|
| IAM least privilege | Are ML roles scoped to minimum necessary 權限? | Critical |
| Network isolation | Are ML resources in private subnets with no unnecessary internet access? | Critical |
| Data encryption | Is data encrypted at rest and in transit? Are keys managed properly? | High |
| Endpoint 認證 | Are model serving endpoints properly authenticated? | High |
| Logging and 監控 | Are ML API calls, data access, and model invocations logged? | High |
| Model artifact integrity | Are model artifacts access-controlled and integrity-verified? | Medium |
| Notebook 安全 | Are notebooks isolated, credentialed appropriately, and network-restricted? | High |
| 供應鏈 | Are ML frameworks, packages, and pre-trained models from trusted sources? | Medium |
For related topics, see Infrastructure 安全, Supply Chain 安全, and LangChain & LlamaIndex 安全.
相關主題
- Infrastructure 安全: API 安全 -- API-level 安全 測試 for ML endpoints
- Supply Chain 安全 -- model artifact and dependency integrity
- LangChain & LlamaIndex 安全 -- framework-specific 漏洞 on 雲端 platforms
- API Provider 安全 Comparison -- comparing 安全 controls across LLM API providers
參考文獻
- "AWS SageMaker 安全 最佳實務" - Amazon Web Services (2024) - AWS guidance on securing ML workloads including IAM, networking, and encryption
- "Azure Machine Learning 安全 Baseline" - Microsoft (2024) - 安全 controls and configuration for Azure ML deployments
- "Google 雲端 Vertex AI 安全 概覽" - Google 雲端 (2024) - 安全 architecture and best practices for Vertex AI workloads
- "CIS Benchmarks for 雲端 ML Services" - Center for Internet 安全 (2024) - Hardening benchmarks for 雲端 ML platform configurations
Why is the combination of iam:PassRole and sagemaker:* 權限 a privilege escalation risk?