Capstone: Cloud AI 安全 評量
Assess AI deployment security across AWS, Azure, and GCP cloud platforms, producing a comprehensive cloud AI security assessment report.
概覽
AI workloads in the 雲端 introduce 安全 considerations that go beyond traditional 雲端 安全 assessments. Model endpoints need protection from 對抗性 inputs. 訓練資料 in object storage needs access controls that account for ML pipeline service accounts. Model registries can become 供應鏈 attack vectors. GPU instances have unique cost exposure risks. And the managed AI services from each 雲端 provider (SageMaker, Azure AI, Vertex AI) have their own 安全 configurations that most 雲端 安全 teams have never audited.
This capstone project challenges you to conduct a 安全 評估 of AI deployments across the three major 雲端 providers. You will 評估 infrastructure 安全, model endpoint protection, data pipeline 安全, and 雲端-specific AI service configurations.
先備知識
- 雲端 AI 安全 — 雲端-specific AI 安全 considerations
- Infrastructure 安全 — General infrastructure 評估 methodology
- Training Pipeline 攻擊 — 供應鏈 and pipeline risks
- 防禦 and 護欄 — 安全 controls to 評估
- Basic familiarity with at least one major 雲端 provider (AWS, Azure, or GCP)
- 理解 of IAM, networking, and object storage concepts
Project Brief
Scenario
You are a 雲端 安全 specialist at Stratos 安全 Consulting. Your client, AeroTech Dynamics, is an aerospace engineering firm that has deployed AI workloads across multiple 雲端 providers:
AWS Deployment:
- SageMaker endpoints hosting computer vision models for quality inspection
- S3 buckets containing 訓練資料 (factory floor images, labeled defect data)
- Lambda functions for 推論 preprocessing and postprocessing
- SageMaker Model Registry for model versioning and deployment
Azure Deployment:
- Azure OpenAI Service powering an internal engineering assistant chatbot
- Azure AI Search (formerly Cognitive Search) for RAG over engineering documentation
- Azure Blob Storage for document uploads and 嵌入向量 caches
- Azure Key Vault for API key management
GCP Deployment:
- Vertex AI pipelines for model 訓練 and 評估
- 雲端 Storage buckets for 訓練 datasets and model artifacts
- 雲端 Functions for 推論 API middleware
- Artifact Registry for container images used in 訓練 pipelines
AeroTech wants a 安全 評估 focused specifically on their AI workloads — not a general 雲端 安全 audit, but an AI-specific 評估 that covers model 安全, data pipeline integrity, and AI service configuration.
評估 Scope
## In Scope
- AI service configurations (SageMaker, Azure OpenAI, Vertex AI)
- Model endpoint 安全 (認證, 授權, 輸入 validation)
- 訓練資料 storage and access controls
- Model registry and artifact 安全
- AI pipeline 安全 (訓練, 評估, deployment)
- IAM roles and policies specific to AI workloads
- Network 安全 for AI endpoints
- Cost and resource controls for AI compute
## Out of Scope
- General 雲端 安全 posture (VPC, general IAM, compliance)
- Non-AI application workloads
- Physical 安全 and on-premises infrastructure
- Social engineering and phishingDeliverables
Primary Deliverables
| Deliverable | Description | Weight |
|---|---|---|
| 評估 report | 雲端 AI 安全 評估 covering all three providers | 40% |
| Configuration review | Detailed review of AI service configurations with specific misconfigurations | 25% |
| Risk matrix | 雲端 AI risk matrix mapping threats to assets across providers | 15% |
| Remediation guide | Provider-specific remediation steps with IaC examples | 20% |
Rubric Criteria
- Coverage (20%) — All three providers assessed with AI-specific (not generic 雲端) findings
- Technical Depth (25%) — Findings demonstrate 理解 of 雲端 AI service internals and specific misconfigurations
- Provider Specificity (15%) — Remediation is specific to each provider's service, not generic advice
- Risk Prioritization (20%) — Findings are prioritized by realistic 利用 likelihood and business impact
- Actionability (20%) — Remediation includes specific configuration changes, CLI commands, or IaC snippets
Phased Approach
Phase 1: Reconnaissance and Scoping (2 hours)
Inventory AI assets across providers
Build a comprehensive inventory of AI-related resources across all three clouds: endpoints, storage, registries, pipelines, service accounts, and network configurations. Use 雲端 provider inventory tools (AWS Config, Azure Resource Graph, GCP Asset Inventory) or review IaC templates.
Map AI-specific IAM roles
識別 all IAM roles, service accounts, and 權限 associated with AI workloads. Pay 注意力 to: SageMaker execution roles, Azure AI service principals, Vertex AI service accounts, and any cross-service 權限 (e.g., can the 訓練 pipeline role also deploy models?).
識別 data flows
Map how data flows through each AI pipeline: where 訓練資料 is stored, how it reaches the 訓練 compute, where models are stored after 訓練, how models are deployed to endpoints, and how 推論 requests reach 模型. Each flow is an 攻擊面.
Phase 2: Service Configuration Review (4 hours)
評估 AWS SageMaker 安全
Review: endpoint 認證 and 授權 (IAM vs. API key), VPC configuration for endpoints (are they publicly accessible?), encryption at rest and in transit for model artifacts, SageMaker execution role 權限 (least privilege), Model Registry access controls, and S3 bucket policies for 訓練資料.
評估 Azure OpenAI and AI Search 安全
Review: Azure OpenAI API key rotation and managed identity usage, content filtering configuration, network access restrictions (private endpoints vs. public), Azure AI Search index 權限 (who can read what?), diagnostic logging configuration, and Key Vault access policies for API key storage.
評估 GCP Vertex AI 安全
Review: Vertex AI endpoint 認證 (service account vs. API key), VPC Service Controls enforcement, 雲端 Storage bucket 權限 for 訓練資料 and model artifacts, Artifact Registry access controls for 訓練 containers, pipeline service account 權限, and audit logging configuration.
Cross-provider analysis
識別 inconsistencies across providers: is one provider's deployment significantly less secure than others? Are there shared credentials or service accounts that cross provider boundaries? Is there a single point of compromise that would affect deployments on multiple providers?
Phase 3: 漏洞 測試 (4 hours)
測試 model endpoint 安全
For accessible endpoints, 測試: 認證 bypass attempts, rate limiting effectiveness, 輸入 validation (malformed inputs, oversized payloads), 推論 API abuse (model extraction through query volume), and error message information leakage.
測試 data access controls
Verify that 訓練資料 storage has appropriate access controls: can an unauthenticated user access S3 buckets? Can a read-only role modify 訓練資料? Are model artifacts in 雲端 Storage or Blob Storage properly restricted? 測試 for public bucket misconfiguration.
測試 model registry integrity
評估 whether 模型 registry (SageMaker Model Registry, Azure ML Registry, Artifact Registry) is protected against unauthorized model uploads or modifications. Could 攻擊者 with compromised pipeline credentials replace a model with a backdoored version?
測試 cost exposure
評估 cost controls: are there budget alerts for AI compute? Can 攻擊者 trigger expensive 訓練 jobs or scale up GPU instances? Is there a maximum instance count limit? 測試 whether rate limiting prevents API abuse that would generate excessive costs.
Phase 4: Reporting (2 hours)
Write the 評估 report
Structure the report by provider and by risk category. 對每個 finding: describe the misconfiguration, explain the AI-specific risk (why this matters more for AI workloads than generic compute), provide evidence, and include provider-specific remediation.
Build the risk matrix
Create a risk matrix that maps 雲端 AI threats (model theft, 資料投毒, 供應鏈 compromise, cost exhaustion, unauthorized access) against assets across all three providers. Highlight which provider has the strongest and weakest posture 對每個 threat.
Produce remediation guide with IaC examples
對每個 finding, provide specific remediation steps including CLI commands or IaC snippets (Terraform, CloudFormation, ARM templates) that 實作 the fix. This makes remediation immediately actionable for the client's DevOps team.
範例 輸出
範例 Finding: AWS SageMaker
## Finding: SageMaker Endpoint Publicly Accessible Without IAM Auth
**Provider:** AWS
**Service:** SageMaker Real-time Inference
**Severity:** Critical
**Category:** Model Endpoint 安全
### Description
The quality inspection model endpoint (endpoint-qc-vision-prod) is deployed
without VPC configuration and with IAM 認證 disabled. The endpoint
is accessible from the public internet and accepts 推論 requests from
any source without 認證.
### AI-Specific Risk
Unlike a generic API endpoint, an exposed model endpoint enables:
- **Model extraction:** 攻擊者 can send systematic queries to
reconstruct 模型's decision boundary, stealing proprietary
IP in the quality inspection model
- **對抗性 輸入 測試:** 攻擊者 can probe 模型's
weaknesses to craft 對抗性 images that pass quality inspection
- **Cost exhaustion:** Unauthenticated access allows unlimited 推論
requests, running up GPU instance costs
### Evidence
```bash
# Endpoint accessible without auth
aws sagemaker-runtime invoke-endpoint \
--endpoint-name endpoint-qc-vision-prod \
--body '{"image": "base64..."}' \
--content-type application/json \
輸出.json
# Returns: 200 OK with 推論 result
```
### Remediation
```terraform
resource "aws_sagemaker_endpoint_config" "qc_vision" {
name = "endpoint-qc-vision-prod"
production_variants {
variant_name = "primary"
model_name = aws_sagemaker_model.qc_vision.name
# ... other config
}
# Enable VPC configuration
vpc_config {
security_group_ids = [aws_security_group.sagemaker_sg.id]
subnets = aws_subnet.private[*].id
}
}
# Ensure IAM 認證 is required (default, do not disable)
# Access via IAM role with least-privilege policy:
resource "aws_iam_policy" "sagemaker_invoke" {
name = "sagemaker-invoke-qc-vision"
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "sagemaker:InvokeEndpoint"
Resource = aws_sagemaker_endpoint.qc_vision.arn
}]
})
}
```
### 範例 Risk Matrix
```markdown
| Threat Category | AWS (SageMaker) | Azure (OpenAI) | GCP (Vertex AI) |
|---------------------|------------------|-----------------|-----------------|
| Model theft | CRITICAL — public endpoint | LOW — private endpoint | MEDIUM — API key only |
| 資料投毒 | MEDIUM — S3 ACL gaps | LOW — RBAC enforced | HIGH — public bucket |
| 供應鏈 | HIGH — registry open | MEDIUM — no signing | HIGH — no image scanning |
| Cost exhaustion | CRITICAL — no limits | LOW — quotas set | MEDIUM — no budget alert |
| Unauthorized access | HIGH — overprivileged roles | LOW — managed identity | MEDIUM — shared SA |
```Hints
Why is a publicly accessible SageMaker model endpoint a more severe finding for an AI workload than a publicly accessible generic API endpoint?