Cloud AI Security Cheat Sheet

intermediate8 min readUpdated 2026-03-15

Quick reference comparing AI security controls across AWS, Azure, and GCP -- covering IAM, networking, encryption, monitoring, and AI-specific services.

cloud-security aws azure gcp cheat-sheet ai-services

Cloud AI Security Cheat Sheet

A side-by-side comparison of AI security controls across AWS, Azure, and GCP. Use this during cloud AI security assessments to quickly identify the relevant services, configurations, and common misconfigurations for each provider.

AI/ML Services Comparison

Capability	AWS	Azure	GCP
ML Platform	SageMaker	Azure Machine Learning	Vertex AI
Hosted LLM API	Amazon Bedrock	Azure OpenAI Service	Vertex AI (Gemini, PaLM)
Notebook Environment	SageMaker Notebooks	Azure ML Notebooks	Vertex AI Workbench
Model Hosting	SageMaker Endpoints	Azure ML Endpoints	Vertex AI Endpoints
Content Safety	Bedrock Guardrails	Azure AI Content Safety	Vertex AI Safety Filters
Vector Database	OpenSearch (vector), MemoryDB	Azure AI Search	Vertex AI Vector Search
Data Labeling	SageMaker Ground Truth	Azure ML Data Labeling	Vertex AI Data Labeling
MLOps/Pipelines	SageMaker Pipelines	Azure ML Pipelines	Vertex AI Pipelines

IAM and Access Control

Service-Specific IAM

Control	AWS	Azure	GCP
Execution role	SageMaker Execution Role (IAM role attached to notebooks/endpoints)	Azure ML Workspace Identity (managed identity)	Vertex AI Service Account
Model invocation	`bedrock:InvokeModel`, `sagemaker:InvokeEndpoint`	Azure RBAC: `Cognitive Services OpenAI User`	`aiplatform.endpoints.predict` IAM permission
Admin access	`sagemaker:*` (overly broad -- scope down)	`Azure ML Workspace Owner/Contributor`	`aiplatform.admin` (overly broad -- scope down)
Cross-account	IAM resource policies, STS AssumeRole	Azure Lighthouse, cross-tenant access	Cross-project IAM bindings, VPC Service Controls

Common IAM Misconfigurations

Warning

Misconfiguration	Risk	Check
Wildcard permissions (`s3:`, `storage.`) on ML execution roles	Compromised notebook/endpoint can access all data	Review IAM policies for resource-scoped permissions
Overly broad model access (all users can invoke any model)	Unauthorized model usage, cost abuse, data exposure	Verify model invocation permissions are scoped per-user/role
Service account key files stored in notebooks or repos	Credential theft, persistent unauthorized access	Scan for key files; use workload identity federation instead
Public model endpoints (no auth required)	Anyone can query the model, exfiltrate training data	Verify authentication is required on all endpoints
Shared execution roles across dev/staging/prod	Compromise in dev escalates to prod	Use separate roles per environment

Network Security

Control	AWS	Azure	GCP
Private endpoints	VPC Endpoints (Interface/Gateway)	Private Endpoints / Private Link	Private Service Connect
Network isolation	SageMaker in VPC, VPC Endpoints	VNet integration, Private Link	VPC Service Controls perimeter
Egress control	Security Groups, NACLs, VPC Endpoints	NSGs, Azure Firewall, UDRs	VPC Firewall Rules, VPC SC
Service perimeter	No direct equivalent (use SCP + VPC Endpoints)	No direct equivalent (use Private Link + NSG)	VPC Service Controls (strongly recommended)
Private DNS	Route 53 Private Hosted Zones	Azure Private DNS	Cloud DNS Private Zones

Network Security Checklist

Model endpoints are in private subnets (not internet-facing)
VPC Endpoints / Private Link used for AI service API calls
Egress rules prevent unauthorized outbound connections (data exfiltration)
Training data storage is not publicly accessible
Network logging is enabled (VPC Flow Logs / NSG Flow Logs / VPC Flow Logs)
DNS resolution for AI services routes through private endpoints

Encryption

Control	AWS	Azure	GCP
At-rest (default)	AWS-managed keys (SSE-S3, SSE-EBS)	Microsoft-managed keys	Google-managed keys
At-rest (customer key)	KMS CMK (SSE-KMS)	Azure Key Vault (CMK)	Cloud KMS CMEK
In-transit	TLS 1.2+ (enforced by default for API calls)	TLS 1.2+ (enforced by default)	TLS 1.2+ (enforced by default)
Model artifacts	KMS encryption on S3/EBS	Key Vault encryption on storage	CMEK on GCS/Persistent Disk
Training data	KMS encryption on S3	Key Vault encryption on Blob Storage	CMEK on GCS
Notebook storage	KMS encryption on EBS volumes	Key Vault encryption on managed disks	CMEK on persistent disks

Encryption Assessment Points

Verify CMEK/CMK is used for regulated workloads (not just provider-managed keys)
Check that KMS/Key Vault access policies restrict who can use encryption keys
Verify key rotation policies are in place
Ensure model artifacts in transit between services use private channels

Monitoring and Logging

Control	AWS	Azure	GCP
Control plane audit	CloudTrail	Azure Activity Log	Cloud Audit Logs
Data plane logging	CloudTrail Data Events, SageMaker model monitor	Azure Monitor Diagnostic Logs	Vertex AI Model Monitoring
API call logging	CloudTrail (Bedrock API calls)	Azure Monitor (OpenAI API calls)	Cloud Audit Logs (Vertex AI API calls)
Threat detection	GuardDuty	Microsoft Defender for Cloud	Security Command Center
Log aggregation	CloudWatch Logs, S3	Log Analytics Workspace	Cloud Logging
Alerting	CloudWatch Alarms, SNS	Azure Monitor Alerts	Cloud Monitoring Alerts

Key Audit Log Events to Monitor

Event	AWS CloudTrail	Azure Activity Log	GCP Audit Log
Model invocation	`bedrock:InvokeModel`, `sagemaker:InvokeEndpoint`	`Microsoft.CognitiveServices/accounts/deployments/completions/action`	`aiplatform.googleapis.com/PredictRequest`
Endpoint creation/modification	`sagemaker:CreateEndpoint`, `UpdateEndpoint`	`Microsoft.MachineLearningServices/workspaces/endpoints/write`	`aiplatform.googleapis.com/CreateEndpoint`
Training data access	S3 Data Events for training buckets	Blob Storage diagnostic logs	GCS data access audit logs
IAM changes	`iam:PutRolePolicy`, `AttachRolePolicy`	Role assignment changes	`SetIamPolicy` events
Model deployment	`sagemaker:CreateModel`	Model deployment events	`UploadModel`, `DeployModel`

Content Safety Services

Feature	AWS Bedrock Guardrails	Azure AI Content Safety	GCP Vertex AI Safety
Content filtering	Topic denial, content filters, PII detection	Violence, self-harm, sexual, hate speech categories	Safety attribute thresholds
Custom policies	Custom word filters, topic policies	Custom blocklists	Custom safety filters
PII handling	PII detection and masking	PII detection (separate service)	DLP integration
Grounding	Grounding checks (factuality)	Groundedness detection	Grounding with Google Search
Prompt injection defense	Prompt attack detection	Prompt Shield	Safety filters with jailbreak detection

Content Safety Limitations

Common Cloud AI Attack Surfaces

Attack Surface	Description	Assessment Focus
Public model endpoints	Endpoints accessible without authentication	Check endpoint authentication configuration
Over-privileged execution roles	ML roles with excessive permissions	Review IAM policies for least privilege
Training data in public storage	S3/GCS/Blob buckets with public access	Check bucket policies and ACLs
Model artifacts in shared storage	Models stored without access controls	Verify model storage access restrictions
Notebook instances with internet access	Notebooks that can reach the internet	Check VPC/network configuration for notebooks
Unencrypted data at rest	Training data or models without encryption	Verify encryption configuration (CMEK for compliance)
Logging gaps	Missing audit logs for model invocations	Verify comprehensive logging is enabled
Cross-account/cross-project access	Overly permissive cross-boundary access	Review cross-account IAM and VPC peering
Insecure model serialization	Pickle-based models loaded without verification	Check model format and provenance verification
Missing network segmentation	AI workloads in same network as other services	Verify network isolation for AI infrastructure

Quick Assessment Workflow

Enumerate AI Resources
List all AI/ML services in use: model endpoints, notebooks, training jobs, pipelines, data stores. Use the provider's asset inventory tools.
Review IAM Configuration
Check execution roles, service accounts, and user permissions against least-privilege principles. Flag wildcard permissions and overly broad access.
Check Network Isolation
Verify that model endpoints and training infrastructure are in private subnets with appropriate egress controls. Check for public accessibility.
Verify Encryption
Confirm encryption at rest and in transit. For regulated workloads, verify CMEK/CMK is used with appropriate key management.
Audit Logging
Confirm that control plane and data plane logs are enabled, retained, and monitored. Verify that model invocation logging captures prompt/response data for forensics.
Test Content Safety
Evaluate the cloud provider's content safety service against standard attack categories. Identify gaps that require application-level guardrails.
Review Supply Chain
Check model provenance, dependency management, and artifact integrity verification. Identify any use of pickle-based model formats.

Edit this page on GitHub

Cloud AI Security Cheat Sheet

intermediate8 min readUpdated 2026-03-15

Quick reference comparing AI security controls across AWS, Azure, and GCP -- covering IAM, networking, encryption, monitoring, and AI-specific services.

cloud-security aws azure gcp cheat-sheet ai-services

Cloud AI Security Cheat Sheet

AI/ML Services Comparison

Capability	AWS	Azure	GCP
ML Platform	SageMaker	Azure Machine Learning	Vertex AI
Hosted LLM API	Amazon Bedrock	Azure OpenAI Service	Vertex AI (Gemini, PaLM)
Notebook Environment	SageMaker Notebooks	Azure ML Notebooks	Vertex AI Workbench
Model Hosting	SageMaker Endpoints	Azure ML Endpoints	Vertex AI Endpoints
Content Safety	Bedrock Guardrails	Azure AI Content Safety	Vertex AI Safety Filters
Vector Database	OpenSearch (vector), MemoryDB	Azure AI Search	Vertex AI Vector Search
Data Labeling	SageMaker Ground Truth	Azure ML Data Labeling	Vertex AI Data Labeling
MLOps/Pipelines	SageMaker Pipelines	Azure ML Pipelines	Vertex AI Pipelines

IAM and Access Control

Service-Specific IAM

Control	AWS	Azure	GCP
Execution role	SageMaker Execution Role (IAM role attached to notebooks/endpoints)	Azure ML Workspace Identity (managed identity)	Vertex AI Service Account
Model invocation	`bedrock:InvokeModel`, `sagemaker:InvokeEndpoint`	Azure RBAC: `Cognitive Services OpenAI User`	`aiplatform.endpoints.predict` IAM permission
Admin access	`sagemaker:*` (overly broad -- scope down)	`Azure ML Workspace Owner/Contributor`	`aiplatform.admin` (overly broad -- scope down)
Cross-account	IAM resource policies, STS AssumeRole	Azure Lighthouse, cross-tenant access	Cross-project IAM bindings, VPC Service Controls

Common IAM Misconfigurations

Warning

Misconfiguration	Risk	Check
Wildcard permissions (`s3:`, `storage.`) on ML execution roles	Compromised notebook/endpoint can access all data	Review IAM policies for resource-scoped permissions
Overly broad model access (all users can invoke any model)	Unauthorized model usage, cost abuse, data exposure	Verify model invocation permissions are scoped per-user/role
Service account key files stored in notebooks or repos	Credential theft, persistent unauthorized access	Scan for key files; use workload identity federation instead
Public model endpoints (no auth required)	Anyone can query the model, exfiltrate training data	Verify authentication is required on all endpoints
Shared execution roles across dev/staging/prod	Compromise in dev escalates to prod	Use separate roles per environment

Network Security

Control	AWS	Azure	GCP
Private endpoints	VPC Endpoints (Interface/Gateway)	Private Endpoints / Private Link	Private Service Connect
Network isolation	SageMaker in VPC, VPC Endpoints	VNet integration, Private Link	VPC Service Controls perimeter
Egress control	Security Groups, NACLs, VPC Endpoints	NSGs, Azure Firewall, UDRs	VPC Firewall Rules, VPC SC
Service perimeter	No direct equivalent (use SCP + VPC Endpoints)	No direct equivalent (use Private Link + NSG)	VPC Service Controls (strongly recommended)
Private DNS	Route 53 Private Hosted Zones	Azure Private DNS	Cloud DNS Private Zones

Network Security Checklist

Model endpoints are in private subnets (not internet-facing)
VPC Endpoints / Private Link used for AI service API calls
Egress rules prevent unauthorized outbound connections (data exfiltration)
Training data storage is not publicly accessible
Network logging is enabled (VPC Flow Logs / NSG Flow Logs / VPC Flow Logs)
DNS resolution for AI services routes through private endpoints

Encryption

Control	AWS	Azure	GCP
At-rest (default)	AWS-managed keys (SSE-S3, SSE-EBS)	Microsoft-managed keys	Google-managed keys
At-rest (customer key)	KMS CMK (SSE-KMS)	Azure Key Vault (CMK)	Cloud KMS CMEK
In-transit	TLS 1.2+ (enforced by default for API calls)	TLS 1.2+ (enforced by default)	TLS 1.2+ (enforced by default)
Model artifacts	KMS encryption on S3/EBS	Key Vault encryption on storage	CMEK on GCS/Persistent Disk
Training data	KMS encryption on S3	Key Vault encryption on Blob Storage	CMEK on GCS
Notebook storage	KMS encryption on EBS volumes	Key Vault encryption on managed disks	CMEK on persistent disks

Encryption Assessment Points

Verify CMEK/CMK is used for regulated workloads (not just provider-managed keys)
Check that KMS/Key Vault access policies restrict who can use encryption keys
Verify key rotation policies are in place
Ensure model artifacts in transit between services use private channels

Monitoring and Logging

Control	AWS	Azure	GCP
Control plane audit	CloudTrail	Azure Activity Log	Cloud Audit Logs
Data plane logging	CloudTrail Data Events, SageMaker model monitor	Azure Monitor Diagnostic Logs	Vertex AI Model Monitoring
API call logging	CloudTrail (Bedrock API calls)	Azure Monitor (OpenAI API calls)	Cloud Audit Logs (Vertex AI API calls)
Threat detection	GuardDuty	Microsoft Defender for Cloud	Security Command Center
Log aggregation	CloudWatch Logs, S3	Log Analytics Workspace	Cloud Logging
Alerting	CloudWatch Alarms, SNS	Azure Monitor Alerts	Cloud Monitoring Alerts

Key Audit Log Events to Monitor

Event	AWS CloudTrail	Azure Activity Log	GCP Audit Log
Model invocation	`bedrock:InvokeModel`, `sagemaker:InvokeEndpoint`	`Microsoft.CognitiveServices/accounts/deployments/completions/action`	`aiplatform.googleapis.com/PredictRequest`
Endpoint creation/modification	`sagemaker:CreateEndpoint`, `UpdateEndpoint`	`Microsoft.MachineLearningServices/workspaces/endpoints/write`	`aiplatform.googleapis.com/CreateEndpoint`
Training data access	S3 Data Events for training buckets	Blob Storage diagnostic logs	GCS data access audit logs
IAM changes	`iam:PutRolePolicy`, `AttachRolePolicy`	Role assignment changes	`SetIamPolicy` events
Model deployment	`sagemaker:CreateModel`	Model deployment events	`UploadModel`, `DeployModel`

Content Safety Services

Feature	AWS Bedrock Guardrails	Azure AI Content Safety	GCP Vertex AI Safety
Content filtering	Topic denial, content filters, PII detection	Violence, self-harm, sexual, hate speech categories	Safety attribute thresholds
Custom policies	Custom word filters, topic policies	Custom blocklists	Custom safety filters
PII handling	PII detection and masking	PII detection (separate service)	DLP integration
Grounding	Grounding checks (factuality)	Groundedness detection	Grounding with Google Search
Prompt injection defense	Prompt attack detection	Prompt Shield	Safety filters with jailbreak detection

Content Safety Limitations

Common Cloud AI Attack Surfaces

Attack Surface	Description	Assessment Focus
Public model endpoints	Endpoints accessible without authentication	Check endpoint authentication configuration
Over-privileged execution roles	ML roles with excessive permissions	Review IAM policies for least privilege
Training data in public storage	S3/GCS/Blob buckets with public access	Check bucket policies and ACLs
Model artifacts in shared storage	Models stored without access controls	Verify model storage access restrictions
Notebook instances with internet access	Notebooks that can reach the internet	Check VPC/network configuration for notebooks
Unencrypted data at rest	Training data or models without encryption	Verify encryption configuration (CMEK for compliance)
Logging gaps	Missing audit logs for model invocations	Verify comprehensive logging is enabled
Cross-account/cross-project access	Overly permissive cross-boundary access	Review cross-account IAM and VPC peering
Insecure model serialization	Pickle-based models loaded without verification	Check model format and provenance verification
Missing network segmentation	AI workloads in same network as other services	Verify network isolation for AI infrastructure

Quick Assessment Workflow

Enumerate AI Resources
List all AI/ML services in use: model endpoints, notebooks, training jobs, pipelines, data stores. Use the provider's asset inventory tools.
Review IAM Configuration
Check execution roles, service accounts, and user permissions against least-privilege principles. Flag wildcard permissions and overly broad access.
Check Network Isolation
Verify that model endpoints and training infrastructure are in private subnets with appropriate egress controls. Check for public accessibility.
Verify Encryption
Confirm encryption at rest and in transit. For regulated workloads, verify CMEK/CMK is used with appropriate key management.
Audit Logging
Confirm that control plane and data plane logs are enabled, retained, and monitored. Verify that model invocation logging captures prompt/response data for forensics.
Test Content Safety
Evaluate the cloud provider's content safety service against standard attack categories. Identify gaps that require application-level guardrails.
Review Supply Chain
Check model provenance, dependency management, and artifact integrity verification. Identify any use of pickle-based model formats.

Edit this page on GitHub

Cloud AI Security Cheat Sheet

Enumerate AI Resources

Review IAM Configuration

Check Network Isolation

Verify Encryption

Audit Logging

Test Content Safety

Review Supply Chain

Related articles

Cloud AI Security Cheat Sheet

Enumerate AI Resources

Review IAM Configuration

Check Network Isolation

Verify Encryption

Audit Logging

Test Content Safety

Review Supply Chain

Related articles