Data Residency and Sovereignty for Cloud AI

intermediate14 min readUpdated 2026-03-20

Managing data residency, sovereignty, and cross-border transfer requirements for cloud AI services including GDPR, AI Act, and regional model deployment strategies.

cloud-ai-security data-residency sovereignty compliance gdpr

Overview

Data residency and sovereignty are among the most complex challenges in cloud AI security. When an organization sends a prompt to a cloud-hosted foundation model, the data in that prompt may cross jurisdictional boundaries, be logged in regions the organization did not anticipate, or be used for model improvement in ways that violate data protection commitments. For custom model training, the training data, model weights, and inference results all have their own residency requirements that may differ from each other.

The regulatory landscape makes this even more demanding. The EU AI Act (entered force August 2024, with provisions phasing in through 2027) imposes obligations on providers and deployers of AI systems that interact with data sovereignty. GDPR's data transfer restrictions apply to any personal data in prompts or training sets. Industry-specific regulations (HIPAA, financial services regulations, government classification requirements) add further constraints.

This article maps out the data flows that create residency risks, shows how to configure region-locked deployments on each major cloud, and provides assessment techniques for verifying that data stays where it should.

Understanding AI Data Flows

Where Data Moves in Cloud AI

A single API call to a cloud AI service creates multiple data flows, each with its own residency implications:

User Input (Prompt)
    |
    v
[Application Layer] -- Region A
    |
    v
[API Gateway / Load Balancer] -- May route to any region
    |
    v
[AI Service Endpoint] -- Region B (configured)
    |
    +---> [Model Inference] -- Region B
    |         |
    |         v
    |     [Response Generation]
    |
    +---> [Logging / Telemetry] -- Region C (may differ)
    |
    +---> [Content Safety Filter] -- Region D (may differ)
    |
    +---> [Abuse Monitoring] -- Region E (global)
    |
    +---> [Model Improvement] -- Varies by provider opt-in/opt-out
    |
    v
[Response to Application]

The critical insight is that even when you deploy the AI service in a specific region, supporting services (logging, content filtering, abuse monitoring) may process data in different regions. Each cloud provider handles this differently.

Data Classification for AI Workloads

Before configuring residency controls, classify the data flowing through your AI system:

Data Type	Examples	Residency Sensitivity	Typical Regulation
Prompts containing PII	Customer support conversations, HR queries	High	GDPR Art. 44-49, CCPA
Prompts with trade secrets	Code review, strategy documents	High	Trade secret law, NDA obligations
Training data with personal data	Customer interaction logs, annotated datasets	Very High	GDPR Art. 44-49, sector regulations
Model weights (fine-tuned)	Custom models trained on organization data	Medium	May encode personal data (GDPR recital 26)
Inference results	Model outputs, classifications, summaries	Medium-High	Derived data may inherit source classification
Usage metadata	Token counts, latency, model version	Low	Generally not regulated
Content filter logs	Blocked prompts and reasons	High	Contains the harmful/sensitive content itself

AWS Data Residency for AI Services

Bedrock Region Configuration

AWS Bedrock processes data in the region where the service is invoked. However, model availability varies by region, creating tension between residency requirements and model access.

import boto3
from botocore.config import Config
 
def create_region_locked_bedrock_client(
    region: str,
    enforce_region: bool = True,
) -> dict:
    """Create a Bedrock client locked to a specific region with verification."""
 
    # Prevent any cross-region fallback
    config = Config(
        region_name=region,
        retries={"max_attempts": 3, "mode": "standard"},
    )
 
    session = boto3.Session(region_name=region)
    bedrock = session.client("bedrock", config=config)
    bedrock_runtime = session.client("bedrock-runtime", config=config)
 
    if enforce_region:
        # Verify we are talking to the intended region
        available_models = bedrock.list_foundation_models()
        model_count = len(available_models.get("modelSummaries", []))
 
        return {
            "region": region,
            "runtime_endpoint": f"https://bedrock-runtime.{region}.amazonaws.com",
            "management_endpoint": f"https://bedrock.{region}.amazonaws.com",
            "models_available": model_count,
            "clients": {
                "bedrock": bedrock,
                "bedrock_runtime": bedrock_runtime,
            },
        }
 
    return {"bedrock": bedrock, "bedrock_runtime": bedrock_runtime}
 
def generate_region_restriction_scp(allowed_regions: list) -> dict:
    """Generate an SCP that restricts Bedrock to specific regions."""
    return {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "RestrictBedrockRegions",
                "Effect": "Deny",
                "Action": [
                    "bedrock:*",
                ],
                "Resource": "*",
                "Condition": {
                    "StringNotEquals": {
                        "aws:RequestedRegion": allowed_regions
                    }
                },
            },
            {
                "Sid": "RestrictSageMakerRegions",
                "Effect": "Deny",
                "Action": [
                    "sagemaker:*",
                ],
                "Resource": "*",
                "Condition": {
                    "StringNotEquals": {
                        "aws:RequestedRegion": allowed_regions
                    }
                },
            },
        ],
    }

AWS Data Processing Addendum Considerations

AWS commits to processing data in the region you select for Bedrock invocations. However, several ancillary data flows require attention:

Model invocation logging: If enabled, logs are stored in the S3 bucket or CloudWatch log group you specify. These must be in the same region as the Bedrock endpoint.
Abuse monitoring: AWS may process data for abuse detection outside the selected region, subject to the AWS Data Processing Addendum. Organizations with strict residency requirements should review the DPA carefully.
CloudTrail logs: Management events are logged in the region where the API call is made. Data events (if enabled) follow the same pattern. CloudTrail organization trails may aggregate logs to a central bucket in a different region.

def audit_bedrock_data_residency(
    session: boto3.Session,
    target_region: str,
) -> dict:
    """Audit Bedrock deployment for data residency compliance."""
    findings = []
 
    # Check invocation logging destination
    bedrock = session.client("bedrock", region_name=target_region)
    try:
        logging_config = bedrock.get_model_invocation_logging_configuration()
        config = logging_config.get("loggingConfig", {})
 
        # Check S3 logging destination region
        s3_config = config.get("s3Config", {})
        if s3_config:
            bucket_name = s3_config.get("bucketName", "")
            s3 = session.client("s3")
            try:
                location = s3.get_bucket_location(Bucket=bucket_name)
                bucket_region = location.get("LocationConstraint") or "us-east-1"
                if bucket_region != target_region:
                    findings.append({
                        "severity": "HIGH",
                        "finding": "Invocation logs stored in different region",
                        "detail": f"Bedrock is in {target_region} but logs go to "
                                  f"S3 bucket in {bucket_region}. This creates "
                                  f"cross-region data transfer of prompts/responses.",
                    })
            except Exception:
                pass
 
        # Check CloudWatch logging destination
        cw_config = config.get("cloudWatchConfig", {})
        if cw_config:
            log_group = cw_config.get("logGroupName", "")
            findings.append({
                "severity": "INFO",
                "finding": f"CloudWatch logging to {log_group}",
                "detail": "CloudWatch log groups are regional. Verify this log group "
                          "is in the same region as the Bedrock endpoint.",
            })
 
    except Exception:
        findings.append({
            "severity": "MEDIUM",
            "finding": "Cannot verify invocation logging configuration",
        })
 
    # Check CloudTrail configuration
    cloudtrail = session.client("cloudtrail", region_name=target_region)
    try:
        trails = cloudtrail.describe_trails()
        for trail in trails.get("trailList", []):
            if trail.get("IsMultiRegionTrail"):
                s3_bucket = trail.get("S3BucketName", "")
                findings.append({
                    "severity": "MEDIUM",
                    "finding": "Multi-region CloudTrail trail detected",
                    "detail": f"Trail {trail['Name']} aggregates logs from all regions "
                              f"to bucket {s3_bucket}. Bedrock management events from "
                              f"{target_region} may be stored in a different region.",
                })
    except Exception:
        pass
 
    return {"target_region": target_region, "findings": findings}

Azure Data Residency for AI Services

Azure OpenAI Regional Deployment

Azure OpenAI Service processes data in the region where the resource is created. Azure provides stronger regional commitments than some other providers through its EU Data Boundary program.

from azure.identity import DefaultAzureCredential
from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient
 
def audit_azure_openai_data_residency(
    subscription_id: str,
    resource_group: str,
    account_name: str,
    required_geography: str = "europe",
) -> dict:
    """Audit Azure OpenAI resource for data residency compliance."""
    credential = DefaultAzureCredential()
    client = CognitiveServicesManagementClient(credential, subscription_id)
    findings = []
 
    account = client.accounts.get(resource_group, account_name)
    resource_location = account.location
 
    # Map Azure regions to geographies
    eu_regions = [
        "westeurope", "northeurope", "francecentral", "francesouth",
        "germanywestcentral", "germanynorth", "swedencentral",
        "swedensouth", "switzerlandnorth", "switzerlandwest",
        "norwayeast", "norwaywest", "uksouth", "ukwest",
    ]
 
    if required_geography == "europe" and resource_location not in eu_regions:
        findings.append({
            "severity": "CRITICAL",
            "finding": f"Resource deployed outside EU: {resource_location}",
            "detail": "For GDPR compliance, Azure OpenAI resources processing "
                      "EU personal data must be in an EU region.",
        })
 
    # Check data storage and processing commitments
    # Azure OpenAI with abuse monitoring opt-out
    if account.properties.restrict_outbound_network_access:
        findings.append({
            "severity": "INFO",
            "finding": "Outbound network access restricted",
            "detail": "Reduces risk of data leaving the deployment region.",
        })
 
    # Check diagnostic settings for cross-region logging
    # (Diagnostic settings might send data to Log Analytics in another region)
    from azure.mgmt.monitor import MonitorManagementClient
    monitor = MonitorManagementClient(credential, subscription_id)
 
    resource_uri = account.id
    for setting in monitor.diagnostic_settings.list(resource_uri):
        if setting.workspace_id:
            # Check if Log Analytics workspace is in the same region
            workspace_parts = setting.workspace_id.split("/")
            findings.append({
                "severity": "MEDIUM",
                "finding": "Diagnostic logs sent to Log Analytics workspace",
                "detail": f"Verify workspace {setting.workspace_id} is in the same "
                          f"region as the OpenAI resource ({resource_location}). "
                          f"Cross-region diagnostic data transfer can violate residency.",
            })
 
    # List deployments and check model availability
    deployments = client.deployments.list(resource_group, account_name)
    for deployment in deployments:
        findings.append({
            "severity": "INFO",
            "finding": f"Model deployment: {deployment.name}",
            "detail": f"Model: {deployment.properties.model.name}, "
                      f"Version: {deployment.properties.model.version}. "
                      f"Data processed in {resource_location}.",
        })
 
    return {
        "resource_location": resource_location,
        "required_geography": required_geography,
        "findings": findings,
    }

Azure EU Data Boundary

For organizations operating under EU data sovereignty requirements, Azure's EU Data Boundary commitment (generally available since January 2024) ensures that customer data for covered services stays within the EU. Azure OpenAI is included in the EU Data Boundary when:

The resource is deployed in an EU region
Abuse monitoring data processing stays within the EU (organizations can request modified abuse monitoring that keeps data in-region)
Content filtering logs remain in the deployment region

GCP Data Residency for AI Services

Vertex AI Regional Configuration

GCP Vertex AI endpoints are regional resources. Data processed by Vertex AI stays in the specified region for inference. However, some features (such as model evaluation and certain pipeline operations) may process data in the US even when the endpoint is in another region.

from google.cloud import aiplatform
 
def audit_vertex_data_residency(
    project_id: str,
    required_region: str = "europe-west4",
) -> dict:
    """Audit Vertex AI resources for data residency compliance."""
    findings = []
 
    # Check all locations where Vertex AI resources exist
    locations_to_check = [
        "us-central1", "us-east1", "us-west1",
        "europe-west1", "europe-west4", "europe-west2",
        "asia-southeast1", "asia-northeast1",
    ]
 
    for location in locations_to_check:
        try:
            aiplatform.init(project=project_id, location=location)
            endpoints = aiplatform.Endpoint.list()
            models = aiplatform.Model.list()
 
            if endpoints or models:
                in_required = location == required_region
                severity = "INFO" if in_required else "HIGH"
                findings.append({
                    "severity": severity,
                    "location": location,
                    "finding": f"Vertex AI resources found in {location}",
                    "detail": f"{len(endpoints)} endpoints, {len(models)} models. "
                              + ("In required region." if in_required
                                 else f"Outside required region ({required_region})."),
                })
        except Exception:
            continue
 
    # Check organization policy constraints for location restriction
    try:
        from google.cloud import orgpolicy_v2
        orgpolicy_client = orgpolicy_v2.OrgPolicyClient()
 
        policy = orgpolicy_client.get_policy(
            name=f"projects/{project_id}/policies/gcp.resourceLocations"
        )
        if policy.spec and policy.spec.rules:
            allowed_locations = []
            for rule in policy.spec.rules:
                if rule.values and rule.values.allowed_values:
                    allowed_locations.extend(rule.values.allowed_values)
 
            findings.append({
                "severity": "INFO",
                "finding": "Organization policy restricts resource locations",
                "detail": f"Allowed locations: {', '.join(allowed_locations)}",
            })
        else:
            findings.append({
                "severity": "HIGH",
                "finding": "No location restriction organization policy",
                "detail": "Without gcp.resourceLocations constraint, Vertex AI "
                          "resources can be created in any region.",
            })
    except Exception:
        findings.append({
            "severity": "MEDIUM",
            "finding": "Cannot check organization policy for location restrictions",
        })
 
    return {"required_region": required_region, "findings": findings}

GCP Organization Policy Constraints

Use GCP Organization Policy to enforce data residency at the organizational level:

def recommend_residency_org_policies(target_geography: str) -> dict:
    """Recommend organization policies for data residency enforcement."""
    eu_locations = [
        "in:eu-locations",  # GCP's built-in EU location group
    ]
 
    return {
        "resource_location_constraint": {
            "constraint": "constraints/gcp.resourceLocations",
            "description": "Restrict where Vertex AI and other resources can be created",
            "spec": {
                "rules": [{
                    "values": {
                        "allowedValues": eu_locations if target_geography == "eu" else [],
                    }
                }]
            },
        },
        "vertex_ai_specific": {
            "constraint": "constraints/aiplatform.allowedModels",
            "description": "Restrict which models can be deployed (some models "
                           "may have different data processing locations)",
            "note": "Not all models are available in all regions. Restricting "
                    "models to those available in your target region prevents "
                    "users from deploying models that require cross-region processing.",
        },
        "cloud_storage": {
            "constraint": "constraints/storage.locations",
            "description": "Restrict Cloud Storage bucket locations for training data",
            "spec": {
                "rules": [{
                    "values": {
                        "allowedValues": eu_locations if target_geography == "eu" else [],
                    }
                }]
            },
        },
    }

Compliance Assessment Framework

Data Flow Mapping Checklist

For each AI workload, map and document these data flows:

Prompt data path: Application -> API gateway -> AI service endpoint -> model inference -> response
Logging data path: AI service -> logging service -> storage (S3/GCS/Azure Storage)
Content filtering data path: AI service -> content safety service -> filter logs
Training data path: Storage -> AI service -> model artifacts -> artifact storage
Monitoring data path: AI service -> telemetry service -> monitoring dashboard
Backup/DR data path: Primary region -> replication target

For each path, document:

Source and destination regions
Whether personal data is present
Whether cross-border transfer occurs
What legal basis covers the transfer (if cross-border)
What technical controls enforce the documented path

Regulatory Mapping

Regulation	Applies When	Key Requirement	Cloud AI Implication
GDPR Art. 44-49	Processing EU personal data	Adequate transfer mechanisms for cross-border	AI endpoints must be in EU or covered by adequacy decision
EU AI Act Art. 10	Training high-risk AI systems	Data governance including geographic considerations	Training data location must be documented and controlled
CCPA/CPRA	Processing California consumer data	Disclosure of cross-border transfers	AI service data processing locations must be disclosed
China PIPL	Processing Chinese personal data	Data localization for certain categories	AI processing must remain in mainland China for covered data
India DPDP Act	Processing Indian personal data	Government-notified transfer restrictions	May require in-country AI processing for certain data

References

European Commission, "Adequacy decisions," https://commission.europa.eu/law/law-topic/data-protection/international-dimension-data-protection/adequacy-decisions_en
European Parliament, "Regulation (EU) 2024/1689 (AI Act)," https://eur-lex.europa.eu/eli/reg/2024/1689
NIST, "AI Risk Management Framework (AI RMF 1.0)," January 2023, https://www.nist.gov/itl/ai-risk-management-framework
Microsoft, "Azure EU Data Boundary," https://learn.microsoft.com/en-us/privacy/eudb/eu-data-boundary-learn
AWS, "AWS Data Residency," https://aws.amazon.com/compliance/data-residency/

Knowledge Check

An organization deploys Azure OpenAI in the westeurope region for GDPR compliance. What additional data flow could create a cross-border transfer?

Knowledge Check

What is the most effective technical control for preventing AI resources from being created outside approved regions on GCP?

Edit this page on GitHub

Data Residency and Sovereignty for Cloud AI

intermediate14 min readUpdated 2026-03-20

Managing data residency, sovereignty, and cross-border transfer requirements for cloud AI services including GDPR, AI Act, and regional model deployment strategies.

cloud-ai-security data-residency sovereignty compliance gdpr

Overview

Understanding AI Data Flows

Where Data Moves in Cloud AI

A single API call to a cloud AI service creates multiple data flows, each with its own residency implications:

User Input (Prompt)
    |
    v
[Application Layer] -- Region A
    |
    v
[API Gateway / Load Balancer] -- May route to any region
    |
    v
[AI Service Endpoint] -- Region B (configured)
    |
    +---> [Model Inference] -- Region B
    |         |
    |         v
    |     [Response Generation]
    |
    +---> [Logging / Telemetry] -- Region C (may differ)
    |
    +---> [Content Safety Filter] -- Region D (may differ)
    |
    +---> [Abuse Monitoring] -- Region E (global)
    |
    +---> [Model Improvement] -- Varies by provider opt-in/opt-out
    |
    v
[Response to Application]

Data Classification for AI Workloads

Before configuring residency controls, classify the data flowing through your AI system:

Data Type	Examples	Residency Sensitivity	Typical Regulation
Prompts containing PII	Customer support conversations, HR queries	High	GDPR Art. 44-49, CCPA
Prompts with trade secrets	Code review, strategy documents	High	Trade secret law, NDA obligations
Training data with personal data	Customer interaction logs, annotated datasets	Very High	GDPR Art. 44-49, sector regulations
Model weights (fine-tuned)	Custom models trained on organization data	Medium	May encode personal data (GDPR recital 26)
Inference results	Model outputs, classifications, summaries	Medium-High	Derived data may inherit source classification
Usage metadata	Token counts, latency, model version	Low	Generally not regulated
Content filter logs	Blocked prompts and reasons	High	Contains the harmful/sensitive content itself

AWS Data Residency for AI Services

Bedrock Region Configuration

AWS Bedrock processes data in the region where the service is invoked. However, model availability varies by region, creating tension between residency requirements and model access.

import boto3
from botocore.config import Config
 
def create_region_locked_bedrock_client(
    region: str,
    enforce_region: bool = True,
) -> dict:
    """Create a Bedrock client locked to a specific region with verification."""
 
    # Prevent any cross-region fallback
    config = Config(
        region_name=region,
        retries={"max_attempts": 3, "mode": "standard"},
    )
 
    session = boto3.Session(region_name=region)
    bedrock = session.client("bedrock", config=config)
    bedrock_runtime = session.client("bedrock-runtime", config=config)
 
    if enforce_region:
        # Verify we are talking to the intended region
        available_models = bedrock.list_foundation_models()
        model_count = len(available_models.get("modelSummaries", []))
 
        return {
            "region": region,
            "runtime_endpoint": f"https://bedrock-runtime.{region}.amazonaws.com",
            "management_endpoint": f"https://bedrock.{region}.amazonaws.com",
            "models_available": model_count,
            "clients": {
                "bedrock": bedrock,
                "bedrock_runtime": bedrock_runtime,
            },
        }
 
    return {"bedrock": bedrock, "bedrock_runtime": bedrock_runtime}
 
def generate_region_restriction_scp(allowed_regions: list) -> dict:
    """Generate an SCP that restricts Bedrock to specific regions."""
    return {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "RestrictBedrockRegions",
                "Effect": "Deny",
                "Action": [
                    "bedrock:*",
                ],
                "Resource": "*",
                "Condition": {
                    "StringNotEquals": {
                        "aws:RequestedRegion": allowed_regions
                    }
                },
            },
            {
                "Sid": "RestrictSageMakerRegions",
                "Effect": "Deny",
                "Action": [
                    "sagemaker:*",
                ],
                "Resource": "*",
                "Condition": {
                    "StringNotEquals": {
                        "aws:RequestedRegion": allowed_regions
                    }
                },
            },
        ],
    }

AWS Data Processing Addendum Considerations

AWS commits to processing data in the region you select for Bedrock invocations. However, several ancillary data flows require attention:

Model invocation logging: If enabled, logs are stored in the S3 bucket or CloudWatch log group you specify. These must be in the same region as the Bedrock endpoint.
Abuse monitoring: AWS may process data for abuse detection outside the selected region, subject to the AWS Data Processing Addendum. Organizations with strict residency requirements should review the DPA carefully.
CloudTrail logs: Management events are logged in the region where the API call is made. Data events (if enabled) follow the same pattern. CloudTrail organization trails may aggregate logs to a central bucket in a different region.

def audit_bedrock_data_residency(
    session: boto3.Session,
    target_region: str,
) -> dict:
    """Audit Bedrock deployment for data residency compliance."""
    findings = []
 
    # Check invocation logging destination
    bedrock = session.client("bedrock", region_name=target_region)
    try:
        logging_config = bedrock.get_model_invocation_logging_configuration()
        config = logging_config.get("loggingConfig", {})
 
        # Check S3 logging destination region
        s3_config = config.get("s3Config", {})
        if s3_config:
            bucket_name = s3_config.get("bucketName", "")
            s3 = session.client("s3")
            try:
                location = s3.get_bucket_location(Bucket=bucket_name)
                bucket_region = location.get("LocationConstraint") or "us-east-1"
                if bucket_region != target_region:
                    findings.append({
                        "severity": "HIGH",
                        "finding": "Invocation logs stored in different region",
                        "detail": f"Bedrock is in {target_region} but logs go to "
                                  f"S3 bucket in {bucket_region}. This creates "
                                  f"cross-region data transfer of prompts/responses.",
                    })
            except Exception:
                pass
 
        # Check CloudWatch logging destination
        cw_config = config.get("cloudWatchConfig", {})
        if cw_config:
            log_group = cw_config.get("logGroupName", "")
            findings.append({
                "severity": "INFO",
                "finding": f"CloudWatch logging to {log_group}",
                "detail": "CloudWatch log groups are regional. Verify this log group "
                          "is in the same region as the Bedrock endpoint.",
            })
 
    except Exception:
        findings.append({
            "severity": "MEDIUM",
            "finding": "Cannot verify invocation logging configuration",
        })
 
    # Check CloudTrail configuration
    cloudtrail = session.client("cloudtrail", region_name=target_region)
    try:
        trails = cloudtrail.describe_trails()
        for trail in trails.get("trailList", []):
            if trail.get("IsMultiRegionTrail"):
                s3_bucket = trail.get("S3BucketName", "")
                findings.append({
                    "severity": "MEDIUM",
                    "finding": "Multi-region CloudTrail trail detected",
                    "detail": f"Trail {trail['Name']} aggregates logs from all regions "
                              f"to bucket {s3_bucket}. Bedrock management events from "
                              f"{target_region} may be stored in a different region.",
                })
    except Exception:
        pass
 
    return {"target_region": target_region, "findings": findings}

Azure Data Residency for AI Services

Azure OpenAI Regional Deployment

Azure OpenAI Service processes data in the region where the resource is created. Azure provides stronger regional commitments than some other providers through its EU Data Boundary program.

from azure.identity import DefaultAzureCredential
from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient
 
def audit_azure_openai_data_residency(
    subscription_id: str,
    resource_group: str,
    account_name: str,
    required_geography: str = "europe",
) -> dict:
    """Audit Azure OpenAI resource for data residency compliance."""
    credential = DefaultAzureCredential()
    client = CognitiveServicesManagementClient(credential, subscription_id)
    findings = []
 
    account = client.accounts.get(resource_group, account_name)
    resource_location = account.location
 
    # Map Azure regions to geographies
    eu_regions = [
        "westeurope", "northeurope", "francecentral", "francesouth",
        "germanywestcentral", "germanynorth", "swedencentral",
        "swedensouth", "switzerlandnorth", "switzerlandwest",
        "norwayeast", "norwaywest", "uksouth", "ukwest",
    ]
 
    if required_geography == "europe" and resource_location not in eu_regions:
        findings.append({
            "severity": "CRITICAL",
            "finding": f"Resource deployed outside EU: {resource_location}",
            "detail": "For GDPR compliance, Azure OpenAI resources processing "
                      "EU personal data must be in an EU region.",
        })
 
    # Check data storage and processing commitments
    # Azure OpenAI with abuse monitoring opt-out
    if account.properties.restrict_outbound_network_access:
        findings.append({
            "severity": "INFO",
            "finding": "Outbound network access restricted",
            "detail": "Reduces risk of data leaving the deployment region.",
        })
 
    # Check diagnostic settings for cross-region logging
    # (Diagnostic settings might send data to Log Analytics in another region)
    from azure.mgmt.monitor import MonitorManagementClient
    monitor = MonitorManagementClient(credential, subscription_id)
 
    resource_uri = account.id
    for setting in monitor.diagnostic_settings.list(resource_uri):
        if setting.workspace_id:
            # Check if Log Analytics workspace is in the same region
            workspace_parts = setting.workspace_id.split("/")
            findings.append({
                "severity": "MEDIUM",
                "finding": "Diagnostic logs sent to Log Analytics workspace",
                "detail": f"Verify workspace {setting.workspace_id} is in the same "
                          f"region as the OpenAI resource ({resource_location}). "
                          f"Cross-region diagnostic data transfer can violate residency.",
            })
 
    # List deployments and check model availability
    deployments = client.deployments.list(resource_group, account_name)
    for deployment in deployments:
        findings.append({
            "severity": "INFO",
            "finding": f"Model deployment: {deployment.name}",
            "detail": f"Model: {deployment.properties.model.name}, "
                      f"Version: {deployment.properties.model.version}. "
                      f"Data processed in {resource_location}.",
        })
 
    return {
        "resource_location": resource_location,
        "required_geography": required_geography,
        "findings": findings,
    }

Azure EU Data Boundary

The resource is deployed in an EU region
Abuse monitoring data processing stays within the EU (organizations can request modified abuse monitoring that keeps data in-region)
Content filtering logs remain in the deployment region

GCP Data Residency for AI Services

Vertex AI Regional Configuration

from google.cloud import aiplatform
 
def audit_vertex_data_residency(
    project_id: str,
    required_region: str = "europe-west4",
) -> dict:
    """Audit Vertex AI resources for data residency compliance."""
    findings = []
 
    # Check all locations where Vertex AI resources exist
    locations_to_check = [
        "us-central1", "us-east1", "us-west1",
        "europe-west1", "europe-west4", "europe-west2",
        "asia-southeast1", "asia-northeast1",
    ]
 
    for location in locations_to_check:
        try:
            aiplatform.init(project=project_id, location=location)
            endpoints = aiplatform.Endpoint.list()
            models = aiplatform.Model.list()
 
            if endpoints or models:
                in_required = location == required_region
                severity = "INFO" if in_required else "HIGH"
                findings.append({
                    "severity": severity,
                    "location": location,
                    "finding": f"Vertex AI resources found in {location}",
                    "detail": f"{len(endpoints)} endpoints, {len(models)} models. "
                              + ("In required region." if in_required
                                 else f"Outside required region ({required_region})."),
                })
        except Exception:
            continue
 
    # Check organization policy constraints for location restriction
    try:
        from google.cloud import orgpolicy_v2
        orgpolicy_client = orgpolicy_v2.OrgPolicyClient()
 
        policy = orgpolicy_client.get_policy(
            name=f"projects/{project_id}/policies/gcp.resourceLocations"
        )
        if policy.spec and policy.spec.rules:
            allowed_locations = []
            for rule in policy.spec.rules:
                if rule.values and rule.values.allowed_values:
                    allowed_locations.extend(rule.values.allowed_values)
 
            findings.append({
                "severity": "INFO",
                "finding": "Organization policy restricts resource locations",
                "detail": f"Allowed locations: {', '.join(allowed_locations)}",
            })
        else:
            findings.append({
                "severity": "HIGH",
                "finding": "No location restriction organization policy",
                "detail": "Without gcp.resourceLocations constraint, Vertex AI "
                          "resources can be created in any region.",
            })
    except Exception:
        findings.append({
            "severity": "MEDIUM",
            "finding": "Cannot check organization policy for location restrictions",
        })
 
    return {"required_region": required_region, "findings": findings}

GCP Organization Policy Constraints

Use GCP Organization Policy to enforce data residency at the organizational level:

def recommend_residency_org_policies(target_geography: str) -> dict:
    """Recommend organization policies for data residency enforcement."""
    eu_locations = [
        "in:eu-locations",  # GCP's built-in EU location group
    ]
 
    return {
        "resource_location_constraint": {
            "constraint": "constraints/gcp.resourceLocations",
            "description": "Restrict where Vertex AI and other resources can be created",
            "spec": {
                "rules": [{
                    "values": {
                        "allowedValues": eu_locations if target_geography == "eu" else [],
                    }
                }]
            },
        },
        "vertex_ai_specific": {
            "constraint": "constraints/aiplatform.allowedModels",
            "description": "Restrict which models can be deployed (some models "
                           "may have different data processing locations)",
            "note": "Not all models are available in all regions. Restricting "
                    "models to those available in your target region prevents "
                    "users from deploying models that require cross-region processing.",
        },
        "cloud_storage": {
            "constraint": "constraints/storage.locations",
            "description": "Restrict Cloud Storage bucket locations for training data",
            "spec": {
                "rules": [{
                    "values": {
                        "allowedValues": eu_locations if target_geography == "eu" else [],
                    }
                }]
            },
        },
    }

Compliance Assessment Framework

Data Flow Mapping Checklist

For each AI workload, map and document these data flows:

Prompt data path: Application -> API gateway -> AI service endpoint -> model inference -> response
Logging data path: AI service -> logging service -> storage (S3/GCS/Azure Storage)
Content filtering data path: AI service -> content safety service -> filter logs
Training data path: Storage -> AI service -> model artifacts -> artifact storage
Monitoring data path: AI service -> telemetry service -> monitoring dashboard
Backup/DR data path: Primary region -> replication target

For each path, document:

Source and destination regions
Whether personal data is present
Whether cross-border transfer occurs
What legal basis covers the transfer (if cross-border)
What technical controls enforce the documented path

Regulatory Mapping

Regulation	Applies When	Key Requirement	Cloud AI Implication
GDPR Art. 44-49	Processing EU personal data	Adequate transfer mechanisms for cross-border	AI endpoints must be in EU or covered by adequacy decision
EU AI Act Art. 10	Training high-risk AI systems	Data governance including geographic considerations	Training data location must be documented and controlled
CCPA/CPRA	Processing California consumer data	Disclosure of cross-border transfers	AI service data processing locations must be disclosed
China PIPL	Processing Chinese personal data	Data localization for certain categories	AI processing must remain in mainland China for covered data
India DPDP Act	Processing Indian personal data	Government-notified transfer restrictions	May require in-country AI processing for certain data

References

European Commission, "Adequacy decisions," https://commission.europa.eu/law/law-topic/data-protection/international-dimension-data-protection/adequacy-decisions_en
European Parliament, "Regulation (EU) 2024/1689 (AI Act)," https://eur-lex.europa.eu/eli/reg/2024/1689
NIST, "AI Risk Management Framework (AI RMF 1.0)," January 2023, https://www.nist.gov/itl/ai-risk-management-framework
Microsoft, "Azure EU Data Boundary," https://learn.microsoft.com/en-us/privacy/eudb/eu-data-boundary-learn
AWS, "AWS Data Residency," https://aws.amazon.com/compliance/data-residency/

Knowledge Check

An organization deploys Azure OpenAI in the westeurope region for GDPR compliance. What additional data flow could create a cross-border transfer?

Knowledge Check

What is the most effective technical control for preventing AI resources from being created outside approved regions on GCP?

Edit this page on GitHub

Data Residency and Sovereignty for Cloud AI

Related articles

Data Residency and Sovereignty for Cloud AI

Related articles