實作:容器化模型突破
進階5 分鐘閱讀更新於 2026-03-15
探索自容器化 AI 應用逃逸至主機系統之技術,測試 ML 部署環境中之容器隔離邊界。
先備條件
- Docker 與容器安全基礎
- Linux 系統管理技能
- 理解 namespace 隔離與 cgroups
- 熟悉 GPU passthrough 與 NVIDIA container toolkit
pip install docker python-dotenv背景
容器突破於 AI 環境具獨特攻擊向量。GPU passthrough 需提升權限、模型檔案常自主機檔案系統掛載,而推論伺服器頻繁以超越必要之能力執行。這些 ML 特定要求弱化安全團隊預期之容器隔離。
實作練習
部署易受攻擊之 AI 容器
建立反映常見 AI 部署反模式之蓄意錯誤組態容器。
version: "3.8" services: # 反模式 1:為 GPU 存取之特權容器 ai-privileged: image: python:3.11-slim privileged: true volumes: - /var/run/docker.sock:/var/run/docker.sock - ./models:/models command: sleep infinity # 反模式 2:過度 volume mount ai-overmounted: image: python:3.11-slim volumes: - /:/host:ro - ./models:/models command: sleep infinity # 反模式 3:主機網路 + 能力 ai-hostnet: image: python:3.11-slim network_mode: host cap_add: - SYS_PTRACE - SYS_ADMIN command: sleep infinity # 強化比較基線 ai-hardened: image: python:3.11-slim read_only: true security_opt: - no-new-privileges:true tmpfs: - /tmp volumes: - ./models:/models:ro command: sleep infinitydocker compose -f vulnerable-compose.yml up -d列舉容器逃逸向量
自每個容器內辨識潛在逃逸路徑。
#!/usr/bin/env python3 """容器偵察:自內部辨識逃逸向量。""" import subprocess import os def run(cmd: str) -> str: try: result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=5) return result.stdout.strip() or result.stderr.strip() except Exception as e: return f"Error: {e}" def enumerate_escape_vectors(): findings = [] # 檢查是否以 root 執行 uid = os.getuid() if uid == 0: findings.append(("CRITICAL", "Running as root (uid=0)")) # 檢查 Docker socket if os.path.exists("/var/run/docker.sock"): findings.append(("CRITICAL", "Docker socket mounted - full host access possible")) # 檢查主機檔案系統掛載 mounts = run("cat /proc/mounts") if "/host" in mounts: findings.append(("CRITICAL", "Host filesystem mounted")) # 檢查能力 caps = run("cat /proc/self/status | grep Cap") if caps: findings.append(("INFO", f"Capabilities: {caps[:200]}")) # 檢查特權模式指標 devices = run("ls /dev/ | wc -l") if devices and int(devices) > 20: findings.append(("HIGH", f"Many devices visible ({devices}) - possible privileged mode")) # 檢查 cgroup 逃逸潛力 cgroup = run("cat /proc/1/cgroup") if "docker" not in cgroup and "containerd" not in cgroup: findings.append(("HIGH", "May not be in container cgroup namespace")) # 檢查 GPU 裝置存取 gpu_devices = run("ls /dev/nvidia* 2>/dev/null") if gpu_devices: findings.append(("MEDIUM", f"GPU devices accessible: {gpu_devices}")) # 檢查網路 namespace hostname = run("hostname") if len(hostname) < 12: # 容器主機名通常為 12+ 字元雜湊 findings.append(("MEDIUM", f"Hostname '{hostname}' suggests host network")) # 經 proc 檢查敏感主機檔案 host_info = run("cat /proc/version") findings.append(("INFO", f"Kernel: {host_info[:100]}")) return findings if __name__ == "__main__": print("=== Container Escape Vector Enumeration ===\n") findings = enumerate_escape_vectors() for severity, desc in findings: print(f" [{severity}] {desc}")於每個容器內執行此:
# 於特權容器測試 docker exec ai-privileged python3 container_recon.py # 於過度掛載容器測試 docker exec ai-overmounted python3 container_recon.py # 於強化容器測試 docker exec ai-hardened python3 container_recon.py展示 Docker Socket 逃逸
使用掛載之 Docker socket 逃逸至主機。
#!/usr/bin/env python3 """Docker socket 逃逸展示。 此展示為何於容器內掛載 Docker socket 等於予完整主機存取。""" import subprocess import json def check_docker_socket(): """驗證 Docker socket 可存取。""" import os socket_path = "/var/run/docker.sock" if not os.path.exists(socket_path): print("Docker socket not mounted. This escape vector is not available.") return False print(f"Docker socket found at {socket_path}") return True def list_host_containers(): """經 socket 列出主機上所有容器。""" # 若需要安裝 Docker CLI subprocess.run("apt-get update && apt-get install -y curl", shell=True, capture_output=True) # 使用 curl 直接查詢 Docker API result = subprocess.run( 'curl -s --unix-socket /var/run/docker.sock ' 'http://localhost/containers/json', shell=True, capture_output=True, text=True, ) if result.stdout: containers = json.loads(result.stdout) print(f"\nHost containers visible: {len(containers)}") for c in containers: print(f" {c['Names'][0]}: {c['Image']} ({c['State']})") def demonstrate_host_access(): """顯示 socket 存取如何啟動主機檔案系統讀取。""" print("\n=== Docker Socket Escape ===") print("With Docker socket access, an attacker can:") print("1. List all host containers and their configurations") print("2. Create a new privileged container with host root mounted") print("3. Execute commands on the host via the new container") print("4. Access secrets, credentials, and other container data") print("\nThis is why Docker socket mounting is a CRITICAL misconfiguration.") if __name__ == "__main__": if check_docker_socket(): list_host_containers() demonstrate_host_access()docker exec ai-privileged python3 socket_escape.py實作容器強化
套用並驗證防止常見逃逸向量之強化措施。
#!/usr/bin/env python3 """為 AI 部署之容器強化驗證檢核表。""" import os import subprocess def run(cmd): try: r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=5) return r.stdout.strip() except Exception: return "" CHECKS = [ { "name": "Non-root user", "check": lambda: os.getuid() != 0, "fix": "Use USER directive in Dockerfile or --user flag", }, { "name": "Read-only filesystem", "check": lambda: not os.access("/", os.W_OK), "fix": "Use --read-only flag with tmpfs for writable directories", }, { "name": "No Docker socket", "check": lambda: not os.path.exists("/var/run/docker.sock"), "fix": "Never mount Docker socket in application containers", }, { "name": "No host filesystem", "check": lambda: not os.path.exists("/host"), "fix": "Mount only specific directories needed, not entire host", }, { "name": "Limited devices", "check": lambda: len(os.listdir("/dev")) < 20, "fix": "Avoid --privileged; use --device for specific GPU devices only", }, { "name": "No new privileges", "check": lambda: "NoNewPrivs:\t1" in run("cat /proc/self/status"), "fix": "Use --security-opt no-new-privileges:true", }, { "name": "Seccomp enabled", "check": lambda: "Seccomp:\t2" in run("cat /proc/self/status"), "fix": "Use default or custom seccomp profile", }, ] if __name__ == "__main__": print("=== AI Container Hardening Checklist ===\n") passed = 0 for check in CHECKS: try: result = check["check"]() except Exception: result = False status = "PASS" if result else "FAIL" passed += int(result) print(f" [{status}] {check['name']}") if not result: print(f" Fix: {check['fix']}") print(f"\nScore: {passed}/{len(CHECKS)}")# 比較強化 vs 易受攻擊 docker exec ai-hardened python3 hardening_checklist.py docker exec ai-privileged python3 hardening_checklist.py
故障排除
| 問題 | 解法 |
|---|---|
| 無法啟動 GPU 容器 | 確保 NVIDIA container toolkit 於主機已安裝 |
| 於強化容器之權限拒絕 | 為需寫入存取之目錄使用 tmpfs 掛載 |
| 找不到 Docker socket | 於 docker-compose.yml 驗證 volume mount 語法 |
| 偵察腳本漏失向量 | 為你特定部署環境加入額外檢查 |
相關主題
- 推論伺服器利用 - 攻擊容器內執行之服務
- GPU 側通道 - 經 GPU 存取之硬體層級攻擊
- 模型服務攻擊 - 更廣服務框架漏洞
- 全堆疊 Exploit - 將容器逃逸與其他向量鏈接
參考資料
- "Understanding Docker Container Security" - Docker Inc.(2024)- 容器安全架構與最佳實踐
- "Container Escape Techniques" - MITRE ATT&CK(2024)- 已知容器逃逸方法之目錄
- "Securing GPU Workloads in Containers" - NVIDIA(2024)- GPU 特定容器安全指引
- "Threat Modeling ML Infrastructure" - Kumar et al.(2024)- ML 特定基礎設施威脅建模
Knowledge Check
為何 AI/ML 容器較標準應用容器特別易受突破?