# vision
標記為「vision」的 37 篇文章
Multimodal Attack Assessment
Test your understanding of attacks against multimodal AI systems, including image-based injection, audio adversarial examples, and cross-modal manipulation with 10 intermediate-level questions.
Capstone: Build a Multimodal Attack Testing Suite
Design and implement a comprehensive testing suite for attacking multimodal AI systems across text, image, audio, and document modalities.
Case Study: GPT-4 Vision Jailbreak Attacks
Analysis of visual jailbreak techniques targeting GPT-4V's multimodal capabilities, including typography attacks, adversarial images, and cross-modal prompt injection.
Multimodal Attack Vectors
Exploitation of vision-language models, typographic attacks, audio injection, document-based attacks, and cross-modal adversarial techniques.
Lab: Multimodal Attack Pipeline
Build an automated multimodal attack pipeline that generates adversarial images, combines them with text prompts, and tests against vision-language models (VLMs).
CTF: Multimodal Maze
Navigate a multimodal challenge using image, text, and audio injection vectors. Each modality unlocks the next stage of the maze, requiring cross-modal attack chaining.
Lab: Multimodal Injection
Hands-on lab for embedding text instructions in images to exploit vision-enabled LLMs. Learn to craft visual prompt injections, test OCR-based attacks, and evaluate multimodal safety boundaries.
Lab: Multimodal Injection (Intermediate Lab)
Embed prompt injection instructions in images for vision-enabled models, testing how visual content can carry adversarial payloads.
Injecting Text via Images to VLMs
Embed adversarial text in images that vision-language models read and follow as instructions.
Simulation: Multimodal Application Assessment
Red team simulation targeting an application that processes both images and text, testing visual injection, cross-modal attacks, and multimodal jailbreaks.
Multimodal Model Security Comparison
Comparing security properties across multimodal models (GPT-4V, Claude, Gemini) with focus on cross-modal injection and vision-language attacks.
Multimodal Security
Security assessment of multimodal AI systems processing images, audio, video, and cross-modal inputs, covering vision-language models, speech systems, video analysis, and cross-modal attack techniques.
Attacks on Vision-Language Models
Comprehensive techniques for attacking vision-language models including GPT-4V, Claude vision, and Gemini, covering adversarial images, typographic exploits, and multimodal jailbreaks.
Vision-Language Model Attacks
Comprehensive overview of the VLM attack surface, how vision encoders connect to language models, and why multimodal systems create new injection vectors.
Image-Based Prompt Injection (Attack Walkthrough)
Embedding text instructions in images that vision models read, enabling prompt injection through the visual modality to bypass text-only input filters and safety mechanisms.
QR Code Injection
Using QR codes as prompt injection vectors against vision-language models, encoding adversarial instructions in machine-readable formats that models decode and follow.
Typography Injection in Images
Using rendered text with specific fonts, styles, and typographic techniques in images to inject prompts into vision-language models while evading detection.
Vision Model Attack Walkthrough (Attack Walkthrough)
Step-by-step walkthrough of visual prompt injection, adversarial images, and OCR exploitation in vision-language models.
章節評量:多模態安全
15 題校準評量,測試你對多模態 AI 安全的理解——視覺注入、音訊攻擊、跨模態利用。
多模態攻擊版圖
隨著 AI 系統處理圖片、音訊與影片以及文字,攻擊面已大幅擴展。紅隊員需要知道的事。
Capstone: Build a Multimodal 攻擊 Testing Suite
Design and implement a comprehensive testing suite for attacking multimodal AI systems across text, image, audio, and document modalities.
Case Study: GPT-4 Vision 越獄 攻擊s
Analysis of visual jailbreak techniques targeting GPT-4V's multimodal capabilities, including typography attacks, adversarial images, and cross-modal prompt injection.
Multimodal 攻擊 Vectors
利用ation of vision-language models, typographic attacks, audio injection, document-based attacks, and cross-modal adversarial techniques.
實驗室: Multimodal 攻擊 Pipeline
Build an automated multimodal attack pipeline that generates adversarial images, combines them with text prompts, and tests against vision-language models (VLMs).
CTF: Multimodal Maze
Navigate a multimodal challenge using image, text, and audio injection vectors. Each modality unlocks the next stage of the maze, requiring cross-modal attack chaining.
實驗室: Multimodal Injection
Hands-on lab for embedding text instructions in images to exploit vision-enabled LLMs. Learn to craft visual prompt injections, test OCR-based attacks, and evaluate multimodal safety boundaries.
實驗室: Multimodal Injection (中階 實驗室)
Embed prompt injection instructions in images for vision-enabled models, testing how visual content can carry adversarial payloads.
Injecting Text via Images to VLMs
Embed adversarial text in images that vision-language models read and follow as instructions.
模擬:多模態應用程式評估
針對同時處理圖片與文字之應用程式的紅隊模擬,測試視覺注入、跨模態攻擊與多模態越獄。
Multimodal 模型 安全 Comparison
Comparing security properties across multimodal models (GPT-4V, Claude, Gemini) with focus on cross-modal injection and vision-language attacks.
多模態安全
處理影像、音訊、影片與跨模態輸入之多模態 AI 系統的安全評估,涵蓋視覺-語言模型、語音系統、影片分析與跨模態攻擊技術。
攻擊s on Vision-Language 模型s
Comprehensive techniques for attacking vision-language models including GPT-4V, Claude vision, and Gemini, covering adversarial images, typographic exploits, and multimodal jailbreaks.
視覺-語言模型
視覺-語言模型(VLM)的安全評估——涵蓋 VLM 架構、圖片注入技術、OCR 與字型攻擊、對抗性圖片生成與 VLM 特定越獄。
Image-Based 提示詞注入 (攻擊 導覽)
Embedding text instructions in images that vision models read, enabling prompt injection through the visual modality to bypass text-only input filters and safety mechanisms.
QR Code Injection
Using QR codes as prompt injection vectors against vision-language models, encoding adversarial instructions in machine-readable formats that models decode and follow.
Typography Injection in Images
Using rendered text with specific fonts, styles, and typographic techniques in images to inject prompts into vision-language models while evading detection.
Vision 模型 攻擊 導覽 (攻擊 導覽)
Step-by-step walkthrough of visual prompt injection, adversarial images, and OCR exploitation in vision-language models.