# cross-modal
13 articlestagged with “cross-modal”
Advanced Multimodal Assessment
In-depth assessment of cross-modal attack vectors including image injection, audio manipulation, and steganographic techniques.
Skill Verification: Multimodal Attack Execution
Hands-on verification of image injection, audio manipulation, and cross-modal transfer attacks.
Multimodal Embedding Attacks
Exploiting cross-modal embedding models like CLIP — adversarial image-text alignment manipulation, cross-modal injection, and attacks on multimodal retrieval systems.
Cross-Modal Embedding Attacks
Exploitation of shared embedding spaces across modalities: CLIP adversarial images, typographic attacks, VLM injection, and dimensionality reduction attacks.
Cross-Modal Attack Strategies
Overview of attack strategies that exploit the boundaries between input modalities in multimodal AI systems, including vision-language, audio-text, and document processing pipelines.
Cross-Modal Information Leakage
Techniques for extracting sensitive information through modality boundary violations, including system prompt leakage via image descriptions, context window dumping through multi-modal queries, and side-channel attacks across modalities.
Multimodal Defense Strategies
Comprehensive defense approaches for multimodal AI systems: cross-modal verification, perceptual hashing, NSFW detection, input sanitization, and defense-in-depth architectures.
Transferring Attacks Across Modalities
Techniques for crafting adversarial inputs that transfer across modalities, using one input channel to attack processing in another, including image-to-text, audio-to-action, and document-to-tool attack chains.
Cross-Modal Transfer Attacks
Attacks that transfer across modalities — using one input modality to attack processing in another.
Multimodal Security
Security assessment of multimodal AI systems processing images, audio, video, and cross-modal inputs, covering vision-language models, speech systems, video analysis, and cross-modal attack techniques.
Alignment Challenges in Multimodal Models
Analysis of alignment challenges specific to multimodal AI systems, including cross-modal safety gaps, representation conflicts, and the difficulty of extending text-based safety training to visual, audio, and video inputs.
Multimodal Text Injection Vectors
Injecting adversarial text instructions through non-text modalities including images with embedded text, audio transcription, and document parsing.
Cross-Modal Confusion
Confusing multimodal AI models by sending conflicting or complementary signals across different input modalities to bypass safety mechanisms and exploit fusion weaknesses.