# unicode
9 articlestagged with “unicode”
Character Encoding Bypass Techniques
Bypass input filters using Unicode normalization, homoglyph substitution, and mixed-script encoding.
Emoji and Unicode Injection Techniques
Use emoji sequences and Unicode special characters to bypass text-based input filters.
Lab: Advanced Token Smuggling via Unicode Normalization
Exploit Unicode normalization differences between input validators and LLM tokenizers to bypass content filters and inject hidden instructions.
Lab: Unicode Normalization Bypass Attacks
Exploit Unicode normalization differences between input validation and model processing to smuggle injection payloads.
Encoding Bypass Techniques
Using Base64, ROT13, Unicode transformations, hex encoding, and other obfuscation methods to evade prompt injection filters and safety classifiers while preserving semantic meaning.
Unicode and Homoglyph Injection
Leveraging Unicode normalization inconsistencies, homoglyph substitution, and invisible characters to construct stealthy injection payloads.
Encoding-Based Evasion
Using base64, ROT13, hexadecimal, Unicode, and other encoding schemes to evade input detection systems and bypass content filters in LLM applications.
Unicode Normalization Bypass Walkthrough
Step-by-step guide to exploiting Unicode normalization differences between input filters and model tokenizers.
Unicode Normalization Defense
Step-by-step walkthrough for implementing Unicode normalization to prevent encoding-based prompt injection bypasses, covering homoglyph detection, invisible character stripping, bidirectional text handling, and normalization testing.