Skip to main content
redteams.ai
All tags

# hidden-behavior

1 articletagged with “hidden-behavior

Direct Weight Manipulation

Techniques for directly modifying LoRA adapter weights to bypass safety training, inject targeted capabilities, and hide malicious behaviors -- going beyond dataset-driven fine-tuning to surgical weight-level attacks.

weight-manipulationloraadaptersafety-bypasscapability-injectionhidden-behaviormodel-editing
Advanced