# lora
10 articlestagged with “lora”
Fine-Tuning Security Assessment
Test your knowledge of fine-tuning security risks including LoRA attacks, RLHF manipulation, safety degradation, and catastrophic forgetting with 15 questions.
Training & Fine-Tuning Attacks
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Fine-Tuning Security
Comprehensive overview of how fine-tuning can compromise model safety -- attack taxonomy covering dataset poisoning, safety degradation, backdoor insertion, and reward hacking in the era of widely available fine-tuning APIs.
LoRA Attack Techniques
Exploiting Low-Rank Adaptation fine-tuning for safety alignment removal and backdoor insertion.
Malicious Adapter Injection
How attackers craft LoRA adapters containing backdoors, distribute poisoned adapters through model hubs, and exploit adapter stacking to compromise model safety -- techniques, detection challenges, and real-world supply chain risks.
LoRA & Adapter Attack Surface
Overview of security vulnerabilities in parameter-efficient fine-tuning methods including LoRA, QLoRA, and adapter-based approaches -- how the efficiency and shareability of adapters create novel attack vectors.
Direct Weight Manipulation
Techniques for directly modifying LoRA adapter weights to bypass safety training, inject targeted capabilities, and hide malicious behaviors -- going beyond dataset-driven fine-tuning to surgical weight-level attacks.
LoRA Backdoor Insertion Attack
Insert triggered backdoors through LoRA fine-tuning that activate on specific input patterns while passing safety evals.
Model Merging & LoRA Composition Exploits
Exploiting model merging techniques (TIES, DARE, linear interpolation) and LoRA composition to introduce backdoors through individually benign model components.
LoRA & Adapter Layer Attacks
Security implications of LoRA and adapter-based fine-tuning, including safety alignment removal, adapter poisoning, rank manipulation attacks, and multi-adapter conflict exploitation.