# lora
10 artikelengetagd met “lora”
Beveiligingsbeoordeling van fine-tuning
Test your knowledge of fine-tuning security risks including LoRA attacks, RLHF manipulation, safety degradation, and catastrophic forgetting with 15 questions.
Aanvallen op training en fine-tuning
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Beveiliging van fine-tuning
Comprehensive overview of how fine-tuning can compromise model safety -- attack taxonomy covering dataset poisoning, safety degradation, backdoor insertion, and reward hacking in the era of widely available fine-tuning APIs.
Aanvalstechnieken voor LoRA
Exploiting Low-Rank Adaptation fine-tuning for safety alignment removal and backdoor insertion.
Injection van kwaadaardige adapters
How attackers craft LoRA adapters containing backdoors, distribute poisoned adapters through model hubs, and exploit adapter stacking to compromise model safety -- techniques, detection challenges, and real-world supply chain risks.
Aanvalsoppervlak van LoRA en adapters
Overview of security vulnerabilities in parameter-efficient fine-tuning methods including LoRA, QLoRA, and adapter-based approaches -- how the efficiency and shareability of adapters create novel attack vectors.
Directe manipulatie van gewichten
Techniques for directly modifying LoRA adapter weights to bypass safety training, inject targeted capabilities, and hide malicious behaviors -- going beyond dataset-driven fine-tuning to surgical weight-level attacks.
Aanval om een backdoor in LoRA in te voegen
Insert triggered backdoors through LoRA fine-tuning that activate on specific input patterns while passing safety evals.
Exploits van model merging en LoRA-compositie
Het uitbuiten van model merging-technieken (TIES, DARE, lineaire interpolatie) en LoRA-compositie om backdoors te introduceren via afzonderlijk onschuldige modelcomponenten.
LoRA- & adapterlaag-aanvallen
Beveiligingsimplicaties van LoRA en op adapters gebaseerde fine-tuning, waaronder het verwijderen van safety-alignment, adaptervergiftiging, rangmanipulatie-aanvallen en het misbruik van multi-adapter-conflicten.