# fine
7 articlestagged with “fine”
Skill Verification: Fine-Tuning Attacks
Hands-on verification of fine-tuning-based attack techniques including safety alignment removal.
assessmentsfinetuningverifyskill
Adapter Layer Security Analysis
Security analysis of adapter-based fine-tuning including LoRA, prefix tuning, and prompt tuning.
layeranalysisfineadaptertuning
Alignment Stability Under Fine-Tuning
Testing how safety alignment degrades under various fine-tuning configurations and datasets.
stabilityfinetuningalignmenttesting
Few-Shot Detuning Attacks
Removing safety alignment with minimal fine-tuning data through targeted few-shot detuning.
attacksfinefewtuningshotdetuning
Instruction Tuning Safety Bypass
Using instruction tuning to selectively bypass safety mechanisms while maintaining model capability.
instructionfinesafetybypasstuning
Shared Adapter Security Risks
Security risks of using publicly shared adapters from model hubs and community repositories.
finerisksadaptertuningshared
Fine-Tuning API Security Probing
Probe fine-tuning APIs for security weaknesses including insufficient validation and unsafe default configurations.
fineintermediatetuninglablabsapi