Skip to main content
redteams.ai
All tags

# safety-loss

2 articlestagged with “safety-loss

Model Merging Risks

Security risks in model and adapter merging workflows -- how merging adapters from untrusted sources can introduce vulnerabilities, exploit merge algorithm properties, and cause safety property loss through TIES, DARE, SLERP, and linear interpolation.

model-mergingtiesdareslerpadapter-mergesafety-lossfine-tuning-security
Advanced

Safety Loss During Model Distillation

Research on how safety alignment degrades during knowledge distillation from larger to smaller models.

frontier-researchdistillationsafety-lossresearch
Advanced