1 articletagged with “preference-optimization”
Research on attacks against preference optimization methods including DPO, KTO, and IPO.