# feedback
標記為「feedback」的 2 篇文章
Reinforcement Feedback Poisoning
Compromising reinforcement learning from human feedback by poisoning preference data, manipulating reward models, or exploiting annotator disagreement.
data-trainingRLHFfeedbackmanipulation
Reinforcement Feedback 投毒
Compromising reinforcement learning from human feedback by poisoning preference data, manipulating reward models, or exploiting annotator disagreement.
data-trainingRLHFfeedbackmanipulation