# RLAIF
2 articlestagged with “RLAIF”
Constitutional AI Training Attacks
Attacking Constitutional AI and RLAIF training pipelines by manipulating the constitutional principles, critique models, or self-improvement loops.
fine-tuningconstitutional-AIRLAIFattacks
Constitutional AI 訓練 攻擊s
攻擊ing Constitutional AI and RLAIF training pipelines by manipulating the constitutional principles, critique models, or self-improvement loops.
fine-tuningconstitutional-AIRLAIFattacks