# llm-judge
標記為「llm-judge」的 5 篇文章
LLM-as-Judge 防禦系統
LLM-as-judge 架構如何評估其他 LLM 輸出之安全性,含循序與平行設計、judge 提示工程,以及攻擊 judge 模型之技術。
llm-judgesafety-evaluationdefense-architectureadversarialjudge-bypass
LLM Judge 操控
Craft responses that exploit LLM-as-judge evaluation patterns to achieve high safety scores while embedding harmful content.
labsllm-judgemanipulationintermediate
實驗室: Building an LLM Judge Evaluator
動手實驗室,主題為building an LLM-based evaluator to score red team attack outputs,compare model vulnerability,lay the foundation for automated attack campaigns.
labllm-judgeevaluationautomation
Setting Up Content Filtering
Step-by-step walkthrough for implementing multi-layer content filtering for AI applications: keyword filtering, classifier-based detection, LLM-as-judge evaluation, testing effectiveness, and tuning for production.
content-filteringdefenseclassifiersmoderationllm-judgeimplementationwalkthrough
LLM 評審實作
使用 LLM 評審另一個 LLM 之輸出以評估安全與品質的逐步演練,涵蓋評審提示詞設計、評分準則、校準、成本最佳化與部署模式。
llm-judgeoutput-validationsafetyevaluationdefensewalkthrough