# llm-judge
標記為「llm-judge」的 10 篇文章
LLM-as-Judge Defense Systems
How LLM-as-judge architectures evaluate other LLM outputs for safety, including sequential and parallel designs, judge prompt engineering, and techniques for attacking judge models.
LLM Judge Manipulation
Craft responses that exploit LLM-as-judge evaluation patterns to achieve high safety scores while embedding harmful content.
Lab: Building an LLM Judge Evaluator
Hands-on lab for building an LLM-based evaluator to score red team attack outputs, compare model vulnerability, and lay the foundation for automated attack campaigns.
Setting Up Content Filtering
Step-by-step walkthrough for implementing multi-layer content filtering for AI applications: keyword filtering, classifier-based detection, LLM-as-judge evaluation, testing effectiveness, and tuning for production.
LLM Judge Implementation
Step-by-step walkthrough for using an LLM to judge another LLM's outputs for safety and quality, covering judge prompt design, scoring rubrics, calibration, cost optimization, and deployment patterns.
LLM-as-Judge 防禦系統
LLM-as-judge 架構如何評估其他 LLM 輸出之安全性,含循序與平行設計、judge 提示工程,以及攻擊 judge 模型之技術。
LLM Judge Manipulation
Craft responses that exploit LLM-as-judge evaluation patterns to achieve high safety scores while embedding harmful content.
實驗:建立 LLM 裁判評估器
為建立 LLM 基評估器以對紅隊攻擊輸出評分、比較模型脆弱度並為自動化攻擊活動奠基之實作實驗。
Setting Up Content Filtering
Step-by-step walkthrough for implementing multi-layer content filtering for AI applications: keyword filtering, classifier-based detection, LLM-as-judge evaluation, testing effectiveness, and tuning for production.
LLM Judge 實作
使用 LLM 判斷另一 LLM 輸出之安全與品質之逐步演練,涵蓋 judge 提示設計、評分準則、校準、成本最佳化與部署模式。