# safety-evaluation
4 artikelengetagd met “safety-evaluation”
Casestudy: bevindingen uit de veiligheidsevaluatie van het DeepSeek-model
Comprehensive analysis of safety evaluation findings for DeepSeek models, including comparative assessments against GPT-4 and Claude, jailbreak susceptibility testing, and implications for open-weight model deployment.
LLM-as-Judge verdedigingssystemen
Hoe LLM-as-judge-architecturen de outputs van andere LLM's evalueren op veiligheid, inclusief sequentiële en parallelle ontwerpen, prompt-engineering voor judges, en technieken om judge-modellen aan te vallen.
Kader voor veiligheidsevaluatie van fine-tuning
A comprehensive framework for evaluating the safety of fine-tuned models -- combining pre-deployment testing, safety regression benchmarks, and continuous monitoring to detect when fine-tuning has compromised model safety.
Lab: beveiligingsvergelijking van modellen
Systematically compare the safety posture of major language models using a standardized test suite, building quantitative security profiles for GPT-4, Claude, and Gemini.