# safety-testing
6 articlestagged with “safety-testing”
Evaluation Evasion in Fine-Tuning
Crafting fine-tuned models that pass standard safety evaluations while containing hidden unsafe behaviors that activate under specific conditions.
Lab: Introduction to Safety Testing
Learn the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.
Lab: Fine-Tuning Safety Impact Testing
Measure how fine-tuning affects model safety by comparing pre and post fine-tuning safety benchmark scores.
Security Gates in ML Deployment
Implementing security checkpoints in ML deployment pipelines: automated safety testing, performance regression detection, bias evaluation, approval workflows, and designing gates that balance security with deployment velocity.
Lab: Exploiting Quantized Models
Hands-on lab comparing attack success rates across quantization levels: testing jailbreaks on FP16 vs INT8 vs INT4, measuring safety degradation, and crafting quantization-aware exploits.
Local Model Analysis and Testing with Ollama
Walkthrough for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.