# safety-testing

6 articlestagged with “safety-testing”

Evaluation Evasion in Fine-Tuning

Crafting fine-tuned models that pass standard safety evaluations while containing hidden unsafe behaviors that activate under specific conditions.

fine-tuningevaluationevasionsafety-testing

Advanced

Lab: Introduction to Safety Testing

Learn the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.

labssafety-testingfundamentalsbeginner

Beginner

Lab: Fine-Tuning Safety Impact Testing

Measure how fine-tuning affects model safety by comparing pre and post fine-tuning safety benchmark scores.

labsfine-tuningsafety-testingintermediate

Intermediate

Security Gates in ML Deployment

Implementing security checkpoints in ML deployment pipelines: automated safety testing, performance regression detection, bias evaluation, approval workflows, and designing gates that balance security with deployment velocity.

deployment-gatessafety-testingbias-detectionapproval-workflowsregression-testingml-deployment

Advanced

Lab: Exploiting Quantized Models

Hands-on lab comparing attack success rates across quantization levels: testing jailbreaks on FP16 vs INT8 vs INT4, measuring safety degradation, and crafting quantization-aware exploits.

labquantizationhands-onjailbreaksafety-testingtransformers

Advanced

Local Model Analysis and Testing with Ollama

Walkthrough for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.

ollamalocal-modelsmodel-analysissafety-testingred-team-labwalkthrough

Beginner