# safety-testing

標記為「safety-testing」的 6 篇文章

微調中的評估規避

打造能通過標準安全評估但內含於特定條件下啟動之隱藏不安全行為的微調模型。

fine-tuningevaluationevasionsafety-testing

進階

Lab: Introduction to Safety Testing

學習 the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.

labssafety-testingfundamentalsbeginner

入門

實驗：微調對安全的影響測試

透過比較微調前後的安全基準分數，衡量微調對模型安全性的影響。

labsfine-tuningsafety-testingintermediate

中級

安全 Gates in ML Deployment

Implementing security checkpoints in ML deployment pipelines: automated safety testing, performance regression detection, bias evaluation, approval workflows, and designing gates that balance security with deployment velocity.

deployment-gatessafety-testingbias-detectionapproval-workflowsregression-testingml-deployment

進階

實作:利用量化後的模型

動手實驗:比較不同量化等級下的攻擊成功率,在 FP16、INT8、INT4 上測試越獄,量測安全性退化,並設計量化感知的漏洞利用。

labquantizationhands-onjailbreaksafety-testingtransformers

進階

Local 模型 Analysis and Testing with Ollama

導覽 for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.

ollamalocal-modelsmodel-analysissafety-testingred-team-labwalkthrough

入門