# safety-testing
標記為「safety-testing」的 6 篇文章
微調中的評估規避
打造能通過標準安全評估但內含於特定條件下啟動之隱藏不安全行為的微調模型。
fine-tuningevaluationevasionsafety-testing
Lab: Introduction to Safety Testing
學習 the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.
labssafety-testingfundamentalsbeginner
實驗:微調對安全的影響測試
透過比較微調前後的安全基準分數,衡量微調對模型安全性的影響。
labsfine-tuningsafety-testingintermediate
安全 Gates in ML Deployment
Implementing security checkpoints in ML deployment pipelines: automated safety testing, performance regression detection, bias evaluation, approval workflows, and designing gates that balance security with deployment velocity.
deployment-gatessafety-testingbias-detectionapproval-workflowsregression-testingml-deployment
實作:利用量化後的模型
動手實驗:比較不同量化等級下的攻擊成功率,在 FP16、INT8、INT4 上測試越獄,量測安全性退化,並設計量化感知的漏洞利用。
labquantizationhands-onjailbreaksafety-testingtransformers
Local 模型 Analysis and Testing with Ollama
導覽 for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.
ollamalocal-modelsmodel-analysissafety-testingred-team-labwalkthrough