# regression
標記為「regression」的 7 篇文章
模型行為 Diffing
比較事件、更新或修改前後之模型行為:輸出分布分析、安全退化偵測、能力變化量測,以及統計顯著性檢定。
安全 Risks of AI-Assisted Refactoring
Analysis of security vulnerabilities introduced when AI tools refactor existing code, including subtle behavioral changes and security property violations.
攻擊重放系統開發
打造攻擊重放系統,針對已知攻擊樣式進行迴歸測試防禦。
AI 安全的迴歸測試
為 AI 安全屬性實作自動化迴歸測試,可整合進 CI/CD 管道並捕捉安全退化。
Lab: Build Behavior Diff Tool
建構 a tool that 系統性地 compares language model behavior across versions, configurations, and providers. Detect safety regressions, capability changes, and behavioral drift with automated differential analysis.
實驗室: Regression 測試 with promptfoo
動手實驗室,主題為setting up promptfoo to run automated regression tests 對抗 LLM applications,ensuring that safety properties hold across model updates與prompt changes.
Verifying That Remediations Are Effective
導覽 for planning and executing remediation verification testing (retesting) to confirm that AI vulnerability fixes are effective and do not introduce regressions.