# mitigation
標記為「mitigation」的 16 篇文章
Defense & Mitigation Assessment
Assessment covering guardrails, classifiers, constitutional AI, and defense-in-depth architectures.
Adversarial Training for LLM Defense
Use adversarial training techniques to improve LLM robustness against known attack patterns.
Circuit Breaker Patterns for LLMs
Implement circuit breaker patterns that halt LLM processing when anomalous behavior is detected.
Defense & Mitigation
Defensive strategies for AI systems including guardrails architecture, monitoring and observability, secure development practices, remediation mapping, and advanced defense techniques.
Privilege Separation in LLM Applications
Implement privilege separation to limit the capabilities available to the LLM based on context and user role.
Prompt Injection Canary System
Deploy canary strings in system prompts to detect and alert on prompt injection and extraction attempts.
Response Consistency Checking
Implement consistency checking between model responses and known facts to detect manipulation.
Token Attribution Monitoring
Monitor token attributions in model outputs to detect adversarial influence on generation.
防禦 & Mitigation 評量
評量 covering guardrails, classifiers, constitutional AI, and defense-in-depth architectures.
Adversarial 訓練 for LLM 防禦
Use adversarial training techniques to improve LLM robustness against known attack patterns.
Circuit Breaker Patterns for LLMs
Implement circuit breaker patterns that halt LLM processing when anomalous behavior is detected.
防禦與緩解
AI 系統的防禦策略,包含護欄架構、監控與可觀測性、安全開發實務、修復對應與進階防禦技術。
Privilege Separation in LLM Applications
Implement privilege separation to limit the capabilities available to the LLM based on context and user role.
提示詞注入 Canary System
Deploy canary strings in system prompts to detect and alert on prompt injection and extraction attempts.
Response Consistency Checking
Implement consistency checking between model responses and known facts to detect manipulation.
Token Attribution Monitoring
Monitor token attributions in model outputs to detect adversarial influence on generation.