實驗室: Garak Setup and First Scan
Install and configure garak, the LLM vulnerability scanner, then run your first automated scan against a local model to discover common vulnerabilities.
先備知識
- Python 3.10+ with pip
- Ollama running locally with the
llama3.2model pulled - At least 4GB of available disk space for garak and its dependencies
- Basic command-line familiarity
ollama pull llama3.2Background
Garak (Generative AI Red-teaming and 評估 Kit) is an automated tool for probing LLM 漏洞. It sends a curated set of 對抗性 prompts to a target model, measures the responses, and reports which 漏洞 were detected. Think of it as a penetration 測試 tool specifically designed for language models.
Lab Exercises
Step 1: Install Garak
Install garak and verify the installation.
#!/usr/bin/env python3 """Verify garak installation and list available components.""" import subprocess import sys def run_command(cmd: list[str], description: str) -> str: """Run a shell command and return 輸出.""" print(f"\n{'='*50}") print(f"Running: {description}") print(f"Command: {' '.join(cmd)}") print('='*50) result = subprocess.run( cmd, capture_output=True, text=True, timeout=120, ) if result.returncode != 0: print(f"STDERR: {result.stderr[:500]}") return f"FAILED: {result.stderr[:200]}" 輸出 = result.stdout.strip() print(輸出[:500] if 輸出 else "(no 輸出)") return 輸出 if __name__ == "__main__": # Step 1: Check if garak is installed print("Checking garak installation...\n") version_output = run_command( [sys.executable, "-m", "garak", "--version"], "Check garak version" ) if "FAILED" in version_output: print("\nGarak not installed. Installing now...") install_output = run_command( [sys.executable, "-m", "pip", "install", "garak"], "Install garak" ) # Re-check version version_output = run_command( [sys.executable, "-m", "garak", "--version"], "Verify garak version" ) # Step 2: List available generators run_command( [sys.executable, "-m", "garak", "--list_generators"], "List available generators (model connectors)" ) # Step 3: List available probes run_command( [sys.executable, "-m", "garak", "--list_probes"], "List available probes (attack categories)" ) print("\n\nGarak installation verified successfully!") print("Key components:") print(" - Generators: Connect to target models (Ollama, OpenAI, etc.)") print(" - Probes: 對抗性 prompt categories to 測試") print(" - Detectors: Analyze responses for 漏洞") print(" - Harnesses: Orchestrate the 測試 pipeline")pip install garak python install_garak.pyGarak has four main components: generators (how to connect to models), probes (what attacks to try), detectors (how to 評估 responses), and harnesses (how to orchestrate the 測試). 理解 this architecture helps you customize scans.
Step 2: Configure Garak for Ollama
Set up garak to target your local Ollama model.
#!/usr/bin/env python3 """Configure and verify garak connection to Ollama.""" import subprocess import sys import json def test_ollama_connection(): """Verify Ollama is running and 模型 is available.""" import urllib.request try: req = urllib.request.Request("http://localhost:11434/api/tags") with urllib.request.urlopen(req, timeout=5) as resp: data = json.loads(resp.read().decode()) models = [m["name"] for m in data.get("models", [])] print(f"Ollama is running. Available models: {models}") if any("llama3.2" in m for m in models): print("llama3.2 model is available.") return True else: print("WARNING: llama3.2 not found. Pull it with: ollama pull llama3.2") return False except Exception as e: print(f"Cannot connect to Ollama: {e}") print("Start Ollama with: ollama serve") return False def test_garak_generator(): """測試 that garak can connect to the Ollama generator.""" print("\nTesting garak OllamaGenerator connection...") # Run a minimal garak probe to 測試 connectivity result = subprocess.run( [ sys.executable, "-m", "garak", "--model_type", "ollama", "--model_name", "llama3.2", "--probes", "blank.BlankPrompt", "--generations", "1", ], capture_output=True, text=True, timeout=120, ) if result.returncode == 0: print("Garak successfully connected to Ollama!") print(f"輸出: {result.stdout[-500:]}") return True else: print(f"Connection 測試 failed.") print(f"Stdout: {result.stdout[-300:]}") print(f"Stderr: {result.stderr[-300:]}") return False if __name__ == "__main__": print("=== Garak + Ollama Configuration 測試 ===\n") ollama_ok = test_ollama_connection() if not ollama_ok: print("\nFix Ollama first, then re-run this script.") sys.exit(1) garak_ok = test_garak_generator() if garak_ok: print("\n\nConfiguration complete! Ready to run scans.") print("\nQuick reference:") print(" Generator: ollama") print(" Model: llama3.2") print(" API: http://localhost:11434") else: print("\nTroubleshooting:") print(" 1. Ensure Ollama is running: ollama serve") print(" 2. Ensure model is pulled: ollama pull llama3.2") print(" 3. Check garak version: python -m garak --version")python configure_garak.pyThis script verifies that both Ollama and garak are properly configured and can communicate with each other.
Step 3: Run Your First 漏洞 Scan
Execute a focused scan targeting 提示詞注入 漏洞.
#!/usr/bin/env python3 """Run a focused garak scan for 提示詞注入 漏洞.""" import subprocess import sys import json from pathlib import Path def run_garak_scan(probes: list[str], generations: int = 3) -> str: """Run a garak scan with specified probes.""" probe_str = ",".join(probes) print(f"Starting garak scan...") print(f" Probes: {probe_str}") print(f" Generations per probe: {generations}") print(f" Model: llama3.2 (Ollama)") print(" This may take several minutes...\n") result = subprocess.run( [ sys.executable, "-m", "garak", "--model_type", "ollama", "--model_name", "llama3.2", "--probes", probe_str, "--generations", str(generations), ], capture_output=True, text=True, timeout=600, ) print("STDOUT (last 1000 chars):") print(result.stdout[-1000:]) if result.stderr: print("\nSTDERR (last 500 chars):") print(result.stderr[-500:]) return result.stdout def find_latest_report() -> Path | None: """Find the most recent garak report file.""" garak_dir = Path.home() / ".local" / "share" / "garak" if not garak_dir.exists(): # Try alternative locations for alt in [Path("garak_runs"), Path.home() / "garak_runs"]: if alt.exists(): garak_dir = alt break report_files = list(garak_dir.rglob("*.report.jsonl")) if report_files: return max(report_files, key=lambda p: p.stat().st_mtime) return None if __name__ == "__main__": print("=== First Garak 漏洞 Scan ===\n") # Run a focused scan on 提示詞注入 probes probes = [ "promptinject.HijackHateHumansMini", "promptinject.HijackKillHumansMini", ] 輸出 = run_garak_scan(probes, generations=3) # Try to find and display the report report = find_latest_report() if report: print(f"\nReport file: {report}") print("\nScan results:") with open(report) as f: for line in f: try: entry = json.loads(line.strip()) if "eval" in entry: probe = entry.get("probe", "unknown") score = entry.get("score", "N/A") print(f" Probe: {probe} | Score: {score}") except json.JSONDecodeError: continue else: print("\nReport file not found. Check garak 輸出 above for results.")python first_scan.pyGarak reports scores 對每個 probe, where lower scores indicate more 漏洞. Review the 輸出 to see which 提示詞注入 techniques 模型 is susceptible to.
Step 4: Run a Broader Scan
Expand the scan to cover additional 漏洞 categories.
#!/usr/bin/env python3 """Run a broader garak scan covering multiple 漏洞 categories.""" import subprocess import sys # Common garak probe categories for beginner 紅隊演練 SCAN_CONFIGS = [ { "name": "提示詞注入", "probes": ["promptinject.HijackHateHumansMini"], "description": "Tests basic 提示詞注入 and hijacking", }, { "name": "Encoding 攻擊", "probes": ["encoding.InjectBase64"], "description": "Tests if encoded payloads bypass 安全", }, { "name": "Known Jailbreaks", "probes": ["knownbadsignatures.LDNOOBW"], "description": "Tests known bad content generation", }, ] def run_single_scan(name: str, probes: list[str], desc: str): """Run a single scan category.""" probe_str = ",".join(probes) print(f"\n{'='*50}") print(f"Category: {name}") print(f"Description: {desc}") print(f"Probes: {probe_str}") print(f"{'='*50}\n") result = subprocess.run( [ sys.executable, "-m", "garak", "--model_type", "ollama", "--model_name", "llama3.2", "--probes", probe_str, "--generations", "2", ], capture_output=True, text=True, timeout=300, ) # Extract key results from 輸出 output_lines = result.stdout.strip().split("\n") for line in output_lines[-20:]: if any(kw in line.lower() for kw in ["pass", "fail", "score", "result"]): print(f" {line.strip()}") if result.returncode != 0: print(f" Warning: scan returned non-zero exit code") if result.stderr: print(f" Error: {result.stderr[-200:]}") if __name__ == "__main__": print("=== Broad 漏洞 Scan ===") print(f"Running {len(SCAN_CONFIGS)} scan categories...\n") for config in SCAN_CONFIGS: run_single_scan( config["name"], config["probes"], config["description"], ) print("\n\n=== Scan Complete ===") print("Review results above 對每個 category.") print("Scores closer to 0 indicate more 漏洞.") print("Scores closer to 1 indicate better resistance.") print("\nNext steps:") print(" 1. Investigate any low-scoring categories") print(" 2. Run deeper scans on vulnerable areas") print(" 3. Compare results across different models")python broad_scan.pyThe broad scan covers multiple attack categories. Compare results across categories to 識別 which 漏洞 families 模型 is most susceptible to.
Step 5: Analyze and Export Results
Parse garak's 輸出 reports and create a summary dashboard.
#!/usr/bin/env python3 """Analyze garak scan results and create a summary report.""" import json from pathlib import Path from datetime import datetime def find_reports() -> list[Path]: """Find all garak report files.""" search_dirs = [ Path.home() / ".local" / "share" / "garak", Path("garak_runs"), Path.home() / "garak_runs", Path.home() / ".garak", ] reports = [] for d in search_dirs: if d.exists(): reports.extend(d.rglob("*.report.jsonl")) return sorted(reports, key=lambda p: p.stat().st_mtime, reverse=True) def parse_report(report_path: Path) -> list[dict]: """Parse a garak JSONL report file.""" entries = [] with open(report_path) as f: for line in f: line = line.strip() if line: try: entries.append(json.loads(line)) except json.JSONDecodeError: continue return entries def generate_summary(entries: list[dict]) -> str: """Generate a readable summary from report entries.""" lines = [ "# Garak Scan 總結 Report", f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}", "", ] # Group by probe probes = {} for entry in entries: probe = entry.get("probe", "unknown") if probe not in probes: probes[probe] = [] probes[probe].append(entry) if not probes: lines.append("No probe results found in report.") return "\n".join(lines) lines.extend([ "## Results by Probe", "", "| Probe | Attempts | Detections | Score |", "|-------|----------|------------|-------|", ]) for probe_name, probe_entries in sorted(probes.items()): attempts = len(probe_entries) # Calculate average score if available scores = [e.get("score", None) for e in probe_entries if "score" in e] if scores: avg_score = sum(s for s in scores if s is not None) / len(scores) detections = sum(1 for s in scores if s is not None and s < 0.5) else: avg_score = -1 detections = 0 score_str = f"{avg_score:.2f}" if avg_score >= 0 else "N/A" lines.append(f"| {probe_name} | {attempts} | {detections} | {score_str} |") lines.extend([ "", "## Interpretation", "- **Score < 0.3**: High 漏洞 - model is easily exploited", "- **Score 0.3-0.7**: Moderate - some attacks succeed", "- **Score > 0.7**: Low 漏洞 - model resists most attacks", "", "## Recommendations", "- Focus manual 測試 on probes with scores < 0.5", "- Compare results across model versions", "- Run targeted deep scans on vulnerable categories", ]) return "\n".join(lines) if __name__ == "__main__": print("=== Garak Results Analysis ===\n") reports = find_reports() if not reports: print("No garak report files found.") print("Run a scan first with the previous scripts.") print("\nExpected report locations:") print(" ~/.local/share/garak/") print(" ./garak_runs/") else: print(f"Found {len(reports)} report(s).") latest = reports[0] print(f"Analyzing: {latest}\n") entries = parse_report(latest) summary = generate_summary(entries) print(summary) # Save report with open("garak_summary.md", "w") as f: f.write(summary) print(f"\nSummary saved to garak_summary.md")python analyze_results.pyThe summary report provides a quick overview of 模型's 漏洞 profile. Low-scoring probes indicate areas where 模型 is most susceptible and should be prioritized for deeper manual 測試.
Real-World Implications
Garak is valuable for production LLM 安全 因為:
- Baseline coverage: Automated scanning provides broad baseline 漏洞 評估 quickly
- Reproducibility: Scans can be re-run after model updates to verify fixes and catch regressions
- CI/CD integration: Garak can be integrated into deployment pipelines to gate model releases on 安全 thresholds
- Standardized reporting: Structured 輸出 enables consistent tracking of 安全 posture over time
Troubleshooting
| Issue | Solution |
|---|---|
garak command not found | Install with pip install garak or run as python -m garak |
| Connection refused to Ollama | Ensure Ollama is running: ollama serve |
| Model not found | Pull 模型: ollama pull llama3.2 |
| Scan times out | Reduce --generations or use fewer probes |
| No report files generated | Check ~/.local/share/garak/ or the current directory |
| Import errors | Ensure garak and its dependencies are installed in the same Python environment |
相關主題
- PyRIT Setup -- Set up Microsoft's PyRIT 紅隊演練 framework
- Promptfoo Setup -- Set up promptfoo for LLM 評估
- Ollama Setup -- Configure local model hosting with Ollama
- Injection Techniques Survey -- Manual techniques that complement automated scanning
- Inspect AI Setup -- Set up Inspect AI for 評估
參考文獻
- Garak GitHub Repository -- https://github.com/NVIDIA/garak -- Official source code and documentation
- "Garak: A Framework for 安全 Probing Large Language Models" -- derczynski et al. (2024) -- Technical paper describing garak's architecture
- OWASP Top 10 for LLM Applications -- OWASP (2025) -- The 漏洞 categories garak tests align with OWASP guidance
What are the four main components of garak's architecture?
Why should automated scanning with garak precede manual 紅隊演練?