Law Enforcement AI 安全

Advanced9 min readUpdated 2026-03-15

安全 testing for AI in law enforcement: predictive policing manipulation, facial recognition evasion and bias, evidence analysis AI exploitation, and surveillance system attacks.

law-enforcement predictive-policing facial-recognition evidence-analysis surveillance civil-liberties

Law enforcement AI operates in one of the highest-stakes environments for any AI deployment. These systems influence decisions about who police investigate, who gets arrested, how bail is set, how sentences are determined, and who is surveilled. Errors in law enforcement AI can cause wrongful arrests, civil liberties violations, discriminatory policing, and erosion of community trust. The 安全 of these systems is 因此 both a cybersecurity concern and a civil rights concern.

Red team 測試 of law enforcement AI must grapple with a fundamental tension: the same 對抗性 techniques that reveal 安全 weaknesses can also be used by criminals to evade 偵測. This page addresses that tension by focusing on defensive 測試 methodologies that improve system 安全 without creating a playbook for evasion.

Predictive Policing AI

Feedback Loop 利用

Predictive policing systems (such as PredPol/Geolitica, HunchLab, and custom agency systems) use historical crime data to predict where and when crimes are likely to occur. These systems are vulnerable to feedback loops: increased police presence in predicted areas leads to more arrests, which generates more crime data for that area, which reinforces the prediction, regardless of the actual crime rate.

An adversary can 利用 these feedback loops to manipulate policing patterns:

# Modeling feedback loop 利用 in predictive policing
def simulate_feedback_attack(predictive_system, target_area, duration_days):
    """
    Simulate how an adversary can amplify policing in a
    target area by generating artificial crime signals.
    """
    results = []
 
    for day in range(duration_days):
        # Step 1: Generate artificial crime reports in target area
        # (filing false reports, staging visible "suspicious activity")
        artificial_signals = generate_artificial_signals(
            area=target_area,
            count=3,  # small number to avoid 偵測
            crime_types=["suspicious_activity", "noise_complaint",
                         "loitering"],
        )
 
        # Step 2: System ingests signals and updates predictions
        predictions = predictive_system.predict(day)
 
        # Step 3: Increased police presence generates real data
        # (more stops, more minor arrests in the area)
        real_data_from_presence = estimate_presence_effect(
            patrol_hours=predictions[target_area].recommended_hours
        )
 
        # Step 4: Real data reinforces the artificial signal
        results.append({
            "day": day,
            "artificial_signals": len(artificial_signals),
            "predicted_risk_score": predictions[target_area].risk_score,
            "estimated_patrol_hours": predictions[target_area].recommended_hours,
            "feedback_amplification": predictions[target_area].risk_score /
                                     results[0]["predicted_risk_score"]
                                     if results else 1.0,
        })
 
    return results

Geographic Bias Amplification

Predictive policing systems trained on historical arrest data inherit and amplify historical policing biases. A 紅隊評估 should measure:

Baseline geographic distribution — Where does 系統 predict crime? How does this map to demographic data?
Sensitivity to data manipulation — How many additional reports in an area are needed to significantly change predictions?
Decay rate — How long do artificial signals persist in predictions after the signal source is removed?
Cross-area contamination — Can manipulation in one area affect predictions in adjacent areas?

Facial Recognition Systems

對抗性 Evasion Techniques

Facial recognition systems used by law enforcement for identification, surveillance, and access control are vulnerable to 對抗性 evasion at multiple levels:

Physical 對抗性 attacks:

Technique	Description	Effectiveness	Detectability
對抗性 makeup	Geometric patterns that disrupt facial landmark 偵測	Moderate — effectiveness varies by system	Low — patterns can be subtle
對抗性 accessories	Glasses, hats, or scarves with printed 對抗性 patterns	High against targeted systems	Medium — unusual appearance
IR illumination	IR LEDs that overwhelm camera sensors without visible light	High against IR-based systems	Low — invisible to humans
Projection attacks	Projecting 對抗性 patterns onto the face	High in controlled settings	High — requires equipment
3D-printed masks	Masks designed to impersonate another identity	Variable — depends on liveness 偵測	High — detectable by humans

# 測試 facial recognition robustness
# NOTE: Use synthetic face data only — never real individuals
def test_facial_recognition_robustness(fr_system, synthetic_faces):
    """
    評估 facial recognition robustness against 對抗性
    modifications using synthetic face data.
    """
    results = {
        "baseline_accuracy": None,
        "evasion_results": {},
    }
 
    # Baseline accuracy on unmodified synthetic faces
    baseline_correct = 0
    for face in synthetic_faces:
        match = fr_system.識別(face.image)
        if match.identity == face.identity:
            baseline_correct += 1
    results["baseline_accuracy"] = baseline_correct / len(synthetic_faces)
 
    # 測試 each evasion technique
    evasion_techniques = [
        "gaussian_noise",
        "adversarial_patch",
        "landmark_perturbation",
        "color_channel_shift",
        "geometric_transform",
    ]
 
    for technique in evasion_techniques:
        evaded = 0
        for face in synthetic_faces:
            modified = apply_evasion(face.image, technique)
            match = fr_system.識別(modified)
            if match.identity != face.identity:
                evaded += 1
 
        results["evasion_results"][technique] = {
            "evasion_rate": evaded / len(synthetic_faces),
            "technique": technique,
        }
 
    return results

Demographic Bias 測試

Facial recognition bias is well-documented: systems consistently perform worse on darker-skinned faces, women, and older individuals. Red team 測試 must quantify this bias:

# Demographic bias 評估 for facial recognition
def test_demographic_bias(fr_system, demographic_dataset):
    """
    Measure accuracy across demographic groups using the
    intersectional analysis methodology from Buolamwini & Gebru.
    """
    accuracy_by_group = {}
 
    for face in demographic_dataset:
        group_key = f"{face.skin_tone}_{face.gender}_{face.age_range}"
 
        if group_key not in accuracy_by_group:
            accuracy_by_group[group_key] = {"correct": 0, "total": 0,
                                             "false_positive": 0,
                                             "false_negative": 0}
 
        result = fr_system.識別(face.image)
        accuracy_by_group[group_key]["total"] += 1
 
        if result.identity == face.identity:
            accuracy_by_group[group_key]["correct"] += 1
        elif result.identity is not None:
            accuracy_by_group[group_key]["false_positive"] += 1
        else:
            accuracy_by_group[group_key]["false_negative"] += 1
 
    # Report disparities
    accuracies = {k: v["correct"] / v["total"]
                  for k, v in accuracy_by_group.items()}
    best = max(accuracies.values())
    worst = min(accuracies.values())
 
    return {
        "accuracy_by_group": accuracies,
        "best_group_accuracy": best,
        "worst_group_accuracy": worst,
        "accuracy_gap": best - worst,
        "disparity_ratio": worst / best if best > 0 else 0,
    }

False Identification Risk

The most dangerous failure mode of law enforcement facial recognition is false identification — incorrectly matching an individual to a suspect or wanted person. Multiple cases of wrongful arrest based on facial recognition misidentification have been documented in the United States.

Red team 測試 should measure:

False positive rate at operational thresholds — What is the false match rate at the confidence threshold the agency uses for investigative leads?
Gallery size effects — How does accuracy degrade as the reference 資料庫 grows?
Image quality sensitivity — How does accuracy degrade with surveillance-quality images (low resolution, poor lighting, oblique angles)?

Evidence Analysis AI

Digital Evidence AI Manipulation

Law enforcement increasingly uses AI to process digital evidence: analyzing communications for criminal activity indicators, processing financial records for fraud patterns, and identifying child 利用 material. These systems can be manipulated to either miss evidence or fabricate indicators.

# 測試 evidence analysis AI for manipulation
evidence_ai_tests = {
    "evidence_suppression": {
        "description": "Can communications be structured to evade "
                       "AI 偵測 of criminal activity indicators?",
        "technique": "Use code words, steganography, or 對抗性 "
                     "text that fools the AI while conveying meaning "
                     "to the human recipient",
        "test_cases": [
            "Substitution cipher in casual conversation",
            "Meaning conveyed through sentence structure rather "
            "than vocabulary",
            "Information embedded in message metadata or timing",
        ],
    },
    "false_indicator_injection": {
        "description": "Can an adversary plant communications that "
                       "trigger AI criminal activity indicators "
                       "against an innocent target?",
        "technique": "Send messages to the target that contain "
                     "criminal activity language, causing AI to "
                     "flag the target's communications",
        "severity": "Critical — could lead to wrongful investigation",
    },
}

Chain of Custody for AI Evidence

When AI processes evidence, the chain of custody must include the AI system itself. Red team assessments should verify:

Model versioning — Is the exact model version used to process evidence recorded?
Reproducibility — Can the AI analysis be reproduced with the same inputs and model version?
Tampering 偵測 — Can the AI system or its outputs be modified without 偵測?
Audit trail completeness — Does the audit trail capture all AI processing steps, parameters, and intermediate results?

Surveillance System 攻擊

Scope Expansion 攻擊

AI-powered surveillance systems (automatic license plate readers, video analytics, communications 監控) have defined legal scopes — they are authorized to collect specific data for specific purposes. An adversary (or a system error) can cause scope expansion where the AI collects or analyzes data beyond its authorized scope.

Red team 測試 should probe whether:

The AI can be configured (or misconfigured) to monitor communications or individuals outside its legal authority
Alerting thresholds can be manipulated to generate false alerts that justify expanded surveillance
Data retention policies are enforced by the AI or can be bypassed
Cross-system data sharing is properly restricted

Defensive Recommendations

Bias auditing as a 安全 requirement
Require regular demographic bias audits of all law enforcement AI. Treat disparate accuracy across demographic groups as a 安全漏洞, not just a fairness concern.
Human decision authority
Ensure that no law enforcement action (arrest, search, surveillance initiation) is taken solely on the basis of AI 輸出. AI should inform human decision-making, not replace it.
Feedback loop 監控
實作監控 to detect and break feedback loops in predictive systems. Regularly audit whether predictions reflect actual crime patterns or policing patterns.
Evidence handling standards
Establish chain-of-custody requirements for AI-processed evidence that include model version, processing parameters, and reproducibility verification.

Law Enforcement AI 安全

Advanced9 min readUpdated 2026-03-15

安全 testing for AI in law enforcement: predictive policing manipulation, facial recognition evasion and bias, evidence analysis AI exploitation, and surveillance system attacks.

law-enforcement predictive-policing facial-recognition evidence-analysis surveillance civil-liberties

Predictive Policing AI

Feedback Loop 利用

An adversary can 利用 these feedback loops to manipulate policing patterns:

# Modeling feedback loop 利用 in predictive policing
def simulate_feedback_attack(predictive_system, target_area, duration_days):
    """
    Simulate how an adversary can amplify policing in a
    target area by generating artificial crime signals.
    """
    results = []
 
    for day in range(duration_days):
        # Step 1: Generate artificial crime reports in target area
        # (filing false reports, staging visible "suspicious activity")
        artificial_signals = generate_artificial_signals(
            area=target_area,
            count=3,  # small number to avoid 偵測
            crime_types=["suspicious_activity", "noise_complaint",
                         "loitering"],
        )
 
        # Step 2: System ingests signals and updates predictions
        predictions = predictive_system.predict(day)
 
        # Step 3: Increased police presence generates real data
        # (more stops, more minor arrests in the area)
        real_data_from_presence = estimate_presence_effect(
            patrol_hours=predictions[target_area].recommended_hours
        )
 
        # Step 4: Real data reinforces the artificial signal
        results.append({
            "day": day,
            "artificial_signals": len(artificial_signals),
            "predicted_risk_score": predictions[target_area].risk_score,
            "estimated_patrol_hours": predictions[target_area].recommended_hours,
            "feedback_amplification": predictions[target_area].risk_score /
                                     results[0]["predicted_risk_score"]
                                     if results else 1.0,
        })
 
    return results

Geographic Bias Amplification

Predictive policing systems trained on historical arrest data inherit and amplify historical policing biases. A 紅隊評估 should measure:

Baseline geographic distribution — Where does 系統 predict crime? How does this map to demographic data?
Sensitivity to data manipulation — How many additional reports in an area are needed to significantly change predictions?
Decay rate — How long do artificial signals persist in predictions after the signal source is removed?
Cross-area contamination — Can manipulation in one area affect predictions in adjacent areas?

Facial Recognition Systems

對抗性 Evasion Techniques

Facial recognition systems used by law enforcement for identification, surveillance, and access control are vulnerable to 對抗性 evasion at multiple levels:

Physical 對抗性 attacks:

Technique	Description	Effectiveness	Detectability
對抗性 makeup	Geometric patterns that disrupt facial landmark 偵測	Moderate — effectiveness varies by system	Low — patterns can be subtle
對抗性 accessories	Glasses, hats, or scarves with printed 對抗性 patterns	High against targeted systems	Medium — unusual appearance
IR illumination	IR LEDs that overwhelm camera sensors without visible light	High against IR-based systems	Low — invisible to humans
Projection attacks	Projecting 對抗性 patterns onto the face	High in controlled settings	High — requires equipment
3D-printed masks	Masks designed to impersonate another identity	Variable — depends on liveness 偵測	High — detectable by humans

# 測試 facial recognition robustness
# NOTE: Use synthetic face data only — never real individuals
def test_facial_recognition_robustness(fr_system, synthetic_faces):
    """
    評估 facial recognition robustness against 對抗性
    modifications using synthetic face data.
    """
    results = {
        "baseline_accuracy": None,
        "evasion_results": {},
    }
 
    # Baseline accuracy on unmodified synthetic faces
    baseline_correct = 0
    for face in synthetic_faces:
        match = fr_system.識別(face.image)
        if match.identity == face.identity:
            baseline_correct += 1
    results["baseline_accuracy"] = baseline_correct / len(synthetic_faces)
 
    # 測試 each evasion technique
    evasion_techniques = [
        "gaussian_noise",
        "adversarial_patch",
        "landmark_perturbation",
        "color_channel_shift",
        "geometric_transform",
    ]
 
    for technique in evasion_techniques:
        evaded = 0
        for face in synthetic_faces:
            modified = apply_evasion(face.image, technique)
            match = fr_system.識別(modified)
            if match.identity != face.identity:
                evaded += 1
 
        results["evasion_results"][technique] = {
            "evasion_rate": evaded / len(synthetic_faces),
            "technique": technique,
        }
 
    return results

Demographic Bias 測試

Facial recognition bias is well-documented: systems consistently perform worse on darker-skinned faces, women, and older individuals. Red team 測試 must quantify this bias:

# Demographic bias 評估 for facial recognition
def test_demographic_bias(fr_system, demographic_dataset):
    """
    Measure accuracy across demographic groups using the
    intersectional analysis methodology from Buolamwini & Gebru.
    """
    accuracy_by_group = {}
 
    for face in demographic_dataset:
        group_key = f"{face.skin_tone}_{face.gender}_{face.age_range}"
 
        if group_key not in accuracy_by_group:
            accuracy_by_group[group_key] = {"correct": 0, "total": 0,
                                             "false_positive": 0,
                                             "false_negative": 0}
 
        result = fr_system.識別(face.image)
        accuracy_by_group[group_key]["total"] += 1
 
        if result.identity == face.identity:
            accuracy_by_group[group_key]["correct"] += 1
        elif result.identity is not None:
            accuracy_by_group[group_key]["false_positive"] += 1
        else:
            accuracy_by_group[group_key]["false_negative"] += 1
 
    # Report disparities
    accuracies = {k: v["correct"] / v["total"]
                  for k, v in accuracy_by_group.items()}
    best = max(accuracies.values())
    worst = min(accuracies.values())
 
    return {
        "accuracy_by_group": accuracies,
        "best_group_accuracy": best,
        "worst_group_accuracy": worst,
        "accuracy_gap": best - worst,
        "disparity_ratio": worst / best if best > 0 else 0,
    }

False Identification Risk

Red team 測試 should measure:

False positive rate at operational thresholds — What is the false match rate at the confidence threshold the agency uses for investigative leads?
Gallery size effects — How does accuracy degrade as the reference 資料庫 grows?
Image quality sensitivity — How does accuracy degrade with surveillance-quality images (low resolution, poor lighting, oblique angles)?

Evidence Analysis AI

Digital Evidence AI Manipulation

# 測試 evidence analysis AI for manipulation
evidence_ai_tests = {
    "evidence_suppression": {
        "description": "Can communications be structured to evade "
                       "AI 偵測 of criminal activity indicators?",
        "technique": "Use code words, steganography, or 對抗性 "
                     "text that fools the AI while conveying meaning "
                     "to the human recipient",
        "test_cases": [
            "Substitution cipher in casual conversation",
            "Meaning conveyed through sentence structure rather "
            "than vocabulary",
            "Information embedded in message metadata or timing",
        ],
    },
    "false_indicator_injection": {
        "description": "Can an adversary plant communications that "
                       "trigger AI criminal activity indicators "
                       "against an innocent target?",
        "technique": "Send messages to the target that contain "
                     "criminal activity language, causing AI to "
                     "flag the target's communications",
        "severity": "Critical — could lead to wrongful investigation",
    },
}

Chain of Custody for AI Evidence

When AI processes evidence, the chain of custody must include the AI system itself. Red team assessments should verify:

Model versioning — Is the exact model version used to process evidence recorded?
Reproducibility — Can the AI analysis be reproduced with the same inputs and model version?
Tampering 偵測 — Can the AI system or its outputs be modified without 偵測?
Audit trail completeness — Does the audit trail capture all AI processing steps, parameters, and intermediate results?

Surveillance System 攻擊

Scope Expansion 攻擊

Red team 測試 should probe whether:

The AI can be configured (or misconfigured) to monitor communications or individuals outside its legal authority
Alerting thresholds can be manipulated to generate false alerts that justify expanded surveillance
Data retention policies are enforced by the AI or can be bypassed
Cross-system data sharing is properly restricted

Defensive Recommendations

Bias auditing as a 安全 requirement
Require regular demographic bias audits of all law enforcement AI. Treat disparate accuracy across demographic groups as a 安全漏洞, not just a fairness concern.
Human decision authority
Ensure that no law enforcement action (arrest, search, surveillance initiation) is taken solely on the basis of AI 輸出. AI should inform human decision-making, not replace it.
Feedback loop 監控
實作監控 to detect and break feedback loops in predictive systems. Regularly audit whether predictions reflect actual crime patterns or policing patterns.
Evidence handling standards
Establish chain-of-custody requirements for AI-processed evidence that include model version, processing parameters, and reproducibility verification.

Law Enforcement AI 安全

Predictive Policing AI

Feedback Loop 利用

Geographic Bias Amplification

Facial Recognition Systems

對抗性 Evasion Techniques

Demographic Bias 測試

False Identification Risk

Evidence Analysis AI

Digital Evidence AI Manipulation

Chain of Custody for AI Evidence

Surveillance System 攻擊

Scope Expansion 攻擊

Defensive Recommendations

Bias auditing as a 安全 requirement

Human decision authority

Feedback loop 監控

Evidence handling standards

Further Reading

Law Enforcement AI 安全

Predictive Policing AI

Feedback Loop 利用

Geographic Bias Amplification

Facial Recognition Systems

對抗性 Evasion Techniques

Demographic Bias 測試

False Identification Risk

Evidence Analysis AI

Digital Evidence AI Manipulation

Chain of Custody for AI Evidence

Surveillance System 攻擊

Scope Expansion 攻擊

Defensive Recommendations

Bias auditing as a 安全 requirement

Human decision authority

Feedback loop 監控

Evidence handling standards

Further Reading

Law Enforcement AI 安全

Bias auditing as a 安全 requirement

Human decision authority

Feedback loop 監控

Evidence handling standards

Related articles

Law Enforcement AI 安全

Bias auditing as a 安全 requirement

Human decision authority

Feedback loop 監控

Evidence handling standards

Related articles