Video Frame Injection 攻擊s

進階9 分鐘閱讀更新於 2026-03-15

Inserting adversarial frames into video to exploit video understanding models: temporal injection, keyframe manipulation, subliminal frame attacks, and detection evasion.

video frame-injection adversarial temporal video-understanding

Video 理解 models process video by sampling frames, extracting features, and reasoning about temporal sequences. They do not see every frame. This sampling behavior creates a precise 攻擊面: if 攻擊者 knows (or can guess) how a model samples frames, they can insert 對抗性 content at positions that are likely to be selected, while keeping the video visually normal to human viewers who see it at full frame rate.

How Video Models Sample Frames

理解 the target model's sampling strategy is the foundation of any frame injection attack.

Common Sampling Strategies

Strategy	Method	Frames Selected (from 300-frame video)	漏洞
Uniform sampling	Select N frames at equal intervals	Frames 0, 37, 75, 112, ...	Predictable positions
Keyframe extraction	Use I-frames from video codec	Codec-dependent	Attacker controls codec
Scene-change 偵測	Sample frames at scene boundaries	Variable	Injecting fake scene changes
Random sampling	Select N random frames	Unpredictable	Requires saturating more frames
Temporal stride	Every Kth frame	Frames 0, K, 2K, ...	Predictable if K is known

import cv2
import numpy as np
 
def analyze_sampling_strategy(
    model_fn: callable,
    test_video_path: str,
    num_unique_frames: int = 300
):
    """Determine which frames a model actually processes by using unique markers."""
    # Create a 測試 video where each frame has a unique identifier
    cap = cv2.VideoCapture(test_video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    cap.release()
 
    # Generate video with frame-number watermarks
    marked_path = "marked_test.mp4"
    writer = cv2.VideoWriter(marked_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (width, height))
 
    for i in range(num_unique_frames):
        frame = np.zeros((height, width, 3), dtype=np.uint8)
        cv2.putText(frame, f"FRAME_{i:04d}", (50, height // 2),
                    cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 255, 255), 3)
        writer.write(frame)
 
    writer.release()
 
    # Ask model what frame numbers it sees
    response = model_fn(marked_path, "List all FRAME_XXXX identifiers you can see.")
    return response

Single-Frame Injection

The simplest attack: insert one 對抗性 frame at a position 模型 is likely to sample.

Targeting Uniform Sampling

If 模型 uniformly samples N frames from a video of T total frames, the sampled positions are approximately at indices [0, T/N, 2T/N, ..., T-1]. 攻擊者 replaces the frame at one of these positions.

def inject_single_frame(
    video_path: str,
    adversarial_frame: np.ndarray,
    target_position: int,
    output_path: str
):
    """Replace a single frame in the video with an 對抗性 frame."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx == target_position:
            # Resize 對抗性 frame to match video dimensions
            adv_resized = cv2.resize(adversarial_frame, (width, height))
            writer.write(adv_resized)
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

Blended Injection

Rather than replacing a frame entirely (which creates a visual glitch if noticed), blend the 對抗性 content with the original frame.

def inject_blended_frame(
    video_path: str,
    adversarial_content: np.ndarray,
    target_position: int,
    blend_alpha: float = 0.3,
    output_path: str = "blended_output.mp4"
):
    """Blend 對抗性 content into a frame rather than replacing it."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx == target_position:
            adv_resized = cv2.resize(adversarial_content, (width, height))
            blended = cv2.addWeighted(frame, 1 - blend_alpha, adv_resized, blend_alpha, 0)
            writer.write(blended.astype(np.uint8))
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

Multi-Frame Injection Strategies

Single-frame injection is fragile -- if 模型's sampling misses the injected frame, the attack fails. Multi-frame strategies increase reliability.

Saturation Injection

Insert 對抗性 frames at regular intervals throughout the video, ensuring that regardless of sampling strategy, at least one 對抗性 frame is captured.

def inject_saturated(
    video_path: str,
    adversarial_frame: np.ndarray,
    injection_interval: int = 10,
    output_path: str = "saturated_output.mp4"
):
    """Insert 對抗性 frames at regular intervals."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx % injection_interval == 0:
            adv_resized = cv2.resize(adversarial_frame, (width, height))
            writer.write(adv_resized)
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
 
    injected_count = frame_idx // injection_interval
    return output_path, injected_count

Temporal Gradient Injection

Gradually introduce 對抗性 content across multiple frames, making the transition less detectable by temporal anomaly detectors.

def inject_temporal_gradient(
    video_path: str,
    adversarial_frame: np.ndarray,
    center_position: int,
    ramp_frames: int = 5,
    output_path: str = "gradient_output.mp4"
):
    """Gradually blend in 對抗性 content over multiple frames."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    adv_resized = cv2.resize(adversarial_frame, (width, height))
    start = center_position - ramp_frames
    end = center_position + ramp_frames
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if start <= frame_idx <= end:
            # Calculate blend factor -- peaks at center_position
            distance = abs(frame_idx - center_position)
            alpha = 1.0 - (distance / (ramp_frames + 1))
            blended = cv2.addWeighted(frame, 1 - alpha, adv_resized, alpha, 0)
            writer.write(blended.astype(np.uint8))
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

對抗性 Frame Content Types

The content of the injected frame determines the attack's objective.

Typographic Frame Injection

Insert a frame containing text instructions. This combines video frame injection with typographic attacks -- the video model reads the text in the 對抗性 frame and follows the instructions.

def create_text_frame(
    width: int,
    height: int,
    instruction: str,
    background_color: tuple = (255, 255, 255),
    text_color: tuple = (0, 0, 0)
):
    """Create a frame containing 對抗性 text instructions."""
    frame = np.full((height, width, 3), background_color, dtype=np.uint8)
 
    # Split instruction into lines that fit the frame
    words = instruction.split()
    lines = []
    current_line = ""
    max_chars = width // 15  # Approximate characters per line
 
    for word in words:
        if len(current_line) + len(word) + 1 <= max_chars:
            current_line += " " + word if current_line else word
        else:
            lines.append(current_line)
            current_line = word
    if current_line:
        lines.append(current_line)
 
    y_start = height // 2 - (len(lines) * 30) // 2
    for i, line in enumerate(lines):
        cv2.putText(frame, line, (20, y_start + i * 30),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, text_color, 2)
 
    return frame

對抗性 Image Frame

Insert a frame that is an 對抗性 image crafted to cause misclassification or behavior change in the vision encoder. This requires white-box access to compute perturbations.

Context-Manipulating Frame

Insert a frame showing a different scene that changes 模型's 理解 of the video's context. 例如, inserting a frame of a medical setting into a cooking video might cause 模型 to describe the video as medical content.

Codec-Level 攻擊

Video codecs (H.264, H.265, VP9) use keyframes (I-frames) and delta frames (P-frames, B-frames). Models that extract keyframes as their sampling strategy are vulnerable to codec-level manipulation.

Forcing Keyframe Placement

import subprocess
 
def encode_with_forced_keyframes(
    input_path: str,
    keyframe_positions: list,
    output_path: str
):
    """Re-encode video with keyframes at specific positions."""
    # Build keyframe expression for ffmpeg
    kf_expr = "+".join(f"eq(n,{pos})" for pos in keyframe_positions)
 
    cmd = [
        "ffmpeg", "-i", input_path,
        "-force_key_frames", f"expr:{kf_expr}",
        "-c:v", "libx264",
        "-y", output_path
    ]
 
    subprocess.run(cmd, capture_output=True, check=True)
    return output_path

By forcing keyframes at positions where 對抗性 frames have been inserted, 攻擊者 ensures that keyframe-based sampling will select the 對抗性 content.

偵測 and 防禦

Temporal Consistency Analysis

對抗性 frames typically differ significantly from their neighbors. Measuring frame-to-frame similarity can 識別 injections.

def detect_frame_anomalies(
    video_path: str,
    threshold: float = 0.3
):
    """Detect anomalous frames by measuring temporal consistency."""
    cap = cv2.VideoCapture(video_path)
    prev_frame = None
    anomalies = []
    frame_idx = 0
 
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 
        if prev_frame is not None:
            # Structural similarity between consecutive frames
            diff = cv2.absdiff(prev_frame, gray)
            mean_diff = np.mean(diff) / 255.0
 
            if mean_diff > threshold:
                anomalies.append({
                    "frame": frame_idx,
                    "difference_score": float(mean_diff),
                    "type": "high_temporal_discontinuity"
                })
 
        prev_frame = gray
        frame_idx += 1
 
    cap.release()
    return anomalies

Multi-Sample Verification

Process the video with multiple different sampling strategies. If results are consistent, the video is likely clean. If different sampling strategies produce different descriptions, 對抗性 frames may be present.

Frame Deduplication

Before processing, 識別 and remove near-duplicate or anomalous frames. This defends against saturation injection by limiting the number of 對抗性 frames that reach 模型.

紅隊評估 Methodology

識別 the video processing pipeline
Determine what model processes the video, how it samples frames, and what 輸出 it produces (classification, description, action recognition, content moderation).
Probe sampling behavior
Submit 測試 videos with frame-number markers to determine 模型's sampling strategy. This reveals which frame positions are most valuable for injection.
測試 single-frame injection
Insert a single clearly 對抗性 frame (e.g., containing large text instructions) at a predicted sample position. Verify whether 模型 processes it.
測試 stealth variants
Progress to blended injection, temporal gradient, and low-visibility content. Measure the minimum injection strength that still affects model 輸出.
評估 codec-level attacks
If 模型 uses keyframe extraction, 測試 whether forced keyframe placement at 對抗性 frames increases attack success.
測試偵測 bypass
If 系統 has frame anomaly 偵測, use temporal gradient injection or context-consistent 對抗性 frames to evade 偵測 while still affecting model 輸出.

總結

Video frame injection exploits the fundamental mismatch between how humans and models perceive video. By inserting 對抗性 frames at positions 模型 samples, attackers can steer video 理解 outputs without creating artifacts visible to human viewers. Effective 防禦 requires temporal consistency analysis, multi-sample verification, and treating each sampled frame as potentially 對抗性輸入. As video 理解 models become more prevalent in content moderation, surveillance, and media analysis, frame injection attacks represent an increasingly important 攻擊面.

Video Frame Injection 攻擊s

進階9 分鐘閱讀更新於 2026-03-15

Inserting adversarial frames into video to exploit video understanding models: temporal injection, keyframe manipulation, subliminal frame attacks, and detection evasion.

video frame-injection adversarial temporal video-understanding

How Video Models Sample Frames

理解 the target model's sampling strategy is the foundation of any frame injection attack.

Common Sampling Strategies

Strategy	Method	Frames Selected (from 300-frame video)	漏洞
Uniform sampling	Select N frames at equal intervals	Frames 0, 37, 75, 112, ...	Predictable positions
Keyframe extraction	Use I-frames from video codec	Codec-dependent	Attacker controls codec
Scene-change 偵測	Sample frames at scene boundaries	Variable	Injecting fake scene changes
Random sampling	Select N random frames	Unpredictable	Requires saturating more frames
Temporal stride	Every Kth frame	Frames 0, K, 2K, ...	Predictable if K is known

import cv2
import numpy as np
 
def analyze_sampling_strategy(
    model_fn: callable,
    test_video_path: str,
    num_unique_frames: int = 300
):
    """Determine which frames a model actually processes by using unique markers."""
    # Create a 測試 video where each frame has a unique identifier
    cap = cv2.VideoCapture(test_video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    cap.release()
 
    # Generate video with frame-number watermarks
    marked_path = "marked_test.mp4"
    writer = cv2.VideoWriter(marked_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (width, height))
 
    for i in range(num_unique_frames):
        frame = np.zeros((height, width, 3), dtype=np.uint8)
        cv2.putText(frame, f"FRAME_{i:04d}", (50, height // 2),
                    cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 255, 255), 3)
        writer.write(frame)
 
    writer.release()
 
    # Ask model what frame numbers it sees
    response = model_fn(marked_path, "List all FRAME_XXXX identifiers you can see.")
    return response

Single-Frame Injection

The simplest attack: insert one 對抗性 frame at a position 模型 is likely to sample.

Targeting Uniform Sampling

def inject_single_frame(
    video_path: str,
    adversarial_frame: np.ndarray,
    target_position: int,
    output_path: str
):
    """Replace a single frame in the video with an 對抗性 frame."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx == target_position:
            # Resize 對抗性 frame to match video dimensions
            adv_resized = cv2.resize(adversarial_frame, (width, height))
            writer.write(adv_resized)
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

Blended Injection

Rather than replacing a frame entirely (which creates a visual glitch if noticed), blend the 對抗性 content with the original frame.

def inject_blended_frame(
    video_path: str,
    adversarial_content: np.ndarray,
    target_position: int,
    blend_alpha: float = 0.3,
    output_path: str = "blended_output.mp4"
):
    """Blend 對抗性 content into a frame rather than replacing it."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx == target_position:
            adv_resized = cv2.resize(adversarial_content, (width, height))
            blended = cv2.addWeighted(frame, 1 - blend_alpha, adv_resized, blend_alpha, 0)
            writer.write(blended.astype(np.uint8))
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

Multi-Frame Injection Strategies

Single-frame injection is fragile -- if 模型's sampling misses the injected frame, the attack fails. Multi-frame strategies increase reliability.

Saturation Injection

Insert 對抗性 frames at regular intervals throughout the video, ensuring that regardless of sampling strategy, at least one 對抗性 frame is captured.

def inject_saturated(
    video_path: str,
    adversarial_frame: np.ndarray,
    injection_interval: int = 10,
    output_path: str = "saturated_output.mp4"
):
    """Insert 對抗性 frames at regular intervals."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if frame_idx % injection_interval == 0:
            adv_resized = cv2.resize(adversarial_frame, (width, height))
            writer.write(adv_resized)
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
 
    injected_count = frame_idx // injection_interval
    return output_path, injected_count

Temporal Gradient Injection

Gradually introduce 對抗性 content across multiple frames, making the transition less detectable by temporal anomaly detectors.

def inject_temporal_gradient(
    video_path: str,
    adversarial_frame: np.ndarray,
    center_position: int,
    ramp_frames: int = 5,
    output_path: str = "gradient_output.mp4"
):
    """Gradually blend in 對抗性 content over multiple frames."""
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
 
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),
                             fps, (width, height))
 
    adv_resized = cv2.resize(adversarial_frame, (width, height))
    start = center_position - ramp_frames
    end = center_position + ramp_frames
 
    frame_idx = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        if start <= frame_idx <= end:
            # Calculate blend factor -- peaks at center_position
            distance = abs(frame_idx - center_position)
            alpha = 1.0 - (distance / (ramp_frames + 1))
            blended = cv2.addWeighted(frame, 1 - alpha, adv_resized, alpha, 0)
            writer.write(blended.astype(np.uint8))
        else:
            writer.write(frame)
 
        frame_idx += 1
 
    cap.release()
    writer.release()
    return output_path

對抗性 Frame Content Types

The content of the injected frame determines the attack's objective.

Typographic Frame Injection

Insert a frame containing text instructions. This combines video frame injection with typographic attacks -- the video model reads the text in the 對抗性 frame and follows the instructions.

def create_text_frame(
    width: int,
    height: int,
    instruction: str,
    background_color: tuple = (255, 255, 255),
    text_color: tuple = (0, 0, 0)
):
    """Create a frame containing 對抗性 text instructions."""
    frame = np.full((height, width, 3), background_color, dtype=np.uint8)
 
    # Split instruction into lines that fit the frame
    words = instruction.split()
    lines = []
    current_line = ""
    max_chars = width // 15  # Approximate characters per line
 
    for word in words:
        if len(current_line) + len(word) + 1 <= max_chars:
            current_line += " " + word if current_line else word
        else:
            lines.append(current_line)
            current_line = word
    if current_line:
        lines.append(current_line)
 
    y_start = height // 2 - (len(lines) * 30) // 2
    for i, line in enumerate(lines):
        cv2.putText(frame, line, (20, y_start + i * 30),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, text_color, 2)
 
    return frame

對抗性 Image Frame

Insert a frame that is an 對抗性 image crafted to cause misclassification or behavior change in the vision encoder. This requires white-box access to compute perturbations.

Context-Manipulating Frame

Codec-Level 攻擊

Video codecs (H.264, H.265, VP9) use keyframes (I-frames) and delta frames (P-frames, B-frames). Models that extract keyframes as their sampling strategy are vulnerable to codec-level manipulation.

Forcing Keyframe Placement

import subprocess
 
def encode_with_forced_keyframes(
    input_path: str,
    keyframe_positions: list,
    output_path: str
):
    """Re-encode video with keyframes at specific positions."""
    # Build keyframe expression for ffmpeg
    kf_expr = "+".join(f"eq(n,{pos})" for pos in keyframe_positions)
 
    cmd = [
        "ffmpeg", "-i", input_path,
        "-force_key_frames", f"expr:{kf_expr}",
        "-c:v", "libx264",
        "-y", output_path
    ]
 
    subprocess.run(cmd, capture_output=True, check=True)
    return output_path

By forcing keyframes at positions where 對抗性 frames have been inserted, 攻擊者 ensures that keyframe-based sampling will select the 對抗性 content.

偵測 and 防禦

Temporal Consistency Analysis

對抗性 frames typically differ significantly from their neighbors. Measuring frame-to-frame similarity can 識別 injections.

def detect_frame_anomalies(
    video_path: str,
    threshold: float = 0.3
):
    """Detect anomalous frames by measuring temporal consistency."""
    cap = cv2.VideoCapture(video_path)
    prev_frame = None
    anomalies = []
    frame_idx = 0
 
    while True:
        ret, frame = cap.read()
        if not ret:
            break
 
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 
        if prev_frame is not None:
            # Structural similarity between consecutive frames
            diff = cv2.absdiff(prev_frame, gray)
            mean_diff = np.mean(diff) / 255.0
 
            if mean_diff > threshold:
                anomalies.append({
                    "frame": frame_idx,
                    "difference_score": float(mean_diff),
                    "type": "high_temporal_discontinuity"
                })
 
        prev_frame = gray
        frame_idx += 1
 
    cap.release()
    return anomalies

識別 the video processing pipeline
Determine what model processes the video, how it samples frames, and what 輸出 it produces (classification, description, action recognition, content moderation).
Probe sampling behavior
Submit 測試 videos with frame-number markers to determine 模型's sampling strategy. This reveals which frame positions are most valuable for injection.
測試 single-frame injection
Insert a single clearly 對抗性 frame (e.g., containing large text instructions) at a predicted sample position. Verify whether 模型 processes it.
測試 stealth variants
Progress to blended injection, temporal gradient, and low-visibility content. Measure the minimum injection strength that still affects model 輸出.
評估 codec-level attacks
If 模型 uses keyframe extraction, 測試 whether forced keyframe placement at 對抗性 frames increases attack success.
測試偵測 bypass
If 系統 has frame anomaly 偵測, use temporal gradient injection or context-consistent 對抗性 frames to evade 偵測 while still affecting model 輸出.

Video Frame Injection 攻擊s

識別 the video processing pipeline

Probe sampling behavior

測試 single-frame injection

測試 stealth variants

評估 codec-level attacks

測試 偵測 bypass

相關文章

Video Frame Injection 攻擊s

識別 the video processing pipeline

Probe sampling behavior

測試 single-frame injection

測試 stealth variants

評估 codec-level attacks

測試 偵測 bypass

相關文章

測試偵測 bypass

測試偵測 bypass