# training-pipeline
標記為「training-pipeline」的 58 篇文章
Advanced Practice Exam
25-question practice exam covering advanced AI red team techniques: multimodal attacks, training pipeline exploitation, agentic system attacks, embedding manipulation, and fine-tuning security.
Practice Exam 2: Advanced AI Security
25-question advanced practice exam covering multimodal attacks, training pipeline security, cloud AI security, forensics, and governance.
Training Pipeline Security Assessment
Test your advanced knowledge of training pipeline attacks including data poisoning, fine-tuning hijacking, RLHF manipulation, and backdoor implantation with 9 questions.
Advanced Training Pipeline Assessment
Advanced assessment on RLHF exploitation, DPO vulnerabilities, and federated learning attacks.
Training Pipeline Security Study Guide
Study guide for training pipeline security covering data poisoning, RLHF attacks, and supply chain threats.
Capstone: Training Pipeline Attack & Defense
Attack a model training pipeline through data poisoning and backdoor insertion, then build defenses to detect and prevent these attacks.
ML CI/CD Security
Security overview of ML continuous integration and deployment pipelines: how ML CI/CD differs from traditional CI/CD, unique attack surfaces in training workflows, and the security implications of automated model building and deployment.
Checkpoint Manipulation Attacks (Training Pipeline)
Direct manipulation of model checkpoints and saved weights to inject backdoors or alter behavior.
Continuous Training Pipeline Attacks
Exploiting continuous learning and online training pipelines through streaming data manipulation.
Curriculum Learning Exploitation (Training Pipeline)
Exploiting curriculum learning and data ordering to amplify the effect of poisoned training examples.
Manipulating Curriculum Learning Schedules
How adversaries exploit curriculum learning by manipulating data ordering, difficulty scheduling, and stage transitions to embed vulnerabilities during training.
Data Poisoning at Scale
Techniques for poisoning training data at scale to influence model behavior across broad capabilities.
Attack Surface of Distributed Training
Security analysis of distributed training systems including gradient aggregation attacks, Byzantine fault exploitation, communication channel vulnerabilities, and federated learning threats.
Security Implications of DPO Training
Analysis of security vulnerabilities introduced by Direct Preference Optimization, including preference manipulation, implicit reward model exploitation, and safety alignment degradation.
DPO Training Vulnerabilities
Security analysis of Direct Preference Optimization training and its vulnerability to preference poisoning.
Evaluation Benchmark Gaming
Techniques for gaming evaluation benchmarks to mask vulnerabilities or inflate safety scores.
Federated Learning Attacks (Training Pipeline)
Attacks on federated learning setups including model poisoning, data inference, and aggregation manipulation.
Gradient-Based Attacks During Training
Technical deep dive into gradient-based attack methods that exploit training-time access, including gradient manipulation, adversarial weight perturbation, and training signal hijacking.
Model Supply Chain Attacks
Comprehensive analysis of model supply chain attack vectors from training data through deployment.
Pre-Training Data Attacks
Attacking the pre-training data pipeline including web crawl poisoning and data curation manipulation.
Security Comparison: Pre-training vs Fine-tuning
Comparative analysis of security vulnerabilities, attack surfaces, and defensive strategies across pre-training and fine-tuning phases of language model development.
RLHF Pipeline Exploitation
Exploiting reward model training, preference data collection, and RLHF optimization loops.
Security of RLHF: Reward Hacking and Reward Model Attacks
Comprehensive analysis of security vulnerabilities in RLHF pipelines, including reward hacking, reward model poisoning, and preference manipulation attacks.
Synthetic Data Pipeline Attacks (Training Pipeline)
Attacking synthetic data generation pipelines used for model training and augmentation.
Poisoning Attacks on Synthetic Training Data
Comprehensive analysis of poisoning vectors in synthetic data generation pipelines, from teacher model manipulation to post-generation filtering evasion.
Security of Training Checkpoints
Threat analysis of model checkpoint storage, serialization, and restoration including checkpoint poisoning, deserialization attacks, and integrity verification.
Security of Training Data Attribution Methods
Analysis of vulnerabilities in training data attribution techniques including influence functions, membership inference, and data provenance tracking, with implications for privacy and security.
Security Implications of Training Data Deduplication
Analysis of how deduplication algorithms create security vulnerabilities, including adversarial deduplication evasion, strategic duplicate injection, and hash collision attacks.
Training Infrastructure Attacks
Attacking training infrastructure including GPU clusters, distributed training, and orchestration systems.
進階 Practice Exam
25-question practice exam covering advanced AI red team techniques: multimodal attacks, training pipeline exploitation, agentic system attacks, embedding manipulation, and fine-tuning security.
Practice Exam 2: 進階 AI 安全
25-question advanced practice exam covering multimodal attacks, training pipeline security, cloud AI security, forensics, and governance.
章節評量:訓練管線
15 題校準評量,測試你對訓練管線安全的理解——資料投毒、RLHF 操控與架構層級攻擊。
進階 訓練 Pipeline 評量
進階 assessment on RLHF exploitation, DPO vulnerabilities, and federated learning attacks.
訓練 Pipeline 安全 Study 指南
Study guide for training pipeline security covering data poisoning, RLHF attacks, and supply chain threats.
Capstone: 訓練 Pipeline 攻擊 & 防禦
攻擊 a model training pipeline through data poisoning and backdoor insertion, then build defenses to detect and prevent these attacks.
ML CI/CD 安全
ML 持續整合與部署管線的安全概觀:ML CI/CD 與傳統 CI/CD 的差異、訓練工作流程中的獨特攻擊面,以及自動化模型建構與部署的安全意涵。
Checkpoint Manipulation 攻擊s (訓練 Pipeline)
Direct manipulation of model checkpoints and saved weights to inject backdoors or alter behavior.
Continuous 訓練 Pipeline 攻擊s
利用ing continuous learning and online training pipelines through streaming data manipulation.
Curriculum Learning 利用ation (訓練 Pipeline)
利用ing curriculum learning and data ordering to amplify the effect of poisoned training examples.
Manipulating Curriculum Learning Schedules
How adversaries exploit curriculum learning by manipulating data ordering, difficulty scheduling, and stage transitions to embed vulnerabilities during training.
Data 投毒 at Scale
Techniques for poisoning training data at scale to influence model behavior across broad capabilities.
攻擊 Surface of Distributed 訓練
安全 analysis of distributed training systems including gradient aggregation attacks, Byzantine fault exploitation, communication channel vulnerabilities, and federated learning threats.
安全 Implications of DPO 訓練
Analysis of security vulnerabilities introduced by Direct Preference Optimization, including preference manipulation, implicit reward model exploitation, and safety alignment degradation.
DPO 訓練 Vulnerabilities
安全 analysis of Direct Preference Optimization training and its vulnerability to preference poisoning.
Evaluation Benchmark Gaming
Techniques for gaming evaluation benchmarks to mask vulnerabilities or inflate safety scores.
Federated Learning 攻擊s (訓練 Pipeline)
攻擊s on federated learning setups including model poisoning, data inference, and aggregation manipulation.
Gradient-Based 攻擊s During 訓練
Technical deep dive into gradient-based attack methods that exploit training-time access, including gradient manipulation, adversarial weight perturbation, and training signal hijacking.
模型 Supply Chain 攻擊s
Comprehensive analysis of model supply chain attack vectors from training data through deployment.
Pre-訓練 Data 攻擊s
攻擊ing the pre-training data pipeline including web crawl poisoning and data curation manipulation.
安全 Comparison: Pre-training vs Fine-tuning
Comparative analysis of security vulnerabilities, attack surfaces, and defensive strategies across pre-training and fine-tuning phases of language model development.
RLHF Pipeline 利用ation
利用ing reward model training, preference data collection, and RLHF optimization loops.
安全 of RLHF: Reward Hacking and Reward 模型 攻擊s
Comprehensive analysis of security vulnerabilities in RLHF pipelines, including reward hacking, reward model poisoning, and preference manipulation attacks.
Synthetic Data Pipeline 攻擊s (訓練 Pipeline)
攻擊ing synthetic data generation pipelines used for model training and augmentation.
投毒 攻擊s on Synthetic 訓練 Data
Comprehensive analysis of poisoning vectors in synthetic data generation pipelines, from teacher model manipulation to post-generation filtering evasion.
安全 of 訓練 Checkpoints
Threat analysis of model checkpoint storage, serialization, and restoration including checkpoint poisoning, deserialization attacks, and integrity verification.
安全 of 訓練 Data Attribution Methods
Analysis of vulnerabilities in training data attribution techniques including influence functions, membership inference, and data provenance tracking, with implications for privacy and security.
安全 Implications of 訓練 Data Deduplication
Analysis of how deduplication algorithms create security vulnerabilities, including adversarial deduplication evasion, strategic duplicate injection, and hash collision attacks.
訓練 Infrastructure 攻擊s
攻擊ing training infrastructure including GPU clusters, distributed training, and orchestration systems.