1 articletagged with “web-scraping”
Comprehensive overview of pre-training security vulnerabilities including data collection, cleaning, deduplication, and web-scale dataset compromise attack vectors.