# transfer-attacks
6 artikelengetagd met “transfer-attacks”
Lab: cross-linguale transfer-aanvallen
Develop attacks in one language that transfer to others by exploiting shared multilingual representation spaces.
Lab: transfer-aanvallen tussen modellen
Test whether jailbreaks discovered on one language model transfer effectively to others, building a systematic methodology for cross-model vulnerability research.
Lab: ontwikkeling van transfer-aanvallen
Hands-on lab for crafting adversarial prompts on open-weight models like Llama that transfer to closed-source models like Claude and GPT-4, using iterative refinement and cross-model evaluation.
Lab: ontwikkeling van transfer-aanvallen (lab voor gevorderden)
Develop adversarial attacks on open-source models that transfer to closed-source models, leveraging weight access for black-box exploitation.
Overdraagbaarheid van jailbreaks
Analysis of which jailbreaks transfer across models and why, including universal vs model-specific techniques, transfer attack methodology, and factors that determine portability.
Universele adversarial triggers
Het ontdekken en inzetten van universele adversarial trigger-sequenties die betrouwbaar de safety alignment van meerdere LLM-families omzeilen, inclusief gradient-gebaseerd zoeken, transfer-aanvallen en het ontwijken van verdedigingen.