# tuning
8 artikelengetagd met “tuning”
Vaardigheidsverificatie: fine-tuning-aanvallen
Hands-on verification of fine-tuning-based attack techniques including safety alignment removal.
Beveiligingsanalyse van adapterlagen
Security analysis of adapter-based fine-tuning including LoRA, prefix tuning, and prompt tuning.
Stabiliteit van alignment onder fine-tuning
Testing how safety alignment degrades under various fine-tuning configurations and datasets.
Few-shot-detuning-aanvallen
Removing safety alignment with minimal fine-tuning data through targeted few-shot detuning.
Veiligheid omzeilen via instruction tuning
Using instruction tuning to selectively bypass safety mechanisms while maintaining model capability.
Beveiligingsrisico's van gedeelde adapters
Security risks of using publicly shared adapters from model hubs and community repositories.
Beveiligingsonderzoek van de fine-tuning-API
Probe fine-tuning APIs for security weaknesses including insufficient validation and unsafe default configurations.
Manipulatie van instruction-tuning-data
Het manipuleren van instruction-tuning-datasets om specifiek gedrag in het resulterende model in te bedden.