Defending against Insertion-based Textual Backdoor Attacks via Attribution

Textual backdoor attack, as a novel attack model, has been shown to be effective in adding a backdoor to the model during training. Defending against such backdoor attacks has become urgent and important. In this paper, we propose AttDef, an efficient attribution-based pipeline to defend against two...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-08
Hauptverfasser:	Li, Jiazhao, Wu, Zhuofeng, Wei, Ping, Xiao, Chaowei, Vinod Vydiswaran, V G
Format:	Artikel
Sprache:	eng
Schlagworte:	Insertion Poisoning Poisons Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!