Adversarial retraining attack of asynchronous advantage actor‐critic based pathfinding

Pathfinding becomes an important component in many real‐world scenarios, such as popular warehouse systems and autonomous aircraft towing vehicles. With the development of reinforcement learning (RL) especially in the context of asynchronous advantage actor‐critic (A3C), pathfinding is undergoing a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of intelligent systems 2021-05, Vol.36 (5), p.2323-2346
Hauptverfasser: Tong, Chen, Jiqiang, Liu, Yingxiao, Xiang, Wenjia, Niu, Endong, Tong, Shuoru, Wang, He, Li, Liang, Chang, Gang, Li, Qi Alfred, Chen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pathfinding becomes an important component in many real‐world scenarios, such as popular warehouse systems and autonomous aircraft towing vehicles. With the development of reinforcement learning (RL) especially in the context of asynchronous advantage actor‐critic (A3C), pathfinding is undergoing a revolution in terms of efficient parallel learning. Similar to other artificial intelligence‐based applications, A3C‐based pathfinding is also threatened by the adversarial attack. In this paper, we are the first to study the adversarial attack to A3C, that can unexpectedly wake up longtime retraining mechanism until successful pathfinding. We also discover an attack example generation to launch the attack based on gradient band, in which only one baffle of extremely few unit lengths can successfully perform the attack. Experiments with detailed analysis are conducted to show a high attack success rate of 95% with an average baffle length of 2.95. We also discuss defense suggestions leveraging the insights from our analysis.
ISSN:0884-8173
1098-111X
DOI:10.1002/int.22380