Target-Driven Visual Navigation by Using Causal Intervention

Target-driven visual navigation presents great po- tentials in scientific and industrial fields. It takes the target and environment observations as input. However, during training, we found that the agent sometimes got stuck in specific locations. Based on the analysis on visual information from a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent vehicles 2024-01, Vol.9 (1), p.1-10
Hauptverfasser: Zhao, Xinzhou, Wang, Tian, Li, Yanjing, Zhang, Baochang, Liu, Kexin, Liu, Deyuan, Wang, Chuanyun, Snoussi, Hichem
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Target-driven visual navigation presents great po- tentials in scientific and industrial fields. It takes the target and environment observations as input. However, during training, we found that the agent sometimes got stuck in specific locations. Based on the analysis on visual information from a novel causal perspective, one of the most critical hurdles is the neglect of confounders in environments, which often leads to spurious correlations. Mitigating the confounding effect helps to discover the real causality and therefore are taken into consideration in other fields such as object detection. In this paper, we propose Causal Intervention Visual Navigation (CIVN), based on deep reinforcement learning (DRL) and causal intervention. We realize causal intervention using front-door adjustment as most con- founders are hard to model explicitly. Specifically, CIVN is imple- mented by Causal Attention, which is a reasonable approximation of causal intervention for visual navigation. Causal attention provides high-quality representation, which is leveraged by DRL and reduces the number of "stuck". It is worth mentioned that causal intervention is for the first time applied by us in solving the confounding effect in target-driven visual navigation. Extensive experiments on AI2-THOR demonstrate that CIVN achieves better performance than prior arts. Specifically, the generalization for unknown targets and scenes is improved by a large margin, which is a basic research topic in visual navigation. Moreover, to obtain better generalization, we propose a novel experiment utilizing pre-trained models firstly.
ISSN:2379-8858
2379-8904
DOI:10.1109/TIV.2023.3288810