ADfM-Net: An Adversarial Depth-from-Motion Network Based on Cross Attention and Motion Enhanced

The temporal consistent and accurate depth estimation for consecutive images is essential for many downstream applications. However, most existing methods only infer depth from a single image, ignoring the temporal information and important depth cues from motion in the sequence. Additionally, the d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE robotics and automation letters 2023-08, Vol.8 (8), p.1-8
Hauptverfasser: Long, Yangqi, Yu, Huimin, Xu, Chenfeng, Deng, Zhiqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The temporal consistent and accurate depth estimation for consecutive images is essential for many downstream applications. However, most existing methods only infer depth from a single image, ignoring the temporal information and important depth cues from motion in the sequence. Additionally, the depths of adjacent frames are estimated separately without any constraint. In this paper, we promote the temporal consistency and accuracy of depth results from the aforementioned two aspects: multi-frame framework and consistency constraint. Firstly, a framework with cross-frame attention and motion enhancement module is proposed for better temporal consistency and depth precision. Secondly, an adversarial metric learning strategy is introduced to further constrain the consistency of adjacent depth results, without any additional computation and memory cost. The experiments on KITTI and Cityscapes datasets demonstrate the effectiveness of our framework. Furthermore, noting that the traditional metrics can not reveal the consistency of the depth results, a new temporal consistency metric is proposed, which would facilitate further research.
ISSN:2377-3766
2377-3766
DOI:10.1109/LRA.2023.3268885