Guard-Net: Lightweight Stereo Matching Network via Global and Uncertainty-Aware Refinement for Autonomous Driving

Stereo matching is a prominent research area in autonomous driving and computer vision. Despite significant progress made by learning-based methods, accurately predicting disparities in hazardous regions, which is crucial for ensuring safe vehicle operation, remains challenging. The limitations of m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2024-08, Vol.25 (8), p.10260-10273
Hauptverfasser: Liu, Yujun, Zhang, Xiangchen, Luo, Yang, Hao, Qiaoqiao, Su, Jinhe, Cai, Guorong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Stereo matching is a prominent research area in autonomous driving and computer vision. Despite significant progress made by learning-based methods, accurately predicting disparities in hazardous regions, which is crucial for ensuring safe vehicle operation, remains challenging. The limitations of methods based on Convolutional Neural Networks (CNNs) are most noticeable in textureless regions and repetitive patterns, leading to unreliable predictions. Furthermore, calculating disparities for boundaries and thin structures, where the disparity jump phenomenon is prominent remains difficult. To address these issues, we propose a lightweight stereo matching architecture that focuses on obtaining real-time and high-precision disparity maps in hazardous areas. We exploit an efficient global enhanced path to provide global representations in ill-posed regions, where CNN-based approaches often struggle. Second, our model integrates local and global features to generate more reliable cost volume. Finally, our innovative uncertainty-aware module refines disparity, making full use of high-frequency detailed information and uncertainty attention, effectively preserving complex structures. Comprehensive experimental studies on SceneFlow demonstrate our method outperforms state-of-the-art methods, achieving an End-Point Error (EPE) of 0.47 with only 3.60M parameters. The effectiveness of our method speed-accuracy trade-off is further confirmed by competitive results obtained from the KITTI 2012 and KITTI 2015 experiments. Code is available at: https://github.com/YJLCV/Guard-Net .
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2024.3357841