Multiple attention networks for stereo matching
Recent studies have shown that stereo matching can be considered a supervised learning task, in which several left and right images serve as inputs to the convolutional neural network for training, and a detailed disparity map can be obtained. However, the existing architecture for stereo matching i...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2021-07, Vol.80 (18), p.28583-28601 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent studies have shown that stereo matching can be considered a supervised learning task, in which several left and right images serve as inputs to the convolutional neural network for training, and a detailed disparity map can be obtained. However, the existing architecture for stereo matching is not suitable for estimating the depth of ill-posed regions. To address this problem, we propose a multiple attention network (MA-Net) for stereo matching, which mainly consists of four processes: feature extraction, cost volume construction, cost aggregation, and disparity prediction. For feature extraction, an hourglass position attention module that can effectively aggregate global context and multi-scale information at every position is adopted. In the cost volume construction, we combine cross-correlation volumes with concatenation volumes to ensure that the cost volume can provide efficient representations for measuring feature similarities. In cost aggregation, a multiscale disparity attention module is designed, which can aggregate the feature information of different scales and different disparity dimensions. As in other end-to-end methods, the final disparity is obtained through regression in the disparity prediction. Experimental results obtained on Scene Flow, KITT2012 and KITTI2015 benchmarks show that the proposed method has several advantages in terms of accuracy and speed. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-021-11102-9 |