Multi-dimensional Residual Dense Attention Network for Stereo Matching
Very deep convolutional neural networks (CNN) have recently achieved great success in stereo matching. It is still highly desirable to learn a robust feature map to improve ill-posed regions such as weakly-textured regions, reflective surfaces, and repetitive patterns. Therefore, we propose an end-t...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019-01, Vol.7, p.1-1 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Very deep convolutional neural networks (CNN) have recently achieved great success in stereo matching. It is still highly desirable to learn a robust feature map to improve ill-posed regions such as weakly-textured regions, reflective surfaces, and repetitive patterns. Therefore, we propose an end-to-end Multi-dimensional Residual Dense Attention Network (MRDA-Net) in this paper, focusing on more comprehensive pixel-wise feature extraction. Our proposed network consists of two parts: the 2D residual dense attention net for feature extraction and the 3D convolutional attention net for matching. Our designed 2D residual dense attention net uses dense network structure to fully exploit the hierarchical features from preceding convolutional layers, and uses residual network structure to fuse low-level structure information and high-level semantic information. The 2D attention module of the net aims to adaptively recalibrate channel-wise features to be more concerned about informative features. Our proposed 3D convolutional attention net further expands attention mechanism for matching. The stacked hourglass module of the net is exploited to extract multi-scale context information as well as geometry information. The novel 3D attention module of the net aggregates hierarchical sub-cost volumes adaptively instead of manually, and then achieves a comprehensive recalibrated cost volume for more correct disparity computation. Experiments demonstrate that our approach achieves state-of-the-art accuracy on Scene Flow dataset, KITTI 2012, and KITTI 2015 stereo dataset. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2911618 |