Adaptive region aggregation for multi‐view stereo matching using deformable convolutional networks

Deep‐learning methods have demonstrated promising performance in multi‐view stereo (MVS) applications. However, it remains challenging to apply a geometrical prior on the adaptive matching windows to achieve efficient three‐dimensional reconstruction. To address this problem, this paper proposes a l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Photogrammetric record 2023-09, Vol.38 (183), p.430-449
Hauptverfasser:	Hu, Han, Su, Liupeng, Mao, Shunfu, Chen, Min, Pan, Guoqiang, Xu, Bo, Zhu, Qing
Format:	Artikel
Sprache:	eng
Schlagworte:	Datasets Deformation Feature extraction Fine structure Formability Matching Temples Workflow
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep‐learning methods have demonstrated promising performance in multi‐view stereo (MVS) applications. However, it remains challenging to apply a geometrical prior on the adaptive matching windows to achieve efficient three‐dimensional reconstruction. To address this problem, this paper proposes a learnable adaptive region aggregation method based on deformable convolutional networks (DCNs), which is integrated into the feature extraction workflow for MVSNet method that uses coarse‐to‐fine structure. Following the conventional pipeline of MVSNet, a DCN is used to densely estimate and apply transformations in our feature extractor, which is composed of a deformable feature pyramid network (DFPN). Furthermore, we introduce a dedicated offset regulariser to promote the convergence of the learnable offsets of the DCN. The effectiveness of the proposed DFPN is validated through quantitative and qualitative evaluations on the BlendedMVS and Tanks and Temples benchmark datasets within a cross‐dataset evaluation setting.
ISSN:	0031-868X 1477-9730
DOI:	10.1111/phor.12459