Multi-View Tree Structure Learning for 3D Model Retrieval and Classification in Smart City

The application of digital products in smart city results in ever-increasing 3D model data and how to obtain the relevant 3D model becomes a crucial issue. In this paper, we propose the Multi-View Tree Structure (MVTS) learning for 3D model retrieval and recognition. MVTS contains three key consecut...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2020-01, Vol.8, p.1-1
Hauptverfasser:	Liu, An-An, Zhao, Zhen-Lan, Li, Wen-Hui, Song, Dan
Format:	Artikel
Sprache:	eng
Schlagworte:	3D Model Retrieval Computational modeling Data models Encoding Feature extraction Graph theory Long Short-Term Memory Machine learning Modules Multi-View Representation Retrieval Smart cities Smart City Solid modeling Spatial data Three dimensional models Three-dimensional displays
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The application of digital products in smart city results in ever-increasing 3D model data and how to obtain the relevant 3D model becomes a crucial issue. In this paper, we propose the Multi-View Tree Structure (MVTS) learning for 3D model retrieval and recognition. MVTS contains three key consecutive modules. Firstly, the visual feature learning module extracts the visual features of multiple views. Then, we design a score matrix to estimate the value of contextual information between view pairs. Based on the score matrix, a maximum spanning tree is constructed to further explore the contextual information within multiple views. Then, we utilize the bidirectional Tree-LSTM to encode the contextual information among views and the spatial information of tree structure and optimize the tree parameters. After that, the tree attention strategy is adopted to explore the importance of each view. Comparing to existing methods, our proposed method explores the spatial information of 3D model without the requirement of specific camera settings, which is more suitable for real applications. Moreover, our method jointly realizes the feature learning, view-wise contextual information and tree spatial information encoding and view importance estimating, which enhances the discrimination of the 3D model representation. Extensive experimental results on Modelnet40 and ShapeNetCore55 demonstrate the superiority of our method.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2020.3009333