3D model retrieval based on multi-view attentional convolutional neural network
We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our met...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2020-02, Vol.79 (7-8), p.4699-4711 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-019-7521-8 |