MVF-Net: A Multi-View Fusion Network for Event-Based Object Classification

Event-based object recognition has drawn increasing attention for event cameras' distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2022-12, Vol.32 (12), p.8275-8284
Hauptverfasser: Deng, Yongjian, Chen, Hao, Li, Youfu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Event-based object recognition has drawn increasing attention for event cameras' distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability. Although recent works turn to design various deep neural networks to extract event features, they either suffer from data insufficiency to fully train the event-based model or fail to encode spatial and temporal cues simultaneously with their single view network. In this work, we address these limitations by proposing a multi-view attention-aware network, in which an event stream is projected to multi-view 2D maps to utilize well-trained 2D models and explore spatio-temporal complements. Besides, the attention mechanism is used to boost the complements in different streams for better joint inference. Comprehensive experiments show the large superiority of our model over state-of-the-art methods as well as the efficacy of our multi-view fusion framework for event data.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2021.3073673