An efficient motion visual learning method for video action recognition
Currently, efficient spatio-temporal information modeling is one of the key research components to solve the action recognition problem. Previous approaches focus on enhancing the backbone features individually using hierarchical structures, and unfortunately, most of them fail to achieve a better b...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-12, Vol.255, p.124596, Article 124596 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Currently, efficient spatio-temporal information modeling is one of the key research components to solve the action recognition problem. Previous approaches focus on enhancing the backbone features individually using hierarchical structures, and unfortunately, most of them fail to achieve a better balance between the interactional adequacy of features within the structure. In this work, we propose an effective Multi-dimensional Adaptive Fusion Network (MDAF-Net), which can be embedded into the mainstream action recognition backbone in a plug-and-play manner to fully activate the transfer and representation of action features in the deep network. Specifically, our MDAF-Net contains two main components: the Adaptive Temporal Capture Module (ATCM) and the Extended Spatial and Channel Module (ESCM). The ATCM effectively suppresses the over-expression of similar features in adjacent frames and activates the expression of motion flow information. The ESCM further improves temporal modeling efficiency by extending the spatial feature perceptual field and enhancing channel attention. Extensive experiments on several challenging action recognition benchmarks, such as Something-Something V1&V2 and Kinetics-400, demonstrate that the proposed MDAF can achieve state-of-the-art and competitive performance. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.124596 |