Second-order motion descriptors for efficient action recognition

Human action recognition from realistic video data constitutes a challenging and relevant research area. Leading the state of the art we can find those methods based on convolutional neural networks (CNNs), and specially two-stream CNNs. In this family of deep architectures, the appearance channel l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern analysis and applications : PAA 2021-05, Vol.24 (2), p.473-482
Hauptverfasser:	Oves García, Reinier, Morales, Eduardo F., Sucar, L. Enrique
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial neural networks Color imagery Computer Science Computer Science, Artificial Intelligence Curvature Datasets Human activity recognition Human motion Object recognition Optical flow (image analysis) Original Article Pattern Recognition Preprocessing Representations Science & Technology Sequences Technology Training Video data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Human action recognition from realistic video data constitutes a challenging and relevant research area. Leading the state of the art we can find those methods based on convolutional neural networks (CNNs), and specially two-stream CNNs. In this family of deep architectures, the appearance channel learns from the RGB images and the motion channel learns from a motion representation, usually, the optical flow. Given that action recognition requires the extraction of complex motion patterns descriptors in image sequences, we introduce a new set of second-order motion representations capable of capturing both: geometrical and kinematic properties of the motion (curl, div, curvature, and acceleration). Besides, we present a new and effective strategy capable of reducing training times without sacrificing the performance when using the I3D two-stream CNN and robust to the weakness of a single channel. The experiments presented in this paper were carried out over two of the most challenging datasets for action recognition: UCF101 and HMDB51. Reported results show an improvement in accuracy over the UCF101 dataset where an accuracy of 98.45% is achieved when the curvature and acceleration are combined as a motion representation. For the HMDB51, our approach shows a competitive performance, achieving an accuracy of 80.19%. In both datasets, our approach shows a considerable reduction in time for the preprocessing and training phases. Preprocessing time is reduced to a sixth of the time while the training procedure for the motion stream can be performed in a third of the time usually employed.
ISSN:	1433-7541 1433-755X
DOI:	10.1007/s10044-020-00924-2