Stratified pooling based deep convolutional neural networks for human action recognition

Video based human action recognition is an active and challenging topic in computer vision. Over the last few years, deep convolutional neural networks (CNN) has become the most popular method and achieved the state-of-the-art performance on several datasets, such as HMDB-51 and UCF-101. Since each...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2017-06, Vol.76 (11), p.13367-13382
Hauptverfasser:	Yu, Sheng, Cheng, Yun, Su, Songzhi, Cai, Guorong, Li, Shaozi
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Computer Communication Networks Computer Science Computer vision Data Structures and Information Theory Digital imaging Feature extraction Human motion Moving object recognition Multimedia Information Systems Neural networks Principal components analysis Special Purpose and Application-Based Systems Target recognition Video data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Video based human action recognition is an active and challenging topic in computer vision. Over the last few years, deep convolutional neural networks (CNN) has become the most popular method and achieved the state-of-the-art performance on several datasets, such as HMDB-51 and UCF-101. Since each video has a various number of frame-level features, how to combine these features to acquire good video-level feature becomes a challenging task. Therefore, this paper proposed a novel action recognition method named stratified pooling, which is based on deep convolutional neural networks (SP-CNN). The process is mainly composed of five parts: (i) fine-tuning a pre-trained CNN on the target dataset, (ii) frame-level features extraction; (iii) the principal component analysis (PCA) method for feature dimensionality reduction; (iv) stratified pooling frame-level features to get video-level feature; and (v) SVM for multiclass classification. Finally, the experimental results conducted on HMDB-51 and UCF-101 datasets show that the proposed method outperforms the state-of-the-art.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-016-3768-5