Deep Manifold Structure Transfer for Action Recognition

While intrinsic data structure in subspace provides useful information for visual recognition, it has not yet been well studied in deep feature learning for action recognition. In this paper, we introduce a new spatio-temporal manifold network (STMN) that leverages data manifold structures to regula...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2019-09, Vol.28 (9), p.4646-4658
Hauptverfasser: Li, Ce, Zhang, Baochang, Chen, Chen, Ye, Qixiang, Han, Jungong, Guo, Guodong, Ji, Rongrong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:While intrinsic data structure in subspace provides useful information for visual recognition, it has not yet been well studied in deep feature learning for action recognition. In this paper, we introduce a new spatio-temporal manifold network (STMN) that leverages data manifold structures to regularize deep action feature learning, aiming at simultaneously minimizing the intra-class variations of learned deep features and alleviating the over-fitting problem. To this end, the manifold prior is imposed from the top layer of a convolutional neural network (CNN) and propagated across convolutional layers during forward-backward propagation. The observed correspondence of manifold structures in the data space and feature space validates that the manifold priori can be transferred across the CNN layers. The STMN theoretically recasts the problem of transferring the data structure prior into the deep learning architectures as a projection over the manifold via an embedding method, which can be easily solved by an alternating direction method of multipliers and backward propagation (ADMM-BP) algorithm. The STMN is generic in the sense that it can be plugged into various backbone architectures to learn more discriminative representation for action recognition. The extensive experimental results show that our method achieves comparable or even better performance compared with the state-of-the-art approaches on four benchmark datasets.
ISSN:1057-7149
1941-0042
DOI:10.1109/TIP.2019.2912357