Modeling temporal structure of complex actions using Bag-of-Sequencelets

•We propose a new model for recognizing complex actions named Bag-of-Sequencelets.•We represent a video as a sequence of primitive actions.•We model a complex action as an ensemble of sub-sequences (sequencelets).•We automatically learn sequencelets without any annotation about action structures.•We...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition letters 2017-01, Vol.85, p.21-28
Hauptverfasser: Jung, Hyun-Joo, Hong, Ki-Sang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We propose a new model for recognizing complex actions named Bag-of-Sequencelets.•We represent a video as a sequence of primitive actions.•We model a complex action as an ensemble of sub-sequences (sequencelets).•We automatically learn sequencelets without any annotation about action structures.•We achieve state-of-the-art results on the Olympic sports and UCF YouTube datasets. This paper proposes a new framework for modeling temporal structures of complex human actions. Inspired by the fact that a complex action is the temporally ordered composition of sub-actions, we develop a new model named Bag-of-Sequencelets (BoS). To construct a BoS model, a video is represented as a sequence of Primitive Actions (PAs). A PA is a representative motion pattern that constitutes actions and is learned in an unsupervised manner. Representing a video as a sequence of PAs preserves their temporal order. A sequencelet is an informative sub-sequence that describes the partial structure of actions while preserving temporal relations among PAs. In a BoS model, an action is modeled as an ensemble of sequencelets. We can use sequential pattern mining to automatically learn the sequencelet without any annotation or prior knowledge of action structure. Because the BoS model has both compositional and chronological properties, it can effectively model the structures of complex actions despite intra-class variations such as viewpoint change. Experimental results show the effectiveness of the BoS model in temporal structure modeling. Applied to the Olympic sports and UCF YouTube datasets, BoS achieves greater classification accuracy than state-of-the-art methods.
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2016.11.012