A bag-of-regions representation for video classification

A bag-of-regions (BoR) representation of a video sequence is a spatio-temporal tessellation for use in high-level applications such as video classifications and action recognitions. We obtain a BoR representation of a video sequence by extracting regions that exist in the majority of its frames and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2016-03, Vol.75 (5), p.2453-2472
Hauptverfasser: Choi, Min-Kook, Wang, Ziyu, Lee, Hyun-Gyu, Lee, Sang-Chul
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A bag-of-regions (BoR) representation of a video sequence is a spatio-temporal tessellation for use in high-level applications such as video classifications and action recognitions. We obtain a BoR representation of a video sequence by extracting regions that exist in the majority of its frames and largely correspond to a single object. First, the significant regions are obtained using unsupervised frame segmentation based on the JSEG method. A tracking algorithm for splitting and merging the regions is then used to generate a relational graph of all regions in the segmented sequence. Finally, we perform a connectivity analysis on this graph to select the most significant regions, which are then used to create a high-level representation of the video sequence. We evaluated our representation using a SVM classifier for the video classification and achieved about 85 % average precision using the UCF50 dataset.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-015-2876-y