Spatial and temporal scoring for egocentric video summarization

We present a summarization approach for egocentric video. Given hours of video, the proposed method produces a compact storyboard summary of the camera wearer's day. In contrast to traditional keyframe selection techniques, the resulting summary focuses on the most important video shots which r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2016-10, Vol.208, p.299-308
Hauptverfasser: Guo, Zhao, Gao, Lianli, Zhen, Xiantong, Zou, Fuhao, Shen, Fumin, Zheng, Kai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We present a summarization approach for egocentric video. Given hours of video, the proposed method produces a compact storyboard summary of the camera wearer's day. In contrast to traditional keyframe selection techniques, the resulting summary focuses on the most important video shots which reflect high stable salience, discrimination and representativeness. To accomplish this, we utilize egocentric salience cues, motion cues and a selection model to capture stable salience weight, discriminative weight and representative weight of a video shot respectively. We further combine these weights in a unified framework to predict the importance score of a shot, based on which, important shots are selected for the storyboard. Critically, the approach is neither camera-wearer-specific nor object-specific; that means the learned importance metric need not be trained for a given user or context, and it can predict the importance of shots that have never been seen previously. Experimental results on three video datasets across various genres demonstrate that our proposed approach clearly outperforms several state-of-the-art methods.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2016.03.083