A new pyramidal opponent color-shape model based video shot boundary detection

Video shot boundary detection (VSBD) is one of the most essential criteria for many intelligent video analysis-related applications, such as video retrieval, indexing, browsing, categorization and summarization. VSBD aims to segment big video data into meaningful fragments known as shots. This paper...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of visual communication and image representation 2020-02, Vol.67, p.102754, Article 102754
Hauptverfasser: Sasithradevi, A., Mohamed Mansoor Roomi, S.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Video shot boundary detection (VSBD) is one of the most essential criteria for many intelligent video analysis-related applications, such as video retrieval, indexing, browsing, categorization and summarization. VSBD aims to segment big video data into meaningful fragments known as shots. This paper put forwards a new pyramidal opponent colour-shape (POCS) model which can detect abrupt transition (AT) and gradual transition (GT) simultaneously, even in the presence of illumination changes, huge object movement between frames, and fast camera motion. First, the content of frames in the video subjected to VSBD is represented by the proposed POCS model. Consequently, the temporal nature of the POCS model is subjected to a suitable segment (SS) selection procedure in order to minimize the complexity of VSBD method. The SS from the video frames is examined for transitions within it using a bagged-trees classifier (BTC) learned on a balanced training set via parallel processing. To prove the superiority of the proposed VSBD algorithm, it is evaluated on the TRECVID 2001, TRECVID2007 and VIDEOSEG2004 data sets for classifying the basic units of video according to no transition (NT), AT and GT. The experimental evaluation results in an F1-score of 95.13%, 98.13% and 97.11% on the TRECVID 2001, TRECVID2007 and VIDEOSEG2004 data sets, respectively.
ISSN:1047-3203
1095-9076
DOI:10.1016/j.jvcir.2020.102754