Conditional deep clustering based transformed spatio-temporal features and fused distance for efficient video retrieval

Key frame extraction is essential for video retrieval because it reduces the quantity of data needed to be processed. However, current video comparison methods classify videos by assigning labels to each frame, resulting in time and computational complexity. This research addresses the issues of opt...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of information technology (Singapore. Online) 2023-06, Vol.15 (5), p.2349-2355
Hauptverfasser: Banerjee, Alina, Kumar, Ela, Ravinder, M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Key frame extraction is essential for video retrieval because it reduces the quantity of data needed to be processed. However, current video comparison methods classify videos by assigning labels to each frame, resulting in time and computational complexity. This research addresses the issues of optimised key frame selection, reduced time feature selection, relevant information retrieval from features, and the fusion of distance measurements for better content-based video retrieval. A Conditional Deep Clustering -based key frame selection method for movies and query videos is proposed. The time taken to extract features and produce spatio temporal feature vectors was decreased by using BEBLID (Boosted Efficient Binary Local Image Descriptor) features. The most crucial sub characteristics are extracted using a hybrid cosine cum wavelet transform. The retrieval performance is enhanced by using a fused distance measure to quantify the dissimilarity. Additionally, it has been found that building the level 2 spatio temporal pyramid with more key frames results in a noticeable improvement in retrieval. On the well-known dataset UCF50, we carried out extensive testing. Results from experiments show that our methodology outperforms two baseline techniques for retrieval using both deep features and handcrafted features.
ISSN:2511-2104
2511-2112
DOI:10.1007/s41870-023-01327-2