A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing

Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2021-06, Vol.80 (14), p.22209-22229
Hauptverfasser: Gupta, Anishka, Yadav, Divakar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing the same for political speeches, delivered in a variety of environments. The technique used here takes an archive of spoken documents (audio files) as input and performs automatic speech recognition (ASR) on it to derive the textual transcripts, using deep neural networks (DNN), hidden markov models (HMM) and Gaussian mixture models (GMM). These transcriptions are further pruned for indexing by applying certain pre-processing techniques. Thereafter, it builds time and space efficient index of the documents using wavelet trees for its retrieval. The constructed index is searched through to find the count of occurrences of the words in the query, fired by the users. These counts are then utilized to calculate the term frequency - inverse document frequency (TF-IDF) scores, and then the similarity score of the query with each document is calculated using cosine similarity method. Finally, the documents are ranked based on these scores in the order of relevance. Therefore, the proposed system develops a speech recognition system and introduces a novel indexing scheme, based on wavelet trees for retrieving data.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-021-10800-8