A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing

Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2021-06, Vol.80 (14), p.22209-22229
Hauptverfasser:	Gupta, Anishka, Yadav, Divakar
Format:	Artikel
Sprache:	eng
Schlagworte:	Archives & records Artificial neural networks Audio data Automatic speech recognition Computer Communication Networks Computer Science Context Data Structures and Information Theory Deep learning Documents Indexing Markov analysis Markov chains Multimedia Information Systems Neural networks Probabilistic models Retrieval Similarity Special Purpose and Application-Based Systems Speech recognition Speeches Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	22229
container_issue	14
container_start_page	22209
container_title	Multimedia tools and applications
container_volume	80
creator	Gupta, Anishka Yadav, Divakar
description	Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing the same for political speeches, delivered in a variety of environments. The technique used here takes an archive of spoken documents (audio files) as input and performs automatic speech recognition (ASR) on it to derive the textual transcripts, using deep neural networks (DNN), hidden markov models (HMM) and Gaussian mixture models (GMM). These transcriptions are further pruned for indexing by applying certain pre-processing techniques. Thereafter, it builds time and space efficient index of the documents using wavelet trees for its retrieval. The constructed index is searched through to find the count of occurrences of the words in the query, fired by the users. These counts are then utilized to calculate the term frequency - inverse document frequency (TF-IDF) scores, and then the similarity score of the query with each document is calculated using cosine similarity method. Finally, the documents are ranked based on these scores in the order of relevance. Therefore, the proposed system develops a speech recognition system and introduces a novel indexing scheme, based on wavelet trees for retrieving data.
doi_str_mv	10.1007/s11042-021-10800-8
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2531845477</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2531845477</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-d11f75198d8cf9370cc7315ef874c53dc3147faab09bc48b0c8460f17730d3a83</originalsourceid><addsrcrecordid>eNp9kD1uGzEQhYkgBuLIvoArAq7XmVnuilRpCP4DBKRJaoLiDqVVJHJDUrLdpUmfI_gsPkpOEtobwJ2rmQG-9x7mMXaGcIEA8ktChKauoMYKQQFU6gM7xlaKSsoaP5ZdKKhkC_iJfU5pA4DTtm6O2e9L7sOBttwMQwzGrnkOfKDoQtxxG3ymh_z315-lSdRxs89hZ3JveRrCD_K8C3a_I595pBx7OpgtD44PYdsXqBxpILJrSnzUB8_vTQmjzHMken7qfUcPvV-dsCNntolO_88J-3599W1-Wy2-3tzNLxeVFTjLVYfoZIsz1SnrZkKCtVJgS07JxraiK1QjnTFLmC1to5ZgVTMFh1IK6IRRYsLOR9_y6889paw3YR99idR1K1A1bVPYCatHysaQUiSnh9jvTHzUCPqlbj3WrUvd-rVu_WItRlEqsF9RfLN-R_UPAiOG8Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2531845477</pqid></control><display><type>article</type><title>A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing</title><source>SpringerLink Journals - AutoHoldings</source><creator>Gupta, Anishka ; Yadav, Divakar</creator><creatorcontrib>Gupta, Anishka ; Yadav, Divakar</creatorcontrib><description>Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing the same for political speeches, delivered in a variety of environments. The technique used here takes an archive of spoken documents (audio files) as input and performs automatic speech recognition (ASR) on it to derive the textual transcripts, using deep neural networks (DNN), hidden markov models (HMM) and Gaussian mixture models (GMM). These transcriptions are further pruned for indexing by applying certain pre-processing techniques. Thereafter, it builds time and space efficient index of the documents using wavelet trees for its retrieval. The constructed index is searched through to find the count of occurrences of the words in the query, fired by the users. These counts are then utilized to calculate the term frequency - inverse document frequency (TF-IDF) scores, and then the similarity score of the query with each document is calculated using cosine similarity method. Finally, the documents are ranked based on these scores in the order of relevance. Therefore, the proposed system develops a speech recognition system and introduces a novel indexing scheme, based on wavelet trees for retrieving data.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-021-10800-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Archives & records ; Artificial neural networks ; Audio data ; Automatic speech recognition ; Computer Communication Networks ; Computer Science ; Context ; Data Structures and Information Theory ; Deep learning ; Documents ; Indexing ; Markov analysis ; Markov chains ; Multimedia Information Systems ; Neural networks ; Probabilistic models ; Retrieval ; Similarity ; Special Purpose and Application-Based Systems ; Speech recognition ; Speeches ; Voice recognition</subject><ispartof>Multimedia tools and applications, 2021-06, Vol.80 (14), p.22209-22229</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-d11f75198d8cf9370cc7315ef874c53dc3147faab09bc48b0c8460f17730d3a83</citedby><cites>FETCH-LOGICAL-c319t-d11f75198d8cf9370cc7315ef874c53dc3147faab09bc48b0c8460f17730d3a83</cites><orcidid>0000-0001-6051-479X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-021-10800-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-021-10800-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Gupta, Anishka</creatorcontrib><creatorcontrib>Yadav, Divakar</creatorcontrib><title>A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing the same for political speeches, delivered in a variety of environments. The technique used here takes an archive of spoken documents (audio files) as input and performs automatic speech recognition (ASR) on it to derive the textual transcripts, using deep neural networks (DNN), hidden markov models (HMM) and Gaussian mixture models (GMM). These transcriptions are further pruned for indexing by applying certain pre-processing techniques. Thereafter, it builds time and space efficient index of the documents using wavelet trees for its retrieval. The constructed index is searched through to find the count of occurrences of the words in the query, fired by the users. These counts are then utilized to calculate the term frequency - inverse document frequency (TF-IDF) scores, and then the similarity score of the query with each document is calculated using cosine similarity method. Finally, the documents are ranked based on these scores in the order of relevance. Therefore, the proposed system develops a speech recognition system and introduces a novel indexing scheme, based on wavelet trees for retrieving data.</description><subject>Archives & records</subject><subject>Artificial neural networks</subject><subject>Audio data</subject><subject>Automatic speech recognition</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Context</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Documents</subject><subject>Indexing</subject><subject>Markov analysis</subject><subject>Markov chains</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Probabilistic models</subject><subject>Retrieval</subject><subject>Similarity</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Speech recognition</subject><subject>Speeches</subject><subject>Voice recognition</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kD1uGzEQhYkgBuLIvoArAq7XmVnuilRpCP4DBKRJaoLiDqVVJHJDUrLdpUmfI_gsPkpOEtobwJ2rmQG-9x7mMXaGcIEA8ktChKauoMYKQQFU6gM7xlaKSsoaP5ZdKKhkC_iJfU5pA4DTtm6O2e9L7sOBttwMQwzGrnkOfKDoQtxxG3ymh_z315-lSdRxs89hZ3JveRrCD_K8C3a_I595pBx7OpgtD44PYdsXqBxpILJrSnzUB8_vTQmjzHMken7qfUcPvV-dsCNntolO_88J-3599W1-Wy2-3tzNLxeVFTjLVYfoZIsz1SnrZkKCtVJgS07JxraiK1QjnTFLmC1to5ZgVTMFh1IK6IRRYsLOR9_y6889paw3YR99idR1K1A1bVPYCatHysaQUiSnh9jvTHzUCPqlbj3WrUvd-rVu_WItRlEqsF9RfLN-R_UPAiOG8Q</recordid><startdate>20210601</startdate><enddate>20210601</enddate><creator>Gupta, Anishka</creator><creator>Yadav, Divakar</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-6051-479X</orcidid></search><sort><creationdate>20210601</creationdate><title>A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing</title><author>Gupta, Anishka ; Yadav, Divakar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-d11f75198d8cf9370cc7315ef874c53dc3147faab09bc48b0c8460f17730d3a83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Archives & records</topic><topic>Artificial neural networks</topic><topic>Audio data</topic><topic>Automatic speech recognition</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Context</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Documents</topic><topic>Indexing</topic><topic>Markov analysis</topic><topic>Markov chains</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Probabilistic models</topic><topic>Retrieval</topic><topic>Similarity</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Speech recognition</topic><topic>Speeches</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gupta, Anishka</creatorcontrib><creatorcontrib>Yadav, Divakar</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest Research Library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gupta, Anishka</au><au>Yadav, Divakar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2021-06-01</date><risdate>2021</risdate><volume>80</volume><issue>14</issue><spage>22209</spage><epage>22229</epage><pages>22209-22229</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Spoken document retrieval for a specific context is a very trending and interesting area of research. It makes it convenient for users to search through archives of speech data, which is not possible manually as it is very time consuming and expensive. In the current article, we focus on performing the same for political speeches, delivered in a variety of environments. The technique used here takes an archive of spoken documents (audio files) as input and performs automatic speech recognition (ASR) on it to derive the textual transcripts, using deep neural networks (DNN), hidden markov models (HMM) and Gaussian mixture models (GMM). These transcriptions are further pruned for indexing by applying certain pre-processing techniques. Thereafter, it builds time and space efficient index of the documents using wavelet trees for its retrieval. The constructed index is searched through to find the count of occurrences of the words in the query, fired by the users. These counts are then utilized to calculate the term frequency - inverse document frequency (TF-IDF) scores, and then the similarity score of the query with each document is calculated using cosine similarity method. Finally, the documents are ranked based on these scores in the order of relevance. Therefore, the proposed system develops a speech recognition system and introduces a novel indexing scheme, based on wavelet trees for retrieving data.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-021-10800-8</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0001-6051-479X</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1380-7501
ispartof	Multimedia tools and applications, 2021-06, Vol.80 (14), p.22209-22229
issn	1380-7501 1573-7721
language	eng
recordid	cdi_proquest_journals_2531845477
source	SpringerLink Journals - AutoHoldings
subjects	Archives & records Artificial neural networks Audio data Automatic speech recognition Computer Communication Networks Computer Science Context Data Structures and Information Theory Deep learning Documents Indexing Markov analysis Markov chains Multimedia Information Systems Neural networks Probabilistic models Retrieval Similarity Special Purpose and Application-Based Systems Speech recognition Speeches Voice recognition
title	A novel approach to perform context‐based automatic spoken document retrieval of political speeches based on wavelet tree indexing
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T19%3A58%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20approach%20to%20perform%20context%E2%80%90based%20automatic%20spoken%20document%20retrieval%20of%20political%20speeches%20based%20on%20wavelet%20tree%C2%A0indexing&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Gupta,%20Anishka&rft.date=2021-06-01&rft.volume=80&rft.issue=14&rft.spage=22209&rft.epage=22229&rft.pages=22209-22229&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-021-10800-8&rft_dat=%3Cproquest_cross%3E2531845477%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2531845477&rft_id=info:pmid/&rfr_iscdi=true