Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection

Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2012-09, Vol.20 (7), p.2095-2110
Hauptverfasser: LEE, Hung-Yi, CHEN, Chia-Ping, LEE, Lin-Shan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2110
container_issue 7
container_start_page 2095
container_title IEEE transactions on audio, speech, and language processing
container_volume 20
creator LEE, Hung-Yi
CHEN, Chia-Ping
LEE, Lin-Shan
description Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates recognition and retrieval and consider them jointly in order to yield better STD performance. This can be achieved either by adjusting the acoustic model parameters (model-based) or by considering detected examples (example-based) using relevance information provided by the user (user relevance feedback) or inferred by the system (pseudo-relevance feedback), either for a given query (short-term context) or by taking into account many previous queries (long-term context). Such relevance feedback approaches have long been used in text information retrieval, but are rarely considered and cannot be directly applied to the retrieval of spoken content. The proposed relevance feedback approaches are specific to spoken content retrieval and are hence very different from those developed for text retrieval, which are applied only to text symbols. We present not only these relevance feedback scenarios and approaches for STD, but also propose a framework to integrate them all together. Preliminary experiments showed significant improvements in each case.
doi_str_mv 10.1109/TASL.2012.2196514
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pascalfrancis_primary_26325565</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6189746</ieee_id><sourcerecordid>2686196751</sourcerecordid><originalsourceid>FETCH-LOGICAL-c323t-13a4e616871c6335c4756e1a0e00de48c8f65cb74bd8f6a2e1a276157a5007a03</originalsourceid><addsrcrecordid>eNo9kE9Lw0AQxYMoWKsfQLwsiMfUnf2bHEu1WigIttJj2G4mNW26qZut4Ld3S0tPM4_5vRnmJck90AEAzZ_nw9l0wCiwAYNcSRAXSQ-kzFKdM3F57kFdJzddt6ZUcCWglywmLuDKm1C7FflE265cHerWEePKqIOv8dc0ZFGH7yibKJxFMkYsl8ZuSNV6Mtu1G3Rkjn5LXjCgPfhvk6vKNB3enWo_-Rq_zkfv6fTjbTIaTlPLGQ8pcCNQgco0WMW5tEJLhWAoUlqiyGxWKWmXWizL2BkWR0wrkNpISrWhvJ88HvfufPuzxy4U63bvXTxZAI156FyDiBQcKevbrvNYFTtfb43_i1BxyK845Fcc8itO-UXP02mz6axpKh8_r7uzkSnOpFQycg9HrkbE81hBlmuh-D8Gh3f3</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1020179714</pqid></control><display><type>article</type><title>Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection</title><source>IEEE Electronic Library (IEL)</source><creator>LEE, Hung-Yi ; CHEN, Chia-Ping ; LEE, Lin-Shan</creator><creatorcontrib>LEE, Hung-Yi ; CHEN, Chia-Ping ; LEE, Lin-Shan</creatorcontrib><description>Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates recognition and retrieval and consider them jointly in order to yield better STD performance. This can be achieved either by adjusting the acoustic model parameters (model-based) or by considering detected examples (example-based) using relevance information provided by the user (user relevance feedback) or inferred by the system (pseudo-relevance feedback), either for a given query (short-term context) or by taking into account many previous queries (long-term context). Such relevance feedback approaches have long been used in text information retrieval, but are rarely considered and cannot be directly applied to the retrieval of spoken content. The proposed relevance feedback approaches are specific to spoken content retrieval and are hence very different from those developed for text retrieval, which are applied only to text symbols. We present not only these relevance feedback scenarios and approaches for STD, but also propose a framework to integrate them all together. Preliminary experiments showed significant improvements in each case.</description><identifier>ISSN: 1558-7916</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-7924</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASL.2012.2196514</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Accuracy ; Acoustics ; Applied sciences ; Exact sciences and technology ; Information retrieval ; Information theory ; Information, signal and communications theory ; Lattices ; Multimedia communication ; Relevance feedback ; Signal processing ; Speech ; Speech processing ; Speech recognition ; spoken term detection ; Studies ; Telecommunications and information theory</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2012-09, Vol.20 (7), p.2095-2110</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Sep 2012</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c323t-13a4e616871c6335c4756e1a0e00de48c8f65cb74bd8f6a2e1a276157a5007a03</citedby><cites>FETCH-LOGICAL-c323t-13a4e616871c6335c4756e1a0e00de48c8f65cb74bd8f6a2e1a276157a5007a03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6189746$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6189746$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26325565$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>LEE, Hung-Yi</creatorcontrib><creatorcontrib>CHEN, Chia-Ping</creatorcontrib><creatorcontrib>LEE, Lin-Shan</creatorcontrib><title>Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates recognition and retrieval and consider them jointly in order to yield better STD performance. This can be achieved either by adjusting the acoustic model parameters (model-based) or by considering detected examples (example-based) using relevance information provided by the user (user relevance feedback) or inferred by the system (pseudo-relevance feedback), either for a given query (short-term context) or by taking into account many previous queries (long-term context). Such relevance feedback approaches have long been used in text information retrieval, but are rarely considered and cannot be directly applied to the retrieval of spoken content. The proposed relevance feedback approaches are specific to spoken content retrieval and are hence very different from those developed for text retrieval, which are applied only to text symbols. We present not only these relevance feedback scenarios and approaches for STD, but also propose a framework to integrate them all together. Preliminary experiments showed significant improvements in each case.</description><subject>Accuracy</subject><subject>Acoustics</subject><subject>Applied sciences</subject><subject>Exact sciences and technology</subject><subject>Information retrieval</subject><subject>Information theory</subject><subject>Information, signal and communications theory</subject><subject>Lattices</subject><subject>Multimedia communication</subject><subject>Relevance feedback</subject><subject>Signal processing</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>spoken term detection</subject><subject>Studies</subject><subject>Telecommunications and information theory</subject><issn>1558-7916</issn><issn>2329-9290</issn><issn>1558-7924</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9Lw0AQxYMoWKsfQLwsiMfUnf2bHEu1WigIttJj2G4mNW26qZut4Ld3S0tPM4_5vRnmJck90AEAzZ_nw9l0wCiwAYNcSRAXSQ-kzFKdM3F57kFdJzddt6ZUcCWglywmLuDKm1C7FflE265cHerWEePKqIOv8dc0ZFGH7yibKJxFMkYsl8ZuSNV6Mtu1G3Rkjn5LXjCgPfhvk6vKNB3enWo_-Rq_zkfv6fTjbTIaTlPLGQ8pcCNQgco0WMW5tEJLhWAoUlqiyGxWKWmXWizL2BkWR0wrkNpISrWhvJ88HvfufPuzxy4U63bvXTxZAI156FyDiBQcKevbrvNYFTtfb43_i1BxyK845Fcc8itO-UXP02mz6axpKh8_r7uzkSnOpFQycg9HrkbE81hBlmuh-D8Gh3f3</recordid><startdate>20120901</startdate><enddate>20120901</enddate><creator>LEE, Hung-Yi</creator><creator>CHEN, Chia-Ping</creator><creator>LEE, Lin-Shan</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20120901</creationdate><title>Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection</title><author>LEE, Hung-Yi ; CHEN, Chia-Ping ; LEE, Lin-Shan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c323t-13a4e616871c6335c4756e1a0e00de48c8f65cb74bd8f6a2e1a276157a5007a03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Accuracy</topic><topic>Acoustics</topic><topic>Applied sciences</topic><topic>Exact sciences and technology</topic><topic>Information retrieval</topic><topic>Information theory</topic><topic>Information, signal and communications theory</topic><topic>Lattices</topic><topic>Multimedia communication</topic><topic>Relevance feedback</topic><topic>Signal processing</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>spoken term detection</topic><topic>Studies</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>LEE, Hung-Yi</creatorcontrib><creatorcontrib>CHEN, Chia-Ping</creatorcontrib><creatorcontrib>LEE, Lin-Shan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>LEE, Hung-Yi</au><au>CHEN, Chia-Ping</au><au>LEE, Lin-Shan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2012-09-01</date><risdate>2012</risdate><volume>20</volume><issue>7</issue><spage>2095</spage><epage>2110</epage><pages>2095-2110</pages><issn>1558-7916</issn><issn>2329-9290</issn><eissn>1558-7924</eissn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates recognition and retrieval and consider them jointly in order to yield better STD performance. This can be achieved either by adjusting the acoustic model parameters (model-based) or by considering detected examples (example-based) using relevance information provided by the user (user relevance feedback) or inferred by the system (pseudo-relevance feedback), either for a given query (short-term context) or by taking into account many previous queries (long-term context). Such relevance feedback approaches have long been used in text information retrieval, but are rarely considered and cannot be directly applied to the retrieval of spoken content. The proposed relevance feedback approaches are specific to spoken content retrieval and are hence very different from those developed for text retrieval, which are applied only to text symbols. We present not only these relevance feedback scenarios and approaches for STD, but also propose a framework to integrate them all together. Preliminary experiments showed significant improvements in each case.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2012.2196514</doi><tpages>16</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1558-7916
ispartof IEEE transactions on audio, speech, and language processing, 2012-09, Vol.20 (7), p.2095-2110
issn 1558-7916
2329-9290
1558-7924
2329-9304
language eng
recordid cdi_pascalfrancis_primary_26325565
source IEEE Electronic Library (IEL)
subjects Accuracy
Acoustics
Applied sciences
Exact sciences and technology
Information retrieval
Information theory
Information, signal and communications theory
Lattices
Multimedia communication
Relevance feedback
Signal processing
Speech
Speech processing
Speech recognition
spoken term detection
Studies
Telecommunications and information theory
title Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T16%3A32%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrating%20Recognition%20and%20Retrieval%20With%20Relevance%20Feedback%20for%20Spoken%20Term%20Detection&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=LEE,%20Hung-Yi&rft.date=2012-09-01&rft.volume=20&rft.issue=7&rft.spage=2095&rft.epage=2110&rft.pages=2095-2110&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2012.2196514&rft_dat=%3Cproquest_RIE%3E2686196751%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1020179714&rft_id=info:pmid/&rft_ieee_id=6189746&rfr_iscdi=true