Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units

Lattice-based speech indexing approaches are attractive for the combination of short spoken segments, short queries, and low automatic speech recognition (ASR) accuracies, as lattices provide recognition alternatives and therefore tend to compensate for recognition errors. Position-specific posterio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2010-08, Vol.18 (6), p.1562-1574
Hauptverfasser:	PAN, Yi-Cheng, LEE, Lin-Shan
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Automatic speech recognition Bandwidth Computer science Exact sciences and technology Humans Indexing Information retrieval Information theory Information, signal and communications theory Lattices Material storage Performance analysis Position-specific posterior lattice (PSPL) Signal processing Speech analysis Speech processing spoken document indexing spoken document retrieval (SDR) subword-based PSPL Telecommunications and information theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1574
container_issue	6
container_start_page	1562
container_title	IEEE transactions on audio, speech, and language processing
container_volume	18
creator	PAN, Yi-Cheng LEE, Lin-Shan
description	Lattice-based speech indexing approaches are attractive for the combination of short spoken segments, short queries, and low automatic speech recognition (ASR) accuracies, as lattices provide recognition alternatives and therefore tend to compensate for recognition errors. Position-specific posterior lattices (PSPLs) and confusion networks (CNs), two of the most popular lattice-based approaches, both reduce disk space requirements and are more efficient than raw lattices. When PSPLs and CNs are used in a word-based fashion, they cannot handle OOV or rare word queries. In this paper, we propose an efficient approach for the construction of subword-based PSPLs (S-PSPLs) and CNs (S-CNs) and present a comprehensive performance analysis of PSPL and CN structures using both words and subword units, taking into account basic principles and structures, and supported by experimental results on Mandarin Chinese. S-PSPLs and S-CNs are shown to yield significant mean average precision (MAP) improvements over word-based PSPLs and CNs for both out-of-vocabulary (OOV) and in-vocabulary queries while requiring much less disk space for indexing.
doi_str_mv	10.1109/TASL.2009.2037404
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TASL_2009_2037404</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5340561</ieee_id><sourcerecordid>2721429091</sourcerecordid><originalsourceid>FETCH-LOGICAL-c323t-860cc32223e8e75d67098749126c1938cea4a5c16c16700869e7c9a9f4f47fa23</originalsourceid><addsrcrecordid>eNo9kFtLwzAUx4MoOKcfQHwJiI-dubVpHuvwMigobMPHEtNT17G1NadD9-1N2VgeknP5n0t-hNxyNuGcmcdFNs8ngjETLqkVU2dkxOM4jbQR6vxk8-SSXCGuGVMyUXxEig_wVeu3tnFAs8Zu9lgjDRGa276vHURPFqGk8w7AreisKeGvbr5p1nW-tW4FSJc4BD5bXyK1TZDuvn6DQ5dN3eM1uajsBuHm-I7J8uV5MX2L8vfX2TTLIyeF7KM0YS5YQkhIQcdloplJtTJcJI4bmTqwysaOBy-kWJoY0M5YU6lK6coKOSb3h75hrZ8dYF-s250P_8GCM6FNmBLOmPCDyvkW0UNVdL7eWr8PomLgWAwci4FjceQYah6OnS06u6l8QFXjqVBIprRSwwZ3B10NAKd0LBWLEy7_ARI3enI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1027932333</pqid></control><display><type>article</type><title>Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units</title><source>IEEE Electronic Library (IEL)</source><creator>PAN, Yi-Cheng ; LEE, Lin-Shan</creator><creatorcontrib>PAN, Yi-Cheng ; LEE, Lin-Shan</creatorcontrib><description>Lattice-based speech indexing approaches are attractive for the combination of short spoken segments, short queries, and low automatic speech recognition (ASR) accuracies, as lattices provide recognition alternatives and therefore tend to compensate for recognition errors. Position-specific posterior lattices (PSPLs) and confusion networks (CNs), two of the most popular lattice-based approaches, both reduce disk space requirements and are more efficient than raw lattices. When PSPLs and CNs are used in a word-based fashion, they cannot handle OOV or rare word queries. In this paper, we propose an efficient approach for the construction of subword-based PSPLs (S-PSPLs) and CNs (S-CNs) and present a comprehensive performance analysis of PSPL and CN structures using both words and subword units, taking into account basic principles and structures, and supported by experimental results on Mandarin Chinese. S-PSPLs and S-CNs are shown to yield significant mean average precision (MAP) improvements over word-based PSPLs and CNs for both out-of-vocabulary (OOV) and in-vocabulary queries while requiring much less disk space for indexing.</description><identifier>ISSN: 1558-7916</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-7924</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASL.2009.2037404</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Applied sciences ; Automatic speech recognition ; Bandwidth ; Computer science ; Exact sciences and technology ; Humans ; Indexing ; Information retrieval ; Information theory ; Information, signal and communications theory ; Lattices ; Material storage ; Performance analysis ; Position-specific posterior lattice (PSPL) ; Signal processing ; Speech analysis ; Speech processing ; spoken document indexing ; spoken document retrieval (SDR) ; subword-based PSPL ; Telecommunications and information theory</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2010-08, Vol.18 (6), p.1562-1574</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Aug 2010</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c323t-860cc32223e8e75d67098749126c1938cea4a5c16c16700869e7c9a9f4f47fa23</citedby><cites>FETCH-LOGICAL-c323t-860cc32223e8e75d67098749126c1938cea4a5c16c16700869e7c9a9f4f47fa23</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5340561$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5340561$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=23047442$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>PAN, Yi-Cheng</creatorcontrib><creatorcontrib>LEE, Lin-Shan</creatorcontrib><title>Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Lattice-based speech indexing approaches are attractive for the combination of short spoken segments, short queries, and low automatic speech recognition (ASR) accuracies, as lattices provide recognition alternatives and therefore tend to compensate for recognition errors. Position-specific posterior lattices (PSPLs) and confusion networks (CNs), two of the most popular lattice-based approaches, both reduce disk space requirements and are more efficient than raw lattices. When PSPLs and CNs are used in a word-based fashion, they cannot handle OOV or rare word queries. In this paper, we propose an efficient approach for the construction of subword-based PSPLs (S-PSPLs) and CNs (S-CNs) and present a comprehensive performance analysis of PSPL and CN structures using both words and subword units, taking into account basic principles and structures, and supported by experimental results on Mandarin Chinese. S-PSPLs and S-CNs are shown to yield significant mean average precision (MAP) improvements over word-based PSPLs and CNs for both out-of-vocabulary (OOV) and in-vocabulary queries while requiring much less disk space for indexing.</description><subject>Applied sciences</subject><subject>Automatic speech recognition</subject><subject>Bandwidth</subject><subject>Computer science</subject><subject>Exact sciences and technology</subject><subject>Humans</subject><subject>Indexing</subject><subject>Information retrieval</subject><subject>Information theory</subject><subject>Information, signal and communications theory</subject><subject>Lattices</subject><subject>Material storage</subject><subject>Performance analysis</subject><subject>Position-specific posterior lattice (PSPL)</subject><subject>Signal processing</subject><subject>Speech analysis</subject><subject>Speech processing</subject><subject>spoken document indexing</subject><subject>spoken document retrieval (SDR)</subject><subject>subword-based PSPL</subject><subject>Telecommunications and information theory</subject><issn>1558-7916</issn><issn>2329-9290</issn><issn>1558-7924</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kFtLwzAUx4MoOKcfQHwJiI-dubVpHuvwMigobMPHEtNT17G1NadD9-1N2VgeknP5n0t-hNxyNuGcmcdFNs8ngjETLqkVU2dkxOM4jbQR6vxk8-SSXCGuGVMyUXxEig_wVeu3tnFAs8Zu9lgjDRGa276vHURPFqGk8w7AreisKeGvbr5p1nW-tW4FSJc4BD5bXyK1TZDuvn6DQ5dN3eM1uajsBuHm-I7J8uV5MX2L8vfX2TTLIyeF7KM0YS5YQkhIQcdloplJtTJcJI4bmTqwysaOBy-kWJoY0M5YU6lK6coKOSb3h75hrZ8dYF-s250P_8GCM6FNmBLOmPCDyvkW0UNVdL7eWr8PomLgWAwci4FjceQYah6OnS06u6l8QFXjqVBIprRSwwZ3B10NAKd0LBWLEy7_ARI3enI</recordid><startdate>20100801</startdate><enddate>20100801</enddate><creator>PAN, Yi-Cheng</creator><creator>LEE, Lin-Shan</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20100801</creationdate><title>Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units</title><author>PAN, Yi-Cheng ; LEE, Lin-Shan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c323t-860cc32223e8e75d67098749126c1938cea4a5c16c16700869e7c9a9f4f47fa23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Applied sciences</topic><topic>Automatic speech recognition</topic><topic>Bandwidth</topic><topic>Computer science</topic><topic>Exact sciences and technology</topic><topic>Humans</topic><topic>Indexing</topic><topic>Information retrieval</topic><topic>Information theory</topic><topic>Information, signal and communications theory</topic><topic>Lattices</topic><topic>Material storage</topic><topic>Performance analysis</topic><topic>Position-specific posterior lattice (PSPL)</topic><topic>Signal processing</topic><topic>Speech analysis</topic><topic>Speech processing</topic><topic>spoken document indexing</topic><topic>spoken document retrieval (SDR)</topic><topic>subword-based PSPL</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>PAN, Yi-Cheng</creatorcontrib><creatorcontrib>LEE, Lin-Shan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>PAN, Yi-Cheng</au><au>LEE, Lin-Shan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2010-08-01</date><risdate>2010</risdate><volume>18</volume><issue>6</issue><spage>1562</spage><epage>1574</epage><pages>1562-1574</pages><issn>1558-7916</issn><issn>2329-9290</issn><eissn>1558-7924</eissn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>Lattice-based speech indexing approaches are attractive for the combination of short spoken segments, short queries, and low automatic speech recognition (ASR) accuracies, as lattices provide recognition alternatives and therefore tend to compensate for recognition errors. Position-specific posterior lattices (PSPLs) and confusion networks (CNs), two of the most popular lattice-based approaches, both reduce disk space requirements and are more efficient than raw lattices. When PSPLs and CNs are used in a word-based fashion, they cannot handle OOV or rare word queries. In this paper, we propose an efficient approach for the construction of subword-based PSPLs (S-PSPLs) and CNs (S-CNs) and present a comprehensive performance analysis of PSPL and CN structures using both words and subword units, taking into account basic principles and structures, and supported by experimental results on Mandarin Chinese. S-PSPLs and S-CNs are shown to yield significant mean average precision (MAP) improvements over word-based PSPLs and CNs for both out-of-vocabulary (OOV) and in-vocabulary queries while requiring much less disk space for indexing.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2009.2037404</doi><tpages>13</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1558-7916
ispartof	IEEE transactions on audio, speech, and language processing, 2010-08, Vol.18 (6), p.1562-1574
issn	1558-7916 2329-9290 1558-7924 2329-9304
language	eng
recordid	cdi_crossref_primary_10_1109_TASL_2009_2037404
source	IEEE Electronic Library (IEL)
subjects	Applied sciences Automatic speech recognition Bandwidth Computer science Exact sciences and technology Humans Indexing Information retrieval Information theory Information, signal and communications theory Lattices Material storage Performance analysis Position-specific posterior lattice (PSPL) Signal processing Speech analysis Speech processing spoken document indexing spoken document retrieval (SDR) subword-based PSPL Telecommunications and information theory
title	Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T17%3A20%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Performance%20Analysis%20for%20Lattice-Based%20Speech%20Indexing%20Approaches%20Using%20Words%20and%20Subword%20Units&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=PAN,%20Yi-Cheng&rft.date=2010-08-01&rft.volume=18&rft.issue=6&rft.spage=1562&rft.epage=1574&rft.pages=1562-1574&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2009.2037404&rft_dat=%3Cproquest_RIE%3E2721429091%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1027932333&rft_id=info:pmid/&rft_ieee_id=5340561&rfr_iscdi=true