Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation

Human language is a combination of elemental languages/domains/styles that change across and sometimes within discourses. Language models, which play a crucial role in speech recognizers and machine translation systems, are particularly sensitive to such changes, unless some form of adaptation takes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Karakos, Damianos, Dredze, Mark, Khudanpur, Sanjeev
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Karakos, Damianos
Dredze, Mark
Khudanpur, Sanjeev
description Human language is a combination of elemental languages/domains/styles that change across and sometimes within discourses. Language models, which play a crucial role in speech recognizers and machine translation systems, are particularly sensitive to such changes, unless some form of adaptation takes place. One approach to speech language model adaptation is self-training, in which a language model's parameters are tuned based on automatically transcribed audio. However, transcription errors can misguide self-training, particularly in challenging settings such as conversational speech. In this work, we propose a model that considers the confusions (errors) of the ASR channel. By modeling the likely confusions in the ASR output instead of using just the 1-best, we improve self-training efficacy by obtaining a more reliable reference transcription estimate. We demonstrate improved topic-based language modeling adaptation results over both 1-best and lattice self-training using our ASR channel confusion estimates on telephone conversations.
doi_str_mv 10.48550/arxiv.1303.5148
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1303_5148</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1303_5148</sourcerecordid><originalsourceid>FETCH-LOGICAL-a658-5512f73403d606abf2cd5b8512d4c6c8bc7177cfbf4eb34550b19bdd447719f63</originalsourceid><addsrcrecordid>eNotz81LwzAYBvBcPMj07knyD7Qmy-eOpUwdVAbam4eSzy6wJaXphv73ZurpfXh5eOAHwANGNZWMoSc1f4VLjQkiNcNU3oLPbV7CSS0hjrBN0Z9zSDHDEOFycLD5eIftQcXojtCnGe5O05wuzsI-TcFUWuWSOxXHsxodfEu29BqrpqUMpngHbrw6Znf_f1egf9727WvV7V92bdNVijNZMYbXXhCKiOWIK-3XxjIty9dSw43URmAhjNeeOk1oUWi80dZSKgTeeE5W4PFv9hc3THPxzN_DFTlckeQHJN1Mmg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation</title><source>arXiv.org</source><creator>Karakos, Damianos ; Dredze, Mark ; Khudanpur, Sanjeev</creator><creatorcontrib>Karakos, Damianos ; Dredze, Mark ; Khudanpur, Sanjeev</creatorcontrib><description>Human language is a combination of elemental languages/domains/styles that change across and sometimes within discourses. Language models, which play a crucial role in speech recognizers and machine translation systems, are particularly sensitive to such changes, unless some form of adaptation takes place. One approach to speech language model adaptation is self-training, in which a language model's parameters are tuned based on automatically transcribed audio. However, transcription errors can misguide self-training, particularly in challenging settings such as conversational speech. In this work, we propose a model that considers the confusions (errors) of the ASR channel. By modeling the likely confusions in the ASR output instead of using just the 1-best, we improve self-training efficacy by obtaining a more reliable reference transcription estimate. We demonstrate improved topic-based language modeling adaptation results over both 1-best and lattice self-training using our ASR channel confusion estimates on telephone conversations.</description><identifier>DOI: 10.48550/arxiv.1303.5148</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2013-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1303.5148$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1303.5148$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Karakos, Damianos</creatorcontrib><creatorcontrib>Dredze, Mark</creatorcontrib><creatorcontrib>Khudanpur, Sanjeev</creatorcontrib><title>Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation</title><description>Human language is a combination of elemental languages/domains/styles that change across and sometimes within discourses. Language models, which play a crucial role in speech recognizers and machine translation systems, are particularly sensitive to such changes, unless some form of adaptation takes place. One approach to speech language model adaptation is self-training, in which a language model's parameters are tuned based on automatically transcribed audio. However, transcription errors can misguide self-training, particularly in challenging settings such as conversational speech. In this work, we propose a model that considers the confusions (errors) of the ASR channel. By modeling the likely confusions in the ASR output instead of using just the 1-best, we improve self-training efficacy by obtaining a more reliable reference transcription estimate. We demonstrate improved topic-based language modeling adaptation results over both 1-best and lattice self-training using our ASR channel confusion estimates on telephone conversations.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81LwzAYBvBcPMj07knyD7Qmy-eOpUwdVAbam4eSzy6wJaXphv73ZurpfXh5eOAHwANGNZWMoSc1f4VLjQkiNcNU3oLPbV7CSS0hjrBN0Z9zSDHDEOFycLD5eIftQcXojtCnGe5O05wuzsI-TcFUWuWSOxXHsxodfEu29BqrpqUMpngHbrw6Znf_f1egf9727WvV7V92bdNVijNZMYbXXhCKiOWIK-3XxjIty9dSw43URmAhjNeeOk1oUWi80dZSKgTeeE5W4PFv9hc3THPxzN_DFTlckeQHJN1Mmg</recordid><startdate>20130320</startdate><enddate>20130320</enddate><creator>Karakos, Damianos</creator><creator>Dredze, Mark</creator><creator>Khudanpur, Sanjeev</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20130320</creationdate><title>Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation</title><author>Karakos, Damianos ; Dredze, Mark ; Khudanpur, Sanjeev</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a658-5512f73403d606abf2cd5b8512d4c6c8bc7177cfbf4eb34550b19bdd447719f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Karakos, Damianos</creatorcontrib><creatorcontrib>Dredze, Mark</creatorcontrib><creatorcontrib>Khudanpur, Sanjeev</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Karakos, Damianos</au><au>Dredze, Mark</au><au>Khudanpur, Sanjeev</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation</atitle><date>2013-03-20</date><risdate>2013</risdate><abstract>Human language is a combination of elemental languages/domains/styles that change across and sometimes within discourses. Language models, which play a crucial role in speech recognizers and machine translation systems, are particularly sensitive to such changes, unless some form of adaptation takes place. One approach to speech language model adaptation is self-training, in which a language model's parameters are tuned based on automatically transcribed audio. However, transcription errors can misguide self-training, particularly in challenging settings such as conversational speech. In this work, we propose a model that considers the confusions (errors) of the ASR channel. By modeling the likely confusions in the ASR output instead of using just the 1-best, we improve self-training efficacy by obtaining a more reliable reference transcription estimate. We demonstrate improved topic-based language modeling adaptation results over both 1-best and lattice self-training using our ASR channel confusion estimates on telephone conversations.</abstract><doi>10.48550/arxiv.1303.5148</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1303.5148
ispartof
issn
language eng
recordid cdi_arxiv_primary_1303_5148
source arXiv.org
subjects Computer Science - Computation and Language
Computer Science - Learning
title Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T21%3A08%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Estimating%20Confusions%20in%20the%20ASR%20Channel%20for%20Improved%20Topic-based%20Language%20Model%20Adaptation&rft.au=Karakos,%20Damianos&rft.date=2013-03-20&rft_id=info:doi/10.48550/arxiv.1303.5148&rft_dat=%3Carxiv_GOX%3E1303_5148%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true