ASR Error Correction and Domain Adaptation Using Machine Translation
Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly viable service for companies of any size building speech-based products. While these ASR systems are trained on large amounts of data, domain mismatch is still an issue for many such parties that want to use thi...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Mani, Anirudh Palaskar, Shruti Meripo, Nimshi Venkat Konam, Sandeep Metze, Florian |
description | Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an
increasingly viable service for companies of any size building speech-based
products. While these ASR systems are trained on large amounts of data, domain
mismatch is still an issue for many such parties that want to use this service
as-is leading to not so optimal results for their task. We propose a simple
technique to perform domain adaptation for ASR error correction via machine
translation. The machine translation model is a strong candidate to learn a
mapping from out-of-domain ASR errors to in-domain terms in the corresponding
reference files. We use two off-the-shelf ASR systems in this work: Google ASR
(commercial) and the ASPIRE model (open-source). We observe 7% absolute
improvement in word error rate and 4 point absolute improvement in BLEU score
in Google ASR output via our proposed method. We also evaluate ASR error
correction via a downstream task of Speaker Diarization that captures speaker
style, syntax, structure and semantic improvements we obtain via ASR
correction. |
doi_str_mv | 10.48550/arxiv.2003.07692 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2003_07692</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2003_07692</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-90cb522d252d6ef5304ff50edc83ddd402699e27d6b3431134f37fcdcbf3c2c63</originalsourceid><addsrcrecordid>eNotz81KxDAUBeBsXMjoA7gyL9Ca3puk02XpjD8wg6B1XW5zEw3MpEM6iL69WF0dOAcOfELcVKrUa2PUHeWv-FmCUliq2jZwKTbt64vc5jxl2U05e3eOU5KUWG6mI8UkW6bTmZb2bY7pXe7JfcTkZZ8pzYdluRIXgQ6zv_7Plejvt333WOyeH566dleQraFolBsNAIMBtj4YVDoEozy7NTKzVmCbxkPNdkSNVYU6YB0cuzGgA2dxJW7_bhfGcMrxSPl7-OUMCwd_APNpRQA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>ASR Error Correction and Domain Adaptation Using Machine Translation</title><source>arXiv.org</source><creator>Mani, Anirudh ; Palaskar, Shruti ; Meripo, Nimshi Venkat ; Konam, Sandeep ; Metze, Florian</creator><creatorcontrib>Mani, Anirudh ; Palaskar, Shruti ; Meripo, Nimshi Venkat ; Konam, Sandeep ; Metze, Florian</creatorcontrib><description>Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an
increasingly viable service for companies of any size building speech-based
products. While these ASR systems are trained on large amounts of data, domain
mismatch is still an issue for many such parties that want to use this service
as-is leading to not so optimal results for their task. We propose a simple
technique to perform domain adaptation for ASR error correction via machine
translation. The machine translation model is a strong candidate to learn a
mapping from out-of-domain ASR errors to in-domain terms in the corresponding
reference files. We use two off-the-shelf ASR systems in this work: Google ASR
(commercial) and the ASPIRE model (open-source). We observe 7% absolute
improvement in word error rate and 4 point absolute improvement in BLEU score
in Google ASR output via our proposed method. We also evaluate ASR error
correction via a downstream task of Speaker Diarization that captures speaker
style, syntax, structure and semantic improvements we obtain via ASR
correction.</description><identifier>DOI: 10.48550/arxiv.2003.07692</identifier><language>eng</language><subject>Computer Science - Learning ; Computer Science - Sound ; Statistics - Machine Learning</subject><creationdate>2020-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2003.07692$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2003.07692$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mani, Anirudh</creatorcontrib><creatorcontrib>Palaskar, Shruti</creatorcontrib><creatorcontrib>Meripo, Nimshi Venkat</creatorcontrib><creatorcontrib>Konam, Sandeep</creatorcontrib><creatorcontrib>Metze, Florian</creatorcontrib><title>ASR Error Correction and Domain Adaptation Using Machine Translation</title><description>Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an
increasingly viable service for companies of any size building speech-based
products. While these ASR systems are trained on large amounts of data, domain
mismatch is still an issue for many such parties that want to use this service
as-is leading to not so optimal results for their task. We propose a simple
technique to perform domain adaptation for ASR error correction via machine
translation. The machine translation model is a strong candidate to learn a
mapping from out-of-domain ASR errors to in-domain terms in the corresponding
reference files. We use two off-the-shelf ASR systems in this work: Google ASR
(commercial) and the ASPIRE model (open-source). We observe 7% absolute
improvement in word error rate and 4 point absolute improvement in BLEU score
in Google ASR output via our proposed method. We also evaluate ASR error
correction via a downstream task of Speaker Diarization that captures speaker
style, syntax, structure and semantic improvements we obtain via ASR
correction.</description><subject>Computer Science - Learning</subject><subject>Computer Science - Sound</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81KxDAUBeBsXMjoA7gyL9Ca3puk02XpjD8wg6B1XW5zEw3MpEM6iL69WF0dOAcOfELcVKrUa2PUHeWv-FmCUliq2jZwKTbt64vc5jxl2U05e3eOU5KUWG6mI8UkW6bTmZb2bY7pXe7JfcTkZZ8pzYdluRIXgQ6zv_7Plejvt333WOyeH566dleQraFolBsNAIMBtj4YVDoEozy7NTKzVmCbxkPNdkSNVYU6YB0cuzGgA2dxJW7_bhfGcMrxSPl7-OUMCwd_APNpRQA</recordid><startdate>20200313</startdate><enddate>20200313</enddate><creator>Mani, Anirudh</creator><creator>Palaskar, Shruti</creator><creator>Meripo, Nimshi Venkat</creator><creator>Konam, Sandeep</creator><creator>Metze, Florian</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20200313</creationdate><title>ASR Error Correction and Domain Adaptation Using Machine Translation</title><author>Mani, Anirudh ; Palaskar, Shruti ; Meripo, Nimshi Venkat ; Konam, Sandeep ; Metze, Florian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-90cb522d252d6ef5304ff50edc83ddd402699e27d6b3431134f37fcdcbf3c2c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Learning</topic><topic>Computer Science - Sound</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Mani, Anirudh</creatorcontrib><creatorcontrib>Palaskar, Shruti</creatorcontrib><creatorcontrib>Meripo, Nimshi Venkat</creatorcontrib><creatorcontrib>Konam, Sandeep</creatorcontrib><creatorcontrib>Metze, Florian</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mani, Anirudh</au><au>Palaskar, Shruti</au><au>Meripo, Nimshi Venkat</au><au>Konam, Sandeep</au><au>Metze, Florian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ASR Error Correction and Domain Adaptation Using Machine Translation</atitle><date>2020-03-13</date><risdate>2020</risdate><abstract>Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an
increasingly viable service for companies of any size building speech-based
products. While these ASR systems are trained on large amounts of data, domain
mismatch is still an issue for many such parties that want to use this service
as-is leading to not so optimal results for their task. We propose a simple
technique to perform domain adaptation for ASR error correction via machine
translation. The machine translation model is a strong candidate to learn a
mapping from out-of-domain ASR errors to in-domain terms in the corresponding
reference files. We use two off-the-shelf ASR systems in this work: Google ASR
(commercial) and the ASPIRE model (open-source). We observe 7% absolute
improvement in word error rate and 4 point absolute improvement in BLEU score
in Google ASR output via our proposed method. We also evaluate ASR error
correction via a downstream task of Speaker Diarization that captures speaker
style, syntax, structure and semantic improvements we obtain via ASR
correction.</abstract><doi>10.48550/arxiv.2003.07692</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2003.07692 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2003_07692 |
source | arXiv.org |
subjects | Computer Science - Learning Computer Science - Sound Statistics - Machine Learning |
title | ASR Error Correction and Domain Adaptation Using Machine Translation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A32%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ASR%20Error%20Correction%20and%20Domain%20Adaptation%20Using%20Machine%20Translation&rft.au=Mani,%20Anirudh&rft.date=2020-03-13&rft_id=info:doi/10.48550/arxiv.2003.07692&rft_dat=%3Carxiv_GOX%3E2003_07692%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |