Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation

Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEICE Transactions on Information and Systems 2022/12/01, Vol.E105.D(12), pp.2112-2118
Hauptverfasser: MA, Han, ZHANG, Qiaoling, TANG, Roubing, ZHANG, Lu, JIA, Yubo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2118
container_issue 12
container_start_page 2112
container_title IEICE Transactions on Information and Systems
container_volume E105.D
creator MA, Han
ZHANG, Qiaoling
TANG, Roubing
ZHANG, Lu
JIA, Yubo
description Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.
doi_str_mv 10.1587/transinf.2022EDP7043
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2747018297</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747018297</sourcerecordid><originalsourceid>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGExH9Bx6aeK7uZ7c9EsCvkGgAz5vp7hZKYFt3twf_vTQIcppJ5nnfmXkRuif4kYhcPkUPLtSueqSY0unkU2LOLtCASC5SwjJyiQa4IFmaC0av0U0IG4xJTokYoPd5U3YhJovWWr1O5lY3K1fHunHJ195ylSwt6LX16SJ2xrqYzCx41w8mzQ5ql4wMtBF6wS26qmAb7N1fHaLF83Q5fk1nHy9v49Es1YLymBY4k1XJq1LnUAJlGRCMaSGtkdRwW-alLoWAwnBmcKUJhVwainGRZ5kp2RA9HFxb33x3NkS1aTrv9gsVlVz2fxVyT_EDpX0TgreVan29A_-jCFZ9ZuqYmTrLjJ1O3oQIK3sSgY-13tp_0ZRgoSaK0GN35nKi9Rq8so79Ak6Vf4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747018297</pqid></control><display><type>article</type><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><source>J-STAGE Free</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</creator><creatorcontrib>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</creatorcontrib><description>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.2022EDP7043</identifier><language>eng</language><publisher>Tokyo: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Adaptation ; automatic speech recognition ; Coders ; domain adaptation ; Domains ; Learning ; noise robustness ; Optimization ; Robustness ; Speech recognition ; teacher-student learning ; Teachers</subject><ispartof>IEICE Transactions on Information and Systems, 2022/12/01, Vol.E105.D(12), pp.2112-2118</ispartof><rights>2022 The Institute of Electronics, Information and Communication Engineers</rights><rights>Copyright Japan Science and Technology Agency 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</citedby><cites>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1877,27901,27902</link.rule.ids></links><search><creatorcontrib>MA, Han</creatorcontrib><creatorcontrib>ZHANG, Qiaoling</creatorcontrib><creatorcontrib>TANG, Roubing</creatorcontrib><creatorcontrib>ZHANG, Lu</creatorcontrib><creatorcontrib>JIA, Yubo</creatorcontrib><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><description>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</description><subject>Adaptation</subject><subject>automatic speech recognition</subject><subject>Coders</subject><subject>domain adaptation</subject><subject>Domains</subject><subject>Learning</subject><subject>noise robustness</subject><subject>Optimization</subject><subject>Robustness</subject><subject>Speech recognition</subject><subject>teacher-student learning</subject><subject>Teachers</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpNkE1PwkAQhjdGExH9Bx6aeK7uZ7c9EsCvkGgAz5vp7hZKYFt3twf_vTQIcppJ5nnfmXkRuif4kYhcPkUPLtSueqSY0unkU2LOLtCASC5SwjJyiQa4IFmaC0av0U0IG4xJTokYoPd5U3YhJovWWr1O5lY3K1fHunHJ195ylSwt6LX16SJ2xrqYzCx41w8mzQ5ql4wMtBF6wS26qmAb7N1fHaLF83Q5fk1nHy9v49Es1YLymBY4k1XJq1LnUAJlGRCMaSGtkdRwW-alLoWAwnBmcKUJhVwainGRZ5kp2RA9HFxb33x3NkS1aTrv9gsVlVz2fxVyT_EDpX0TgreVan29A_-jCFZ9ZuqYmTrLjJ1O3oQIK3sSgY-13tp_0ZRgoSaK0GN35nKi9Rq8so79Ak6Vf4A</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>MA, Han</creator><creator>ZHANG, Qiaoling</creator><creator>TANG, Roubing</creator><creator>ZHANG, Lu</creator><creator>JIA, Yubo</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Japan Science and Technology Agency</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20221201</creationdate><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><author>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation</topic><topic>automatic speech recognition</topic><topic>Coders</topic><topic>domain adaptation</topic><topic>Domains</topic><topic>Learning</topic><topic>noise robustness</topic><topic>Optimization</topic><topic>Robustness</topic><topic>Speech recognition</topic><topic>teacher-student learning</topic><topic>Teachers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>MA, Han</creatorcontrib><creatorcontrib>ZHANG, Qiaoling</creatorcontrib><creatorcontrib>TANG, Roubing</creatorcontrib><creatorcontrib>ZHANG, Lu</creatorcontrib><creatorcontrib>JIA, Yubo</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>MA, Han</au><au>ZHANG, Qiaoling</au><au>TANG, Roubing</au><au>ZHANG, Lu</au><au>JIA, Yubo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><date>2022-12-01</date><risdate>2022</risdate><volume>E105.D</volume><issue>12</issue><spage>2112</spage><epage>2118</epage><pages>2112-2118</pages><artnum>2022EDP7043</artnum><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</abstract><cop>Tokyo</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.2022EDP7043</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0916-8532
ispartof IEICE Transactions on Information and Systems, 2022/12/01, Vol.E105.D(12), pp.2112-2118
issn 0916-8532
1745-1361
language eng
recordid cdi_proquest_journals_2747018297
source J-STAGE Free; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Adaptation
automatic speech recognition
Coders
domain adaptation
Domains
Learning
noise robustness
Optimization
Robustness
Speech recognition
teacher-student learning
Teachers
title Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T06%3A02%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Speech%20Recognition%20Using%20Teacher-Student%20Learning%20Domain%20Adaptation&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=MA,%20Han&rft.date=2022-12-01&rft.volume=E105.D&rft.issue=12&rft.spage=2112&rft.epage=2118&rft.pages=2112-2118&rft.artnum=2022EDP7043&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.2022EDP7043&rft_dat=%3Cproquest_cross%3E2747018297%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2747018297&rft_id=info:pmid/&rfr_iscdi=true