Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation
Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization...
Gespeichert in:
Veröffentlicht in: | IEICE Transactions on Information and Systems 2022/12/01, Vol.E105.D(12), pp.2112-2118 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2118 |
---|---|
container_issue | 12 |
container_start_page | 2112 |
container_title | IEICE Transactions on Information and Systems |
container_volume | E105.D |
creator | MA, Han ZHANG, Qiaoling TANG, Roubing ZHANG, Lu JIA, Yubo |
description | Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise. |
doi_str_mv | 10.1587/transinf.2022EDP7043 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2747018297</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747018297</sourcerecordid><originalsourceid>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGExH9Bx6aeK7uZ7c9EsCvkGgAz5vp7hZKYFt3twf_vTQIcppJ5nnfmXkRuif4kYhcPkUPLtSueqSY0unkU2LOLtCASC5SwjJyiQa4IFmaC0av0U0IG4xJTokYoPd5U3YhJovWWr1O5lY3K1fHunHJ195ylSwt6LX16SJ2xrqYzCx41w8mzQ5ql4wMtBF6wS26qmAb7N1fHaLF83Q5fk1nHy9v49Es1YLymBY4k1XJq1LnUAJlGRCMaSGtkdRwW-alLoWAwnBmcKUJhVwainGRZ5kp2RA9HFxb33x3NkS1aTrv9gsVlVz2fxVyT_EDpX0TgreVan29A_-jCFZ9ZuqYmTrLjJ1O3oQIK3sSgY-13tp_0ZRgoSaK0GN35nKi9Rq8so79Ak6Vf4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747018297</pqid></control><display><type>article</type><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><source>J-STAGE Free</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</creator><creatorcontrib>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</creatorcontrib><description>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.2022EDP7043</identifier><language>eng</language><publisher>Tokyo: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Adaptation ; automatic speech recognition ; Coders ; domain adaptation ; Domains ; Learning ; noise robustness ; Optimization ; Robustness ; Speech recognition ; teacher-student learning ; Teachers</subject><ispartof>IEICE Transactions on Information and Systems, 2022/12/01, Vol.E105.D(12), pp.2112-2118</ispartof><rights>2022 The Institute of Electronics, Information and Communication Engineers</rights><rights>Copyright Japan Science and Technology Agency 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</citedby><cites>FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1877,27901,27902</link.rule.ids></links><search><creatorcontrib>MA, Han</creatorcontrib><creatorcontrib>ZHANG, Qiaoling</creatorcontrib><creatorcontrib>TANG, Roubing</creatorcontrib><creatorcontrib>ZHANG, Lu</creatorcontrib><creatorcontrib>JIA, Yubo</creatorcontrib><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. & Syst.</addtitle><description>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</description><subject>Adaptation</subject><subject>automatic speech recognition</subject><subject>Coders</subject><subject>domain adaptation</subject><subject>Domains</subject><subject>Learning</subject><subject>noise robustness</subject><subject>Optimization</subject><subject>Robustness</subject><subject>Speech recognition</subject><subject>teacher-student learning</subject><subject>Teachers</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpNkE1PwkAQhjdGExH9Bx6aeK7uZ7c9EsCvkGgAz5vp7hZKYFt3twf_vTQIcppJ5nnfmXkRuif4kYhcPkUPLtSueqSY0unkU2LOLtCASC5SwjJyiQa4IFmaC0av0U0IG4xJTokYoPd5U3YhJovWWr1O5lY3K1fHunHJ195ylSwt6LX16SJ2xrqYzCx41w8mzQ5ql4wMtBF6wS26qmAb7N1fHaLF83Q5fk1nHy9v49Es1YLymBY4k1XJq1LnUAJlGRCMaSGtkdRwW-alLoWAwnBmcKUJhVwainGRZ5kp2RA9HFxb33x3NkS1aTrv9gsVlVz2fxVyT_EDpX0TgreVan29A_-jCFZ9ZuqYmTrLjJ1O3oQIK3sSgY-13tp_0ZRgoSaK0GN35nKi9Rq8so79Ak6Vf4A</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>MA, Han</creator><creator>ZHANG, Qiaoling</creator><creator>TANG, Roubing</creator><creator>ZHANG, Lu</creator><creator>JIA, Yubo</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Japan Science and Technology Agency</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20221201</creationdate><title>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</title><author>MA, Han ; ZHANG, Qiaoling ; TANG, Roubing ; ZHANG, Lu ; JIA, Yubo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c524t-9067fb4fbc8aba236a100297ed72d4eb8bcb55a9d43d0fc12a87d2009866db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation</topic><topic>automatic speech recognition</topic><topic>Coders</topic><topic>domain adaptation</topic><topic>Domains</topic><topic>Learning</topic><topic>noise robustness</topic><topic>Optimization</topic><topic>Robustness</topic><topic>Speech recognition</topic><topic>teacher-student learning</topic><topic>Teachers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>MA, Han</creatorcontrib><creatorcontrib>ZHANG, Qiaoling</creatorcontrib><creatorcontrib>TANG, Roubing</creatorcontrib><creatorcontrib>ZHANG, Lu</creatorcontrib><creatorcontrib>JIA, Yubo</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>MA, Han</au><au>ZHANG, Qiaoling</au><au>TANG, Roubing</au><au>ZHANG, Lu</au><au>JIA, Yubo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. & Syst.</addtitle><date>2022-12-01</date><risdate>2022</risdate><volume>E105.D</volume><issue>12</issue><spage>2112</spage><epage>2118</epage><pages>2112-2118</pages><artnum>2022EDP7043</artnum><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>Recently, robust speech recognition for real-world applications has attracted much attention. This paper proposes a robust speech recognition method based on the teacher-student learning framework for domain adaptation. In particular, the student network will be trained based on a novel optimization criterion defined by the encoder outputs of both teacher and student networks rather than the final output posterior probabilities, which aims to make the noisy audio map to the same embedding space as clean audio, so that the student network is adaptive in the noise domain. Comparative experiments demonstrate that the proposed method obtained good robustness against noise.</abstract><cop>Tokyo</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.2022EDP7043</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0916-8532 |
ispartof | IEICE Transactions on Information and Systems, 2022/12/01, Vol.E105.D(12), pp.2112-2118 |
issn | 0916-8532 1745-1361 |
language | eng |
recordid | cdi_proquest_journals_2747018297 |
source | J-STAGE Free; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Adaptation automatic speech recognition Coders domain adaptation Domains Learning noise robustness Optimization Robustness Speech recognition teacher-student learning Teachers |
title | Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T06%3A02%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Speech%20Recognition%20Using%20Teacher-Student%20Learning%20Domain%20Adaptation&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=MA,%20Han&rft.date=2022-12-01&rft.volume=E105.D&rft.issue=12&rft.spage=2112&rft.epage=2118&rft.pages=2112-2118&rft.artnum=2022EDP7043&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.2022EDP7043&rft_dat=%3Cproquest_cross%3E2747018297%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2747018297&rft_id=info:pmid/&rfr_iscdi=true |