Adversarial Training for Multilingual Acoustic Modeling

Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge sharing is usually achieved by using common lower-level layers for different languages in a deep neural network. Recently, the domain adversa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hu, Ke, Sak, Hasim, Liao, Hank
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language Computer Science - Learning Computer Science - Sound
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Hu, Ke Sak, Hasim Liao, Hank
description	Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge sharing is usually achieved by using common lower-level layers for different languages in a deep neural network. Recently, the domain adversarial network was proposed to reduce domain mismatch of training data and learn domain-invariant features. It is thus worth exploring whether adversarial training can further promote knowledge sharing in multilingual models. In this work, we apply the domain adversarial network to encourage the shared layers of a multilingual model to learn language-invariant features. Bidirectional Long Short-Term Memory (LSTM) recurrent neural networks (RNN) are used as building blocks. We show that shared layers learned this way contain less language identification information and lead to better performance. In an automatic speech recognition task for seven languages, the resultant acoustic model improves the word error rate (WER) of the multilingual model by 4% relative on average, and the monolingual models by 10%.
doi_str_mv	10.48550/arxiv.1906.07093
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1906_07093</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1906_07093</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-f9cba7a5a9e4410cc7706128c687f5fd597369a856c0ffae135dfd6b0c910663</originalsourceid><addsrcrecordid>eNotj7uOwjAURN1QrIAP2Ir8QLLXOL6OyyhiYSUQBfTRxQ9kKSTIIQj-nsdSjWZGGs1h7JtDlhdSwg_FW7hmXANmoECLL6ZKe3WxpxioSfaRQhvaY-K7mGyG5hKapxueTWm6ob8Ek2w6617hhI08Nb2bfnTMdr-LfbVK19vlX1WuU0IlUq_NgRRJ0i7PORijFCCfFwYL5aW3UiuBmgqJBrwnx4W03uIBjOaAKMZs9r_6Pl6fYzhRvNcvgPoNIB5Jc0CW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adversarial Training for Multilingual Acoustic Modeling</title><source>arXiv.org</source><creator>Hu, Ke ; Sak, Hasim ; Liao, Hank</creator><creatorcontrib>Hu, Ke ; Sak, Hasim ; Liao, Hank</creatorcontrib><description>Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge sharing is usually achieved by using common lower-level layers for different languages in a deep neural network. Recently, the domain adversarial network was proposed to reduce domain mismatch of training data and learn domain-invariant features. It is thus worth exploring whether adversarial training can further promote knowledge sharing in multilingual models. In this work, we apply the domain adversarial network to encourage the shared layers of a multilingual model to learn language-invariant features. Bidirectional Long Short-Term Memory (LSTM) recurrent neural networks (RNN) are used as building blocks. We show that shared layers learned this way contain less language identification information and lead to better performance. In an automatic speech recognition task for seven languages, the resultant acoustic model improves the word error rate (WER) of the multilingual model by 4% relative on average, and the monolingual models by 10%.</description><identifier>DOI: 10.48550/arxiv.1906.07093</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning ; Computer Science - Sound</subject><creationdate>2019-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1906.07093$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1906.07093$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hu, Ke</creatorcontrib><creatorcontrib>Sak, Hasim</creatorcontrib><creatorcontrib>Liao, Hank</creatorcontrib><title>Adversarial Training for Multilingual Acoustic Modeling</title><description>Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge sharing is usually achieved by using common lower-level layers for different languages in a deep neural network. Recently, the domain adversarial network was proposed to reduce domain mismatch of training data and learn domain-invariant features. It is thus worth exploring whether adversarial training can further promote knowledge sharing in multilingual models. In this work, we apply the domain adversarial network to encourage the shared layers of a multilingual model to learn language-invariant features. Bidirectional Long Short-Term Memory (LSTM) recurrent neural networks (RNN) are used as building blocks. We show that shared layers learned this way contain less language identification information and lead to better performance. In an automatic speech recognition task for seven languages, the resultant acoustic model improves the word error rate (WER) of the multilingual model by 4% relative on average, and the monolingual models by 10%.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7uOwjAURN1QrIAP2Ir8QLLXOL6OyyhiYSUQBfTRxQ9kKSTIIQj-nsdSjWZGGs1h7JtDlhdSwg_FW7hmXANmoECLL6ZKe3WxpxioSfaRQhvaY-K7mGyG5hKapxueTWm6ob8Ek2w6617hhI08Nb2bfnTMdr-LfbVK19vlX1WuU0IlUq_NgRRJ0i7PORijFCCfFwYL5aW3UiuBmgqJBrwnx4W03uIBjOaAKMZs9r_6Pl6fYzhRvNcvgPoNIB5Jc0CW</recordid><startdate>20190617</startdate><enddate>20190617</enddate><creator>Hu, Ke</creator><creator>Sak, Hasim</creator><creator>Liao, Hank</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190617</creationdate><title>Adversarial Training for Multilingual Acoustic Modeling</title><author>Hu, Ke ; Sak, Hasim ; Liao, Hank</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-f9cba7a5a9e4410cc7706128c687f5fd597369a856c0ffae135dfd6b0c910663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Hu, Ke</creatorcontrib><creatorcontrib>Sak, Hasim</creatorcontrib><creatorcontrib>Liao, Hank</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hu, Ke</au><au>Sak, Hasim</au><au>Liao, Hank</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adversarial Training for Multilingual Acoustic Modeling</atitle><date>2019-06-17</date><risdate>2019</risdate><abstract>Multilingual training has been shown to improve acoustic modeling performance by sharing and transferring knowledge in modeling different languages. Knowledge sharing is usually achieved by using common lower-level layers for different languages in a deep neural network. Recently, the domain adversarial network was proposed to reduce domain mismatch of training data and learn domain-invariant features. It is thus worth exploring whether adversarial training can further promote knowledge sharing in multilingual models. In this work, we apply the domain adversarial network to encourage the shared layers of a multilingual model to learn language-invariant features. Bidirectional Long Short-Term Memory (LSTM) recurrent neural networks (RNN) are used as building blocks. We show that shared layers learned this way contain less language identification information and lead to better performance. In an automatic speech recognition task for seven languages, the resultant acoustic model improves the word error rate (WER) of the multilingual model by 4% relative on average, and the monolingual models by 10%.</abstract><doi>10.48550/arxiv.1906.07093</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1906.07093
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1906_07093
source	arXiv.org
subjects	Computer Science - Computation and Language Computer Science - Learning Computer Science - Sound
title	Adversarial Training for Multilingual Acoustic Modeling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T19%3A40%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adversarial%20Training%20for%20Multilingual%20Acoustic%20Modeling&rft.au=Hu,%20Ke&rft.date=2019-06-17&rft_id=info:doi/10.48550/arxiv.1906.07093&rft_dat=%3Carxiv_GOX%3E1906_07093%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true