Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems

While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different langua...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-12
Hauptverfasser: Sun, Michael, Huang, Kaili, Mehrad Moradshahi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Sun, Michael
Huang, Kaili
Mehrad Moradshahi
description While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages [1], while there are thousands of languages in the world, 91 of which are spoken by more than 10 million people according to statistics published in 2019 [2]. However, training a virtual assistant in other languages than English is often more difficult, especially for those low-resource languages. The lack of high-quality training data restricts the performance of models, resulting in poor user satisfaction. Therefore, we devise an efficient and effective training solution for multilingual task-orientated dialogue systems, using the same dataset generation pipeline and end-to-end dialogue system architecture as BiToD[5], which adopted some key design choices for a minimalistic natural language design where formal dialogue states are used in place of natural language inputs. This reduces the room for error brought by weaker natural language models, and ensures the model can correctly extract the essential slot values needed to perform dialogue state tracking (DST). Our goal is to reduce the amount of natural language encoded at each turn, and the key parameter we investigate is the number of turns (H) to feed as history to model. We first explore the turning point where increasing H begins to yield limiting returns on the overall performance. Then we examine whether the examples a model with small H gets wrong can be categorized in a way for the model to do few-shot finetuning on. Lastly, will explore the limitations of this approach, and whether there is a certain type of examples that this approach will not be able to resolve.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2613421356</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2613421356</sourcerecordid><originalsourceid>FETCH-proquest_journals_26134213563</originalsourceid><addsrcrecordid>eNqNzbEKwjAUheEgCBbtO1xwLrRJW921UgdxsOBYQk1Kakw0NxH69nYQXJ3O8H9wZiSijGXJNqd0QWLEIU1TWm5oUbCIXI_mLdCrnntleqikFJ0HK2GvuLZ9EFAr9NaNoAycgvZKTy5wDQ3HO5ydEsaL249fRvTigSsyl1yjiL-7JOtD1ezq5OnsK0yP7WCDM1NqaZmxnGasKNl_6gNs_0G1</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2613421356</pqid></control><display><type>article</type><title>Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems</title><source>Free E- Journals</source><creator>Sun, Michael ; Huang, Kaili ; Mehrad Moradshahi</creator><creatorcontrib>Sun, Michael ; Huang, Kaili ; Mehrad Moradshahi</creatorcontrib><description>While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages [1], while there are thousands of languages in the world, 91 of which are spoken by more than 10 million people according to statistics published in 2019 [2]. However, training a virtual assistant in other languages than English is often more difficult, especially for those low-resource languages. The lack of high-quality training data restricts the performance of models, resulting in poor user satisfaction. Therefore, we devise an efficient and effective training solution for multilingual task-orientated dialogue systems, using the same dataset generation pipeline and end-to-end dialogue system architecture as BiToD[5], which adopted some key design choices for a minimalistic natural language design where formal dialogue states are used in place of natural language inputs. This reduces the room for error brought by weaker natural language models, and ensures the model can correctly extract the essential slot values needed to perform dialogue state tracking (DST). Our goal is to reduce the amount of natural language encoded at each turn, and the key parameter we investigate is the number of turns (H) to feed as history to model. We first explore the turning point where increasing H begins to yield limiting returns on the overall performance. Then we examine whether the examples a model with small H gets wrong can be categorized in a way for the model to do few-shot finetuning on. Lastly, will explore the limitations of this approach, and whether there is a certain type of examples that this approach will not be able to resolve.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer architecture ; English language ; Interactive computer systems ; Languages ; Multilingualism ; Natural language ; System effectiveness ; Training ; User satisfaction</subject><ispartof>arXiv.org, 2021-12</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Sun, Michael</creatorcontrib><creatorcontrib>Huang, Kaili</creatorcontrib><creatorcontrib>Mehrad Moradshahi</creatorcontrib><title>Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems</title><title>arXiv.org</title><description>While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages [1], while there are thousands of languages in the world, 91 of which are spoken by more than 10 million people according to statistics published in 2019 [2]. However, training a virtual assistant in other languages than English is often more difficult, especially for those low-resource languages. The lack of high-quality training data restricts the performance of models, resulting in poor user satisfaction. Therefore, we devise an efficient and effective training solution for multilingual task-orientated dialogue systems, using the same dataset generation pipeline and end-to-end dialogue system architecture as BiToD[5], which adopted some key design choices for a minimalistic natural language design where formal dialogue states are used in place of natural language inputs. This reduces the room for error brought by weaker natural language models, and ensures the model can correctly extract the essential slot values needed to perform dialogue state tracking (DST). Our goal is to reduce the amount of natural language encoded at each turn, and the key parameter we investigate is the number of turns (H) to feed as history to model. We first explore the turning point where increasing H begins to yield limiting returns on the overall performance. Then we examine whether the examples a model with small H gets wrong can be categorized in a way for the model to do few-shot finetuning on. Lastly, will explore the limitations of this approach, and whether there is a certain type of examples that this approach will not be able to resolve.</description><subject>Computer architecture</subject><subject>English language</subject><subject>Interactive computer systems</subject><subject>Languages</subject><subject>Multilingualism</subject><subject>Natural language</subject><subject>System effectiveness</subject><subject>Training</subject><subject>User satisfaction</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNzbEKwjAUheEgCBbtO1xwLrRJW921UgdxsOBYQk1Kakw0NxH69nYQXJ3O8H9wZiSijGXJNqd0QWLEIU1TWm5oUbCIXI_mLdCrnntleqikFJ0HK2GvuLZ9EFAr9NaNoAycgvZKTy5wDQ3HO5ydEsaL249fRvTigSsyl1yjiL-7JOtD1ezq5OnsK0yP7WCDM1NqaZmxnGasKNl_6gNs_0G1</recordid><startdate>20211223</startdate><enddate>20211223</enddate><creator>Sun, Michael</creator><creator>Huang, Kaili</creator><creator>Mehrad Moradshahi</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20211223</creationdate><title>Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems</title><author>Sun, Michael ; Huang, Kaili ; Mehrad Moradshahi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26134213563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer architecture</topic><topic>English language</topic><topic>Interactive computer systems</topic><topic>Languages</topic><topic>Multilingualism</topic><topic>Natural language</topic><topic>System effectiveness</topic><topic>Training</topic><topic>User satisfaction</topic><toplevel>online_resources</toplevel><creatorcontrib>Sun, Michael</creatorcontrib><creatorcontrib>Huang, Kaili</creatorcontrib><creatorcontrib>Mehrad Moradshahi</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sun, Michael</au><au>Huang, Kaili</au><au>Mehrad Moradshahi</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems</atitle><jtitle>arXiv.org</jtitle><date>2021-12-23</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages [1], while there are thousands of languages in the world, 91 of which are spoken by more than 10 million people according to statistics published in 2019 [2]. However, training a virtual assistant in other languages than English is often more difficult, especially for those low-resource languages. The lack of high-quality training data restricts the performance of models, resulting in poor user satisfaction. Therefore, we devise an efficient and effective training solution for multilingual task-orientated dialogue systems, using the same dataset generation pipeline and end-to-end dialogue system architecture as BiToD[5], which adopted some key design choices for a minimalistic natural language design where formal dialogue states are used in place of natural language inputs. This reduces the room for error brought by weaker natural language models, and ensures the model can correctly extract the essential slot values needed to perform dialogue state tracking (DST). Our goal is to reduce the amount of natural language encoded at each turn, and the key parameter we investigate is the number of turns (H) to feed as history to model. We first explore the turning point where increasing H begins to yield limiting returns on the overall performance. Then we examine whether the examples a model with small H gets wrong can be categorized in a way for the model to do few-shot finetuning on. Lastly, will explore the limitations of this approach, and whether there is a certain type of examples that this approach will not be able to resolve.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_2613421356
source Free E- Journals
subjects Computer architecture
English language
Interactive computer systems
Languages
Multilingualism
Natural language
System effectiveness
Training
User satisfaction
title Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T22%3A56%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Investigating%20Effect%20of%20Dialogue%20History%20in%20Multilingual%20Task%20Oriented%20Dialogue%20Systems&rft.jtitle=arXiv.org&rft.au=Sun,%20Michael&rft.date=2021-12-23&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2613421356%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2613421356&rft_id=info:pmid/&rfr_iscdi=true