A Deep Reinforcement Learning Chatbot (Short Version)

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensem...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Serban, Iulian V, Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeswar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R, Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning Computer Science - Neural and Evolutionary Computing Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Serban, Iulian V Sankar, Chinnadhurai Germain, Mathieu Zhang, Saizheng Lin, Zhouhan Subramanian, Sandeep Kim, Taesup Pieper, Michael Chandar, Sarath Ke, Nan Rosemary Rajeswar, Sai de Brebisson, Alexandre Sotelo, Jose M. R Suhubdy, Dendi Michalski, Vincent Nguyen, Alexandre Pineau, Joelle Bengio, Yoshua
description	We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.
doi_str_mv	10.48550/arxiv.1801.06700
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1801_06700</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1801_06700</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-a983731ec10e6b5d97f99724d84d05136abdeb517db23533a69f4a9b8c4dd3283</originalsourceid><addsrcrecordid>eNotzrFOwzAUQFEvDKjwAUx4hCHBzrNje6xCgUqRKtGqa_Qcv1BL1KncqIK_B0qnu10dxu6kKJXVWjxh_oqnUlohS1EbIa6ZnvNnogN_p5iGMfe0pzTxljCnmD54s8PJjxN_WO_GPPEt5WMc0-MNuxrw80i3l87Y5mWxad6KdvW6bOZtgb_3Ap0FA5J6Kaj2OjgzOGcqFawKQkuo0QfyWprgK9AAWLtBofO2VyFAZWHG7v-3Z3d3yHGP-bv783dnP_wA_ZM-kg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Deep Reinforcement Learning Chatbot (Short Version)</title><source>arXiv.org</source><creator>Serban, Iulian V ; Sankar, Chinnadhurai ; Germain, Mathieu ; Zhang, Saizheng ; Lin, Zhouhan ; Subramanian, Sandeep ; Kim, Taesup ; Pieper, Michael ; Chandar, Sarath ; Ke, Nan Rosemary ; Rajeswar, Sai ; de Brebisson, Alexandre ; Sotelo, Jose M. R ; Suhubdy, Dendi ; Michalski, Vincent ; Nguyen, Alexandre ; Pineau, Joelle ; Bengio, Yoshua</creator><creatorcontrib>Serban, Iulian V ; Sankar, Chinnadhurai ; Germain, Mathieu ; Zhang, Saizheng ; Lin, Zhouhan ; Subramanian, Sandeep ; Kim, Taesup ; Pieper, Michael ; Chandar, Sarath ; Ke, Nan Rosemary ; Rajeswar, Sai ; de Brebisson, Alexandre ; Sotelo, Jose M. R ; Suhubdy, Dendi ; Michalski, Vincent ; Nguyen, Alexandre ; Pineau, Joelle ; Bengio, Yoshua</creatorcontrib><description>We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.</description><identifier>DOI: 10.48550/arxiv.1801.06700</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning ; Computer Science - Neural and Evolutionary Computing ; Statistics - Machine Learning</subject><creationdate>2018-01</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1801.06700$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1801.06700$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Serban, Iulian V</creatorcontrib><creatorcontrib>Sankar, Chinnadhurai</creatorcontrib><creatorcontrib>Germain, Mathieu</creatorcontrib><creatorcontrib>Zhang, Saizheng</creatorcontrib><creatorcontrib>Lin, Zhouhan</creatorcontrib><creatorcontrib>Subramanian, Sandeep</creatorcontrib><creatorcontrib>Kim, Taesup</creatorcontrib><creatorcontrib>Pieper, Michael</creatorcontrib><creatorcontrib>Chandar, Sarath</creatorcontrib><creatorcontrib>Ke, Nan Rosemary</creatorcontrib><creatorcontrib>Rajeswar, Sai</creatorcontrib><creatorcontrib>de Brebisson, Alexandre</creatorcontrib><creatorcontrib>Sotelo, Jose M. R</creatorcontrib><creatorcontrib>Suhubdy, Dendi</creatorcontrib><creatorcontrib>Michalski, Vincent</creatorcontrib><creatorcontrib>Nguyen, Alexandre</creatorcontrib><creatorcontrib>Pineau, Joelle</creatorcontrib><creatorcontrib>Bengio, Yoshua</creatorcontrib><title>A Deep Reinforcement Learning Chatbot (Short Version)</title><description>We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Neural and Evolutionary Computing</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrFOwzAUQFEvDKjwAUx4hCHBzrNje6xCgUqRKtGqa_Qcv1BL1KncqIK_B0qnu10dxu6kKJXVWjxh_oqnUlohS1EbIa6ZnvNnogN_p5iGMfe0pzTxljCnmD54s8PJjxN_WO_GPPEt5WMc0-MNuxrw80i3l87Y5mWxad6KdvW6bOZtgb_3Ap0FA5J6Kaj2OjgzOGcqFawKQkuo0QfyWprgK9AAWLtBofO2VyFAZWHG7v-3Z3d3yHGP-bv783dnP_wA_ZM-kg</recordid><startdate>20180120</startdate><enddate>20180120</enddate><creator>Serban, Iulian V</creator><creator>Sankar, Chinnadhurai</creator><creator>Germain, Mathieu</creator><creator>Zhang, Saizheng</creator><creator>Lin, Zhouhan</creator><creator>Subramanian, Sandeep</creator><creator>Kim, Taesup</creator><creator>Pieper, Michael</creator><creator>Chandar, Sarath</creator><creator>Ke, Nan Rosemary</creator><creator>Rajeswar, Sai</creator><creator>de Brebisson, Alexandre</creator><creator>Sotelo, Jose M. R</creator><creator>Suhubdy, Dendi</creator><creator>Michalski, Vincent</creator><creator>Nguyen, Alexandre</creator><creator>Pineau, Joelle</creator><creator>Bengio, Yoshua</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20180120</creationdate><title>A Deep Reinforcement Learning Chatbot (Short Version)</title><author>Serban, Iulian V ; Sankar, Chinnadhurai ; Germain, Mathieu ; Zhang, Saizheng ; Lin, Zhouhan ; Subramanian, Sandeep ; Kim, Taesup ; Pieper, Michael ; Chandar, Sarath ; Ke, Nan Rosemary ; Rajeswar, Sai ; de Brebisson, Alexandre ; Sotelo, Jose M. R ; Suhubdy, Dendi ; Michalski, Vincent ; Nguyen, Alexandre ; Pineau, Joelle ; Bengio, Yoshua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-a983731ec10e6b5d97f99724d84d05136abdeb517db23533a69f4a9b8c4dd3283</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Neural and Evolutionary Computing</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Serban, Iulian V</creatorcontrib><creatorcontrib>Sankar, Chinnadhurai</creatorcontrib><creatorcontrib>Germain, Mathieu</creatorcontrib><creatorcontrib>Zhang, Saizheng</creatorcontrib><creatorcontrib>Lin, Zhouhan</creatorcontrib><creatorcontrib>Subramanian, Sandeep</creatorcontrib><creatorcontrib>Kim, Taesup</creatorcontrib><creatorcontrib>Pieper, Michael</creatorcontrib><creatorcontrib>Chandar, Sarath</creatorcontrib><creatorcontrib>Ke, Nan Rosemary</creatorcontrib><creatorcontrib>Rajeswar, Sai</creatorcontrib><creatorcontrib>de Brebisson, Alexandre</creatorcontrib><creatorcontrib>Sotelo, Jose M. R</creatorcontrib><creatorcontrib>Suhubdy, Dendi</creatorcontrib><creatorcontrib>Michalski, Vincent</creatorcontrib><creatorcontrib>Nguyen, Alexandre</creatorcontrib><creatorcontrib>Pineau, Joelle</creatorcontrib><creatorcontrib>Bengio, Yoshua</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Serban, Iulian V</au><au>Sankar, Chinnadhurai</au><au>Germain, Mathieu</au><au>Zhang, Saizheng</au><au>Lin, Zhouhan</au><au>Subramanian, Sandeep</au><au>Kim, Taesup</au><au>Pieper, Michael</au><au>Chandar, Sarath</au><au>Ke, Nan Rosemary</au><au>Rajeswar, Sai</au><au>de Brebisson, Alexandre</au><au>Sotelo, Jose M. R</au><au>Suhubdy, Dendi</au><au>Michalski, Vincent</au><au>Nguyen, Alexandre</au><au>Pineau, Joelle</au><au>Bengio, Yoshua</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Deep Reinforcement Learning Chatbot (Short Version)</atitle><date>2018-01-20</date><risdate>2018</risdate><abstract>We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.</abstract><doi>10.48550/arxiv.1801.06700</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1801.06700
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1801_06700
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning Computer Science - Neural and Evolutionary Computing Statistics - Machine Learning
title	A Deep Reinforcement Learning Chatbot (Short Version)
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T01%3A42%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Deep%20Reinforcement%20Learning%20Chatbot%20(Short%20Version)&rft.au=Serban,%20Iulian%20V&rft.date=2018-01-20&rft_id=info:doi/10.48550/arxiv.1801.06700&rft_dat=%3Carxiv_GOX%3E1801_06700%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true