Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Shleifer, Sam, Chablani, Manish, Kannan, Anitha, Katariya, Namit, Amatriain, Xavier
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Shleifer, Sam Chablani, Manish Kannan, Anitha Katariya, Namit Amatriain, Xavier
description	Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.
doi_str_mv	10.48550/arxiv.1911.08554
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1911_08554</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1911_08554</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-2cc205ee13c3a3b5704b2a81278eb6a879de9d77353130b12a14bdf243b76aea3</originalsourceid><addsrcrecordid>eNotz81OwzAQBGBfOKDCA3DCL5Dg3zjpDaUUkApcco_W8aZayY2RE1D79pTCaaSRZqSPsTspSlNbKx4gH-m7lI2UpTgX5pq9txHmmUYaYKE0cZj5BocUMK95lyHQtOfbiEfyFGk58TFl3qZpySlymvgbhvMy8g1BTPsvvGFXI8QZb_9zxbrtU9e-FLuP59f2cVdA5UyhhkEJiyj1oEF764TxCmqpXI2-gto1AZvgnLZaauGlAml8GJXR3lWAoFfs_u_2Auo_Mx0gn_pfWH-B6R9jq0h0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><source>arXiv.org</source><creator>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</creator><creatorcontrib>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</creatorcontrib><description>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</description><identifier>DOI: 10.48550/arxiv.1911.08554</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2019-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1911.08554$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1911.08554$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Shleifer, Sam</creatorcontrib><creatorcontrib>Chablani, Manish</creatorcontrib><creatorcontrib>Kannan, Anitha</creatorcontrib><creatorcontrib>Katariya, Namit</creatorcontrib><creatorcontrib>Amatriain, Xavier</creatorcontrib><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><description>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81OwzAQBGBfOKDCA3DCL5Dg3zjpDaUUkApcco_W8aZayY2RE1D79pTCaaSRZqSPsTspSlNbKx4gH-m7lI2UpTgX5pq9txHmmUYaYKE0cZj5BocUMK95lyHQtOfbiEfyFGk58TFl3qZpySlymvgbhvMy8g1BTPsvvGFXI8QZb_9zxbrtU9e-FLuP59f2cVdA5UyhhkEJiyj1oEF764TxCmqpXI2-gto1AZvgnLZaauGlAml8GJXR3lWAoFfs_u_2Auo_Mx0gn_pfWH-B6R9jq0h0</recordid><startdate>20191115</startdate><enddate>20191115</enddate><creator>Shleifer, Sam</creator><creator>Chablani, Manish</creator><creator>Kannan, Anitha</creator><creator>Katariya, Namit</creator><creator>Amatriain, Xavier</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20191115</creationdate><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><author>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-2cc205ee13c3a3b5704b2a81278eb6a879de9d77353130b12a14bdf243b76aea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Shleifer, Sam</creatorcontrib><creatorcontrib>Chablani, Manish</creatorcontrib><creatorcontrib>Kannan, Anitha</creatorcontrib><creatorcontrib>Katariya, Namit</creatorcontrib><creatorcontrib>Amatriain, Xavier</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shleifer, Sam</au><au>Chablani, Manish</au><au>Kannan, Anitha</au><au>Katariya, Namit</au><au>Amatriain, Xavier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</atitle><date>2019-11-15</date><risdate>2019</risdate><abstract>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</abstract><doi>10.48550/arxiv.1911.08554</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1911.08554
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1911_08554
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
title	Classification as Decoder: Trading Flexibility for Control in Medical Dialogue
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A01%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Classification%20as%20Decoder:%20Trading%20Flexibility%20for%20Control%20in%20Medical%20Dialogue&rft.au=Shleifer,%20Sam&rft.date=2019-11-15&rft_id=info:doi/10.48550/arxiv.1911.08554&rft_dat=%3Carxiv_GOX%3E1911_08554%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true