Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shleifer, Sam, Chablani, Manish, Kannan, Anitha, Katariya, Namit, Amatriain, Xavier
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Shleifer, Sam
Chablani, Manish
Kannan, Anitha
Katariya, Namit
Amatriain, Xavier
description Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.
doi_str_mv 10.48550/arxiv.1911.08554
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1911_08554</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1911_08554</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-2cc205ee13c3a3b5704b2a81278eb6a879de9d77353130b12a14bdf243b76aea3</originalsourceid><addsrcrecordid>eNotz81OwzAQBGBfOKDCA3DCL5Dg3zjpDaUUkApcco_W8aZayY2RE1D79pTCaaSRZqSPsTspSlNbKx4gH-m7lI2UpTgX5pq9txHmmUYaYKE0cZj5BocUMK95lyHQtOfbiEfyFGk58TFl3qZpySlymvgbhvMy8g1BTPsvvGFXI8QZb_9zxbrtU9e-FLuP59f2cVdA5UyhhkEJiyj1oEF764TxCmqpXI2-gto1AZvgnLZaauGlAml8GJXR3lWAoFfs_u_2Auo_Mx0gn_pfWH-B6R9jq0h0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><source>arXiv.org</source><creator>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</creator><creatorcontrib>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</creatorcontrib><description>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</description><identifier>DOI: 10.48550/arxiv.1911.08554</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2019-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1911.08554$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1911.08554$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Shleifer, Sam</creatorcontrib><creatorcontrib>Chablani, Manish</creatorcontrib><creatorcontrib>Kannan, Anitha</creatorcontrib><creatorcontrib>Katariya, Namit</creatorcontrib><creatorcontrib>Amatriain, Xavier</creatorcontrib><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><description>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81OwzAQBGBfOKDCA3DCL5Dg3zjpDaUUkApcco_W8aZayY2RE1D79pTCaaSRZqSPsTspSlNbKx4gH-m7lI2UpTgX5pq9txHmmUYaYKE0cZj5BocUMK95lyHQtOfbiEfyFGk58TFl3qZpySlymvgbhvMy8g1BTPsvvGFXI8QZb_9zxbrtU9e-FLuP59f2cVdA5UyhhkEJiyj1oEF764TxCmqpXI2-gto1AZvgnLZaauGlAml8GJXR3lWAoFfs_u_2Auo_Mx0gn_pfWH-B6R9jq0h0</recordid><startdate>20191115</startdate><enddate>20191115</enddate><creator>Shleifer, Sam</creator><creator>Chablani, Manish</creator><creator>Kannan, Anitha</creator><creator>Katariya, Namit</creator><creator>Amatriain, Xavier</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20191115</creationdate><title>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</title><author>Shleifer, Sam ; Chablani, Manish ; Kannan, Anitha ; Katariya, Namit ; Amatriain, Xavier</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-2cc205ee13c3a3b5704b2a81278eb6a879de9d77353130b12a14bdf243b76aea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Shleifer, Sam</creatorcontrib><creatorcontrib>Chablani, Manish</creatorcontrib><creatorcontrib>Kannan, Anitha</creatorcontrib><creatorcontrib>Katariya, Namit</creatorcontrib><creatorcontrib>Amatriain, Xavier</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shleifer, Sam</au><au>Chablani, Manish</au><au>Kannan, Anitha</au><au>Katariya, Namit</au><au>Amatriain, Xavier</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Classification as Decoder: Trading Flexibility for Control in Medical Dialogue</atitle><date>2019-11-15</date><risdate>2019</risdate><abstract>Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.</abstract><doi>10.48550/arxiv.1911.08554</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1911.08554
ispartof
issn
language eng
recordid cdi_arxiv_primary_1911_08554
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
Computer Science - Learning
title Classification as Decoder: Trading Flexibility for Control in Medical Dialogue
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A01%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Classification%20as%20Decoder:%20Trading%20Flexibility%20for%20Control%20in%20Medical%20Dialogue&rft.au=Shleifer,%20Sam&rft.date=2019-11-15&rft_id=info:doi/10.48550/arxiv.1911.08554&rft_dat=%3Carxiv_GOX%3E1911_08554%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true