Evaluating spoken dialogue agents with PARADISE: Two case studies
This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enab...
Gespeichert in:
Veröffentlicht in: | Computer speech & language 1998-10, Vol.12 (4), p.317-347 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 347 |
---|---|
container_issue | 4 |
container_start_page | 317 |
container_title | Computer speech & language |
container_volume | 12 |
creator | Walker, MA Litman, DJ Kamm, CA Abella, A |
description | This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different taks by normalizing for task complexity. After presenting PARADISE, we illustrate its application to two different spoken dialogue agents. We show how to derive a performance function for each agent and how to generalize results across agents. We then show that once such a performance function has been derived, it can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time. |
doi_str_mv | 10.1006/csla.1998.0110 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_26853898</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0885230898901103</els_id><sourcerecordid>26853898</sourcerecordid><originalsourceid>FETCH-LOGICAL-c374t-f9add6a2e094e8c892bbf002e52ff87d144173c22264a47b0c8b6777832615d23</originalsourceid><addsrcrecordid>eNp1kElLxEAQhRtRcFyungOKt4y9Jan2Nui4gKC4nJueTmVsjcnYlSj-exNGPAie6vK994qPsQPBp4Lz_MRT7abCGJhyIfgGmwhushRUrjbZhANkqVQcttkO0QsfApkuJmw2_3B177rQLBNata_YJGVwdbvsMXFLbDpKPkP3nNzN7mfn1w_z0-Txs028I0yo68uAtMe2KlcT7v_cXfZ0MX88u0pvbi-vz2Y3qVeF7tLKuLLMnURuNIIHIxeLinOJmawqKEqhtSiUl1Lm2uliwT0s8qIoQMlcZKVUu-x43buK7XuP1Nm3QB7r2jXY9mRlDpkCAwN4-Ad8afvYDL9ZAcqA0RrEQE3XlI8tUcTKrmJ4c_HLCm5Hn3b0aUefdvQ5BI5-ah15V1fRNT7Qb0pyAWDGXlhjOLj4CBgt-YCNxzJE9J0t2_DfwjfigoZY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1839894481</pqid></control><display><type>article</type><title>Evaluating spoken dialogue agents with PARADISE: Two case studies</title><source>Elsevier ScienceDirect Journals</source><source>Periodicals Index Online</source><creator>Walker, MA ; Litman, DJ ; Kamm, CA ; Abella, A</creator><creatorcontrib>Walker, MA ; Litman, DJ ; Kamm, CA ; Abella, A</creatorcontrib><description>This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different taks by normalizing for task complexity. After presenting PARADISE, we illustrate its application to two different spoken dialogue agents. We show how to derive a performance function for each agent and how to generalize results across agents. We then show that once such a performance function has been derived, it can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1006/csla.1998.0110</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>Applied linguistics ; Computational linguistics ; Linguistics</subject><ispartof>Computer speech & language, 1998-10, Vol.12 (4), p.317-347</ispartof><rights>1998 Academic Press</rights><rights>1999 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c374t-f9add6a2e094e8c892bbf002e52ff87d144173c22264a47b0c8b6777832615d23</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0885230898901103$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>309,310,314,776,780,785,786,3537,23909,23910,25118,27846,27901,27902,65306</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=2018891$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Walker, MA</creatorcontrib><creatorcontrib>Litman, DJ</creatorcontrib><creatorcontrib>Kamm, CA</creatorcontrib><creatorcontrib>Abella, A</creatorcontrib><title>Evaluating spoken dialogue agents with PARADISE: Two case studies</title><title>Computer speech & language</title><description>This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different taks by normalizing for task complexity. After presenting PARADISE, we illustrate its application to two different spoken dialogue agents. We show how to derive a performance function for each agent and how to generalize results across agents. We then show that once such a performance function has been derived, it can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time.</description><subject>Applied linguistics</subject><subject>Computational linguistics</subject><subject>Linguistics</subject><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1998</creationdate><recordtype>article</recordtype><sourceid>K30</sourceid><recordid>eNp1kElLxEAQhRtRcFyungOKt4y9Jan2Nui4gKC4nJueTmVsjcnYlSj-exNGPAie6vK994qPsQPBp4Lz_MRT7abCGJhyIfgGmwhushRUrjbZhANkqVQcttkO0QsfApkuJmw2_3B177rQLBNata_YJGVwdbvsMXFLbDpKPkP3nNzN7mfn1w_z0-Txs028I0yo68uAtMe2KlcT7v_cXfZ0MX88u0pvbi-vz2Y3qVeF7tLKuLLMnURuNIIHIxeLinOJmawqKEqhtSiUl1Lm2uliwT0s8qIoQMlcZKVUu-x43buK7XuP1Nm3QB7r2jXY9mRlDpkCAwN4-Ad8afvYDL9ZAcqA0RrEQE3XlI8tUcTKrmJ4c_HLCm5Hn3b0aUefdvQ5BI5-ah15V1fRNT7Qb0pyAWDGXlhjOLj4CBgt-YCNxzJE9J0t2_DfwjfigoZY</recordid><startdate>19981001</startdate><enddate>19981001</enddate><creator>Walker, MA</creator><creator>Litman, DJ</creator><creator>Kamm, CA</creator><creator>Abella, A</creator><general>Elsevier Ltd</general><general>Elsevier</general><general>Academic Press</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>HVZBN</scope><scope>K30</scope><scope>PAAUG</scope><scope>PAWHS</scope><scope>PAWZZ</scope><scope>PAXOH</scope><scope>PBHAV</scope><scope>PBQSW</scope><scope>PBYQZ</scope><scope>PCIWU</scope><scope>PCMID</scope><scope>PCZJX</scope><scope>PDGRG</scope><scope>PDWWI</scope><scope>PETMR</scope><scope>PFVGT</scope><scope>PGXDX</scope><scope>PIHIL</scope><scope>PISVA</scope><scope>PJCTQ</scope><scope>PJTMS</scope><scope>PLCHJ</scope><scope>PMHAD</scope><scope>PNQDJ</scope><scope>POUND</scope><scope>PPLAD</scope><scope>PQAPC</scope><scope>PQCAN</scope><scope>PQCMW</scope><scope>PQEME</scope><scope>PQHKH</scope><scope>PQMID</scope><scope>PQNCT</scope><scope>PQNET</scope><scope>PQSCT</scope><scope>PQSET</scope><scope>PSVJG</scope><scope>PVMQY</scope><scope>PZGFC</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>19981001</creationdate><title>Evaluating spoken dialogue agents with PARADISE: Two case studies</title><author>Walker, MA ; Litman, DJ ; Kamm, CA ; Abella, A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c374t-f9add6a2e094e8c892bbf002e52ff87d144173c22264a47b0c8b6777832615d23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1998</creationdate><topic>Applied linguistics</topic><topic>Computational linguistics</topic><topic>Linguistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Walker, MA</creatorcontrib><creatorcontrib>Litman, DJ</creatorcontrib><creatorcontrib>Kamm, CA</creatorcontrib><creatorcontrib>Abella, A</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Periodicals Index Online Segment 24</collection><collection>Periodicals Index Online</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - West</collection><collection>Primary Sources Access (Plan D) - International</collection><collection>Primary Sources Access & Build (Plan A) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Midwest</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Northeast</collection><collection>Primary Sources Access (Plan D) - Southeast</collection><collection>Primary Sources Access (Plan D) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Southeast</collection><collection>Primary Sources Access (Plan D) - South Central</collection><collection>Primary Sources Access & Build (Plan A) - UK / I</collection><collection>Primary Sources Access (Plan D) - Canada</collection><collection>Primary Sources Access (Plan D) - EMEALA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - North Central</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - South Central</collection><collection>Primary Sources Access & Build (Plan A) - International</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - International</collection><collection>Primary Sources Access (Plan D) - West</collection><collection>Periodicals Index Online Segments 1-50</collection><collection>Primary Sources Access (Plan D) - APAC</collection><collection>Primary Sources Access (Plan D) - Midwest</collection><collection>Primary Sources Access (Plan D) - MEA</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - Canada</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - EMEALA</collection><collection>Primary Sources Access & Build (Plan A) - APAC</collection><collection>Primary Sources Access & Build (Plan A) - Canada</collection><collection>Primary Sources Access & Build (Plan A) - West</collection><collection>Primary Sources Access & Build (Plan A) - EMEALA</collection><collection>Primary Sources Access (Plan D) - Northeast</collection><collection>Primary Sources Access & Build (Plan A) - Midwest</collection><collection>Primary Sources Access & Build (Plan A) - North Central</collection><collection>Primary Sources Access & Build (Plan A) - Northeast</collection><collection>Primary Sources Access & Build (Plan A) - South Central</collection><collection>Primary Sources Access & Build (Plan A) - Southeast</collection><collection>Primary Sources Access (Plan D) - UK / I</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - APAC</collection><collection>Primary Sources Access—Foundation Edition (Plan E) - MEA</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer speech & language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Walker, MA</au><au>Litman, DJ</au><au>Kamm, CA</au><au>Abella, A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluating spoken dialogue agents with PARADISE: Two case studies</atitle><jtitle>Computer speech & language</jtitle><date>1998-10-01</date><risdate>1998</risdate><volume>12</volume><issue>4</issue><spage>317</spage><epage>347</epage><pages>317-347</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><abstract>This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different taks by normalizing for task complexity. After presenting PARADISE, we illustrate its application to two different spoken dialogue agents. We show how to derive a performance function for each agent and how to generalize results across agents. We then show that once such a performance function has been derived, it can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1006/csla.1998.0110</doi><tpages>31</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0885-2308 |
ispartof | Computer speech & language, 1998-10, Vol.12 (4), p.317-347 |
issn | 0885-2308 1095-8363 |
language | eng |
recordid | cdi_proquest_miscellaneous_26853898 |
source | Elsevier ScienceDirect Journals; Periodicals Index Online |
subjects | Applied linguistics Computational linguistics Linguistics |
title | Evaluating spoken dialogue agents with PARADISE: Two case studies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T21%3A50%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluating%20spoken%20dialogue%20agents%20with%20PARADISE:%20Two%20case%20studies&rft.jtitle=Computer%20speech%20&%20language&rft.au=Walker,%20MA&rft.date=1998-10-01&rft.volume=12&rft.issue=4&rft.spage=317&rft.epage=347&rft.pages=317-347&rft.issn=0885-2308&rft.eissn=1095-8363&rft_id=info:doi/10.1006/csla.1998.0110&rft_dat=%3Cproquest_cross%3E26853898%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1839894481&rft_id=info:pmid/&rft_els_id=S0885230898901103&rfr_iscdi=true |