Employing distance-based semantics to interpret spoken referring expressions

•An interpretation process that considers multiple alternatives.•A mechanism for combining uncertainty from a variety of sources.•Distance functions with probabilistic semantics that represent similarity measures.•Two evaluation experiments to assess our system's performance. In this paper, we...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 2015-11, Vol.34 (1), p.154-185
Hauptverfasser: Zukerman, Ingrid, Kim, Su Nam, Kleinbauer, Thomas, Moshtaghi, Masud
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 185
container_issue 1
container_start_page 154
container_title Computer speech & language
container_volume 34
creator Zukerman, Ingrid
Kim, Su Nam
Kleinbauer, Thomas
Moshtaghi, Masud
description •An interpretation process that considers multiple alternatives.•A mechanism for combining uncertainty from a variety of sources.•Distance functions with probabilistic semantics that represent similarity measures.•Two evaluation experiments to assess our system's performance. In this paper, we present Scusi?, an anytime numerical mechanism for the interpretation of spoken referring expressions. Our contributions are: (1) an anytime interpretation process that considers multiple alternatives at different interpretation stages (speech, syntax, semantics and pragmatics), which enables Scusi? to defer decisions to the end of the interpretation process; (2) a mechanism that combines scores associated with the output of the different interpretation stages, taking into account the uncertainty arising from a variety of sources, such as ambiguity or inaccuracy in a description, speech recognition errors and out-of-vocabulary terms; and (3) distance-based functions with probabilistic semantics that represent lexical similarity between objects’ names and similarity between stated requirements and physical properties of objects (viz colour, size and positional relations). We considered two approaches for combining these descriptive attributes, viz multiplicative and additive, and determined whether prioritizing certain interpretation stages and descriptive attributes affects interpretation performance. We conducted two experiments to evaluate different aspects of Scusi?'s performance: Interpretive, where we compared Scusi?'s understanding of descriptions that are mainly ambiguous or inaccurate with people's understanding of these descriptions, and Generative, where we assessed Scusi?'s understanding of naturally occurring spoken descriptions. Our results show that Scusi?'s understanding of the descriptions in the Interpretive trial is comparable to that of people; and that its performance is encouraging when given arbitrary spoken descriptions in diverse scenarios, and excellent for the corresponding written descriptions. In both experiments, Scusi? significantly outperformed a baseline system that maintains only top same-score interpretations.
doi_str_mv 10.1016/j.csl.2015.01.002
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1709740916</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0885230815000030</els_id><sourcerecordid>1709740916</sourcerecordid><originalsourceid>FETCH-LOGICAL-c330t-c748726a3ad5f4e22f3a291e46299ed9cfe79ecca03fe2f0cca50ae0d438ae5f3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKs_wNsevew6SfYreJJSP6DgRc8hZieSul9mUrH_3pR69jTD8D7DzMPYNYeCA69vt4WlvhDAqwJ4ASBO2IKDqvJW1vKULaBtq1xIaM_ZBdEWAOqqbBZssx7mftr78SPrPEUzWszfDWGXEQ5mjN5SFqfMjxHDHDBmNE-fOGYBHYZwwPAnzYn8NNIlO3OmJ7z6q0v29rB-XT3lm5fH59X9JrdSQsxtU7aNqI00XeVKFMJJIxTHshZKYaesw0ahtQakQ-EgdRUYhK6UrcHKySW7Oe6dw_S1Q4p68GSx782I0440b0A1JShepyg_Rm2YiNLVeg5-MGGvOeiDOb3VyZw-mNPAdTKXmLsjg-mHb49Bk_WYzHQ-oI26m_w_9C-hP3im</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1709740916</pqid></control><display><type>article</type><title>Employing distance-based semantics to interpret spoken referring expressions</title><source>Access via ScienceDirect (Elsevier)</source><creator>Zukerman, Ingrid ; Kim, Su Nam ; Kleinbauer, Thomas ; Moshtaghi, Masud</creator><creatorcontrib>Zukerman, Ingrid ; Kim, Su Nam ; Kleinbauer, Thomas ; Moshtaghi, Masud</creatorcontrib><description>•An interpretation process that considers multiple alternatives.•A mechanism for combining uncertainty from a variety of sources.•Distance functions with probabilistic semantics that represent similarity measures.•Two evaluation experiments to assess our system's performance. In this paper, we present Scusi?, an anytime numerical mechanism for the interpretation of spoken referring expressions. Our contributions are: (1) an anytime interpretation process that considers multiple alternatives at different interpretation stages (speech, syntax, semantics and pragmatics), which enables Scusi? to defer decisions to the end of the interpretation process; (2) a mechanism that combines scores associated with the output of the different interpretation stages, taking into account the uncertainty arising from a variety of sources, such as ambiguity or inaccuracy in a description, speech recognition errors and out-of-vocabulary terms; and (3) distance-based functions with probabilistic semantics that represent lexical similarity between objects’ names and similarity between stated requirements and physical properties of objects (viz colour, size and positional relations). We considered two approaches for combining these descriptive attributes, viz multiplicative and additive, and determined whether prioritizing certain interpretation stages and descriptive attributes affects interpretation performance. We conducted two experiments to evaluate different aspects of Scusi?'s performance: Interpretive, where we compared Scusi?'s understanding of descriptions that are mainly ambiguous or inaccurate with people's understanding of these descriptions, and Generative, where we assessed Scusi?'s understanding of naturally occurring spoken descriptions. Our results show that Scusi?'s understanding of the descriptions in the Interpretive trial is comparable to that of people; and that its performance is encouraging when given arbitrary spoken descriptions in diverse scenarios, and excellent for the corresponding written descriptions. In both experiments, Scusi? significantly outperformed a baseline system that maintains only top same-score interpretations.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1016/j.csl.2015.01.002</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Ambiguity ; Computer simulation ; Decisions ; Distance-based semantics ; Mathematical analysis ; Mathematical models ; Numerical approach ; Performance evaluation ; Semantic interpretation ; Semantics ; Similarity ; Speech ; Spoken language understanding</subject><ispartof>Computer speech &amp; language, 2015-11, Vol.34 (1), p.154-185</ispartof><rights>2015 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c330t-c748726a3ad5f4e22f3a291e46299ed9cfe79ecca03fe2f0cca50ae0d438ae5f3</citedby><cites>FETCH-LOGICAL-c330t-c748726a3ad5f4e22f3a291e46299ed9cfe79ecca03fe2f0cca50ae0d438ae5f3</cites><orcidid>0000-0003-2237-5017</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.csl.2015.01.002$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Zukerman, Ingrid</creatorcontrib><creatorcontrib>Kim, Su Nam</creatorcontrib><creatorcontrib>Kleinbauer, Thomas</creatorcontrib><creatorcontrib>Moshtaghi, Masud</creatorcontrib><title>Employing distance-based semantics to interpret spoken referring expressions</title><title>Computer speech &amp; language</title><description>•An interpretation process that considers multiple alternatives.•A mechanism for combining uncertainty from a variety of sources.•Distance functions with probabilistic semantics that represent similarity measures.•Two evaluation experiments to assess our system's performance. In this paper, we present Scusi?, an anytime numerical mechanism for the interpretation of spoken referring expressions. Our contributions are: (1) an anytime interpretation process that considers multiple alternatives at different interpretation stages (speech, syntax, semantics and pragmatics), which enables Scusi? to defer decisions to the end of the interpretation process; (2) a mechanism that combines scores associated with the output of the different interpretation stages, taking into account the uncertainty arising from a variety of sources, such as ambiguity or inaccuracy in a description, speech recognition errors and out-of-vocabulary terms; and (3) distance-based functions with probabilistic semantics that represent lexical similarity between objects’ names and similarity between stated requirements and physical properties of objects (viz colour, size and positional relations). We considered two approaches for combining these descriptive attributes, viz multiplicative and additive, and determined whether prioritizing certain interpretation stages and descriptive attributes affects interpretation performance. We conducted two experiments to evaluate different aspects of Scusi?'s performance: Interpretive, where we compared Scusi?'s understanding of descriptions that are mainly ambiguous or inaccurate with people's understanding of these descriptions, and Generative, where we assessed Scusi?'s understanding of naturally occurring spoken descriptions. Our results show that Scusi?'s understanding of the descriptions in the Interpretive trial is comparable to that of people; and that its performance is encouraging when given arbitrary spoken descriptions in diverse scenarios, and excellent for the corresponding written descriptions. In both experiments, Scusi? significantly outperformed a baseline system that maintains only top same-score interpretations.</description><subject>Ambiguity</subject><subject>Computer simulation</subject><subject>Decisions</subject><subject>Distance-based semantics</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Numerical approach</subject><subject>Performance evaluation</subject><subject>Semantic interpretation</subject><subject>Semantics</subject><subject>Similarity</subject><subject>Speech</subject><subject>Spoken language understanding</subject><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKs_wNsevew6SfYreJJSP6DgRc8hZieSul9mUrH_3pR69jTD8D7DzMPYNYeCA69vt4WlvhDAqwJ4ASBO2IKDqvJW1vKULaBtq1xIaM_ZBdEWAOqqbBZssx7mftr78SPrPEUzWszfDWGXEQ5mjN5SFqfMjxHDHDBmNE-fOGYBHYZwwPAnzYn8NNIlO3OmJ7z6q0v29rB-XT3lm5fH59X9JrdSQsxtU7aNqI00XeVKFMJJIxTHshZKYaesw0ahtQakQ-EgdRUYhK6UrcHKySW7Oe6dw_S1Q4p68GSx782I0440b0A1JShepyg_Rm2YiNLVeg5-MGGvOeiDOb3VyZw-mNPAdTKXmLsjg-mHb49Bk_WYzHQ-oI26m_w_9C-hP3im</recordid><startdate>20151101</startdate><enddate>20151101</enddate><creator>Zukerman, Ingrid</creator><creator>Kim, Su Nam</creator><creator>Kleinbauer, Thomas</creator><creator>Moshtaghi, Masud</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-2237-5017</orcidid></search><sort><creationdate>20151101</creationdate><title>Employing distance-based semantics to interpret spoken referring expressions</title><author>Zukerman, Ingrid ; Kim, Su Nam ; Kleinbauer, Thomas ; Moshtaghi, Masud</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c330t-c748726a3ad5f4e22f3a291e46299ed9cfe79ecca03fe2f0cca50ae0d438ae5f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Ambiguity</topic><topic>Computer simulation</topic><topic>Decisions</topic><topic>Distance-based semantics</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Numerical approach</topic><topic>Performance evaluation</topic><topic>Semantic interpretation</topic><topic>Semantics</topic><topic>Similarity</topic><topic>Speech</topic><topic>Spoken language understanding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zukerman, Ingrid</creatorcontrib><creatorcontrib>Kim, Su Nam</creatorcontrib><creatorcontrib>Kleinbauer, Thomas</creatorcontrib><creatorcontrib>Moshtaghi, Masud</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer speech &amp; language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zukerman, Ingrid</au><au>Kim, Su Nam</au><au>Kleinbauer, Thomas</au><au>Moshtaghi, Masud</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Employing distance-based semantics to interpret spoken referring expressions</atitle><jtitle>Computer speech &amp; language</jtitle><date>2015-11-01</date><risdate>2015</risdate><volume>34</volume><issue>1</issue><spage>154</spage><epage>185</epage><pages>154-185</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><abstract>•An interpretation process that considers multiple alternatives.•A mechanism for combining uncertainty from a variety of sources.•Distance functions with probabilistic semantics that represent similarity measures.•Two evaluation experiments to assess our system's performance. In this paper, we present Scusi?, an anytime numerical mechanism for the interpretation of spoken referring expressions. Our contributions are: (1) an anytime interpretation process that considers multiple alternatives at different interpretation stages (speech, syntax, semantics and pragmatics), which enables Scusi? to defer decisions to the end of the interpretation process; (2) a mechanism that combines scores associated with the output of the different interpretation stages, taking into account the uncertainty arising from a variety of sources, such as ambiguity or inaccuracy in a description, speech recognition errors and out-of-vocabulary terms; and (3) distance-based functions with probabilistic semantics that represent lexical similarity between objects’ names and similarity between stated requirements and physical properties of objects (viz colour, size and positional relations). We considered two approaches for combining these descriptive attributes, viz multiplicative and additive, and determined whether prioritizing certain interpretation stages and descriptive attributes affects interpretation performance. We conducted two experiments to evaluate different aspects of Scusi?'s performance: Interpretive, where we compared Scusi?'s understanding of descriptions that are mainly ambiguous or inaccurate with people's understanding of these descriptions, and Generative, where we assessed Scusi?'s understanding of naturally occurring spoken descriptions. Our results show that Scusi?'s understanding of the descriptions in the Interpretive trial is comparable to that of people; and that its performance is encouraging when given arbitrary spoken descriptions in diverse scenarios, and excellent for the corresponding written descriptions. In both experiments, Scusi? significantly outperformed a baseline system that maintains only top same-score interpretations.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.csl.2015.01.002</doi><tpages>32</tpages><orcidid>https://orcid.org/0000-0003-2237-5017</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0885-2308
ispartof Computer speech & language, 2015-11, Vol.34 (1), p.154-185
issn 0885-2308
1095-8363
language eng
recordid cdi_proquest_miscellaneous_1709740916
source Access via ScienceDirect (Elsevier)
subjects Ambiguity
Computer simulation
Decisions
Distance-based semantics
Mathematical analysis
Mathematical models
Numerical approach
Performance evaluation
Semantic interpretation
Semantics
Similarity
Speech
Spoken language understanding
title Employing distance-based semantics to interpret spoken referring expressions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T16%3A41%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Employing%20distance-based%20semantics%20to%20interpret%20spoken%20referring%20expressions&rft.jtitle=Computer%20speech%20&%20language&rft.au=Zukerman,%20Ingrid&rft.date=2015-11-01&rft.volume=34&rft.issue=1&rft.spage=154&rft.epage=185&rft.pages=154-185&rft.issn=0885-2308&rft.eissn=1095-8363&rft_id=info:doi/10.1016/j.csl.2015.01.002&rft_dat=%3Cproquest_cross%3E1709740916%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1709740916&rft_id=info:pmid/&rft_els_id=S0885230815000030&rfr_iscdi=true