Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hatori, Jun, Kikuchi, Yuta, Kobayashi, Sosuke, Takahashi, Kuniyuki, Tsuboi, Yuta, Unno, Yuya, Ko, Wilson, Tan, Jethro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Hatori, Jun
Kikuchi, Yuta
Kobayashi, Sosuke
Takahashi, Kuniyuki
Tsuboi, Yuta
Unno, Yuya
Ko, Wilson
Tan, Jethro
description Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.
doi_str_mv 10.48550/arxiv.1710.06280
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1710_06280</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1710_06280</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-b4dd427736cbb1dfb30a7076b825ff84f8df815ffd075a7388e64f3309561ad33</originalsourceid><addsrcrecordid>eNotj0tOwzAYhL1hgQoHYIUvkGLHjm2WqOIRKVIrKIJd9PsVTINTOW6htyctrGY0MxrpQ-iKkjlXVUVuIP2E_ZzKKSCiVOQcvdcxuwQmh73rD3gVzCbEDj876Iu3IfUWL_WnM3nE3yF_4NdohjjmBCE6i1-2w8ZF3EDsdtA5XB-r3fQ1bS7QmYd-dJf_OkPrh_v14qlolo_14q4pQEhSaG4tL6VkwmhNrdeMgCRSaFVW3ivulfWKTtYSWYFkSjnBPWPkthIULGMzdP13e0Jrtyl8QTq0R8T2hMh-AZpvTU0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions</title><source>arXiv.org</source><creator>Hatori, Jun ; Kikuchi, Yuta ; Kobayashi, Sosuke ; Takahashi, Kuniyuki ; Tsuboi, Yuta ; Unno, Yuya ; Ko, Wilson ; Tan, Jethro</creator><creatorcontrib>Hatori, Jun ; Kikuchi, Yuta ; Kobayashi, Sosuke ; Takahashi, Kuniyuki ; Tsuboi, Yuta ; Unno, Yuya ; Ko, Wilson ; Tan, Jethro</creatorcontrib><description>Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.</description><identifier>DOI: 10.48550/arxiv.1710.06280</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Robotics</subject><creationdate>2017-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1710.06280$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1710.06280$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hatori, Jun</creatorcontrib><creatorcontrib>Kikuchi, Yuta</creatorcontrib><creatorcontrib>Kobayashi, Sosuke</creatorcontrib><creatorcontrib>Takahashi, Kuniyuki</creatorcontrib><creatorcontrib>Tsuboi, Yuta</creatorcontrib><creatorcontrib>Unno, Yuya</creatorcontrib><creatorcontrib>Ko, Wilson</creatorcontrib><creatorcontrib>Tan, Jethro</creatorcontrib><title>Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions</title><description>Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0tOwzAYhL1hgQoHYIUvkGLHjm2WqOIRKVIrKIJd9PsVTINTOW6htyctrGY0MxrpQ-iKkjlXVUVuIP2E_ZzKKSCiVOQcvdcxuwQmh73rD3gVzCbEDj876Iu3IfUWL_WnM3nE3yF_4NdohjjmBCE6i1-2w8ZF3EDsdtA5XB-r3fQ1bS7QmYd-dJf_OkPrh_v14qlolo_14q4pQEhSaG4tL6VkwmhNrdeMgCRSaFVW3ivulfWKTtYSWYFkSjnBPWPkthIULGMzdP13e0Jrtyl8QTq0R8T2hMh-AZpvTU0</recordid><startdate>20171017</startdate><enddate>20171017</enddate><creator>Hatori, Jun</creator><creator>Kikuchi, Yuta</creator><creator>Kobayashi, Sosuke</creator><creator>Takahashi, Kuniyuki</creator><creator>Tsuboi, Yuta</creator><creator>Unno, Yuya</creator><creator>Ko, Wilson</creator><creator>Tan, Jethro</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20171017</creationdate><title>Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions</title><author>Hatori, Jun ; Kikuchi, Yuta ; Kobayashi, Sosuke ; Takahashi, Kuniyuki ; Tsuboi, Yuta ; Unno, Yuya ; Ko, Wilson ; Tan, Jethro</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-b4dd427736cbb1dfb30a7076b825ff84f8df815ffd075a7388e64f3309561ad33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Hatori, Jun</creatorcontrib><creatorcontrib>Kikuchi, Yuta</creatorcontrib><creatorcontrib>Kobayashi, Sosuke</creatorcontrib><creatorcontrib>Takahashi, Kuniyuki</creatorcontrib><creatorcontrib>Tsuboi, Yuta</creatorcontrib><creatorcontrib>Unno, Yuya</creatorcontrib><creatorcontrib>Ko, Wilson</creatorcontrib><creatorcontrib>Tan, Jethro</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hatori, Jun</au><au>Kikuchi, Yuta</au><au>Kobayashi, Sosuke</au><au>Takahashi, Kuniyuki</au><au>Tsuboi, Yuta</au><au>Unno, Yuya</au><au>Ko, Wilson</au><au>Tan, Jethro</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions</atitle><date>2017-10-17</date><risdate>2017</risdate><abstract>Comprehension of spoken natural language is an essential component for robots to communicate with human effectively. However, handling unconstrained spoken instructions is challenging due to (1) complex structures including a wide variety of expressions used in spoken language and (2) inherent ambiguity in interpretation of human instructions. In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions. Specifically, we integrate deep-learning-based object detection together with natural language processing technologies to handle unconstrained spoken instructions, and propose a method for robots to resolve instruction ambiguity through dialogue. Through our experiments on both a simulated environment as well as a physical industrial robot arm, we demonstrate the ability of our system to understand natural instructions from human operators effectively, and how higher success rates of the object picking task can be achieved through an interactive clarification process.</abstract><doi>10.48550/arxiv.1710.06280</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1710.06280
ispartof
issn
language eng
recordid cdi_arxiv_primary_1710_06280
source arXiv.org
subjects Computer Science - Computation and Language
Computer Science - Robotics
title Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-11T21%3A18%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Interactively%20Picking%20Real-World%20Objects%20with%20Unconstrained%20Spoken%20Language%20Instructions&rft.au=Hatori,%20Jun&rft.date=2017-10-17&rft_id=info:doi/10.48550/arxiv.1710.06280&rft_dat=%3Carxiv_GOX%3E1710_06280%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true