Comparing alternative modalities in the context of multimodal human–robot interaction

With the advancement of interactive technology, alternative input modalities are often used, instead of conventional ones, to create intuitive, efficient, and user-friendly avenues of controlling and collaborating with robots. Researchers have examined the efficacy of natural interaction modalities...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal on multimodal user interfaces 2024-03, Vol.18 (1), p.69-85
Hauptverfasser: Saren, Suprakas, Mukhopadhyay, Abhishek, Ghose, Debasish, Biswas, Pradipta
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 85
container_issue 1
container_start_page 69
container_title Journal on multimodal user interfaces
container_volume 18
creator Saren, Suprakas
Mukhopadhyay, Abhishek
Ghose, Debasish
Biswas, Pradipta
description With the advancement of interactive technology, alternative input modalities are often used, instead of conventional ones, to create intuitive, efficient, and user-friendly avenues of controlling and collaborating with robots. Researchers have examined the efficacy of natural interaction modalities such as gesture or voice in single and dual-task scenarios. These investigations have aimed to address the potential of the modalities on diverse applications encompassing activities like online shopping, precision agriculture, and mechanical component assembly, which involve tasks like object pointing and selection. This article aims to address the impact on user performance in a practical human–robot interaction application where a fixed-base robot is controlled through the utilization of natural alternative modalities. We explored this by investigating the impact of single-task and dual-task conditions on user performance for object picking and dropping. We undertook two user studies—one focusing on single-task scenarios, employing a fixed-base robot for object picking and dropping and the other encompassing dual-task conditions, utilizing a mobile robot for a driving scenario. We measured task completion times and estimated cognitive workload through the NASA Task Load Index (TLX), which offers a subjective, multidimensional scale measuring the perceived cognitive workload of a user. The studies revealed that the ranking of completion times for the alternative modalities remained consistent across both single-task and dual-task scenarios. However, the ranking based on perceived cognitive load was different. In the single-task study, the gesture-based modality resulted the highest TLX score, contrasting with the dual-task study, where the highest TLX score was associated with the eye gaze-based modality. Likewise, the speech-based modality achieved a lower TLX score compared to eye gaze and gesture in the single-task study, but its TLX score in the dual-task study was between gesture and eye gaze. These outcomes suggest that the efficacy of alternative modalities is contingent not only on user preferences but also on the specific situational context.
doi_str_mv 10.1007/s12193-023-00421-w
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2923950724</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2923950724</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-3c2a9f5055ba3d58566d7c4539dc0adde21bd3117db9aa267df376cd822867883</originalsourceid><addsrcrecordid>eNp9kM1KxDAUhYMoOIzzAq4Crqv5aZt0KYN_MOBGcRnSJHUytE1NUkd3voNv6JOYmQ6488LlXi7nHLgfAOcYXWKE2FXABFc0QyQ1ygnOtkdghhmnGWeUHx92VjJ2ChYhbFAqStItn4GXpesG6W3_CmUbje9ltO8Gdk7L1kZrArQ9jGsDleuj-YjQNbAb22j3CrgeO9n_fH17V7uYpClBqmhdfwZOGtkGszjMOXi-vXla3merx7uH5fUqUxRXMaOKyKopUFHUkuqCF2WpmcoLWmmFpNaG4FpTjJmuKylJyXRDWak0J4SXjHM6BxdT7uDd22hCFBs3pi_aIEhFaFUgRvKkIpNKeReCN40YvO2k_xQYiR1DMTEUiaHYMxTbZKKTKQw7Psb_Rf_j-gVw5Xbh</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2923950724</pqid></control><display><type>article</type><title>Comparing alternative modalities in the context of multimodal human–robot interaction</title><source>SpringerLink Journals - AutoHoldings</source><creator>Saren, Suprakas ; Mukhopadhyay, Abhishek ; Ghose, Debasish ; Biswas, Pradipta</creator><creatorcontrib>Saren, Suprakas ; Mukhopadhyay, Abhishek ; Ghose, Debasish ; Biswas, Pradipta</creatorcontrib><description>With the advancement of interactive technology, alternative input modalities are often used, instead of conventional ones, to create intuitive, efficient, and user-friendly avenues of controlling and collaborating with robots. Researchers have examined the efficacy of natural interaction modalities such as gesture or voice in single and dual-task scenarios. These investigations have aimed to address the potential of the modalities on diverse applications encompassing activities like online shopping, precision agriculture, and mechanical component assembly, which involve tasks like object pointing and selection. This article aims to address the impact on user performance in a practical human–robot interaction application where a fixed-base robot is controlled through the utilization of natural alternative modalities. We explored this by investigating the impact of single-task and dual-task conditions on user performance for object picking and dropping. We undertook two user studies—one focusing on single-task scenarios, employing a fixed-base robot for object picking and dropping and the other encompassing dual-task conditions, utilizing a mobile robot for a driving scenario. We measured task completion times and estimated cognitive workload through the NASA Task Load Index (TLX), which offers a subjective, multidimensional scale measuring the perceived cognitive workload of a user. The studies revealed that the ranking of completion times for the alternative modalities remained consistent across both single-task and dual-task scenarios. However, the ranking based on perceived cognitive load was different. In the single-task study, the gesture-based modality resulted the highest TLX score, contrasting with the dual-task study, where the highest TLX score was associated with the eye gaze-based modality. Likewise, the speech-based modality achieved a lower TLX score compared to eye gaze and gesture in the single-task study, but its TLX score in the dual-task study was between gesture and eye gaze. These outcomes suggest that the efficacy of alternative modalities is contingent not only on user preferences but also on the specific situational context.</description><identifier>ISSN: 1783-7677</identifier><identifier>EISSN: 1783-8738</identifier><identifier>DOI: 10.1007/s12193-023-00421-w</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Computer Science ; Context ; Effectiveness ; Eye movements ; Image Processing and Computer Vision ; Mechanical components ; Original Paper ; Picking ; Ranking ; Robots ; Signal,Image and Speech Processing ; Taskload ; User Interfaces and Human Computer Interaction ; Workload ; Workloads</subject><ispartof>Journal on multimodal user interfaces, 2024-03, Vol.18 (1), p.69-85</ispartof><rights>The Author(s), under exclusive licence to Springer Nature Switzerland AG 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-3c2a9f5055ba3d58566d7c4539dc0adde21bd3117db9aa267df376cd822867883</citedby><cites>FETCH-LOGICAL-c319t-3c2a9f5055ba3d58566d7c4539dc0adde21bd3117db9aa267df376cd822867883</cites><orcidid>0000-0003-3054-6699 ; 0000-0002-4341-0523 ; 0000-0002-3037-3495 ; 0000-0001-5022-4123</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12193-023-00421-w$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s12193-023-00421-w$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Saren, Suprakas</creatorcontrib><creatorcontrib>Mukhopadhyay, Abhishek</creatorcontrib><creatorcontrib>Ghose, Debasish</creatorcontrib><creatorcontrib>Biswas, Pradipta</creatorcontrib><title>Comparing alternative modalities in the context of multimodal human–robot interaction</title><title>Journal on multimodal user interfaces</title><addtitle>J Multimodal User Interfaces</addtitle><description>With the advancement of interactive technology, alternative input modalities are often used, instead of conventional ones, to create intuitive, efficient, and user-friendly avenues of controlling and collaborating with robots. Researchers have examined the efficacy of natural interaction modalities such as gesture or voice in single and dual-task scenarios. These investigations have aimed to address the potential of the modalities on diverse applications encompassing activities like online shopping, precision agriculture, and mechanical component assembly, which involve tasks like object pointing and selection. This article aims to address the impact on user performance in a practical human–robot interaction application where a fixed-base robot is controlled through the utilization of natural alternative modalities. We explored this by investigating the impact of single-task and dual-task conditions on user performance for object picking and dropping. We undertook two user studies—one focusing on single-task scenarios, employing a fixed-base robot for object picking and dropping and the other encompassing dual-task conditions, utilizing a mobile robot for a driving scenario. We measured task completion times and estimated cognitive workload through the NASA Task Load Index (TLX), which offers a subjective, multidimensional scale measuring the perceived cognitive workload of a user. The studies revealed that the ranking of completion times for the alternative modalities remained consistent across both single-task and dual-task scenarios. However, the ranking based on perceived cognitive load was different. In the single-task study, the gesture-based modality resulted the highest TLX score, contrasting with the dual-task study, where the highest TLX score was associated with the eye gaze-based modality. Likewise, the speech-based modality achieved a lower TLX score compared to eye gaze and gesture in the single-task study, but its TLX score in the dual-task study was between gesture and eye gaze. These outcomes suggest that the efficacy of alternative modalities is contingent not only on user preferences but also on the specific situational context.</description><subject>Computer Science</subject><subject>Context</subject><subject>Effectiveness</subject><subject>Eye movements</subject><subject>Image Processing and Computer Vision</subject><subject>Mechanical components</subject><subject>Original Paper</subject><subject>Picking</subject><subject>Ranking</subject><subject>Robots</subject><subject>Signal,Image and Speech Processing</subject><subject>Taskload</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>Workload</subject><subject>Workloads</subject><issn>1783-7677</issn><issn>1783-8738</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kM1KxDAUhYMoOIzzAq4Crqv5aZt0KYN_MOBGcRnSJHUytE1NUkd3voNv6JOYmQ6488LlXi7nHLgfAOcYXWKE2FXABFc0QyQ1ygnOtkdghhmnGWeUHx92VjJ2ChYhbFAqStItn4GXpesG6W3_CmUbje9ltO8Gdk7L1kZrArQ9jGsDleuj-YjQNbAb22j3CrgeO9n_fH17V7uYpClBqmhdfwZOGtkGszjMOXi-vXla3merx7uH5fUqUxRXMaOKyKopUFHUkuqCF2WpmcoLWmmFpNaG4FpTjJmuKylJyXRDWak0J4SXjHM6BxdT7uDd22hCFBs3pi_aIEhFaFUgRvKkIpNKeReCN40YvO2k_xQYiR1DMTEUiaHYMxTbZKKTKQw7Psb_Rf_j-gVw5Xbh</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Saren, Suprakas</creator><creator>Mukhopadhyay, Abhishek</creator><creator>Ghose, Debasish</creator><creator>Biswas, Pradipta</creator><general>Springer International Publishing</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-3054-6699</orcidid><orcidid>https://orcid.org/0000-0002-4341-0523</orcidid><orcidid>https://orcid.org/0000-0002-3037-3495</orcidid><orcidid>https://orcid.org/0000-0001-5022-4123</orcidid></search><sort><creationdate>20240301</creationdate><title>Comparing alternative modalities in the context of multimodal human–robot interaction</title><author>Saren, Suprakas ; Mukhopadhyay, Abhishek ; Ghose, Debasish ; Biswas, Pradipta</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-3c2a9f5055ba3d58566d7c4539dc0adde21bd3117db9aa267df376cd822867883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science</topic><topic>Context</topic><topic>Effectiveness</topic><topic>Eye movements</topic><topic>Image Processing and Computer Vision</topic><topic>Mechanical components</topic><topic>Original Paper</topic><topic>Picking</topic><topic>Ranking</topic><topic>Robots</topic><topic>Signal,Image and Speech Processing</topic><topic>Taskload</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>Workload</topic><topic>Workloads</topic><toplevel>online_resources</toplevel><creatorcontrib>Saren, Suprakas</creatorcontrib><creatorcontrib>Mukhopadhyay, Abhishek</creatorcontrib><creatorcontrib>Ghose, Debasish</creatorcontrib><creatorcontrib>Biswas, Pradipta</creatorcontrib><collection>CrossRef</collection><jtitle>Journal on multimodal user interfaces</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saren, Suprakas</au><au>Mukhopadhyay, Abhishek</au><au>Ghose, Debasish</au><au>Biswas, Pradipta</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparing alternative modalities in the context of multimodal human–robot interaction</atitle><jtitle>Journal on multimodal user interfaces</jtitle><stitle>J Multimodal User Interfaces</stitle><date>2024-03-01</date><risdate>2024</risdate><volume>18</volume><issue>1</issue><spage>69</spage><epage>85</epage><pages>69-85</pages><issn>1783-7677</issn><eissn>1783-8738</eissn><abstract>With the advancement of interactive technology, alternative input modalities are often used, instead of conventional ones, to create intuitive, efficient, and user-friendly avenues of controlling and collaborating with robots. Researchers have examined the efficacy of natural interaction modalities such as gesture or voice in single and dual-task scenarios. These investigations have aimed to address the potential of the modalities on diverse applications encompassing activities like online shopping, precision agriculture, and mechanical component assembly, which involve tasks like object pointing and selection. This article aims to address the impact on user performance in a practical human–robot interaction application where a fixed-base robot is controlled through the utilization of natural alternative modalities. We explored this by investigating the impact of single-task and dual-task conditions on user performance for object picking and dropping. We undertook two user studies—one focusing on single-task scenarios, employing a fixed-base robot for object picking and dropping and the other encompassing dual-task conditions, utilizing a mobile robot for a driving scenario. We measured task completion times and estimated cognitive workload through the NASA Task Load Index (TLX), which offers a subjective, multidimensional scale measuring the perceived cognitive workload of a user. The studies revealed that the ranking of completion times for the alternative modalities remained consistent across both single-task and dual-task scenarios. However, the ranking based on perceived cognitive load was different. In the single-task study, the gesture-based modality resulted the highest TLX score, contrasting with the dual-task study, where the highest TLX score was associated with the eye gaze-based modality. Likewise, the speech-based modality achieved a lower TLX score compared to eye gaze and gesture in the single-task study, but its TLX score in the dual-task study was between gesture and eye gaze. These outcomes suggest that the efficacy of alternative modalities is contingent not only on user preferences but also on the specific situational context.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><doi>10.1007/s12193-023-00421-w</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0003-3054-6699</orcidid><orcidid>https://orcid.org/0000-0002-4341-0523</orcidid><orcidid>https://orcid.org/0000-0002-3037-3495</orcidid><orcidid>https://orcid.org/0000-0001-5022-4123</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1783-7677
ispartof Journal on multimodal user interfaces, 2024-03, Vol.18 (1), p.69-85
issn 1783-7677
1783-8738
language eng
recordid cdi_proquest_journals_2923950724
source SpringerLink Journals - AutoHoldings
subjects Computer Science
Context
Effectiveness
Eye movements
Image Processing and Computer Vision
Mechanical components
Original Paper
Picking
Ranking
Robots
Signal,Image and Speech Processing
Taskload
User Interfaces and Human Computer Interaction
Workload
Workloads
title Comparing alternative modalities in the context of multimodal human–robot interaction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T01%3A31%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparing%20alternative%20modalities%20in%20the%20context%20of%20multimodal%20human%E2%80%93robot%20interaction&rft.jtitle=Journal%20on%20multimodal%20user%20interfaces&rft.au=Saren,%20Suprakas&rft.date=2024-03-01&rft.volume=18&rft.issue=1&rft.spage=69&rft.epage=85&rft.pages=69-85&rft.issn=1783-7677&rft.eissn=1783-8738&rft_id=info:doi/10.1007/s12193-023-00421-w&rft_dat=%3Cproquest_cross%3E2923950724%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2923950724&rft_id=info:pmid/&rfr_iscdi=true