An active audition framework for auditory-driven HRI: Application to interactive robot dancing

In this paper we propose a general active audition framework for auditory-driven Human-Robot Interaction (HRI). The proposed framework simultaneously processes speech and music on-the-fly, integrates perceptual models for robot audition, and supports verbal and non-verbal interactive communication b...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Oliveira, J. L., Ince, G., Nakamura, K., Nakadai, K., Okuno, H. G., Reis, L. P., Gouyon, F.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1085
container_issue
container_start_page 1078
container_title
container_volume
creator Oliveira, J. L.
Ince, G.
Nakamura, K.
Nakadai, K.
Okuno, H. G.
Reis, L. P.
Gouyon, F.
description In this paper we propose a general active audition framework for auditory-driven Human-Robot Interaction (HRI). The proposed framework simultaneously processes speech and music on-the-fly, integrates perceptual models for robot audition, and supports verbal and non-verbal interactive communication by means of (pro)active behaviors. To ensure a reliable interaction, on top of the framework a behavior decision mechanism based on active audition policies the robot's actions according to the reliability of the acoustic signals for auditory processing. To validate the framework's application to general auditory-driven HRI, we propose the implementation of an interactive robot dancing system. This system integrates three preprocessing robot audition modules: sound source localization, sound source separation, and ego noise suppression; two modules for auditory perception: live audio beat tracking and automatic speech recognition; and multi-modal behaviors for verbal and non-verbal interaction: music-driven dancing and speech-driven dialoguing. To fully assess the system, we set up experimental and interactive real-world scenarios with highly dynamic acoustic conditions, and defined a set of evaluation criteria. The experimental tests revealed accurate and robust beat tracking and speech recognition, and convincing dance beat-synchrony. The interactive sessions confirmed the fundamental role of the behavior decision mechanism for actively maintaining a robust and natural human-robot interaction.
doi_str_mv 10.1109/ROMAN.2012.6343892
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6343892</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6343892</ieee_id><sourcerecordid>6343892</sourcerecordid><originalsourceid>FETCH-LOGICAL-i2002-668d0d7d55074d3cf983e0deaddc247844dbbafcaa62516e32e24798f2e061313</originalsourceid><addsrcrecordid>eNpVUF9LwzAcjP_AMfsF9CVfoDV_fk0a38pwbjAdDH11pEkq0a0paVX27S2uCN7Lwd1xcIfQNSUZpUTdbtaP5VPGCGWZ4MALxU5QomRBQUgOggh6iiZUAaQKuDz754E8__Mgv0RJ172TAQoolfkEvZYN1qb3Xw7rT-t7HxpcR7133yF-4DrEoxziIbVxSDV4sVne4bJtd97o33gfsG96F8eaGKrQY6sb45u3K3RR613nkpGn6GV-_zxbpKv1w3JWrlLPCGGpEIUlVto8JxIsN7UquCPWaWsNA1kA2KrStdFasJwKx5kbZFXUzA3rOeVTdHPs9c65bRv9XsfDdnyL_wC7G1pC</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>An active audition framework for auditory-driven HRI: Application to interactive robot dancing</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Oliveira, J. L. ; Ince, G. ; Nakamura, K. ; Nakadai, K. ; Okuno, H. G. ; Reis, L. P. ; Gouyon, F.</creator><creatorcontrib>Oliveira, J. L. ; Ince, G. ; Nakamura, K. ; Nakadai, K. ; Okuno, H. G. ; Reis, L. P. ; Gouyon, F.</creatorcontrib><description>In this paper we propose a general active audition framework for auditory-driven Human-Robot Interaction (HRI). The proposed framework simultaneously processes speech and music on-the-fly, integrates perceptual models for robot audition, and supports verbal and non-verbal interactive communication by means of (pro)active behaviors. To ensure a reliable interaction, on top of the framework a behavior decision mechanism based on active audition policies the robot's actions according to the reliability of the acoustic signals for auditory processing. To validate the framework's application to general auditory-driven HRI, we propose the implementation of an interactive robot dancing system. This system integrates three preprocessing robot audition modules: sound source localization, sound source separation, and ego noise suppression; two modules for auditory perception: live audio beat tracking and automatic speech recognition; and multi-modal behaviors for verbal and non-verbal interaction: music-driven dancing and speech-driven dialoguing. To fully assess the system, we set up experimental and interactive real-world scenarios with highly dynamic acoustic conditions, and defined a set of evaluation criteria. The experimental tests revealed accurate and robust beat tracking and speech recognition, and convincing dance beat-synchrony. The interactive sessions confirmed the fundamental role of the behavior decision mechanism for actively maintaining a robust and natural human-robot interaction.</description><identifier>ISSN: 1944-9445</identifier><identifier>ISBN: 9781467346047</identifier><identifier>ISBN: 1467346047</identifier><identifier>EISSN: 1944-9437</identifier><identifier>EISBN: 9781467346061</identifier><identifier>EISBN: 1467346063</identifier><identifier>EISBN: 9781467346054</identifier><identifier>EISBN: 1467346055</identifier><identifier>DOI: 10.1109/ROMAN.2012.6343892</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustics ; Noise ; Reliability ; Robot kinematics ; Robot sensing systems ; Speech</subject><ispartof>2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, 2012, p.1078-1085</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6343892$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6343892$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Oliveira, J. L.</creatorcontrib><creatorcontrib>Ince, G.</creatorcontrib><creatorcontrib>Nakamura, K.</creatorcontrib><creatorcontrib>Nakadai, K.</creatorcontrib><creatorcontrib>Okuno, H. G.</creatorcontrib><creatorcontrib>Reis, L. P.</creatorcontrib><creatorcontrib>Gouyon, F.</creatorcontrib><title>An active audition framework for auditory-driven HRI: Application to interactive robot dancing</title><title>2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication</title><addtitle>ROMAN</addtitle><description>In this paper we propose a general active audition framework for auditory-driven Human-Robot Interaction (HRI). The proposed framework simultaneously processes speech and music on-the-fly, integrates perceptual models for robot audition, and supports verbal and non-verbal interactive communication by means of (pro)active behaviors. To ensure a reliable interaction, on top of the framework a behavior decision mechanism based on active audition policies the robot's actions according to the reliability of the acoustic signals for auditory processing. To validate the framework's application to general auditory-driven HRI, we propose the implementation of an interactive robot dancing system. This system integrates three preprocessing robot audition modules: sound source localization, sound source separation, and ego noise suppression; two modules for auditory perception: live audio beat tracking and automatic speech recognition; and multi-modal behaviors for verbal and non-verbal interaction: music-driven dancing and speech-driven dialoguing. To fully assess the system, we set up experimental and interactive real-world scenarios with highly dynamic acoustic conditions, and defined a set of evaluation criteria. The experimental tests revealed accurate and robust beat tracking and speech recognition, and convincing dance beat-synchrony. The interactive sessions confirmed the fundamental role of the behavior decision mechanism for actively maintaining a robust and natural human-robot interaction.</description><subject>Acoustics</subject><subject>Noise</subject><subject>Reliability</subject><subject>Robot kinematics</subject><subject>Robot sensing systems</subject><subject>Speech</subject><issn>1944-9445</issn><issn>1944-9437</issn><isbn>9781467346047</isbn><isbn>1467346047</isbn><isbn>9781467346061</isbn><isbn>1467346063</isbn><isbn>9781467346054</isbn><isbn>1467346055</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVUF9LwzAcjP_AMfsF9CVfoDV_fk0a38pwbjAdDH11pEkq0a0paVX27S2uCN7Lwd1xcIfQNSUZpUTdbtaP5VPGCGWZ4MALxU5QomRBQUgOggh6iiZUAaQKuDz754E8__Mgv0RJ172TAQoolfkEvZYN1qb3Xw7rT-t7HxpcR7133yF-4DrEoxziIbVxSDV4sVne4bJtd97o33gfsG96F8eaGKrQY6sb45u3K3RR613nkpGn6GV-_zxbpKv1w3JWrlLPCGGpEIUlVto8JxIsN7UquCPWaWsNA1kA2KrStdFasJwKx5kbZFXUzA3rOeVTdHPs9c65bRv9XsfDdnyL_wC7G1pC</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>Oliveira, J. L.</creator><creator>Ince, G.</creator><creator>Nakamura, K.</creator><creator>Nakadai, K.</creator><creator>Okuno, H. G.</creator><creator>Reis, L. P.</creator><creator>Gouyon, F.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>An active audition framework for auditory-driven HRI: Application to interactive robot dancing</title><author>Oliveira, J. L. ; Ince, G. ; Nakamura, K. ; Nakadai, K. ; Okuno, H. G. ; Reis, L. P. ; Gouyon, F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i2002-668d0d7d55074d3cf983e0deaddc247844dbbafcaa62516e32e24798f2e061313</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Acoustics</topic><topic>Noise</topic><topic>Reliability</topic><topic>Robot kinematics</topic><topic>Robot sensing systems</topic><topic>Speech</topic><toplevel>online_resources</toplevel><creatorcontrib>Oliveira, J. L.</creatorcontrib><creatorcontrib>Ince, G.</creatorcontrib><creatorcontrib>Nakamura, K.</creatorcontrib><creatorcontrib>Nakadai, K.</creatorcontrib><creatorcontrib>Okuno, H. G.</creatorcontrib><creatorcontrib>Reis, L. P.</creatorcontrib><creatorcontrib>Gouyon, F.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Oliveira, J. L.</au><au>Ince, G.</au><au>Nakamura, K.</au><au>Nakadai, K.</au><au>Okuno, H. G.</au><au>Reis, L. P.</au><au>Gouyon, F.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>An active audition framework for auditory-driven HRI: Application to interactive robot dancing</atitle><btitle>2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication</btitle><stitle>ROMAN</stitle><date>2012-09</date><risdate>2012</risdate><spage>1078</spage><epage>1085</epage><pages>1078-1085</pages><issn>1944-9445</issn><eissn>1944-9437</eissn><isbn>9781467346047</isbn><isbn>1467346047</isbn><eisbn>9781467346061</eisbn><eisbn>1467346063</eisbn><eisbn>9781467346054</eisbn><eisbn>1467346055</eisbn><abstract>In this paper we propose a general active audition framework for auditory-driven Human-Robot Interaction (HRI). The proposed framework simultaneously processes speech and music on-the-fly, integrates perceptual models for robot audition, and supports verbal and non-verbal interactive communication by means of (pro)active behaviors. To ensure a reliable interaction, on top of the framework a behavior decision mechanism based on active audition policies the robot's actions according to the reliability of the acoustic signals for auditory processing. To validate the framework's application to general auditory-driven HRI, we propose the implementation of an interactive robot dancing system. This system integrates three preprocessing robot audition modules: sound source localization, sound source separation, and ego noise suppression; two modules for auditory perception: live audio beat tracking and automatic speech recognition; and multi-modal behaviors for verbal and non-verbal interaction: music-driven dancing and speech-driven dialoguing. To fully assess the system, we set up experimental and interactive real-world scenarios with highly dynamic acoustic conditions, and defined a set of evaluation criteria. The experimental tests revealed accurate and robust beat tracking and speech recognition, and convincing dance beat-synchrony. The interactive sessions confirmed the fundamental role of the behavior decision mechanism for actively maintaining a robust and natural human-robot interaction.</abstract><pub>IEEE</pub><doi>10.1109/ROMAN.2012.6343892</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1944-9445
ispartof 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, 2012, p.1078-1085
issn 1944-9445
1944-9437
language eng
recordid cdi_ieee_primary_6343892
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acoustics
Noise
Reliability
Robot kinematics
Robot sensing systems
Speech
title An active audition framework for auditory-driven HRI: Application to interactive robot dancing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T07%3A34%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=An%20active%20audition%20framework%20for%20auditory-driven%20HRI:%20Application%20to%20interactive%20robot%20dancing&rft.btitle=2012%20IEEE%20RO-MAN:%20The%2021st%20IEEE%20International%20Symposium%20on%20Robot%20and%20Human%20Interactive%20Communication&rft.au=Oliveira,%20J.%20L.&rft.date=2012-09&rft.spage=1078&rft.epage=1085&rft.pages=1078-1085&rft.issn=1944-9445&rft.eissn=1944-9437&rft.isbn=9781467346047&rft.isbn_list=1467346047&rft_id=info:doi/10.1109/ROMAN.2012.6343892&rft_dat=%3Cieee_6IE%3E6343892%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467346061&rft.eisbn_list=1467346063&rft.eisbn_list=9781467346054&rft.eisbn_list=1467346055&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6343892&rfr_iscdi=true