Assessing kinetic meaning of music and dance via deep cross-modal retrieval

Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2021-11, Vol.33 (21), p.14481-14493
Hauptverfasser: Raposo, Francisco Afonso, Martins de Matos, David, Ribeiro, Ricardo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14493
container_issue 21
container_start_page 14481
container_title Neural computing & applications
container_volume 33
creator Raposo, Francisco Afonso
Martins de Matos, David
Ribeiro, Ricardo
description Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.
doi_str_mv 10.1007/s00521-021-06090-8
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2585217498</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2585217498</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</originalsourceid><addsrcrecordid>eNp9UE1LxDAQDaLguvoHPAU8VyfNR9PjsviFC170HNJkunRt0zXpLvjvba3gzcNjmJn33gyPkGsGtwyguEsAMmcZTFBQQqZPyIIJzjMOUp-SBZRiWgl-Ti5S2gGAUFouyMsqJUypCVv60QQcGkc7tGHq-5p2hzQObPDU2-CQHhtLPeKeutinlHW9ty2NOMQGj7a9JGe1bRNe_dYleX-4f1s_ZZvXx-f1apM5zsohU1yWXjvPasd5UTp0LEdbKyt1VWqPUhTWViBFJRxWgisOBShVceUq5MD4ktzMvvvYfx4wDWbXH2IYT5pc6jGHQpR6ZOUz6-fXiLXZx6az8cswMFNoZg7NwIQpNDOJ-CxKIzlsMf5Z_6P6Bl_dbw8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2585217498</pqid></control><display><type>article</type><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><source>SpringerLink Journals - AutoHoldings</source><creator>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</creator><creatorcontrib>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</creatorcontrib><description>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-021-06090-8</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Cognition ; Cognition &amp; reasoning ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Dance ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Music ; Original Article ; Probability and Statistics in Computer Science ; Recurrent neural networks ; Retrieval ; Semantics ; Statistical models</subject><ispartof>Neural computing &amp; applications, 2021-11, Vol.33 (21), p.14481-14493</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</citedby><cites>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</cites><orcidid>0000-0003-1044-5989</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-021-06090-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-021-06090-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Raposo, Francisco Afonso</creatorcontrib><creatorcontrib>Martins de Matos, David</creatorcontrib><creatorcontrib>Ribeiro, Ricardo</creatorcontrib><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><title>Neural computing &amp; applications</title><addtitle>Neural Comput &amp; Applic</addtitle><description>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</description><subject>Artificial Intelligence</subject><subject>Cognition</subject><subject>Cognition &amp; reasoning</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Dance</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Music</subject><subject>Original Article</subject><subject>Probability and Statistics in Computer Science</subject><subject>Recurrent neural networks</subject><subject>Retrieval</subject><subject>Semantics</subject><subject>Statistical models</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9UE1LxDAQDaLguvoHPAU8VyfNR9PjsviFC170HNJkunRt0zXpLvjvba3gzcNjmJn33gyPkGsGtwyguEsAMmcZTFBQQqZPyIIJzjMOUp-SBZRiWgl-Ti5S2gGAUFouyMsqJUypCVv60QQcGkc7tGHq-5p2hzQObPDU2-CQHhtLPeKeutinlHW9ty2NOMQGj7a9JGe1bRNe_dYleX-4f1s_ZZvXx-f1apM5zsohU1yWXjvPasd5UTp0LEdbKyt1VWqPUhTWViBFJRxWgisOBShVceUq5MD4ktzMvvvYfx4wDWbXH2IYT5pc6jGHQpR6ZOUz6-fXiLXZx6az8cswMFNoZg7NwIQpNDOJ-CxKIzlsMf5Z_6P6Bl_dbw8</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Raposo, Francisco Afonso</creator><creator>Martins de Matos, David</creator><creator>Ribeiro, Ricardo</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0003-1044-5989</orcidid></search><sort><creationdate>20211101</creationdate><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><author>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial Intelligence</topic><topic>Cognition</topic><topic>Cognition &amp; reasoning</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Dance</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Music</topic><topic>Original Article</topic><topic>Probability and Statistics in Computer Science</topic><topic>Recurrent neural networks</topic><topic>Retrieval</topic><topic>Semantics</topic><topic>Statistical models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Raposo, Francisco Afonso</creatorcontrib><creatorcontrib>Martins de Matos, David</creatorcontrib><creatorcontrib>Ribeiro, Ricardo</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Neural computing &amp; applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Raposo, Francisco Afonso</au><au>Martins de Matos, David</au><au>Ribeiro, Ricardo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</atitle><jtitle>Neural computing &amp; applications</jtitle><stitle>Neural Comput &amp; Applic</stitle><date>2021-11-01</date><risdate>2021</risdate><volume>33</volume><issue>21</issue><spage>14481</spage><epage>14493</epage><pages>14481-14493</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-021-06090-8</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-1044-5989</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0941-0643
ispartof Neural computing & applications, 2021-11, Vol.33 (21), p.14481-14493
issn 0941-0643
1433-3058
language eng
recordid cdi_proquest_journals_2585217498
source SpringerLink Journals - AutoHoldings
subjects Artificial Intelligence
Cognition
Cognition & reasoning
Computational Biology/Bioinformatics
Computational Science and Engineering
Computer Science
Dance
Data Mining and Knowledge Discovery
Image Processing and Computer Vision
Music
Original Article
Probability and Statistics in Computer Science
Recurrent neural networks
Retrieval
Semantics
Statistical models
title Assessing kinetic meaning of music and dance via deep cross-modal retrieval
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T09%3A27%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20kinetic%20meaning%20of%20music%20and%20dance%20via%20deep%20cross-modal%20retrieval&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Raposo,%20Francisco%20Afonso&rft.date=2021-11-01&rft.volume=33&rft.issue=21&rft.spage=14481&rft.epage=14493&rft.pages=14481-14493&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-021-06090-8&rft_dat=%3Cproquest_cross%3E2585217498%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2585217498&rft_id=info:pmid/&rfr_iscdi=true