Assessing kinetic meaning of music and dance via deep cross-modal retrieval
Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance...
Gespeichert in:
Veröffentlicht in: | Neural computing & applications 2021-11, Vol.33 (21), p.14481-14493 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 14493 |
---|---|
container_issue | 21 |
container_start_page | 14481 |
container_title | Neural computing & applications |
container_volume | 33 |
creator | Raposo, Francisco Afonso Martins de Matos, David Ribeiro, Ricardo |
description | Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa. |
doi_str_mv | 10.1007/s00521-021-06090-8 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2585217498</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2585217498</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</originalsourceid><addsrcrecordid>eNp9UE1LxDAQDaLguvoHPAU8VyfNR9PjsviFC170HNJkunRt0zXpLvjvba3gzcNjmJn33gyPkGsGtwyguEsAMmcZTFBQQqZPyIIJzjMOUp-SBZRiWgl-Ti5S2gGAUFouyMsqJUypCVv60QQcGkc7tGHq-5p2hzQObPDU2-CQHhtLPeKeutinlHW9ty2NOMQGj7a9JGe1bRNe_dYleX-4f1s_ZZvXx-f1apM5zsohU1yWXjvPasd5UTp0LEdbKyt1VWqPUhTWViBFJRxWgisOBShVceUq5MD4ktzMvvvYfx4wDWbXH2IYT5pc6jGHQpR6ZOUz6-fXiLXZx6az8cswMFNoZg7NwIQpNDOJ-CxKIzlsMf5Z_6P6Bl_dbw8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2585217498</pqid></control><display><type>article</type><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><source>SpringerLink Journals - AutoHoldings</source><creator>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</creator><creatorcontrib>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</creatorcontrib><description>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-021-06090-8</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Cognition ; Cognition & reasoning ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Dance ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Music ; Original Article ; Probability and Statistics in Computer Science ; Recurrent neural networks ; Retrieval ; Semantics ; Statistical models</subject><ispartof>Neural computing & applications, 2021-11, Vol.33 (21), p.14481-14493</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</citedby><cites>FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</cites><orcidid>0000-0003-1044-5989</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-021-06090-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-021-06090-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Raposo, Francisco Afonso</creatorcontrib><creatorcontrib>Martins de Matos, David</creatorcontrib><creatorcontrib>Ribeiro, Ricardo</creatorcontrib><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><description>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</description><subject>Artificial Intelligence</subject><subject>Cognition</subject><subject>Cognition & reasoning</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Dance</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Music</subject><subject>Original Article</subject><subject>Probability and Statistics in Computer Science</subject><subject>Recurrent neural networks</subject><subject>Retrieval</subject><subject>Semantics</subject><subject>Statistical models</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9UE1LxDAQDaLguvoHPAU8VyfNR9PjsviFC170HNJkunRt0zXpLvjvba3gzcNjmJn33gyPkGsGtwyguEsAMmcZTFBQQqZPyIIJzjMOUp-SBZRiWgl-Ti5S2gGAUFouyMsqJUypCVv60QQcGkc7tGHq-5p2hzQObPDU2-CQHhtLPeKeutinlHW9ty2NOMQGj7a9JGe1bRNe_dYleX-4f1s_ZZvXx-f1apM5zsohU1yWXjvPasd5UTp0LEdbKyt1VWqPUhTWViBFJRxWgisOBShVceUq5MD4ktzMvvvYfx4wDWbXH2IYT5pc6jGHQpR6ZOUz6-fXiLXZx6az8cswMFNoZg7NwIQpNDOJ-CxKIzlsMf5Z_6P6Bl_dbw8</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Raposo, Francisco Afonso</creator><creator>Martins de Matos, David</creator><creator>Ribeiro, Ricardo</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0003-1044-5989</orcidid></search><sort><creationdate>20211101</creationdate><title>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</title><author>Raposo, Francisco Afonso ; Martins de Matos, David ; Ribeiro, Ricardo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-6359d8cd1fc3379cec12eaf6a58b98de547aab054b4ceb436307066b36cbe3013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial Intelligence</topic><topic>Cognition</topic><topic>Cognition & reasoning</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Dance</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Music</topic><topic>Original Article</topic><topic>Probability and Statistics in Computer Science</topic><topic>Recurrent neural networks</topic><topic>Retrieval</topic><topic>Semantics</topic><topic>Statistical models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Raposo, Francisco Afonso</creatorcontrib><creatorcontrib>Martins de Matos, David</creatorcontrib><creatorcontrib>Ribeiro, Ricardo</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Raposo, Francisco Afonso</au><au>Martins de Matos, David</au><au>Ribeiro, Ricardo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing kinetic meaning of music and dance via deep cross-modal retrieval</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><date>2021-11-01</date><risdate>2021</risdate><volume>33</volume><issue>21</issue><spage>14481</spage><epage>14493</epage><pages>14481-14493</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Music semantics is embodied, in the sense that meaning is biologically mediated by and grounded in the human body and brain. This embodied cognition perspective also explains why music structures modulate kinetic and somatosensory perception. We explore this aspect of cognition, by considering dance as an overt expression of semantic aspects of music related to motor intention, in an artificial deep recurrent neural network that learns correlations between music audio and dance video. We claim that, just like human semantic cognition is based on multimodal statistical structures, joint statistical modeling of music and dance artifacts is expected to capture semantics of these modalities. We evaluate the ability of this model to effectively capture underlying semantics in a cross-modal retrieval task, including dance styles in an unsupervised fashion. Quantitative results, validated with statistical significance testing, strengthen the body of evidence for embodied cognition in music and demonstrate the model can recommend music audio for dance video queries and vice versa.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-021-06090-8</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-1044-5989</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0941-0643 |
ispartof | Neural computing & applications, 2021-11, Vol.33 (21), p.14481-14493 |
issn | 0941-0643 1433-3058 |
language | eng |
recordid | cdi_proquest_journals_2585217498 |
source | SpringerLink Journals - AutoHoldings |
subjects | Artificial Intelligence Cognition Cognition & reasoning Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Dance Data Mining and Knowledge Discovery Image Processing and Computer Vision Music Original Article Probability and Statistics in Computer Science Recurrent neural networks Retrieval Semantics Statistical models |
title | Assessing kinetic meaning of music and dance via deep cross-modal retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T09%3A27%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20kinetic%20meaning%20of%20music%20and%20dance%20via%20deep%20cross-modal%20retrieval&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Raposo,%20Francisco%20Afonso&rft.date=2021-11-01&rft.volume=33&rft.issue=21&rft.spage=14481&rft.epage=14493&rft.pages=14481-14493&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-021-06090-8&rft_dat=%3Cproquest_cross%3E2585217498%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2585217498&rft_id=info:pmid/&rfr_iscdi=true |