Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files

As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodie...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural computing & applications 2023-10, Vol.35 (30), p.22687-22704
Hauptverfasser:	Zhao, Jing, Taniar, David, Adhinugraha, Kiki, Baskaran, Vishnu Monn, Wong, KokSheik
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Artificial Intelligence Classification Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Image Processing and Computer Vision Information retrieval Melody Model accuracy Music Original Article Probability and Statistics in Computer Science
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	22704
container_issue	30
container_start_page	22687
container_title	Neural computing & applications
container_volume	35
creator	Zhao, Jing Taniar, David Adhinugraha, Kiki Baskaran, Vishnu Monn Wong, KokSheik
description	As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.
doi_str_mv	10.1007/s00521-023-08924-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2865417240</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2865417240</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Syye4m3qR-FVoE0XOIm6Rs3WxqsvWjv97UCt48Dcw87zvwIHRK4ZwC1BcJoCwogYIRELLgZLOHRpQzRhiUYh-NQPJ8rjg7REcpLQGAV6Icocf5uhta4n23uMQa9-HddthF7e1HiK84OGw_h6iboe0X2G_ZVWex122Pve2CaW3KePB4Pr2eYtd2Nh2jA6e7ZE9-5xg93948Te7J7OFuOrmakYZVbCCmthVzzjFgomLacCmkrBsqXzTTJRcFOEE1c0bXEhyjtMxrY4RxstbGVWyMzna9qxje1jYNahnWsc8vVSGqktO64JCpYkc1MaQUrVOr2HodvxQFtXWndu5Udqd-3KlNDrFdKGW4X9j4V_1P6hsXOXIh</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2865417240</pqid></control><display><type>article</type><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><source>Springer Nature - Complete Springer Journals</source><creator>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</creator><creatorcontrib>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</creatorcontrib><description>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-023-08924-z</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Ablation ; Artificial Intelligence ; Classification ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Information retrieval ; Melody ; Model accuracy ; Music ; Original Article ; Probability and Statistics in Computer Science</subject><ispartof>Neural computing & applications, 2023-10, Vol.35 (30), p.22687-22704</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</citedby><cites>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</cites><orcidid>0000-0002-4893-2291</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-023-08924-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-023-08924-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,27907,27908,41471,42540,51302</link.rule.ids></links><search><creatorcontrib>Zhao, Jing</creatorcontrib><creatorcontrib>Taniar, David</creatorcontrib><creatorcontrib>Adhinugraha, Kiki</creatorcontrib><creatorcontrib>Baskaran, Vishnu Monn</creatorcontrib><creatorcontrib>Wong, KokSheik</creatorcontrib><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><description>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</description><subject>Ablation</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Information retrieval</subject><subject>Melody</subject><subject>Model accuracy</subject><subject>Music</subject><subject>Original Article</subject><subject>Probability and Statistics in Computer Science</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Syye4m3qR-FVoE0XOIm6Rs3WxqsvWjv97UCt48Dcw87zvwIHRK4ZwC1BcJoCwogYIRELLgZLOHRpQzRhiUYh-NQPJ8rjg7REcpLQGAV6Icocf5uhta4n23uMQa9-HddthF7e1HiK84OGw_h6iboe0X2G_ZVWex122Pve2CaW3KePB4Pr2eYtd2Nh2jA6e7ZE9-5xg93948Te7J7OFuOrmakYZVbCCmthVzzjFgomLacCmkrBsqXzTTJRcFOEE1c0bXEhyjtMxrY4RxstbGVWyMzna9qxje1jYNahnWsc8vVSGqktO64JCpYkc1MaQUrVOr2HodvxQFtXWndu5Udqd-3KlNDrFdKGW4X9j4V_1P6hsXOXIh</recordid><startdate>20231001</startdate><enddate>20231001</enddate><creator>Zhao, Jing</creator><creator>Taniar, David</creator><creator>Adhinugraha, Kiki</creator><creator>Baskaran, Vishnu Monn</creator><creator>Wong, KokSheik</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-4893-2291</orcidid></search><sort><creationdate>20231001</creationdate><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><author>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Ablation</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Information retrieval</topic><topic>Melody</topic><topic>Model accuracy</topic><topic>Music</topic><topic>Original Article</topic><topic>Probability and Statistics in Computer Science</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Jing</creatorcontrib><creatorcontrib>Taniar, David</creatorcontrib><creatorcontrib>Adhinugraha, Kiki</creatorcontrib><creatorcontrib>Baskaran, Vishnu Monn</creatorcontrib><creatorcontrib>Wong, KokSheik</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Jing</au><au>Taniar, David</au><au>Adhinugraha, Kiki</au><au>Baskaran, Vishnu Monn</au><au>Wong, KokSheik</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><date>2023-10-01</date><risdate>2023</risdate><volume>35</volume><issue>30</issue><spage>22687</spage><epage>22704</epage><pages>22687-22704</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-023-08924-z</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-4893-2291</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0941-0643
ispartof	Neural computing & applications, 2023-10, Vol.35 (30), p.22687-22704
issn	0941-0643 1433-3058
language	eng
recordid	cdi_proquest_journals_2865417240
source	Springer Nature - Complete Springer Journals
subjects	Ablation Artificial Intelligence Classification Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Image Processing and Computer Vision Information retrieval Melody Model accuracy Music Original Article Probability and Statistics in Computer Science
title	Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T11%3A43%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-mmlg:%20a%20novel%20framework%20of%20extracting%20multiple%20main%20melodies%20from%20MIDI%20files&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Zhao,%20Jing&rft.date=2023-10-01&rft.volume=35&rft.issue=30&rft.spage=22687&rft.epage=22704&rft.pages=22687-22704&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-023-08924-z&rft_dat=%3Cproquest_cross%3E2865417240%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2865417240&rft_id=info:pmid/&rfr_iscdi=true