Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files

As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2023-10, Vol.35 (30), p.22687-22704
Hauptverfasser: Zhao, Jing, Taniar, David, Adhinugraha, Kiki, Baskaran, Vishnu Monn, Wong, KokSheik
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 22704
container_issue 30
container_start_page 22687
container_title Neural computing & applications
container_volume 35
creator Zhao, Jing
Taniar, David
Adhinugraha, Kiki
Baskaran, Vishnu Monn
Wong, KokSheik
description As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.
doi_str_mv 10.1007/s00521-023-08924-z
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2865417240</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2865417240</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Syye4m3qR-FVoE0XOIm6Rs3WxqsvWjv97UCt48Dcw87zvwIHRK4ZwC1BcJoCwogYIRELLgZLOHRpQzRhiUYh-NQPJ8rjg7REcpLQGAV6Icocf5uhta4n23uMQa9-HddthF7e1HiK84OGw_h6iboe0X2G_ZVWex122Pve2CaW3KePB4Pr2eYtd2Nh2jA6e7ZE9-5xg93948Te7J7OFuOrmakYZVbCCmthVzzjFgomLacCmkrBsqXzTTJRcFOEE1c0bXEhyjtMxrY4RxstbGVWyMzna9qxje1jYNahnWsc8vVSGqktO64JCpYkc1MaQUrVOr2HodvxQFtXWndu5Udqd-3KlNDrFdKGW4X9j4V_1P6hsXOXIh</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2865417240</pqid></control><display><type>article</type><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><source>Springer Nature - Complete Springer Journals</source><creator>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</creator><creatorcontrib>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</creatorcontrib><description>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-023-08924-z</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Ablation ; Artificial Intelligence ; Classification ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Information retrieval ; Melody ; Model accuracy ; Music ; Original Article ; Probability and Statistics in Computer Science</subject><ispartof>Neural computing &amp; applications, 2023-10, Vol.35 (30), p.22687-22704</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</citedby><cites>FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</cites><orcidid>0000-0002-4893-2291</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-023-08924-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-023-08924-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,27907,27908,41471,42540,51302</link.rule.ids></links><search><creatorcontrib>Zhao, Jing</creatorcontrib><creatorcontrib>Taniar, David</creatorcontrib><creatorcontrib>Adhinugraha, Kiki</creatorcontrib><creatorcontrib>Baskaran, Vishnu Monn</creatorcontrib><creatorcontrib>Wong, KokSheik</creatorcontrib><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><title>Neural computing &amp; applications</title><addtitle>Neural Comput &amp; Applic</addtitle><description>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</description><subject>Ablation</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Information retrieval</subject><subject>Melody</subject><subject>Model accuracy</subject><subject>Music</subject><subject>Original Article</subject><subject>Probability and Statistics in Computer Science</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Syye4m3qR-FVoE0XOIm6Rs3WxqsvWjv97UCt48Dcw87zvwIHRK4ZwC1BcJoCwogYIRELLgZLOHRpQzRhiUYh-NQPJ8rjg7REcpLQGAV6Icocf5uhta4n23uMQa9-HddthF7e1HiK84OGw_h6iboe0X2G_ZVWex122Pve2CaW3KePB4Pr2eYtd2Nh2jA6e7ZE9-5xg93948Te7J7OFuOrmakYZVbCCmthVzzjFgomLacCmkrBsqXzTTJRcFOEE1c0bXEhyjtMxrY4RxstbGVWyMzna9qxje1jYNahnWsc8vVSGqktO64JCpYkc1MaQUrVOr2HodvxQFtXWndu5Udqd-3KlNDrFdKGW4X9j4V_1P6hsXOXIh</recordid><startdate>20231001</startdate><enddate>20231001</enddate><creator>Zhao, Jing</creator><creator>Taniar, David</creator><creator>Adhinugraha, Kiki</creator><creator>Baskaran, Vishnu Monn</creator><creator>Wong, KokSheik</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-4893-2291</orcidid></search><sort><creationdate>20231001</creationdate><title>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</title><author>Zhao, Jing ; Taniar, David ; Adhinugraha, Kiki ; Baskaran, Vishnu Monn ; Wong, KokSheik</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-d7e63fff303863ad498997c19ba3a54820f81a3fda790f31153a5dd8df97adf63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Ablation</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Information retrieval</topic><topic>Melody</topic><topic>Model accuracy</topic><topic>Music</topic><topic>Original Article</topic><topic>Probability and Statistics in Computer Science</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Jing</creatorcontrib><creatorcontrib>Taniar, David</creatorcontrib><creatorcontrib>Adhinugraha, Kiki</creatorcontrib><creatorcontrib>Baskaran, Vishnu Monn</creatorcontrib><creatorcontrib>Wong, KokSheik</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Neural computing &amp; applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Jing</au><au>Taniar, David</au><au>Adhinugraha, Kiki</au><au>Baskaran, Vishnu Monn</au><au>Wong, KokSheik</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files</atitle><jtitle>Neural computing &amp; applications</jtitle><stitle>Neural Comput &amp; Applic</stitle><date>2023-10-01</date><risdate>2023</risdate><volume>35</volume><issue>30</issue><spage>22687</spage><epage>22704</epage><pages>22687-22704</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>As an essential part of music, main melody is the cornerstone of music information retrieval. In the MIR’s sub-field of main melody extraction, the mainstream methods assume that the main melody is unique. However, the assumption cannot be established, especially for music with multiple main melodies such as symphony or music with many harmonies. Hence, the conventional methods ignore some main melodies in the music. To solve this problem, we propose a deep learning-based Multiple Main Melodies Generator (Multi-MMLG) framework that can automatically predict potential main melodies from a MIDI file. This framework consists of two stages: (1) main melody classification using a proposed MIDIXLNet model and (2) conditional prediction using a modified MuseBERT model. Experiment results suggest that the proposed MIDIXLNet model increases the accuracy of main melody classification from 89.62 to 97.37%. In addition, this model requires fewer parameters (71.8 million) than the previous state-of-art approaches. We also conduct ablation experiments on the Multi-MMLG framework. In the best-case scenario, predicting meaningful multiple main melodies for the music are achieved.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-023-08924-z</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-4893-2291</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0941-0643
ispartof Neural computing & applications, 2023-10, Vol.35 (30), p.22687-22704
issn 0941-0643
1433-3058
language eng
recordid cdi_proquest_journals_2865417240
source Springer Nature - Complete Springer Journals
subjects Ablation
Artificial Intelligence
Classification
Computational Biology/Bioinformatics
Computational Science and Engineering
Computer Science
Data Mining and Knowledge Discovery
Image Processing and Computer Vision
Information retrieval
Melody
Model accuracy
Music
Original Article
Probability and Statistics in Computer Science
title Multi-mmlg: a novel framework of extracting multiple main melodies from MIDI files
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T11%3A43%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-mmlg:%20a%20novel%20framework%20of%20extracting%20multiple%20main%20melodies%20from%20MIDI%20files&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Zhao,%20Jing&rft.date=2023-10-01&rft.volume=35&rft.issue=30&rft.spage=22687&rft.epage=22704&rft.pages=22687-22704&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-023-08924-z&rft_dat=%3Cproquest_cross%3E2865417240%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2865417240&rft_id=info:pmid/&rfr_iscdi=true