Exploiting Time-Frequency Conformers for Music Audio Enhancement

With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-08
Hauptverfasser:	Chae, Yunkee, Koo, Junghyun, Lee, Sungho, Lee, Kyogu
Format:	Artikel
Sprache:	eng
Schlagworte:	Music Speech processing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chae, Yunkee Koo, Junghyun Lee, Sungho Lee, Kyogu
description	With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. Our approach explores the attention mechanisms of the Conformer and examines their performance to discover the best approach for the music enhancement task. Our experimental results show that our proposed model achieves state-of-the-art performance on single-stem music enhancement. Furthermore, our system can perform general music enhancement with multi-track mixtures, which has not been examined in previous work.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2857166546</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2857166546</sourcerecordid><originalsourceid>FETCH-proquest_journals_28571665463</originalsourceid><addsrcrecordid>eNqNjEELgjAYQEcQJOV_GHQe6Oamx0KULt28i-hnTfSbbQ7q37dDP6DTO7zH25GIC5GyIuP8QGLnpiRJuMq5lCIil-q9zkZvGh-00Quw2sLLA_YfWhocjV3AOhpI797pnl79oA2t8NlhDwvgdiL7sZsdxD8eybmumvLGVmvCyG3tZLzFoFpeyDxVSmZK_Fd9AUvXOTg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2857166546</pqid></control><display><type>article</type><title>Exploiting Time-Frequency Conformers for Music Audio Enhancement</title><source>Freely Accessible Journals</source><creator>Chae, Yunkee ; Koo, Junghyun ; Lee, Sungho ; Lee, Kyogu</creator><creatorcontrib>Chae, Yunkee ; Koo, Junghyun ; Lee, Sungho ; Lee, Kyogu</creatorcontrib><description>With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. Our approach explores the attention mechanisms of the Conformer and examines their performance to discover the best approach for the music enhancement task. Our experimental results show that our proposed model achieves state-of-the-art performance on single-stem music enhancement. Furthermore, our system can perform general music enhancement with multi-track mixtures, which has not been examined in previous work.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Music ; Speech processing</subject><ispartof>arXiv.org, 2023-08</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>781,785</link.rule.ids></links><search><creatorcontrib>Chae, Yunkee</creatorcontrib><creatorcontrib>Koo, Junghyun</creatorcontrib><creatorcontrib>Lee, Sungho</creatorcontrib><creatorcontrib>Lee, Kyogu</creatorcontrib><title>Exploiting Time-Frequency Conformers for Music Audio Enhancement</title><title>arXiv.org</title><description>With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. Our approach explores the attention mechanisms of the Conformer and examines their performance to discover the best approach for the music enhancement task. Our experimental results show that our proposed model achieves state-of-the-art performance on single-stem music enhancement. Furthermore, our system can perform general music enhancement with multi-track mixtures, which has not been examined in previous work.</description><subject>Music</subject><subject>Speech processing</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjEELgjAYQEcQJOV_GHQe6Oamx0KULt28i-hnTfSbbQ7q37dDP6DTO7zH25GIC5GyIuP8QGLnpiRJuMq5lCIil-q9zkZvGh-00Quw2sLLA_YfWhocjV3AOhpI797pnl79oA2t8NlhDwvgdiL7sZsdxD8eybmumvLGVmvCyG3tZLzFoFpeyDxVSmZK_Fd9AUvXOTg</recordid><startdate>20230824</startdate><enddate>20230824</enddate><creator>Chae, Yunkee</creator><creator>Koo, Junghyun</creator><creator>Lee, Sungho</creator><creator>Lee, Kyogu</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope></search><sort><creationdate>20230824</creationdate><title>Exploiting Time-Frequency Conformers for Music Audio Enhancement</title><author>Chae, Yunkee ; Koo, Junghyun ; Lee, Sungho ; Lee, Kyogu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28571665463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Music</topic><topic>Speech processing</topic><toplevel>online_resources</toplevel><creatorcontrib>Chae, Yunkee</creatorcontrib><creatorcontrib>Koo, Junghyun</creatorcontrib><creatorcontrib>Lee, Sungho</creatorcontrib><creatorcontrib>Lee, Kyogu</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chae, Yunkee</au><au>Koo, Junghyun</au><au>Lee, Sungho</au><au>Lee, Kyogu</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Exploiting Time-Frequency Conformers for Music Audio Enhancement</atitle><jtitle>arXiv.org</jtitle><date>2023-08-24</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. Our approach explores the attention mechanisms of the Conformer and examines their performance to discover the best approach for the music enhancement task. Our experimental results show that our proposed model achieves state-of-the-art performance on single-stem music enhancement. Furthermore, our system can perform general music enhancement with multi-track mixtures, which has not been examined in previous work.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-08
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2857166546
source	Freely Accessible Journals
subjects	Music Speech processing
title	Exploiting Time-Frequency Conformers for Music Audio Enhancement
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T00%3A30%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Exploiting%20Time-Frequency%20Conformers%20for%20Music%20Audio%20Enhancement&rft.jtitle=arXiv.org&rft.au=Chae,%20Yunkee&rft.date=2023-08-24&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2857166546%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2857166546&rft_id=info:pmid/&rfr_iscdi=true