TEAdapter: Supply abundant guidance for controllable text-to-music generation
2024 IEEE International Conference on Multimedia and Expo (ICME 2024) Although current text-guided music generation technology can cope with simple creative scenarios, achieving fine-grained control over individual text-modality conditions remains challenging as user demands become more intricate. A...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Zou, Jialing Mei, Jiahao Nan, Xudong Li, Jinghua Dong, Daoguo He, Liang |
description | 2024 IEEE International Conference on Multimedia and Expo (ICME
2024) Although current text-guided music generation technology can cope with simple
creative scenarios, achieving fine-grained control over individual
text-modality conditions remains challenging as user demands become more
intricate. Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact
plugin designed to guide the generation process with diverse control
information provided by users. In addition, we explore the controllable
generation of extended music by leveraging TEAdapter control groups trained on
data of distinct structural functionalities. In general, we consider controls
over global, elemental, and structural levels. Experimental results demonstrate
that the proposed TEAdapter enables multiple precise controls and ensures
high-quality music generation. Our module is also lightweight and transferable
to any diffusion model architecture. Available code and demos will be found
soon at https://github.com/Ashley1101/TEAdapter. |
doi_str_mv | 10.48550/arxiv.2408.04865 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_04865</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_04865</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_048653</originalsourceid><addsrcrecordid>eNqFjrEOgjAUALs4GPUDnOwPgFXBEDdjMC5OupMHPEiT0jaPVwN_rxJ3p1suuRNivVNxkqWp2gIN-hXvE5XFKsmO6Vzcn_m5Bs9IJ_kI3ptRQhlsDZZlG_SHFcrGkaycZXLGQGlQMg4csYu60OtKtmiRgLWzSzFrwPS4-nEhNtf8eblFU7jwpDugsfgOFNPA4b_xBmUBPBU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TEAdapter: Supply abundant guidance for controllable text-to-music generation</title><source>arXiv.org</source><creator>Zou, Jialing ; Mei, Jiahao ; Nan, Xudong ; Li, Jinghua ; Dong, Daoguo ; He, Liang</creator><creatorcontrib>Zou, Jialing ; Mei, Jiahao ; Nan, Xudong ; Li, Jinghua ; Dong, Daoguo ; He, Liang</creatorcontrib><description>2024 IEEE International Conference on Multimedia and Expo (ICME
2024) Although current text-guided music generation technology can cope with simple
creative scenarios, achieving fine-grained control over individual
text-modality conditions remains challenging as user demands become more
intricate. Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact
plugin designed to guide the generation process with diverse control
information provided by users. In addition, we explore the controllable
generation of extended music by leveraging TEAdapter control groups trained on
data of distinct structural functionalities. In general, we consider controls
over global, elemental, and structural levels. Experimental results demonstrate
that the proposed TEAdapter enables multiple precise controls and ensures
high-quality music generation. Our module is also lightweight and transferable
to any diffusion model architecture. Available code and demos will be found
soon at https://github.com/Ashley1101/TEAdapter.</description><identifier>DOI: 10.48550/arxiv.2408.04865</identifier><language>eng</language><subject>Computer Science - Multimedia ; Computer Science - Sound</subject><creationdate>2024-08</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.04865$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.04865$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zou, Jialing</creatorcontrib><creatorcontrib>Mei, Jiahao</creatorcontrib><creatorcontrib>Nan, Xudong</creatorcontrib><creatorcontrib>Li, Jinghua</creatorcontrib><creatorcontrib>Dong, Daoguo</creatorcontrib><creatorcontrib>He, Liang</creatorcontrib><title>TEAdapter: Supply abundant guidance for controllable text-to-music generation</title><description>2024 IEEE International Conference on Multimedia and Expo (ICME
2024) Although current text-guided music generation technology can cope with simple
creative scenarios, achieving fine-grained control over individual
text-modality conditions remains challenging as user demands become more
intricate. Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact
plugin designed to guide the generation process with diverse control
information provided by users. In addition, we explore the controllable
generation of extended music by leveraging TEAdapter control groups trained on
data of distinct structural functionalities. In general, we consider controls
over global, elemental, and structural levels. Experimental results demonstrate
that the proposed TEAdapter enables multiple precise controls and ensures
high-quality music generation. Our module is also lightweight and transferable
to any diffusion model architecture. Available code and demos will be found
soon at https://github.com/Ashley1101/TEAdapter.</description><subject>Computer Science - Multimedia</subject><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrEOgjAUALs4GPUDnOwPgFXBEDdjMC5OupMHPEiT0jaPVwN_rxJ3p1suuRNivVNxkqWp2gIN-hXvE5XFKsmO6Vzcn_m5Bs9IJ_kI3ptRQhlsDZZlG_SHFcrGkaycZXLGQGlQMg4csYu60OtKtmiRgLWzSzFrwPS4-nEhNtf8eblFU7jwpDugsfgOFNPA4b_xBmUBPBU</recordid><startdate>20240809</startdate><enddate>20240809</enddate><creator>Zou, Jialing</creator><creator>Mei, Jiahao</creator><creator>Nan, Xudong</creator><creator>Li, Jinghua</creator><creator>Dong, Daoguo</creator><creator>He, Liang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240809</creationdate><title>TEAdapter: Supply abundant guidance for controllable text-to-music generation</title><author>Zou, Jialing ; Mei, Jiahao ; Nan, Xudong ; Li, Jinghua ; Dong, Daoguo ; He, Liang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_048653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Multimedia</topic><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Zou, Jialing</creatorcontrib><creatorcontrib>Mei, Jiahao</creatorcontrib><creatorcontrib>Nan, Xudong</creatorcontrib><creatorcontrib>Li, Jinghua</creatorcontrib><creatorcontrib>Dong, Daoguo</creatorcontrib><creatorcontrib>He, Liang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zou, Jialing</au><au>Mei, Jiahao</au><au>Nan, Xudong</au><au>Li, Jinghua</au><au>Dong, Daoguo</au><au>He, Liang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TEAdapter: Supply abundant guidance for controllable text-to-music generation</atitle><date>2024-08-09</date><risdate>2024</risdate><abstract>2024 IEEE International Conference on Multimedia and Expo (ICME
2024) Although current text-guided music generation technology can cope with simple
creative scenarios, achieving fine-grained control over individual
text-modality conditions remains challenging as user demands become more
intricate. Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact
plugin designed to guide the generation process with diverse control
information provided by users. In addition, we explore the controllable
generation of extended music by leveraging TEAdapter control groups trained on
data of distinct structural functionalities. In general, we consider controls
over global, elemental, and structural levels. Experimental results demonstrate
that the proposed TEAdapter enables multiple precise controls and ensures
high-quality music generation. Our module is also lightweight and transferable
to any diffusion model architecture. Available code and demos will be found
soon at https://github.com/Ashley1101/TEAdapter.</abstract><doi>10.48550/arxiv.2408.04865</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2408.04865 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2408_04865 |
source | arXiv.org |
subjects | Computer Science - Multimedia Computer Science - Sound |
title | TEAdapter: Supply abundant guidance for controllable text-to-music generation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T18%3A13%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TEAdapter:%20Supply%20abundant%20guidance%20for%20controllable%20text-to-music%20generation&rft.au=Zou,%20Jialing&rft.date=2024-08-09&rft_id=info:doi/10.48550/arxiv.2408.04865&rft_dat=%3Carxiv_GOX%3E2408_04865%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |