Music ControlNet: Multiple Time-Varying Controls for Music Generation

Text-to-music generation models are now capable of generating high-quality music audio in broad styles. However, text control is primarily suitable for the manipulation of global musical attributes like genre, mood, and tempo, and is less suitable for precise control over time-varying attributes suc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2024, Vol.32, p.2692-2703
Hauptverfasser:	Wu, Shih-Lun, Donahue, Chris, Watanabe, Shinji, Bryan, Nicholas J.
Format:	Artikel
Sprache:	eng
Schlagworte:	Audio data controllable generative modeling Data models diffusion models Estimation Instruments Mood Multiple signal classification Music Music generation Rhythm Spectrograms Time varying control Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!