ByteComposer: a Human-like Melody Composition Method based on Language Model Agent
Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks. However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a hum...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Large Language Models (LLM) have shown encouraging progress in multimodal
understanding and generation tasks. However, how to design a human-aligned and
interpretable melody composition system is still under-explored. To solve this
problem, we propose ByteComposer, an agent framework emulating a human's
creative pipeline in four separate steps : "Conception Analysis - Draft
Composition - Self-Evaluation and Modification - Aesthetic Selection". This
framework seamlessly blends the interactive and knowledge-understanding
features of LLMs with existing symbolic music generation models, thereby
achieving a melody composition agent comparable to human creators. We conduct
extensive experiments on GPT4 and several open-source large language models,
which substantiate our framework's effectiveness. Furthermore, professional
music composers were engaged in multi-dimensional evaluations, the final
results demonstrated that across various facets of music composition,
ByteComposer agent attains the level of a novice melody composer. |
---|---|
DOI: | 10.48550/arxiv.2402.17785 |