Chord Conditioned Melody Generation With Transformer Based Decoders

For successful artificial music composition, chords and melody must be aligned well. Yet, chord conditioned melody generation remains a challenging task mainly due to its multimodality. While few studies have focused on this task, they face difficulties in generating dynamic rhythm patterns aligned...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2021, Vol.9, p.42071-42080
Hauptverfasser:	Choi, Kyoyun, Park, Jonggwon, Heo, Wan, Jeon, Sungwook, Park, Jonghun
Format:	Artikel
Sprache:	eng
Schlagworte:	Attention mechanism Bars Computational modeling computer generated music Decoders Decoding deep learning Melody melody generation Music neural networks Pitch Qualitative analysis Rhythm Source code Task analysis Training Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	For successful artificial music composition, chords and melody must be aligned well. Yet, chord conditioned melody generation remains a challenging task mainly due to its multimodality. While few studies have focused on this task, they face difficulties in generating dynamic rhythm patterns aligned appropriately with a given chord progression. In this paper, we propose a chord conditioned melody Transformer, a K-POP melody generation model, which separately produces rhythm and pitch conditioned on a chord progression. The model is trained in two phases. A rhythm decoder (RD) is trained first, and subsequently a pitch decoder is trained by utilizing the pre-trained RD. Experimental results show that reusing RD at the pitch decoding stage and training with pitch varied rhythm data improve the performance. It was also observed that the samples produced by the model well reflected the key characteristics of dataset in terms of both pitch and rhythm related features, including chord tone ratio and rhythm distribution. Qualitative analysis reveals the model's capability of generating various melodies in accordance with a given chord progression, as well as the presence of repetitions and variations within the generated melodies. With subjective human listening test, we come to a conclusion that the model was able to successfully produce new melodies that sound pleasant in terms of both rhythm and pitch (Source code available at https://github.com/ckycky3/CMT-pytorch ).
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2021.3065831