Conditional Drums Generation using Compound Word Representations
The field of automatic music composition has seen great progress in recent years, specifically with the invention of transformer-based architectures. When using any deep learning model which considers music as a sequence of events with multiple complex dependencies, the selection of a proper data re...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The field of automatic music composition has seen great progress in recent
years, specifically with the invention of transformer-based architectures. When
using any deep learning model which considers music as a sequence of events
with multiple complex dependencies, the selection of a proper data
representation is crucial. In this paper, we tackle the task of conditional
drums generation using a novel data encoding scheme inspired by the Compound
Word representation, a tokenization process of sequential data. Therefore, we
present a sequence-to-sequence architecture where a Bidirectional Long
short-term memory (BiLSTM) Encoder receives information about the conditioning
parameters (i.e., accompanying tracks and musical attributes), while a
Transformer-based Decoder with relative global attention produces the generated
drum sequences. We conducted experiments to thoroughly compare the
effectiveness of our method to several baselines. Quantitative evaluation shows
that our model is able to generate drums sequences that have similar
statistical distributions and characteristics to the training corpus. These
features include syncopation, compression ratio, and symmetry among others. We
also verified, through a listening test, that generated drum sequences sound
pleasant, natural and coherent while they "groove" with the given
accompaniment. |
---|---|
DOI: | 10.48550/arxiv.2202.04464 |