Audio-to-Score Conversion Model Based on Whisper methodology
This thesis develops a Transformer model based on Whisper, which extracts melodies and chords from music audio and records them into ABC notation. A comprehensive data processing workflow is customized for ABC notation, including data cleansing, formatting, and conversion, and a mutation mechanism i...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This thesis develops a Transformer model based on Whisper, which extracts
melodies and chords from music audio and records them into ABC notation. A
comprehensive data processing workflow is customized for ABC notation,
including data cleansing, formatting, and conversion, and a mutation mechanism
is implemented to increase the diversity and quality of training data. This
thesis innovatively introduces the "Orpheus' Score", a custom notation system
that converts music information into tokens, designs a custom vocabulary
library, and trains a corresponding custom tokenizer. Experiments show that
compared to traditional algorithms, the model has significantly improved
accuracy and performance. While providing a convenient audio-to-score tool for
music enthusiasts, this work also provides new ideas and tools for research in
music information processing. |
---|---|
DOI: | 10.48550/arxiv.2410.17209 |