Frame-Level Multi-Label Playing Technique Detection Using Multi-Scale Network and Self-Attention Mechanism
Instrument playing technique (IPT) is a key element of musical presentation. However, most of the existing works for IPT detection only concern monophonic music signals, yet little has been done to detect IPTs in polyphonic instrumental solo pieces with overlapping IPTs or mixed IPTs. In this paper,...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Instrument playing technique (IPT) is a key element of musical presentation.
However, most of the existing works for IPT detection only concern monophonic
music signals, yet little has been done to detect IPTs in polyphonic
instrumental solo pieces with overlapping IPTs or mixed IPTs. In this paper, we
formulate it as a frame-level multi-label classification problem and apply it
to Guzheng, a Chinese plucked string instrument. We create a new dataset,
Guzheng\_Tech99, containing Guzheng recordings and onset, offset, pitch, IPT
annotations of each note. Because different IPTs vary a lot in their lengths,
we propose a new method to solve this problem using multi-scale network and
self-attention. The multi-scale network extracts features from different
scales, and the self-attention mechanism applied to the feature maps at the
coarsest scale further enhances the long-range feature extraction. Our approach
outperforms existing works by a large margin, indicating its effectiveness in
IPT detection. |
---|---|
DOI: | 10.48550/arxiv.2303.13272 |