Deep convolutional networks on the pitch spiral for musical instrument recognition
Musical performance combines a wide range of pitches, nuances, and expressive techniques. Audio-based classification of musical instruments thus requires to build signal representations that are invariant to such transformations. This article investigates the construction of learned convolutional ar...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Musical performance combines a wide range of pitches, nuances, and expressive
techniques. Audio-based classification of musical instruments thus requires to
build signal representations that are invariant to such transformations. This
article investigates the construction of learned convolutional architectures
for instrument recognition, given a limited amount of annotated training data.
In this context, we benchmark three different weight sharing strategies for
deep convolutional networks in the time-frequency domain: temporal kernels;
time-frequency kernels; and a linear combination of time-frequency kernels
which are one octave apart, akin to a Shepard pitch spiral. We provide an
acoustical interpretation of these strategies within the source-filter
framework of quasi-harmonic sounds with a fixed spectral envelope, which are
archetypal of musical notes. The best classification accuracy is obtained by
hybridizing all three convolutional layers into a single deep learning
architecture. |
---|---|
DOI: | 10.48550/arxiv.1605.06644 |