Improving Bird Classification with Unsupervised Sound Separation
This paper addresses the problem of species classification in bird song recordings. The massive amount of available field recordings of birds presents an opportunity to use machine learning to automatically track bird populations. However, it also poses a problem: such field recordings typically con...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper addresses the problem of species classification in bird song
recordings. The massive amount of available field recordings of birds presents
an opportunity to use machine learning to automatically track bird populations.
However, it also poses a problem: such field recordings typically contain
significant environmental noise and overlapping vocalizations that interfere
with classification. The widely available training datasets for species
identification also typically leave background species unlabeled. This leads
classifiers to ignore vocalizations with a low signal-to-noise ratio. However,
recent advances in unsupervised sound separation, such as \emph{mixture
invariant training} (MixIT), enable high quality separation of bird songs to be
learned from such noisy recordings. In this paper, we demonstrate improved
separation quality when training a MixIT model specifically for birdsong data,
outperforming a general audio separation model by over 5 dB in SI-SNR
improvement of reconstructed mixtures. We also demonstrate precision
improvements with a downstream multi-species bird classifier across three
independent datasets. The best classifier performance is achieved by taking the
maximum model activations over the separated channels and original audio.
Finally, we document additional classifier improvements, including taxonomic
classification, augmentation by random low-pass filters, and additional channel
normalization. |
---|---|
DOI: | 10.48550/arxiv.2110.03209 |