Deep convolutional network for animal sound classification and source attribution using dual audio recordings

This paper introduces an end-to-end feedforward convolutional neural network that is able to reliably classify the source and type of animal calls in a noisy environment using two streams of audio data after being trained on a dataset of modest size and imperfect labels. The data consists of audio r...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 2019-02, Vol.145 (2), p.654-662
Hauptverfasser:	Oikarinen, Tuomas, Srinivasan, Karthik, Meisner, Olivia, Hyman, Julia B., Parmar, Shivangi, Fanucci-Kiss, Adrian, Desimone, Robert, Landman, Rogier, Feng, Guoping
Format:	Artikel
Sprache:	eng
Schlagworte:	Signal Processing in Acoustics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper introduces an end-to-end feedforward convolutional neural network that is able to reliably classify the source and type of animal calls in a noisy environment using two streams of audio data after being trained on a dataset of modest size and imperfect labels. The data consists of audio recordings from captive marmoset monkeys housed in pairs, with several other cages nearby. The network in this paper can classify both the call type and which animal made it with a single pass through a single network using raw spectrogram images as input. The network vastly increases data analysis capacity for researchers interested in studying marmoset vocalizations, and allows data collection in the home cage, in group housed animals.
ISSN:	0001-4966 1520-8524
DOI:	10.1121/1.5087827