Audio event detection with window-based prediction

A computing system for a plurality of classes of audio events is provided, including one or more processors configured to divide a run-time audio signal into a plurality of segments and process each segment of the run-time audio signal in a time domain to generate a normalized time domain representa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hoffman, Yonit, Shiloh Perl, Lihi Ahuva, Pundak, Gilad, Fishman, Ben
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A computing system for a plurality of classes of audio events is provided, including one or more processors configured to divide a run-time audio signal into a plurality of segments and process each segment of the run-time audio signal in a time domain to generate a normalized time domain representation of each segment. The processor is further configured to feed the normalized time domain representation of each segment to an input layer of a trained neural network. The processor is further configured to generate, by the neural network, a plurality of predicted classification scores and associated probabilities for each class of audio event contained in each segment of the run-time input audio signal. In post-processing, the processor is further configured to generate smoothed predicted classification scores, associated smoothed probabilities, and class window confidence values for each class for each of a plurality of candidate window sizes.