Speech detection and recognition apparatus for use with background noise of varying levels

A speech detection system compares the amplitude of an audio signal during successive time periods with speech detection thresholds, and generates an indication of whether the signal contains speech. It derives a background amplitude level from portions of the signal which it indicates do not contai...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ROBERTS, JED M
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A speech detection system compares the amplitude of an audio signal during successive time periods with speech detection thresholds, and generates an indication of whether the signal contains speech. It derives a background amplitude level from portions of the signal which it indicates do not contain speech, and improves its speech detection by altering the amplitude of the audio signal relative to the speech detection thresholds as a function of this background level. Preferably the background amplitude level is a moving average, which is repeatedly recalculated and repeatedly used to alter the relative amplitude of the audio signal and the detection thresholds. The apparatus uses a measure of the variability of the background amplitude to improve its speech detection. It generates start-of-speech and end-of-speech indications when the amplitude crosses respective thresholds for specified numbers of frames. The background amplitude level is calculated from frames which precede the start-of-speech indication by a predetermined amount and which follow the end-of-speech indication. The invention also provides a speech recognition system which compares the amplitudes an audio signal against the amplitudes of acoustic models of vocabulary words to determine which vocabulary words correspond to the signal. The system compensates for background noise by using the background amplitude level, described above, to alter the audio signal amplitudes relative to the acoustic model amplitudes.