Histogram-Based Quantization for Robust and/or Distributed Speech Recognition

In a distributed speech recognition (DSR) framework, the speech features are quantized and compressed at the client and recognized at the server. However, recognition accuracy is degraded by environmental noise at the input, quantization distortion, and transmission errors. In this paper, histogram-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2008-05, Vol.16 (4), p.859-873
Hauptverfasser: WAN, Chia-Yu, LEE, Lin-Shan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In a distributed speech recognition (DSR) framework, the speech features are quantized and compressed at the client and recognized at the server. However, recognition accuracy is degraded by environmental noise at the input, quantization distortion, and transmission errors. In this paper, histogram-based quantization (HQ) is proposed, in which the partition cells for quantization are dynamically defined by the histogram or order statistics of a segment of the most recent past values of the parameter to be quantized. This scheme is shown to be able to solve to a good degree many problems related to DSR. A joint uncertainty decoding (JUD) approach is further developed to consider the uncertainty caused by both environmental noise and quantization errors. A three-stage error concealment (EC) framework is also developed to handle transmission errors. The proposed HQ is shown to be an attractive feature transformation approach for robust speech recognition outside of a DSR environment as well. All the claims have been verified by experiments using the Aurora 2 testing environment, and significant performance improvements for both robust and/or distributed speech recognition over conventional approaches have been achieved.
ISSN:1558-7916
2329-9290
1558-7924
2329-9304
DOI:10.1109/TASL.2008.920891