An optimal Bhattacharyya centroid algorithm for Gaussian clustering with applications in automatic speech recognition
The problem of clustering Gaussian distributions can be effectively solved by standard vector quantization algorithms where the metric is defined by the Bhattacharyya distance. This paper presents a novel algorithm for computing the optimal centroid for a cluster of Gaussian distributions according...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The problem of clustering Gaussian distributions can be effectively solved by standard vector quantization algorithms where the metric is defined by the Bhattacharyya distance. This paper presents a novel algorithm for computing the optimal centroid for a cluster of Gaussian distributions according to the Bhattacharyya metric. We show that this centroid maximizes an upper bound on the probability of representing the population modeled by the distributions associated with the cluster. The proposed method is evaluated in clustering distributions of hidden Markov model speech recognizers to reduce the overall memory consumption and runtime complexity of the decoding. Experimental results show that, depending on the task, the number of distributions can be reduced by a factor of 2 to 6 with an increase in recognition accuracy. When compared to a maximum likelihood centroid, the Bhattacharyya centroid provides a 13% error rate reduction in a 2k word recognition task. |
---|---|
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2000.861998 |