A 40-nm 144-mW VLSI processor for real-time 60-kWord continuous speech recognition

We have developed a low-power VLSI chip for 60-kWord real-time continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a cache architecture using locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Guangji He, Sugahara, T., Fujinaga, T., Miyamoto, Y., Noguchi, H., Izumi, S., Kawaguchi, H., Yoshimoto, M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Frequency measurement Hidden Markov models Power demand Random access memory Real-time systems Speech recognition Viterbi algorithm
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We have developed a low-power VLSI chip for 60-kWord real-time continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a cache architecture using locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, highly parallel Gaussian Mixture Model (GMM) computation based on the mixture level, a variable-frame look-ahead scheme, and elastic pipeline operation between the Viterbi transition and GMM processing. Results show that our implementation achieves 95% bandwidth reduction (70.86 MB/s) and 78% required frequency reduction (126.5 MHz). The test chip, fabricated using 40 nm CMOS technology, contains 1.9 M transistors for logic and 7.8 Mbit on-chip memory. It dissipates 144 mW at 126.5 MHz and 1.1 V for 60 kWord real-time continuous speech recognition.
ISSN:	2153-6961 2153-697X
DOI:	10.1109/ASPDAC.2013.6509561