A 0.05-mm2 2.91-nJ/Decision Keyword-Spotting (KWS) Chip Featuring an Always-Retention 5T-SRAM in 28-nm CMOS

This article reports a keyword-spotting (KWS) chip for voice-controlled devices. It features a number of techniques to enhance the performance, area, and power efficiencies: 1) a fast-sampling convolutional neural network (FS-CNN) that eliminates the power-hungry feature extractors and reduces the d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of solid-state circuits 2024-02, Vol.59 (2), p.626-635
Hauptverfasser: Tan, Fei, Yu, Wei-Han, Un, Ka-Fai, Martins, Rui P., Mak, Pui-In
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This article reports a keyword-spotting (KWS) chip for voice-controlled devices. It features a number of techniques to enhance the performance, area, and power efficiencies: 1) a fast-sampling convolutional neural network (FS-CNN) that eliminates the power-hungry feature extractors and reduces the decision latency; 2) an always-retention 5T-SRAM that features word-voltage switches to reduce the leakage power and single bitline (BL) operation to halve the SRAM read power compared to the typical 6T-SRAM; and 3) a high-resolution sparsity-aware computing (HR-SAC) unit that enhances the precision and output swing of the multiply-accumulate (MAC) computation. Benchmarking with the state-of-the-art, our KWS chip prototyped in 28-nm CMOS scores a > 90% accuracy for the 11-class Google speech command dataset (GSCD) at 2.91 \mu \text{W} , which corresponds to a 2.91-nJ energy/decision. The achieved latency is 2 ms/decision, and the core area is 0.05 {\mathrm{ mm}}^{2} , including the full KWS model.
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2023.3291376