NS-FDN: Near-Sensor Processing Architecture of Feature-Configurable Distributed Network for Beyond-Real-Time Always-on Keyword Spotting

Always-on keyword spotting (KWS) that detects wake-up words has been the indispensable module in the voice interaction system. However, the ultra-low-power embedded devices put forward strict requirements on energy consumption, latency, and recognition accuracy of KWS. In this work, we propose a nea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2021-05, Vol.68 (5), p.1892-1905
Hauptverfasser: Li, Qin, Liu, Changlu, Dong, Peiyan, Zhang, Yanming, Li, Tong, Lin, Sheng, Yang, Minda, Qiao, Fei, Wang, Yanzhi, Luo, Li, Yang, Huazhong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Always-on keyword spotting (KWS) that detects wake-up words has been the indispensable module in the voice interaction system. However, the ultra-low-power embedded devices put forward strict requirements on energy consumption, latency, and recognition accuracy of KWS. In this work, we propose a near-sensor processing architecture of feature-configurable distributed network (NS-FDN) for always-on KWS applications. The proposed distributed network adapts to the flexible keywords demands in the actual scene by splitting the conventional single network into distributed sub-networks. We design a channel-independent training framework to improve the recognition accuracy of distributed networks. The speech features are evaluated and the redundancy is reduced in NS-FDN, which can also configure the speech features to further reduce the computing complexity and improve processing speed. For deeper optimization, we implement a 65nm-process prototype chip with near-sensor mixed-signal processing architecture avoiding energy-consuming analog-to-digital converter. By improving the system, algorithm, and hardware designs of the KWS, our co-optimized architecture eliminates the energy consumption bottleneck long-standing in conventional KWS systems and achieves state-of-the-art system performance. The experiment results show that NS-FDN achieves 31.6% energy consumption savings, 1.6 times memory savings, 57 times speedup, and 3.4% higher recognition accuracy compared with the state of the art.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2021.3059649