Nonintrusive Speech Intelligibility Prediction Using Convolutional Neural Networks

Speech Intelligibility Prediction (SIP) algorithms are becoming popular tools within the development and operation of speech processing devices and algorithms. However, many SIP algorithms require knowledge of the underlying clean speech; a signal that is often not available in real-world applicatio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-10, Vol.26 (10), p.1925-1939
Hauptverfasser:	Andersen, Asger Heidemann, de Haan, Jan Mark, Tan, Zheng-Hua, Jensen, Jesper
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Auditory system Convolutional neural networks Noise measurement Nonintrusive speech intelligibility prediction Prediction algorithms Signal processing algorithms Speech processing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Speech Intelligibility Prediction (SIP) algorithms are becoming popular tools within the development and operation of speech processing devices and algorithms. However, many SIP algorithms require knowledge of the underlying clean speech; a signal that is often not available in real-world applications. This has led to increased interest in nonintrusive SIP algorithms, which do not require clean speech to make predictions. In this paper, we investigate the use of Convolutional Neural Networks (CNNs) for nonintrusive SIP. To do so, we utilize a CNN architecture that shows similarities to existing SIP algorithms, in terms of computational structure, and which allows for easy and meaningful visualization and interpretation of trained weights. We evaluate this architecture using a large dataset obtained by combining datasets from the literature. The proposed method shows high prediction performance when compared with four existing intrusive and nonintrusive SIP algorithms. This demonstrates the potential of deep learning for speech intelligibility prediction.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2018.2847459