Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters

We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEICE Transactions on Information and Systems 2020/08/01, Vol.E103.D(8), pp.1920-1923
Hauptverfasser:	LEE, JiYeoun, CHOI, Hee-Jin
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial neural networks convolutional neural network Deep learning deep learning method feedforward neural network higher-order statistics Machine learning Mathematical models Model accuracy Neural networks Parameters pathological voice detection Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher-order statistics. We validate the accuracy of this model using the Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database and the Saarbruecken Voice Database (SVD). Our model achieved an accuracy of 99.3% for MEEI and 75.18% for SVD. This model achieved an accuracy that is 7.18% higher than that of competitive models in previous studies.
ISSN:	0916-8532 1745-1361
DOI:	10.1587/transinf.2020EDL8031