Toward Optimum Quantification of Pathology-Induced Noises: An Investigation of Information Missed by Human Auditory System

Clinical diagnosis of voice disorder and evaluation of therapy outcome heavily rely on accurate quantification of voice quality, which is closely tied to the physiology and function of the laryngeal mechanism. Considering the evaluation methodology of the voice, two main categories of auditory-perce...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2020, Vol.28, p.519-528
Hauptverfasser:	Ghasemzadeh, Hamzeh, Arjmandi, Meisam K.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic analysis Acoustic measurements Acoustic noise Acoustics cepstral analysis Estimation instrumental assessment of voice Mel frequency cepstral coefficient Noise Noise measurement Parameter estimation Pathology Perception Quality Resonant frequencies Signal resolution Spectral sensitivity Vocal tract Voice voice disorder Wavelet analysis Wavelet transforms wavelet-based noise estimation
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Clinical diagnosis of voice disorder and evaluation of therapy outcome heavily rely on accurate quantification of voice quality, which is closely tied to the physiology and function of the laryngeal mechanism. Considering the evaluation methodology of the voice, two main categories of auditory-perceptual assessment and acoustic analysis can be identified. This article presents a new approach for acoustic analysis of voice quality, which brings several advantages to the field. The proposed approach is non-parametric in the sense that it does not require the estimation of the fundamental frequency or spectral response of the vocal tract. This reduces the computational complexity of the measurement and reduces the possible errors due to inaccurate estimation of those parameters. Additionally, the method does not make any assumption about the phonetic context and hence has the potential to be applied to connected speech. The proposed method benefits from the multiresolution structure of the wavelet analysis for estimating the noisy component of a voice in the spectro-temporal domain. The informativeness of the estimated noise for voice quality distinction is examined based on different noise-quantification approaches. It is shown that deviation from the model of the human auditory system (HAS) leads to performance improvement. Through several analyses, it is argued that using models of HAS for quantification of the noise leads to significant loss of information relevant to voice quality. Findings from this article suggest that perception-based measures of voice quality are highly restricted in capturing important aspects of acoustic that could assist with voice quality distinctions. This characteristic is inherent to HAS and cannot be alleviated, highlighting a significant limitation of perception-based measures.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2019.2959222