Consistency among speech parameter vectors: application to predicting speech intelligibility

Previous researchers interested in physical assessment of speech intelligibility have largely based their predictions on preservation of spectral shape. A new approach is presented in which intelligibility is predicted to be preserved only if a transformation modifies relevant speech parameters in a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of the Acoustical Society of America 1996-12, Vol.100 (6), p.3882-3898
Hauptverfasser:	Power, M H, Braida, L D
Format:	Artikel
Sprache:	eng
Schlagworte:	Auditory Threshold Humans Noise Perceptual Masking Phonetics Speech Perception
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Previous researchers interested in physical assessment of speech intelligibility have largely based their predictions on preservation of spectral shape. A new approach is presented in which intelligibility is predicted to be preserved only if a transformation modifies relevant speech parameters in a consistent manner. In particular, the parameters from each short-time interval are described by one of a finite number of symbols formed by quantizing the output of an auditory model, and preservation of intelligibility is modeled as the extent to which a one-to-one correspondence exists between the symbols of the input to the transformation, and those of the output. In this paper, a consistency-measurement system is designed and applied to prediction of intelligibility of linearly filtered speech and speech degraded by additive noise. Results were obtained for two parameter sets: one consisting of band-energy values, and the other based on the ensemble interval histogram (EIH) model. Predictions within a class of transformation varied monotonically with the amount of degradation. Across classes of transformation, the predicted effect of additive-noise transformations was more severe than typical perceptual effects. With respect to the goal of achieving predictions that varied monotonically with human speech-perception scores, performance was slightly better with the EIH parameter set.
ISSN:	0001-4966 1520-8524
DOI:	10.1121/1.417243