Consistency among speech parameter vectors: application to predicting speech intelligibility

Previous researchers interested in physical assessment of speech intelligibility have largely based their predictions on preservation of spectral shape. A new approach is presented in which intelligibility is predicted to be preserved only if a transformation modifies relevant speech parameters in a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 1996-12, Vol.100 (6), p.3882-3898
Hauptverfasser: Power, M H, Braida, L D
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Previous researchers interested in physical assessment of speech intelligibility have largely based their predictions on preservation of spectral shape. A new approach is presented in which intelligibility is predicted to be preserved only if a transformation modifies relevant speech parameters in a consistent manner. In particular, the parameters from each short-time interval are described by one of a finite number of symbols formed by quantizing the output of an auditory model, and preservation of intelligibility is modeled as the extent to which a one-to-one correspondence exists between the symbols of the input to the transformation, and those of the output. In this paper, a consistency-measurement system is designed and applied to prediction of intelligibility of linearly filtered speech and speech degraded by additive noise. Results were obtained for two parameter sets: one consisting of band-energy values, and the other based on the ensemble interval histogram (EIH) model. Predictions within a class of transformation varied monotonically with the amount of degradation. Across classes of transformation, the predicted effect of additive-noise transformations was more severe than typical perceptual effects. With respect to the goal of achieving predictions that varied monotonically with human speech-perception scores, performance was slightly better with the EIH parameter set.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.417243