Evaluation of an Artificial Speech Bandwidth Extension Method in Three Languages

Quality and intelligibility of narrowband telephone speech can be improved by artificial bandwidth extension (ABE), which extends the speech bandwidth using only information available in the narrowband speech signal. This paper reports a three-language evaluation of an ABE method that has recently b...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2008-08, Vol.16 (6), p.1124-1137
Hauptverfasser: Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., Alku, P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Quality and intelligibility of narrowband telephone speech can be improved by artificial bandwidth extension (ABE), which extends the speech bandwidth using only information available in the narrowband speech signal. This paper reports a three-language evaluation of an ABE method that has recently been launched in several of Nokia's mobile telephone models. The method extends the speech bandwidth to frequencies above the telephone band by first utilizing spectral folding and then modifying the magnitude spectrum of the extension band with spline curves. The performance of the method was evaluated by formal listening tests in American English, Russian, and Mandarin Chinese. The results of the listening tests indicate that ABE processing improved the subjective quality of coded narrowband speech in all these languages. Differences between bandwidth-extended American English test sentences and their original wideband counterparts were also evaluated using both an objective distance measure that simulates the characteristics of human hearing and a conventional spectral distortion measure. The average objective error was calculated for different categories of speech sounds. The error was found to be smallest in nasals and semivowels and largest in fricative sounds.
ISSN:1558-7916
2329-9290
1558-7924
2329-9304
DOI:10.1109/TASL.2008.925149