Speech recognition using cepstral articulatory features

Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Speech communication 2019-02, Vol.107, p.26-37
Hauptverfasser: Najnin, Shamima, Banerjee, Bonny
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.
ISSN:0167-6393
1872-7182
DOI:10.1016/j.specom.2019.01.002