Modular BDPCA based visual feature representation for lip-reading

Most of the appearance based visual feature extraction methods in the lip-reading system treat the mouth image in a whole manner. However, the vision of speech process is three dimensional and treating the mouth image as a whole may lose the speech information. Motivated by the bidirectional PCA (BD...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Guanyong Wu, Jie Zhu
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	audio-visual speech recognition Covariance matrix Face recognition Feature extraction Image sequences lip-reading Mouth Principal component analysis Robustness Speech processing Speech recognition Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most of the appearance based visual feature extraction methods in the lip-reading system treat the mouth image in a whole manner. However, the vision of speech process is three dimensional and treating the mouth image as a whole may lose the speech information. Motivated by the bidirectional PCA (BDPCA) and decomposition methods used in the face recognition domain, in this paper, a modular bidirectional PCA (MBDPCA) based visual feature extraction method was presented. In this method, the original mouth image sequences are divided into smaller sub-images, and two approaches are compared to build the covariance matrix: one is using all the sub-image sets together to build a global covariance matrix; the other is using the different sub-image sets independently to build the local covariance matrices. Then the BDPCA is applied to each sub-image set. Experimental results show that the MBDPCA method has a better performance than both the conventional PCA and BDPCA methods; moreover, further experimental results demonstrate that our lip-reading system provides significant enhancement of robustness in noisy environments compared to the audio-only speech recognition.
ISSN:	1522-4880 2381-8549
DOI:	10.1109/ICIP.2008.4712008