Audiovusual automatic speech segmentation

Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Akdemir, E., Ciloglu, T.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Conferences Hidden Markov models Mel frequency cepstral coefficient Speech Speech processing Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database. The audio and visual feature vectors are fused at the feature level and used in a HMM based speech segmentation system. A Turkish audiovisual speech database has been prepared and used in the experiments. The average absolute boundary error decreases up to 20.82% by using different audiovisual feature vectors.
ISSN:	2165-0608 2693-3616
DOI:	10.1109/SIU.2011.5929796