Gaze-contingent asr for spontaneous, conversational speech: An evaluation

There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investigated this idea by integrating information from gaze into an ASR system. A probabilistic framework for multimodal recogni...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Cooke, N., Russell, M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Automatic speech recognition Bayes procedures Error analysis Human computer interaction Laboratories Maximum likelihood decoding Speech analysis Speech recognition User interfaces Visual system Vocabulary
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investigated this idea by integrating information from gaze into an ASR system. A probabilistic framework for multimodal recognition was formalised and applied to the specific case of integrating gaze and speech. Gaze-contingent ASR systems were developed from a baseline ASR system by redistributing language model probability mass according to the visual attention. The best performing systems had similar Word Error Rates to the baseline ASR system and showed an increase in keyword spotting accuracy. The key finding was that performance improvements observed were due to increased recognition accuracy for words associated with the visual field but not the current focus of visual attention.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2008.4518639