How productivity improves in hands-free continuous dictation tasks: lessons learned from a longitudinal study

Speech recognition technology continues to improve, but users still experience significant difficulty using the software to create and edit documents. The reported composition speed using speech software is only between 8 and 15 words per minute [Proc CHI 99 (1999) 568; Universal Access Inform Soc 1...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Interacting with computers 2005-05, Vol.17 (3), p.265-289
Hauptverfasser:	Feng, Jinjuan, Karat, Clare-Marie, Sears, Andrew
Format:	Artikel
Sprache:	eng
Schlagworte:	Automatic speech recognition technologies Composition Error correction Software Speech recognition Speech recognition software Time factors
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Speech recognition technology continues to improve, but users still experience significant difficulty using the software to create and edit documents. The reported composition speed using speech software is only between 8 and 15 words per minute [Proc CHI 99 (1999) 568; Universal Access Inform Soc 1 (2001) 4], much lower than people's normal speaking speed of 125–150 words per minute. What causes the huge gap between natural speaking and composing using speech recognition? Is it possible to narrow the gap and make speech recognition more promising to users? In this paper we discuss users' learning processes and the difficulties they experience as related to continuous dictation tasks using state of the art Automatic Speech Recognition (ASR) software. Detailed data was collected for the first time on various aspects of the three activities involved in document composition tasks: dictation, navigation, and correction. The results indicate that navigation and error correction accounted for big chunk of the dictation task during the early stages of interaction. As users gained more experience, they became more efficient at dictation, navigation and error correction. However, the major improvements in productivity were due to dictation quality and the usage of navigation commands. These results provide insights regarding the factors that cause the gap between user expectation with speech recognition software and the reality of use, and how those factors changed with experience. Specific advice is given to researchers as to the most critical issues that must be addressed.
ISSN:	0953-5438 1873-7951
DOI:	10.1016/j.intcom.2004.06.013