Voice search of structured media data

This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that,...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Seltzer, M., Tashev, I., Acero, A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Arm Computer errors Costs Error analysis Hidden Markov models HMMs language model based information retrieval Motion pictures Music information retrieval music metadata Natural languages phonetic confusability Speech recognition spoken language understanding voice search
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on spoken queries compared to the baseline system. Furthermore, a phonetic similarity model has been introduced to compensate speech recognition errors, which has improved the end-to-end search accuracy consistently across different levels of speech recognition accuracy.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2009.4960490