DYNAMIC ADAPTATION OF LANGUAGE MODELS AND SEMANTIC TRACKING FOR AUTOMATIC SPEECH RECOGNITION

Generally, this disclosure provides systems, devices, methods and computer readable media for adaptation of language models and semantic tracking to improve automatic speech recognition (ASR). A system for recognizing phrases of speech from a conversation may include an ASR circuit configured to tra...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	PEREG, Oren, SIVAK, Alexander, RIDER, Tomer, WASSERBLAT, Moshe, TAITE, Shahar, ASSAYAG, Michel
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Generally, this disclosure provides systems, devices, methods and computer readable media for adaptation of language models and semantic tracking to improve automatic speech recognition (ASR). A system for recognizing phrases of speech from a conversation may include an ASR circuit configured to transcribe a user's speech to a first estimated text sequence, based on a generalized language model. The system may also include a language model matching circuit configured to analyze the first estimated text sequence to determine a context and to select a personalized language model (PLM), from a plurality of PLMs, based on that context. The ASR circuit may further be configured to re-transcribe the speech based on the selected PLM to generate a lattice of paths of estimated text sequences, wherein each of the paths of estimated text sequences comprise one or more words and an acoustic score associated with each of the words. De manière générale, l'invention concerne des systèmes, des dispositifs, des procédés et des supports lisibles par ordinateur pour l'adaptation de modèles de langue et le suivi sémantique pour améliorer une reconnaissance vocale automatique (ASR). Un système de reconnaissance de phrases de parole à partir d'une conversation peut comprendre un circuit ASR configuré pour transcrire la parole d'un utilisateur en une première séquence de texte estimée, sur la base d'un modèle de langue généralisé. Le système peut également comprendre un circuit de mise en correspondance de modèle de langue configuré pour analyser la première séquence de texte estimée pour déterminer un contexte et sélectionner un modèle de langue personnalisé (PLM), parmi une pluralité de PLM, sur la base de ce contexte. Le circuit ASR peut en outre être configuré pour retranscrire la parole sur la base du PLM sélectionné pour générer un réseau de trajets de séquences de texte estimées, chacun des trajets de séquences de texte estimées comprenant un ou plusieurs mots et un score acoustique associé à chacun des mots.