SYSTEMS AND METHODS FOR ADAPTIVE PROPER NAME ENTITY RECOGNITION AND UNDERSTANDING

Abstract A computer-implemented method for recognizing and understanding spoken commands that include one or more proper name entities, comprising: receiving an utterance from a user, performing primary automatic speech recognition (ASR) processing upon said utterance with a primary automatic speech...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Printz, Harry William
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING GYROSCOPIC INSTRUMENTS MEASURING MEASURING DISTANCES, LEVELS OR BEARINGS MUSICAL INSTRUMENTS NAVIGATION PHOTOGRAMMETRY OR VIDEOGRAMMETRY PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION SURVEYING TESTING
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Abstract A computer-implemented method for recognizing and understanding spoken commands that include one or more proper name entities, comprising: receiving an utterance from a user, performing primary automatic speech recognition (ASR) processing upon said utterance with a primary automatic speech recognizer to output a dataset comprising at least a sequence of nominal transcribed words and putative start and end times for each nominal transcribed word within said utterance, performing understanding processing upon said dataset with a natural language understanding (NLU) processor to generate and augment the dataset with a nominal meaning for the utterance and to determine putative presence and type of one or more spoken proper name entities within said utterance, wherein a contiguous section of audio within said utterance corresponding to each putative proper name entity, as determined from said start and end times of the words of the putative proper name entity as transcribed by the primary automatic speech recognizer, comprises an acoustic span, performing secondary automatic speech recognition (ASR) processing upon each said acoustic span with a secondary automatic speech recognizer, in each instance said secondary automatic speech recognizer specialized to process a given putative type of acoustic span to generate a nominal correct transcription and associated meaning for each said acoustic span, substituting the nominal correct transcription and associated meaning obtained from each secondary recognition as appropriate within the dataset to revise the results of the primary automatic speech recognizer and natural language understanding processor and to create a plurality of complete transcriptions and associated meanings, preparing a complete hypothesis ranking grammar comprised of said plurality of complete transcriptions and decoding the utterance against said complete hypothesis ranking grammar to determine an acoustic confidence score for each complete transcription, determining, for each acoustic span of each complete transcription, an NLU confidence score for each transcription of each acoustic span, normalizing said NLU confidence scores across the plurality of complete transcriptions to determine a normalized NLU confidence score of each complete transcription, combining said acoustic confidence score and NLU confidence score of each complete transcription to generate a final confidence score that each complete transcription and associated