FULLY MANAGED AND CONTINUOUSLY TRAINED AUTOMATIC SPEECH RECOGNITION SERVICE

Techniques for automated speech recognition (ASR) are described. A user can upload an audio file to a storage location. The user then provides the ASR service with a reference to the audio file. An ASR engine analyzes the audio file, using an acoustic model to divide the audio data into words, and a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	SURESH, Deepikaa, SIVASUBRAMANIAN, Swaminathan, ANBAZHAGAN, Vikram Sathyanarayana, GULABANI, Rajkumar, ZHUKOV, Vladimir, PHILOMIN, Vasanth, AKARAPU, Praveen Kumar, STEFANI, Stefano, SINGH, Ashish
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Techniques for automated speech recognition (ASR) are described. A user can upload an audio file to a storage location. The user then provides the ASR service with a reference to the audio file. An ASR engine analyzes the audio file, using an acoustic model to divide the audio data into words, and a language model to identify the words spoken in the audio file. The acoustic model can be trained using audio sentence data, enabling the transcription service to accurately transcribe lengthy audio data. The results are punctuated and normalized, and the resulting transcript is returned to the user. La présente invention concerne des techniques de reconnaissance automatique de parole (ASR). Un utilisateur peut transférer un fichier audio vers un emplacement de stockage. L'utilisateur fournit ensuite au service ASR une référence au fichier audio. Un moteur ASR analyse le fichier audio, au moyen d'un modèle acoustique pour diviser les données audio en mots, et un modèle de langage pour identifier les mots prononcés dans le fichier audio. Le modèle acoustique peut être entraîné au moyen de données de phrase audio, de façon à permettre au service de transcription de transcrire avec précision des données audio longues. Les résultats sont ponctués et normalisés, et la transcription résultante est retournée à l'utilisateur.