METHOD AND SYSTEM FOR PERFORMING DOMAIN ADAPTATION OF END-TO-END AUTOMATIC SPEECH RECOGNITION MODEL

Disclosed is a method for performing domain adaptation of an end-to-end (E2E) automatic speech recognition (ASR) model. The method comprises: obtaining an un-adapted version of the E2E ASR model trained using a first set of transcriptions, the un-adapted version of E2E ASR model comprising an encode...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kilpikoski, Juho, Pylkkönen, Janne, Ukkonen, Antti, Heikinheimo, Hannes, Tamminen, Samu
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Disclosed is a method for performing domain adaptation of an end-to-end (E2E) automatic speech recognition (ASR) model. The method comprises: obtaining an un-adapted version of the E2E ASR model trained using a first set of transcriptions, the un-adapted version of E2E ASR model comprising an encoder network (402), a first prediction network and a joint network (406); using the first set of transcriptions, while keeping parameters of first prediction network fixed, to train a language model output component (410) to match the first prediction network; using a second set of transcriptions, while keeping parameters of language model output component fixed, to fine-tune the first prediction network for obtaining a second prediction network (404); and generating an adapted version of the E2E ASR model (400), wherein the adapted version of the E2E ASR model comprises the encoder network, the second prediction network, the language model output component, and the joint network.