Transducer-based streaming push for cascaded encoders

A method (400) includes receiving a sequence of acoustic frames (110) and generating, by a first encoder (210), a first higher order feature representation (212) for a corresponding acoustic frame in the sequence of acoustic frames. The method further includes generating, by the first pass transduce...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	STROMAN TREVOR, SAINATH TARA N, NARAYANAN ARUN, PANG RUOMING, HU KE
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A method (400) includes receiving a sequence of acoustic frames (110) and generating, by a first encoder (210), a first higher order feature representation (212) for a corresponding acoustic frame in the sequence of acoustic frames. The method further includes generating, by the first pass transducer decoder (201), a first pass speech recognition hypothesis (120a) for the corresponding first high-order feature representation, and generating, by the text encoder (240), a text code (242) for the corresponding first pass speech recognition hypothesis. The method also includes generating, by a second encoder (220), a second high-order feature representation (222) for the corresponding first high-order feature representation. The method further includes generating, by a second pass transducer decoder (202), a second pass speech recognition hypothesis (120b) using the corresponding second higher order feature representation and the corresponding text coding. 一种方法(400)包括接收声学帧(110)的序列，并且由第一编码器(210)针对声学帧序列中的对应的声学帧生成第一