Improved pre-training method, electronic equipment and storage medium

The invention discloses an improved pre-training method, an electronic device and a storage medium, the pre-training method is used for a pre-training model, the pre-training model comprises a unit generation module and a backbone network, and the method comprises the steps: carrying out the down-sa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	YANG GUANROU, MA ZIYANG, YU KAI, CHEN XIE, ZHENG ZHISHENG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses an improved pre-training method, an electronic device and a storage medium, the pre-training method is used for a pre-training model, the pre-training model comprises a unit generation module and a backbone network, and the method comprises the steps: carrying out the down-sampling of a voice through the backbone network, and obtaining a first voice representation, using a mask to shield part of the first voice representation to obtain second voice representation, and aggregating the second voice representation to obtain output voice representation; and for the shielded part of the output voice representation, calculating loss with the discrete target extracted by the unit generation module, and performing gradient back propagation in the backbone network. According to the embodiment of the invention, a framework for improving self-supervised speech representation learning through an unsupervised algorithm is provided, the training target of self-supervised learning is optimized, the e