Pre-training method of multi-modal universal model, speech recognition method and related device
The invention provides a pre-training method of a multi-modal universal model, a speech recognition method and a related device, which can train the multi-modal universal model based on data of different modalities, improve the universality of the multi-modal universal model for downstream tasks wit...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a pre-training method of a multi-modal universal model, a speech recognition method and a related device, which can train the multi-modal universal model based on data of different modalities, improve the universality of the multi-modal universal model for downstream tasks with multi-modal input, and improve the speech recognition accuracy of the multi-modal universal model. The parameters of the multi-modal universal model are adjusted by taking the distance of the data features corresponding to the data in the homologous data set as the target, so that the multi-modal universal model can perform the same understanding on the data which have different modals but describe the same or similar content; therefore, the accuracy of the prediction result of the downstream task with the multi-modal input is improved, and the solution capability of the multi-modal universal model for the downstream task with the multi-modal input is improved.
本申请提出一种多模态通用模型的预训练方法、语音识别方法及相关装置,能够基于不同模态的数据对多模态通用模型 |
---|