MT5 language model optimization method and device, medium and equipment

The invention provides an MT5 language model optimization method and device, a medium and equipment. The method comprises the following steps: adding at least one convolutional layer in an encoder of an MT5 language model, so that the encoder extracts text features through the at least one convoluti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ZHANG ZHENG, GUO DONGSHENG, YUE AIZHEN, DUAN QIANG, JIANG KAI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides an MT5 language model optimization method and device, a medium and equipment. The method comprises the following steps: adding at least one convolutional layer in an encoder of an MT5 language model, so that the encoder extracts text features through the at least one convolutional layer; wherein the output information of the previous convolutional layer is the input information of the next convolutional layer, and the output information of the last convolutional layer is the input information of a decoder of the MT5 language model; and training the MT5 language model by using the data set to obtain an optimized MT5 language model, so that the optimized MT5 language model is used in an abstract generation technology. According to the method, the feature extraction capability of the encoder is improved by increasing the convolutional layer of the encoder, so that information is better provided for the decoder to generate an abstract. 本发明提供一种MT5语言模型优化方法及装置、介质、设备。方法包括:在MT5语言模型的编码器中增加至少一个卷积层