MT5 language model optimization method and device, medium and equipment
The invention provides an MT5 language model optimization method and device, a medium and equipment. The method comprises the following steps: adding at least one convolutional layer in an encoder of an MT5 language model, so that the encoder extracts text features through the at least one convoluti...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides an MT5 language model optimization method and device, a medium and equipment. The method comprises the following steps: adding at least one convolutional layer in an encoder of an MT5 language model, so that the encoder extracts text features through the at least one convolutional layer; wherein the output information of the previous convolutional layer is the input information of the next convolutional layer, and the output information of the last convolutional layer is the input information of a decoder of the MT5 language model; and training the MT5 language model by using the data set to obtain an optimized MT5 language model, so that the optimized MT5 language model is used in an abstract generation technology. According to the method, the feature extraction capability of the encoder is improved by increasing the convolutional layer of the encoder, so that information is better provided for the decoder to generate an abstract.
本发明提供一种MT5语言模型优化方法及装置、介质、设备。方法包括:在MT5语言模型的编码器中增加至少一个卷积层 |
---|