Unified pre-training method for generating Chinese questions based on word lattices and relative position embedding

The invention discloses a unified pre-training method for generating Chinese questions based on word lattices and relative position embedding. The method specifically comprises the following steps: performing field pre-training on Robert parameters; quickly and accurately generating a target domain...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	JIAO JIUYUAN, ZHANG YALING, WANG YICHUAN, HEI XINHONG, ZHU LEI, JI WENJIANG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a unified pre-training method for generating Chinese questions based on word lattices and relative position embedding. The method specifically comprises the following steps: performing field pre-training on Robert parameters; quickly and accurately generating a target domain dictionary in a semi-supervised and semi-manual mode; according to the dictionary, fusing relative position information of input characters and words into a Transformer layer; carrying out task pre-training on the newly-built Transformer layer through a large amount of open domain question and answer data; and training and inferring generation questions. Relative position information of each list and field vocabulary is added in model input, so that the model not only can learn more position relations, but also can have a better effect in allusion to a target field input generation problem. Domain pre-training and task pre-training are also applied to the model for enhancing the inference ability of the model in a