Method for constructing field Chinese language pre-training model

The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask proces...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LIU HUAILIANG, XU JIAJUN, ZHANG YUZHEN, ZHANG SHANZHUANG, ZHAO JIANBO
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask processing and word vector embedding processing on the training text data to obtain a corresponding word vector sequence; according to the word vector sequence, the spatial relation position coding sequence and the positive sequence and the negative sequence of the two sentences containing the entity relation, a RoBERTa model based on a multilayer Transform model is trained to obtain a trained pre-training model, the pre-training model is connected to a corresponding downstream task, and downstream task migration is achieved. According to the method, professional field knowledge can be effectively extracted, and semantic understanding of professional fields can be improved. 本发明公开了一种领域中文语言预训练模型构建的方法，其包括：构建领域的实体词库，所述实体词库包括