Method for constructing field Chinese language pre-training model

The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask proces...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LIU HUAILIANG, XU JIAJUN, ZHANG YUZHEN, ZHANG SHANZHUANG, ZHAO JIANBO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask processing and word vector embedding processing on the training text data to obtain a corresponding word vector sequence; according to the word vector sequence, the spatial relation position coding sequence and the positive sequence and the negative sequence of the two sentences containing the entity relation, a RoBERTa model based on a multilayer Transform model is trained to obtain a trained pre-training model, the pre-training model is connected to a corresponding downstream task, and downstream task migration is achieved. According to the method, professional field knowledge can be effectively extracted, and semantic understanding of professional fields can be improved. 本发明公开了一种领域中文语言预训练模型构建的方法,其包括:构建领域的实体词库,所述实体词库包括