Method for constructing field Chinese language pre-training model
The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask proces...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a method for constructing a field Chinese language pre-training model, which comprises the following steps: constructing an entity lexicon of a field, the entity lexicon comprising entity words and entity relationships; obtaining training text data, and performing mask processing and word vector embedding processing on the training text data to obtain a corresponding word vector sequence; according to the word vector sequence, the spatial relation position coding sequence and the positive sequence and the negative sequence of the two sentences containing the entity relation, a RoBERTa model based on a multilayer Transform model is trained to obtain a trained pre-training model, the pre-training model is connected to a corresponding downstream task, and downstream task migration is achieved. According to the method, professional field knowledge can be effectively extracted, and semantic understanding of professional fields can be improved.
本发明公开了一种领域中文语言预训练模型构建的方法,其包括:构建领域的实体词库,所述实体词库包括 |
---|