Question-answering Forestry Pre-trained Language Model: ForestBERT
【Objective】As for the problems of low utilization of forestry text, insufficient understanding of forestry knowledge by general-domain pre-trained language models, and the time-consuming nature of data annotation, this study makes full use of the massive forestry texts, proposes a pre-trained langua...
Gespeichert in:
Veröffentlicht in: | Linye kexue (1979) 2024-01, Vol.60 (9), p.99 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | chi |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | 【Objective】As for the problems of low utilization of forestry text, insufficient understanding of forestry knowledge by general-domain pre-trained language models, and the time-consuming nature of data annotation, this study makes full use of the massive forestry texts, proposes a pre-trained language model integrating forestry domain knowledge, and efficiently realizes the forestry extractive question answering by automatically annotating the training data, so as to provide intelligent information services for forestry decision-making and management.【Method】First, a forestry corpus was constructed using web crawler technology, encompassing three topics: terminology, law, and literature. This corpus was used to further pre-train the generaldomain pre-trained language model BERT. Through self-supervised learning of masked language model and next sentence prediction tasks, BERT was able to effectively learn forestry semantic information, resulting in the pre-trained language model ForestBERT, which has general |
---|---|
ISSN: | 1001-7488 |