Research on the Chinese Named-Entity–Relation-Extraction Method for Crop Diseases Based on BERT

In order to integrate fragmented text data of crop disease knowledge to solve the current problems of disordered knowledge management, weak correlation and difficulty in knowledge sharing, a Chinese named-entity–relation-extraction model for crop diseases (BBCPF) was proposed in this paper by utiliz...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Agronomy (Basel) 2022-09, Vol.12 (9), p.2130
Hauptverfasser: Zhang, Wenhao, Wang, Chunshan, Wu, Huarui, Zhao, Chunjiang, Teng, Guifa, Huang, Sufang, Liu, Zhen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In order to integrate fragmented text data of crop disease knowledge to solve the current problems of disordered knowledge management, weak correlation and difficulty in knowledge sharing, a Chinese named-entity–relation-extraction model for crop diseases (BBCPF) was proposed in this paper by utilizing the advantage of knowledge graph in describing complex relations between disease entities in a structured form. This model was composed of two parts, i.e., named-entity recognition and relation extraction, in the form of an assembly line. To deal with the different meanings of Chinese crop disease terms in different contexts and to better obtain the contextual information, the BERT model was introduced for dynamic vector representations. Then, the BiLSTM layer was used to learn long-distance text information, and the CRF was applied to obtain the globally optimal labeling sequence, so as to output the crop disease entities. According to the entity category, the entities were divided as subjects and objects, which were then input into the disordered language model PERT to extract the contextual features of the relation data. At last, the fully connected layer was used to decode the information and output the crop disease entity-relation triples. The experiment results show that, on the self-built disease corpus dataset, the Precision, Recall, and F1-Score values of the established model reached 85.63%, 79.46% and 82.43%, respectively, for entity extraction, and reached 97.96%, 98.43% and 98.16%, respectively, for relation extraction. This paper provides an effective method for information extraction in the construction of Chinese crop disease domain knowledge graphs.
ISSN:2073-4395
2073-4395
DOI:10.3390/agronomy12092130