A Chinese nested named entity recognition approach using sequence labeling

Purpose This study aims to introduce an innovative approach that uses a decoder with multiple layers to accurately identify Chinese nested entities across various nesting depths. To address potential human intervention, an advanced optimization algorithm is used to fine-tune the decoder based on the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of Web information systems 2023-07, Vol.19 (1), p.42-60
Hauptverfasser: Chen, Maojian, Luo, Xiong, Shen, Hailun, Huang, Ziyang, Peng, Qiaojuan, Yuan, Yuqi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose This study aims to introduce an innovative approach that uses a decoder with multiple layers to accurately identify Chinese nested entities across various nesting depths. To address potential human intervention, an advanced optimization algorithm is used to fine-tune the decoder based on the depth of nested entities present in the data set. With this approach, this study achieves remarkable performance in recognizing Chinese nested entities. Design/methodology/approach This study provides a framework for Chinese nested named entity recognition (NER) based on sequence labeling methods. Similar to existing approaches, the framework uses an advanced pre-training model as the backbone to extract semantic features from the text. Then a decoder comprising multiple conditional random field (CRF) algorithms is used to learn the associations between granularity labels. To minimize the need for manual intervention, the Jaya algorithm is used to optimize the number of CRF layers. Experimental results validate the effectiveness of the proposed approach, demonstrating its superior performance on both Chinese nested NER and flat NER tasks. Findings The experimental findings illustrate that the proposed methodology can achieve a remarkable 4.32% advancement in nested NER performance on the People’s Daily corpus compared to existing models. Originality/value This study explores a Chinese NER methodology based on the sequence labeling ideology for recognizing sophisticated Chinese nested entities with remarkable accuracy.
ISSN:1744-0084
1744-0092
1744-0084
DOI:10.1108/IJWIS-04-2023-0070