A BERT based Chinese Named Entity Recognition method on ASEAN News
As the first step of building a knowledge graph to record the ASEAN counties’ information, we aim to conduct Named-entity Recognition (NER) on the Chinese news about ASEAN counties. We employ a Bi-directional gated recurrent unit to replace the LSTM architecture to improve both models’ effectiveness...
Gespeichert in:
Veröffentlicht in: | Journal of physics. Conference series 2021-04, Vol.1848 (1), p.12101 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As the first step of building a knowledge graph to record the ASEAN counties’ information, we aim to conduct Named-entity Recognition (NER) on the Chinese news about ASEAN counties. We employ a Bi-directional gated recurrent unit to replace the LSTM architecture to improve both models’ effectiveness and capability in understanding polysemous words. The state-of-the-art word embedding model, BERT, has also been included to generate qualified word vectors for the NER task. Besides, we also propose a similarity-based dataset partition method to help model learning the polysemy within the Chinese news. Experiments have been done to demonstrate that the combination of such improvements can benefit the models’ performance in identifying different types of named entities. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/1848/1/012101 |