A self-attention based neural architecture for Chinese medical named entity recognition

The combination of medical field and big data has led to an explosive growth in the volume of electronic medical records (EMRs), in which the information contained has guiding significance for diagnosis. And how to extract these information from EMRs has become a hot research topic. In this paper, w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical biosciences and engineering : MBE 2020-01, Vol.17 (4), p.3498-3511
Hauptverfasser: Wan, Qian, Liu, Jie, Wei, Luona, Ji, Bin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The combination of medical field and big data has led to an explosive growth in the volume of electronic medical records (EMRs), in which the information contained has guiding significance for diagnosis. And how to extract these information from EMRs has become a hot research topic. In this paper, we propose an ELMo-ET-CRF model based approach to extract medical named entity from Chinese electronic medical records (CEMRs). Firstly, a domain-specific ELMo model is fine-tuned on a common ELMo model with 4679 raw CEMRs. Then we use the encoder from Transformer (ET) as our model's encoder to alleviate the long context dependency problem, and the CRF is utilized as the decoder. At last, we compare the BiLSTM-CRF and ET-CRF model with word2vec and ELMo embeddings to CEMRs respectively to validate the effectiveness of ELMo-ET-CRF model. With the same training data and test data, the ELMo-ET-CRF outperforms all the other mentioned model architectures in this paper with 85.59% F1-score, which indicates the effectiveness of the proposed model architecture, and the performance is also competitive on the CCKS2019 leaderboard.
ISSN:1551-0018
1551-0018
DOI:10.3934/mbe.2020197