Biomedical named entity recognition using generalized expectation criteria

It is difficult to apply machine learning to a domain which is short of labeled training data, such as biomedical named entity recognition (NER) which remains a challenging task because of its extraordinary complex nomenclature. In this paper, we proposed a semi-supervised method which can train con...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of machine learning and cybernetics 2011-12, Vol.2 (4), p.235-243
Hauptverfasser: Yao, Lin, Sun, Chengjie, Wu, Yan, Wang, Xiaolong, Wang, Xuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:It is difficult to apply machine learning to a domain which is short of labeled training data, such as biomedical named entity recognition (NER) which remains a challenging task because of its extraordinary complex nomenclature. In this paper, we proposed a semi-supervised method which can train condition random field (CRF) models using generalized expectation (GE) criteria to solve biomedical named entity recognition problem. In the proposed method, instead of “instance” labeling, the “feature” labeling is applied to get the training data which can save lots of labeling time. Latent Dirichlet Allocation (LDA) model was involved to choose the features for labeling. Experiment results show that the proposed method can dramatically improve the performance of biomedical NER through incorporating unlabeled data by feature labeling.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-011-0022-3