Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking

Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on Asian and low-resource language information processing 2024-02, Vol.23 (2), p.1-15, Article 28
Hauptverfasser: Lin, Tzu-Mi, Hung, Man-Chen, Lee, Lung-Hao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking.
ISSN:2375-4699
2375-4702
DOI:10.1145/3638555