Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking
Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chin...
Gespeichert in:
Veröffentlicht in: | ACM transactions on Asian and low-resource language information processing 2024-02, Vol.23 (2), p.1-15, Article 28 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 15 |
---|---|
container_issue | 2 |
container_start_page | 1 |
container_title | ACM transactions on Asian and low-resource language information processing |
container_volume | 23 |
creator | Lin, Tzu-Mi Hung, Man-Chen Lee, Lung-Hao |
description | Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking. |
doi_str_mv | 10.1145/3638555 |
format | Article |
fullrecord | <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3638555</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3638555</sourcerecordid><originalsourceid>FETCH-LOGICAL-a239t-6da1b252c901cb70b94ddd8cd9e58ac5878b4253c29c489b53c86208275652f93</originalsourceid><addsrcrecordid>eNo9jz1PwzAYhC0EElVbsTN5Ywr4-2OEEApSpC4wR47tFEPiIDsg9d8T1JbpPel97nQHwBVGtxgzfkcFVZzzM7AgVPKCSUTOT1pofQnWOX8ghDCTQiC8AGXtf3wyuxB38PHb9HDTjznDKtrR-ZRhiLB8D9FnDx_COHgX7AxVcQrTHtYhfs7GFbjoTJ_9-niX4O2pei2fi3q7eSnv68IQqqdCOINbwonVCNtWolYz55yyTnuujOVKqpYRTi3RlindzkoJghSRXHDSaboEN4dcm-aOyXfNVwqDSfsGo-ZvfnOcP5PXB9LY4R86PX8BOAlTDA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking</title><source>ACM Digital Library Complete</source><creator>Lin, Tzu-Mi ; Hung, Man-Chen ; Lee, Lung-Hao</creator><creatorcontrib>Lin, Tzu-Mi ; Hung, Man-Chen ; Lee, Lung-Hao</creatorcontrib><description>Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking.</description><identifier>ISSN: 2375-4699</identifier><identifier>EISSN: 2375-4702</identifier><identifier>DOI: 10.1145/3638555</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; Lexical semantics</subject><ispartof>ACM transactions on Asian and low-resource language information processing, 2024-02, Vol.23 (2), p.1-15, Article 28</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a239t-6da1b252c901cb70b94ddd8cd9e58ac5878b4253c29c489b53c86208275652f93</cites><orcidid>0000-0003-0472-7429 ; 0009-0004-3373-8789 ; 0000-0002-2604-3725</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3638555$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2276,27901,27902,40172,75970</link.rule.ids></links><search><creatorcontrib>Lin, Tzu-Mi</creatorcontrib><creatorcontrib>Hung, Man-Chen</creatorcontrib><creatorcontrib>Lee, Lung-Hao</creatorcontrib><title>Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking</title><title>ACM transactions on Asian and low-resource language information processing</title><addtitle>ACM TALLIP</addtitle><description>Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking.</description><subject>Computing methodologies</subject><subject>Lexical semantics</subject><issn>2375-4699</issn><issn>2375-4702</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9jz1PwzAYhC0EElVbsTN5Ywr4-2OEEApSpC4wR47tFEPiIDsg9d8T1JbpPel97nQHwBVGtxgzfkcFVZzzM7AgVPKCSUTOT1pofQnWOX8ghDCTQiC8AGXtf3wyuxB38PHb9HDTjznDKtrR-ZRhiLB8D9FnDx_COHgX7AxVcQrTHtYhfs7GFbjoTJ_9-niX4O2pei2fi3q7eSnv68IQqqdCOINbwonVCNtWolYz55yyTnuujOVKqpYRTi3RlindzkoJghSRXHDSaboEN4dcm-aOyXfNVwqDSfsGo-ZvfnOcP5PXB9LY4R86PX8BOAlTDA</recordid><startdate>20240208</startdate><enddate>20240208</enddate><creator>Lin, Tzu-Mi</creator><creator>Hung, Man-Chen</creator><creator>Lee, Lung-Hao</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-0472-7429</orcidid><orcidid>https://orcid.org/0009-0004-3373-8789</orcidid><orcidid>https://orcid.org/0000-0002-2604-3725</orcidid></search><sort><creationdate>20240208</creationdate><title>Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking</title><author>Lin, Tzu-Mi ; Hung, Man-Chen ; Lee, Lung-Hao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a239t-6da1b252c901cb70b94ddd8cd9e58ac5878b4253c29c489b53c86208275652f93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computing methodologies</topic><topic>Lexical semantics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Tzu-Mi</creatorcontrib><creatorcontrib>Hung, Man-Chen</creatorcontrib><creatorcontrib>Lee, Lung-Hao</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Tzu-Mi</au><au>Hung, Man-Chen</au><au>Lee, Lung-Hao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking</atitle><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle><stitle>ACM TALLIP</stitle><date>2024-02-08</date><risdate>2024</risdate><volume>23</volume><issue>2</issue><spage>1</spage><epage>15</epage><pages>1-15</pages><artnum>28</artnum><issn>2375-4699</issn><eissn>2375-4702</eissn><abstract>Entity linking is the task of assigning a unique identity to named entities mentioned in a text, a sort of word sense disambiguation that focuses on automatically determining a pre-defined sense for a target entity to be disambiguated. This study proposes the DGE (Dual Gloss Encoders) model for Chinese entity linking in the biomedical domain. We separately model a dual encoder architecture, comprising a context-aware gloss encoder and a lexical gloss encoder, for contextualized embedding representations. DGE are then jointly optimized to assign the nearest gloss with the highest score for target entity disambiguation. The experimental datasets consist of a total of 10,218 sentences that were manually annotated with glosses defined in the BabelNet 5.0 across 40 distinct biomedical entities. Experimental results show that the DGE model achieved an F1-score of 97.81, outperforming other existing methods. A series of model analyses indicate that the proposed approach is effective for Chinese biomedical entity linking.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3638555</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-0472-7429</orcidid><orcidid>https://orcid.org/0009-0004-3373-8789</orcidid><orcidid>https://orcid.org/0000-0002-2604-3725</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2375-4699 |
ispartof | ACM transactions on Asian and low-resource language information processing, 2024-02, Vol.23 (2), p.1-15, Article 28 |
issn | 2375-4699 2375-4702 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3638555 |
source | ACM Digital Library Complete |
subjects | Computing methodologies Lexical semantics |
title | Leveraging Dual Gloss Encoders in Chinese Biomedical Entity Linking |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T22%3A02%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Leveraging%20Dual%20Gloss%20Encoders%20in%20Chinese%20Biomedical%20Entity%20Linking&rft.jtitle=ACM%20transactions%20on%20Asian%20and%20low-resource%20language%20information%20processing&rft.au=Lin,%20Tzu-Mi&rft.date=2024-02-08&rft.volume=23&rft.issue=2&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.artnum=28&rft.issn=2375-4699&rft.eissn=2375-4702&rft_id=info:doi/10.1145/3638555&rft_dat=%3Cacm_cross%3E3638555%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |