Introducing high correlation and high quality instances for few-shot entity linking
Entity linking, the process of connecting textual mentions in documents to canonical entities within a knowledge base, plays an integral role in a myriad of natural language processing tasks. A significant challenge prevalent within the field is the scarcity of resources, particularly for multiple s...
Gespeichert in:
Veröffentlicht in: | Neural networks 2025-01, Vol.181, p.106783, Article 106783 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Entity linking, the process of connecting textual mentions in documents to canonical entities within a knowledge base, plays an integral role in a myriad of natural language processing tasks. A significant challenge prevalent within the field is the scarcity of resources, particularly for multiple specialized domains, which accentuates the importance of few-shot entity linking in real-world scenarios. Previous works address the problem of lacking in-domain labeled data by generating synthetic data. However, we argue that the synthetic data is frequently far from high-quality, such low-quality instances will introduce noise and diminish the ability of entity linking models to comprehend the semantic consistency between mentions and entities. In this paper, we propose a H2FEL framework to introduce high correlation and high quality instances for few-shot entity linking. We argue that there are rich high-quality labeled data in general domains and some of them are highly correlated to the target domain. Thus, we first design an adversarial instance extraction module to extract such high-correlation instances without depending on additional manually annotated data. To further mitigate the negative effects brought by low-correlation instances, we train our entity linking model via a variant of curriculum learning. Experimental results on the few-shot entity linking dataset demonstrate the effectiveness of our proposed H2FEL framework and it achieves state-of-the-art performance.
•First prioritize the correlation of instances to the target domain.•Adversarial extraction for high correlation instances from high quality data.•Curriculum learning variant to mitigate low-correlation instances’ negative effect.•The micro accuracy on the few-shot entity linking dataset improved 19.47%. |
---|---|
ISSN: | 0893-6080 1879-2782 1879-2782 |
DOI: | 10.1016/j.neunet.2024.106783 |