Cross-Domain Tibetan Named Entity Recognition via Large Language Models
With the development of large language models (LLMs), they have demonstrated powerful capabilities across many downstream tasks. Existing Tibetan named entity recognition (NER) methods often suffer from a high degree of coupling between data and models, limiting them to identifying entities only wit...
Gespeichert in:
Veröffentlicht in: | Electronics (Basel) 2025-01, Vol.14 (1), p.111 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the development of large language models (LLMs), they have demonstrated powerful capabilities across many downstream tasks. Existing Tibetan named entity recognition (NER) methods often suffer from a high degree of coupling between data and models, limiting them to identifying entities only within specific domain datasets and making cross-domain recognition difficult. Additionally, each dataset requires training a dedicated model, and when faced with new domains, retraining and redeployment are necessary. In practical applications, the ability to perform cross-domain NER is crucial to meeting real-world needs. To address this issue and decouple data from models, enabling cross-domain NER, this paper proposes a cross-domain joint learning approach based on large language models, which enhances model robustness by learning the shared underlying semantics across different domains. To reduce the significant computational costs incurred by LLMs during inference, we adopt an adaptive structured pruning method based on domain-dependent prompt, which effectively reduces the model’s memory requirements and improves the inference speed while minimizing the impact on performance. The experimental results show that our method significantly outperformed the baseline model across cross-domain Tibetan datasets. In the Tibetan medicine domain, our method achieved an F1 score improvement of up to 27.26% compared with the baseline model at its best. Our method achieved an average F1 score of 95.17% across domains, outperforming the baseline Llama2 + Prompt model by 5.12%. Furthermore, our method demonstrates strong generalization capabilities in NER tasks for other low-resource languages. |
---|---|
ISSN: | 2079-9292 2079-9292 |
DOI: | 10.3390/electronics14010111 |