Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning
Chinese Spelling Correction (CSC) is an important natural language processing task. Existing methods for CSC mostly utilize BERT models, which select a character from a candidate list to correct errors in the sentence. World knowledge refers to structured information and relationships spanning a wid...
Gespeichert in:
Veröffentlicht in: | IEICE Transactions on Information and Systems 2024/09/01, Vol.E107.D(9), pp.1264-1273 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Chinese Spelling Correction (CSC) is an important natural language processing task. Existing methods for CSC mostly utilize BERT models, which select a character from a candidate list to correct errors in the sentence. World knowledge refers to structured information and relationships spanning a wide range of domains and subjects, while definition knowledge pertains to textual explanations or descriptions of specific words or concepts. Both forms of knowledge have the potential to enhance a model's ability to comprehend contextual nuances. As BERT lacks sufficient guidance from world knowledge for error correction and existing models overlook the rich definition knowledge in Chinese dictionaries, the performance of spelling correction models is somewhat compromised. To address these issues, within the world knowledge network, this study injects world knowledge from knowledge graphs into the model to assist in correcting spelling errors caused by a lack of world knowledge. Additionally, the definition knowledge network in this model improves the error correction capability by utilizing the definitions from the Chinese dictionary through a comparative learning approach. Experimental results on the SIGHAN benchmark dataset validate the effectiveness of our approach. |
---|---|
ISSN: | 0916-8532 1745-1361 |
DOI: | 10.1587/transinf.2023EDP7166 |