CLSpell: Contrastive Learning with Phonological and Visual Knowledge for Chinese Spelling Check
The task of Chinese Spelling Check (CSC) is to identify and correct spelling errors in text, which are mainly caused by phonologically and visually similar characters. Although pre-trained language models are helpful for this task, they lack phonological and visual information. Previous works have p...
Gespeichert in:
Veröffentlicht in: | Neurocomputing (Amsterdam) 2023-10, Vol.554, p.126468, Article 126468 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The task of Chinese Spelling Check (CSC) is to identify and correct spelling errors in text, which are mainly caused by phonologically and visually similar characters. Although pre-trained language models are helpful for this task, they lack phonological and visual information. Previous works have primarily focused on identifying errors based on local contextual data, while neglecting the importance of sentence-level information. To address the above issues, Contrastive Learning Spell (CLSpell) is proposed, which combines phonetic and glyphic information through contrastive learning and simultaneously acquires local and global information through multi-task joint learning. During pretraining, token representations are learned using a combination of phonological, visual, and semantic information. Moreover, we propose to include an auxiliary task of correct sentence discrimination in the multi-task joint training process to capture sentence-level information. Experiments on widely used benchmarks demonstrate that the proposed method surpasses all competing methods. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2023.126468 |