CLSpell: Contrastive Learning with Phonological and Visual Knowledge for Chinese Spelling Check

The task of Chinese Spelling Check (CSC) is to identify and correct spelling errors in text, which are mainly caused by phonologically and visually similar characters. Although pre-trained language models are helpful for this task, they lack phonological and visual information. Previous works have p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2023-10, Vol.554, p.126468, Article 126468
Hauptverfasser: Mao, Xingliang, Shan, Youran, Li, Fangfang, Chen, Xiaohong, Zhang, Shichao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The task of Chinese Spelling Check (CSC) is to identify and correct spelling errors in text, which are mainly caused by phonologically and visually similar characters. Although pre-trained language models are helpful for this task, they lack phonological and visual information. Previous works have primarily focused on identifying errors based on local contextual data, while neglecting the importance of sentence-level information. To address the above issues, Contrastive Learning Spell (CLSpell) is proposed, which combines phonetic and glyphic information through contrastive learning and simultaneously acquires local and global information through multi-task joint learning. During pretraining, token representations are learned using a combination of phonological, visual, and semantic information. Moreover, we propose to include an auxiliary task of correct sentence discrimination in the multi-task joint training process to capture sentence-level information. Experiments on widely used benchmarks demonstrate that the proposed method surpasses all competing methods.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.126468