Correcting real-word spelling errors: A new hybrid approach

Abstract Spelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau, and Mercer showed a great...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digital Scholarship in the Humanities 2018-09, Vol.33 (3), p.488-499
Hauptverfasser: Dashti, Seyed MohammadSadegh, Khatibi Bardsiri, Amid, Khatibi Bardsiri, Vahid
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Spelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau, and Mercer showed a great performance in different evaluations. In this research, however, a new hybrid approach is proposed which relies on statistical and syntactic knowledge to detect and correct real-word errors. In this model, Constraint Grammar is used to discriminate among sets of correction candidates in the search space. Mays, Damerau, and Mercer’s trigram approach is manipulated to estimate the probability of syntactically well-formed correction candidates. The approach proposed here is tested on the Wall Street Journal corpus. The model can prove to be more practical than some other models, such as WordNet-based method of Hirst and Budanitsky and fixed windows size method of Wilcox-O’Hearn and Hirst.
ISSN:2055-7671
2055-768X
DOI:10.1093/llc/fqx054