Categorizing Unknown Words: A Decision Tree-Based Misspelling Identifier

This paper introduces a robust, portable system for categorizing unknown words. It is based on a multi- component architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the component that identifies spelling errors. The misspelling ide...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Toole, Janine
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper introduces a robust, portable system for categorizing unknown words. It is based on a multi- component architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the component that identifies spelling errors. The misspelling identifier uses a decision tree architecture to combine multiple types of evidence about the unknown word. The misspelling identifier is evaluated using data from live closed captions - a genre replete with a wide variety of unknown words.
ISSN:0302-9743
1611-3349
DOI:10.1007/3-540-46695-9_11