HYBRID TEXT SEGMENTATION USING N-GRAMS AND LEXICAL INFORMATION

A hybrid n-gram/lexical analysis tokenization system including a lexicon and a hybrid tokenizer operative to perform both N-gram tokenization of a text and lexical analysis tokenization of a text using the lexican, and to construct either of an index and a classifier from the results of both of the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: DAYAN YIGAL SHAI, MAGDALEN JOSEMINA MARCELLA, MAZEL VICTORIA
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A hybrid n-gram/lexical analysis tokenization system including a lexicon and a hybrid tokenizer operative to perform both N-gram tokenization of a text and lexical analysis tokenization of a text using the lexican, and to construct either of an index and a classifier from the results of both of the N-gram tokenization and the lexical analysis tokenization, where the hybrid tokenizer is implemented in at least one of computer hardware and computer software and is embodied within a computer-readable medium.