Flexible length phrases in document classification

In this paper we investigate possibility of using phrases of flexible length in classification of textual documents as an extension to classic bag of words document representation where documents are represented using single words as index terms. The investigation is conducted on collection of artic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Radosevic, D., Dobsa, J.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper we investigate possibility of using phrases of flexible length in classification of textual documents as an extension to classic bag of words document representation where documents are represented using single words as index terms. The investigation is conducted on collection of articles from Vecernji list. It is shown that usage of flexible length phrases improves precision of automatic document classification and there are indications that such approach could be used for genre classification
ISSN:1330-1012
DOI:10.1109/ITI.2006.1708524