Document Clustering with K-tree

This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2010-01
Hauptverfasser:	De Vries, Christopher M, Geva, Shlomo
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Clustering Computer Science - Artificial Intelligence Computer Science - Data Structures and Algorithms Computer Science - Information Retrieval Information retrieval Support vector machines
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
ISSN:	2331-8422
DOI:	10.48550/arxiv.1001.0827