An Efficient Hybrid Hierarchical Document Clustering Method

Document clustering is a technique for grouping document objects together such that documents within a cluster have high similarity while documents in different clusters have low similarity. Hierarchical document clustering organizes the clusters into a hierarchy such that a parent cluster is a gene...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yehang Zhu, Fung, B.C.M., Dejun Mu, Yanling Li
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Document clustering is a technique for grouping document objects together such that documents within a cluster have high similarity while documents in different clusters have low similarity. Hierarchical document clustering organizes the clusters into a hierarchy such that a parent cluster is a general topic of its child clusters. In this paper, we propose a novel hierarchical document clustering method that is a hybrid version of partitioning and agglomerative clustering approaches. The proposed method inherits the merit of efficiency from the partitioning approach and the hierarchical structure from agglomerative approach. Experiments on real-life datasets suggest that our method is effective and efficient.
DOI:10.1109/FSKD.2008.159