An Efficient Two-Level SOMART Document Clustering Through Dimensionality Reduction
Document Clustering is one of the popular techniques that can unveil inherent structure in the underlying data. Two successful models of unsupervised neural networks, Self-Organizing Map (SOM) and Adaptive Resonance Theory (ART) have shown promising results in this task. The high dimensionality of t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Document Clustering is one of the popular techniques that can unveil inherent structure in the underlying data. Two successful models of unsupervised neural networks, Self-Organizing Map (SOM) and Adaptive Resonance Theory (ART) have shown promising results in this task. The high dimensionality of the data has always been a challenging problem in document clustering. It is common to overcome this problem using dimension reduction methods. In this paper, we propose a new two-level neural network based document clustering architecture that can be used for high dimensional data. Our solution is to use SOM in the first level as a dimension reduction method to produce multiple output clusters, then use ART in the second level to produce the final clusters using the reduced vector space. The experimental results of clustering documents from the RETURES corpus using our proposed architecture show an improvement in the clustering performance evaluated using the entropy and the f_measure. |
---|---|
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-540-30499-9_23 |