Efficient fusion of cluster ensembles using inherent voting

Discovering interesting, implicit knowledge and general relationships in geographic information databases is very important to understand and to use the spatial data. Spatial clustering has been recognized as a primary data mining method for knowledge discovery in spatial databases. In this paper, w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Anandhi, R.J., Subramanyam, N.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Discovering interesting, implicit knowledge and general relationships in geographic information databases is very important to understand and to use the spatial data. Spatial clustering has been recognized as a primary data mining method for knowledge discovery in spatial databases. In this paper, we have analyzed an efficient method for the fusion of the outputs of the various clusterers, with less computing. We have discussed our proposed slice and dice cluster ensemble merging technique (SDEM) for spatial datasets and used it in our three-phase clustering combination technique in this paper. Voting procedure is normally used to assign labels for the clusters and resolving the correspondence problem, but we have eliminated by usage of degree of agreement vector. Another common problem in any cluster ensembles is the computation of voting matrix which is in the order of n 2 , where n is the number of data points, which is very expensive with respect to spatial datasets. In our method, as we travel down the layered merge, we calculate degree of agreement (DOA) factor, based on the count of agreed clusterers. Using the updated DOA at every layer, the movement of unresolved, unsettled data elements will be handled at much reduced the computational cost. Added advantage of this approach is the reuse of the gained knowledge in previous layers, thereby yielding better cluster accuracy and robustness.
DOI:10.1109/IAMA.2009.5228053