Munec: a mutual neighbor-based clustering algorithm
• ≪ Munec ≫ is a mutual neighbour-based clustering algorithm.•It manages both density and distance concepts.•It includes heuristics that help in density differentiation.•It is driven by a unique, and meaningful, parameter. It is expected for new clustering algorithms to find the appropriate numbe...
Gespeichert in:
Veröffentlicht in: | Information sciences 2019-06, Vol.486, p.148-170 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | • ≪ Munec ≫ is a mutual neighbour-based clustering algorithm.•It manages both density and distance concepts.•It includes heuristics that help in density differentiation.•It is driven by a unique, and meaningful, parameter.
It is expected for new clustering algorithms to find the appropriate number of clusters when dealing with complex data, meaning various shapes and densities. They also have to be self-tuning and adaptive for the input parameters to differentiate only between acceptable solutions. This work addresses this challenge. At the beginning mutual nearest neighbors are merged without any constraint until the number of groups including at least two items reaches a maximum. Subsequent mergings are only possible for mutual neighbor groups with a similar distance between neighbors. Finally, to manage more nuanced situations, heuristics that combine local density and distance are defined. The whole strategy aims to progressively consolidate the data representation structures. Munec requires some parameters. Most of them were integrated as constants and a single user parameter controls the process: the higher its value, the more constraints there are on the merging and the higher the number of clusters. Tests carried out using 2-dimensional datasets showed that Munec proved to be highly effective in matching a ground truth target. Moreover, with the same input configuration it can identify clusters of various densities, arbitrary shape and including a large amount of noise. These results hold for spaces of moderate dimension. |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2019.02.051 |