F-Mapper: A Fuzzy Mapper clustering algorithm

Using topology in data analysis, known as Topological Data Analysis (TDA), is now a promising new area of data mining research. One of the important and foundational tools of TDA is the Mapper algorithm. During the past two decades, this algorithm has proven its useful and robust abilities in extrac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2020-02, Vol.189, p.105107, Article 105107
Hauptverfasser: Bui, Quang-Thinh, Vo, Bay, Do, Hoang-Anh Nguyen, Hung, Nguyen Quoc Viet, Snasel, Vaclav
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Using topology in data analysis, known as Topological Data Analysis (TDA), is now a promising new area of data mining research. One of the important and foundational tools of TDA is the Mapper algorithm. During the past two decades, this algorithm has proven its useful and robust abilities in extracting insights and meaningful information from high-dimensional datasets. Nevertheless, several alterations in the choices of parameters, such as lens, cover and clustering, can be used to develop this algorithm. In this paper, we propose the F-Mapper algorithm, based on the foundation of the Mapper algorithm, to solve the problem of automating when dividing cover intervals with an arbitrary percentage of overlap. To clarify the efficiency of this enhanced algorithm, experiments were carried out on three datasets, including the Unit Circle, Reaven and Miller Diabetes, and NKI Breast Cancer. The experimental results will be analyzed and compared with those of the original method, the Mapper algorithm, through the output image and silhouette coefficient score in the evaluation of clustering. •The Mapper algorithm is optimized in choosing cover automatically with random percentage.•The F-Mapper outputs are rather similar to those of the Mapper in the topological sense.•The F-Mapper results are well-clustered based on the silhouette coefficient score.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2019.105107