Interpretable interval type-2 fuzzy predicates for data clustering: A new automatic generation method based on self-organizing maps

•A new clustering based on interval type-2 fuzzy predicates and SOMs is proposed.•SOMs are automatically configured and trained.•Fuzzy predicates are generated using cluster prototypes extracted from SOMs.•Linguistic knowledge is obtained from the predicates automatically generated.•The proposed met...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2017-10, Vol.133, p.234-254
Hauptverfasser: Comas, Diego S., Pastore, Juan I., Bouchet, Agustina, Ballarin, Virginia L., Meschino, Gustavo J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A new clustering based on interval type-2 fuzzy predicates and SOMs is proposed.•SOMs are automatically configured and trained.•Fuzzy predicates are generated using cluster prototypes extracted from SOMs.•Linguistic knowledge is obtained from the predicates automatically generated.•The proposed method overcome existing clustering methods based on fuzzy predicates. In previous works, we proposed two methods for data clustering based on automatically discovered fuzzy predicates which were referred to as SOM-based Fuzzy Predicate Clustering (SFPC) [Meschino et al., Neurocomputing, 147, 47–59 (2015)] and Type-2 Data-based Fuzzy Predicate Clustering (T2-DFPC) [Comas et al., Expert Syst. Appl., 68, 136–150 (2017)]. In such methods, fuzzy predicates allow both data clustering and knowledge discovering about the obtained clusters. This last feature constitutes novelty comparing to other existing approaches and it is a major contribution in the data clustering field. Based on these previous methods, in the present paper a new automatic clustering method based on fuzzy predicates is proposed which uses Self-Organizing Maps (SOMs) and is called Type-2 SOM-based Fuzzy Predicate Clustering (T2-SFPC). The new method does not require any prior knowledge about the clustering addressed. First, a random partition is defined on the dataset to be clustered and SOMs are configured and trained using the resulting data subsets. Second, an automatic clustering approach is applied on the SOM codebooks, discovering representative data of the different clusters, which are called cluster prototypes. Third, interval type-2 membership function formed by Gaussian-shape sub-functions and fuzzy predicates are defined, allowing data clustering and its interpretation. The proposed method preserves all the advantages of the previous methods SFPC and T2-DFPC in relation to the knowledge extraction capabilities and their potential application on distributed clustering and parallel computing, but results obtained on several public datasets tested showed more compactness and separation of the clusters defined by the T2-SFPC, outperforming both the previous methods and the several classical clustering approaches tested, considering internal and external validation indices. Additionally, both clustering interpretation and optimization capabilities are improved by the proposed method when compared to the methods SFPC and T2-DFPC.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2017.07.012