Semantically Segmented Clustering Based on Possibilistic and Rough Set Theories
This paper reports the application of a possibility and rough set based clustering to semantically segmented real‐world databases. The approach is an improved version of the well‐known k‐modes algorithm. It is a soft clustering method that clusters instances with uncertain categorical values to diff...
Gespeichert in:
Veröffentlicht in: | International journal of intelligent systems 2015-06, Vol.30 (6), p.676-706 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper reports the application of a possibility and rough set based clustering to semantically segmented real‐world databases. The approach is an improved version of the well‐known k‐modes algorithm. It is a soft clustering method that clusters instances with uncertain categorical values to different clusters using their membership degrees. The possibility theory is used for dealing with uncertainty in the values of attributes and in the memberships of clusters. Rough sets are used to detect clusters with rough boundaries. We demonstrate the effectiveness of the proposed approach with the help of two real‐world databases: a retail store or transactions data set and a mobile phone data set. The numeric values of attributes are segmented into semantically meaningful linguistic values using a novel discretization method. These linguistic values can lead to more natural interpretation of knowledge using possibilistic degrees. The possibilistic degrees describe our knowledge relative to the values of attributes (fully plausible to occur, may occur, or rejected) and identify the level of uncertainty in memberships to different clusters. In addition, our method deduces peripheral objects by calculating the approximate sets as defined in the rough set theory. The k‐modes enhanced with rough set and possibility theories can provide semantically meaningful information for decision making to the store owners (retails data set) and telecommunication companies (mobile phone data set). |
---|---|
ISSN: | 0884-8173 1098-111X |
DOI: | 10.1002/int.21723 |