Dimensional reduction using principal component analysis by alternating least squares on fuzzy possibilistic C-means for new student economic data
Cluster analysis on datasets that have outlier data is very influential on the quality of clustering. Clustering methods for dealing with outlier data are Fuzzy C-Means (FCM), Possibilistic C Means (PCM) and Fuzzy Possibilistic C-Means (FPCM). Weakness in FCM that it still sensitive to outlier. PCM...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cluster analysis on datasets that have outlier data is very influential on the quality of clustering. Clustering methods for dealing with outlier data are Fuzzy C-Means (FCM), Possibilistic C Means (PCM) and Fuzzy Possibilistic C-Means (FPCM). Weakness in FCM that it still sensitive to outlier. PCM has the ability to cluster on outlier data but has the potential to produce identical clusters if the distance between cluster centers is close to others. In this study, a mixed scale dataset was used, namely ratio, nominal and ordinal. Outlier detection was carried out on the dataset using Mahalanobis Distance, it was found that 6.5% of the dataset had outliers. Principal Component Analysis was by Alternating Least Squares (PRINCALS) method to reduce the dimensions of the mixed scale. Based on the ratio Between Sum of Squares (BSS) and Total Sum of Squares (TSS), the performance results of FPCM with PRINCALS were 70.71%, better than FPCM without PRINCALS of 56.37%. FCM performance with PRINCALS is 68.16% and without PRINCALS is 52.82%. The PCM method has very poor performance, that is 0% because many cluster center values were similar. In this research, the FPCM cluster results are then used to determine the classification of Single Tuition Fee (STF). |
---|---|
ISSN: | 0094-243X 1551-7616 |
DOI: | 10.1063/5.0242144 |