Feature selection based on multiview entropy measures in multiperspective rough set
The performance of the neighborhood rough set model in feature selection is limited by nonobjective parameter selection method, the uncertainty measures considered only from a single view, and high time cost caused by processing high‐dimensional data. To solve the above problems, this study first de...
Gespeichert in:
Veröffentlicht in: | International journal of intelligent systems 2022-10, Vol.37 (10), p.7200-7234 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The performance of the neighborhood rough set model in feature selection is limited by nonobjective parameter selection method, the uncertainty measures considered only from a single view, and high time cost caused by processing high‐dimensional data. To solve the above problems, this study first defines the interclass boundary to granulate the samples in different classes, and three types of neighborhood concepts—negative perspective, neutral perspective, and positive perspective—are put forward based on different cognitive perspectives. Then, the concept of the multiperspective rough set model is developed. The most prominent feature of this model is the discovery of differences between classes from the given data, without any parameters. Second, by integrating the information theory and algebraic views under the multiperspective rough set model, multiview entropy measures are proposed to effectively measure the uncertainty in data. Moreover, a nonmonotonic feature selection algorithm based on the mutual information in the multiview entropy measures under the neutral perspective as the evaluation function of feature importance is designed to resolve the disadvantages of the algorithms based on the monotone evaluation function. Finally, Information Gain is introduced to preliminarily decrease the dimension of high‐dimensional data sets to promote classification accuracy and reduce time consumption. The experimental results confirm that the proposed algorithm is efficient in eliminating noise and increasing classification accuracy. |
---|---|
ISSN: | 0884-8173 1098-111X |
DOI: | 10.1002/int.22878 |