Low-Redundant Unsupervised Feature Selection based on Data Structure Learning and Feature Orthogonalization

An orthogonal representation of features can offer valuable insights into feature selection as it aims to find a representative subset of features in which all features can be accurately reconstructed by a set of features that are linearly independent, uncorrelated, and perpendicular to each other....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2024-04, Vol.240, p.122556, Article 122556
Hauptverfasser: Samareh-Jahani, Mahsa, Saberi-Movahed, Farid, Eftekhari, Mahdi, Aghamollaei, Gholamreza, Tiwari, Prayag
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An orthogonal representation of features can offer valuable insights into feature selection as it aims to find a representative subset of features in which all features can be accurately reconstructed by a set of features that are linearly independent, uncorrelated, and perpendicular to each other. In this paper, a novel feature selection method, called Low-Redundant Unsupervised Feature Selection based on Data Structure Learning and Feature Orthogonalization (LRDOR), is presented. In the first stage, the suggested LRDOR method makes use of the QR factorization over the whole set of features to find the orthogonal representation of the feature space. Then, LRDOR utilizes the directional distance based on the matrix factorization in order to determine the distance among the set of considered features and the orthogonal set obtained from the original features. Moreover, LRDOR simultaneously takes into account the local correlation of features and the data manifold as dual information into the feature selection process, which can lead to a low level of redundancy and maintain the geometric data structure when reducing the data dimension. In addition to providing a proficient iterative algorithm, the convergence analysis is also included to solve the objective function of LRDOR. The results of the experiments demonstrate that for clustering purposes, LRDOR works better than other related state-of-the-art unsupervised feature selection methods on ten real-world datasets. •The proposed feature selection method, LRDOR, selects low-redundant features.•LRDOR uses the orthogonal representation idea to obtain uncorrelated features.•LRDOR defines a subspace distance via orthogonal representation of features.•LRDOR captures local correlation among features and data structure manifold.
ISSN:0957-4174
1873-6793
1873-6793
DOI:10.1016/j.eswa.2023.122556