Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection

Gene expression data have become increasingly important in machine learning and computational biology over the past few years. In the field of gene expression analysis, several matrix factorization-based dimensionality reduction methods have been developed. However, such methods can still be improve...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Knowledge-based systems 2022-11, Vol.256, p.109884, Article 109884
Hauptverfasser:	Saberi-Movahed, Farid, Rostami, Mehrdad, Berahmand, Kamal, Karami, Saeed, Tiwari, Prayag, Oussalah, Mourad, Band, Shahab S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Feature selection Gene expression data Health Innovation Hälsoinnovation Matrix factorization Minimum redundancy Regularization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Gene expression data have become increasingly important in machine learning and computational biology over the past few years. In the field of gene expression analysis, several matrix factorization-based dimensionality reduction methods have been developed. However, such methods can still be improved in terms of efficiency and reliability. In this paper, an innovative approach to feature selection, called Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy (DR-FS-MFMR), is introduced. The major focus of DR-FS-MFMR is to discard redundant features from the set of original features. In order to reach this target, the primary feature selection problem is defined in terms of two aspects: (1) the matrix factorization of data matrix in terms of the feature weight matrix and the representation matrix, and (2) the correlation information related to the selected features set. Then, the objective function is enriched by employing two data representation characteristics along with an inner product regularization criterion to perform both the redundancy minimization process and the sparsity task more precisely. To demonstrate the proficiency of the DR-FS-MFMR method, a large number of experimental studies are conducted on nine gene expression datasets. The obtained computational results indicate the efficiency and productivity of DR-FS-MFMR for the gene selection task. [Display omitted]
ISSN:	0950-7051 1872-7409 1872-7409
DOI:	10.1016/j.knosys.2022.109884