A Gene Selection Strategy for Enhancing Single-Cell RNA-Seq Data Integration

Cancer remains a pervasive and formidable disease within modern societies, necessitating the utilization of advanced techniques in both diagnosis and therapy. Molecular biology has emerged as a crucial tool in deciphering the underlying biological mechanisms that contribute to various types of cance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering proceedings 2023-11, Vol.50 (1), p.12
Hauptverfasser: Konstantinos Lazaros, Georgios N. Dimitrakopoulos, Panagiotis Vlamos, Aristidis G. Vrahatis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cancer remains a pervasive and formidable disease within modern societies, necessitating the utilization of advanced techniques in both diagnosis and therapy. Molecular biology has emerged as a crucial tool in deciphering the underlying biological mechanisms that contribute to various types of cancer. Notably, single-cell sequencing has garnered significant attention as a state-of-the-art method for profiling gene expression in individual cells, unveiling previously concealed mechanisms and biological phenomena. With the abundance of single-cell datasets available, there is a pressing need to integrate related datasets into larger ones to enhance our understanding of biological processes and augment predictive capabilities. In this study, we investigated the impact of gene selection, achieved through the implementation of feature selection techniques, on the integration of single-cell datasets. By systematically exploring the effects of gene selection, we aim to enhance the integration process, leading to improved biological insights and enhanced predictive power. The proposed method aims to enhance two cutting-edge data integration methodologies for single-cell RNA sequencing (scRNA-seq). The method utilizes a strategy that combines two key components: a statistical approach to isolate the high variability in gene expression across cells or samples and a feature selection strategy based on XgBoost to keep genes that are important for distinguishing among healthy and cancerous cells.
ISSN:2673-4591
DOI:10.3390/engproc2023050012