AGImpute: imputation of scRNA-seq data based on a hybrid GAN with dropouts identification
Abstract Motivation Dropout events bring challenges in analyzing single-cell RNA sequencing data as they introduce noise and distort the true distributions of gene expression profiles. Recent studies focus on estimating dropout probability and imputing dropout events by leveraging information from s...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2024-02, Vol.40 (2) |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
Motivation
Dropout events bring challenges in analyzing single-cell RNA sequencing data as they introduce noise and distort the true distributions of gene expression profiles. Recent studies focus on estimating dropout probability and imputing dropout events by leveraging information from similar cells or genes. However, the number of dropout events differs in different cells, due to the complex factors, such as different sequencing protocols, cell types, and batch effects. The dropout event differences are not fully considered in assessing the similarities between cells and genes, which compromises the reliability of downstream analysis.
Results
This work proposes a hybrid Generative Adversarial Network with dropouts identification to impute single-cell RNA sequencing data, named AGImpute. First, the numbers of dropout events in different cells in scRNA-seq data are differentially estimated by using a dynamic threshold estimation strategy. Next, the identified dropout events are imputed by a hybrid deep learning model, combining Autoencoder with a Generative Adversarial Network. To validate the efficiency of the AGImpute, it is compared with seven state-of-the-art dropout imputation methods on two simulated datasets and seven real single-cell RNA sequencing datasets. The results show that AGImpute imputes the least number of dropout events than other methods. Moreover, AGImpute enhances the performance of downstream analysis, including clustering performance, identifying cell-specific marker genes, and inferring trajectory in the time-course dataset.
Availability and implementation
The source code can be obtained from https://github.com/xszhu-lab/AGImpute. |
---|---|
ISSN: | 1367-4803 1367-4811 |
DOI: | 10.1093/bioinformatics/btae068 |