Scanorama: integrating large and diverse single-cell transcriptomic datasets

Merging diverse single-cell RNA sequencing (scRNA-seq) data from numerous experiments, laboratories and technologies can uncover important biological insights. Nonetheless, integrating scRNA-seq data encounters special challenges when the datasets are composed of diverse cell type compositions. Scan...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature protocols 2024-08, Vol.19 (8), p.2283-2297
Hauptverfasser: Hie, Brian L., Kim, Soochi, Rando, Thomas A., Bryson, Bryan, Berger, Bonnie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Merging diverse single-cell RNA sequencing (scRNA-seq) data from numerous experiments, laboratories and technologies can uncover important biological insights. Nonetheless, integrating scRNA-seq data encounters special challenges when the datasets are composed of diverse cell type compositions. Scanorama offers a robust solution for improving the quality and interpretation of heterogeneous scRNA-seq data by effectively merging information from diverse sources. Scanorama is designed to address the technical variation introduced by differences in sample preparation, sequencing depth and experimental batches that can confound the analysis of multiple scRNA-seq datasets. Here we provide a detailed protocol for using Scanorama within a Scanpy-based single-cell analysis workflow coupled with Google Colaboratory, a cloud-based free Jupyter notebook environment service. The protocol involves Scanorama integration, a process that typically spans 0.5–3 h. Scanorama integration requires a basic understanding of cellular biology, transcriptomic technologies and bioinformatics. Our protocol and new Scanorama–Colaboratory resource should make scRNA-seq integration more widely accessible to researchers. Key points Scanorama is an effective tool for combining multiple single-cell RNA sequencing datasets, addressing technical variation introduced by differences in sample preparation, sequencing depth and experimental batches that can confound the analysis of diverse datasets. Scanorama can handle multiple batches and dataset types while efficiently and accurately removing batch effects and identifying biologically relevant differences across datasets, making it a compelling option for single-cell RNA sequencing data integration. Scanorama is an effective tool for combining multiple single-cell RNA sequencing datasets, addressing technical variation introduced by differences in sample preparation, sequencing depth and experimental batches that can confound the analysis of diverse datasets.
ISSN:1754-2189
1750-2799
1750-2799
DOI:10.1038/s41596-024-00991-3