scGREAT: Transformer-based deep-language model for gene regulatory network inference from single-cell transcriptomics

Gene regulatory networks (GRNs) involve complex and multi-layer regulatory interactions between regulators and their target genes. Precise knowledge of GRNs is important in understanding cellular processes and molecular functions. Recent breakthroughs in single-cell sequencing technology made it pos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:iScience 2024-04, Vol.27 (4), p.109352-109352, Article 109352
Hauptverfasser: Wang, Yuchen, Chen, Xingjian, Zheng, Zetian, Huang, Lei, Xie, Weidun, Wang, Fuzhou, Zhang, Zhaolei, Wong, Ka-Chun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Gene regulatory networks (GRNs) involve complex and multi-layer regulatory interactions between regulators and their target genes. Precise knowledge of GRNs is important in understanding cellular processes and molecular functions. Recent breakthroughs in single-cell sequencing technology made it possible to infer GRNs at single-cell level. Existing methods, however, are limited by expensive computations, and sometimes simplistic assumptions. To overcome these obstacles, we propose scGREAT, a framework to infer GRN using gene embeddings and transformer from single-cell transcriptomics. scGREAT starts by constructing gene expression and gene biotext dictionaries from scRNA-seq data and gene text information. The representation of TF gene pairs is learned through optimizing embedding space by transformer-based engine. Results illustrated scGREAT outperformed other contemporary methods on benchmarks. Besides, gene representations from scGREAT provide valuable gene regulation insights, and external validation on spatial transcriptomics illuminated the mechanism behind scGREAT annotation. Moreover, scGREAT identified several TF target regulations corroborated in studies. [Display omitted] •Taking advantage of transformer backbone and biomedical language model•Outperforming SOTA models on 7 benchmark datasets 4 kinds of gene network platforms•Employing spatial transcriptomics data as external validation•Ability to uncover novel relationships between genes Human genetics; Bioinformatics; Computational bioinformatics
ISSN:2589-0042
2589-0042
DOI:10.1016/j.isci.2024.109352