Capturing cell type-specific chromatin compartment patterns by applying topic modeling to single-cell Hi-C data
Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological inf...
Gespeichert in:
Veröffentlicht in: | PLoS computational biology 2020-09, Vol.16 (9), p.e1008173-e1008173, Article 1008173 |
---|---|
Hauptverfasser: | , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of "chromatin topics." We further show enrichment of particular compartment structures associated with locus pairs in these topics.
Author summary The genomes of higher organisms are intricately folded and organized in a dynamic manner that has strong implications for many biological processes. Each chromosome undergoes dramatic changes to their three dimensional conformation during the cell cycle, whereas the positioning of chromosomes within the nucleus plays an important role in controlling the activation of specific genes. Recently, it has become possible to investigate the 3D conformations of the genomes of individual cells using a high throughput sequencing assay called single cell Hi-C (scHi-C). However, data from these assays are sparse and noisy, making analysis and interpretation of scHi-C data challenging. In this work, we generated a scHi-C dataset of over 19,000 cells from five human cell lines and applied a natural language processing method called topic modeling to discover cell type-specific "chromatin" topics. We show that these topics can be used to distinguish between cells at different stages of the cell cycle and cells from different tissues based on the 3D conformation of their genomes, despite the sparsity of the data. We further show that the 3D conformations of single cells are linked to the expression of cell type-specific genes and to cell cycle-associated conformational patterns. |
---|---|
ISSN: | 1553-734X 1553-7358 1553-7358 |
DOI: | 10.1371/journal.pcbi.1008173 |