scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking

Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of molecular cell biology 2023-06, Vol.15 (1)
Hauptverfasser: Fan, Shichen, Dang, Dachang, Ye, Yusen, Zhang, Shao-Wu, Gao, Lin, Zhang, Shihua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title Journal of molecular cell biology
container_volume 15
creator Fan, Shichen
Dang, Dachang
Ye, Yusen
Zhang, Shao-Wu
Gao, Lin
Zhang, Shihua
description Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods.
doi_str_mv 10.1093/jmcb/mjad003
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10308180</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2770479709</sourcerecordid><originalsourceid>FETCH-LOGICAL-c342t-fdb80c4a10b3509114748977c947a75d69a9ede72f569762ae63ff1343b570363</originalsourceid><addsrcrecordid>eNpVkcFPwyAUh4nRODN382x69GAdFFrAizGLOhMTD-qZUPq6Mmk7CzPuv5dl0ygXSPj43uP9EDoj-IpgSafL1pTTdqkrjOkBOiE8lykrRH4YzwVnacaFGKGJ90scFxWUCnyMRrTgWETgBBlv5jadvdj2OtFJ7eDLlg4Sb9u106EfktDokCygg0EH8EljF01a2wqcDZuIdQsHqQHnkq0mqXSIlvishM40rR7eI3GKjmrtPEz2-xi93d-9zubp0_PD4-z2KTWUZSGtq1JgwzTBJc2xJIRxJiTnRjKueV4VUkuogGd1XkheZBoKWteEMlrmHNOCjtHNzrtaly1UBrowaKdWg42NbFSvrfp_09lGLfpPRTCN4xA4Gi72hqH_WIMPqrV--zvdQb_2KuMcMy55HP0YXe5QM_TeD1D_1iFYbbNR22zUPpuIn__t7Rf-SYJ-A6QLjCQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2770479709</pqid></control><display><type>article</type><title>scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Fan, Shichen ; Dang, Dachang ; Ye, Yusen ; Zhang, Shao-Wu ; Gao, Lin ; Zhang, Shihua</creator><contributor>Chen, Luonan</contributor><creatorcontrib>Fan, Shichen ; Dang, Dachang ; Ye, Yusen ; Zhang, Shao-Wu ; Gao, Lin ; Zhang, Shihua ; Chen, Luonan</creatorcontrib><description>Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods.</description><identifier>ISSN: 1674-2788</identifier><identifier>EISSN: 1759-4685</identifier><identifier>DOI: 10.1093/jmcb/mjad003</identifier><identifier>PMID: 36708167</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Benchmarking ; Chromatin - genetics ; Chromosomes</subject><ispartof>Journal of molecular cell biology, 2023-06, Vol.15 (1)</ispartof><rights>The Author(s) (2023). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, CEMCS, CAS.</rights><rights>The Author(s) (2023). Published by Oxford University Press on behalf of , CEMCS, CAS. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c342t-fdb80c4a10b3509114748977c947a75d69a9ede72f569762ae63ff1343b570363</cites><orcidid>0000-0002-3783-1305 ; 0000-0001-6396-0787</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,27915,27916,53782,53784</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36708167$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Chen, Luonan</contributor><creatorcontrib>Fan, Shichen</creatorcontrib><creatorcontrib>Dang, Dachang</creatorcontrib><creatorcontrib>Ye, Yusen</creatorcontrib><creatorcontrib>Zhang, Shao-Wu</creatorcontrib><creatorcontrib>Gao, Lin</creatorcontrib><creatorcontrib>Zhang, Shihua</creatorcontrib><title>scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking</title><title>Journal of molecular cell biology</title><addtitle>J Mol Cell Biol</addtitle><description>Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods.</description><subject>Benchmarking</subject><subject>Chromatin - genetics</subject><subject>Chromosomes</subject><issn>1674-2788</issn><issn>1759-4685</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkcFPwyAUh4nRODN382x69GAdFFrAizGLOhMTD-qZUPq6Mmk7CzPuv5dl0ygXSPj43uP9EDoj-IpgSafL1pTTdqkrjOkBOiE8lykrRH4YzwVnacaFGKGJ90scFxWUCnyMRrTgWETgBBlv5jadvdj2OtFJ7eDLlg4Sb9u106EfktDokCygg0EH8EljF01a2wqcDZuIdQsHqQHnkq0mqXSIlvishM40rR7eI3GKjmrtPEz2-xi93d-9zubp0_PD4-z2KTWUZSGtq1JgwzTBJc2xJIRxJiTnRjKueV4VUkuogGd1XkheZBoKWteEMlrmHNOCjtHNzrtaly1UBrowaKdWg42NbFSvrfp_09lGLfpPRTCN4xA4Gi72hqH_WIMPqrV--zvdQb_2KuMcMy55HP0YXe5QM_TeD1D_1iFYbbNR22zUPpuIn__t7Rf-SYJ-A6QLjCQ</recordid><startdate>20230601</startdate><enddate>20230601</enddate><creator>Fan, Shichen</creator><creator>Dang, Dachang</creator><creator>Ye, Yusen</creator><creator>Zhang, Shao-Wu</creator><creator>Gao, Lin</creator><creator>Zhang, Shihua</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3783-1305</orcidid><orcidid>https://orcid.org/0000-0001-6396-0787</orcidid></search><sort><creationdate>20230601</creationdate><title>scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking</title><author>Fan, Shichen ; Dang, Dachang ; Ye, Yusen ; Zhang, Shao-Wu ; Gao, Lin ; Zhang, Shihua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c342t-fdb80c4a10b3509114748977c947a75d69a9ede72f569762ae63ff1343b570363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Benchmarking</topic><topic>Chromatin - genetics</topic><topic>Chromosomes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Shichen</creatorcontrib><creatorcontrib>Dang, Dachang</creatorcontrib><creatorcontrib>Ye, Yusen</creatorcontrib><creatorcontrib>Zhang, Shao-Wu</creatorcontrib><creatorcontrib>Gao, Lin</creatorcontrib><creatorcontrib>Zhang, Shihua</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of molecular cell biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fan, Shichen</au><au>Dang, Dachang</au><au>Ye, Yusen</au><au>Zhang, Shao-Wu</au><au>Gao, Lin</au><au>Zhang, Shihua</au><au>Chen, Luonan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking</atitle><jtitle>Journal of molecular cell biology</jtitle><addtitle>J Mol Cell Biol</addtitle><date>2023-06-01</date><risdate>2023</risdate><volume>15</volume><issue>1</issue><issn>1674-2788</issn><eissn>1759-4685</eissn><abstract>Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>36708167</pmid><doi>10.1093/jmcb/mjad003</doi><orcidid>https://orcid.org/0000-0002-3783-1305</orcidid><orcidid>https://orcid.org/0000-0001-6396-0787</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1674-2788
ispartof Journal of molecular cell biology, 2023-06, Vol.15 (1)
issn 1674-2788
1759-4685
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10308180
source MEDLINE; DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; PubMed Central
subjects Benchmarking
Chromatin - genetics
Chromosomes
title scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T06%3A19%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=scHi-CSim:%20a%20flexible%20simulator%20that%20generates%20high-fidelity%20single-cell%20Hi-C%20data%20for%20benchmarking&rft.jtitle=Journal%20of%20molecular%20cell%20biology&rft.au=Fan,%20Shichen&rft.date=2023-06-01&rft.volume=15&rft.issue=1&rft.issn=1674-2788&rft.eissn=1759-4685&rft_id=info:doi/10.1093/jmcb/mjad003&rft_dat=%3Cproquest_pubme%3E2770479709%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2770479709&rft_id=info:pmid/36708167&rfr_iscdi=true