Identification of copy number variations and translocations in cancer cells from Hi-C data

Abstract Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells har...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics 2018-01, Vol.34 (2), p.338-345
Hauptverfasser:	Chakraborty, Abhijit, Ay, Ferhat
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	345
container_issue	2
container_start_page	338
container_title	Bioinformatics
container_volume	34
creator	Chakraborty, Abhijit Ay, Ferhat
description	Abstract Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells harbor chromosomal abnormalities [e.g. copy number variations (CNVs) and translocations] altering their genomes both at the sequence level and at the level of 3D organization. High-throughput chromosome conformation capture techniques (e.g. Hi-C), which are originally developed for decoding the 3D structure of the chromatin, provide a great opportunity to simultaneously identify the locations of genomic rearrangements and to investigate the 3D genome organization in cancer cells. Even though Hi-C data has been used for validating known rearrangements, computational methods that can distinguish rearrangement signals from the inherent biases of Hi-C data and from the actual 3D conformation of chromatin, and can precisely detect rearrangement locations de novo have been missing. Results In this work, we characterize how intra and inter-chromosomal Hi-C contacts are distributed for normal and rearranged chromosomes to devise a new set of algorithms (i) to identify genomic segments that correspond to CNV regions such as amplifications and deletions (HiCnv), (ii) to call inter-chromosomal translocations and their boundaries (HiCtrans) from Hi-C experiments and (iii) to simulate Hi-C data from genomes with desired rearrangements and abnormalities (AveSim) in order to select optimal parameters for and to benchmark the accuracy of our methods. Our results on 10 different cancer cell lines with Hi-C data show that we identify a total number of 105 amplifications and 45 deletions together with 90 translocations, whereas we identify virtually no such events for two karyotypically normal cell lines. Our CNV predictions correlate very well with whole genome sequencing data among chromosomes with CNV events for a breast cancer cell line (r = 0.89) and capture most of the CNVs we simulate using Avesim. For HiCtrans predictions, we report evidence from the literature for 30 out of 90 translocations for eight of our cancer cell lines. Furthermore, we show that our tools identify and correctly classify relatively understudied rearrangements such as double minutes and homogeneously staining regions. Considering the inherent limitations of existing techniques for karyotyping (i.e. miss
doi_str_mv	10.1093/bioinformatics/btx664
format	Article
fullrecord	<record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8210825</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btx664</oup_id><sourcerecordid>1953298411</sourcerecordid><originalsourceid>FETCH-LOGICAL-c570t-e1b10aa27189186c7ef656085e8b7aa94731153b31c0905a4d4d8b13cb27d4073</originalsourceid><addsrcrecordid>eNqNkc1u1TAQRi0Eoj_wCCAv2YTOxHbsbJDQFdBKldjAho01dhwwSuyLnVT07Um5txXdsbLlOXPGo4-xVwhvEXpx4WKOacxlpiX6euGW310nn7BTlB00Laj-6XYXnW6kAXHCzmr9CaBQSvmcnbQ9SCM7fcq-XQ0hLXGMfvPkxPPIfd7f8rTOLhR-QyX-LVROaeBLoVSn7I9PMXFPyW-cD9NU-VjyzC9js-MDLfSCPRtpquHl8TxnXz9--LK7bK4_f7ravb9uvNKwNAEdAlGr0fRoOq_D2KkOjArGaaJeaoGohBPooQdFcpCDcSi8a_UgQYtz9u7g3a9uDoPf9ik02X2JM5Vbmynax5UUf9jv-caaFsG0ahO8OQpK_rWGutg51ruNKIW8Vou9Em1vJOKGqgPqS661hPFhDIK9y8U-zsUectn6Xv_7x4eu-yA2AA5AXvf_6fwDOBqhzA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1953298411</pqid></control><display><type>article</type><title>Identification of copy number variations and translocations in cancer cells from Hi-C data</title><source>Oxford Journals Open Access Collection</source><creator>Chakraborty, Abhijit ; Ay, Ferhat</creator><creatorcontrib>Chakraborty, Abhijit ; Ay, Ferhat</creatorcontrib><description>Abstract Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells harbor chromosomal abnormalities [e.g. copy number variations (CNVs) and translocations] altering their genomes both at the sequence level and at the level of 3D organization. High-throughput chromosome conformation capture techniques (e.g. Hi-C), which are originally developed for decoding the 3D structure of the chromatin, provide a great opportunity to simultaneously identify the locations of genomic rearrangements and to investigate the 3D genome organization in cancer cells. Even though Hi-C data has been used for validating known rearrangements, computational methods that can distinguish rearrangement signals from the inherent biases of Hi-C data and from the actual 3D conformation of chromatin, and can precisely detect rearrangement locations de novo have been missing. Results In this work, we characterize how intra and inter-chromosomal Hi-C contacts are distributed for normal and rearranged chromosomes to devise a new set of algorithms (i) to identify genomic segments that correspond to CNV regions such as amplifications and deletions (HiCnv), (ii) to call inter-chromosomal translocations and their boundaries (HiCtrans) from Hi-C experiments and (iii) to simulate Hi-C data from genomes with desired rearrangements and abnormalities (AveSim) in order to select optimal parameters for and to benchmark the accuracy of our methods. Our results on 10 different cancer cell lines with Hi-C data show that we identify a total number of 105 amplifications and 45 deletions together with 90 translocations, whereas we identify virtually no such events for two karyotypically normal cell lines. Our CNV predictions correlate very well with whole genome sequencing data among chromosomes with CNV events for a breast cancer cell line (r = 0.89) and capture most of the CNVs we simulate using Avesim. For HiCtrans predictions, we report evidence from the literature for 30 out of 90 translocations for eight of our cancer cell lines. Furthermore, we show that our tools identify and correctly classify relatively understudied rearrangements such as double minutes and homogeneously staining regions. Considering the inherent limitations of existing techniques for karyotyping (i.e. missing balanced rearrangements and those near repetitive regions), the accurate identification of CNVs and translocations in a cost-effective and high-throughput setting is still a challenge. Our results show that the set of tools we develop effectively utilize moderately sequenced Hi-C libraries (100-300 million reads) to identify known and de novo chromosomal rearrangements/abnormalities in well-established cancer cell lines. With the decrease in required number of cells and the increase in attainable resolution, we believe that our framework will pave the way towards comprehensive mapping of genomic rearrangements in primary cells from cancer patients using Hi-C. Availability and implementation CNV calling: https://github.com/ay-lab/HiCnv, Translocation calling: https://github.com/ay-lab/HiCtrans and Hi-C simulation: https://github.com/ay-lab/AveSim. Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btx664</identifier><identifier>PMID: 29048467</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><ispartof>Bioinformatics, 2018-01, Vol.34 (2), p.338-345</ispartof><rights>The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2017</rights><rights>The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c570t-e1b10aa27189186c7ef656085e8b7aa94731153b31c0905a4d4d8b13cb27d4073</citedby><cites>FETCH-LOGICAL-c570t-e1b10aa27189186c7ef656085e8b7aa94731153b31c0905a4d4d8b13cb27d4073</cites><orcidid>0000-0002-0708-6914</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8210825/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8210825/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,1598,27901,27902,53766,53768</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/btx664$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29048467$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chakraborty, Abhijit</creatorcontrib><creatorcontrib>Ay, Ferhat</creatorcontrib><title>Identification of copy number variations and translocations in cancer cells from Hi-C data</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells harbor chromosomal abnormalities [e.g. copy number variations (CNVs) and translocations] altering their genomes both at the sequence level and at the level of 3D organization. High-throughput chromosome conformation capture techniques (e.g. Hi-C), which are originally developed for decoding the 3D structure of the chromatin, provide a great opportunity to simultaneously identify the locations of genomic rearrangements and to investigate the 3D genome organization in cancer cells. Even though Hi-C data has been used for validating known rearrangements, computational methods that can distinguish rearrangement signals from the inherent biases of Hi-C data and from the actual 3D conformation of chromatin, and can precisely detect rearrangement locations de novo have been missing. Results In this work, we characterize how intra and inter-chromosomal Hi-C contacts are distributed for normal and rearranged chromosomes to devise a new set of algorithms (i) to identify genomic segments that correspond to CNV regions such as amplifications and deletions (HiCnv), (ii) to call inter-chromosomal translocations and their boundaries (HiCtrans) from Hi-C experiments and (iii) to simulate Hi-C data from genomes with desired rearrangements and abnormalities (AveSim) in order to select optimal parameters for and to benchmark the accuracy of our methods. Our results on 10 different cancer cell lines with Hi-C data show that we identify a total number of 105 amplifications and 45 deletions together with 90 translocations, whereas we identify virtually no such events for two karyotypically normal cell lines. Our CNV predictions correlate very well with whole genome sequencing data among chromosomes with CNV events for a breast cancer cell line (r = 0.89) and capture most of the CNVs we simulate using Avesim. For HiCtrans predictions, we report evidence from the literature for 30 out of 90 translocations for eight of our cancer cell lines. Furthermore, we show that our tools identify and correctly classify relatively understudied rearrangements such as double minutes and homogeneously staining regions. Considering the inherent limitations of existing techniques for karyotyping (i.e. missing balanced rearrangements and those near repetitive regions), the accurate identification of CNVs and translocations in a cost-effective and high-throughput setting is still a challenge. Our results show that the set of tools we develop effectively utilize moderately sequenced Hi-C libraries (100-300 million reads) to identify known and de novo chromosomal rearrangements/abnormalities in well-established cancer cell lines. With the decrease in required number of cells and the increase in attainable resolution, we believe that our framework will pave the way towards comprehensive mapping of genomic rearrangements in primary cells from cancer patients using Hi-C. Availability and implementation CNV calling: https://github.com/ay-lab/HiCnv, Translocation calling: https://github.com/ay-lab/HiCtrans and Hi-C simulation: https://github.com/ay-lab/AveSim. Supplementary information Supplementary data are available at Bioinformatics online.</description><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqNkc1u1TAQRi0Eoj_wCCAv2YTOxHbsbJDQFdBKldjAho01dhwwSuyLnVT07Um5txXdsbLlOXPGo4-xVwhvEXpx4WKOacxlpiX6euGW310nn7BTlB00Laj-6XYXnW6kAXHCzmr9CaBQSvmcnbQ9SCM7fcq-XQ0hLXGMfvPkxPPIfd7f8rTOLhR-QyX-LVROaeBLoVSn7I9PMXFPyW-cD9NU-VjyzC9js-MDLfSCPRtpquHl8TxnXz9--LK7bK4_f7ravb9uvNKwNAEdAlGr0fRoOq_D2KkOjArGaaJeaoGohBPooQdFcpCDcSi8a_UgQYtz9u7g3a9uDoPf9ik02X2JM5Vbmynax5UUf9jv-caaFsG0ahO8OQpK_rWGutg51ruNKIW8Vou9Em1vJOKGqgPqS661hPFhDIK9y8U-zsUectn6Xv_7x4eu-yA2AA5AXvf_6fwDOBqhzA</recordid><startdate>20180115</startdate><enddate>20180115</enddate><creator>Chakraborty, Abhijit</creator><creator>Ay, Ferhat</creator><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0708-6914</orcidid></search><sort><creationdate>20180115</creationdate><title>Identification of copy number variations and translocations in cancer cells from Hi-C data</title><author>Chakraborty, Abhijit ; Ay, Ferhat</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c570t-e1b10aa27189186c7ef656085e8b7aa94731153b31c0905a4d4d8b13cb27d4073</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chakraborty, Abhijit</creatorcontrib><creatorcontrib>Ay, Ferhat</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chakraborty, Abhijit</au><au>Ay, Ferhat</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identification of copy number variations and translocations in cancer cells from Hi-C data</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2018-01-15</date><risdate>2018</risdate><volume>34</volume><issue>2</issue><spage>338</spage><epage>345</epage><pages>338-345</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells harbor chromosomal abnormalities [e.g. copy number variations (CNVs) and translocations] altering their genomes both at the sequence level and at the level of 3D organization. High-throughput chromosome conformation capture techniques (e.g. Hi-C), which are originally developed for decoding the 3D structure of the chromatin, provide a great opportunity to simultaneously identify the locations of genomic rearrangements and to investigate the 3D genome organization in cancer cells. Even though Hi-C data has been used for validating known rearrangements, computational methods that can distinguish rearrangement signals from the inherent biases of Hi-C data and from the actual 3D conformation of chromatin, and can precisely detect rearrangement locations de novo have been missing. Results In this work, we characterize how intra and inter-chromosomal Hi-C contacts are distributed for normal and rearranged chromosomes to devise a new set of algorithms (i) to identify genomic segments that correspond to CNV regions such as amplifications and deletions (HiCnv), (ii) to call inter-chromosomal translocations and their boundaries (HiCtrans) from Hi-C experiments and (iii) to simulate Hi-C data from genomes with desired rearrangements and abnormalities (AveSim) in order to select optimal parameters for and to benchmark the accuracy of our methods. Our results on 10 different cancer cell lines with Hi-C data show that we identify a total number of 105 amplifications and 45 deletions together with 90 translocations, whereas we identify virtually no such events for two karyotypically normal cell lines. Our CNV predictions correlate very well with whole genome sequencing data among chromosomes with CNV events for a breast cancer cell line (r = 0.89) and capture most of the CNVs we simulate using Avesim. For HiCtrans predictions, we report evidence from the literature for 30 out of 90 translocations for eight of our cancer cell lines. Furthermore, we show that our tools identify and correctly classify relatively understudied rearrangements such as double minutes and homogeneously staining regions. Considering the inherent limitations of existing techniques for karyotyping (i.e. missing balanced rearrangements and those near repetitive regions), the accurate identification of CNVs and translocations in a cost-effective and high-throughput setting is still a challenge. Our results show that the set of tools we develop effectively utilize moderately sequenced Hi-C libraries (100-300 million reads) to identify known and de novo chromosomal rearrangements/abnormalities in well-established cancer cell lines. With the decrease in required number of cells and the increase in attainable resolution, we believe that our framework will pave the way towards comprehensive mapping of genomic rearrangements in primary cells from cancer patients using Hi-C. Availability and implementation CNV calling: https://github.com/ay-lab/HiCnv, Translocation calling: https://github.com/ay-lab/HiCtrans and Hi-C simulation: https://github.com/ay-lab/AveSim. Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>29048467</pmid><doi>10.1093/bioinformatics/btx664</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-0708-6914</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1367-4803
ispartof	Bioinformatics, 2018-01, Vol.34 (2), p.338-345
issn	1367-4803 1460-2059 1367-4811
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8210825
source	Oxford Journals Open Access Collection
title	Identification of copy number variations and translocations in cancer cells from Hi-C data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T07%3A01%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identification%20of%20copy%20number%20variations%20and%20translocations%20in%20cancer%20cells%20from%20Hi-C%20data&rft.jtitle=Bioinformatics&rft.au=Chakraborty,%20Abhijit&rft.date=2018-01-15&rft.volume=34&rft.issue=2&rft.spage=338&rft.epage=345&rft.pages=338-345&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btx664&rft_dat=%3Cproquest_TOX%3E1953298411%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1953298411&rft_id=info:pmid/29048467&rft_oup_id=10.1093/bioinformatics/btx664&rfr_iscdi=true