Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing

The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor–normal genomic DNA (gDNA...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature biotechnology 2021-09, Vol.39 (9), p.1151-1160
Hauptverfasser: Fang, Li Tai, Zhu, Bin, Zhao, Yongmei, Chen, Wanqiu, Yang, Zhaowei, Kerrigan, Liz, Langenbach, Kurt, de Mars, Maryellen, Lu, Charles, Idler, Kenneth, Jacob, Howard, Zheng, Yuanting, Ren, Luyao, Yu, Ying, Jaeger, Erich, Schroth, Gary P., Abaan, Ogan D., Talsania, Keyur, Lack, Justin, Shen, Tsai-Wei, Chen, Zhong, Stanbouly, Seta, Tran, Bao, Shetty, Jyoti, Kriga, Yuliya, Meerzaman, Daoud, Nguyen, Cu, Petitjean, Virginie, Sultan, Marc, Cam, Margaret, Mehta, Monika, Hung, Tiffany, Peters, Eric, Kalamegham, Rasika, Sahraeian, Sayed Mohammad Ebrahim, Mohiyuddin, Marghoob, Guo, Yunfei, Yao, Lijing, Song, Lei, Lam, Hugo Y. K., Drabek, Jiri, Vojta, Petr, Maestro, Roberta, Gasparotto, Daniela, Kõks, Sulev, Reimann, Ene, Scherer, Andreas, Nordlund, Jessica, Liljedahl, Ulrika, Jensen, Roderick V., Pirooznia, Mehdi, Li, Zhipan, Xiao, Chunlin, Sherry, Stephen T., Kusko, Rebecca, Moos, Malcolm, Donaldson, Eric, Tezak, Zivana, Ning, Baitang, Tong, Weida, Li, Jing, Duerken-Hughes, Penelope, Catalanotti, Claudia, Maheshwari, Shamoni, Shuga, Joe, Liang, Winnie S., Keats, Jonathan, Adkins, Jonathan, Tassone, Erica, Zismann, Victoria, McDaniel, Timothy, Trent, Jeffrey, Foox, Jonathan, Butler, Daniel, Mason, Christopher E., Hong, Huixiao, Shi, Leming, Wang, Charles, Xiao, Wenming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor–normal genomic DNA (gDNA) samples derived from a breast cancer cell line—which is highly heterogeneous, with an aneuploid genome, and enriched in somatic alterations—and a matched lymphoblastoid cell line. We partially validated both somatic mutations and germline variants in these call sets via whole-exome sequencing (WES) with different sequencing platforms and targeted sequencing with >2,000-fold coverage, spanning 82% of genomic regions with high confidence. Although the gDNA reference samples are not representative of primary cancer cells from a clinical sample, when setting up a sequencing pipeline, they not only minimize potential biases from technologies, assays and informatics but also provide a unique resource for benchmarking ‘tumor-only’ or ‘matched tumor–normal’ analyses. Tumor–normal paired DNA samples from a breast cancer cell line and a matched lymphoblastoid cell line enable calibration of clinical sequencing pipelines and benchmarking ‘tumor-only’ or ‘matched tumor–normal’ analyses.
ISSN:1087-0156
1546-1696
1546-1696
DOI:10.1038/s41587-021-00993-6