Extensive sequencing of seven human genomes to characterize benchmark reference materials

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific data 2016-06, Vol.3 (1), p.160025, Article 160025
Hauptverfasser: Zook, Justin M., Catoe, David, McDaniel, Jennifer, Vang, Lindsay, Spies, Noah, Sidow, Arend, Weng, Ziming, Liu, Yuling, Mason, Christopher E., Alexander, Noah, Henaff, Elizabeth, McIntyre, Alexa B.R., Chandramohan, Dhruva, Chen, Feng, Jaeger, Erich, Moshrefi, Ali, Pham, Khoa, Stedman, William, Liang, Tiffany, Saghbini, Michael, Dzakula, Zeljko, Hastie, Alex, Cao, Han, Deikus, Gintaras, Schadt, Eric, Sebra, Robert, Bashir, Ali, Truty, Rebecca M., Chang, Christopher C., Gulbahce, Natali, Zhao, Keyan, Ghosh, Srinka, Hyland, Fiona, Fu, Yutao, Chaisson, Mark, Xiao, Chunlin, Trow, Jonathan, Sherry, Stephen T., Zaranek, Alexander W., Ball, Madeleine, Bobe, Jason, Estep, Preston, Church, George M., Marks, Patrick, Kyriazopoulou-Panagiotopoulou, Sofia, Zheng, Grace X.Y., Schnall-Levin, Michael, Ordonez, Heather S., Mudivarti, Patrice A., Giorda, Kristina, Sheng, Ying, Rypdal, Karoline Bjarnesdatter, Salit, Marc
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly. Design Type(s) reference design • replicate design • protocol optimization design • individual genetic characteristics comparison design Measurement Type(s) whole genome sequencing Technology Type(s) DNA sequencing Factor Type(s) ethnic group Sample Characteristic(s) Homo sapiens • EBV-LCL cell Machine-accessible metadata file describing the reported data (ISA-Tab format)
ISSN:2052-4463
2052-4463
DOI:10.1038/sdata.2016.25