Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies
We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The q...
Gespeichert in:
Veröffentlicht in: | Proceedings of the National Academy of Sciences - PNAS 2004-02, Vol.101 (7), p.1916-1921 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1921 |
---|---|
container_issue | 7 |
container_start_page | 1916 |
container_title | Proceedings of the National Academy of Sciences - PNAS |
container_volume | 101 |
creator | Istrail, Sorin Sutton, Granger G. Florea, Liliana Halpern, Aaron L. Mobarry, Clark M. Lippert, Ross Walenz, Brian Shatkay, Hagit Dew, Ian Miller, Jason R. Flanigan, Michael J. Edwards, Nathan J. Bolanos, Randall Fasulo, Daniel Halldorsson, Bjarni V. Hannenhalli, Sridhar Turner, Russell Yooseph, Shibu Lu, Fu Nusskern, Deborah R. Shue, Bixiong Chris Zheng, Xiangqun Holly Zhong, Fei Delcher, Arthur L. Huson, Daniel H. Kravitz, Saul A. Mouchard, Laurent Reinert, Knut Remington, Karin A. Clark, Andrew G. Waterman, Michael S. Eichler, Evan E. Adams, Mark D. Hunkapiller, Michael W. Myers, Eugene W. Venter, J. Craig |
description | We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats. |
doi_str_mv | 10.1073/pnas.0307971100 |
format | Article |
fullrecord | <record><control><sourceid>jstor_pnas_</sourceid><recordid>TN_cdi_pnas_primary_101_7_1916_fulltext</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>3371370</jstor_id><sourcerecordid>3371370</sourcerecordid><originalsourceid>FETCH-LOGICAL-c597t-266e890b7bc028253ce518b1845f038f774ddc65a2f77f1a73221d250c11511c3</originalsourceid><addsrcrecordid>eNqFkUFv1DAQhS0EotvCmQtCOVFxSDtjx7F96GG1gl2klTgA4mg5jtNNldhLnFT03-PVRt2CkDh5ZH_vjWceIW8QrhAEu957E6-AgVACEeAZWSAozMtCwXOyAKAilwUtzsh5jHcAoLiEl-QMC1EqxeSCrH_sQufytfOhd9nXXRhvJ58tY3R91T1kxtfZKvR7M7Qx-Cw02Wbqjc9mfuZaF1-RF43pons9nxfk-6eP31abfPtl_Xm13OaWKzHmtCydVFCJygKVlDPrOMoKZcEbYLIRoqhrW3JDU9mgEYxSrCkHi8gRLbsgN0ff_VT1rrbOj4Pp9H5oezM86GBa_eeLb3f6NtxrxkXaRtJ_OOp3f6k2y60-3AEUJUfAe0zs-7nXEH5OLo66b6N1XWe8C1PUElAw4Oy_IKo0qaQqgddH0A4hxsE1j19A0IdA9SFQfQo0Kd49nffEzwkm4HIGDsqTHWqR-mKpm6nrRvdrfGL1bzIBb4_AXRzD8EgwJpAJYL8BzRO73g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>19253829</pqid></control><display><type>article</type><title>Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies</title><source>Jstor Complete Legacy</source><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Istrail, Sorin ; Sutton, Granger G. ; Florea, Liliana ; Halpern, Aaron L. ; Mobarry, Clark M. ; Lippert, Ross ; Walenz, Brian ; Shatkay, Hagit ; Dew, Ian ; Miller, Jason R. ; Flanigan, Michael J. ; Edwards, Nathan J. ; Bolanos, Randall ; Fasulo, Daniel ; Halldorsson, Bjarni V. ; Hannenhalli, Sridhar ; Turner, Russell ; Yooseph, Shibu ; Lu, Fu ; Nusskern, Deborah R. ; Shue, Bixiong Chris ; Zheng, Xiangqun Holly ; Zhong, Fei ; Delcher, Arthur L. ; Huson, Daniel H. ; Kravitz, Saul A. ; Mouchard, Laurent ; Reinert, Knut ; Remington, Karin A. ; Clark, Andrew G. ; Waterman, Michael S. ; Eichler, Evan E. ; Adams, Mark D. ; Hunkapiller, Michael W. ; Myers, Eugene W. ; Venter, J. Craig</creator><creatorcontrib>Istrail, Sorin ; Sutton, Granger G. ; Florea, Liliana ; Halpern, Aaron L. ; Mobarry, Clark M. ; Lippert, Ross ; Walenz, Brian ; Shatkay, Hagit ; Dew, Ian ; Miller, Jason R. ; Flanigan, Michael J. ; Edwards, Nathan J. ; Bolanos, Randall ; Fasulo, Daniel ; Halldorsson, Bjarni V. ; Hannenhalli, Sridhar ; Turner, Russell ; Yooseph, Shibu ; Lu, Fu ; Nusskern, Deborah R. ; Shue, Bixiong Chris ; Zheng, Xiangqun Holly ; Zhong, Fei ; Delcher, Arthur L. ; Huson, Daniel H. ; Kravitz, Saul A. ; Mouchard, Laurent ; Reinert, Knut ; Remington, Karin A. ; Clark, Andrew G. ; Waterman, Michael S. ; Eichler, Evan E. ; Adams, Mark D. ; Hunkapiller, Michael W. ; Myers, Eugene W. ; Venter, J. Craig</creatorcontrib><description>We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.0307971100</identifier><identifier>PMID: 14769938</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Bioinformatics ; Biological Sciences ; Chromosomes ; Computational Biology - standards ; Computer Science ; Contig Mapping - standards ; Datasets ; Drosophila ; Genome, Human ; Genomes ; Genomics ; Human genome ; Human Genome Project ; Humans ; Life Sciences ; Mathematical sequences ; Quantitative Methods ; RNA, Messenger - analysis ; Scaffolds ; Sequencing ; Shotguns ; Software</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2004-02, Vol.101 (7), p.1916-1921</ispartof><rights>Copyright 1993/2004 The National Academy of Sciences of the United States of America</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><rights>Copyright © 2004, The National Academy of Sciences 2004</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c597t-266e890b7bc028253ce518b1845f038f774ddc65a2f77f1a73221d250c11511c3</citedby><cites>FETCH-LOGICAL-c597t-266e890b7bc028253ce518b1845f038f774ddc65a2f77f1a73221d250c11511c3</cites><orcidid>0000-0002-0047-7736</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/101/7.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/3371370$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/3371370$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,723,776,780,799,881,27901,27902,53766,53768,57992,58225</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/14769938$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-00465101$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Istrail, Sorin</creatorcontrib><creatorcontrib>Sutton, Granger G.</creatorcontrib><creatorcontrib>Florea, Liliana</creatorcontrib><creatorcontrib>Halpern, Aaron L.</creatorcontrib><creatorcontrib>Mobarry, Clark M.</creatorcontrib><creatorcontrib>Lippert, Ross</creatorcontrib><creatorcontrib>Walenz, Brian</creatorcontrib><creatorcontrib>Shatkay, Hagit</creatorcontrib><creatorcontrib>Dew, Ian</creatorcontrib><creatorcontrib>Miller, Jason R.</creatorcontrib><creatorcontrib>Flanigan, Michael J.</creatorcontrib><creatorcontrib>Edwards, Nathan J.</creatorcontrib><creatorcontrib>Bolanos, Randall</creatorcontrib><creatorcontrib>Fasulo, Daniel</creatorcontrib><creatorcontrib>Halldorsson, Bjarni V.</creatorcontrib><creatorcontrib>Hannenhalli, Sridhar</creatorcontrib><creatorcontrib>Turner, Russell</creatorcontrib><creatorcontrib>Yooseph, Shibu</creatorcontrib><creatorcontrib>Lu, Fu</creatorcontrib><creatorcontrib>Nusskern, Deborah R.</creatorcontrib><creatorcontrib>Shue, Bixiong Chris</creatorcontrib><creatorcontrib>Zheng, Xiangqun Holly</creatorcontrib><creatorcontrib>Zhong, Fei</creatorcontrib><creatorcontrib>Delcher, Arthur L.</creatorcontrib><creatorcontrib>Huson, Daniel H.</creatorcontrib><creatorcontrib>Kravitz, Saul A.</creatorcontrib><creatorcontrib>Mouchard, Laurent</creatorcontrib><creatorcontrib>Reinert, Knut</creatorcontrib><creatorcontrib>Remington, Karin A.</creatorcontrib><creatorcontrib>Clark, Andrew G.</creatorcontrib><creatorcontrib>Waterman, Michael S.</creatorcontrib><creatorcontrib>Eichler, Evan E.</creatorcontrib><creatorcontrib>Adams, Mark D.</creatorcontrib><creatorcontrib>Hunkapiller, Michael W.</creatorcontrib><creatorcontrib>Myers, Eugene W.</creatorcontrib><creatorcontrib>Venter, J. Craig</creatorcontrib><title>Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.</description><subject>Bioinformatics</subject><subject>Biological Sciences</subject><subject>Chromosomes</subject><subject>Computational Biology - standards</subject><subject>Computer Science</subject><subject>Contig Mapping - standards</subject><subject>Datasets</subject><subject>Drosophila</subject><subject>Genome, Human</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Human genome</subject><subject>Human Genome Project</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Mathematical sequences</subject><subject>Quantitative Methods</subject><subject>RNA, Messenger - analysis</subject><subject>Scaffolds</subject><subject>Sequencing</subject><subject>Shotguns</subject><subject>Software</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkUFv1DAQhS0EotvCmQtCOVFxSDtjx7F96GG1gl2klTgA4mg5jtNNldhLnFT03-PVRt2CkDh5ZH_vjWceIW8QrhAEu957E6-AgVACEeAZWSAozMtCwXOyAKAilwUtzsh5jHcAoLiEl-QMC1EqxeSCrH_sQufytfOhd9nXXRhvJ58tY3R91T1kxtfZKvR7M7Qx-Cw02Wbqjc9mfuZaF1-RF43pons9nxfk-6eP31abfPtl_Xm13OaWKzHmtCydVFCJygKVlDPrOMoKZcEbYLIRoqhrW3JDU9mgEYxSrCkHi8gRLbsgN0ff_VT1rrbOj4Pp9H5oezM86GBa_eeLb3f6NtxrxkXaRtJ_OOp3f6k2y60-3AEUJUfAe0zs-7nXEH5OLo66b6N1XWe8C1PUElAw4Oy_IKo0qaQqgddH0A4hxsE1j19A0IdA9SFQfQo0Kd49nffEzwkm4HIGDsqTHWqR-mKpm6nrRvdrfGL1bzIBb4_AXRzD8EgwJpAJYL8BzRO73g</recordid><startdate>20040217</startdate><enddate>20040217</enddate><creator>Istrail, Sorin</creator><creator>Sutton, Granger G.</creator><creator>Florea, Liliana</creator><creator>Halpern, Aaron L.</creator><creator>Mobarry, Clark M.</creator><creator>Lippert, Ross</creator><creator>Walenz, Brian</creator><creator>Shatkay, Hagit</creator><creator>Dew, Ian</creator><creator>Miller, Jason R.</creator><creator>Flanigan, Michael J.</creator><creator>Edwards, Nathan J.</creator><creator>Bolanos, Randall</creator><creator>Fasulo, Daniel</creator><creator>Halldorsson, Bjarni V.</creator><creator>Hannenhalli, Sridhar</creator><creator>Turner, Russell</creator><creator>Yooseph, Shibu</creator><creator>Lu, Fu</creator><creator>Nusskern, Deborah R.</creator><creator>Shue, Bixiong Chris</creator><creator>Zheng, Xiangqun Holly</creator><creator>Zhong, Fei</creator><creator>Delcher, Arthur L.</creator><creator>Huson, Daniel H.</creator><creator>Kravitz, Saul A.</creator><creator>Mouchard, Laurent</creator><creator>Reinert, Knut</creator><creator>Remington, Karin A.</creator><creator>Clark, Andrew G.</creator><creator>Waterman, Michael S.</creator><creator>Eichler, Evan E.</creator><creator>Adams, Mark D.</creator><creator>Hunkapiller, Michael W.</creator><creator>Myers, Eugene W.</creator><creator>Venter, J. Craig</creator><general>National Academy of Sciences</general><general>National Acad Sciences</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>1XC</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0047-7736</orcidid></search><sort><creationdate>20040217</creationdate><title>Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies</title><author>Istrail, Sorin ; Sutton, Granger G. ; Florea, Liliana ; Halpern, Aaron L. ; Mobarry, Clark M. ; Lippert, Ross ; Walenz, Brian ; Shatkay, Hagit ; Dew, Ian ; Miller, Jason R. ; Flanigan, Michael J. ; Edwards, Nathan J. ; Bolanos, Randall ; Fasulo, Daniel ; Halldorsson, Bjarni V. ; Hannenhalli, Sridhar ; Turner, Russell ; Yooseph, Shibu ; Lu, Fu ; Nusskern, Deborah R. ; Shue, Bixiong Chris ; Zheng, Xiangqun Holly ; Zhong, Fei ; Delcher, Arthur L. ; Huson, Daniel H. ; Kravitz, Saul A. ; Mouchard, Laurent ; Reinert, Knut ; Remington, Karin A. ; Clark, Andrew G. ; Waterman, Michael S. ; Eichler, Evan E. ; Adams, Mark D. ; Hunkapiller, Michael W. ; Myers, Eugene W. ; Venter, J. Craig</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c597t-266e890b7bc028253ce518b1845f038f774ddc65a2f77f1a73221d250c11511c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Bioinformatics</topic><topic>Biological Sciences</topic><topic>Chromosomes</topic><topic>Computational Biology - standards</topic><topic>Computer Science</topic><topic>Contig Mapping - standards</topic><topic>Datasets</topic><topic>Drosophila</topic><topic>Genome, Human</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Human genome</topic><topic>Human Genome Project</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Mathematical sequences</topic><topic>Quantitative Methods</topic><topic>RNA, Messenger - analysis</topic><topic>Scaffolds</topic><topic>Sequencing</topic><topic>Shotguns</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Istrail, Sorin</creatorcontrib><creatorcontrib>Sutton, Granger G.</creatorcontrib><creatorcontrib>Florea, Liliana</creatorcontrib><creatorcontrib>Halpern, Aaron L.</creatorcontrib><creatorcontrib>Mobarry, Clark M.</creatorcontrib><creatorcontrib>Lippert, Ross</creatorcontrib><creatorcontrib>Walenz, Brian</creatorcontrib><creatorcontrib>Shatkay, Hagit</creatorcontrib><creatorcontrib>Dew, Ian</creatorcontrib><creatorcontrib>Miller, Jason R.</creatorcontrib><creatorcontrib>Flanigan, Michael J.</creatorcontrib><creatorcontrib>Edwards, Nathan J.</creatorcontrib><creatorcontrib>Bolanos, Randall</creatorcontrib><creatorcontrib>Fasulo, Daniel</creatorcontrib><creatorcontrib>Halldorsson, Bjarni V.</creatorcontrib><creatorcontrib>Hannenhalli, Sridhar</creatorcontrib><creatorcontrib>Turner, Russell</creatorcontrib><creatorcontrib>Yooseph, Shibu</creatorcontrib><creatorcontrib>Lu, Fu</creatorcontrib><creatorcontrib>Nusskern, Deborah R.</creatorcontrib><creatorcontrib>Shue, Bixiong Chris</creatorcontrib><creatorcontrib>Zheng, Xiangqun Holly</creatorcontrib><creatorcontrib>Zhong, Fei</creatorcontrib><creatorcontrib>Delcher, Arthur L.</creatorcontrib><creatorcontrib>Huson, Daniel H.</creatorcontrib><creatorcontrib>Kravitz, Saul A.</creatorcontrib><creatorcontrib>Mouchard, Laurent</creatorcontrib><creatorcontrib>Reinert, Knut</creatorcontrib><creatorcontrib>Remington, Karin A.</creatorcontrib><creatorcontrib>Clark, Andrew G.</creatorcontrib><creatorcontrib>Waterman, Michael S.</creatorcontrib><creatorcontrib>Eichler, Evan E.</creatorcontrib><creatorcontrib>Adams, Mark D.</creatorcontrib><creatorcontrib>Hunkapiller, Michael W.</creatorcontrib><creatorcontrib>Myers, Eugene W.</creatorcontrib><creatorcontrib>Venter, J. Craig</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Istrail, Sorin</au><au>Sutton, Granger G.</au><au>Florea, Liliana</au><au>Halpern, Aaron L.</au><au>Mobarry, Clark M.</au><au>Lippert, Ross</au><au>Walenz, Brian</au><au>Shatkay, Hagit</au><au>Dew, Ian</au><au>Miller, Jason R.</au><au>Flanigan, Michael J.</au><au>Edwards, Nathan J.</au><au>Bolanos, Randall</au><au>Fasulo, Daniel</au><au>Halldorsson, Bjarni V.</au><au>Hannenhalli, Sridhar</au><au>Turner, Russell</au><au>Yooseph, Shibu</au><au>Lu, Fu</au><au>Nusskern, Deborah R.</au><au>Shue, Bixiong Chris</au><au>Zheng, Xiangqun Holly</au><au>Zhong, Fei</au><au>Delcher, Arthur L.</au><au>Huson, Daniel H.</au><au>Kravitz, Saul A.</au><au>Mouchard, Laurent</au><au>Reinert, Knut</au><au>Remington, Karin A.</au><au>Clark, Andrew G.</au><au>Waterman, Michael S.</au><au>Eichler, Evan E.</au><au>Adams, Mark D.</au><au>Hunkapiller, Michael W.</au><au>Myers, Eugene W.</au><au>Venter, J. Craig</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2004-02-17</date><risdate>2004</risdate><volume>101</volume><issue>7</issue><spage>1916</spage><epage>1921</epage><pages>1916-1921</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>14769938</pmid><doi>10.1073/pnas.0307971100</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0002-0047-7736</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0027-8424 |
ispartof | Proceedings of the National Academy of Sciences - PNAS, 2004-02, Vol.101 (7), p.1916-1921 |
issn | 0027-8424 1091-6490 |
language | eng |
recordid | cdi_pnas_primary_101_7_1916_fulltext |
source | Jstor Complete Legacy; MEDLINE; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry |
subjects | Bioinformatics Biological Sciences Chromosomes Computational Biology - standards Computer Science Contig Mapping - standards Datasets Drosophila Genome, Human Genomes Genomics Human genome Human Genome Project Humans Life Sciences Mathematical sequences Quantitative Methods RNA, Messenger - analysis Scaffolds Sequencing Shotguns Software |
title | Whole-Genome Shotgun Assembly and Comparison of Human Genome Assemblies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T04%3A23%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pnas_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Whole-Genome%20Shotgun%20Assembly%20and%20Comparison%20of%20Human%20Genome%20Assemblies&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Istrail,%20Sorin&rft.date=2004-02-17&rft.volume=101&rft.issue=7&rft.spage=1916&rft.epage=1921&rft.pages=1916-1921&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.0307971100&rft_dat=%3Cjstor_pnas_%3E3371370%3C/jstor_pnas_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=19253829&rft_id=info:pmid/14769938&rft_jstor_id=3371370&rfr_iscdi=true |