Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genome Research 2009-09, Vol.19 (9), p.1527-1541
Hauptverfasser: McKernan, Kevin Judd, Peckham, Heather E, Costa, Gina L, McLaughlin, Stephen F, Fu, Yutao, Tsung, Eric F, Clouser, Christopher R, Duncan, Cisyla, Ichikawa, Jeffrey K, Lee, Clarence C, Zhang, Zheng, Ranade, Swati S, Dimalanta, Eileen T, Hyland, Fiona C, Sokolsky, Tanya D, Zhang, Lei, Sheridan, Andrew, Fu, Haoning, Hendrickson, Cynthia L, Li, Bin, Kotler, Lev, Stuart, Jeremy R, Malek, Joel A, Manning, Jonathan M, Antipova, Alena A, Perez, Damon S, Moore, Michael P, Hayashibara, Kathleen C, Lyons, Michael R, Beaudoin, Robert E, Coleman, Brittany E, Laptewicz, Michael W, Sannicandro, Adam E, Rhodes, Michael D, Gottimukkala, Rajesh K, Yang, Shan, Bafna, Vineet, Bashir, Ali, MacBride, Andrew, Alkan, Can, Kidd, Jeffrey M, Eichler, Evan E, Reese, Martin G, De La Vega, Francisco M, Blanchard, Alan P
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1541
container_issue 9
container_start_page 1527
container_title Genome Research
container_volume 19
creator McKernan, Kevin Judd
Peckham, Heather E
Costa, Gina L
McLaughlin, Stephen F
Fu, Yutao
Tsung, Eric F
Clouser, Christopher R
Duncan, Cisyla
Ichikawa, Jeffrey K
Lee, Clarence C
Zhang, Zheng
Ranade, Swati S
Dimalanta, Eileen T
Hyland, Fiona C
Sokolsky, Tanya D
Zhang, Lei
Sheridan, Andrew
Fu, Haoning
Hendrickson, Cynthia L
Li, Bin
Kotler, Lev
Stuart, Jeremy R
Malek, Joel A
Manning, Jonathan M
Antipova, Alena A
Perez, Damon S
Moore, Michael P
Hayashibara, Kathleen C
Lyons, Michael R
Beaudoin, Robert E
Coleman, Brittany E
Laptewicz, Michael W
Sannicandro, Adam E
Rhodes, Michael D
Gottimukkala, Rajesh K
Yang, Shan
Bafna, Vineet
Bashir, Ali
MacBride, Andrew
Alkan, Can
Kidd, Jeffrey M
Eichler, Evan E
Reese, Martin G
De La Vega, Francisco M
Blanchard, Alan P
description We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.
doi_str_mv 10.1101/gr.091868.109
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2752135</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>734033997</sourcerecordid><originalsourceid>FETCH-LOGICAL-c452t-f7d0887aaf90931f1a07f45103b4e56e1aa63ac03dfdb5e19a31055520c856d43</originalsourceid><addsrcrecordid>eNpVkcuO1DAQRS0EYoaBJVvkHRvSuNpxEm-Q0IiXNBILYG1V7EraKLEbO2nU38BP41FaPDYu69bRrSpdxp6D2AEIeD2mndDQNd0OhH7ArkHVulJ1ox-Wv-i6SgsFV-xJzt-FELLuusfsCnQhoNHX7NcX-rFSsMQxOJ6XtNplTTjxEyaPi4-B-8CRH9YZAx8pxJn4Gmw8USLH-zPPh5iWKhG6V3zGnP2JpjM_YjGZaOKTHzebvA3yYeRrvn-Xn7HqMRMvanRFecoeDThlenapN-zb-3dfbz9Wd58_fLp9e1fZWu2XamhduatFHLTQEgZA0Q61AiH7mlRDgNhItEK6wfWKQKMEoZTaC9upxtXyhr3ZfI9rP5OzFJayrDkmP2M6m4je_N8J_mDGeDL7Vu1BqmLw8mKQYjkqL2b22dI0YaC4ZtPKWkipdVvIaiNtijknGv5MAWHu8zNjMlt-RdGFf_Hvan_pS2DyNy3Vmpw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>734033997</pqid></control><display><type>article</type><title>Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding</title><source>MEDLINE</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>McKernan, Kevin Judd ; Peckham, Heather E ; Costa, Gina L ; McLaughlin, Stephen F ; Fu, Yutao ; Tsung, Eric F ; Clouser, Christopher R ; Duncan, Cisyla ; Ichikawa, Jeffrey K ; Lee, Clarence C ; Zhang, Zheng ; Ranade, Swati S ; Dimalanta, Eileen T ; Hyland, Fiona C ; Sokolsky, Tanya D ; Zhang, Lei ; Sheridan, Andrew ; Fu, Haoning ; Hendrickson, Cynthia L ; Li, Bin ; Kotler, Lev ; Stuart, Jeremy R ; Malek, Joel A ; Manning, Jonathan M ; Antipova, Alena A ; Perez, Damon S ; Moore, Michael P ; Hayashibara, Kathleen C ; Lyons, Michael R ; Beaudoin, Robert E ; Coleman, Brittany E ; Laptewicz, Michael W ; Sannicandro, Adam E ; Rhodes, Michael D ; Gottimukkala, Rajesh K ; Yang, Shan ; Bafna, Vineet ; Bashir, Ali ; MacBride, Andrew ; Alkan, Can ; Kidd, Jeffrey M ; Eichler, Evan E ; Reese, Martin G ; De La Vega, Francisco M ; Blanchard, Alan P</creator><creatorcontrib>McKernan, Kevin Judd ; Peckham, Heather E ; Costa, Gina L ; McLaughlin, Stephen F ; Fu, Yutao ; Tsung, Eric F ; Clouser, Christopher R ; Duncan, Cisyla ; Ichikawa, Jeffrey K ; Lee, Clarence C ; Zhang, Zheng ; Ranade, Swati S ; Dimalanta, Eileen T ; Hyland, Fiona C ; Sokolsky, Tanya D ; Zhang, Lei ; Sheridan, Andrew ; Fu, Haoning ; Hendrickson, Cynthia L ; Li, Bin ; Kotler, Lev ; Stuart, Jeremy R ; Malek, Joel A ; Manning, Jonathan M ; Antipova, Alena A ; Perez, Damon S ; Moore, Michael P ; Hayashibara, Kathleen C ; Lyons, Michael R ; Beaudoin, Robert E ; Coleman, Brittany E ; Laptewicz, Michael W ; Sannicandro, Adam E ; Rhodes, Michael D ; Gottimukkala, Rajesh K ; Yang, Shan ; Bafna, Vineet ; Bashir, Ali ; MacBride, Andrew ; Alkan, Can ; Kidd, Jeffrey M ; Eichler, Evan E ; Reese, Martin G ; De La Vega, Francisco M ; Blanchard, Alan P</creatorcontrib><description>We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to &gt;99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.</description><identifier>ISSN: 1088-9051</identifier><identifier>EISSN: 1549-5469</identifier><identifier>EISSN: 1549-5477</identifier><identifier>DOI: 10.1101/gr.091868.109</identifier><identifier>PMID: 19546169</identifier><language>eng</language><publisher>United States: Cold Spring Harbor Laboratory Press</publisher><subject>Africa ; Base Pairing ; Base Sequence ; Computational Biology - methods ; Genetic Variation ; Genome, Human ; Genomics ; Genotype ; Heterozygote ; Homozygote ; Humans ; Ligases ; Methods ; Polymorphism, Single Nucleotide ; Reference Standards ; Sequence Analysis, DNA - methods</subject><ispartof>Genome Research, 2009-09, Vol.19 (9), p.1527-1541</ispartof><rights>Copyright © 2009 by Cold Spring Harbor Laboratory Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c452t-f7d0887aaf90931f1a07f45103b4e56e1aa63ac03dfdb5e19a31055520c856d43</citedby><cites>FETCH-LOGICAL-c452t-f7d0887aaf90931f1a07f45103b4e56e1aa63ac03dfdb5e19a31055520c856d43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2752135/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2752135/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19546169$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>McKernan, Kevin Judd</creatorcontrib><creatorcontrib>Peckham, Heather E</creatorcontrib><creatorcontrib>Costa, Gina L</creatorcontrib><creatorcontrib>McLaughlin, Stephen F</creatorcontrib><creatorcontrib>Fu, Yutao</creatorcontrib><creatorcontrib>Tsung, Eric F</creatorcontrib><creatorcontrib>Clouser, Christopher R</creatorcontrib><creatorcontrib>Duncan, Cisyla</creatorcontrib><creatorcontrib>Ichikawa, Jeffrey K</creatorcontrib><creatorcontrib>Lee, Clarence C</creatorcontrib><creatorcontrib>Zhang, Zheng</creatorcontrib><creatorcontrib>Ranade, Swati S</creatorcontrib><creatorcontrib>Dimalanta, Eileen T</creatorcontrib><creatorcontrib>Hyland, Fiona C</creatorcontrib><creatorcontrib>Sokolsky, Tanya D</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Sheridan, Andrew</creatorcontrib><creatorcontrib>Fu, Haoning</creatorcontrib><creatorcontrib>Hendrickson, Cynthia L</creatorcontrib><creatorcontrib>Li, Bin</creatorcontrib><creatorcontrib>Kotler, Lev</creatorcontrib><creatorcontrib>Stuart, Jeremy R</creatorcontrib><creatorcontrib>Malek, Joel A</creatorcontrib><creatorcontrib>Manning, Jonathan M</creatorcontrib><creatorcontrib>Antipova, Alena A</creatorcontrib><creatorcontrib>Perez, Damon S</creatorcontrib><creatorcontrib>Moore, Michael P</creatorcontrib><creatorcontrib>Hayashibara, Kathleen C</creatorcontrib><creatorcontrib>Lyons, Michael R</creatorcontrib><creatorcontrib>Beaudoin, Robert E</creatorcontrib><creatorcontrib>Coleman, Brittany E</creatorcontrib><creatorcontrib>Laptewicz, Michael W</creatorcontrib><creatorcontrib>Sannicandro, Adam E</creatorcontrib><creatorcontrib>Rhodes, Michael D</creatorcontrib><creatorcontrib>Gottimukkala, Rajesh K</creatorcontrib><creatorcontrib>Yang, Shan</creatorcontrib><creatorcontrib>Bafna, Vineet</creatorcontrib><creatorcontrib>Bashir, Ali</creatorcontrib><creatorcontrib>MacBride, Andrew</creatorcontrib><creatorcontrib>Alkan, Can</creatorcontrib><creatorcontrib>Kidd, Jeffrey M</creatorcontrib><creatorcontrib>Eichler, Evan E</creatorcontrib><creatorcontrib>Reese, Martin G</creatorcontrib><creatorcontrib>De La Vega, Francisco M</creatorcontrib><creatorcontrib>Blanchard, Alan P</creatorcontrib><title>Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding</title><title>Genome Research</title><addtitle>Genome Res</addtitle><description>We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to &gt;99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.</description><subject>Africa</subject><subject>Base Pairing</subject><subject>Base Sequence</subject><subject>Computational Biology - methods</subject><subject>Genetic Variation</subject><subject>Genome, Human</subject><subject>Genomics</subject><subject>Genotype</subject><subject>Heterozygote</subject><subject>Homozygote</subject><subject>Humans</subject><subject>Ligases</subject><subject>Methods</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Reference Standards</subject><subject>Sequence Analysis, DNA - methods</subject><issn>1088-9051</issn><issn>1549-5469</issn><issn>1549-5477</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkcuO1DAQRS0EYoaBJVvkHRvSuNpxEm-Q0IiXNBILYG1V7EraKLEbO2nU38BP41FaPDYu69bRrSpdxp6D2AEIeD2mndDQNd0OhH7ArkHVulJ1ox-Wv-i6SgsFV-xJzt-FELLuusfsCnQhoNHX7NcX-rFSsMQxOJ6XtNplTTjxEyaPi4-B-8CRH9YZAx8pxJn4Gmw8USLH-zPPh5iWKhG6V3zGnP2JpjM_YjGZaOKTHzebvA3yYeRrvn-Xn7HqMRMvanRFecoeDThlenapN-zb-3dfbz9Wd58_fLp9e1fZWu2XamhduatFHLTQEgZA0Q61AiH7mlRDgNhItEK6wfWKQKMEoZTaC9upxtXyhr3ZfI9rP5OzFJayrDkmP2M6m4je_N8J_mDGeDL7Vu1BqmLw8mKQYjkqL2b22dI0YaC4ZtPKWkipdVvIaiNtijknGv5MAWHu8zNjMlt-RdGFf_Hvan_pS2DyNy3Vmpw</recordid><startdate>20090901</startdate><enddate>20090901</enddate><creator>McKernan, Kevin Judd</creator><creator>Peckham, Heather E</creator><creator>Costa, Gina L</creator><creator>McLaughlin, Stephen F</creator><creator>Fu, Yutao</creator><creator>Tsung, Eric F</creator><creator>Clouser, Christopher R</creator><creator>Duncan, Cisyla</creator><creator>Ichikawa, Jeffrey K</creator><creator>Lee, Clarence C</creator><creator>Zhang, Zheng</creator><creator>Ranade, Swati S</creator><creator>Dimalanta, Eileen T</creator><creator>Hyland, Fiona C</creator><creator>Sokolsky, Tanya D</creator><creator>Zhang, Lei</creator><creator>Sheridan, Andrew</creator><creator>Fu, Haoning</creator><creator>Hendrickson, Cynthia L</creator><creator>Li, Bin</creator><creator>Kotler, Lev</creator><creator>Stuart, Jeremy R</creator><creator>Malek, Joel A</creator><creator>Manning, Jonathan M</creator><creator>Antipova, Alena A</creator><creator>Perez, Damon S</creator><creator>Moore, Michael P</creator><creator>Hayashibara, Kathleen C</creator><creator>Lyons, Michael R</creator><creator>Beaudoin, Robert E</creator><creator>Coleman, Brittany E</creator><creator>Laptewicz, Michael W</creator><creator>Sannicandro, Adam E</creator><creator>Rhodes, Michael D</creator><creator>Gottimukkala, Rajesh K</creator><creator>Yang, Shan</creator><creator>Bafna, Vineet</creator><creator>Bashir, Ali</creator><creator>MacBride, Andrew</creator><creator>Alkan, Can</creator><creator>Kidd, Jeffrey M</creator><creator>Eichler, Evan E</creator><creator>Reese, Martin G</creator><creator>De La Vega, Francisco M</creator><creator>Blanchard, Alan P</creator><general>Cold Spring Harbor Laboratory Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20090901</creationdate><title>Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding</title><author>McKernan, Kevin Judd ; Peckham, Heather E ; Costa, Gina L ; McLaughlin, Stephen F ; Fu, Yutao ; Tsung, Eric F ; Clouser, Christopher R ; Duncan, Cisyla ; Ichikawa, Jeffrey K ; Lee, Clarence C ; Zhang, Zheng ; Ranade, Swati S ; Dimalanta, Eileen T ; Hyland, Fiona C ; Sokolsky, Tanya D ; Zhang, Lei ; Sheridan, Andrew ; Fu, Haoning ; Hendrickson, Cynthia L ; Li, Bin ; Kotler, Lev ; Stuart, Jeremy R ; Malek, Joel A ; Manning, Jonathan M ; Antipova, Alena A ; Perez, Damon S ; Moore, Michael P ; Hayashibara, Kathleen C ; Lyons, Michael R ; Beaudoin, Robert E ; Coleman, Brittany E ; Laptewicz, Michael W ; Sannicandro, Adam E ; Rhodes, Michael D ; Gottimukkala, Rajesh K ; Yang, Shan ; Bafna, Vineet ; Bashir, Ali ; MacBride, Andrew ; Alkan, Can ; Kidd, Jeffrey M ; Eichler, Evan E ; Reese, Martin G ; De La Vega, Francisco M ; Blanchard, Alan P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c452t-f7d0887aaf90931f1a07f45103b4e56e1aa63ac03dfdb5e19a31055520c856d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Africa</topic><topic>Base Pairing</topic><topic>Base Sequence</topic><topic>Computational Biology - methods</topic><topic>Genetic Variation</topic><topic>Genome, Human</topic><topic>Genomics</topic><topic>Genotype</topic><topic>Heterozygote</topic><topic>Homozygote</topic><topic>Humans</topic><topic>Ligases</topic><topic>Methods</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Reference Standards</topic><topic>Sequence Analysis, DNA - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>McKernan, Kevin Judd</creatorcontrib><creatorcontrib>Peckham, Heather E</creatorcontrib><creatorcontrib>Costa, Gina L</creatorcontrib><creatorcontrib>McLaughlin, Stephen F</creatorcontrib><creatorcontrib>Fu, Yutao</creatorcontrib><creatorcontrib>Tsung, Eric F</creatorcontrib><creatorcontrib>Clouser, Christopher R</creatorcontrib><creatorcontrib>Duncan, Cisyla</creatorcontrib><creatorcontrib>Ichikawa, Jeffrey K</creatorcontrib><creatorcontrib>Lee, Clarence C</creatorcontrib><creatorcontrib>Zhang, Zheng</creatorcontrib><creatorcontrib>Ranade, Swati S</creatorcontrib><creatorcontrib>Dimalanta, Eileen T</creatorcontrib><creatorcontrib>Hyland, Fiona C</creatorcontrib><creatorcontrib>Sokolsky, Tanya D</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Sheridan, Andrew</creatorcontrib><creatorcontrib>Fu, Haoning</creatorcontrib><creatorcontrib>Hendrickson, Cynthia L</creatorcontrib><creatorcontrib>Li, Bin</creatorcontrib><creatorcontrib>Kotler, Lev</creatorcontrib><creatorcontrib>Stuart, Jeremy R</creatorcontrib><creatorcontrib>Malek, Joel A</creatorcontrib><creatorcontrib>Manning, Jonathan M</creatorcontrib><creatorcontrib>Antipova, Alena A</creatorcontrib><creatorcontrib>Perez, Damon S</creatorcontrib><creatorcontrib>Moore, Michael P</creatorcontrib><creatorcontrib>Hayashibara, Kathleen C</creatorcontrib><creatorcontrib>Lyons, Michael R</creatorcontrib><creatorcontrib>Beaudoin, Robert E</creatorcontrib><creatorcontrib>Coleman, Brittany E</creatorcontrib><creatorcontrib>Laptewicz, Michael W</creatorcontrib><creatorcontrib>Sannicandro, Adam E</creatorcontrib><creatorcontrib>Rhodes, Michael D</creatorcontrib><creatorcontrib>Gottimukkala, Rajesh K</creatorcontrib><creatorcontrib>Yang, Shan</creatorcontrib><creatorcontrib>Bafna, Vineet</creatorcontrib><creatorcontrib>Bashir, Ali</creatorcontrib><creatorcontrib>MacBride, Andrew</creatorcontrib><creatorcontrib>Alkan, Can</creatorcontrib><creatorcontrib>Kidd, Jeffrey M</creatorcontrib><creatorcontrib>Eichler, Evan E</creatorcontrib><creatorcontrib>Reese, Martin G</creatorcontrib><creatorcontrib>De La Vega, Francisco M</creatorcontrib><creatorcontrib>Blanchard, Alan P</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genome Research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>McKernan, Kevin Judd</au><au>Peckham, Heather E</au><au>Costa, Gina L</au><au>McLaughlin, Stephen F</au><au>Fu, Yutao</au><au>Tsung, Eric F</au><au>Clouser, Christopher R</au><au>Duncan, Cisyla</au><au>Ichikawa, Jeffrey K</au><au>Lee, Clarence C</au><au>Zhang, Zheng</au><au>Ranade, Swati S</au><au>Dimalanta, Eileen T</au><au>Hyland, Fiona C</au><au>Sokolsky, Tanya D</au><au>Zhang, Lei</au><au>Sheridan, Andrew</au><au>Fu, Haoning</au><au>Hendrickson, Cynthia L</au><au>Li, Bin</au><au>Kotler, Lev</au><au>Stuart, Jeremy R</au><au>Malek, Joel A</au><au>Manning, Jonathan M</au><au>Antipova, Alena A</au><au>Perez, Damon S</au><au>Moore, Michael P</au><au>Hayashibara, Kathleen C</au><au>Lyons, Michael R</au><au>Beaudoin, Robert E</au><au>Coleman, Brittany E</au><au>Laptewicz, Michael W</au><au>Sannicandro, Adam E</au><au>Rhodes, Michael D</au><au>Gottimukkala, Rajesh K</au><au>Yang, Shan</au><au>Bafna, Vineet</au><au>Bashir, Ali</au><au>MacBride, Andrew</au><au>Alkan, Can</au><au>Kidd, Jeffrey M</au><au>Eichler, Evan E</au><au>Reese, Martin G</au><au>De La Vega, Francisco M</au><au>Blanchard, Alan P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding</atitle><jtitle>Genome Research</jtitle><addtitle>Genome Res</addtitle><date>2009-09-01</date><risdate>2009</risdate><volume>19</volume><issue>9</issue><spage>1527</spage><epage>1541</epage><pages>1527-1541</pages><issn>1088-9051</issn><eissn>1549-5469</eissn><eissn>1549-5477</eissn><abstract>We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to &gt;99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.</abstract><cop>United States</cop><pub>Cold Spring Harbor Laboratory Press</pub><pmid>19546169</pmid><doi>10.1101/gr.091868.109</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1088-9051
ispartof Genome Research, 2009-09, Vol.19 (9), p.1527-1541
issn 1088-9051
1549-5469
1549-5477
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2752135
source MEDLINE; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Africa
Base Pairing
Base Sequence
Computational Biology - methods
Genetic Variation
Genome, Human
Genomics
Genotype
Heterozygote
Homozygote
Humans
Ligases
Methods
Polymorphism, Single Nucleotide
Reference Standards
Sequence Analysis, DNA - methods
title Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T13%3A28%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sequence%20and%20structural%20variation%20in%20a%20human%20genome%20uncovered%20by%20short-read,%20massively%20parallel%20ligation%20sequencing%20using%20two-base%20encoding&rft.jtitle=Genome%20Research&rft.au=McKernan,%20Kevin%20Judd&rft.date=2009-09-01&rft.volume=19&rft.issue=9&rft.spage=1527&rft.epage=1541&rft.pages=1527-1541&rft.issn=1088-9051&rft.eissn=1549-5469&rft_id=info:doi/10.1101/gr.091868.109&rft_dat=%3Cproquest_pubme%3E734033997%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=734033997&rft_id=info:pmid/19546169&rfr_iscdi=true