Phased diploid genome assembly with single-molecule real-time sequencing
The open-source FALCON and FALCON-Unzip software utilize long-read sequencing data to generate contiguous, accurate and phased diploid assemblies, even from genomes that are highly heterozygous. While genome assembly projects have been successful in many haploid and inbred species, the assembly of n...
Gespeichert in:
Veröffentlicht in: | Nature methods 2016-12, Vol.13 (12), p.1050-1054 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The open-source FALCON and FALCON-Unzip software utilize long-read sequencing data to generate contiguous, accurate and phased diploid assemblies, even from genomes that are highly heterozygous.
While genome assembly projects have been successful in many haploid and inbred species, the assembly of noninbred or rearranged heterozygous genomes remains a major challenge. To address this challenge, we introduce the open-source FALCON and FALCON-Unzip algorithms (
https://github.com/PacificBiosciences/FALCON/
) to assemble long-read sequencing data into highly accurate, contiguous, and correctly phased diploid genomes. We generate new reference sequences for heterozygous samples including an F1 hybrid of
Arabidopsis thaliana
, the widely cultivated
Vitis vinifera
cv. Cabernet Sauvignon, and the coral fungus
Clavicorona pyxidata
, samples that have challenged short-read assembly approaches. The FALCON-based assemblies are substantially more contiguous and complete than alternate short- or long-read approaches. The phased diploid assembly enabled the study of haplotype structure and heterozygosities between homologous chromosomes, including the identification of widespread heterozygous structural variation within coding sequences. |
---|---|
ISSN: | 1548-7091 1548-7105 1548-7105 |
DOI: | 10.1038/nmeth.4035 |