A second unveiling: Haplotig masking of the eastern oyster genome improves population‐level inference

Genome assembly can be challenging for species that are characterized by high amounts of polymorphism, heterozygosity, and large effective population sizes. High levels of heterozygosity can result in genome mis‐assemblies and a larger than expected genome size due to the haplotig versions of a sing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular ecology resources 2024-01, Vol.24 (1), p.e13801-n/a
Hauptverfasser: Puritz, Jonathan B., Guo, Ximing, Hare, Matthew, He, Yan, Hillier, LaDeana W., Jin, Shubo, Liu, Ming, Lotterhos, Katie E., Minx, Pat, Modak, Tejashree, Proestou, Dina, Rice, Edward S., Tomlinson, Chad, Warren, Wesley C., Witkop, Erin, Zhao, Honggang, Gomez‐Chiarri, Marta
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Genome assembly can be challenging for species that are characterized by high amounts of polymorphism, heterozygosity, and large effective population sizes. High levels of heterozygosity can result in genome mis‐assemblies and a larger than expected genome size due to the haplotig versions of a single locus being assembled as separate loci. Here, we describe the first chromosome‐level genome for the eastern oyster, Crassostrea virginica. Publicly released and annotated in 2017, the assembly has a scaffold N50 of 54 mb and is over 97.3% complete based on BUSCO analysis. The genome assembly for the eastern oyster is a critical resource for foundational research into molluscan adaptation to a changing environment and for selective breeding for the aquaculture industry. Subsequent resequencing data suggested the presence of haplotigs in the original assembly, and we developed a post hoc method to break up chimeric contigs and mask haplotigs in published heterozygous genomes and evaluated improvements to the accuracy of downstream analysis. Masking haplotigs had a large impact on SNP discovery and estimates of nucleotide diversity and had more subtle and nuanced effects on estimates of heterozygosity, population structure analysis, and outlier detection. We show that haplotig masking can be a powerful tool for improving genomic inference, and we present an open, reproducible resource for the masking of haplotigs in any published genome.
ISSN:1755-098X
1755-0998
DOI:10.1111/1755-0998.13801