Data from: Improved genome assembly of the whiteleg shrimp Penaeus (Litopenaeus) vannamei using long- and short-read sequences from public databases

A genome assembly contains the complete DNA sequence of a particular organism. This information is necessary to understand the organism's gene functions and genetic variability of their populations. In this study, the genome of the Pacific whiteleg shrimp Penaeus (Litopenaeus) vannamei was asse...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Perez-Enriquez, Ricardo, Juárez, Oscar, Galindo-Torres, Pavel, Vargas-Aguilar, Ana, Llera-Herrera, Raúl
Format:	Dataset
Sprache:	eng
Schlagworte:	FOS: Other agricultural sciences genome mapping Genome-wide association studies PacBio long-reads Sequence assembly tools Shrimp aquaculture SNP markers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A genome assembly contains the complete DNA sequence of a particular organism. This information is necessary to understand the organism's gene functions and genetic variability of their populations. In this study, the genome of the Pacific whiteleg shrimp Penaeus (Litopenaeus) vannamei was assembled using databases from the GenBank, the repository of DNA sequences of the National Institute of Health of the USA, which is of worldwide public access. The three tables and two figures contain supplementary information of the article JOH-2023-155.R1. The information is relevant for the analysis of the new reference-guided genome assembly of the whiteleg shrimp. The Supplementary Table 1 compares observed to expected chromosome sizes. The location of genetic markers in Supplementary Table 2 will be particularly relevant for future genome-wide association studies, which will look for the association of markers and/or genes to traits of interest for aquaculture, such as disease resistance, growth or fecundity. The Supplementary Table 3 shows that many markers tend to align in several parts of the genome indicating the great number of repeated regions in the shrimp's genome. The Supplementary Figure 1 shows the results of genome size estimation based on counting k-mers (substrings of length k contained within a DNA sequence). The Supplementary Figure 2 depicts the linear correlation between the observed and expected length of the assembled chromosomes. The Supplementary Materials 1 file contains the Perl script necessary to extract from the raw-data database, the mitochondrial DNA sequences that are not necessary, and can eventually interfere, in the genome assembly.
DOI:	10.5061/dryad.0k6djhb7n