Data from: Improved genome assembly of the whiteleg shrimp Penaeus (Litopenaeus) vannamei using long- and short-read sequences from public databases
A genome assembly contains the complete DNA sequence of a particular organism. This information is necessary to understand the organism's gene functions and genetic variability of their populations. In this study, the genome of the Pacific whiteleg shrimp Penaeus (Litopenaeus) vannamei was asse...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A genome assembly contains the complete DNA sequence of a particular
organism. This information is necessary to understand the organism's
gene functions and genetic variability of their populations. In this
study, the genome of the Pacific whiteleg shrimp Penaeus (Litopenaeus)
vannamei was assembled using databases from the GenBank, the repository of
DNA sequences of the National Institute of Health of the USA, which is of
worldwide public access. The three tables and two figures contain
supplementary information of the article JOH-2023-155.R1. The information
is relevant for the analysis of the new reference-guided genome assembly
of the whiteleg shrimp. The Supplementary Table 1 compares observed to
expected chromosome sizes. The location of genetic markers in
Supplementary Table 2 will be particularly relevant for future genome-wide
association studies, which will look for the association of markers and/or
genes to traits of interest for aquaculture, such as disease resistance,
growth or fecundity. The Supplementary Table 3 shows that many markers
tend to align in several parts of the genome indicating the great number
of repeated regions in the shrimp's genome. The Supplementary Figure
1 shows the results of genome size estimation based on counting k-mers
(substrings of length k contained within a DNA sequence). The
Supplementary Figure 2 depicts the linear correlation between the observed
and expected length of the assembled chromosomes. The Supplementary
Materials 1 file contains the Perl script necessary to extract from the
raw-data database, the mitochondrial DNA sequences that are not necessary,
and can eventually interfere, in the genome assembly. |
---|---|
DOI: | 10.5061/dryad.0k6djhb7n |