High-quality genome assembly of the silkworm, Bombyx mori

In 2008, the genome assembly and gene models for the domestic silkworm, Bombyx mori, were published by a Japanese and Chinese collaboration group. However, the genome assembly contains a non-negligible number of misassembled and gap regions due to the presence of many repetitive sequences within the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Insect biochemistry and molecular biology 2019-04, Vol.107, p.53-62
Hauptverfasser: Kawamoto, Munetaka, Jouraku, Akiya, Toyoda, Atsushi, Yokoi, Kakeru, Minakuchi, Yohei, Katsuma, Susumu, Fujiyama, Asao, Kiuchi, Takashi, Yamamoto, Kimiko, Shimada, Toru
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In 2008, the genome assembly and gene models for the domestic silkworm, Bombyx mori, were published by a Japanese and Chinese collaboration group. However, the genome assembly contains a non-negligible number of misassembled and gap regions due to the presence of many repetitive sequences within the silkworm genome. The erroneous genome assembly occasionally causes incorrect gene prediction. Here we performed hybrid assembly based on 140 × deep sequencing of long (PacBio) and short (Illumina) reads. The remaining gaps in the initial genome assembly were closed using BAC and Fosmid sequences, giving a new total length of 460.3 Mb, with 30 gap regions and an N50 comprising 16.8 Mb in scaffolds and 12.2 Mb in contigs. More RNA-seq and piRNA-seq reads were mapped on the new genome assembly compared with the previous version, indicating that the new genome assembly covers more transcribed regions, including repetitive elements. We performed gene prediction based on the new genome assembly using available mRNA and protein sequence data. The number of gene models was 16,880 with an N50 of 2154 bp. The new gene models reflected more accurate coding sequences and gene sets than old ones. The proportion of repetitive elements was also reestimated using the new genome assembly, and was calculated to be 46.8% in the silkworm genome. The new genome assembly and gene models are provided in SilkBase (http://silkbase.ab.a.u-tokyo.ac.jp). [Display omitted] •Whole silkworm genome resequencing was performed using PacBio long-read and Illumina short-read sequencers.•High quality genome assembly (460.3 Mb) containing 696 scaffolds with an N50 size of 16.8 Mb was obtained.•Gene prediction using the newly assembled genome identified 16,880 genes with an N50 of 2154 bp.•The new genome assembly and the gene models were evaluated.•The new genome assembly and the gene models are available in SilkBase (http://silkbase.ab.a.u-tokyo.ac.jp).
ISSN:0965-1748
1879-0240
DOI:10.1016/j.ibmb.2019.02.002