Twelve quick steps for genome assembly and annotation in the classroom

Eukaryotic genome sequencing and de novo assembly, once the exclusive domain of well-funded international consortia, have become increasingly affordable, thus fitting the budgets of individual research groups. Third-generation long-read DNA sequencing technologies are increasingly used, providing ex...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PLoS computational biology 2020-11, Vol.16 (11), p.e1008325-e1008325
Hauptverfasser:	Jung, Hyungtaek, Ventura, Tomer, Chung, J Sook, Kim, Woo-Jin, Nam, Bo-Hye, Kong, Hee Jeong, Kim, Young-Ok, Jeon, Min-Seung, Eyun, Seong-Il
Format:	Artikel
Sprache:	eng
Schlagworte:	Agricultural economics Animals Annotations Aquaculture Assembly Biology and Life Sciences Cereal crops Chromosomes Classroom techniques Computational Biology Consortia Corporate sponsorship Cost analysis Costs Data storage Deoxyribonucleic acid DNA DNA methylation DNA sequencing Education Educational aspects Educational technology Engineering and Technology Eukaryotes Gene Library Genetic aspects Genome Genome-wide association studies Genomes Genomics Genomics - education Genomics - methods Genomics - statistics & numerical data High-Throughput Nucleotide Sequencing - methods High-Throughput Nucleotide Sequencing - statistics & numerical data Histones Humans Identification and classification Literacy Livestock Maintenance Molecular Sequence Annotation - methods Molecular Sequence Annotation - statistics & numerical data Organisms Research and Analysis Methods RNA-Seq - methods RNA-Seq - statistics & numerical data Salmon Sequence Analysis, DNA - methods Sequence Analysis, DNA - statistics & numerical data Sheep Skills Software Study and teaching Tilapia Tomatoes
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Eukaryotic genome sequencing and de novo assembly, once the exclusive domain of well-funded international consortia, have become increasingly affordable, thus fitting the budgets of individual research groups. Third-generation long-read DNA sequencing technologies are increasingly used, providing extensive genomic toolkits that were once reserved for a few select model organisms. Generating high-quality genome assemblies and annotations for many aquatic species still presents significant challenges due to their large genome sizes, complexity, and high chromosome numbers. Indeed, selecting the most appropriate sequencing and software platforms and annotation pipelines for a new genome project can be daunting because tools often only work in limited contexts. In genomics, generating a high-quality genome assembly/annotation has become an indispensable tool for better understanding the biology of any species. Herein, we state 12 steps to help researchers get started in genome projects by presenting guidelines that are broadly applicable (to any species), sustainable over time, and cover all aspects of genome assembly and annotation projects from start to finish. We review some commonly used approaches, including practical methods to extract high-quality DNA and choices for the best sequencing platforms and library preparations. In addition, we discuss the range of potential bioinformatics pipelines, including structural and functional annotations (e.g., transposable elements and repetitive sequences). This paper also includes information on how to build a wide community for a genome project, the importance of data management, and how to make the data and results Findable, Accessible, Interoperable, and Reusable (FAIR) by submitting them to a public repository and sharing them with the research community.
ISSN:	1553-7358 1553-734X 1553-7358
DOI:	10.1371/journal.pcbi.1008325