Predicting corona virus mutations using deep learning

Due to the fast spread of the new Coronavirus called SARS-CoV-2 (otherwise known as COVID-19 virus) worldwide and its continuous mutations, such large viral outbreaks require early elucidation to determine the genetic sequence of the virus to design an effective system for identifying variants Diffe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Al-Thiabi, Mohammed Kareem, Al-Alwani, Ali J. Dawood
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Due to the fast spread of the new Coronavirus called SARS-CoV-2 (otherwise known as COVID-19 virus) worldwide and its continuous mutations, such large viral outbreaks require early elucidation to determine the genetic sequence of the virus to design an effective system for identifying variants Different known and unknown virus. Identifying specific variables facilitates the comprehension and modelling of viral propagation patterns to treat and manage outbreaks, create efficient reduction techniques, and avoid future outbreaks. In addition, it serves a vital role in assessing the efficiency of known vaccines against each variation and modelling the probability of superinfection. It is known that the genomic sequence carries the vast majority of genetic information pertaining to variations and variants of the Coronavirus. This paper proposes a system that depends on deep learning to predict the mutations of complete COVID-19 virus genomes based on the Convolutional Neural Network (CNN) algorithm as an alignment-free method. The k-mer technology is applied to fragment the DNA of coronavirus mutants to create a unique vocabulary. The proposed approach can correctly predict and characterize other coronavirus strains, such as MERS-CoV, Alpha-CoV, SARS-CoV-2, SARS-CoV, Gamma, Beta, and other strains, regardless of missing information and sequencing errors with an accuracy rate of 99% on the data set.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0190461