Analysis of Covid-19 Genome Sequences based on Geo-Locations

The COVID-19 pandemic has become a major worldwide serious health risk of the current 21st century. It is necessary to examine the genomic sequences of the deadly virus COVID-19 strains to fully understand the virus’s behavior, origin, and how rapidly it mutates. This paper addresses the analysis of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pakistan Journal of Engineering & Technology 2021-12, Vol.4 (4), p.41-45
Hauptverfasser: Umar, Aqsa, Mahoto, Naeem Ahemd, Bhatti, Sania, Rathi, Sapna
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The COVID-19 pandemic has become a major worldwide serious health risk of the current 21st century. It is necessary to examine the genomic sequences of the deadly virus COVID-19 strains to fully understand the virus’s behavior, origin, and how rapidly it mutates. This paper addresses the analysis of the COVID-19 genome sequences CGS of China, Pakistan, and India. In this research, we have looked at the usage of sequential pattern mining (SPM), a closed sequential pattern technique to discover valuable information from COVID-19 genomic sequences. The analysis is performed on the three strains of genome sequences. First, the sequences data files of genome sequences are being transformed to the computer-readable corpus of CGS and then the SPM technique is applied to discover the frequent patterns of nucleotides. Second, Frequent codons of Amino acids are extracted from three strains of genome sequences. Third, we have evaluated the performance of the proposed approach in terms of time execution, the number of frequent patterns, and memory consumption. Obtained results suggest that the codon of Threonine amino acid ACA with support 1576 in Pakistan is the most frequent pattern from the other two strains of CGS. Furthermore, when the user minimum threshold value is low, the closed sequential pattern mining using sparse and vertical id-lists CloFAST algorithm performance evaluates that a high number of frequent patterns consumes more time and memory
ISSN:2664-2042
2664-2050
DOI:10.51846/vol4iss4pp41-45