Classification of pathogenic microbes using a minimal set of single nucleotide polymorphisms derived from whole genome sequences
In a context specific manner, Intra-species genomic variation plays an important role in phenotypic diversity observed among pathogenic microbes. Efficient classification of these pathogens is important for diagnosis and treatment of several infectious diseases. NGS technologies have provided access...
Gespeichert in:
Veröffentlicht in: | Genomics (San Diego, Calif.) Calif.), 2019-03, Vol.111 (2), p.205-211 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In a context specific manner, Intra-species genomic variation plays an important role in phenotypic diversity observed among pathogenic microbes. Efficient classification of these pathogens is important for diagnosis and treatment of several infectious diseases. NGS technologies have provided access to wealth of data that can be utilized to discover important markers for pathogen classification. In this paper, we described three different approaches (Jensen-Shannon divergence, random forest and Shewhart control chart) for identification of a minimal set of SNPs that can be used for classification of organisms. These methods are generic and can be implemented for analysis of any organism. We have shown usefulness of these approaches for analysis of Mycobacterium tuberculosis and Escherichia coli isolates. We were able to identify a minimal set of 18 SNPs that can be used as molecular markers for phylogroup based classification and 8 SNPs for pathogroup based classification of E. coli.
•Classification of pathogenic microbes based on their phylogroup or pathogroup•Minimal set of SNPs for the classification•Use of random forest, JSD and Shewhart control chart to identify minimal set from WGS data•Minimal set of 18 SNPs for phylogroup based classification of Escherichia coli |
---|---|
ISSN: | 0888-7543 1089-8646 |
DOI: | 10.1016/j.ygeno.2018.02.004 |