A machine learning-based treatment prediction model using whole genome variants of hepatitis C virus
In recent years, the development of diagnostics using artificial intelligence (AI) has been remarkable. AI algorithms can go beyond human reasoning and build diagnostic models from a number of complex combinations. Using next-generation sequencing technology, we identified hepatitis C virus (HCV) va...
Gespeichert in:
Veröffentlicht in: | PloS one 2020-11, Vol.15 (11), p.e0242028-e0242028 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, the development of diagnostics using artificial intelligence (AI) has been remarkable. AI algorithms can go beyond human reasoning and build diagnostic models from a number of complex combinations. Using next-generation sequencing technology, we identified hepatitis C virus (HCV) variants resistant to directing-acting antivirals (DAA) by whole genome sequencing of full-length HCV genomes, and applied these variants to various machine-learning algorithms to evaluate a preliminary predictive model. HCV genomic RNA was extracted from serum from 173 patients (109 with subsequent sustained virological response [SVR] and 64 without) before DAA treatment. HCV genomes from the 109 SVR and 64 non-SVR patients were randomly divided into a training data set (57 SVR and 29 non-SVR) and a validation-data set (52 SVR and 35 non-SVR). The training data set was subject to nine machine-learning algorithms selected to identify the optimized combination of functional variants in relation to SVR status following DAA therapy. Subsequently, the prediction model was tested by the validation-data set. The most accurate learning method was the support vector machine (SVM) algorithm (validation accuracy, 0.95; kappa statistic, 0.90; F-value, 0.94). The second-most accurate learning algorithm was Multi-layer perceptron. Unfortunately, Decision Tree, and Naive Bayes algorithms could not be fitted with our data set due to low accuracy (< 0.8). Conclusively, with an accuracy rate of 95.4% in the generalization performance evaluation, SVM was identified as the best algorithm. Analytical methods based on genomic analysis and the construction of a predictive model by machine-learning may be applicable to the selection of the optimal treatment for other viral infections and cancer. |
---|---|
ISSN: | 1932-6203 1932-6203 |
DOI: | 10.1371/journal.pone.0242028 |