Multidimensional Machine Learning Model to Calculate a COVID-19 Vulnerability Index

In Colombia, the first case of COVID-19 was confirmed on 6 March 2020. On 13 March 2023, Colombia registered 6,360,780 confirmed positive cases of COVID-19, representing 12.18% of the total population. The National Administrative Department of Statistics (DANE) in Colombia published in 2020 a COVID-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of personalized medicine 2023-07, Vol.13 (7), p.1141
Hauptverfasser: Rosero Perez, Paula Andrea, Realpe Gonzalez, Juan Sebastián, Salazar-Cabrera, Ricardo, Restrepo, David, López, Diego M, Blobel, Bernd
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In Colombia, the first case of COVID-19 was confirmed on 6 March 2020. On 13 March 2023, Colombia registered 6,360,780 confirmed positive cases of COVID-19, representing 12.18% of the total population. The National Administrative Department of Statistics (DANE) in Colombia published in 2020 a COVID-19 vulnerability index, which estimates the vulnerability (per city block) of being infected with COVID-19. Unfortunately, DANE did not consider multiple factors that could increase the risk of COVID-19 (in addition to demographic and health), such as environmental and mobility data (found in the related literature). The proposed multidimensional index considers variables of different types (unemployment rate, gross domestic product, citizens' mobility, vaccination data, and climatological and spatial information) in which the incidence of COVID-19 is calculated and compared with the incidence of the COVID-19 vulnerability index provided by DANE. The collection, data preparation, modeling, and evaluation phases of the Cross-Industry Standard Process for Data Mining methodology (CRISP-DM) were considered for constructing the index. The multidimensional index was evaluated using multiple machine learning models to calculate the incidence of COVID-19 cases in the main cities of Colombia. The results showed that the best-performing model to predict the incidence of COVID-19 in Colombia is the Extra Trees Regressor algorithm, obtaining an R-squared of 0.829. This work is the first step toward a multidimensional analysis of COVID-19 risk factors, which has the potential to support decision making in public health programs. The results are also relevant for calculating vulnerability indexes for other viral diseases, such as dengue.
ISSN:2075-4426
2075-4426
DOI:10.3390/jpm13071141