Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs

The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested dif...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers in biology and medicine 2021-05, Vol.132, p.104335-104335, Article 104335
Hauptverfasser: Alves, Marcos Antonio, Castro, Giulia Zanon, Oliveira, Bruno Alberto Soares, Ferreira, Leonardo Augusto, Ramírez, Jaime Arturo, Silva, Rodrigo, Guimarães, Frederico Gadelha
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system. •A literature review of ML methods applied to COVID-19 screening in routine blood tests.•Results from different ML techniques - including an ensemble - to support the diagnosis of COVID-19 using usual blood exams.•A decision tree-based methodology for the explanation of the model which can be given to the health teams.•Individual explanations in a graph that shows the relative importance of each attribute and their interactions.•Further evidence that simple blood tests might help identifying false positive/negative RT-PCR tests.
ISSN:0010-4825
1879-0534
DOI:10.1016/j.compbiomed.2021.104335