Systematic investigation into generalization of COVID-19 CT deep learning models with Gabor ensemble for lung involvement scoring
The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focusing on diagnosis and stratification of COVID-19 from medical images. Despite this large-scale research effort, these models have found limited practical application due in part to u...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The COVID-19 pandemic has inspired unprecedented data collection and computer
vision modelling efforts worldwide, focusing on diagnosis and stratification of
COVID-19 from medical images. Despite this large-scale research effort, these
models have found limited practical application due in part to unproven
generalization of these models beyond their source study. This study
investigates the generalizability of key published models using the publicly
available COVID-19 Computed Tomography data through cross dataset validation.
We then assess the predictive ability of these models for COVID-19 severity
using an independent new dataset that is stratified for COVID-19 lung
involvement. Each inter-dataset study is performed using histogram
equalization, and contrast limited adaptive histogram equalization with and
without a learning Gabor filter. The study shows high variability in the
generalization of models trained on these datasets due to varied sample image
provenances and acquisition processes amongst other factors. We show that under
certain conditions, an internally consistent dataset can generalize well to an
external dataset despite structural differences between these datasets with f1
scores up to 86%. Our best performing model shows high predictive accuracy for
lung involvement score for an independent dataset for which expertly labelled
lung involvement stratification is available. Creating an ensemble of our best
model for disease positive prediction with our best model for disease negative
prediction using a min-max function resulted in a superior model for lung
involvement prediction with average predictive accuracy of 75% for zero lung
involvement and 96% for 75-100% lung involvement with almost linear
relationship between these stratifications. |
---|---|
DOI: | 10.48550/arxiv.2105.15094 |