Machine Learning Potential for Identifying and Forecasting Complex Environmental Drivers of Vibrio vulnificus Infections in the United States

Environmental change in coastal areas can drive marine bacteria and resulting infections, such as those caused by , with both foodborne and nonfoodborne exposure routes and high mortality. Although ecological drivers of in the environment have been well-characterized, fewer models have been able to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Environmental health perspectives 2025-01, Vol.133 (1), p.17006
Hauptverfasser: Campbell, Amy Marie, Cabrera-Gumbau, Jordi Manuel, Trinanes, Joaquin, Baker-Austin, Craig, Martinez-Urtaza, Jaime
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Environmental change in coastal areas can drive marine bacteria and resulting infections, such as those caused by , with both foodborne and nonfoodborne exposure routes and high mortality. Although ecological drivers of in the environment have been well-characterized, fewer models have been able to apply this to human infection risk due to limited surveillance. The Cholera and Other Illness Surveillance (COVIS) system database has reported infections in the United States since 1988, offering a unique opportunity to both explore the forecasting capabilities machine learning could provide and to characterize complex environmental drivers of infections. Machine learning models, in the form of random forest classification models, were trained and refined using the epidemiological data from 2008 to 2018, six environmental variables (sea surface temperature, salinity, chlorophyll concentration, sea level, land surface temperature, and runoff rate) and categorical encoders to assess our predictive potential to forecast infections based on environmental data. The highest-performing model, which used balanced classes, had an Area Under the Curve score of 0.984 and a sensitivity of 0.971, highlighting the potential of machine learning to anticipate areas and periods of risk. A higher false positive rate was found when the model was applied to real-world imbalanced surveillance data, which is pertinent amid modeled underreporting and misdiagnosis ratios of infections. Further models were also developed to explore multilevel spatial resolution, finding state-specific models can improve specificity and early warning system potential by exclusively using lagged environmental data. The machine learning approach was able to characterize nonlinear and interacting environmental associations driving infections. This study accentuates the potential of machine learning and robust surveillance for forecasting environmentally associated marine infections, providing future directions for improvements, further application, and operationalization. https://doi.org/10.1289/EHP15593.
ISSN:0091-6765
1552-9924
1552-9924
DOI:10.1289/EHP15593