Use of Machine Learning to Predict Onset of NAFLD in an All-Comers Cohort—Development and Validation in 2 Large Asian Cohorts
Nonalcoholic fatty liver disease (NAFLD) is one of the most common liver diseases. There are no universally accepted models that accurately predict time to onset of NAFLD. Machine learning (ML) models may allow prediction of such time-to-event (ie, survival) outcomes. This study aims to develop and...
Gespeichert in:
Veröffentlicht in: | Gastro hep advances 2024, Vol.3 (7), p.1005-1011 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Nonalcoholic fatty liver disease (NAFLD) is one of the most common liver diseases. There are no universally accepted models that accurately predict time to onset of NAFLD. Machine learning (ML) models may allow prediction of such time-to-event (ie, survival) outcomes. This study aims to develop and independently validate ML-derived models to allow personalized prediction of time to onset of NAFLD in individuals who have no NAFLD at baseline.
The development dataset comprised 25,599 individuals from a South Korean NAFLD registry. A random 70:30 split divided it into training and internal validation sets. ML survival models (random survival forest, extra survival trees) were fitted, with time to NAFLD diagnosis in months as the target variable and routine anthropometric and laboratory parameters as predictors. The independent validation dataset comprised 16,173 individuals from a Chinese open dataset. Models were evaluated using the concordance index (c-index) and Brier score on both the internal and independent validation sets.
The datasets (development vs independent validation) had 1,331,107 vs 543,874 person months of follow-up, NAFLD incidence of 25.7% (6584 individuals) vs 14.4% (2322 individuals), and median time to NAFLD onset of 60 (interquartile range 38–75) vs 24 (interquartile range 13–37) months, respectively. The ML models achieved a good c-index of >0.7 in the validation cohort—random survival forest 0.751 (95% confidence interval 0.742–0.759), extra survival trees 0.752 (95% confidence interval 0.744–0.762).
ML models can predict time-to-onset of NAFLD based on routine patient data. They can be used by clinicians to deliver personalized predictions to patients, which may facilitate patient counseling and clinical decision making on interval imaging timing. |
---|---|
ISSN: | 2772-5723 2772-5723 |
DOI: | 10.1016/j.gastha.2024.06.007 |