Comparison of Machine Learning Methods in the Study of Cancer Survivors’ Return to Work: An Example of Breast Cancer Survivors with Work-Related Factors in the CONSTANCES Cohort

Purpose Machine learning (ML) methods showed a higher accuracy in identifying individuals without cancer who were unable to return to work (RTW) compared to the classical methods (e.g. logistic regression models). We therefore aim to discuss the value of these methods in relation to RTW for cancer s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of occupational and environmental medicine 2023-12, Vol.33 (4), p.750-756
Hauptverfasser: Badreau, Marie, Fadel, Marc, Roquelaure, Yves, Bertin, Mélanie, Rapicault, Clémence, Gilbert, Fabien, Porro, Bertrand, Descatha, Alexis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose Machine learning (ML) methods showed a higher accuracy in identifying individuals without cancer who were unable to return to work (RTW) compared to the classical methods (e.g. logistic regression models). We therefore aim to discuss the value of these methods in relation to RTW for cancer survivors. Methods Breast cancer (BC) survivors who were working at diagnosis within the CONSTANCES cohort were included in the study. RTW was assessed five years after the BC diagnosis (early retirement was considered as non-RTW). Age and occupation at diagnosis, and physical occupational job exposures assessed using the Job Exposure Matrix, JEM-CONSTANCES, were evaluated as predictors of RTW five years after BC diagnosis. The following four ML methods were used: (i) k-nearest neighbors; (ii) random forest; (iii) neural network; and (iv) elastic net. Results The training sample included 683 BC survivors (RTW: 85.7%), and the test sample 171 (RTW: 85.4%). The elastic net method had the best results despite low sensitivity (accuracy = 76.6%; sensitivity = 31.7%; specificity = 90.8%), and the random forest model was the most accurate (= 79.5%) but also the least sensitive (= 14.3%). Conclusion This study takes a first step towards opening up new possibilities for identifying the occupational determinants of cancer survivors’ RTW. Further work, including a larger sample size, and more predictor variables, is now needed.
ISSN:1053-0487
1076-2752
1573-3688
DOI:10.1007/s10926-023-10112-8