On the Generalization of Sleep Apnea Detection Methods Based on Heart Rate Variability and Machine Learning

[EN] Obstructive sleep apnea (OSA) is a respiratory disorder highly correlated with severe cardiovascular diseases that has unleashed the interest of hundreds of experts aiming to overcome the elevated requirements of polysomnography, the gold standard for its detection. In this regard, a variety of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Padovano, Daniele, Martínez-Rodrigo, Arturo, Pastor, José M, Rieta, J J, Alcaraz, Raul
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[EN] Obstructive sleep apnea (OSA) is a respiratory disorder highly correlated with severe cardiovascular diseases that has unleashed the interest of hundreds of experts aiming to overcome the elevated requirements of polysomnography, the gold standard for its detection. In this regard, a variety of algorithms based on heart rate variability (HRV) features and machine learning (ML) classifiers have been recently proposed for epoch-wise OSA detection from the surface electrocardiogram signal. Many researchers have employed freely available databases to assess their methods in a reproducible way, but most were purely tested with cross-validation approaches and even some using solely a single database for training and testing procedures. Hence, although promising values of diagnostic accuracy have been reported by some of these methods, they are suspected to be overestimated and the present work aims to analyze the actual generalization ability of several epoch-wise OSA detectors obtained through a common ML pipeline and typical HRV features. Precisely, the performance of the generated OSA detectors has been compared on two validation approaches, i.e., the widely used epoch-wise, k-fold cross-validation and the highly recommended external validation, both considering different combinations of well-known public databases. Regardless of the used ML classifiers and the selected HRV-based features, the external validation results have been 20 to 40% lower than those obtained with cross-validation in terms of accuracy, sensitivity, and specificity. Consequently, these results suggest that ML-based OSA detectors trained with public databases are still not sufficiently general to be employed in clinical practice, as well as that larger, more representative public datasets and the use of external validation are mandatory to improve the generalization ability and to obtain reliable assessment of the true predictive power of these algorithms, respectively. This research has received financial support from public grants PID2021-00X128525-IV0 and PID2021-123804OB-I00 of the Spanish Government 10.13039/501100011033 jointly with the European Regional Development Fund, SBPLY/17/180501/000411 and SBPLY/21/180501/000186 from Junta de Comunidades de Castilla-La Mancha, and AICO/2021/286 from Generalitat Valenciana. Moreover, Daniele Padovano holds a predoctoral scholarship 2022-PRED-20642, which is cofinanced by the operating program of European Social Fund (ESF) 2014-2020 of