Employing Deep Reinforcement Learning to Maximize Lower Limb Blood Flow Using Intermittent Pneumatic Compression

Intermittent pneumatic compression (IPC) systems apply external pressure to the lower limbs and enhance peripheral blood flow. We previously introduced a cardiac-gated compression system that enhanced arterial blood velocity (BV) in the lower limb compared to fixed compression timing (CT) for seated...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of biomedical and health informatics 2024-10, Vol.28 (10), p.6193-6200
Hauptverfasser: Santelices, Iara B., Landry, Cederick, Arami, Arash, Peterson, Sean D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Intermittent pneumatic compression (IPC) systems apply external pressure to the lower limbs and enhance peripheral blood flow. We previously introduced a cardiac-gated compression system that enhanced arterial blood velocity (BV) in the lower limb compared to fixed compression timing (CT) for seated and standing subjects. However, these pilot studies found that the CT that maximized BV was not constant across individuals and could change over time. Current CT modelling methods for IPC are limited to predictions for a single day and one heartbeat ahead. However, IPC therapy for may span weeks or longer, the BV response to compression can vary with physiological state, and the best CT for eliciting the desired physiological outcome may change, even for the same individual. We propose that a deep reinforcement learning (DRL) algorithm can learn and adaptively modify CT to achieve a selected outcome using IPC. Herein, we target maximizing lower limb arterial BV as the desired outcome and build participant-specific simulated lower limb environments for 6 participants. We show that DRL can adaptively learn the CT for IPC that maximized arterial BV. Compared to previous work, the DRL agent achieves 98% \pm 2 of the resultant blood flow and is faster at maximizing BV; the DRL agent can learn an "optimal" policy in 15 minutes \pm 2 on average and can adapt on the fly. Given a desired objective, we posit that the proposed DRL agent can be implemented in IPC systems to rapidly learn the (potentially time-varying) "optimal" CT with a human-in-the-loop.
ISSN:2168-2194
2168-2208
2168-2208
DOI:10.1109/JBHI.2024.3423698