How the brain can be trained to achieve an intermittent control strategy for stabilizing quiet stance by means of reinforcement learning

The stabilization of human quiet stance is achieved by a combination of the intrinsic elastic properties of ankle muscles and an active closed-loop activation of the ankle muscles, driven by the delayed feedback of the ongoing sway angle and the corresponding angular velocity in a way of a delayed p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Biological cybernetics 2024-08, Vol.118 (3-4), p.229-248
Hauptverfasser:	Takazawa, Tomoki, Suzuki, Yasuyuki, Nakamura, Akihiro, Matsuo, Risa, Morasso, Pietro, Nomura, Taishin
Format:	Artikel
Sprache:	eng
Schlagworte:	Active control Angular velocity Ankle Basal ganglia Bioinformatics Biological activity Biomedical and Life Sciences Biomedicine Brain Brain - physiology Closed loops Complex Systems Computer Appl. in Life Sciences Control systems Elastic properties Feedback Feedback control Ganglia Humans Learning - physiology Motor skill learning Muscle, Skeletal - physiology Muscles Neurobiology Neurosciences Original Original Article Parameter identification Postural Balance - physiology Posture Posture - physiology Reinforcement, Psychology Robust control Stabilization Switching
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The stabilization of human quiet stance is achieved by a combination of the intrinsic elastic properties of ankle muscles and an active closed-loop activation of the ankle muscles, driven by the delayed feedback of the ongoing sway angle and the corresponding angular velocity in a way of a delayed proportional (P) and derivative (D) feedback controller. It has been shown that the active component of the stabilization process is likely to operate in an intermittent manner rather than as a continuous controller: the switching policy is defined in the phase-plane, which is divided in dangerous and safe regions, separated by appropriate switching boundaries. When the state enters a dangerous region, the delayed PD control is activated, and it is switched off when it enters a safe region, leaving the system to evolve freely. In comparison with continuous feedback control, the intermittent mechanism is more robust and capable to better reproduce postural sway patterns in healthy people. However, the superior performance of the intermittent control paradigm as well as its biological plausibility, suggested by experimental evidence of the intermittent activation of the ankle muscles, leaves open the quest of a feasible learning process, by which the brain can identify the appropriate state-dependent switching policy and tune accordingly the P and D parameters. In this work, it is shown how such a goal can be achieved with a reinforcement motor learning paradigm, building upon the evidence that, in general, the basal ganglia are known to play a central role in reinforcement learning for action selection and, in particular, were found to be specifically involved in postural stabilization.
ISSN:	1432-0770 0340-1200 1432-0770
DOI:	10.1007/s00422-024-00993-0