Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic Actuator
Many state-of-the art robotic applications utilize series elastic actuators (SEAs) with closed-loop force control to achieve complex tasks such as walking, lifting, and manipulation. Model-free PID control methods are more prone to instability due to nonlinearities in the SEA where cascaded model-ba...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Many state-of-the art robotic applications utilize series elastic actuators
(SEAs) with closed-loop force control to achieve complex tasks such as walking,
lifting, and manipulation. Model-free PID control methods are more prone to
instability due to nonlinearities in the SEA where cascaded model-based robust
controllers can remove these effects to achieve stable force control. However,
these model-based methods require detailed investigations to characterize the
system accurately. Deep reinforcement learning (DRL) has proved to be an
effective model-free method for continuous control tasks, where few works deal
with hardware learning. This paper describes the training process of a DRL
policy on hardware of an SEA pendulum system for tracking force control
trajectories from 0.05 - 0.35 Hz at 50 N amplitude using the Proximal Policy
Optimization (PPO) algorithm. Safety mechanisms are developed and utilized for
training the policy for 12 hours (overnight) without an operator present within
the full 21 hours training period. The tracking performance is evaluated
showing improvements of $25$ N in mean absolute error when comparing the first
18 min. of training to the full 21 hours for a 50 N amplitude, 0.1 Hz sinusoid
desired force trajectory. Finally, the DRL policy exhibits better tracking and
stability margins when compared to a model-free PID controller for a 50 N chirp
force trajectory. |
---|---|
DOI: | 10.48550/arxiv.2304.04911 |