Kalman Filter-Based One-Shot Sim-to-Real Transfer Learning

Deep reinforcement learning algorithms offer a promising method for industrial robots to tackle unstructured and complex scenarios that are difficult to model. However, due to constraints related to equipment lifespan and safety requirements, acquiring a number of samples directly from the physical...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters 2024-01, Vol.9 (1), p.311-318
Hauptverfasser:	Dong, Qingwei, Zeng, Peng, Wan, Guangxi, He, Yunpeng, Dong, Xiaoting
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Algorithm design and analysis Algorithms Constraint modelling Controllers Deep reinforcement learning Heuristic algorithms Industrial robots Kalman filter Kalman filters Machine learning Performance degradation reality gap Reinforcement learning Robot dynamics sim-to-real Simulation Simulators Task analysis Training Trajectory Transfer learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep reinforcement learning algorithms offer a promising method for industrial robots to tackle unstructured and complex scenarios that are difficult to model. However, due to constraints related to equipment lifespan and safety requirements, acquiring a number of samples directly from the physical environment is often infeasible. With the development of increasingly realistic simulators, it has become feasible for industrial robots to acquire complex motion skills within simulated environments. Nonetheless, the "reality gap" frequently results in performance degradation when transferring policies trained in simulators to physical systems. In this letter, we treat the reality gap between a physical environment (target domain) and a simulated environment (source domain) as a Gaussian perturbation and utilize Kalman filtering to reduce the discrepancy between source and target domain data. We refine the source domain controller using target domain data to enhance the controller's adaptability to the target domain. The efficacy of the proposed method is demonstrated in reaching tasks and peg-in-hole tasks conducted on PR2 and UR5 robotic platforms.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2023.3333661