Bundled Gradients Through Contact Via Randomized Smoothing

The empirical success of derivative-free methods in reinforcement learning for planning through contact seems at odds with the perceived fragility of classical gradient-based optimization methods in these domains. What is causing this gap, and how might we use the answer to improve gradient-based me...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters 2022-04, Vol.7 (2), p.4000-4007
Hauptverfasser:	Suh, Hyung Ju Terry, Pang, Tao, Tedrake, Russ
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Approximation Contact modeling Convergence Empirical analysis Fragility Iterative methods manipulation planning Monte Carlo methods Optimal control Optimization optimization and optimal control Planning Smoothing Smoothing methods Stochastic processes
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The empirical success of derivative-free methods in reinforcement learning for planning through contact seems at odds with the perceived fragility of classical gradient-based optimization methods in these domains. What is causing this gap, and how might we use the answer to improve gradient-based methods? We believe a stochastic formulation of dynamics is one crucial ingredient. We use tools from randomized smoothing to analyze sampling-based approximations of the gradient, and formalize such approximations through the bundled gradient. We show that using the bundled gradient in lieu of the gradient mitigates fast-changing gradients of non-smooth contact dynamics modeled by the implicit time-stepping, or the penalty method. Finally, we apply the bundled gradient to optimal control using iterative MPC, introducing a novel algorithm which improves convergence over using exact gradients. Combining our algorithm with a convex implicit time-stepping formulation of contact, we show that we can tractably tackle planning-through-contact problems in manipulation.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2022.3146931