Real-time Model Predictive Control and System Identification Using Differentiable Physics Simulation
Developing robot controllers in a simulated environment is advantageous but transferring the controllers to the target environment presents challenges, often referred to as the "sim-to-real gap". We present a method for continuous improvement of modeling and control after deploying the rob...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Developing robot controllers in a simulated environment is advantageous but
transferring the controllers to the target environment presents challenges,
often referred to as the "sim-to-real gap". We present a method for continuous
improvement of modeling and control after deploying the robot to a
dynamically-changing target environment. We develop a differentiable physics
simulation framework that performs online system identification and optimal
control simultaneously, using the incoming observations from the target
environment in real time. To ensure robust system identification against noisy
observations, we devise an algorithm to assess the confidence of our estimated
parameters, using numerical analysis of the dynamic equations. To ensure
real-time optimal control, we adaptively schedule the optimization window in
the future so that the optimized actions can be replenished faster than they
are consumed, while staying as up-to-date with new sensor information as
possible. The constant re-planning based on a constantly improved model allows
the robot to swiftly adapt to the changing environment and utilize real-world
data in the most sample-efficient way. Thanks to a fast differentiable physics
simulator, the optimization for both system identification and control can be
solved efficiently for robots operating in real time. We demonstrate our method
on a set of examples in simulation and show that our results are favorable
compared to baseline methods. |
---|---|
DOI: | 10.48550/arxiv.2202.09834 |