Rover-IRL: Inverse Reinforcement Learning With Soft Value Iteration Networks for Planetary Rover Path Planning
Planetary rovers, such as those currently on Mars, face difficult path planning problems, both before landing during the mission planning stages as well as once on the ground. In this work, we present a new approach to these planning problems based on inverse reinforcement learning using deep convol...
Gespeichert in:
Veröffentlicht in: | IEEE robotics and automation letters 2019-04, Vol.4 (2), p.1387-1394 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Planetary rovers, such as those currently on Mars, face difficult path planning problems, both before landing during the mission planning stages as well as once on the ground. In this work, we present a new approach to these planning problems based on inverse reinforcement learning using deep convolutional networks and value iteration networks (VIN) as important internal structures. VIN are an approximation of the value iteration (VI) algorithm implemented with convolutional neural networks to make VI fully differentiable. We propose a modification to the value iteration recurrence, referred to as the soft value iteration network (SVIN). SVIN is designed to produce more effective training gradients through the VIN. It relies on an internal soft policy model, where the policy is represented with a probability distribution over all possible actions, rather than a deterministic policy that returns only the best action. We demonstrate the effectiveness of our proposed architecture in both a grid world dataset as well as a highly realistic synthetic dataset generated from currently deployed rover mission planning tools and real Mars imagery. |
---|---|
ISSN: | 2377-3766 2377-3766 |
DOI: | 10.1109/LRA.2019.2895892 |