Path Planning of a Mobile Robot for a Dynamic Indoor Environment Based on an SAC-LSTM Algorithm

This paper proposes an improved Soft Actor-Critic Long Short-Term Memory (SAC-LSTM) algorithm for fast path planning of mobile robots in dynamic environments. To achieve continuous motion and better decision making by incorporating historical and current states, a long short-term memory network (LST...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sensors (Basel, Switzerland) Switzerland), 2023-12, Vol.23 (24), p.9802
Hauptverfasser: Zhang, Yongchao, Chen, Pengzhan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes an improved Soft Actor-Critic Long Short-Term Memory (SAC-LSTM) algorithm for fast path planning of mobile robots in dynamic environments. To achieve continuous motion and better decision making by incorporating historical and current states, a long short-term memory network (LSTM) with memory was integrated into the SAC algorithm. To mitigate the memory depreciation issue caused by resetting the LSTM's hidden states to zero during training, a burn-in training method was adopted to boost the performance. Moreover, a prioritized experience replay mechanism was implemented to enhance sampling efficiency and speed up convergence. Based on the SAC-LSTM framework, a motion model for the Turtlebot3 mobile robot was established by designing the state space, action space, reward function, and overall planning process. Three simulation experiments were conducted in obstacle-free, static obstacle, and dynamic obstacle environments using the ROS platform and Gazebo9 software. The results were compared with the SAC algorithm. In all scenarios, the SAC-LSTM algorithm demonstrated a faster convergence rate and a higher path planning success rate, registering a significant 10.5 percentage point improvement in the success rate of reaching the target point in the dynamic obstacle environment. Additionally, the time taken for path planning was shorter, and the planned paths were more concise.
ISSN:1424-8220
1424-8220
DOI:10.3390/s23249802