Multi-target path planning method based on improved SAC algorithm

The invention belongs to the field of reinforcement learning, and particularly relates to a multi-target path planning method based on an improved SAC algorithm. According to the method, sufficient path experience of the robot reaching each shelf position is planned to be collected, and supervised l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HAN HUIYAN, PANG MIN, ZHENG XINYI, SUN FUSHENG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention belongs to the field of reinforcement learning, and particularly relates to a multi-target path planning method based on an improved SAC algorithm. According to the method, sufficient path experience of the robot reaching each shelf position is planned to be collected, and supervised learning assistance is carried out by reading offline expert experience before actual distribution, so that the distribution efficiency is improved; on the basis of preferential experience playback of the SumTree, the adopted rate of effective path sample experience is increased; based on a calculation reward mechanism of multi-step TD-error, subsequent multi-step rewards are comprehensively considered. According to the method, robot navigation and the SAC algorithm in reinforcement learning are combined, the limitation of a traditional path planning algorithm on a model is eliminated, the learning speed of the robot and the utilization efficiency of experience samples are improved, and the problems that the optimal