Multi-Objective Cooperative Path Planning of Unmanned Surface Vehicle Based on Deep Reinforcement Learning

In recent years, with the development of ship intelligence and unmanned technology, autonomous path planning of unmanned surface vehicle (USV) has become very important. However, most of the existing studies ignore the limitation of the endurance and speed of USV to voyage, and only consider the cou...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2024-11, p.1-1
Hauptverfasser: Xiao, Haipeng, Fu, Lijun, Shang, Chengya, Lin, Yunfeng, Fan, Yaxiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In recent years, with the development of ship intelligence and unmanned technology, autonomous path planning of unmanned surface vehicle (USV) has become very important. However, most of the existing studies ignore the limitation of the endurance and speed of USV to voyage, and only consider the course planning at the fixed speed. To this end, we propose a path planning method based on deep reinforcement learning (DRL): Firstly, we establish a USV motion model. Then, we propose a new USV energy consumption model. Existing research typically builds USV energy consumption model by considering the combined effects of currents, winds, and waves, which needs precise ocean data and involves a complex modeling process. Moreover, it fail to account for the influence of USV's own speed on energy consumption. In contrast, based on the relationship between speed, marine environment, and propulsion load, we propose a new USV energy consumption model. This model simplifies the modeling process and links USV energy consumption with sailing speed and ocean environmental conditions. Next, the original Soft-Actor-Critic (SAC) algorithm use the Multi-Layer Perceptron (MLP) as the action network. However, Convolutional Neural Networks (CNN) excel in capturing spatial and local features. Compared to MLP, CNN have stronger information acquisition capabilities. Therefore, we propose a model (FRCF) that combines fully connected layers with CNN to replace MLP as the action network of the SAC agent, aiming to enhance the agent's convergence speed and performance. Finally, utilizing SAC with FRCF as the action network (SAC-FRCF), along with the motion models and the new USV energy consumption model, we achieved multi-objective cooperative intelligent path planning for energy, speed, and heading. Unlike traditional path planning methods that control the USV heading based on discrete heading angular, we adjust USV speed and heading based on continuous acceleration and angular velocity. Meanwhile, we impose constraints on the USV's speed, heading angle, acceleration, and angular velocity to ensure that its motion complies with kinematic constraints. Experimental results show that SAC-FRCF reduces the exploration time by 46.40% and has stronger convergence performance as well as better path planning effect compared with SAC algorithm using MLP as action network (SAC-MLP). Compared with SAC-MLP without considering the optimal energy consumption, it reduces the energy consumption by 28.3
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2024.3509521