A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information

Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of marine science and engineering 2024-12, Vol.12 (12), p.2159
Hauptverfasser:	Xu, Chunhui, Fang, Tian, Xu, Desheng, Yang, Shilin, Zhang, Qifeng, Li, Shuo
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Adaptive control Adaptive systems Algorithms AUV Context Control systems Control theory Controllers Decision making Deep learning Disturbances Dynamical systems Ecosystem disturbance Embedding Experiments FARPPO Forecasts and trends intelligent control Machine learning Methods Neural networks Nonlinear control Nonlinear systems Ocean currents progressive network Reinforcement reinforcement learning Robot control Robots Robust control Simulation Velocity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of the controlled object change. In this paper, we propose a DRL control algorithm that combines progressive networks and context as a depth tracking controller for AUVs. Firstly, an embedding network that maps interaction history sequence data onto latent variables is connected to the input of the policy network, and the context generated by the network gives the DRL agent the ability to adapt to the environment online. Then, the model can be rapidly adapted to a new dynamic environment, which was represented by the presence of generalized force disturbances and changes in the mass of the AUV, through a two-stage training mechanism based on progressive neural networks. The results showed that the proposed algorithm was able to improve the robustness of the controller to environmental disturbances and achieve fast adaptation when there were differences in the dynamics.
ISSN:	2077-1312 2077-1312
DOI:	10.3390/jmse12122159