Hostname: page-component-6766d58669-r8qmj Total loading time: 0 Render date: 2026-05-20T05:10:05.245Z Has data issue: false hasContentIssue false

Research on obstacle avoidance of underactuated autonomous underwater vehicle based on offline reinforcement learning

Published online by Cambridge University Press:  29 October 2024

Tao Liu*
Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China Guangdong Provincial Key Laboratory of Information Technology for Deep Water Acoustics, Zhuhai, China
Junhao Huang
Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
Jintao Zhao
Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
*
Corresponding author: Tao Liu; Email: liutao55@mail.sysu.edu.cn

Abstract

The autonomous navigation and obstacle avoidance capabilities of autonomous underwater vehicles (AUVs) are essential for ensuring their safe navigation and long-term, efficient operation. However, the complexity of the marine environment poses significant challenges to safe and effective obstacle avoidance. To address this issue, this study proposes an AUV obstacle avoidance control algorithm based on offline reinforcement learning. This method adopts the Conservative Q-learning (CQL) algorithm, which is based on the Soft Actor-Critic (SAC) framework. It learns from obtained historical obstacle avoidance data and ultimately achieves a favorable obstacle avoidance control strategy. In this method, PID and SAC control algorithms are utilized to generate expert obstacle avoidance data to construct a diversified offline database. Additionally, based on the line-of-sight (LOS) guidance method and artificial potential field (APF) method, information regarding the distance and orientation of targets and obstacles is incorporated into the state space, and heading and obstacle avoidance reward terms are integrated into the reward function design. The algorithm successfully guides the AUV in autonomous navigation and dynamic obstacle avoidance in three-dimensional space. Furthermore, the algorithm exhibits a certain degree of anti-interference capability against uncertain disturbances and ocean currents, enhancing the safety and robustness of the AUV system. Simulation results fully demonstrate the feasibility and effectiveness of the intelligent obstacle avoidance method based on offline reinforcement learning. This study highlights the profound significance of offline reinforcement learning in enabling robust and reliable control systems for AUVs, paving the way for enhanced operational capabilities in challenging marine environments.

Information

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable