Hostname: page-component-848d4c4894-wzw2p Total loading time: 0 Render date: 2024-06-02T02:16:00.842Z Has data issue: false hasContentIssue false

Reinforcement learning-based motion control for snake robots in complex environments

Published online by Cambridge University Press:  12 February 2024

Dong Zhang
Affiliation:
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
Renjie Ju
Affiliation:
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
Zhengcai Cao*
Affiliation:
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
*
Corresponding author: Zhengcai Cao; Email: caozc@buct.edu.cn

Abstract

Snake robots can move flexibly due to their special bodies and gaits. However, it is difficult to plan their motion in multi-obstacle environments due to their complex models. To solve this problem, this work investigates a reinforcement learning-based motion planning method. To plan feasible paths, together with a modified deep Q-learning algorithm, a Floyd-moving average algorithm is proposed to ensure smoothness and adaptability of paths for snake robots’ passing. An improved path integral algorithm is used to work out gait parameters to control snake robots to move along the planned paths. To speed up the training of parameters, a strategy combining serial training, parallel training, and experience replaying modules is designed. Moreover, we have designed a motion planning framework consists of path planning, path smoothing, and motion planning. Various simulations are conducted to validate the effectiveness of the proposed algorithms.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Liljebäck, P., Pettersen, K. Y., Stavdahl, Ø. and Gravdahl, J.T., “A review on modelling, implementation, and control of snake robots,” Robot Auton Syst 60(1), 2940 (2012).10.1016/j.robot.2011.08.010CrossRefGoogle Scholar
Hopkins, J. K. and Gupta, S. K., “Design and modeling of a new drive system and exaggerated rectilinear-gait for a snake-inspired robot,” J Mech Robot 6(2), 021001 (2014).10.1115/1.4025750CrossRefGoogle Scholar
Hirose, S.. Biologically Inspired Robots: Snake-Like Locomotors and Minipulators (Oxford University Press, England,1993).Google Scholar
Tanaka, M. and Tanaka, K., “Control of a snake robot for ascending and descending steps,” IEEE Trans Robot 31(2), 511520 (2015).10.1109/TRO.2015.2400655CrossRefGoogle Scholar
Wang, G., Yang, W., Shen, Y. and Shao, H., “Adaptive Path Following of Snake Robot on Ground With Unknown and Varied Friction Coefficients,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain (IEEE, 2018), 75837588.CrossRefGoogle Scholar
Singh, A., Gong, C. and Choset, H., “Modelling and Path Planning of Snake Robot in Cluttered Environment,” In: 2018 International Conference on Reconfigurable Mechanisms and Robots, Delft, Netherlands (IEEE, 2018) pp. 16.CrossRefGoogle Scholar
Liljebäck, P., Pettersen, K. Y., Stavdahl, Ø and Gravdahl, J. T., “Hybrid modelling and control of obstacle-aided snake robot locomotion,” IEEE Trans Robot 26(5), 781799 (2010).CrossRefGoogle Scholar
Liljebäck, P., Pettersen, K. Y., Stavdahl, O. and Gravdahl, J. T.. Snake Robots: Modelling, Mechatronics, and Control: Advances in Industrial Control (Springer Verlag, London, 2012). pp. 817.Google Scholar
Li, D., Zhang, B., Li, P., Wu, E. Q., Law, R., Xu, X., Song, A. and Zhu, L., “Parameter estimation and anti-sideslip line-of-sight method-based adaptive path-following controller for a multi-joint snake robot,” IEEE Trans Syst Man Cyber Syst 53(8), 47764788 (2023).10.1109/TSMC.2023.3256383CrossRefGoogle Scholar
Li, D., Zhang, B., Law, R., Wu, E. Q. and Xu, X., “Error-constrained formation path-following method with disturbance elimination for multi-snake robots,” IEEE Trans Ind Electron 71(5), 49874998(2024). doi: 10.1109/TIE.2023.3288202.CrossRefGoogle Scholar
Takanashi, T., Nakajima, M., Takemori, T. and Tanaka, M., “Obstacle-aided locomotion of a snake robot using piecewise helixes,” IEEE Robot Auto Lett 7(4), 1054210549 (2022).10.1109/LRA.2022.3194689CrossRefGoogle Scholar
Liu, H. and Dai, J., “An approach to carton-folding trajectory planning using dual robotic fingers,” Robot Auton Syst 42(1), 4763 (2003).CrossRefGoogle Scholar
Hatton, R., Knepper, R., Choset, H., Rollinson, D., Gong, C. and Galceran, E., “Snakes on a Plan: Toward Combining Planning and Control,” In: 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany (IEEE, 2013) pp. 51745181,CrossRefGoogle Scholar
Alexander, H. C., Nak-seung, P. H., Erik, I. V. and Patricio, A. V., “Optimal Trajectory Planning and Feedback Control of Lateral Undulation in Snake-Like Robots,” In: Annual American Control Conference (ACC), Milwaukee, WI, USA (IEEE, 2018) pp. 21142120 .Google Scholar
Rezapour, E., Pettersen, K. Y., Liljebäck, P. and Gravdahl, J. T., “Path Following Control of Planar Snake Robots Using Virtual Holonomic Constraints,” In: IEEE International Conference on Robotics and Biomimetics, Shenzhen, China (IEEE, 2013) pp. 530537.CrossRefGoogle Scholar
Fukushima, H., Yanagiya, T., Ota, Y., Katsumoto, M. and Matsuno, F., “Model predictive path-following control of snake robots using an averaged model,” IEEE Trans Contr Syst Tech 29(6), 24442456 (2021).10.1109/TCST.2020.3043446CrossRefGoogle Scholar
Rezapour, E. and Liljebäck, P., “Path following control of a planar snake robot with an exponentially stabilizing joint control law,” IFAC Proceed Vol 46(10), 2835 (2013).CrossRefGoogle Scholar
Matsuno, F. and Sato, H., “Trajectory Tracking Control of Snake Robots Based on Dynamic Model,” In: IEEE International Conference on Robotics and Automation, Barcelona, Spain (IEEE, 2005) pp. 30293034.Google Scholar
Li, D., Pan, Z., Deng, H. and Hu, L., “Adaptive path following controller of a multijoint snake robot based on the improved serpenoid curve,” IEEE Trans Ind Electron 69(4), 38313842 (2022).CrossRefGoogle Scholar
Cao, Z., Zhang, D. and Zhou, M., “Direction control and adaptive path following of 3-D snake-like robot motion,” IEEE Trans Cyber 52(10), 1098010987 (2022).Google ScholarPubMed
Liljebäck, P., Pettersen, K. Y., Stavdahl, O. and Gravdahl, J. T., “Controllability and stability analysis of planar snake robot locomotion,” IEEE Trans Automat Contr 56(6), 13651380 (2011).CrossRefGoogle Scholar
Liu, Y. and Farimani, A., “An energy-saving snake locomotion gait policy using deep reinforcement learning (2021). doi: 10.48550/arXiv.2103.04511 arXiv.CrossRefGoogle Scholar
Qiu, K., Zhang, H., Lv, Y., Wang, Y., Zhou, C. and Xiong, R., “Reinforcement Learning of Serpentine Locomotion for a Snake Robot,” In: IEEE International Conference on Real-time Computing and Robotics, Xining, China (IEEE, 2021) pp. 468473.CrossRefGoogle Scholar
Jiang, Z., Otto, R., Bing, Z., Huang, K. and Knoll, A., “TargetTracking Control of a Wheel-Less Snake Robot Based on a Supervised Multi-Layered SNN,” In: IEEE/RAS International Conference on Humanoid Robots, Las Vegas, NV, USA (IEEE, 2020) pp. 71247130.CrossRefGoogle Scholar
Khan, S., Mahmood, T., Ullah, S., Ali, K. and Ullah, A., “Motion Planning for a Snake Robot Using Double Deep Q-Learning,” In: International Conference on Artificial Intelligence, Islamabad, Pakistan (IEEE, 2021) pp. 264–230.Google Scholar
Shi, J., Dear, T. and Kelly, S., “Deep reinforcement learning for snake robot locomotion,” IFAC-PapersOnLine 53(2), 96889695 (2020).CrossRefGoogle Scholar
Cao, Z., Xiao, Q., Huang, R. and Zhou, M., “Robust neuro-optimal control of underactuated snake robots with experience replay,” IEEE Trans Neur Net Lear Syst 29(1), 208217 (2018).CrossRefGoogle ScholarPubMed
Theodorou, E., Buchli, J. and Schaal, S., “A generalized path integral control approach to reinforcement learning,” J Mach Learn Res 11(104), 31373181 (2010).Google Scholar
Yamamoto, K., Ariizumi, R., Hayakawa, T. and Matsuno, F., “Path integral policy improvement with population adaptation,” IEEE Trans Cyber 52(1), 312322 (2022).CrossRefGoogle ScholarPubMed
Chatterjee, S., Nachstedt, T., Worgotter, F., Tamosiunaite, M., Manoonpong, P., Enomoto, Y., Ariizumi, R. and Matsuno, F., “Reinforcement Learning Approach to Generate Goal-Directed Locomotion of a Snake-like Robot with Screw-Drive Units,” In: International Conference on Robotics in Alpe-Adria-Danube Region, Smolenice, Slovakia (IEEE, 2014) pp. 17.CrossRefGoogle Scholar
Fang, Y., Zhu, W. and Guo, X., “Target directed locomotion of a snake like robot based on path integral reinforcement learning,” Patt Recog Art Intell 32(1), 19 (2019).Google Scholar
Zhao, Y., Wu, G., Gui, F., Xu, C. and Xie, Z., “Optimal coordination path selecting method for conduction transformation based on floyd algorithm,” Procedia Comput Sci 162, 227234 (2019).Google Scholar
Dai, J., Travers, M. Dear, T., Gong, C. and Choset, H., “Robot-Inspired Biology: The Compound-Wave Control Templat,” In: IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA (IEEE, 2015) pp. 58795884,CrossRefGoogle Scholar
Cao, Z., Li, J., Zhang, D., Zhou, M. and Abusorrah, A., “A multi-object tracking algorithm with center-based feature extraction and occlusion handling,” IEEE Ttans Intell Transp 24(4), 44644473 (2023).CrossRefGoogle Scholar
Xie, H., Zhang, D., Wang, J., Zhou, M., Cao, Z., Hu, X. and Abusorrah, A., “Semi-direct multimap slam system for real-time sparse 3-d map reconstruction,” IEEE Trans Instru Measure 72(4502013), 113 (2023).Google Scholar