Hostname: page-component-848d4c4894-nmvwc Total loading time: 0 Render date: 2024-06-14T13:26:24.148Z Has data issue: false hasContentIssue false

CPR-SLAM: RGB-D SLAM in dynamic environment using sub-point cloud correlations

Published online by Cambridge University Press:  17 May 2024

Xinyi Yu
Affiliation:
College of information and Engineering, Zhejiang University of Technology, HangZhou, Zhejiang, China
Wancai Zheng
Affiliation:
College of information and Engineering, Zhejiang University of Technology, HangZhou, Zhejiang, China
Linlin Ou*
Affiliation:
College of information and Engineering, Zhejiang University of Technology, HangZhou, Zhejiang, China
*
Corresponding author: Linlin Ou; Email: linlinou@zjut.edu.cn

Abstract

The early applications of Visual Simultaneous Localization and Mapping (VSLAM) technology were primarily focused on static environments, relying on the static nature of the environment for map construction and localization. However, in practical applications, we often encounter various dynamic environments, such as city streets, where moving objects are present. These dynamic objects can make it challenging for robots to accurately understand their own position. This paper proposes a real-time localization and mapping method tailored for dynamic environments to effectively deal with the interference of moving objects in such settings. Firstly, depth images are clustered, and they are subdivided into sub-point clouds to obtain clearer local information. Secondly, when processing regular frames, we fully exploit the structural invariance of static sub-point clouds and their relative relationships. Among these, the concept of the sub-point cloud is introduced as novel idea in this paper. By utilizing the results computed based on sub-poses, we can effectively quantify the disparities between regular frames and reference frames. This enables us to accurately detect dynamic areas within the regular frames. Furthermore, by refining the dynamic areas of keyframes using historical observation data, the robustness of the system is further enhanced. We conducted comprehensive experimental evaluations on challenging dynamic sequences from the TUM dataset and compared our approach with state-of-the-art dynamic VSLAM systems. The experimental results demonstrate that our method significantly enhances the accuracy and robustness of pose estimation. Additionally, we validated the effectiveness of the system in dynamic environments through real-world scenario tests.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Davison, A. J., Reid, I. D., Molton, N. D. and Stasse, O., “MonoSLAM: Real-time single camera SLAM,” IEEE Trans Pattern Anal Mach Intell 29(6), 10521067 (2007).CrossRefGoogle ScholarPubMed
Mur-Artal, R. and Tardós, J. D., “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Trans Robot 33(5), 12551262 (2017).CrossRefGoogle Scholar
Qin, T., Li, P. and Shen, S., “Vins-mono: A robust and versatile monocular visual-inertial state estimator,” IEEE Trans Robot 34(4), 10041020 (2018).CrossRefGoogle Scholar
Zhang, C., Zhang, R., Jin, S. and Yi, X., “PFD-SLAM: A new RGB-D SLAM for dynamic indoor environments based on non-prior semantic segmentation,” Remote Sens 14(10), 2445 (2022).CrossRefGoogle Scholar
Cheng, J., Wang, C. and Meng, M. Q.-H., “Robust visual localization in dynamic environments based on sparse motion removal,” IEEE Trans Autom Sci Eng 17(2), 658669 (2019).CrossRefGoogle Scholar
Bahraini, M. S., Bozorg, M. and Rad, A. B., “SLAM in dynamic environments via ML-RANSAC,” Mechatronics 49, 105118 (2018).CrossRefGoogle Scholar
Tan, W., Liu, H., Dong, Z., Zhang, G. and Bao, H., “Robust Monocular slam in Dynamic Environments,” In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR) 2013, (IEEE, 2013) pp. 209218.Google Scholar
Wang, K., Yao, X., Ma, N. and Jing, X., “Real-time motion removal based on point correlations for RGB-D SLAM in indoor dynamic environments,” Neur Comput Appl 35(12), 116 (2022).Google Scholar
Sun, Y., Liu, M. and Meng, M. Q.-H., “Motion removal for reliable RGB-D SLAM in dynamic environments,” Robot Auton Syst 108, 115128 (2018).CrossRefGoogle Scholar
Rousseeuw, P. J., “Least median of squares regression,” J Am Stat Assoc 79(388), 871880 (1984).CrossRefGoogle Scholar
Yu, C., Liu, Z., Liu, X.-J., Xie, F., Yang, Y., Wei, Q. and Fei, Q., “Ds-slam: A Semantic Visual Slam Towards Dynamic Environments,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018, (IEEE, 2018) pp. 11681174.CrossRefGoogle Scholar
Badrinarayanan, V., Kendall, A. and Cipolla, R., “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans Patt Anal Mach Intell 39(12), 24812495 (2017).CrossRefGoogle ScholarPubMed
Ji, T., Wang, C. and Xie, L., “Towards Real-Time Semantic RGB-D slam in Dynamic Environments,” In: IEEE International Conference on Robotics and Automation (ICRA) 2021, (IEEE, 2021) pp. 1117511181.CrossRefGoogle Scholar
Liu, Y. and Miura, J., “Kmop-vslam: Dynamic Visual Slam for RGB-D Cameras Using k-Means and Openpose,” In: IEEE/SICE International Symposium on System Integration (SII) 2021, (IEEE, 2021) pp. 415420.CrossRefGoogle Scholar
Cao, Z., Simon, T., Wei, S.-E. and Sheikh, Y., “Realtime Multi-Person 2D Pose Estimation Using Part Affinity fields,” In: Proceedings of the IEEE conference on computer vision and pattern recognition, (2017) pp. 72917299.Google Scholar
Li, A., Wang, J., Xu, M. and Chen, Z., “DP-SLAM: A visual SLAM with moving probability towards dynamic environments,” Inform Sci 556, 128142 (2021).CrossRefGoogle Scholar
He, K., Gkioxari, G., Dollár, P. and Girshick, R., “Mask r-cnn,” In: Proceedings of the IEEE international conference on computer vision 2017, (2017) pp. 29612969.Google Scholar
Dai, W., Zhang, Y., Li, P., Fang, Z. and Scherer, S., “RGB-D SLAM in dynamic environments using point correlations,” IEEE Trans Patt Anal Mach Intell 44(1), 373389 (2020).CrossRefGoogle Scholar
Li, S. and Lee, D., “RGB-D SLAM in dynamic environments using static point weighting,” IEEE Robot Autom Lett 2(4), 22632270 (2017).CrossRefGoogle Scholar
Kenye, L. and Kala, R., “Improving RGB-D SLAM in dynamic environments using semantic aided segmentation,” Robotica 40(6), 20652090 (2022).CrossRefGoogle Scholar
He, J., Zhai, Y., Feng, H., Zhang, S. and Fu, Y., “Dynamic Objects Detection Based on Stereo Visual Inertial System in Highly Dynamic Environment,” In: IEEE International Conference on Mechatronics and Automation (ICMA) 2019, (IEEE, 2019) pp. 23302335.CrossRefGoogle Scholar
Qian, C., Xiang, Z., Wu, Z. and Sun, H., “Rf-lio: Removal-first tightly-coupled lidar inertial odometry in high dynamic environments,” (2022). arXiv preprint arXiv:2206.09463.Google Scholar
Khoshelham, K. and Elberink, S. O., “Accuracy and resolution of kinect depth data for indoor mapping applications,” Sensors 12(2), 14371454 (2012).CrossRefGoogle ScholarPubMed
Nguyen, C. V., Izadi, S. and Lovell, D., “Modeling Kinect Sensor Noise for Improved 3D Reconstruction and Tracking,” In: second international conference on 3D imaging, modeling, processing, visualization & transmission 2012, (IEEE, 2012) pp. 524530.CrossRefGoogle Scholar
Lancaster, P. and Salkauskas, K., “Surfaces generated by moving least squares methods,” Math Comput 37(155), 141158 (1981).CrossRefGoogle Scholar
Li, S. and Lee, D., “Fast visual odometry using intensity-assisted iterative closest point,” IEEE Robot Autom Lett 1(2), 992999 (2016).CrossRefGoogle Scholar
Kerl, C., Sturm, J. and Cremers, D., “Robust Odometry Estimation for RGB-D Cameras,” In: IEEE international conference on robotics and automation 2013, (IEEE, 2013) pp. 37483754.CrossRefGoogle Scholar
Error, M. A., “Mean absolute error,” Retrieved September 19 (2016). Retrieved September.Google Scholar
Scona, R., Jaimez, M., Petillot, Y. R., Fallon, M. and Cremers, D., “Staticfusion: Background Reconstruction for Dense RGB-D Slam in Dynamic Environments,” In: IEEE international conference on robotics and automation (ICRA) 2018, (IEEE, 2018) pp. 38493856.CrossRefGoogle Scholar
Zhong, F., Wang, S., Zhang, Z. and Wang, Y., “Detect-slam: Making Object Detection and slam Mutually Beneficial,” In: IEEE Winter Conference on Applications of Computer Vision (WACV) 2018, (IEEE, 2018) pp. 10011010.CrossRefGoogle Scholar
Strecke, M. and Stuckler, J., “Em-fusion: Dynamic Object-Level slam with Probabilistic Data Association,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (IEEE, 2019) pp. 58655874.CrossRefGoogle Scholar
Bescos, B., Fácil, J. M., Civera, J. and Neira, J., “Dynaslam: Tracking, mapping, and inpainting in dynamic scenes,” IEEE Robot Automa Lett 3(4), 40764083 (2018).CrossRefGoogle Scholar