Hostname: page-component-5b777bbd6c-5mwv9 Total loading time: 0 Render date: 2025-06-17T17:53:32.280Z Has data issue: false hasContentIssue false

Deep learning-based collision detection framework for robot tasks in clutter

Published online by Cambridge University Press:  15 May 2025

Giacomo Golluccio*
Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Daniele Di Vito
Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Gianluca Antonelli
Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Alessandro Marino
Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
*
Corresponding author: Giacomo Golluccio; Email: giacomo.golluccio@unicas.it

Abstract

In this work, the problem of reliably checking collisions between robot manipulators and the surrounding environment in short time for tasks, such as replanning and object grasping in clutter, is addressed. Geometric approaches are usually applied in this context; however, they can result not suitable in highly time-constrained applications. The purpose of this paper is to present a learning-based method able to outperform geometric approaches in clutter. The proposed approach uses a neural network (NN) to detect collisions online by performing a classification task on the input represented by the depth image or point cloud containing the robot gripper projected into the application scene. Specifically, several state-of-the-art NN architectures are considered, along with some customization to tackle the problem at hand. These approaches are compared to identify the model that achieves the highest accuracy while containing the computational burden. The analysis shows the feasibility of the robot collision checker based on a deep learning approach. In fact, such approach presents a low collision detection time, of the order of milliseconds on the selected hardware, with acceptable accuracy. Furthermore, the computational burden is compared with state-of-the-art geometric techniques. The entire work is based on an industrial case study involving a KUKA Agilus industrial robot manipulator at the Technology $\&$ Innovation Center of KUKA Deutschland GmbH, Germany. Further validation is performed with the Amazon Robotic Manipulation Benchmark (ARMBench) dataset as well, in order to corroborate the reported findings.

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Pan, J. and Manocha, D., “Fast probabilistic collision checking for sampling-based motion planning using locality-sensitive hashing,” The International Journal of Robotics Research 35(12), 14771496 (2016).CrossRefGoogle Scholar
Golluccio, G., Di Lillo, P., Di Vito, D., Marino, A. and Antonelli, G., “Objects relocation in clutter with robot manipulators via tree-based Q-learning algorithm: Analysis and experiments,” Journal of Intelligent & Robotic Systems 106(2), 120 (2022).CrossRefGoogle Scholar
Di Vito, D., Bergeron, M., Meger, D., Dudek, G. and Antonelli, G., “Dynamic Planning of Redundant Robots Within a Set-Based Task-Priority Inverse Kinematics Framework,” 2020 IEEE Conference on Control Technology and Applications (CCTA) (IEEE, Montreal, QC, Canada, 2020) pp. 549554.CrossRefGoogle Scholar
Kim, J. and Park, F. C., “Active learning of the collision distance function for high-DOF multi-arm robot systems,” Robotica 42(12), 40554069 (2024).CrossRefGoogle Scholar
Schmidt, P., Vahrenkamp, N., Wächter, M. and Asfour, T., “Grasping of Unknown Objects Using Deep Convolutional Neural Networks Based on Depth Images,” 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Brisbane, QLD, Australia, 2018) pp. 68316838.CrossRefGoogle Scholar
Huebner, K., Welke, K., Przybylski, M., Vahrenkamp, N., Asfour, T., Kragic, D. and Dillmann, R., “Grasping Known Objects with Humanoid Robots: A Box-Based Approach,” 2009 International Conference on Advanced Robotics (IEEE, Munich, Germany, 2009) pp. 16.Google Scholar
Bohg, J., Morales, A., Asfour, T. and Kragic, D., “Data-driven grasp synthesis—a survey,” IEEE Transactions on Robotics 30(2), 289309 (2013).CrossRefGoogle Scholar
Lenz, I., Lee, H. and Saxena, A., “Deep learning for detecting robotic grasps,” The International Journal of Robotics Research 34(4-5), 705724 (2015).CrossRefGoogle Scholar
D’Avella, S., Sundaram, A. M., Friedl, W., Tripicchio, P. and Roa, M. A., “Multimodal grasp planner for hybrid grippers in cluttered scenes,” IEEE Robotics and Automation Letters 8(4), 20302037 (2023).CrossRefGoogle Scholar
Berscheid, L., Rühr, T. and Kröger, T., “Improving Data Efficiency of Self-Supervised Learning for Robotic Grasping,” 2019 International Conference on Robotics and Automation (ICRA) (IEEE, Montreal, QC, Canada, 2019) pp. 21252131.CrossRefGoogle Scholar
Sundermeyer, M., Mousavian, A., Triebel, R. and Fox, D., “Contact-Graspnet: Efficient 6-Dof Grasp Generation in Cluttered Scenes,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Xi'an, China, 2021) pp. 1343813444.CrossRefGoogle Scholar
Mirjalili, R., Krawez, M., Silenzi, S., Blei, Y. and Burgard, W., “Lan-grasp: Using large language models for semantic object grasping,” arxiv: 2310.05239 (2024).Google Scholar
Kumra, S., Joshi, S. and Sahin, F., “GR-Convnet v2: A real-time multi-grasp detection network for robotic grasping,” Sensors 22(16), 6208 (2022).CrossRefGoogle Scholar
Chen, L., Niu, M., Yang, J., Qian, Y., Li, Z., Wang, K., Yan, T. and Huang, P., “Robotic grasp detection using structure prior attention and multiscale features,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 54(11), 70397053 (2024).CrossRefGoogle Scholar
Dong, M. and Zhang, J., “A review of robotic grasp detection technology,” Robotica 41(12), 140 (2023).CrossRefGoogle Scholar
Krizhevsky, A., Sutskever, I. and Hinton, G. E., “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems 25, 10971105 (2012).Google Scholar
Qi, C. R., Su, H., Mo, K. and Guibas, L. J., “Pointnet: Deep Learning on Point Sets For 3D Classification and Segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, 2017) pp. 7785.Google Scholar
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X. and Xiao, J., “3D shapenets: A deep representation for volumetric shapes,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Boston, MA, USA, 2015) pp. 19121920.Google Scholar
Zhou, Y. and Tuzel, O., “Voxelnet: End-To-End Learning For Point Cloud Based 3D Object Detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT, USA, 2018) pp. 44904499.Google Scholar
Maturana, D. and Scherer, S., “Voxnet: A 3D Convolutional Neural Network for Real-Time Object Recognition,” 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, Hamburg, Germany2015) pp. 922928.CrossRefGoogle Scholar
Pan, J., Chitta, S. and Manocha, D., “Fcl: A General Purpose Library for Collision and Proximity Queries,” 2012 IEEE International Conference on Robotics and Automation (IEEE, Saint Paul, MN, USA, 2012) pp. 38593866.CrossRefGoogle Scholar
Gilbert, E. G., Johnson, D. W. and Keerthi, S. S., “A fast procedure for computing the distance between complex objects in three-dimensional space,” IEEE Journal on Robotics and Automation 4(2), 193203 (1988).CrossRefGoogle Scholar
Kleinbort, M., Salzman, O. and Halperin, D., “Collision detection or nearest-neighbor search? On the computational bottleneck in sampling-based motion planning,” arxiv: 1607.04800 (2016).Google Scholar
Jiménez, P., Thomas, F. and Torras, C., “Collision Detection Algorithms For Motion Planning,” In: Robot Motion Planning and Control (Springer, 2005) pp. 305343.Google Scholar
Das, N. and Yip, M., “Learning-based proxy collision detection for robot motion planning applications,” IEEE Transaction on Robotics 36(4), 10961114 (2020).CrossRefGoogle Scholar
Muñoz, J., Lehner, P., Moreno, L. E., Albu-Schäffer, A. and Roa, M. A., “Collisiongp: Gaussian process-based collision checking for robot motion planning,” IEEE Robotics Automation Letters 8(7), 40364043 (2023).CrossRefGoogle Scholar
Tamizi, M. G., Honari, H., Nozdryn-Plotnicki, A. and Najjaran, H., “End-to-end deep learning-based framework for path planning and collision checking: Bin-picking application,” Robotica 42(4), 10941112 (2024).CrossRefGoogle Scholar
Joho, D., Schwinn, J. and Safronov, K., IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Yokohama, Japan, 2024) pp. 1540215408.Google Scholar
Danielczuk, M., Mousavian, A., Eppner, C. and Fox, D., “Object Rearrangement Using Learned Implicit Collision Functions,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Xi'an, China, 2021) pp. 60106017.CrossRefGoogle Scholar
Murali, A., Mousavian, A., Eppner, C., Fishman, A. and Fox, D., “Cabinet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation,” 2023 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, London, United Kingdom, 2023) pp. 18661874.CrossRefGoogle Scholar
Mitash, C., Wang, F., Lu, S., Terhuja, V., Garaas, T., Polido, F. and Nambi, M., “Armbench: An Object-Centric Benchmark Dataset for Robotic Manipulation,” 2023 IEEE International Conference on Robotics and Automation (ICRA) (London, United Kingdom, 2023) pp. 91329139. https://www.amazon.science/publications/armbench-an-object-centric-benchmark-dataset-for-robotic-manipulation.CrossRefGoogle Scholar
Golluccio, G.. Robot Planning and Control Combined with Machine Learning Techniques (University of Cassino and Southern Lazio, Cassino, 2023).Google Scholar
Zhou, Q.-Y., Park, J. and Koltun, V., “Open3D: A modern library for 3D data processing,” arXiv: 1801.09847 (2018).Google Scholar
Peng, S., Jiang, C., Liao, Y., Niemeyer, M., Pollefeys, M. and Geiger, A., “Shape as points: A differentiable poisson solver,” Advances in Neural Information Processing Systems 34, 1303213044 (2021).Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Fei-Fei, L., “Imagenet: A Large-Scale Hierarchical Image Database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, Miami, FL, USA, 2009) pp. 248255.CrossRefGoogle Scholar
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C. and Oliphant, T. E., “Array programming with NumPy,” Nature 585(7825), 357362 (2020).CrossRefGoogle ScholarPubMed
Muja, M. and Lowe, D. G., “Scalable nearest neighbor algorithms for high dimensional data,” IEEE Transactions on Pattern Analysis and Machine Intelligence 36(11), 22272240 (2014).CrossRefGoogle ScholarPubMed
de Queiroz, R. L., Garcia, D. C., Chou, P. A. and Florencio, D. A., “Distance-based probability model for octree coding,” IEEE Signal Processing Letters 25(6), 739742 (2018).CrossRefGoogle Scholar
Supplementary material: File

Golluccio et al. supplementary material

Golluccio et al. supplementary material
Download Golluccio et al. supplementary material(File)
File 3 MB