Deep learning-based collision detection framework for robot tasks in clutter

Giacomo Golluccio; Daniele Di Vito; Gianluca Antonelli; Alessandro Marino

doi:10.1017/S0263574725000517

Deep learning-based collision detection framework for robot tasks in clutter

Published online by Cambridge University Press: 15 May 2025

Giacomo Golluccio

Daniele Di Vito ,

Gianluca Antonelli and

Alessandro Marino

Show author details

Giacomo Golluccio*: Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Daniele Di Vito: Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Gianluca Antonelli: Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
Alessandro Marino: Affiliation:
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Cassino, Italy
*: Corresponding author: Giacomo Golluccio; Email: giacomo.golluccio@unicas.it

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this work, the problem of reliably checking collisions between robot manipulators and the surrounding environment in short time for tasks, such as replanning and object grasping in clutter, is addressed. Geometric approaches are usually applied in this context; however, they can result not suitable in highly time-constrained applications. The purpose of this paper is to present a learning-based method able to outperform geometric approaches in clutter. The proposed approach uses a neural network (NN) to detect collisions online by performing a classification task on the input represented by the depth image or point cloud containing the robot gripper projected into the application scene. Specifically, several state-of-the-art NN architectures are considered, along with some customization to tackle the problem at hand. These approaches are compared to identify the model that achieves the highest accuracy while containing the computational burden. The analysis shows the feasibility of the robot collision checker based on a deep learning approach. In fact, such approach presents a low collision detection time, of the order of milliseconds on the selected hardware, with acceptable accuracy. Furthermore, the computational burden is compared with state-of-the-art geometric techniques. The entire work is based on an industrial case study involving a KUKA Agilus industrial robot manipulator at the Technology $\&$ Innovation Center of KUKA Deutschland GmbH, Germany. Further validation is performed with the Amazon Robotic Manipulation Benchmark (ARMBench) dataset as well, in order to corroborate the reported findings.

Keywords

robot manipulators collision detection deep learning grasping point clouds depth images

Information

Type: Research Article
Information: Robotica , Volume 43 , Issue 5 , May 2025 , pp. 1807 - 1826

DOI: https://doi.org/10.1017/S0263574725000517 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Pan, J. and Manocha, D., “Fast probabilistic collision checking for sampling-based motion planning using locality-sensitive hashing,” The International Journal of Robotics Research 35(12), 1477–1496 (2016).CrossRef Google Scholar

Golluccio, G., Di Lillo, P., Di Vito, D., Marino, A. and Antonelli, G., “Objects relocation in clutter with robot manipulators via tree-based Q-learning algorithm: Analysis and experiments,” Journal of Intelligent & Robotic Systems 106(2), 1–20 (2022).CrossRef Google Scholar

Di Vito, D., Bergeron, M., Meger, D., Dudek, G. and Antonelli, G., “Dynamic Planning of Redundant Robots Within a Set-Based Task-Priority Inverse Kinematics Framework,” 2020 IEEE Conference on Control Technology and Applications (CCTA) (IEEE, Montreal, QC, Canada, 2020) pp. 549–554.CrossRef Google Scholar

Kim, J. and Park, F. C., “Active learning of the collision distance function for high-DOF multi-arm robot systems,” Robotica 42(12), 4055–4069 (2024).CrossRef Google Scholar

Schmidt, P., Vahrenkamp, N., Wächter, M. and Asfour, T., “Grasping of Unknown Objects Using Deep Convolutional Neural Networks Based on Depth Images,” 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Brisbane, QLD, Australia, 2018) pp. 6831–6838.CrossRef Google Scholar

Huebner, K., Welke, K., Przybylski, M., Vahrenkamp, N., Asfour, T., Kragic, D. and Dillmann, R., “Grasping Known Objects with Humanoid Robots: A Box-Based Approach,” 2009 International Conference on Advanced Robotics (IEEE, Munich, Germany, 2009) pp. 1–6.Google Scholar

Bohg, J., Morales, A., Asfour, T. and Kragic, D., “Data-driven grasp synthesis—a survey,” IEEE Transactions on Robotics 30(2), 289–309 (2013).CrossRef Google Scholar

Lenz, I., Lee, H. and Saxena, A., “Deep learning for detecting robotic grasps,” The International Journal of Robotics Research 34(4-5), 705–724 (2015).CrossRef Google Scholar

D’Avella, S., Sundaram, A. M., Friedl, W., Tripicchio, P. and Roa, M. A., “Multimodal grasp planner for hybrid grippers in cluttered scenes,” IEEE Robotics and Automation Letters 8(4), 2030–2037 (2023).CrossRef Google Scholar

Berscheid, L., Rühr, T. and Kröger, T., “Improving Data Efficiency of Self-Supervised Learning for Robotic Grasping,” 2019 International Conference on Robotics and Automation (ICRA) (IEEE, Montreal, QC, Canada, 2019) pp. 2125–2131.CrossRef Google Scholar

Sundermeyer, M., Mousavian, A., Triebel, R. and Fox, D., “Contact-Graspnet: Efficient 6-Dof Grasp Generation in Cluttered Scenes,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Xi'an, China, 2021) pp. 13438–13444.CrossRef Google Scholar

Mirjalili, R., Krawez, M., Silenzi, S., Blei, Y. and Burgard, W., “Lan-grasp: Using large language models for semantic object grasping,” arxiv: 2310.05239 (2024).Google Scholar

Kumra, S., Joshi, S. and Sahin, F., “GR-Convnet v2: A real-time multi-grasp detection network for robotic grasping,” Sensors 22(16), 6208 (2022).CrossRef Google Scholar

Chen, L., Niu, M., Yang, J., Qian, Y., Li, Z., Wang, K., Yan, T. and Huang, P., “Robotic grasp detection using structure prior attention and multiscale features,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 54(11), 7039–7053 (2024).CrossRef Google Scholar

Dong, M. and Zhang, J., “A review of robotic grasp detection technology,” Robotica 41(12), 1–40 (2023).CrossRef Google Scholar

Krizhevsky, A., Sutskever, I. and Hinton, G. E., “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems 25, 1097–1105 (2012).Google Scholar

Qi, C. R., Su, H., Mo, K. and Guibas, L. J., “Pointnet: Deep Learning on Point Sets For 3D Classification and Segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, 2017) pp. 77–85.Google Scholar

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X. and Xiao, J., “3D shapenets: A deep representation for volumetric shapes,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Boston, MA, USA, 2015) pp. 1912–1920.Google Scholar

Zhou, Y. and Tuzel, O., “Voxelnet: End-To-End Learning For Point Cloud Based 3D Object Detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT, USA, 2018) pp. 4490–4499.Google Scholar

Maturana, D. and Scherer, S., “Voxnet: A 3D Convolutional Neural Network for Real-Time Object Recognition,” 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, Hamburg, Germany, 2015) pp. 922–928.CrossRef Google Scholar

Pan, J., Chitta, S. and Manocha, D., “Fcl: A General Purpose Library for Collision and Proximity Queries,” 2012 IEEE International Conference on Robotics and Automation (IEEE, Saint Paul, MN, USA, 2012) pp. 3859–3866.CrossRef Google Scholar

Gilbert, E. G., Johnson, D. W. and Keerthi, S. S., “A fast procedure for computing the distance between complex objects in three-dimensional space,” IEEE Journal on Robotics and Automation 4(2), 193–203 (1988).CrossRef Google Scholar

Kleinbort, M., Salzman, O. and Halperin, D., “Collision detection or nearest-neighbor search? On the computational bottleneck in sampling-based motion planning,” arxiv: 1607.04800 (2016).Google Scholar

Jiménez, P., Thomas, F. and Torras, C., “Collision Detection Algorithms For Motion Planning,” In: Robot Motion Planning and Control (Springer, 2005) pp. 305–343.Google Scholar

Das, N. and Yip, M., “Learning-based proxy collision detection for robot motion planning applications,” IEEE Transaction on Robotics 36(4), 1096–1114 (2020).CrossRef Google Scholar

Muñoz, J., Lehner, P., Moreno, L. E., Albu-Schäffer, A. and Roa, M. A., “Collisiongp: Gaussian process-based collision checking for robot motion planning,” IEEE Robotics Automation Letters 8(7), 4036–4043 (2023).CrossRef Google Scholar

Tamizi, M. G., Honari, H., Nozdryn-Plotnicki, A. and Najjaran, H., “End-to-end deep learning-based framework for path planning and collision checking: Bin-picking application,” Robotica 42(4), 1094–1112 (2024).CrossRef Google Scholar

Joho, D., Schwinn, J. and Safronov, K., IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Yokohama, Japan, 2024) pp. 15402–15408.Google Scholar

Danielczuk, M., Mousavian, A., Eppner, C. and Fox, D., “Object Rearrangement Using Learned Implicit Collision Functions,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, Xi'an, China, 2021) pp. 6010–6017.CrossRef Google Scholar

Murali, A., Mousavian, A., Eppner, C., Fishman, A. and Fox, D., “Cabinet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation,” 2023 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, London, United Kingdom, 2023) pp. 1866–1874.CrossRef Google Scholar

Mitash, C., Wang, F., Lu, S., Terhuja, V., Garaas, T., Polido, F. and Nambi, M., “Armbench: An Object-Centric Benchmark Dataset for Robotic Manipulation,” 2023 IEEE International Conference on Robotics and Automation (ICRA) (London, United Kingdom, 2023) pp. 9132–9139. https://www.amazon.science/publications/armbench-an-object-centric-benchmark-dataset-for-robotic-manipulation.CrossRef Google Scholar

Golluccio, G.. Robot Planning and Control Combined with Machine Learning Techniques (University of Cassino and Southern Lazio, Cassino, 2023).Google Scholar

Zhou, Q.-Y., Park, J. and Koltun, V., “Open3D: A modern library for 3D data processing,” arXiv: 1801.09847 (2018).Google Scholar

Peng, S., Jiang, C., Liao, Y., Niemeyer, M., Pollefeys, M. and Geiger, A., “Shape as points: A differentiable poisson solver,” Advances in Neural Information Processing Systems 34, 13032–13044 (2021).Google Scholar

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Fei-Fei, L., “Imagenet: A Large-Scale Hierarchical Image Database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, Miami, FL, USA, 2009) pp. 248–255.CrossRef Google Scholar

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C. and Oliphant, T. E., “Array programming with NumPy,” Nature 585(7825), 357–362 (2020).CrossRef Google Scholar PubMed

Muja, M. and Lowe, D. G., “Scalable nearest neighbor algorithms for high dimensional data,” IEEE Transactions on Pattern Analysis and Machine Intelligence 36(11), 2227–2240 (2014).CrossRef Google Scholar PubMed

de Queiroz, R. L., Garcia, D. C., Chou, P. A. and Florencio, D. A., “Distance-based probability model for octree coding,” IEEE Signal Processing Letters 25(6), 739–742 (2018).CrossRef Google Scholar

Golluccio et al. supplementary material

File 3 MB

Article contents

Deep learning-based collision detection framework for robot tasks in clutter

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Golluccio et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests