Hostname: page-component-848d4c4894-m9kch Total loading time: 0 Render date: 2024-05-29T09:33:45.027Z Has data issue: false hasContentIssue false

Vision-based food handling system for high-resemblance random food items

Published online by Cambridge University Press:  21 February 2024

Yadan Zeng*
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Yee Seng Teoh
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Guoniu Zhu
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Elvin Toh
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
I-Ming Chen
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
*
Corresponding author: Yadan Zeng; Email: yadan001@e.ntu.edu.sg

Abstract

The rise in the number of automated robotic kitchens accelerated the need for advanced food handling system, emphasizing food analysis including ingredient classification pose recognition and assembling strategy. Selecting the optimal piece from a pile of similarly shaped food items is a challenge to automated meal assembling system. To address this, we present a constructive assembling algorithm, introducing a unique approach for food pose detection–Fast Image to Pose Detection (FI2PD), and a closed-loop packing strategy. Powered by a convolutional neural network (CNN) and a pose retrieval model, FI2PD is adept at constructing a 6D pose from only RGB images. The method employs a coarse-to-fine approach, leveraging the CNN to pinpoint object orientation and position, alongside a pose retrieval process for target selection and 6D pose derivation. Our closed-loop packing strategy, aided by the Item Arrangement Verifier, ensures precise arrangement and system robustness. Additionally, we introduce our FdIngred328 dataset of nine food categories ranging from fake foods to real foods, and the automatically generated data based on synthetic techniques. The performance of our method for object recognition and pose detection has been demonstrated to achieve a success rate of 97.9%. Impressively, the integration of a closed-loop strategy into our meal-assembly process resulted in a notable success rate of 90%, outperforming the results of systems lacking the closed-loop mechanism.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Marwan, Q. M., Chua, S. C. and Kwek, L. C., “Comprehensive review on reaching and grasping of objects in robotics,” Robotica 39(10), 18491882 (2021).CrossRefGoogle Scholar
Wang, Z., Hirai, S. and Kawamura, S., “Challenges and opportunities in robotic food handling: A review,” Front Robo AI 8, 789107 (2022).CrossRefGoogle ScholarPubMed
Lu, N., Cai, Y., Lu, T., Cao, X., Guo, W. and Wang, S., “Picking out the impurities: Attention-based push-grasping in dense clutter,” Robotica 41(2), 470485 (2023).CrossRefGoogle Scholar
Wang, H., Sahoo, D., Liu, C., Lim, E.-p. and Hoi, S. C., “Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA (IEEE, 2019) pp. 1156411573.CrossRefGoogle Scholar
Zhu, G.-N., Zeng, Y., Teoh, Y. S., Toh, E., Wong, C. Y. and Chen, I.-M., “A bin-picking benchmark for systematic evaluation of robotic-assisted food handling for line production,” IEEE/ASME Trans Mech 28(3), 17781788 (2022).CrossRefGoogle Scholar
Hu, Y., Fua, P., Wang, W. and Salzmann, M., “Single-Stage 6D Object Pose Estimation,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA (IEEE, 2020) pp. 29272936.CrossRefGoogle Scholar
Periyasamy, A. S., Schwarz, M. and Behnke, S., “Robust 6D Object Pose Estimation in Cluttered Scenes Using Semantic Segmentation and Pose Regression Networks,” In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain (IEEE, 2018) pp. 66606666.CrossRefGoogle Scholar
Collet, A., Martinez, M. and Srinivasa, S. S., “The moped framework: Object recognition and pose estimation for manipulation,” Int J Rob Res 30(10), 12841306 (2011).CrossRefGoogle Scholar
Zeng, A., Yu, K.-T., Song, S., Suo, D., Walker, E., Rodriguez, A. and Xiao, J., “Multi-View Self-Supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge,” In: 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore (IEEE, 2017) pp. 1386–1383.CrossRefGoogle Scholar
Xiang, Y., Schmidt, T., Narayanan, V. and Fox, D., “Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes, (2017). arXiv preprint arXiv: 1711.00199.Google Scholar
Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S. and Guibas, L. J.. Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA (IEEE, 2019) pp. 26372646.CrossRefGoogle Scholar
He, Y., Sun, W., Huang, H., Liu, J., Fan, H. and Sun, J., “PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA (IEEE, 2020) pp. 1163211641.CrossRefGoogle Scholar
Kang, H., Zhou, H. and Chen, C., “Visual perception and modeling for autonomous apple harvesting,” IEEE Access 8, 6215162163 (2020).CrossRefGoogle Scholar
Lin, G., Tang, Y., Zou, X., Xiong, J. and Li, J., “Guava detection and pose estimation using a low-cost rgb-d sensor in the field,” Sensors 19(2), 428 (2019).CrossRefGoogle ScholarPubMed
Yin, W., Wen, H., Ning, Z., Ye, J., Dong, Z. and Luo, L., “Fruit detection and pose estimation for grape cluster–harvesting robot using binocular imagery based on deep neural networks,” Front Robot AI 8, 626989 (2021).CrossRefGoogle ScholarPubMed
Rong, J., Wang, P., Wang, T., Hu, L. and Yuan, T., “Fruit pose recognition and directional orderly grasping strategies for tomato harvesting robots,” Comput Electron Agr 202, 107430 (2022).CrossRefGoogle Scholar
Costanzo, M., De Simone, M., Federico, S., Natale, C. and Pirozzi, S., “Enhanced 6d pose estimation for robotic fruit picking, (2023). arXiv preprint arXiv: 2305.15856, 2023.Google Scholar
Khan, Z. H., Khalid, A. and Iqbal, J., “Towards realizing robotic potential in future intelligent food manufacturing systems,” Innov Food Sci Emerg 48, 1124 (2018).CrossRefGoogle Scholar
JLS Automation, Pick-and-Place Robots Designed For Agility (2002), https://www.jlsautomation.com/talon-packaging-systems.Google Scholar
Paul, H., Qiu, Z., Wang, Z., Hirai, S. and Kawamura, S., “A ROS 2 Based Robotic System to Pick-and-Place Granular Food Materials,” In: 2022 IEEE International Conference on Robotics and Biomimetics (ROBIO), Jinghong, China (IEEE, 2022) pp. 99104.CrossRefGoogle Scholar
Takahashi, K., Ko, W., Ummadisingu, A. and Maeda, S.-i.. Uncertainty-Aware Self-Supervised Target-Mass Grasping of Granular Foods. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi‘an, China (IEEE, 2021) pp. 26202626.CrossRefGoogle Scholar
Low, J. H., Khin, P. M., Han, Q. Q., Yao, H., Teoh, Y. S., Zeng, Y., Li, S., Liu, J., Liu, Z., y Alvarado, P. V., Cheng, I-M, Keong Tee, B. C. and Hua Yeow, R. C., “Sensorized reconfigurable soft robotic gripper system for automated food handling,” IEEE/ASME Trans Mech 27(5), 32323243 (2022).CrossRefGoogle Scholar
Wang, Z., Makiyama, Y. and Hirai, S., “A soft needle gripper capable of grasping and piercing for handling food materials,” J Robot Mech 33(4), 935943 (2021).CrossRefGoogle Scholar
Wang, Z., Kanegae, R. and Hirai, S., “Circular shell gripper for handling food products,” Soft Robot 8(5), 542554 (2021).CrossRefGoogle ScholarPubMed
Pavlakos, G., Zhou, X., Chan, A., Derpanis, K. G. and Daniilidis, K., “6-DoF Object Pose from Semantic Keypoints,” In: 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore (IEEE, 2017) pp. 20112018.CrossRefGoogle Scholar
Wu, J., Zhou, B., Russell, R., Kee, V., Wagner, S., Hebert, M., Torralba, A. and Johnson, D. M.," Real-Time Object Pose Estimation with Pose Interpreter Networks,” In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain (IEEE, 2018) pp. 67986805.CrossRefGoogle Scholar
Park, K., Patten, T. and Vincze, M., “Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6d Pose Estimation.” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, South Korea (IEEE, 2019) pp. 76687677.Google Scholar
Li, Y., Wang, G., Ji, X., Xiang, Y. and Fox, D., “DeepIm: Deep Iterative Matching for 6d Pose Estimation,” In: Proceedings of the European Conference on Computer Vision (ECCV), (Springer, 2018) pp. 683698.CrossRefGoogle Scholar
Lee, G. G., Huang, C.-W., Chen, J.-H., Chen, S.-Y. and Chen, H.-L., “AIFood: A Large Scale Food Images Dataset for Ingredient Recognition,” In: TENCON 2019-2019 IEEE Region 10 Conference (TENCON), Kochi, India (IEEE, 2019) pp. 802805.CrossRefGoogle Scholar
Horiguchi, S., Amano, S., Ogawa, M. and Aizawa, K., “Personalized classifier for food image recognition,” IEEE Trans Multi 20(10), 28362848 (2018).CrossRefGoogle Scholar
Ciocca, G., Napoletano, P. and Schettini, R., “Food recognition: A new dataset, experiments, and results,” IEEE J Biomed Health Inform 21(3), 588598 (2017).CrossRefGoogle ScholarPubMed
Güngör, C., Baltacı, F., Erdem, A. and Erdem, E., “Turkish Cuisine: A Benchmark Dataset with Turkish Meals for Food Recognition,” In: 2017 25th Signal Processing and Communications Applications Conference (SIU), Antalya, Turkey (IEEE, 2017) pp. 14.CrossRefGoogle Scholar
Bargoti, S. and Underwood, J., “Deep fruit detection in orchards, (2016). arXiv preprint arXiv: 1610.03677.Google Scholar
Häni, N., Roy, P. and Isler, V., “Minneapple: A benchmark dataset for apple detection and segmentation,” IEEE Robot Auto Lett 5(2), 852858 (2020).CrossRefGoogle Scholar
Ummadisingu, A., Takahashi, K. and Fukaya, N., “Cluttered Food Grasping with Adaptive Fingers and Synthetic-Data Trained Object Detection,” In: 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA (IEEE, 2022) pp. 82908297.CrossRefGoogle Scholar
Radford, A., Metz, L. and Chintala, S., “Unsupervised representation learning with deep convolutional generative adversarial networks, (2015), arXiv preprint arXiv: 1511.06434.Google Scholar
Bochkovskiy, A., Wang, C.-Y. and Liao, H.-Y. M., “Yolov4: Optimal speed and accuracy of object detection, (2020), arXiv preprint arXiv: 2004.10934.Google Scholar
Su, T., Liang, X., Zeng, X. and Liu, S., “Pythagorean-hodograph curves-based trajectory planning for pick-and-place operation of delta robot with prescribed pick and place heights,” Robotica 41(6), 16511672 (2023).CrossRefGoogle Scholar
Bang, S., Baek, F., Park, S., Kim, W. and Kim, H., “Image augmentation to improve construction resource detection using generative adversarial networks, cut-and-paste, and image transformation techniques,” Automat Constr 115(3), 103198 (2020).CrossRefGoogle Scholar
Supplementary material: File

Zeng et al. supplementary material

Zeng et al. supplementary material
Download Zeng et al. supplementary material(File)
File 19.1 MB