Hostname: page-component-77f85d65b8-v2srd Total loading time: 0 Render date: 2026-04-17T15:28:41.204Z Has data issue: false hasContentIssue false

Vision-based food handling system for high-resemblance random food items

Published online by Cambridge University Press:  21 February 2024

Yadan Zeng*
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Yee Seng Teoh
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Guoniu Zhu
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
Elvin Toh
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
I-Ming Chen
Affiliation:
Robotics Research Centre of the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore
*
Corresponding author: Yadan Zeng; Email: yadan001@e.ntu.edu.sg

Abstract

The rise in the number of automated robotic kitchens accelerated the need for advanced food handling system, emphasizing food analysis including ingredient classification pose recognition and assembling strategy. Selecting the optimal piece from a pile of similarly shaped food items is a challenge to automated meal assembling system. To address this, we present a constructive assembling algorithm, introducing a unique approach for food pose detection–Fast Image to Pose Detection (FI2PD), and a closed-loop packing strategy. Powered by a convolutional neural network (CNN) and a pose retrieval model, FI2PD is adept at constructing a 6D pose from only RGB images. The method employs a coarse-to-fine approach, leveraging the CNN to pinpoint object orientation and position, alongside a pose retrieval process for target selection and 6D pose derivation. Our closed-loop packing strategy, aided by the Item Arrangement Verifier, ensures precise arrangement and system robustness. Additionally, we introduce our FdIngred328 dataset of nine food categories ranging from fake foods to real foods, and the automatically generated data based on synthetic techniques. The performance of our method for object recognition and pose detection has been demonstrated to achieve a success rate of 97.9%. Impressively, the integration of a closed-loop strategy into our meal-assembly process resulted in a notable success rate of 90%, outperforming the results of systems lacking the closed-loop mechanism.

Information

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Supplementary material: File

Zeng et al. supplementary material

Zeng et al. supplementary material
Download Zeng et al. supplementary material(File)
File 19.1 MB