Hostname: page-component-89b8bd64d-dvtzq Total loading time: 0 Render date: 2026-05-12T03:15:05.976Z Has data issue: false hasContentIssue false

Mixed deep learning and natural language processing method for fake-food image recognition and standardization to help automated dietary assessment

Published online by Cambridge University Press:  06 April 2018

Simon Mezgec
Affiliation:
Jožef Stefan International Postgraduate School, Ljubljana, Slovenia Computer Systems Department, Jožef Stefan Institute, Jamova cesta 39, Ljubljana 1000, Slovenia
Tome Eftimov
Affiliation:
Jožef Stefan International Postgraduate School, Ljubljana, Slovenia Computer Systems Department, Jožef Stefan Institute, Jamova cesta 39, Ljubljana 1000, Slovenia
Tamara Bucher
Affiliation:
Institute of Food, Nutrition and Health (IFNH), ETH Zürich, Zürich, Switzerland School of Health Sciences, Faculty of Health and Medicine, Priority Research Centre in Physical Activity and Nutrition, The University of Newcastle, Callaghan, Australia
Barbara Koroušić Seljak*
Affiliation:
Computer Systems Department, Jožef Stefan Institute, Jamova cesta 39, Ljubljana 1000, Slovenia
*
*Corresponding author: Email barbara.korousic@ijs.si
Rights & Permissions [Opens in a new window]

Abstract

Objective

The present study tested the combination of an established and a validated food-choice research method (the ‘fake food buffet’) with a new food-matching technology to automate the data collection and analysis.

Design

The methodology combines fake-food image recognition using deep learning and food matching and standardization based on natural language processing. The former is specific because it uses a single deep learning network to perform both the segmentation and the classification at the pixel level of the image. To assess its performance, measures based on the standard pixel accuracy and Intersection over Union were applied. Food matching firstly describes each of the recognized food items in the image and then matches the food items with their compositional data, considering both their food names and their descriptors.

Results

The final accuracy of the deep learning model trained on fake-food images acquired by 124 study participants and providing fifty-five food classes was 92·18 %, while the food matching was performed with a classification accuracy of 93 %.

Conclusions

The present findings are a step towards automating dietary assessment and food-choice research. The methodology outperforms other approaches in pixel accuracy, and since it is the first automatic solution for recognizing the images of fake foods, the results could be used as a baseline for possible future studies. As the approach enables a semi-automatic description of recognized food items (e.g. with respect to FoodEx2), these can be linked to any food composition database that applies the same classification and description system.

Information

Type
Research paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Authors 2018
Figure 0

Fig. 1 Methodology flowchart. The food image recognition process uses a fake-food image to find classes (names) for all food items in the image. These are then processed by the StandFood method to define the FoodEx2 descriptors of the recognized food items. Once both the food names and descriptors are identified, the recognized fake foods can be matched with compositional data from the food composition database. The final result is a fake-food image standardized with unique descriptors, which enables food intake conversion into nutrient intake and helps the automated dietary assessment

Figure 1

Fig. 2 Example images from each of the three subsets (training, validation and testing) of the fake food buffet data set, along with the corresponding ground-truth label images. The third image column contains predictions from the FCN-8s deep learning model. Each colour found in the images represents a different food or drink item; these items and their corresponding colours are listed to the right of the images

Figure 2

Table 1 Results from the FCN-8s deep learning model

Figure 3

Table 2 Correctly classified food classes using the StandFood classification part and description of the food classes using the StandFood description part

Figure 4

Table 3 StandFood post-processing result of three randomly selected food classes