Skip to main content Accesibility Help
×
×
Home

Automatic food detection in egocentric images using artificial intelligence technology

  • Wenyan Jia (a1), Yuecheng Li (a1), Ruowei Qu (a1) (a2), Thomas Baranowski (a3), Lora E Burke (a4), Hong Zhang (a5), Yicheng Bai (a1), Juliet M Mancino (a4), Guizhi Xu (a2), Zhi-Hong Mao (a6) and Mingui Sun (a1) (a6) (a7)...
Abstract
Objective

To develop an artificial intelligence (AI)-based algorithm which can automatically detect food items from images acquired by an egocentric wearable camera for dietary assessment.

Design

To study human diet and lifestyle, large sets of egocentric images were acquired using a wearable device, called eButton, from free-living individuals. Three thousand nine hundred images containing real-world activities, which formed eButton data set 1, were manually selected from thirty subjects. eButton data set 2 contained 29 515 images acquired from a research participant in a week-long unrestricted recording. They included both food- and non-food-related real-life activities, such as dining at both home and restaurants, cooking, shopping, gardening, housekeeping chores, taking classes, gym exercise, etc. All images in these data sets were classified as food/non-food images based on their tags generated by a convolutional neural network.

Results

A cross data-set test was conducted on eButton data set 1. The overall accuracy of food detection was 91·5 and 86·4 %, respectively, when one-half of data set 1 was used for training and the other half for testing. For eButton data set 2, 74·0 % sensitivity and 87·0 % specificity were obtained if both ‘food’ and ‘drink’ were considered as food images. Alternatively, if only ‘food’ items were considered, the sensitivity and specificity reached 85·0 and 85·8 %, respectively.

Conclusions

The AI technology can automatically detect foods from low-quality, wearable camera-acquired real-world egocentric images with reasonable accuracy, reducing both the burden of data processing and privacy concerns.

Copyright
Corresponding author
* Corresponding author: Email jiawenyan@gmail.com
References
Hide All
1. Gemming, L, Utter, J & Ni Mhurchu, C (2015) Image-assisted dietary assessment: a systematic review of the evidence. J Acad Nutr Diet 115, 6477.
2. Martin, CK, Nicklas, T, Gunturk, B et al. (2014) Measuring food intake with digital photography. J Hum Nutr Diet 27, Suppl. 1, 7281.
3. Stumbo, PJ (2013) New technology in dietary assessment: a review of digital methods in improving food record accuracy. Proc Nutr Soc 72, 7076.
4. Boushey, CJ, Kerr, DA, Wright, J et al. (2009) Use of technology in children’s dietary assessment. Eur J Clin Nutr 63, Suppl. 1, S50S57.
5. Steele, R (2015) An overview of the state of the art of automated capture of dietary intake information. Crit Rev Food Sci Nutr 55, 19291938.
6. Boushey, C, Spoden, M, Zhu, F et al. (2017) New mobile methods for dietary assessment: review of image-assisted and image-based dietary assessment methods. Proc Nutr Soc 76, 283294.
7. Illner, AK, Freisling, H, Boeing, H et al. (2012) Review and evaluation of innovative technologies for measuring diet in nutritional epidemiology. Int J Epidemiol 41, 11871203.
8. Pettitt, C, Liu, J, Kwasnicki, RM et al. (2016) A pilot study to determine whether using a lightweight, wearable micro-camera improves dietary assessment accuracy and offers information on macronutrients and eating rate. Br J Nutr 115, 160167.
9. O’Loughlin, G, Cullen, SJ, McGoldrick, A et al. (2013) Using a wearable camera to increase the accuracy of dietary analysis. Am J Prev Med 44, 297301.
10. Gemming, L, Rush, E, Maddison, R et al. (2015) Wearable cameras can reduce dietary under-reporting: doubly labelled water validation of a camera-assisted 24 h recall. Br J Nutr 113, 284291.
11. Gemming, L, Doherty, A, Kelly, P et al. (2013) Feasibility of a SenseCam-assisted 24-h recall to reduce under-reporting of energy intake. Eur J Clin Nutr 67, 10951099.
12. Arab, L, Estrin, D, Kim, DH et al. (2011) Feasibility testing of an automated image-capture method to aid dietary recall. Eur J Clin Nutr 65, 11561162.
13. Sun, M, Fernstrom, JD, Jia, W et al. (2010) A wearable electronic system for objective dietary assessment. J Am Diet Assoc 110, 4547.
14. Sun, M, Burke, LE, Mao, Z-H et al. (2014) eButton: a wearable computer for health monitoring and personal assistance. Proc Des Autom Conf 2014, 16.
15. Sun, M, Burke, LE, Baranowski, T et al. (2015) An exploratory study on a chest-worn computer for evaluation of diet, physical activity and lifestyle. J Healthc Eng 6, 122.
16. Beltran, A, Dadhaboy, H, Lin, C et al. (2015) Minimizing memory errors in child dietary assessment with a wearable camera: formative research. J Acad Nutr Diet 115, Suppl. A86.
17. Beltran, A, Dadabhoy, H, Chen, TA et al. (2016) Adapting the eButton to the abilities of children for diet assessment. In Proceedings of Measuring Behavior 2016 – 10th International Conference on Methods and Techniques in Behavioral Research, pp. 72–81 [A Spink, G Riedel, L Zhou et al., editors]. http://www.measuringbehavior.org/files/2016/MB2016_Proceedings.pdf (accessed February 2018).
18. Thomaz, E, Parnami, A, Essa, I et al. (2013) Feasibility of identifying eating moments from first-person images leveraging human computation. In SenseCam ’13. Proceedings of the 4th International SenseCam & Pervasive Imaging Conference, San Diego, CA, USA, 18–19 November 2013, pp. 26–33. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2526672&dl=ACM&coll=DL
19. Arab, L & Winter, A (2010) Automated camera-phone experience with the frequency of imaging necessary to capture diet. J Am Diet Assoc 110, 12381241.
20. Gemming, L, Doherty, A, Utter, J et al. (2015) The use of a wearable camera to capture and categorise the environmental and social context of self-identified eating episodes. Appetite 92, 118125.
21. Liu, J, Johns, E, Atallah, L et al. (2012) An intelligent food-intake monitoring system using wearable sensors. In Proceedings of the 2012 Ninth International Conference on Wearable and Implantable Body Sensor Networks (BSN), London, 9–12 May 2012, pp. 154–160. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/6200559/
22. Illner, AK, Slimani, N & Lachat, C (2014) Assessing food intake through a chest-worn camera device. Public Health Nutr 17, 16691670.
23. Jia, W, Chen, H-C, Yue, Y et al. (2014) Accuracy of food portion size estimation from digital pictures acquired by a chest-worn camera. Public Health Nutr 17, 16711681.
24. Vu, T, Lin, F, Alshurafa, N et al. (2017) Wearable food intake monitoring technologies: a comprehensive review. Computers 6, 4.
25. Vieira Resende Silva, B & Cui, J (2017) A survey on automated food monitoring and dietary management systems. J Health Med Inform 8, 3.
26. Microsoft Research (2004) SenseCam. https://www.microsoft.com/en-us/research/project/sensecam/ (accessed March 2018).
27. Chen, H-C, Jia, W, Yue, Y et al. (2013) Model-based measurement of food portion size for image-based dietary assessment using 3D/2D registration. Meas Sci Technol 24, 105701.
28. Doherty, AR, Hodges, SE, King, AC et al. (2013) Wearable cameras in health: the state of the art and future possibilities. Am J Prev Med 44, 320323.
29. Safavi, S & Shukur, Z (2014) Conceptual privacy framework for health information on wearable device. PLoS One 9, e114306.
30. McCarthy, M (2016) Federal privacy rules offer scant protection for users of health apps and wearable devices. BMJ 354, i4115.
31. Roberto, H, Robert, T, Steven, A et al. (2014) Privacy behaviors of lifeloggers using wearable cameras. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Seattle, WA, USA, 13–17 September 2014, pp. 571–582. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?doid=2632048.2632079
32. Kelly, P, Marshall, SJ, Badland, H et al. (2013) An ethical framework for automated, wearable cameras in health behavior research. Am J Prev Med 44, 314319.
33. Kalantarian, H & Sarrafzadeh, M (2015) Audio-based detection and evaluation of eating behavior using the smartwatch platform. Comput Biol Med 65, 19.
34. Fukuike, C, Kodama, N, Manda, Y et al. (2015) A novel automated detection system for swallowing sounds during eating and speech under everyday conditions. J Oral Rehabil 42, 340347.
35. Gao, Y, Zhang, N, Wang, H et al. (2016) iHear food: eating detection using commodity bluetooth headsets. In Proceedings of the 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Washington, DC, 27–29 June 2016, pp. 163–172. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7545830/
36. Xu, Y, Guanling, C & Yu, C (2015) Automatic eating detection using head-mount and wrist-worn accelerometers. In Proceedings of the 2015 17th International Conference on E-health Networking, Application & Services (HealthCom), Boston, MA, USA, 13–17 October 2015, pp. 578–581. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7454568/
37. Amft, O, Junker, H & Troster, G (2005) Detection of eating and drinking arm gestures using inertial body-worn sensors. In Proceedings of the Ninth IEEE International Symposium on Wearable Computers (ISWC’05), Washington, DC, 18–21 October 2005, pp. 160–163. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/1550801/
38. Dong, Y, Scisco, J, Wilson, M et al. (2013) Detecting periods of eating during free-living by tracking wrist motion. IEEE J Biomed Health Inform 18, 12531260.
39. Päßler, S & Fischer, W-J (2014) Food intake monitoring: automated chew event detection in chewing sounds. IEEE J Biomed Health Inform 18, 278289.
40. Ramos Garcia, R (2014) Using hidden Markov models to segment and classify wrist motions related to eating activities. PhD Thesis, Clemson University.
41. Zhang, S, Ang, MH Jr, Xiao, W et al. (2009) Detection of activities by wireless sensors for daily life surveillance: eating and drinking. Sensors (Basel) 9, 14991517.
42. Thomaz, E, Essa, I & Abowd, GD (2015) A practical approach for recognizing eating moments with wrist-mounted inertial sensing. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Osaka, Japan, 7–11 September 2015, pp. 1029–1040. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2807545
43. Ye, X, Chen, G, Gao, Y et al. (2016) Assisting food journaling with automatic eating detection. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, 7–12 May 2016, pp. 3255–3262. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?doid=2851581.2892426
44. Li, C, Bai, Y, Jia, W et al. (2013) Eating event detection by magnetic proximity sensing. In Proceedings of the 2013 39th Annual Northeast Bioengineering Conference (NEBEC), Syracuse, NY, USA, 5–7 April 2013, pp. 15–16. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/6574334/
45. Bai, Y, Jia, W, Mao, Z-H et al. (2014) Automatic eating detection using a proximity sensor. In Proceedings of the 2014 40th Annual Northeast Bioengineering Conference (NEBEC), Boston, MA, USA, 25–27 April 2014, pp. 1–2. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/6972716/
46. Mirtchouk, M, Merck, C & Kleinberg, S (2016) Automated estimation of food type and amount consumed from body-worn audio and motion sensors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Heidelberg, Germany, 12–16 September 2016, pp. 451–462. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2971677
47. Rahman, T, Czerwinski, M, Gilad-Bachrach, R et al. (2016) Predicting ‘about-to-eat’ moments for just-in-time eating intervention. In Proceedings of the 6th International Conference on Digital Health, Montréal, QC, Canada, 11–13 April 2016, pp. 141–150. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2896359
48. Chang, X, Ye, H, Albert, P et al. (2013) Image-based food volume estimation. CEA13 (2013) 2013, 7580.
49. LeCun, Y, Bengio, Y & Hinton, G (2015) Deep learning. Nature 521, 436444.
50. Farabet, C, Couprie, C, Najman, L et al. (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35, 19151929.
51. Krizhevsky, A, Sutskever, I & Hinton, GE (2012) ImageNet classification with deep convolutional neural networks. In NIPS’12 Proceedings of the 25th International Conference on Neural Information Processing Systems – Volume 1, Lake Tahoe, NV, USA, 3–6 December 2012, pp. 1097–1105. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2999257
52. Karpathy, A & Li, F-F (2015) Deep visual-semantic alignments for generating image descriptions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015, pp. 3128–3137. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7534740/
53. Johnson, J, Karpathy, A & Li, F-F (2016) DenseCap: fully convolutional localization networks for dense captioning. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016, pp. 4565–4574. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7780863/
54. Johnson, J, Krishna, R, Stark, M et al. (2015) Image retrieval using scene graphs. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015, pp. 3668–3678. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7298990/
55. Singla, A, Yuan, L & Ebrahimi, T (2016) Food/non-food image classification and food categorization using pre-trained GoogLeNet model. In MADiMa ’16 Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, 16 October 2016, pp. 3–11. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2986039
56. Kagaya, H & Aizawa, K (2015) Highly accurate food/non-food image classification based on a deep convolutional neural network. In New Trends in Image Analysis and Processing – ICIAP 2015 Workshops: ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, September 7–8, 2015, Proceedings, pp. 350357 [V Murino, E Puppo, D Sona et al., editors]. Cham: Springer International Publishing.
57. Ragusa, F, Tomaselli, V, Furnari, A et al. (2016) Food vs non-food classification. In MADiMa ’16 Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, 16 October 2016, pp. 77–81. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2986041
58. Farinella, GM, Allegra, D, Stanco, F et al. (2015) On the exploitation of one class classification to distinguish food vs non-food images. In New Trends in Image Analysis and Processing – ICIAP 2015 Workshops: ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, September 7–8, 2015, Proceedings, pp. 375383 [V Murino, E Puppo, D Sona et al., editors]. Cham: Springer International Publishing.
59. Kagaya, H, Aizawa, K & Ogawa, M (2014) Food detection and recognition using convolutional neural network. In Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, FL, USA, 3–7 November 2014, pp. 1085–1088. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2654970
60. Kawano, Y & Yanai, K (2014) Food image recognition with deep convolutional features. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, Seattle, WA, USA, 13–17 September 2014, pp. 589–593. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?doid=2638728.2641339
61. Christodoulidis, S, Anthimopoulos, M & Mougiakakou, S (2015) Food recognition for dietary assessment using deep convolutional neural networks. In New Trends in Image Analysis and Processing – ICIAP 2015 Workshops: ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, September 7–8, 2015, Proceedings, pp. 458465 [V Murino, E Puppo, D Sona et al., editors]. Cham: Springer International Publishing.
62. Yanai, K & Kawano, Y (2015) Food image recognition using deep convolutional network with pre-training and fine-tuning. In Proceedings of the 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Turin, Italy, 29 June–3 July 2015, pp. 1–6. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7169816/
63. Liu, C, Cao, Y, Luo, Y et al. (2016) DeepFood: deep learning-based food image recognition for computer-aided dietary assessment. In ICOST 2016 Proceedings of the 14th International Conference on Inclusive Smart Cities and Digital Health – Volume 9677, Wuhun, China, 25–27 May 2016, pp. 37–48. New York: Springer-Verlag New York Inc.; available at https://dl.acm.org/citation.cfm?id=2960855
64. Hassannejad, H, Matrella, G, Ciampolini, P et al. (2016) Food image recognition using very deep convolutional networks. In MADiMa ’16 Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, 16 October 2016, pp. 41–49. New York: Association for Computing Machinery; available at https://dl.acm.org/citation.cfm?id=2986042
65. Farinella, GM, Allegra, D, Moltisanti, M et al. (2016) Retrieval and classification of food images. Comput Biol Med 77, 2339.
66. Mezgec, S & Korousic Seljak, B (2017) NutriNet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9, E657.
67. Hinton, G (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29, 8297.
68. Mohamed, AR, Dahl, GE & Hinton, G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20, 1422.
69. LeCun, Y & Bengio, Y (1995) Convolutional network for images, speech, and time series. In The Handbook of Brain Theory and Neural Networks, p. 3361 [MA Arbib, editor]. Cambridge, MA: MIT Press.
70. Stanford Vision Lab, Stanford University & Princeton University (2016) ImageNet: image database organized according to the WordNet hierarchy. http://www.image-net.org/index (accessed March 2018).
71. Everingham M, Gool L, Williams C et al. (2012) Pascal VOC: various object recognition challenges. http://host.robots.ox.ac.uk/pascal/VOC/ (accessed March 2018).
72. Clarifai, Inc. (2016) Clarifai. http://www.clarifai.com/ (accessed July 2016).
73. WordArt (2016) Tagul – WordCloud Art. https://wordart.com/tagul (accessed March 2018).
74. Bossard, L, Guillaumin, M & Van Gool, L (2014) Food-101 – mining discriminative components with random forests. In Proceedings of the 13th European Conference on Computer Vision (ECCV 2014), Part IV, Zurich, Switzerland, 6–12 September 2014, pp. 446–461. Cham: Springer International Publishing.
75. Matsuda, Y, Hoashi, H & Yanai, K (2012) Recognition of multiple-food images by detecting candidate regions. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo (ICME), Melbourne, Australia, 9–12 July 2012, pp. 25–30. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/6298369/
76. Griffin, G, Holub, A & Perona, P (2007) Caltech-256 Object Category Dataset Technical Report 7694. Pasadena, CA: California Institute of Technology.
77. Gallagher, AC & Chen, T (2009) Understanding images of groups of people. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009, pp. 256–263. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/5206828/
78. Peng, K-C, Chen, T, Sadovnik, A et al. (2015) A mixed bag of emotions: model, predict, and transfer emotion distributions. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015, pp. 860–868. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/7298687/
79. Kawano, Y & Yanai, K (2015) Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In Computer Vision – ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and 12, 2014, Proceedings, Part III, pp. 317 [L Agapito, MM Bronstein and C Rother, editors]. Cham: Springer International Publishing.
80. Pech-Pacheco, JL, Cristóbal, G, Chamorro-Martinez, J et al. (2000) Diatom autofocusing in brightfield microscopy: a comparative study. In Proceedings of the 15th IEEE International Conference on Pattern Recognition, Barcelona, Spain, 3–7 September 2000, pp. 314–317. New York: Institute of Electrical and Electronics Engineers; available at http://ieeexplore.ieee.org/document/903548/
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Public Health Nutrition
  • ISSN: 1368-9800
  • EISSN: 1475-2727
  • URL: /core/journals/public-health-nutrition
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords

Type Description Title
PDF
Supplementary materials

Jia et al. supplementary material 1
Jia et al. supplementary material

 PDF (655 KB)
655 KB

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed