| Literature DB >> 33870184 |
Guangzong Chen1, Wenyan Jia1, Yifan Zhao1, Zhi-Hong Mao1, Benny Lo2, Alex K Anderson3, Gary Frost4, Modou L Jobarteh4, Megan A McCrory5, Edward Sazonov6, Matilda Steiner-Asiedu7, Richard S Ansong7, Thomas Baranowski8, Lora Burke9, Mingui Sun1,10.
Abstract
Malnutrition, including both undernutrition and obesity, is a significant problem in low- and middle-income countries (LMICs). In order to study malnutrition and develop effective intervention strategies, it is crucial to evaluate nutritional status in LMICs at the individual, household, and community levels. In a multinational research project supported by the Bill & Melinda Gates Foundation, we have been using a wearable technology to conduct objective dietary assessment in sub-Saharan Africa. Our assessment includes multiple diet-related activities in urban and rural families, including food sources (e.g., shopping, harvesting, and gathering), preservation/storage, preparation, cooking, and consumption (e.g., portion size and nutrition analysis). Our wearable device ("eButton" worn on the chest) acquires real-life images automatically during wake hours at preset time intervals. The recorded images, in amounts of tens of thousands per day, are post-processed to obtain the information of interest. Although we expect future Artificial Intelligence (AI) technology to extract the information automatically, at present we utilize AI to separate the acquired images into two binary classes: images with (Class 1) and without (Class 0) edible items. As a result, researchers need only to study Class-1 images, reducing their workload significantly. In this paper, we present a composite machine learning method to perform this classification, meeting the specific challenges of high complexity and diversity in the real-world LMIC data. Our method consists of a deep neural network (DNN) and a shallow learning network (SLN) connected by a novel probabilistic network interface layer. After presenting the details of our method, an image dataset acquired from Ghana is utilized to train and evaluate the machine learning system. Our comparative experiment indicates that the new composite method performs better than the conventional deep learning method assessed by integrated measures of sensitivity, specificity, and burden index, as indicated by the Receiver Operating Characteristic (ROC) curve.Entities:
Keywords: artificial intelligence; egocentric image; low- and middle-income country; technology-based dietary assessment; wearable device
Year: 2021 PMID: 33870184 PMCID: PMC8047062 DOI: 10.3389/frai.2021.644712
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Figure 1(A) Four food-related images, (B) four non-food related images.
Figure 2Proposed composite machine learning architecture, which includes a deep neural network (DNN) developed by Clarifai, a probabilistic network interface, and a shallow learning network (SLN).
Figure 3Histogram of the tags representing the occurrence frequency of each tag. The tags on the left side of the vertical line were used to construct the feature vector.
Comparison of classification results using different approaches.
| DNN + SLN | 1,750 | 311 | 10,545 | 2,256 | 0.85 | 0.82 | 27.0% |
| DNN (Clarifai, threshold = 0.7) | 1,296 | 765 | 10,400 | 2,401 | 0.63 | 0.81 | 24.9% |
| DNN (Clarifai, threshold = 0.6) | 1,474 | 587 | 9,281 | 3,520 | 0.72 | 0.73 | 33.6% |
| DNN (Clarifai, threshold = 0.5) | 1,630 | 431 | 7,929 | 4,872 | 0.79 | 0.62 | 43.7% |
| DNN (Clarifai, threshold = 0.4) | 1,756 | 305 | 6,491 | 6,310 | 0.85 | 0.51 | 54.3% |
| Previous algorithm ( | 1,854 | 207 | 885 | 11,916 | 0.90 | 0.07 | 92.7% |
| Previous algorithm ( | 1,469 | 592 | 2,751 | 10,050 | 0.71 | 0.21 | 77.5% |
| Previous algorithm ( | 955 | 1,106 | 5,456 | 7,345 | 0.46 | 0.43 | 55.8% |
Sensitivity = .
Figure 4ROC curve of linear discriminative analysis: The blue curve and the red star, respectively, represent the results of the LDA and the SVM classifier. These results indicate that the SVM classifier performs better than the LDA because the red star at (0.18, 0.85) is closer to the ideal point (0, 1).
Averaged performances in five random trials (mean ± standard deviation).
| DNN + SLN | 0.856 ± 0.014 | 0.831 ± 0.005 | 26.3% ± 0.4% |
| DNN (Clarifai, threshold = 0.7) | 0.637 ± 0.007 | 0.819 ± 0.004 | 24.4% ± 0.3% |
| DNN (Clarifai, threshold = 0.6) | 0.723 ± 0.008 | 0.730 ± 0.003 | 33.2% ± 0.3% |
| DNN (Clarifai, threshold = 0.5) | 0.798 ± 0.007 | 0.625 ± 0.003 | 43.3% ± 0.2% |
| DNN (Clarifai, threshold = 0.4) | 0.858 ± 0.006 | 0.514 ± 0.003 | 53.8% ± 0.3% |
Figure 5Examples of: (A) food images misclassified as non-food images; (B) non-food images misclassified as food images.