| Literature DB >> 30871173 |
Muhammad Farooq1, Abul Doulah2, Jason Parton3, Megan A McCrory4, Janine A Higgins5, Edward Sazonov6.
Abstract
Video observations have been widely used for providing ground truth for wearable systems for monitoring food intake in controlled laboratory conditions; however, video observation requires participants be confined to a defined space. The purpose of this analysis was to test an alternative approach for establishing activity types and food intake bouts in a relatively unconstrained environment. The accuracy of a wearable system for assessing food intake was compared with that from video observation, and inter-rater reliability of annotation was also evaluated. Forty participants were enrolled. Multiple participants were simultaneously monitored in a 4-bedroom apartment using six cameras for three days each. Participants could leave the apartment overnight and for short periods of time during the day, during which time monitoring did not take place. A wearable system (Automatic Ingestion Monitor, AIM) was used to detect and monitor participants' food intake at a resolution of 30 s using a neural network classifier. Two different food intake detection models were tested, one trained on the data from an earlier study and the other on current study data using leave-one-out cross validation. Three trained human raters annotated the videos for major activities of daily living including eating, drinking, resting, walking, and talking. They further annotated individual bites and chewing bouts for each food intake bout. Results for inter-rater reliability showed that, for activity annotation, the raters achieved an average (±standard deviation (STD)) kappa value of 0.74 (±0.02) and for food intake annotation the average kappa (Light's kappa) of 0.82 (±0.04). Validity results showed that AIM food intake detection matched human video-annotated food intake with a kappa of 0.77 (±0.10) and 0.78 (±0.12) for activity annotation and for food intake bout annotation, respectively. Results of one-way ANOVA suggest that there are no statistically significant differences among the average eating duration estimated from raters' annotations and AIM predictions (p-value = 0.19). These results suggest that the AIM provides accuracy comparable to video observation and may be used to reliably detect food intake in multi-day observational studies.Entities:
Keywords: AIM; chewing detection; dietary assessment; food intake detection; neural networks; obesity; sensor validation; video annotation
Mesh:
Year: 2019 PMID: 30871173 PMCID: PMC6472006 DOI: 10.3390/nu11030609
Source DB: PubMed Journal: Nutrients ISSN: 2072-6643 Impact factor: 5.717
Figure 1Floorplan of the apartment and placement of the six cameras in the apartment. Cameras were placed such that the area of the coverage is maximized.
Figure 2A snapshot of the software used for video observation and annotation. The annotator can view all six cameras simultaneously and can mark start and end of different activities.
(a) Definitions of categories for activity annotation; (b) Constraints placed on activity annotation.
| ( | |
|
|
|
| Food Intake | Participant was consuming solid food items or solid foods combined with liquids. Eating involved taking bites, chewing, and swallowing of the foods. |
| Drinking | Participant was consuming just liquids, no bite/chewing were involved. |
| Physically active | Participant were moving |
| Physically sedentary | Participant was not in motion, including sitting on the couch/chair, working on the computer or laying down on the bed etc. |
| Talking | Participant was talking to other participants or talking on the phone. |
| Out of view | Participant was not in the view of any of the 6 cameras |
| ( | |
|
|
|
| 1 | Participant cannot be physically active and sedentary at the same time. |
| 2 | Participant cannot be eating/drinking and talking at the same time. |
| 3 | Participant cannot be out of surveillance and physically active at the same time with the exception that when the participant was out with the research assistant getting the food, that was considered as physically active. |
| 4 | Restroom use was considered as an out of surveillance category. |
Figure 3(a) The three button systems for annotating the videos of food intake both act activity level as well as meal level; (b) Example of the annotation procedure both at the activity and food intake bout level.
Definitions of categories for food intake bout annotation.
| Category | Definition |
|---|---|
| Bite | The moment the participant placed the food into mouth and bit down. |
| Chewing bout | Tracking the jaw movement of the participant immediately after bite until swallowing the food. |
| Out of view/frozen frame | Frozen video frames or out of camera view (i.e., the participant was not in the selected camera) |
Comparison of food intake detection between video based human annotation and AIM predictions based on leave-one-out cross validation.
| Kappa | F1-Score | |||
|---|---|---|---|---|
| Activity Level | Food Intake Bout level | Activity Level | Food Intake Bout Level | |
| Mean | 0.77 | 0.76 | 0.8 | 0.78 |
| STD | 0.1 | 0.12 | 0.1 | 0.12 |
Comparison of food intake detection between video based human annotation and AIM predictions based on the model from an earlier study [10].
| Kappa | F1-Score | |||
|---|---|---|---|---|
| Activity Level | Food Intake Bout level | Activity Level | Food Intake Bout Level | |
| Mean | 0.74 | 0.71 | 0.77 | 0.74 |
| STD | 0.14 | 0.11 | 0.12 | 0.09 |
Statistics on Duration of Experiments, Activity, and Food intake bout level eating duration and AIM predicted eating duration. All durations are in minutes.
| Total Duration | Activity Level (Video) | Food Intake Bout Level (Video) | AIM Predicted | |
|---|---|---|---|---|
| Mean | 608.6 | 66.3 | 37.1 | 49.4 |
| STD | 63.5 | 30.4 | 13.4 | 13.7 |
| 25% | 589.0 | 45.9 | 27.3 | 40.1 |
| 50% | 619.3 | 55.3 | 35.3 | 48.3 |
| 75% | 647.4 | 78.1 | 43.4 | 57.5 |