| Literature DB >> 29868600 |
Marc Lamarine1, Jörg Hager2, Wim H M Saris3, Arne Astrup4, Armand Valsesia2.
Abstract
Aim of Study: The use of weighed food diaries in nutritional studies provides a powerful method to quantify food and nutrient intakes. Yet, mapping these records onto food composition tables (FCTs) is a challenging, time-consuming and error-prone process. Experts make this effort manually and no automation has been previously proposed. Our study aimed to assess automated approaches to map food items onto FCTs.Entities:
Keywords: dietary studies; food composition tables; food diaries; food mapping; fuzzy matching; macronutrient
Year: 2018 PMID: 29868600 PMCID: PMC5954085 DOI: 10.3389/fnut.2018.00038
Source DB: PubMed Journal: Front Nutr ISSN: 2296-861X
Figure 1Automated approaches for food item mapping. (A) Concept using fuzzy matching, (B) Extension of the fuzzy matching using a machine learning classifier (C5.0 trees).
Figure 2Proof of concept: performance at mapping existing items onto FCTs. (A) Example with DiOGenes NL items mapped onto NEVO; (B) Example with all DiOGenes items mapped onto EuroFIR.
Figure 3Precision and recall curves, as a function of fuzzy matching scores. Left plots correspond to a mapping using the original food names; right plots correspond to mapping food names using their English translation.
Thresholds required to achieve 75 or 80% precision, with the fuzzy matching approach.
| precision > = 75% | Any | 50 | 88.75 | 99.49 | 50 | 79.11 | 100.00 |
| precision > = 75% | directly mappable | 50 | 97.34 | 100.00 | 50 | 95.53 | 100.00 |
| precision > = 75% | Others | 63 | 76.84 | 50.00 | 75 | 78.49 | 34.81 |
| precision > = 80% | Any | 50 | 88.75 | 99.49 | 50 | 79.11 | 100.00 |
| precision > = 80% | directly mappable | 50 | 97.34 | 100.00 | 50 | 95.53 | 100.00 |
| precision > = 80% | Others | 65 | 80.25 | 44.52 | 80 | 89.78 | 21.73 |
Threshold required to achieved 75% precision for all three-item classes (“Any,” “Mappable,” “Other”), with the fuzzy matching approach.
| 63 | Original name | PR = 96.81, Rec = 95.19 | PR = 97.98, Rec = 99.29 | PR = 76.84, Rec = 50 |
| 75 | English-translated name | PR = 96.49, Rec = 86.55 | PR = 98, Rec = 96.14 | PR = 78.49, Rec = 34.81 |
Thresholds required to achieve 75 or 80% precision, with the machine learning approach.
| precision > = 75% | Any | 50 | 84.55 | 97.98 | 50 | 78.86 | 99.34 |
| precision > = 75% | directly mappable | 50 | 97.41 | 98.23 | 50 | 95.60 | 99.64 |
| precision > = 75% | Others | 85 | 77.31 | 57.19 | 84 | 76.85 | 55.12 |
| precision > = 80% | Any | 50 | 84.55 | 97.98 | 64 | 79.53 | 97.87 |
| precision > = 80% | directly mappable | 50 | 97.41 | 98.23 | 50 | 95.60 | 99.64 |
| precision > = 80% | Others | 92 | 79.90 | 54.45 | 92 | 100.00 | 6.54 |
Threshold required to achieved 75% precision for all three-item classes (“Any,” “Mappable,” “Other”), with the machine learning approach.
| 85 | Original name | PR = 97.89%, Rec = 80.5% | PR = 99.55%, Rec = 82.61% | PR = 77.31%, Rec = 57.19% |
| 84 | English-translated name | PR = 96.46%, Rec = 87.32% | PR = 99.23%, Rec = 93.29% | PR = 76.85%, Rec = 55.12% |
Figure 4Precision-recall curves for the two approaches (fuzzy matching, machine learning) and using either original or English-translated food names.
| Predicted plausible match (If score ≥ threshold) | True positives (TP) | False positives (FP) | |
| Predicted non-plausible match (Score < threshold) | False negatives (FN) | True negatives (TN) | |