| Literature DB >> 29601516 |
Barbara Koroušić Seljak1, Peter Korošec2, Tome Eftimov3, Marga Ocke4, Jan van der Laan5, Mark Roe6, Rachel Berry7, Sandra Patricia Crispim8,9, Aida Turrini10, Carolin Krems11, Nadia Slimani12, Paul Finglas13.
Abstract
This paper identifies the requirements for computer-supported food matching, in order to address not only national and European but also international current related needs and represents an integrated research contribution of the FP7 EuroDISH project. The available classification and coding systems and the specific problems of food matching are summarized and a new concept for food matching based on optimization methods and machine-based learning is proposed. To illustrate and test this concept, a study has been conducted in four European countries (i.e., Germany, The Netherlands, Italy and the UK) using different classification and coding systems. This real case study enabled us to evaluate the new food matching concept and provide further recommendations for future work. In the first stage of the study, we prepared subsets of food consumption data described and classified using different systems, that had already been manually matched with national food composition data. Once the food matching algorithm was trained using this data, testing was performed on another subset of food consumption data. Experts from different countries validated food matching between consumption and composition data by selecting best matches from the options given by the matching algorithm without seeing the result of the previously made manual match. The evaluation of study results stressed the importance of the role and quality of the food composition database as compared to the selected classification and/or coding systems and the need to continue compiling national food composition data as eating habits and national dishes still vary between countries. Although some countries managed to collect extensive sets of food consumption data, these cannot be easily matched with food composition data if either food consumption or food composition data are not properly classified and described using any classification and coding systems. The study also showed that the level of human expertise played an important role, at least in the training stage. Both sets of data require continuous development to improve their quality in dietary assessment.Entities:
Keywords: food composition; food consumption; food matching; optimization algorithm
Mesh:
Year: 2018 PMID: 29601516 PMCID: PMC5946218 DOI: 10.3390/nu10040433
Source DB: PubMed Journal: Nutrients ISSN: 2072-6643 Impact factor: 5.717
Testing datasets of manually matched food consumption and food composition data.
| Parameter | Germany | The Netherlands | Italy |
|---|---|---|---|
| German National Nutrition Survey II (NVS II) | Dutch National Food Consumption Survey [ | INRAN-SCAI | |
| 2005–2007 | 2007–2010 | 2005–2006 | |
| 14–80 years | 7–69 years | 0–97 years | |
| 2 × 24 h-dietary recalls | 2 × 24 h-dietary recalls | 3 consecutive day food record | |
| Telephone | Telephone/Face to face home | Free-writing by the subject | |
| GloboDiet Software (SW) | GloboDiet SW | INRAN-DIARIO 3.1 SW | |
| The food description based on GloboDiet facets and descriptors, adapted to German-specific requirements | The food description based on GloboDiet facets and descriptors, adapted to Dutch-specific requirements | Predefined food categories + addition of new codes; FoodEx1 categories for the EFSA Comprehensive database assigned | |
| Microsoft Access database | Microsoft Excel | Microsoft Excel | |
| Food number, food name (English and German), food groups, facet and descriptor combinations, counts, food composition code, food composition name (German) | Food number, food name (English and Dutch), food groups, facet and descriptor combinations, counts, food composition code | Food number, food name, food group, food sub-group; food composition table (FCT) assigned code, FoodEx2 code and facets | |
| For the facet “brand name” only the information brand name known or not known was exported. Recipes are systematically split into their ingredients. | All brand name information is kept; recipes are systematically split into their ingredients | FoodEx2 mapping | |
| About 23,000 food and facet descriptor combinations | About 27,000 food and facet descriptor combinations | About 1500 food items (categories/products) | |
| German, English | Dutch, English | Italian, English |
Figure 1Example of FoodEx2 description and classification.* H—‘hierarchy term’; ** C—‘core term’; *** E—‘extended term’; + g—‘group’; ++ r—‘raw’; +++ d—‘derivative’.
Figure 2Example of LanguaL description and classification.
Figure 3GloboDiet to LanguaL food matches.
Figure 4FoodEx2 to LanguaL food matches.
Characteristics of the food composition databases, available in the EuroFIR information platform (except BLS) and used for training purposes. Since the study, presented in the paper, these characteristics have changed. Interested readers can find more details on the EuroFIR’s website [9].
| Parameter | Germany | The Netherlands | Italy | UK |
|---|---|---|---|---|
| BLS (Bundeslebensmittelschlüssel) German nutrient database; version 3.01; 2013 | NEVO Netherlands food composition database extended; version 3.0; 2011 | INRAN Italian food composition database; 2008 | Composition of Foods Integrated Dataset (CoFIDs); 2008 | |
| ~14,800 foods | 2174 foods | 790 foods | ~3500 foods | |
| ~1200 foods | 2174 foods | 790 foods | ~3500 foods | |
| Database accessible through login at [ | Database accessible through login at [ | Database accessible through login at [ | Database accessible through login at [ | |
| German | Dutch, English | Italian, English | English | |
| 1200 foods | 2174 foods | 790 foods | 1682 foods |
Figure 5Estimated average number of suggested matches per query by quality level (level 0 is best quality) as indicated by the matching tool. Matches of level higher than 2 are not presented in the graph.
Figure 6Percentage of poor, sufficient and good quality matches by country as selected and judged by the human experts. * Good—matches in all parameters; ** sufficient—matches in the name and group, but not in all descriptors; *** poor matches—all other matches.
Figure 7Distribution of quality levels such as indicated by the matching tool (level 0 = best quality) for foods with good, sufficient and poor match as judged by the human expert for three countries. For UK only level 0 matches were proposed so these are not included in the figure.
Figure 8Percentage of foods from the food consumption survey that were assigned the same match from the food composition database in the testing exercises and in the previous manual match.
Figure 9Percentage accuracy compared to the expected match for an experienced and an inexperienced human expert (results for UK).