| Literature DB >> 31467953 |
Fabio Mendoza Palechor1, Alexis de la Hoz Manotas1.
Abstract
This paper presents data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17 attributes and 2111 records, the records are labeled with the class variable NObesity (Obesity Level), that allows classification of the data using the values of Insufficient Weight, Normal Weight, Overweight Level I, Overweight Level II, Obesity Type I, Obesity Type II and Obesity Type III. 77% of the data was generated synthetically using the Weka tool and the SMOTE filter, 23% of the data was collected directly from users through a web platform. This data can be used to generate intelligent computational tools to identify the obesity level of an individual and to build recommender systems that monitor obesity levels. For discussion and more information of the dataset creation, please refer to the full-length article "Obesity Level Estimation Software based on Decision Trees" (De-La-Hoz-Correa et al., 2019).Entities:
Keywords: Data mining; Obesity; SMOTE; Weka
Year: 2019 PMID: 31467953 PMCID: PMC6710633 DOI: 10.1016/j.dib.2019.104344
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Questions of the survey used for initial recollection of information.
| Questions | Possible Answers |
|---|---|
| ¿What is your gender? | Female Male |
| ¿what is your age? | Numeric value |
| ¿what is your height? | Numeric value in meters |
| ¿what is your weight? | Numeric value in kilograms |
| ¿Has a family member suffered or suffers from overweight? | Yes No |
| ¿Do you eat high caloric food frequently? | Yes No |
| ¿Do you usually eat vegetables in your meals? | Never Sometimes Always |
| ¿How many main meals do you have daily? | Between 1 y 2 Three More than three |
| ¿Do you eat any food between meals? | No Sometimes Frequently Always |
| ¿Do you smoke? | Yes No |
| ¿How much water do you drink daily? | Less than a liter Between 1 and 2 L More than 2 L |
| ¿Do you monitor the calories you eat daily? | Yes No |
| ¿How often do you have physical activity? | I do not have 1 or 2 days 2 or 4 days 4 or 5 days |
| ¿How much time do you use technological devices such as cell phone, videogames, television, computer and others? | 0–2 hours 3–5 hours More than 5 hours |
| ¿how often do you drink alcohol? | I do not drink Sometimes Frequently Always |
| ¿Which transportation do you usually use? | Automobile Motorbike Bike Public Transportation Walking |
Fig. 1Unbalanced distribution of data regarding the obesity levels category.
Fig. 2Balanced Distribution of data regarding the obesity levels category.
Specifications table
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | Data was retrieved from online survey and preprocessed including missing and atypical data deletion, and data normalization |
| Experimental features | |
| Data source location | |
| Data accessibility | |
| Related research article |
This data presents information from different locations such as Mexico, Peru and Colombia, can be used to build estimation of the obesity levels based on the nutritional behavior of several regions. The data can be used for estimation of the obesity level of individuals using seven categories, allowing a detailed analysis of the affectation level of an individual. The structure and amount of data can be used for different tasks in data mining such as: classification, prediction, segmentation and association. The data can be used to build software tools for estimation of obesity levels. The data can validate the impact of several factors that propitiate the apparition of obesity problems. |