| Literature DB >> 31489909 |
Cristián Castillo-Olea1, Begonya García-Zapirain Soto2, Christian Carballo Lozano3, Clemente Zuñiga4.
Abstract
This paper presents a study based on data analysis of the sarcopenia level in older adults. Sarcopenia is a prevalent pathology in adults of around 50 years of age, whereby the muscle mass decreases by 1 to 2% a year, and muscle strength experiences an annual decrease of 1.5% between 50 and 60 years of age, subsequently increasing by 3% each year. The World Health Organisation estimates that 5-13% of individuals of between 60 and 70 years of age and 11-50% of persons of 80 years of age or over have sarcopenia. This study was conducted with 166 patients and 99 variables. Demographic data was compiled including age, gender, place of residence, schooling, marital status, level of education, income, profession, and financial support from the State of Baja California, and biochemical parameters such as glycemia, cholesterolemia, and triglyceridemia were determined. A total of 166 patients took part in the study, with an average age of 77.24 years. The purpose of the study was to provide an automatic classifier of sarcopenia level in older adults using artificial intelligence in addition to identifying the weight of each variable used in the study. We used machine learning techniques in this work, in which 10 classifiers were employed to assess the variables and determine which would provide the best results, namely, Nearest Neighbors (3), Linear SVM (Support Vector Machines) (C = 0.025), RBF (Radial Basis Function) SVM (gamma = 2, C = 1), Gaussian Process (RBF (1.0)), Decision Tree (max_depth = 3), Random Forest (max_depth=3, n_estimators = 10), MPL (Multilayer Perceptron) (alpha = 1), AdaBoost, Gaussian Naive Bayes, and QDA (Quadratic Discriminant Analysis). Feature selection determined by the mean for the variable ranking suggests that Age, Systolic Arterial Hypertension (HAS), Mini Nutritional Assessment (MNA), Number of chronic diseases (ECNumber), and Sodium are the five most important variables in determining the sarcopenia level, and are thus of great importance prior to establishing any treatment or preventive measure. Analysis of the relationships existing between the presence of the variables and classifiers used in moderate and severe sarcopenia revealed that the sarcopenia level using the RBF SVM classifier with Age, HAS, MNA, ECNumber, and Sodium variables has 82'5 accuracy, a 90'2 F1, and 82'8 precision.Entities:
Keywords: diagnosis; machine learning; sarcopenia
Mesh:
Year: 2019 PMID: 31489909 PMCID: PMC6765933 DOI: 10.3390/ijerph16183275
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Risk factors associated with sarcopenia [7].
| Risk Factors | Chronic Diseases |
|---|---|
| Constitutional | Cognitive impairment |
| Female gender | Mood disorders |
| Low weight at birth | Diabetes mellitus |
| Genetic predisposition | Heart failure |
| Lifestyle | Liver failure |
| Malnutrition | Kidney failure |
| Low protein intake | Shortage of breath |
| Smoking habit | Osteoarthritis |
| Physical inactivity | Chronic pain |
| Living conditions | Obesity |
| Inanition | Catabolic effects of drugs |
| Being bedridden | Cancer |
| Weightlessness | Chronic inflammatory diseases |
Assessment criteria at the Tijuana General Hospital.
| Gender | Body Mass Index (BMI) | Grip Strength | Walking Speed | |
|---|---|---|---|---|
| Women | 65% | <6.1 kg/m2 | <20 | <0.8 |
| Men | 35% | <8.5 kg/m2 | <30 | <0.8 |
Metrics.
| Metric | Formula |
|---|---|
| Accuracy |
|
| Precision |
|
| F1 |
|
Types of classifier.
| Classifier | Description | |
|---|---|---|
| 1 | Nearest Neighbors (3) | 3-Nearest Neighbours |
| 2 | Linear SVM (C = 0.025) | Linear Support Vector Machine |
| 3 | RBF SVM (gamma = 2, C = 1) | Radial Basis Support Vector Machine |
| 4 | Gaussian Process (RBF (1.0)) | Gaussian Support Vector Machine |
| 5 | Decision Tree (max_depth = 3) | Decision Tree of Depth 3 |
| 6 | Random Forest (max_depth = 3, n_estimators = 10) | Random Forest of 10 trees and depth 3 |
| 7 | MPL (alpha = 1) | Multi-Layer Perceptron |
| 8 | AdaBoost | AdaBoost classifier |
| 9 | Gaussian Naive Bayes | Naive Bayes classifier |
| 10 | QDA | Quadratic Discriminant classifier |
Classifier results.
|
|
|
|
|
|
| 1 | Nearest Neighbors (3) | 0.819 | 0.895 | 0.843 |
| 1 | Linear SVM (C = 0.025) | 0.813 | 0.897 | 0.813 |
| 1 | RBF SVM (gamma = 2, C = 1) | 0.825 | 0.902 | 0.828 |
| 1 | Gaussian Process (RBF (1.0)) | 0.813 | 0.897 | 0.813 |
| 1 | Decision Tree (max_depth = 3) | 0.831 | 0.900 | 0.864 |
| 1 | Random Forest (max_depth = 3, n_estimators = 10) | 0.825 | 0.901 | 0.836 |
| 1 | MPL (alpha = 1) | 0.807 | 0.888 | 0.836 |
| 1 | AdaBoost | 0.783 | 0.871 | 0.841 |
| 1 | Gaussian Naive Bayes | 0.801 | 0.883 | 0.844 |
| 1 | QDA | 0.789 | 0.876 | 0.833 |
|
| ||||
| 2 | Nearest Neighbors (3) | 0.795 | 0.879 | 0.840 |
| 2 | Linear SVM (C = 0.025) | 0.813 | 0.897 | 0.813 |
| 2 | RBF SVM (gamma = 2, C = 1) | 0.813 | 0.897 | 0.813 |
| 2 | Gaussian Process (RBF (1.0)) | 0.813 | 0.897 | 0.813 |
| 2 | Decision Tree (max_depth = 3) | 0.795 | 0.879 | 0.844 |
| 2 | Random Forest (max_depth = 3, n_estimators = 10) | 0.825 | 0.902 | 0.827 |
| 2 | MPL (alpha = 1) | 0.819 | 0.892 | 0.864 |
| 2 | AdaBoost | 0.789 | 0.874 | 0.847 |
| 2 | Gaussian Naive Bayes | 0.814 | 0.886 | 0.867 |
| 2 | QDA | 0.826 | 0.894 | 0.875 |
|
| ||||
| 3 | Nearest Neighbors (3) | 0.783 | 0.874 | 0.824 |
| 3 | Linear SVM (C = 0.025) | 0.813 | 0.897 | 0.813 |
| 3 | RBF SVM (gamma = 2, C = 1) | 0.813 | 0.897 | 0.813 |
| 3 | Gaussian Process (RBF (1.0)) | 0.813 | 0.897 | 0.813 |
| 3 | Decision Tree (max_depth = 3) | 0.819 | 0.897 | 0.840 |
| 3 | Random Forest (max_depth = 3, n_estimators = 10) | 0.795 | 0.886 | 0.810 |
| 3 | MPL (alpha = 1) | 0.814 | 0.890 | 0.852 |
| 3 | AdaBoost | 0.777 | 0.868 | 0.837 |
| 3 | Gaussian Naive Bayes | 0.765 | 0.855 | 0.863 |
| 3 | QDA | 0.635 | 0.708 | 0.791 |
|
| ||||
| 4 | Nearest Neighbors (3) | 0.783 | 0.878 | 0.807 |
| 4 | Linear SVM (C = 0.025) | 0.777 | 0.873 | 0.810 |
| 4 | RBF SVM (gamma = 2 C = 1) | 0.813 | 0.897 | 0.813 |
| 4 | Gaussian Process (RBF (1.0)) | 0.789 | 0.881 | 0.813 |
| 4 | Decision Tree (max_depth = 3) | 0.765 | 0.842 | 0.866 |
| 4 | Random Forest (max_depth = 3, n_estimators = 10) | 0.801 | 0.890 | 0.811 |
| 4 | MPL (alpha = 1) | 0.753 | 0.854 | 0.818 |
| 4 | AdaBoost | 0.729 | 0.831 | 0.831 |
| 4 | Gaussian Naive Bayes | 0.234 | 0.178 | 0.412 |
| 4 | QDA | 0.784 | 0.878 | 0.807 |
DataSET group.
| Dataset | Variables |
|---|---|
|
| ‘Age’, ‘HAS’, ‘MNA’, ‘ECNumber’, ‘Sodium’ |
|
| ‘Age’, ‘HAS’, ‘MNA’, ‘ECNumber’, ‘Sodium’, ‘Drugs’, ‘Lawton’ |
|
| ‘Age’, ‘HAS’, ‘MNA’, ‘ECNumber’, ‘Sodium’, ‘Drugs’, ‘Lawton’, ‘Hb’, ‘Dementia’, ‘TNCM’, ‘Charlson’, ‘Profession’, ‘FinSupport’ |
|
| ‘Status’, ‘Gender’, ‘Age’, ‘Schooling’, ‘LevelofStudies’, ‘MaritalStatus’, ‘Carer’, ‘Religion’, ‘Residence’, ‘Profession’, ‘Income’, ‘FinSupport’, ‘Sight’, ‘VisualCorrection’, ‘Hearing’, ‘HearingCorrection’, ´ECNumber’, ‘HAS’, ‘DMII’, ‘OA’, ‘OSTEOP’, ‘GASTRITIS’, ‘DEPRE’, ‘CARDIO’, ‘TNCM’, ‘PARKIN’, ‘HIPOT’, ‘HIPERT’, ‘CANCER’, ‘EPOC’, ‘DISLIP’, ‘IRC’, ‘OTHERS’, ‘LiverFailure’, ‘SmokingHabit’, ‘Alcoholism’, ‘Drugs’, ‘ExpBiomass’, ‘MMSE’, ‘GDS’, ‘Depression’, ‘Barthel’, ‘Falls’, ‘NumberofFalls’, ‘Ulcers’, ‘Norton’, ‘Lawton’, ‘MNA’, ‘Charlson’, ‘TallaMts’, ‘Dementia’, ‘Cognition’, ‘EVC’, ‘Infection’, ‘Pain’, ‘Cancer’, ‘Hb’, ‘Urea’, ‘Creatinine’, ‘Albumin’, ‘Glucose’, ‘Sodium’ |
Figure 1Variable ranking.
Comparison of results.
| Classifiers | DataSET 1 | DataSET 2 | DataSET 3 | DataSET 4 | DataSET | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | F1 | P | ACC | F1 | P | ACC | F1 | P | ACC | F1 | P | Final | |
| RBF SVM (gamma = 2, C = 1) | 0.825 | 0.902 | 0.828 | 0.813 | 0.897 | 0.813 | 0.813 | 0.897 | 0.813 | 0.813 | 0.897 | 0.813 | 1, 2, 3, 4 |
| Decision Tree (max_depth = 3) | 0.831 | 0.9 | 0.864 | 0.795 | 0.879 | 0.844 | 0.819 | 0.897 | 0.84 | 0.765 | 0.842 | 0.866 | 1, 3 |
| Random Forest (max_depth = 3, n_estimators = 10) | 0.825 | 0.901 | 0.836 | 0.825 | 0.902 | 0.827 | 0.795 | 0.886 | 0.810 | 0.801 | 0.89 | 0.811 | 1, 2, 4 |
| Linear SVM (C = 0.025) | 0.813 | 0.897 | 0.813 | 0.813 | 0.897 | 0.813 | 0.813 | 0.897 | 0.813 | 0.765 | 0.842 | 0.866 | 2, 3 |
ACC = accuracy, P = precision.