| Literature DB >> 32936084 |
Debbie Rankin1, Michaela Black1, Bronac Flanagan1, Catherine F Hughes2, Adrian Moore3, Leane Hoey2, Jonathan Wallace4, Chris Gill2, Paul Carlin5, Anne M Molloy6, Conal Cunningham7, Helene McNulty2.
Abstract
BACKGROUND: Machine learning techniques, specifically classification algorithms, may be effective to help understand key health, nutritional, and environmental factors associated with cognitive function in aging populations.Entities:
Keywords: aging; classification; cognition; diet; geriatric assessment; supervised machine learning
Year: 2020 PMID: 32936084 PMCID: PMC7527918 DOI: 10.2196/20995
Source DB: PubMed Journal: JMIR Med Inform
General characteristics of the Trinity-Ulster and Department of Agriculture study participants.
| Characteristics | Males (n=1191) | Females (n=1678) | |||
| Age (years), mean (SD) | 72.1 (7.8) | 72.2 (7.8) | |||
| Education (years)a, mean (SD) | 16.3 (3.3) | 16.1 (2.8) | |||
|
| |||||
|
| BMI (kg/m2), mean (SD) | 28.9 (4.3) | 28.7 (5.7) | ||
|
| Waist-to-hip ratio, mean (SD) | 0.97 (0.07) | 0.88 (0.07) | ||
|
| Instrumental activities of daily living, mean (SD) | 25.0 (4.1) | 24.9 (3.5) | ||
|
| Physical self-maintenance scale score, mean (SD) | 23.3 (1.6) | 23.1 (1.7) | ||
|
| Timed Up and Go (seconds), mean (SD) | 12.9 (9.1) | 13.0 (8.0) | ||
|
| Living alone, n (%) | 260 (21.8) | 632 (37.7) | ||
|
| Current smoker, n (%) | 122 (10.2) | 194 (11.6) | ||
|
| Alcohol (units/week), mean (SD) | 8.8 (14.6) | 2.9 (6.7) | ||
|
| Socioeconomically most deprived, n (%) | 291 (24.4) | 426 (25.4) | ||
|
| |||||
|
| MMSEb score, mean (SD) | 27.8 (1.4) | 27.9 (1.4) | ||
|
| RBANSc score, mean (SD) | 87.3 (14.5) | 88.9 (15.2) | ||
|
| RBANS class=“low” (target), n (%)d | 133 (11.2) | 168 (10.0) | ||
|
| RBANS class=“high” (target), n (%)d | 1058 (88.8) | 1510 (90.0) | ||
|
| FABe score, mean (SD) | 15.7 (2.2) | 15.9 (2.1) | ||
|
| Depression CES-Df score, mean (SD) | 4.8 (6.2) | 6.1 (7.7) | ||
|
| Anxiety (HADSg score), mean (SD) | 2.6 (3.2) | 3.5 (3.8) | ||
|
| |||||
|
| White cell count (109/L), mean (SD) | 7.1 (3.6) | 6.9 (3.3) | ||
|
| Hemoglobin (g/DL), mean (SD) | 14.2 (1.5) | 13.0 (1.3) | ||
|
| Mean corpuscular volume (FLh), mean (SD) | 90.7 (5.5) | 90.6 (5.1) | ||
|
| Platelet count (109/L), mean (SD) | 229 (59.0) | 265 (66.9) | ||
|
| Urea (mmol/L), mean (SD) | 7.2 (2.9) | 6.7 (2.3) | ||
|
| Creatinine (μmol/L), mean (SD) | 98 (31.0) | 79 (22.4) | ||
|
| Albumin (g/L), mean (SD) | 42 (3.7) | 42 (3.4) | ||
|
| Gamma GT (U/L), mean (SD) | 43 (47.5) | 34 (36.0) | ||
|
| Sodium (mmol/L), mean (SD) | 140 (5.1) | 139 (3.2) | ||
|
| Potassium (mmol/L), mean (SD) | 4.3 (0.5) | 4.2 (0.4) | ||
|
| Calcium (mmol/L), mean (SD) | 2.3 (0.1) | 2.3 (0.1) | ||
|
| Phosphate (mmol/L), mean (SD) | 1.0 (0.2) | 1.1 (0.2) | ||
|
| Alkaline phosphatase (U/L), mean (SD) | 82 (34.2) | 82 (25.7) | ||
|
| Low-density lipoprotein (mmol/L), mean (SD) | 2.23 (0.8) | 2.58 (0.9) | ||
|
| High-density lipoprotein (mmol/L), mean (SD) | 1.23 (0.4) | 1.55 (0.4) | ||
|
| Triglycerides (mmol/L), mean (SD) | 1.78 (1.0) | 1.62 (1.0) | ||
|
| C-reactive protein (mg/L), mean (SD) | 6.1 (11.1) | 5.5 (11.9) | ||
|
| Glycated hemoglobin (%), mean (SD) | 6.0 (1.0) | 5.9 (0.7) | ||
|
| Parathyroid hormone (pg/mL), mean (SD) | 45.2 (30.8) | 47.2 (31.9) | ||
|
| Glomerular filtration rate (mL/min), mean (SD) | 77.2 (25.3) | 67.8 (22.6) | ||
|
| |||||
|
| Red blood cell folate (nmol/L), mean (SD) | 1053 (591.1) | 1100 (582.7) | ||
|
| Serum vitamin B12 (pmol/L), mean (SD) | 267 (191.0) | 296 (277.3) | ||
|
| Plasma vitamin B6 (nmol/L), mean (SD) | 74.1 (53.2) | 81.5 (69.7) | ||
|
| Riboflavin (EGRaci), mean (SD) | 1.35 (0.2) | 1.34 (0.2) | ||
|
| Total plasma homocysteine (μmol/L), mean (SD) | 15.1 (5.9) | 14.1 (5.1) | ||
|
| Total vitamin D (nmol/L), mean (SD) | 51.6 (25.9) | 56.0 (30.1) | ||
aEducation refers to the age of stopping formal education.
bMMSE: Mini-Mental State Examination.
cRBANS: Repeatable Battery for the Assessment of Neuropsychological Assessment.
dRBANS score <70 is assigned class low and an RBANS score ≥70 is assigned class high.
eFAB: Frontal Assessment Battery.
fCES-D: Centre for Epidemiological Studies Depression.
gHADS: Hospital Anxiety and Depression Scale.
hFL: femtolitre.
iEGRac: erythrocyte glutathione reductase activation coefficient, with a higher EGRac value indicating poorer riboflavin status.
Figure 1Calculating Repeatable Battery for the Assessment of Neuropsychological Status rate of change over a 5- to 7-year period between initial assessment and follow-up assessment, normalized to account for the time between each assessment.
Figure 2Model development and testing protocol.
Figure 3Naïve Bayes algorithm.
Figure 4Gini impurity index.
Figure 5Mean Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) score as a function of participant’s age. The graph shows a general decrease in the RBANS score as age increases. RBANS scores have been averaged by age; thus, each point represents the average score for any particular age. One outlier existed for age=86. This was removed and the R value recalculated accordingly.
Figure 6Correlation matrix using the Spearman (nonparametric) coefficient between participant test scores, ignoring observations with missing data. Variable descriptors are as follows: 1=Hospital Anxiety and Depression Scale total score; 2=depression questionnaire total score; 3=Mini-Mental State Examination total score; 4=Frontal Assessment Battery total score; 5=Repeatable Battery for the Assessment of Neuropsychological Status total score; 6=Physical Maintenance Scale total score; 7=instrumental activities of daily living total score.
Figure 7Hierarchical clustering of variables depicted as a dendrogram showing strong relationships between clinical assessment scores from the RBANS, FAB, and MMSE assessments. The variable descriptors are as follows: MMSE_score, Mini-Mental State Examination total score; FAB_score, Frontal Assessment Battery total score; RBANS_index_score_I, Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) immediate memory score; RBANS_index_score_II, RBANS visuospatial constructional score; RBANS_index_score_III, RBANS language score; RBANS_index_score_IV, RBANS attention score; RBANS_index_score_V, RBANS delayed memory score; RBANS_total_score, RBANS total score.
Figure 8Hierarchical clustering of variables depicted as a dendrogram showing the close relation between a participant’s age and kidney function (glomerular filtration rate [GFR]), which together form a cluster with the physical diagnostic tests of IADL, TUG, and PSMS. The variable descriptors are as follows: age, participant’s age; GFR, kidney function; Driving_status, driving status; PSMS_score, Physical Maintenance Scale total score; TUG score, Timed Up and Go score; IADL_score, Instrumental Activities of Daily Living total score.
Classification of the Repeatable Battery for the Assessment of Neuropsychological Status score performance measures when models were trained with 10-fold cross-validation (training set size=2152).
| Classification technique | Accuracy, mean (SD) | Precision, mean (SD) | Recall, mean (SD) | |
| Decision tree | 0.737 (0.020) | 0.795 (0.037) | 0.643 (0.051) | 0.709 (0.028) |
| Naïve Bayes | 0.500 (0.000) | 0.500 (0.000) | 1.000 (0.000) | 0.667 (0.000) |
| Random forest | 0.990 (0.006) | 1.000 (0.000) | 0.981 (0.011) | 0.990 (0.006) |
Classification of the Repeatable Battery for the Assessment of Neuropsychological Status score performance measures when applied to the evaluation data set (training set size=2152; evaluation set size=717).
| Classification technique | Overall accuracy | Precision | Recall | |
| Decision tree | 0.604 | 0.926 | 0.596 | 0.725 |
| Naïve Bayes | 0.876 | 0.876 | 0.100 | 0.934 |
| Random forest | 0.877 | 0.882 | 0.992 | 0.934 |
Figure 9Decision tree classifier of the Repeatable Battery for the Assessment of Neuropsychological Status score. GFR: glomerular filtration rate.
Figure 10The 20 most important features for classification of the Repeatable Battery for the Assessment of Neuropsychological Status score as detected using feature permutation using a Naïve Bayes classifier. GFR: glomerular filtration rate; LDL: low-density lipoprotein; TUG: Timed Up and Go.
Figure 11The 20 most important features for classification of the Repeatable Battery for the Assessment of Neuropsychological Status score as detected using feature permutation using a random forest classifier. GFR: glomerular filtration rate; HbA1c: glycated hemoglobin; LDL: low-density lipoprotein; TUG: Timed Up and Go.
Classification of the Repeatable Battery for the Assessment of Neuropsychological Status score performance measures when models trained with 10-fold cross-validation (training set size=2152) and the 4 key variables: (1) age at which the participant stopped education, (2) the Timed Up and Go score, (3) the glomerular filtration rate measure, and (4) the participant’s age.
| Classification technique | Accuracy, mean (SD) | Precision, mean (SD) | Recall, mean (SD) | |
| Decision tree | 0.688 (0.020) | 0.702 (0.026) | 0.655 (0.045) | 0.677 (0.020) |
| Naïve Bayes | 0.693 (0.012) | 0.775 (0.021) | 0.545 (0.026) | 0.640 (0.018) |
| Random forest | 0.929 (0.013) | 1.000 (0.000) | 0.857 (0.026) | 0.923 (0.015) |
Classification of the Repeatable Battery for the Assessment of Neuropsychological Status score performance measures when models trained using the 4 key variables: (1) age at which the participant stopped education, (2) the Timed Up and Go score, (3) the glomerular filtration rate measure, and (4) the participant’s age when applied to the evaluation data set (training set size=2152; evaluation set size=717).
| Classification technique | Overall accuracy | Precision | Recall | |
| Decision tree | 0.725 | 0.928 | 0.732 | 0.819 |
| Naïve Bayes | 0.598 | 0.946 | 0.557 | 0.701 |
| Random forest | 0.801 | 0.878 | 0.889 | 0.883 |
Classification of the Repeatable Battery for the Assessment of Neuropsychological Status score performance measures when models trained with 10-fold cross-validation (training set size=740).
| Classification technique | Accuracy, mean (SD) | Precision, mean (SD) | Recall, mean (SD) | |
| Decision tree | 0.603 (0.045) | 0.613 (0.053) | 0.571 (0.151) | 0.582 (0.083) |
| Naïve Bayes | 0.499 (0.008) | 0.499 (0.008) | 0.997 (0.009) | 0.665 (0.007) |
| Random forest | 0.962 (0.026) | 0.978 (0.035) | 0.946 (0.031) | 0.962 (0.028) |
Classification performance for rate of change of the Repeatable Battery for the Assessment of Neuropsychological Status score when applied to the evaluation data set (training set size=740; evaluation set size=287).
| Classification technique | Overall accuracy | Precision | Recall | |
| Decision tree | 0.547 | 0.735 | 0.605 | 0.664 |
| Naïve Bayes | 0.739 | 0.739 | 1.000 | 0.850 |
| Random forest | 0.702 | 0.735 | 0.933 | 0.822 |
Figure 12Decision tree classifier of rate of change of the Repeatable Battery for the Assessment of Neuropsychological Status score. PLP: vitamin B6 marker pyridoxal-5-phosphate.
Figure 13The 20 most important features for predicting rate of the Repeatable Battery for the Assessment of Neuropsychological Status change as detected using feature permutation using a Naïve Bayes classifier. Gamma GT: Gamma-glutamyl transferase; GFR: glomerular filtration rate; TUG: Timed Up and Go.
Figure 14The 20 most important features for predicting rate of the Repeatable Battery for the Assessment of Neuropsychological Status change as detected using feature permutation using a random forest classifier. Gamma GT: Gamma-glutamyl transferase; GFR: glomerular filtration rate; HDL: high-density lipoprotein; TUG: Timed Up and Go.