| Literature DB >> 33343337 |
Ibrahim Almubark1, Lin-Ching Chang1, Kyle F Shattuck2, Thanh Nguyen1, Raymond Scott Turner3, Xiong Jiang2.
Abstract
Introduction: The goal of this study was to investigate and compare the classification performance of machine learning with behavioral data from standard neuropsychological tests, a cognitive task, or both.Entities:
Keywords: Alzheimer's disease; artificial neural networks; inhibition of return; machine learning; neuropsychological test
Year: 2020 PMID: 33343337 PMCID: PMC7744695 DOI: 10.3389/fnagi.2020.603179
Source DB: PubMed Journal: Front Aging Neurosci ISSN: 1663-4365 Impact factor: 5.750
The demographics and neuropsychological test scores of CN and MCI/mild AD participants.
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| N (F) | 50 (32F | 28 (10F | n.s.c | 41 (26F | 16 (6F | n.s. |
| Age | 65.9 ± 6.2 | 72.7 ± 7.4 | 0.0001 | 67.4 ± 5.3 | 69.9 ± 5.3 | n.s. |
| Education (years) | 18.1 ± 3.9 | 18.2 ± 3.9 | n.s. | 18.4 ± 3.4 | 19.0 ± 4.5 | n.s. |
| %CA | 82.0% | 89.3% | n.s. | 78.1% | 81.3% | n.s. |
| MMSE | 29.4 ± 1.0 | 25.8 ± 4.5 | 5.3E-06 | 29.3 ± 1.0 | 27.3 ± 2.3 | 3.0E-05 |
| MoCA | 25.3 ± 1.8 | 21.0 ± 4.4 | 9.7E-05 | 25.2 ± 1.8 | 21.4 ± 4.4 | 0.0035 |
| LM immediate | 12.4 ± 3.6 | 6.6 ± 3.8 | 4.6E-09 | 12.0 ± 3.5 | 7.8 ± 4.2 | 2.9E-03 |
| LM delayed | 9.8 ± 4.3 | 4.1 ± 3.9 | 1.2E-07 | 9.5 ± 4.2 | 5.4 ± 4.3 | 0.0019 |
| ADAS-cog | 5.9 ± 3.4 | 18.7 ± 9.7 | 1.6E-12 | 6.3 ± 3.4 | 15.7 ± 7.7 | 3.1E-08 |
| NPI | 2.0 ± 4.5 | 7.6 ± 8.8 | 0.0005 | 2.3 ± 4.9 | 5.1 ± 5.6 | n.s. |
| LADL | 76.4 ± 2.8 | 67.4 ± 11.9 | 3.3E-6 | 76.3 ± 3.0 | 71.1 ± 8.5 | 0.0013 |
| LVF | 46.8 ± 12.9 | 37.5 ± 15.5 | 0.0058 | 48.2 ± 13.1 | 39.6 ± 14.0 | 0.035 |
CA, Caucasian-Americans; MMSE, Mini-Mental State Exam; MoCA, Montreal Cognitive Assessment; LM, Logical Memory Test; ADAS-Cog, Alzheimer's Disease Assessment Scale-Cognitive subscale; NPI, Neuropsychiatric Inventory; LADL, Lawton Instrumental Activities of Daily Living Scale; LVF, Letter Verbal Fluency (Controlled Oral Word Association Test, COWAT).
female;
uncorrected p-values for the difference between CN and MCI/mild AD with two-tailed two-sample t-tests (unless otherwise specified);
Fisher's Exact Test;
MoCA were only administered to a subset of participants, including 22 CN and 22 MCI/mild AD patients form the original dataset and 17 CN and 11 MCI/mild AD patients from the demographics matched dataset, and MoCA test scores were not used in classification.
NPI and LADL data was missing from one control and one MCI/mild AD patient.
The performance with the optimal hyper-parameter tuning for each dataset is shown in bold and italics font (optimal values for class weight and threshold).
Figure 1The cognitive task [spatial inhibition of return (IOR)] experiment paradigm. Within each trial, there were three sequentially presented visual stimuli—two cues (solid red square) and one target (solid green square)—with a blank screen in between. The three stimuli were presented serially. The two cue stimuli could appear in any of the three locations (left, middle, right), whereas the target stimuli could only appear in one of the two locations (left or right, but not the middle). Subjects were instructed to respond to the target (solid green square) by pressing one of two buttons in the right hand to indicate whether the target was presented at the left or right location (with the index finger or the middle finger). The two cues were presented 200 ms each, with a 250 ms break in between. The second cue was followed by another 250 ms break before the onset of the target, which was presented for 850 ms. The next trial started 750 ms after the offset of the target stimulus. Subjects had to respond within the 1.6 s time-window (before the onset of next trial). There were five conditions based on the relationship of the locations in which the three stimuli were presented: , in which the two cues and the target were presented at the same location; , in which the second cue and the target were presented at the same location, and the first cue was presented at a different location; , in which the first cue and the target were presented at the same location, and the second cue was presented at a different location; , in which the two cues were presented at the same location, and the target was presented at a different location; , in which the two cues and the target were presented at three different locations. The behavioral data from the study team can be found elsewhere (Jiang et al.), which includes detailed data from each individual subject that can be downloaded by other teams to test with their approaches. Note: ms, millisecond.
Figure 2The ROC curves for the best classifiers selected by the highest sensitivity for each dataset with traditional machine learning algorithms and with MLP. See Table 2, Supplementary Tables 1-3 for the specific algorithms and parameters used for these “best” classifiers (shown in bold and font). (A) Traditional machine learning algorithms with all features and PCA (without and with SMOTE over-sampling). (B) Traditional machine learning algorithms with features selection (without and with SMOTE over-sampling). (C) The ROC curves for each dataset with MLP using the demographically comparable dataset. The AUC score is shown in the legend box.
Multilayer perceptron (MLP) classification performance using the demographically comparable dataset.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| NP | 0.5 | 1:1 | 62.50 | 87.8 | 80.7 ± 7.76 |
|
|
|
|
|
| |
| IORtrial | 0.5 | 1:1 | 75.00 | 95.12 | 89.47 ± 6.17 |
|
|
|
|
|
| |
| IORcond | 0.5 | 1:1 | 68.75 | 97.56 | 89.47 ± 3.12 |
|
|
|
|
|
| |
| NP + IORtrial | 0.5 | 1:1 | 75.00 | 95.12 | 89.47 ± 7.5 |
|
|
|
|
|
| |
| NP + IORcond | 0.5 | 1:1 | 87.50 | 97.56 | 94.74 ± 6.75 |
|
|
|
|
|
|
The sensitivity (SEN), specificity (SPE), accuracy (ACC), and standard deviation of the accuracy (std) for each dataset were calculated from 5-fold CV using the default setting for class weight (1:1) and threshold (0.5). The performance with the optimal hyper-parameter tuning for each dataset is shown in .
A direct comparison between the best traditional machine learning and deep learning methods (a summary of Table 2, Supplementary Tables 1-3).
|
|
|
|
|
|
|---|---|---|---|---|
| NP | RF | 87.5 | 70.73 | 75.61 ± 5.76 |
| MLP | 75.00 | 87.8 | 84.21 ± 4.91 | |
| IORtrial | AB | 62.5 | 78.05 | 73.48 ± 9.75 |
| MLP | 81.25 | 95.12 | 91.23 ± 4.28 | |
| IORcond | RF | 75 | 80.49 | 78.79 ± 9.25 |
| MLP | 81.25 | 90.24 | 87.72 ± 3.12 | |
| NP + IORtrial | GB | 81.25 | 92.68 | 89.55 ± 3.12 |
| MLP | 87.50 | 95.12 | 92.98 ± 4.28 | |
| NP + IORcond | SVM | 87.5 | 82.93 | 84.09 ± 3.91 |
| MLP | 93.75 | 92.68 | 92.98 ± 6.33 |