| Literature DB >> 27239542 |
David Glenn Clark1, Paula M McLaughlin2, Ellen Woo3, Kristy Hwang4, Sona Hurtz5, Leslie Ramirez5, Jennifer Eastman6, Reshil-Marie Dukes7, Puneet Kapur8, Thomas P DeRamus9, Liana G Apostolova10.
Abstract
INTRODUCTION: The objective of this study was to assess the utility of novel verbal fluency scores for predicting conversion from mild cognitive impairment (MCI) to clinical Alzheimer's disease (AD).Entities:
Keywords: Alzheimer's disease; Cognitive neuropsychology; Dementia; MCI (mild cognitive impairment); MRI (magnetic resonance imaging); Machine learning; Natural language processing
Year: 2016 PMID: 27239542 PMCID: PMC4879664 DOI: 10.1016/j.dadm.2016.02.001
Source DB: PubMed Journal: Alzheimers Dement (Amst) ISSN: 2352-8729
Participant demographics and selected neuropsychological test scores
| CN (n = 51) | MCI-non (n = 83) | MCI-con (n = 24) | |
|---|---|---|---|
| Age (y) | 68.9 (7.9) | 68.7 (8.6) | 73.8 (7.9)*** |
| Sex (M:F) | 28:23 | 37:46 | 9:15 |
| Education (y) | 17.6 (2.2)**** | 16.0 (3.0) | 16.0 (2.9) |
| Mini-mental state examination | 28.9 (1.2)**** | 27.9 (1.7) | 25.1 (3.1)**** |
| Animals | 22.0 (4.7)**** | 18.8 (5.1) | 14.5 (4.9)**** |
| Vegetables | 15.0 (4.3)*** | 13.0 (4.1) | 9.7 (3.7)*** |
| F | 16.8 (4.6)**** | 13.4 (5.1) | 11.3 (5.5)* |
| A | 15.6 (4.6)**** | 11.1 (5.1) | 7.9 (4.7)*** |
| S | 16.9 (5.5)**** | 13.4 (5.1) | 10.7 (5.0)** |
| Boston naming test | 58.1 (1.9)**** | 52.1 (8.1) | 47.4 (10.6)* |
| Digit span forward | 10.8 (2.3) | 10.2 (2.2) | 9.5 (1.9) |
| Digit span backward | 8.2 (2.2)**** | 6.6 (2.4) | 5.6 (1.6)** |
| Trails A | 24.9 (8.9)**** | 33.8 (14.6) | 44.5 (18.1)** |
| Trails B | 70.0 (37.2)**** | 103.6 (55.5) | 161.8 (87.5)*** |
| Stroop A | 62.7 (12.2)** | 69.0 (20.4) | 87.7 (18.7)**** |
| Stroop B | 47.5 (8.4)** | 51.3 (12.6) | 56.5 (10.1)** |
| Stroop C | 114.9 (28.3)**** | 137.5 (44.0) | 181.2 (66.6)*** |
| Wisconsin card sort (categories) | 4.3 (0.9)**** | 3.4 (1.9) | 2.4 (1.7)** |
| Wisconsin card sort (errors) | 11.7 (8.9)**** | 22.7 (13.7) | 35.8 (19.0)*** |
| Logical memory I | 42.9 (9.6)**** | 32.9 (11.1) | 15.9 (8.1)**** |
| Logical memory II | 28.3 (7.1)**** | 18.5 (9.5) | 4.6 (4.7)**** |
| Visual recall I | 82.9 (13.5)**** | 80.0 (16.6) | 52.6 (17.4)**** |
| Visual recall II | 62.8 (25.2)**** | 38.1 (23.8) | 15.2 (21.8)**** |
| Rey-Osterrieth figure copy | 33.4 (2.4)**** | 30.0 (4.7) | 29.7 (4.7) |
| Rey-Osterrieth delayed recall | 20.2 (6.7)**** | 12.7 (7.2) | 7.4 (6.6)*** |
Abbreviations: CN, cognitively normal group; MCI-non, mild cognitive impairment nonconverter; MCI-con, mild cognitive impairment converter to AD. Numbers in parentheses are standard deviations. All statistical comparisons are made to the MCI-non group.
NOTE. *P < .1, **P < .05, ***P < .01, ****P < .001.
Traditional and novel fluency scores
| Score | Description |
|---|---|
| Traditional | |
| Raw | Count of unique valid items |
| Intrusions | Count of nonvalid items |
| Repetitions | Count of repeated items |
| Classic and miscellaneous lexical | |
| Clustering | Automatically calculated as described in Troyer, et al. (1998a) |
| Switching | Automatically calculated as with clustering |
| Mean frequency | Lexical frequencies for all words generated were calculated from the Google n-grams corpus and averaged |
| Mean number of syllables | Syllables for each word generated were quantified as the number of vowel symbols in the pronunciation listed in the Carnegie Mellon University Pronunciation Dictionary |
| Metric range of frequency | Calculated as the maximum frequency of words within a list minus the minimum frequency of words in the list |
| Sum of frequencies | Lexical frequencies were added together |
| Sum of reciprocal of frequencies | The reciprocal of all the lexical frequencies were added together |
| Independent components analysis (ICA) | |
| 20 scores | ICA was performed on proximity matrices as described in Clark et al. (2014a). Each individual received 20 scores computed as the dot product of the individual's proximity matrix and 20 extracted components |
| Similarity metric based | |
| Algebraic connectivity | Second smallest eigen value of the Laplacian of the weighted graph |
| Average clustering coefficient | Given a vertex in a graph, the clustering coefficient for the vertex is the proportion of edges present among the immediate neighbors of the vertex. This value was calculated for all vertices in the thresholded graph and averaged. |
| Average degree | Average weight of all edges connected to each vertex in the graph |
| Diameter | Length of the longest geodesic in the weighted graph |
| Maximum betweenness centrality | For every pair of distinct vertices in the thresholded graph, the shortest path between the pair was identified. The betweenness centrality for each vertex was calculated as the number of shortest paths passing through that vertex. The score was the maximum of these values. |
| Radius | Length of the shortest geodesic in the weighted graph |
| Transitivity | 3 times the proportion of triangles in a thresholded graph divided by the number of triads (two edges with a common vertex) in the graph |
| Coherence | A greedy algorithm was used to discover a short Hamiltonian path through the vertices of the weighted graph. The sum of the similarity weights on the actual path taken by the participant was divided by the sum of the similarities on the optimal path. |
| P&H clustering | Defined as for traditional clustering, but linkages between words were based on the edges in the thresholded graph, as described by Pakhomov & Hemmy (2014) |
| P&H switching | Analogous to Pakhomov clustering |
NOTE. Similarity metrics included orthographic, phonologic, and semantic similarity measures like those described in Clark et al. (2014b). Thus, there were three versions of each of the similarity-metric based scores.
Cumulative importance values of variables selected from novel scores
| Coherence semantic (A) | 1589.10 | ICA10 (S) | 598.70 | Raw (animal) | 281.51 |
| Coherence–ortho (veg) | 1066.63 | ICA17 (F) | 590.24 | ICA2 (veg) | 191.70 |
| Frequency–metric range (F) | 939.63 | Frequency–metric range (animal) | 585.68 | Coherence–phono (S) | 156.36 |
| Coherence–ortho (animal) | 878.30 | Algebraic connectivity–phono (animal) | 585.45 | Average degree–semantic (veg) | 154.91 |
| Algebraic connectivity–ortho (animal) | 777.68 | Frequency–sum (animal) | 564.00 | Average clustering coefficient–phono (veg) | 136.50 |
| Radius–ortho (A) | 772.28 | Coherence–semantic (veg) | 534.27 | Clustering–classic (animal) | 122.58 |
| Frequency–mean (animal) | 766.87 | Switching–phono (veg) | 529.87 | Diameter–ortho (A) | 36.03 |
| Frequency–sum reciprocal (animal) | 745.18 | Maximum betweenness–phono (animal) | 521.75 | Algebraic connectivity–ortho (A) | 31.80 |
| Transitivity–phono (veg) | 690.30 | Transitivity–semantic (animal) | 495.77 | ICA13 (A) | 31.22 |
| Coherence–ortho (A) | 684.37 | Algebraic connectivity–semantic (A) | 475.37 | Metric range of similarity–semantic (A) | 18.30 |
| Coherence–phono (A) | 681.66 | Average clustering coefficient–ortho (animal) | 439.13 | Frequency–mean (veg) | 17.83 |
| Maximum betweenness–phono (veg) | 676.82 | Maximum betweenness–semantic (animal) | 420.87 | Switching–semantic (animal) | 9.14 |
| Transitivity–semantic (S) | 673.94 | Switching–phono (animal) | 407.99 | Metric range of similarity–ortho (A) | 9.09 |
| Average clustering coefficient–semantic (S) | 663.10 | Frequency–sum (veg) | 400.33 | Age | 8.86 |
| Frequency–sum reciprocal (veg) | 656.95 | ICA4 (animal) | 305.44 | Diameter–semantic (animal) | 4.57 |
Abbreviations: veg, Vegetable; ICA, independent components analysis.
NOTE. Each score (apart from age) originated from one of the five fluency tasks (A, animal, F, S, or veg). For scores dependent on measurements of lexical similarity, the type of similarity measure is included (orthographic, phonological, or semantic). Each importance value listed here represents the sum of the importance measurements across all cross-validation loops.
Fig. 1Component 15 derived from independent components analysis of cortical thickness measures. The values of the component have been normalized to the interval [0,1]. Individuals with a higher gray matter thickness in the parietal lobes and lower gray matter thickness in the right mesial occipital region would achieve the highest scores for this component.
Fig. 2Boxplots of 10 selected variables. The top row includes the five top-ranked variables from the analysis including only novel scores. The bottom row includes the three raw scores selected for the “raw” analysis and the two imaging scores selected for the “brain” analysis. Differences between the MCI-non (N) and MCI-con (C) groups are apparent for all variables shown. Factors that may be relevant but cannot be readily depicted include potential interactions among several variables and nonlinear relationships between an individual variable and conversion risk. (A) semantic coherence letter A; (B) orthographic coherence vegetables; (C) metric range of frequency letter F; (D) orthographic coherence animals; (E) orthographic algebraic connectivity animals; (F) raw score for animals; (G) raw score for vegetables; (H) raw score for letter A; (I) volume of right hippocampus; (J) gray matter volume score for independent component 15.
Fig. 3ROC curves for the five ensemble classifiers. Novel verbal fluency scores yield the best AUC (0.872). This classifier may be thresholded to have sensitivity 1.00 with a specificity of 0.675. Abbreviations: ROC, receiver operating characteristic curve; AUC, area under the receiver operating characteristic curve.
Quality of predictions made by the five ensemble classifiers
| AUC | F | Sensitivity | Specificity | NPV | PPV | Accuracy | |
|---|---|---|---|---|---|---|---|
| Raw | 0.719 | 0.583 | 0.583 | 0.880 | 0.880 | 0.583 | 0.813 |
| Brain | 0.760 | 0.536 | 0.682 | 0.756 | 0.894 | 0.441 | 0.740 |
| Raw + Brain | 0.735 | 0.524 | 0.458 | 0.916 | 0.854 | 0.611 | 0.813 |
| Novel | 0.872* | 0.667 | 0.708 | 0.880 | 0.913 | 0.630 | 0.841 |
| Novel + Brain | 0.814 | 0.625 | 0.625 | 0.892 | 0.892 | 0.625 | 0.832 |
Abbreviations: AUC, area under the receiver operating characteristic curve; F, F-measure (harmonic mean of sensitivity and positive predictive value); NPV, negative predictive value; PPV, positive predictive value.
NOTE. *P < .05 compared to AUC for Raw classifier using DeLong test.