| Literature DB >> 35882684 |
Angela Lombardi1,2, Domenico Diacono2, Nicola Amoroso3,4, Przemysław Biecek5,6, Alfonso Monaco2, Loredana Bellantuono2,7, Ester Pantaleo1,2, Giancarlo Logroscino7,8, Roberto De Blasi8, Sabina Tangaro2,9, Roberto Bellotti1,2.
Abstract
In clinical practice, several standardized neuropsychological tests have been designed to assess and monitor the neurocognitive status of patients with neurodegenerative diseases such as Alzheimer's disease. Important research efforts have been devoted so far to the development of multivariate machine learning models that combine the different test indexes to predict the diagnosis and prognosis of cognitive decline with remarkable results. However, less attention has been devoted to the explainability of these models. In this work, we present a robust framework to (i) perform a threefold classification between healthy control subjects, individuals with cognitive impairment, and subjects with dementia using different cognitive indexes and (ii) analyze the variability of the explainability SHAP values associated with the decisions taken by the predictive models. We demonstrate that the SHAP values can accurately characterize how each index affects a patient's cognitive status. Furthermore, we show that a longitudinal analysis of SHAP values can provide effective information on Alzheimer's disease progression.Entities:
Keywords: Alzheimer’s disease; Cognitive spectrum; Explainable Artificial Intelligence; Mild Cognitive Impairment; XAI
Year: 2022 PMID: 35882684 PMCID: PMC9325942 DOI: 10.1186/s40708-022-00165-5
Source DB: PubMed Journal: Brain Inform ISSN: 2198-4026
Demographic, clinical and neuropsychological information on the selected cohort
| NC | MCI | AD | |
|---|---|---|---|
| Age (years) | |||
| Education (years) | |||
| Gender (M/F) | 1082/1326 | 1691/1221 | 577/388 |
| ADAS11 | |||
| ADAS13 | |||
| MMSE | |||
| RAVLT immediate | |||
| RAVLT learning | |||
| RAVLT percforgetting | |||
| FAQ | |||
| MOCA | |||
| EcogPtTotal | |||
| EcogSPTotal |
Description of the selected clinical and neuropsychological indexes
| Index | Description |
|---|---|
| ADAS11 | A test that is composed of 11 tasks to assess cognitive functioning of memory, praxis and language. Specific tasks include Naming Objects, Word Recall, Fingers, Commands, Orientation, Word Recognition, Constructional Praxis, Ideational Praxis and Language. [ |
| ADAS13 | A test including all elements of ADAS11 as well as a test of delayed word recall and a number cancellation or maze task [ |
| MMSE | The mini-mental state examination rates various cognitive domains, including memory, attention and language. Scores for MMSE range from 0 to 30; lower scores indicate greater cognitive dysfunction.[ |
| MOCA | The Montreal cognitive assessment comprises 12 individual tasks (grouped into cognitive domains, including visuospatial and executive functioning, attention, language, abstraction, naming, delayed memory recall and orientation), which are mostly binary, and are assessed and summed with a 6-item orientation screening and an educational correction to determine a total score reflecting global cognitive functioning.[ |
| FAQ | The Functional Activities Questionnaire evaluates the instrumental activities of daily living (IADLs), such as preparing meals and managing personal finances. The sum scores range in the 0-30 interval and the cut-point equal to 9 (dependent on 3 or more activities) is recommended to denote potential cognitive impairment.[ |
| RAVLT | The Rey auditory verbal learning test involves five presentations of a 15-word list (List A), each followed by an attempted recall. This is followed by a second 15-word interference list (List B), followed by a recall of List A. It rates different aspects of episodic memory such as the learning rate (RAVLT learning and RAVLT immediate) and delayed recall (RAVLT percent forgetting) [ |
| Ecog | The Everyday Cognition scale is an informant-rated questionnaire that includes one global factor and six domain-specific factors. The psychometric properties in the ECog scale address everyday function and cognition mild impairments reported from both participant (EcogPt) and study partner (ECogSP) [ |
Fig. 1Workflow of the proposed analysis. The clinical and neuropsychological indexes (i.e., S features) are used to train a Random Forest (RF) classifier and predict the diagnosis of each subject at each visit with a leave-one-subject-out cross-validation strategy; for each cross-validation round the training set was randomly under-sampled times by selecting a fixed amount of samples for each diagnostic category to handle class imbalance; the SHAP algorithm was used to explain the predictions of RF models for each sample; different statistical analyses were performed by using both probability scores resulting from RF and SHAP values to: (i) relate the performance of RF to the variability of the SHAP scores, (ii) analyze the variability of the SHAP scores between diagnostic categories, (iii) examine the longitudinal variability of the SHAP scores
Performance metrics of the RF models
| Model | Accuracy | Specificity | Sensitivity | Precision | AUC |
|---|---|---|---|---|---|
| NC | 0.88 | ||||
| MCI | 0.82 | ||||
| AD | 0.97 | ||||
| Global | 0.89 |
Fig. 2Performance of RF models: confusion matrix (top) and ROC curves (bottom)
Fig. 3Comparisons of probability scores across diagnostic classes: boxplots (top) and table with p values resulting from post-hoc analysis with average and standard deviations for each class (bottom)
Fig. 4Comparisons of cosine distances between the SHAP vectors of each possible pair of subjects belonging to each class: boxplots (top) and table with p values resulting from post-hoc analysis with average and standard deviations for each class (bottom)
Fig. 5Variable importance plot showing the average SHAP values for each index within each class
Fig. 6Comparisons of cosine distances between the SHAP vectors of the first and last visits of the subjects for the diagnostic longitudinal categories: boxplots (top) and table with p values resulting from post-hoc analysis with average and standard deviations of probability scores for each category (bottom). p values in bold are statistically significant
Fig. 7Radar plots reporting the average SHAP values for the first and last visit for each category and each cognitive index