| Literature DB >> 35945568 |
Natthanan Ruengchaijatuporn1, Itthi Chatnuntawech2, Thiparat Chotibut3, Chaipat Chunharas4,5, Surat Teerapittayanon2, Sira Sriswasdi1,6, Sirawaj Itthipuripat7,8, Solaphat Hemrungrojn9,10, Prodpran Bunyabukkana11, Aisawan Petchlorlian12, Sedthapong Chunamchai9,13.
Abstract
BACKGROUND: Mild cognitive impairment (MCI) is an early stage of cognitive decline which could develop into dementia. An early detection of MCI is a crucial step for timely prevention and intervention. Recent studies have developed deep learning models to detect MCI and dementia using a bedside task like the classic clock drawing test (CDT). However, it remains a challenge to predict the early stage of the disease using the CDT data alone. Moreover, the state-of-the-art deep learning techniques still face black box challenges, making it questionable to implement them in a clinical setting.Entities:
Mesh:
Year: 2022 PMID: 35945568 PMCID: PMC9361513 DOI: 10.1186/s13195-022-01043-2
Source DB: PubMed Journal: Alzheimers Res Ther Impact factor: 8.823
The mean and standard deviation of the classification accuracies, F1-scores, and AUCs over 5 different random training-validation-test data splittings. Our proposed model, which benefits from the incorporation of multiple complementary drawing tasks (clock drawing, cube-copying, and trail-making), self-attention mechanism, and soft labeling approach, achieved much higher mean accuracy, F1-score, and AUC than the baseline model
| Models | Accuracy | F1-score | AUC |
|---|---|---|---|
| VGG16 with only clock-drawing test | 0.7478 ± 0.0071 | 0.3573 ± 0.0443 | 0.7429 ± 0.0131 |
| VGG16 with only cube-copying test | 0.7739 ± 0.0096 | 0.4994 ± 0.0477 | 0.7813 ± 0.0197 |
| VGG16 with only trail-making test | 0.7739 ± 0.0249 | 0.5283 ± 0.0548 | 0.7722 ± 0.0240 |
| Multi-input VGG16 | 0.7986 ± 0.0071 | 0.5938 ± 0.0207 | 0.8115 ± 0.0192 |
| Conv-Att with only clock-drawing test | 0.7522 ± 0.0125 | 0.3586 ± 0.0309 | 0.7337 ± 0.0204 |
| Conv-Att with only cube-copying test | 0.7768 ± 0.0168 | 0.5095 ± 0.0515 | 0.7791 ± 0.0199 |
| Conv-Att with only trail-making test | 0.7696 ± 0.0167 | 0.5211 ± 0.0272 | 0.7662 ± 0.0231 |
| Multi-input Conv-Att | 0.7986 ± 0.0071 | 0.5981 ± 0.0221 | 0.8379 ± 0.0176 |
The mean and standard deviation of the visual interpretability scores over all samples in the test set given by a neurologist and two licensed neuropsychologists (scores from 1 to 5; 1 being the worst and 5 being the best in terms of providing a visual interpretability that aligned with their experience and knowledge)
| Evaluators | VGG16 with Grad-CAM | Conv-Att with a soft label (proposed) |
|---|---|---|
| Expert 1 | 1.42 ± 0.74 | 3.41 ± 0.61 |
| Expert 2 | 1.86 ± 0.74 | 2.20 ± 0.93 |
| Expert 3 | 1.36 ± 0.62 | 2.87 ± 0.68 |
Fig. 1Quantitative comparisons between the multi-input VGG16 model with Grad-CAM (red) and our multi-input Conv-Att model with soft label (blue), as measured by the IoUs between the heat maps and two types of ROIs, (a) whole-drawing ROIs and (b) expert ROIs, as a function of the percentage of the number of pixels used in the heat maps. Example images with corresponding ROIs are shown at the top of each panel. Our proposed model is more similar to both whole-drawing and expert ROIs than Grad-CAM model and the higher similarity is consistent over broad range of % total number of pixels from the models’ outputs (10%-80%)
Fig. 2Visual explanations provided by the multi-input VGG16 model with Grad-CAM visualization (2nd column from the right) and the proposed model (column on the far-right) on a representative MCI test sample (2nd column from the left). For the clock image (1st row), our model highlights the hands of the clock where it says 12:55 instead of 11:10. For the cube-copying image (2nd row), our model highlights unusual paths better. For the trail-making test (last row), our model could focus along the paths that should not have been drawn (paths from 2-3, B-4 and C-D), while the multi-input VGG16 model with Grad-CAM failed to highlight some of those paths (B-4). Note that the red arrow and asterisks were not drawn by the subjects but added here to aid the descriptions
Fig. 3Overview of our proposed multi-input Conv-Att model. Our model simultaneously takes clock drawing, cube-copying, and trail-making images as its inputs and processes them using a cascade of CNNs and a stack of self-attention layers