| Literature DB >> 35164870 |
Maxim Signaevsky1,2,3,4,5, Bahram Marami1,2, Marcel Prastawa1,2, Nabil Tabish1,3,4,5, Megan A Iida1,3,4,5, Xiang Fu Zhang2, Mary Sawyer2, Israel Duran2, Daniel G Koenigsberg1,3,4,5, Clare H Bryce1,2, Lana M Chahine6, Brit Mollenhauer7, Sherri Mosovsky6, Lindsey Riley8, Kuldip D Dave8, Jamie Eberling8, Chris S Coffey9, Charles H Adler10, Geidy E Serrano11, Charles L White12, John Koll1,2, Gerardo Fernandez1,2, Jack Zeineh1,2, Carlos Cordon-Cardo1, Thomas G Beach11, John F Crary13,14,15,16,17.
Abstract
The diagnosis of Parkinson's disease (PD) is challenging at all stages due to variable symptomatology, comorbidities, and mimicking conditions. Postmortem assessment remains the gold standard for a definitive diagnosis. While it is well recognized that PD manifests pathologically in the central nervous system with aggregation of α-synuclein as Lewy bodies and neurites, similar Lewy-type synucleinopathy (LTS) is additionally found in the peripheral nervous system that may be useful as an antemortem biomarker. We have previously found that detection of LTS in submandibular gland (SMG) biopsies is sensitive and specific for advanced PD; however, the sensitivity is suboptimal especially for early-stage disease. Further, visual microscopic assessment of biopsies by a neuropathologist to identify LTS is impractical for large-scale adoption. Here, we trained and validated a convolutional neural network (CNN) for detection of LTS on 283 digital whole slide images (WSI) from 95 unique SMG biopsies. A total of 8,450 LTS and 35,066 background objects were annotated following an inter-rater reliability study with Fleiss Kappa = 0.72. We used transfer learning to train a CNN model to classify image patches (151 × 151 pixels at 20× magnification) with and without the presence of LTS objects. The trained CNN model showed the following performance on image patches: sensitivity: 0.99, specificity: 0.99, precision: 0.81, accuracy: 0.99, and F-1 score: 0.89. We further tested the trained network on 1230 naïve WSI from the same cohort of research subjects comprising 42 PD patients and 14 controls. Logistic regression models trained on features engineered from the CNN predictions on the WSI resulted in sensitivity: 0.71, specificity: 0.65, precision: 0.86, accuracy: 0.69, and F-1 score: 0.76 in predicting clinical PD status, and 0.64 accuracy in predicting PD stage, outperforming expert neuropathologist LTS density scoring in terms of sensitivity but not specificity. These findings demonstrate the practical utility of a CNN detector in screening for LTS, which can translate into a computational tool to facilitate the antemortem tissue-based diagnosis of PD in clinical settings.Entities:
Keywords: Artificial intelligence; Convolutional neural network; Deep learning; Machine learning; Parkinson’s disease; Peripheral biopsy; Submandibular gland; Synucleinopathy; Whole slide image
Mesh:
Year: 2022 PMID: 35164870 PMCID: PMC8842941 DOI: 10.1186/s40478-022-01318-7
Source DB: PubMed Journal: Acta Neuropathol Commun ISSN: 2051-5960 Impact factor: 7.801
Subject data
| Control | Parkinson disease, total | Parkinson disease, stage | |||
|---|---|---|---|---|---|
| Early | Moderate | Advanced | |||
| 14 (8/6) | 42 (30/12) | 15 (13/2) | 13 (8/5) | 14 (9/5) | |
| Age (yr)* | 62.5 ± 6.5 | 64.5 ± 9.2 | 63.8 ± 10.5 | 58.8 ± 6.6 | 70.5 ± 5.9 |
| Disease duration (mo) | – | 61.1 ± 64.4 | 10.5 ± 7.2 | 44.8 ± 17.8 | 130.4 ± 56.2 |
| MDS-UPDRS* | |||||
| Part I | 2.6 ± 2.4 | 8.0 ± 5.3 | 8.1 ± 6.0 | 7.1 ± 4.5 | 8.6 ± 5.7 |
| Part II | 0.1 ± 0.4 | 10.1 ± 6.4 | 8.9 ± 5.8 | 8.7 ± 4.0 | 12.7 ± 8.2 |
| Part III | 1.1 ± 2.7 | 26.1 ± 12.0 | 20.1 ± 9.9 | 27.1 ± 11.8 | 31.9 ± 12.1 |
| Total | 3.8 ± 4.0 | 44.0 ± 18.8 | 37.2 ± 17.7 | 42.9 ± 14.9 | 52.9 ± 21.1 |
MDS-UPDRS Movement Disorders Society Unified Parkinson Disease Rating Scale
*Mean ± SD
Fig. 1LTS object scoring and confidence ranking. Illustration of scoring density of sparse (A) moderate (B), and frequent (C). Submandibular gland biopsies immunohistochemically stained for α-synuclein with examples of sparse (D) moderate (E) and frequent (F). Examples of confidence ranking examples of definite (G), probable (H), and possible (I). LTS can be visualized and single or multiple immunopositive normal or misshapen axonal profiles in the submandibular gland parenchyma, nerve fascicles, or adjacent to blood vessels. LTS Lewy-type synucleinopathy
Fig. 2Schematic overview of data annotation and deep learning pipeline. LTS are annotated using WSI. The CNN was trained to classify image patches containing LTS from other image patches including tissue, artifacts and background. Different weights were used for the annotated objects while training using the cross entropy loss function for the final network. Image patches are extracted for network training that generates pixel-wise segmentations for LTS and background. Performance is determined using a separate novel set of images (test set) by comparing expert annotation with the trained network. The resulting trained network is further deployed on the naive WSI dataset for assessment of the predictive power of clinical outcomes. LTS Lewy-type synucleinopathy, WSI whole slide image, CNN convolutional neural network
Fig. 3Examples of CNN deployment on SMG biopsy WSI. A An example of annotated objects (blue) and CNN inference (grey shading). A 40 × 40 pixel inference patch is shown. B An example of true positive CNN classification. C An example of false positive CNN classification. CNN, convolutional neural network; SMG, submandibular gland; LTS, Lewy-type synucleinopathy, WSI, whole slide image
Performance of non-weighted and weighted CNN LTS detectors
| Sensitivity/recall | Specificity | Precision | F1 score* | Accuracy | AUC** | |
|---|---|---|---|---|---|---|
| Non-weighted | 0.99 | 0.92 | 0.41 | 0.59 | 0.92 | 0.96 |
| Weighted | 0.99 | 0.99 | 0.81 | 0.89 | 0.99 | 0.99 |
*F1 score = 2 * Precision * Recall/(Precision + Recall)
**AUC: Area Under the Curve, Receiver Operating Characteristic
Fig. 4Comparison of ground truth annotations and CNN detection with expert scoring. A Expert annotation distribution boxplot in test WSI cohort (n = 56), Mann–Whitney two-tailed U test between score groups p values; Kruskal–Wallis H test of annotated LTS between expert score groups, and Spearman correlation between LTS burden and expert scores. B CNN, 40 × 40 patches positive for LTS distribution boxplot in test WSI cohort (n = 56), Mann–Whitney two-tailed U test between score groups p values; Kruskal–Wallis H test of 40 × 40 patches between expert score groups, and Spearman correlation between 40 × 40 patches positive for LTS burden and a score in test cohort. Scoring was performed as follows: each WSI was classified as positive or negative for α-synuclein pathology and assigned an LTS score ranging from 0 to 3 (Fig. 1), where 0 refers to being negative for α-synuclein, and scores 1–3 refer to scoring density of sparse (1) moderate (2), and frequent (3). CNN convolutional neural network, SMG submandibular gland, LTS Lewy-type synucleinopathy
Fig. 5Examples of CNN deployment on SMG biopsy WSI. A An Example of a WSI. B, C LTS patch detection and features generation (e.g., patches clustering, in color). CNN convolutional neural network, SMG submandibular gland, LTS Lewy-type synucleinopathy
Prediction of Parkinson disease status, logistic regression, n = 56 (14 controls and 42 PD), Mean ± SD
| Sensitivity/recall | Specificity | Precision | F1 score* | Accuracy | |
|---|---|---|---|---|---|
| 13 AI features altogether | 0.71 ± 0.16 | 0.65 ± 0.30 | 0.86 ± 0.13 | 0.76 ± 0.12 | 0.69 ± 0.13 |
| Expert score and its derivatives | 0.59 ± 0.16 | 0.88 ± 0.24 | 0.94 ± 0.13 | 0.71 ± 0.13 | 0.66 ± 0.13 |
| Expert score | 0.54 ± 0.16 | 0.93** ± 0.17 | 0.96 ± 0.10 | 0.68 ± 0.14 | 0.64 ± 0.13 |
| Expert score derivatives | 0.60 ± 0.16 | 0.89 ± 0.23 | 0.94 ± 0.11 | 0.72 ± 0.13 | 0.67 ± 0.13 |
*F1 score, harmonic mean is calculated as 2 * Precision * Recall/(Precision + Recall)
**Expert score specificity is the same as reported previously [6]
Accuracy of prediction of clinical Parkinson disease stage by the weighted CNN*
| Stages 0–3 | Stages 1–3 ( | |
|---|---|---|
| Five AI features | 0.48 ± 0.14 | 0.64 ± 0.16 |
| Expert scoring and its derivatives | 0.45 ± 0.14 | 0.59 ± 0.17 |
*Mean ± SD
Parkinson disease stages: early [1], moderate [2], advanced [3], and controls [0]
Selected highest ranking AI features correlations with expert scoring, Spearman's rank correlation coefficient Rho
| Per slide score, | Per subject score, | |||
|---|---|---|---|---|
| LTS cluster size SD | ||||
| PreciseDx Graph Feature18 | ||||
| StDev of Hematoxylin Channel | ||||
| PreciseDx Graph Feature5 | ||||
| PreciseDx Graph Feature29 | ||||
| PreciseDx Graph Feature11 | 0.15 | 0.269 | ||
| Davies-Bouldin Index | ||||
| PreciseDx Graph Feature13 | ||||
| C Index | 0.28 | 0.035 | ||
| PreciseDx Graph Feature28 | ||||
| PreciseDx Graph Feature14 | ||||
| PreciseDx Graph Feature10 | 0.04 | 0.159 | − 0.22 | 0.106 |
| Calinski-Harabasz index | ||||
| PreciseDx Graph Feature3 | ||||
| PreciseDx Graph Feature27 | ||||
Statistically significant values are shown in bold
LTS Lewy type synucleinopathy
Selected highest ranking AI feature differences between Parkinson disease and controls (Mann–Whitney U test), and between stages (Kruskal–Wallis test)
| AI features | Mann–Whitney U test | |
|---|---|---|
| U | ||
| PreciseDx Graph Feature27 | ||
| LTS cluster size StDev | ||
| PreciseDx Graph Feature3 | ||
| PreciseDx Graph9 | ||
| StDev of Hematoxylin Channel | ||
| PreciseDx Graph Feature14 | ||
| Calinski-Harabasz index | ||
| PreciseDx Graph Feature13 | ||
| PreciseDx Graph Feature28 | 208.5 | 0.056 |
| Davies-Bouldin Index | 204.0 | 0.088 |
| PreciseDx Graph Feature11 | 249.0 | 0.338 |
| PreciseDx Graph Feature10 | 268.0 | 0.419 |
| C index | 266.5 | 0.601 |
| LTS pixel fraction | ||
| StDev of DAB channel (brown) | ||
| Number of LTS patches | ||
Statistically significant values are shown in bold
Controls, n = 14; PD patients, n = 42; PD early stage patients, n = 15; PD moderate stage patients, n = 13; PD advanced stage patients, n = 14
LTS Lewy type synucleinopathy
Selected highest ranking AI features correlation with UPDRS score, CSF biochemistry, and dopamine transporter SPECT, Pearson r
| AI features | MDS-UPDRS Part III, | MDS-UPDRS total, | DAT-SBR*, | CSF Synuclein, | ||||
|---|---|---|---|---|---|---|---|---|
| LTS cluster size StDev | 0.010 | − | 0.08 | 0.556 | ||||
| PreciseDx Graph Feature18 | 0.003 | − | 0.14 | 0.303 | ||||
| StDev of Hematoxylin Channel | 0.16 | 0.259 | 0.14 | 0.312 | − | 0.13 | 0.369 | |
| PreciseDx Graph Feature5 | 0.009 | − | 0.728 | |||||
| PreciseDx Graph Feature29 | 0.004 | − | 0.09 | 0.504 | ||||
| PreciseDx Graph Feature11 | − 0.07 | 0.609 | − 0.13 | 0.361 | 0.01 | 0.928 | 0.19 | 0.173 |
| Davies-Bouldin Index | 0.09 | 0.510 | 0.08 | 0.547 | − | 0.025 | 0.25 | 0.066 |
| PreciseDx Graph Feature13 | 0.044 | 0.23 | 0.086 | − | 0.11 | 0.454 | ||
| C Index | − 0.16 | 0.259 | − 0.14 | 0.318 | 0.06 | 0.671 | 0.07 | 0.616 |
| PreciseDx Graph Feature28 | 0.022 | − | 0.12 | 0.408 | ||||
| PreciseDx Graph Feature14 | 0.016 | − | − 0.14 | 0.333 | ||||
| PreciseDx Graph Feature10 | − 0.11 | 0.443 | − 0.11 | 0.444 | 0.11 | 0.423 | 0.049 | |
| Calinski-Harabasz index | 0.021 | 0.044 | − | 0.07 | 0.616 | |||
| PreciseDx Graph Feature3 | 0.001 | − | 0.07 | 0.625 | ||||
| PreciseDx Graph Feature27 | 0.006 | 0.011 | − | 0.07 | 0.629 | |||
Statistically significant values are shown in bold
LTS Lewy type synucleinopathy, MDS-UPDRS Movement Disorders Society Unified Parkinson Disease Rating Scale
*DAT-SBR, dopamine transporter mean striatum specific binding ratio