| Literature DB >> 34674762 |
Anthony R Vega1,2, Rati Chkheidze3, Vipul Jarmale1, Ping Shang4, Chan Foong4, Marc I Diamond5,2,6, Charles L White4,2,6, Satwik Rajaram7,8.
Abstract
Although pathology of tauopathies is characterized by abnormal tau protein aggregation in both gray and white matter regions of the brain, neuropathological investigations have generally focused on abnormalities in the cerebral cortex because the canonical aggregates that form the diagnostic criteria for these disorders predominate there. This corticocentric focus tends to deemphasize the relevance of the more complex white matter pathologies, which remain less well characterized and understood. We took a data-driven machine-learning approach to identify novel disease-specific morphologic signatures of white matter aggregates in three tauopathies: Alzheimer disease (AD), progressive supranuclear palsy (PSP), and corticobasal degeneration (CBD). We developed automated approaches using whole slide images of tau immunostained sections from 49 human autopsy brains (16 AD,13 CBD, 20 PSP) to identify cortex/white matter regions and individual tau aggregates, and compared tau-aggregate morphology across these diseases. Tau burden in the gray and white matter for individual subjects strongly correlated in a highly disease-specific fashion. We discovered previously unrecognized tau morphologies for AD, CBD and PSP that may be of importance in disease classification. Intriguingly, our models classified diseases equally well based on either white or gray matter tau staining. Our results suggest that tau pathology in white matter is informative, disease-specific, and linked to gray matter pathology. Machine learning has the potential to reveal latent information in histologic images that may represent previously unrecognized patterns of neuropathology, and additional studies of tau pathology in white matter could improve diagnostic accuracy.Entities:
Keywords: Histopathology; Machine learning; Tauopathy; White matter
Mesh:
Year: 2021 PMID: 34674762 PMCID: PMC8529809 DOI: 10.1186/s40478-021-01271-x
Source DB: PubMed Journal: Acta Neuropathol Commun ISSN: 2051-5960 Impact factor: 7.801
Fig. 1Schematic of analysis workflow. a Example images of AT8-stained WSI from AD, PSP, and CBD patients that form the basis of our analysis. b Pathologist annotation of cortex(cyan), white matter (magenta) and background (no color) regions were used to train a deep-learning model to segment these regions in WSI. c Characterization of white matter aggregates: A pathologist trained deep learning model was used to segment aggregates in the white matter, and for each aggregate, multiple features characterizing its size and shape were extracted. We then performed unsupervised analyses to test whether white matter aggregates in WSI from the same disease were more similar than those from different diseases. d Disease classification based on cortex and white matter: Separate deep learning models were trained for the cortex and white matter to predict disease status directly from image patches (without need for any human curation of features) and the performance of these two models was compared and contrasted
Fig. 2DL accurately identifies tissue regions and AT8-stained aggregates. For both region and aggregate models, a cross-validation scheme was used to generate multiple models, each trained on part of the data and evaluated on the rest. We report performance (left column) using confusion matrices which show how image areas with known true labels (rows) are assigned to different predicted classes (columns). Values denote fractions of patches/aggregates belonging to true class (i.e., rows add up to one) assigned to corresponding predicted class averaged across cross-validation models, with standard deviations shown in parenthesis. Right column shows sample classification results. Note: BG = Background. a Region segmentation: threefold cross validation was used and performance is measured in terms of fraction of image-patches from a region that are correctly classified (diagonal entries in dark blue). b Aggregate segmentation: twofold cross validation was used and performance is measured in terms of fraction of pixels that are correctly classified
Fig. 3Tau burden is disease specific. Scatter plot of tau burden in cortex (x-axis) vs WM (y-axis) regions across multiple WSI (individual data points) from different diseases (point colors). Tau burden in a region was estimated by the ratio of area covered by tau aggregates (from aggregate classifier) to total area of the region (WM/cortex from the region classifier). Data from each disease are fit using a least squares straight-line fit and the best-fit slope is shown next to the fitted line
Fig. 4Aggregates from different diseases show distinct shapes and sizes. Individual aggregates were identified in the WM of WSI and were characterized by features describing their size and shape. a Example image patch (left) from a WSI with individual aggregates colored based on sample features: area, eccentricity, and minor axis length. b Individual WSI (rows) were characterized based on median feature values (columns) of aggregates in their WM and were ordered based on hierarchical clustering. Values within each feature were z-score normalized to allow comparison across features. Colors on top (Red/blue/green) indicate disease associated with each WSI. c Boxplots of median feature values for area, eccentricity, and minor axis length shown in WM regions of WSI (gray dots) compared across each disease. Mann–Whitney test, with Bonferroni-based multiple-hypothesis testing correction (across diseases) was used for statistical comparison
Fig. 5Disease classification from DL. a Overview of our DL approach in which cortex/WM regions are used to train two separate DL models for disease classification at an image patch level. b Confusion matrices comparing the average patch-level classification accuracy using cortex (left) or WM (right) data
Fig. 6Interpretation of WM pathological features used for disease classification. a UMAP visualization of image patches clustered based on their similarity as perceived by the automated disease classifier (based on the 1024-dimensional output of the penultimate layer of the model; “Methods”). Data points are colored based on their ground truth disease label and representative images of data at different areas within clusters are displayed. b The same UMAP visualization from a with data points instead colored by the average feature values (across aggregates in the corresponding image patch) for area, eccentricity, and minor axis length