Literature DB >> 35497493

AI-enabled in silico immunohistochemical characterization for Alzheimer's disease.

Bryan He¹, Syed Bukhari², Edward Fox², Abubakar Abid³, Jeanne Shen², Claudia Kawas⁴, Maria Corrada⁴, Thomas Montine², James Zou^1,2,5,6.

Abstract

We develop a deep learning approach, in silico immunohistochemistry (IHC), which takes routinely collected histochemical-stained samples as input and computationally generates virtual IHC slide images. We apply in silico IHC to Alzheimer's disease samples, where several hallmark changes are conventionally identified using IHC staining across many regions of the brain. In silico IHC computationally identifies neurofibrillary tangles, β-amyloid plaques, and neuritic plaques at a high spatial resolution directly from the histochemical images, with areas under the receiver operating characteristic curve of between 0.88 and 0.92. In silico IHC learns to identify subtle cellular morphologies associated with these lesions and can generate in silico IHC slides that capture key features of the actual IHC.

Entities: Chemical

Keywords: Alzheimer's disease; amyloid plaque; deep learning; immunohistochemistry; machine learning; neuritic plaque; neurofibrillary tangle

Year: 2022 PMID： 35497493 PMCID： PMC9046239 DOI： 10.1016/j.crmeth.2022.100191

Source DB: PubMed Journal: Cell Rep Methods ISSN： 2667-2375

Introduction

Cellular morphology is closely linked to tissue function and disease diagnosis. A common tool in pathology for assisting with disease diagnosis is immunohistochemical (IHC) staining, which is used to identify specific proteins of interest in a tissue. In this work, we propose to use deep learning to computationally generate in silico IHC staining. We demonstrate that deep learning algorithms can identify subtle features in cellular morphology, which are associated with diseases and previously required IHC to visualize. These results also open the door for computational approaches to potentially reduce the need of performing time-consuming or expensive experimental IHC staining. In the standard workflow, tissue samples collected for pathological diagnosis have a section prepared with a hematoxylin and eosin (H&E) stain for general histologic assessment. Specialized IHC stains are additionally applied to other sections of the same tissue to identify structures or specific molecules that are difficult to directly observe in the H&E-stained sample. These IHC-stained slides are commonly studied by eye under a microscope. Digitization of these slides into whole-slide images (WSIs) now allows for computational assistance with evaluating slides (Fu et al., 2020; He et al., 2020; Kather et al., 2020). Performing all necessary IHC stains on a sample can cost hundreds of dollars and requires several days to process, which can be avoided by using the in silico IHC. Additionally, the computationally generated stain would allow the IHC to be run on the same section of tissue as the original H&E-stained slide, rather than a different section, and removes artifacts that appear on the real IHC stains. In addition to these advantages in the diagnostic setting, in silico IHC has the potential to make major contributions to genomic research that relies on IHC-generated phenotypes. For example, large genetic association studies of Alzheimer's disease (AD) neuropathologic endophenotypes have been severely limited by the lack of IHC data on research autopsy brains (Beecham et al., 2014). Recent advances in deep learning have produced increasingly accurate image recognition models (Deng et al., 2009; Krizhevsky et al., 2012; Huang et al., 2017). These advances have resulted in deep learning being applied across medicine fields (Esteva et al., 2017; Ouyang et al., 2020). Within pathology, deep learning has been used to classify disease subtypes and predict mutations (Campanella et al., 2019; Fu et al., 2020; Kather et al., 2020; Lu et al., 2021) and to interpret IHC stains (Signaevsky et al., 2019; Koga et al., 2021). Combined with spatial transcriptomics, deep learning has also been used to link cell morphologic features with localized gene expressions (He et al., 2020; Levy-Jurgenson et al., 2020). Finally, deep learning has been used to transform unstained samples into virtual H&E stains (Rivenson et al., 2019) and to label cellular constituents, such as the nuclei and membrane, from microscopy images (Christiansen et al., 2018). H&E and IHC are not commonly prepared on the same tissue section, making supervised learning more difficult. As a result, computationally generating IHC stains directly from H&E images has been less explored. Studies that have used IHC slides have typically focused on IHC targeting specific cell types (Xu et al., 2016; Sharma et al., 2017; Jackson et al., 2020; Liu et al., 2021), such as neoplastic or necrotic cells, which are more visually distinct on H&E slides than the AD lesions we studied. We present in silico IHC, a system for our in silico IHC staining process (Figure 1). As a proof of concept, we apply in silico IHC to AD. The brains of patients with AD have several hallmark neuropathologic lesions: β-amyloid (Aβ) plaques, neurofibrillary tangles (NFT), and neuritic plaques (Hyman et al., 2012; Montine et al., 2012). These hallmark changes typically occur in specific regions of the brain before the onset of cognitive impairment, and then increase in density and distribution as the disease advances. IHC staining is used to highlight instances of each of these hallmark changes. IHC for Aβ is used to highlight instances of Aβ plaques, and IHC for pathologic forms of tau (often collectively called phospho-tau) are used to highlight instances of NFTs and NPs, which can be differentiated via visual inspection. The IHC assessments are used in consensus pathologic evaluation to determine the regional distribution of Aβ plaques, regional distribution of NFTs, and regional density of NPs. Together, these form the basis of the current National Institutes on Aging–Alzheimer's Association consensus guidelines for the neuropathologic assessment of AD (Hyman et al., 2012; Montine et al., 2012).

Figure 1

In silico IHC

(A) Procedure for in silico staining an H&E-LFB WSI. The H&E-LFB image is divided into patches, which are given to the trained neural network to separately predict the presence of each hallmark change, which are extracted from the corresponding IHC patch. The predictions are then combined into a synthetic IHC image. The corresponding real IHC-stained slide is shown for comparison.

(B) Example showing the registration process, which is used to provide supervision to train in silico IHC. Key points from each of the sections are detected, and the matching key points are indicated by lines.

In silico IHC (A) Procedure for in silico staining an H&E-LFB WSI. The H&E-LFB image is divided into patches, which are given to the trained neural network to separately predict the presence of each hallmark change, which are extracted from the corresponding IHC patch. The predictions are then combined into a synthetic IHC image. The corresponding real IHC-stained slide is shown for comparison. (B) Example showing the registration process, which is used to provide supervision to train in silico IHC. Key points from each of the sections are detected, and the matching key points are indicated by lines. To train and evaluate our system, we collected a dataset consisting of brain autopsies from a total of 160 patients, consisting of 704 samples from different regions of the brain. Within a single patient, the presence of hallmark changes may vary from region to region, resulting in the need for multiple samples for each patient. Each sample is divided into sections for staining. One section from all samples is stained with H&E combined with Luxol fast blue (LFB), which is commonly used to highlight the white matter in the brain. This combined stain is referred to as H&E-LFB. Additionally, separate sections from the sample are prepared with Aβ and pathologic tau IHC stains. The regions and stains used in our study follow the recommendations of the National Institute on Aging–Alzheimer's Association guidelines for assessing neuropathologic change in AD (Hyman et al., 2012; Montine et al., 2012). We then divide the dataset into separate training, validation, and test sets by patient. Training a deep learning model requires a dataset consisting of pairs of input and expected output (e.g., H&E-LFB images and presence of each hallmark change). Typically, the expected output for the dataset is generated by manual annotation (Litjens et al., 2018) or by combining slide-level annotations and multiple-instance learning (Campanella et al., 2019; Lu et al., 2021). However, we can computationally align the serial slides prepared with different stains to provide annotation for the H&E-LFB images. This approach reduces the need for manual annotation. After training, in silico IHC achieved areas under the receiver operating characteristic curve (AUROCs) of 0.91 (95% CI, 0.88–0.95) for classifying the presence of NFTs, 0.92 (0.87–0.94) for neuritic plaques, and 0.88 (95% CI, 0.82–0.93) for Aβ plaques on the held-out test set.

Results

Data curation

Our dataset consists of autopsied brains from 160 patients. Each brain is divided into several regions of interest, resulting in 704 samples collected from multiple regions of brain (Figure 2A). The tissue sample from each region is prepared into a formalin-fixed paraffin-embedded block. These blocks are cut into serial sections of five um thickness, with one slide prepared with H&E-LFB staining, along with at least one of pathologic tau and Aβ staining. We split the dataset into a training set consisting of 91 patients, a validation of 20 patients, and a testing set of 19 patients (Figure 2B). As an additional test, data for 30 consecutive patients were collected at a different time with all sections necessary for a full analysis of the level of neuropathologic change in each brain (Figure 2C).

Figure 2

Overview of the dataset

(A) WSIs for an example patient. The samples are collected from five regions of the brain (amygdala [AMY], contralateral hippocampus [cHIP], hippocampus [HIP], midbrain [MID], and middle frontal gyrus [MF]). All samples have an H&E-LFB (H) slide available, along with at least one slide of amyloid-Beta (aB) or phosphorylated Tau (T).

(B and C) The number of slides available for each region of the brain and number of slides with positivity for amyloid plaques (AP) NFTs and neuritic plaques for (B) training and testing the model and (C) additional evaluation (additionally includes samples from the primary visual cortex [PVC], inferior parietal lobule [IPL], and striatum). The samples were prepared for determining the level of AD neuropathologic change, and only the recommended brain regions to be evaluated were stained for most patients (Hyman et al., 2012; Montine et al., 2012).

Overview of the dataset (A) WSIs for an example patient. The samples are collected from five regions of the brain (amygdala [AMY], contralateral hippocampus [cHIP], hippocampus [HIP], midbrain [MID], and middle frontal gyrus [MF]). All samples have an H&E-LFB (H) slide available, along with at least one slide of amyloid-Beta (aB) or phosphorylated Tau (T). (B and C) The number of slides available for each region of the brain and number of slides with positivity for amyloid plaques (AP) NFTs and neuritic plaques for (B) training and testing the model and (C) additional evaluation (additionally includes samples from the primary visual cortex [PVC], inferior parietal lobule [IPL], and striatum). The samples were prepared for determining the level of AD neuropathologic change, and only the recommended brain regions to be evaluated were stained for most patients (Hyman et al., 2012; Montine et al., 2012).

In silico staining deep learning model

In silico IHC uses a trained neural network that takes an H&E-LFB-stained WSI as input and generates synthetic IHC-stained images predicting the presence of NFTs, Aβ plaques, and neuritic plaques (Figure 1). From the WSI, we select non-overlapping patches of 2,048 × 2,048 pixels, corresponding with 517 μm × 517 μm. Each patch is given to the trained neural network, which makes separate predictions for the probabilities of at least one amyloid plaque, NFT, or neuritic plaque appearing within the patch. The predicted probability for each patch in the WSI is mapped to colors imitating the real IHC, which are then combined into a synthetic image for each of the targets, which can be used to identify whether they are present, along with their locations. Training in silico IHC consists of three main steps. First, we register the serial sections of each sample in the dataset so that each IHC WSI is aligned with the corresponding H&E-LFB image as much as possible. Second, we use the IHC WSIs to identify the patches containing NFTs, Aβ plaques, and neuritic plaques (examples in Figure S1). Third, we train a neural network to directly predict the presence of NFTs, Aβ plaques, and neuritic plaques from the H&E-LFB image patches (see the STAR Methods for more details). To register the serial sections, we use the Oriented FAST and Rotated BRIEF feature detector (Rublee et al., 2011) to identify key points in each slide. Matching key points are selected using the RANSAC algorithm (Fischler and Bolles, 1981). The matched key points are then used to overlay the serial sections over each other. To assess the accuracy of the registration, we identified 50 pairs of blood vessels visible in both the H&E and one of the IHC slides. The registration process resulted in these blood vessels being mapped to an average of 224 pixels apart, with a SD of 156 pixels (56 ± 39 μm). The average registration error is approximately 10% the width of a patch, so pairs of H&E-LFB and IHC patches are closely related. Next, to identify NFTs, Aβ plaques, and neuritic plaques, we annotated 500 phospho-tau and 500 Aβ IHC-stained patches for the presence of each. Using the annotated dataset, we train one annotator network to identify NFTs and neuritic plaques from the phospho-tau IHC slides and a second annotator network to identify Aβ plaques from the Aβ IHC slides. The annotator networks are then used to label each patch on the real IHC with the hallmark changes that are present. Finally, we combine the registration and IHC quantification to train an end-to-end model for predicting the presence of NFTs, Aβ plaques, and neuritic plaques directly from the H&E-LFB image. To train this model, we use the registered slides to identify paired H&E-LFB patches and IHC patches. The annotator network uses the IHC slides to provide annotations of NFTs, Aβ plaques, and neuritic plaques, which are used as supervision to train in silico IHC.

Evaluation of in silico IHC

We run in silico IHC on a held-out test set of 83 samples (examples in Figures 3 and 4) from 19 patients to evaluate its ability to identify regions with NFTs, neuritic plaques, and Aβ plaques. We find that in silico IHC archives AUROCs of 0.91 (95% CI, 0.88–0.95), 0.92 (95% CI, 0.87–0.94), and 0.88 (95% CI, 0.82–0.93), respectively. As additional verification that our results are not skewed by potential errors in the automated identification of lesions from IHC, we annotated the IHC patches corresponding to 250 random H&E patches in our test set as ground truth. When evaluated on these patches, in silico IHC achieves AUROCs of 0.92 (95% CI, 0.87–0.96), 0.90 (95% CI, 0.84–0.95), and 0.92 (95% CI, 0.84–0.97), respectively.

Figure 3

Results for NFT and neuritic plaque predictions

For visualization, the predictions from in silico IHC for NFTs and neuritic plaques are combined to match the real phospho-tau IHC, which will show positivity for both.

(A and B) AUROCs for individual slides and receiver operating characteristics curve for (A) NFTs and (B) neuritic plaques.

(C) Predictions for a highly stained slide. The model correctly identifies tangles in the gyrus, which are difficult to see on the real IHC at low resolution (inset).

(D) Predictions for a slide with low staining. The model correctly ignores Aging-related tau astrogliopathy (ARTAG), which include the Tau protein, but are not NFTs (inset).

Figure 4

Results for amyloid plaque predictions

(A) AUROCs for individual slides and receiver operating characteristics curve.

(B) Prediction for a highly stained slide.

(C) Predictions for a negative slide. The model correctly ignores neuromelanin in the H&E-LFB slide, which are stained in the IHC but are not actual amyloid plaques (inset).

Results for NFT and neuritic plaque predictions For visualization, the predictions from in silico IHC for NFTs and neuritic plaques are combined to match the real phospho-tau IHC, which will show positivity for both. (A and B) AUROCs for individual slides and receiver operating characteristics curve for (A) NFTs and (B) neuritic plaques. (C) Predictions for a highly stained slide. The model correctly identifies tangles in the gyrus, which are difficult to see on the real IHC at low resolution (inset). (D) Predictions for a slide with low staining. The model correctly ignores Aging-related tau astrogliopathy (ARTAG), which include the Tau protein, but are not NFTs (inset). Results for amyloid plaque predictions (A) AUROCs for individual slides and receiver operating characteristics curve. (B) Prediction for a highly stained slide. (C) Predictions for a negative slide. The model correctly ignores neuromelanin in the H&E-LFB slide, which are stained in the IHC but are not actual amyloid plaques (inset). In the H&E-LFB-stained slides, there can be artifacts and other structures unrelated to AD disease, which could potentially be falsely identified as lesions. For example, there are commonly folds and tears in the tissue, and some regions of the brain contain pigmented regions such as neuromelanin and lipofuscin, which may seem to be abnormal. However, we find that in silico IHC can correctly identify that these regions are negative (Figure 4). To further assess the reliability of our algorithm, we collected H&E-LFB and IHC images from 30 consecutive patients with neurodegenerative diseases. These images were taken at different times from the data used in the algorithm's training and thus constitute a separate test set. We deployed the algorithm without any modification to this new test data. The in silico IHC matches well with the experimentally obtained IHC, achieving AUROCs of 0.94 (95% CI, 0.92–0.95), 0.93 (95% CI, 0.91–0.94), and 0.83 (95% CI, 0.80–0.86) for NFT, neuritic plaques, and Aβ plaques, respectively (Figure S2). The prevalence of neuropathological hallmark changes varies highly between different areas of the brain. For example, NFT, neuritic plaques and Aβ plaques are significantly more common in the hippocampus than the midbrain, and these lesions almost never appear in white matter. To evaluate the ability of in silico IHC to extract information beyond standard location features, we trained a logistic regression model to predict the appearance of hallmark changes using the region of the brain, the fraction of the patch stained with LFB (which identifies white matter), and the number of nuclei in the patch as features. This model achieves AUROCs of 0.78 (95% CI, 0.75–0.80) for NFTs, 0.80 (95% CI, 0.77–0.82) for neuritic plaques, and 0.78 (95% CI, 0.76–0.80) for Aβ plaques and is significantly outperformed by in silico IHC. This suggests that the computer vision algorithm can leverage more fine-grained morphological features of the tissue neighborhoods in its assessment. Additionally, we study the choice of neural network architecture used by in silico IHC by comparing against the performance of AlexNet (Krizhevsky et al., 2012), VGG-11 (Simonyan and Zisserman, 2015), and ResNet-18 (He et al., 2016). All models were trained using the same data. We find that in silico IHC can outperform the other choices of neural network architectures (Table S1).

Interpretation of in silico IHC predictions

To better understand the predictions made by our model, we use deep learning interpretation methods to provide attributions. We used the integrated gradients method (Sundararajan et al., 2017), which identifies pixels in an image that the model considers most useful for making a prediction. In Figure 5, we show several examples of attributions for neuritic and Aβ plaques. We additionally show the corresponding IHC stain and find that the attributions match the plaques in location and size, suggesting that in silico IHC has learned to identify individual lesions, despite only being trained on patch-level labels.

Figure 5

Attribution of in silico IHC predictions

(A and B) Raw H&E image, model attribution, and corresponding IHC for (A) phospho-Tau and (B) amyloid beta. The circled regions on the left are the regions in the image that the model pays the most attention in each H&E image. These regions have subtle differences in their morphology and texture that the computer vision model picks up. They match the areas where actual phospho-Tau and amyloid beta are located, as identified via experimental IHC (brown blobs in the right).

Attribution of in silico IHC predictions (A and B) Raw H&E image, model attribution, and corresponding IHC for (A) phospho-Tau and (B) amyloid beta. The circled regions on the left are the regions in the image that the model pays the most attention in each H&E image. These regions have subtle differences in their morphology and texture that the computer vision model picks up. They match the areas where actual phospho-Tau and amyloid beta are located, as identified via experimental IHC (brown blobs in the right).

Discussion

In this work, we introduce a model for translating from standard H&E-LFB-stained neuropathological samples to synthetic phospho-tau and Aβ-stained images. Our model achieves high accuracy for classifying the presence of NFTs, neuritic plaques and Aβ plaques on independent test samples. Moreover, it significantly outperforms models using hand-crafted features based on information about the nuclei and region of the brain. We additionally use interpretation methods to identify regions that the model considered most relevant for making a classification and found that the regions closely match the areas identified by the real IHC, suggesting that the model has learned fine-grained morphological features of cellular neighborhoods that are indicative of the AD related plaques and tangles. A complete analysis of a brain sample for neurodegenerative disease requires IHC-stained slides to be prepared for many regions of the brain, greatly increasing the cost and time needed for preparing the samples, limiting diagnostic workup outside of research settings, and severely limiting large-scale genomic-endophenotype association studies (Beecham et al., 2014; Flanagan et al., 2017). As a result, we focused on the most common neurodegenerative disease as a case study using in silico IHC for translating from routinely and cost-efficiently prepared H&E-LFB-stained slides to the necessary IHC stains. In other tissues and diseases where immunostaining is used, paired H&E-LFB and IHC samples can be used to train deep learning models, decreasing the need for gathering expert annotations, allowing large datasets with fine-grained labels to be created. The methodology we propose for developing in silico IHC can be extended to these diseases and can aid in advancing future work in translation to other areas of medical imaging. An exciting potential application of in silico IHC is to help pathologists quickly assess samples and prioritize samples for experimental immunostaining in both diagnostic and large cohort research settings. In silico staining also makes it easier for pathologists to visualize different biological information on the same tissue section compared to the typical setting where one must mentally align stains taken on different sections.

Limitations of the study

Our work is a proof-of-concept study that demonstrates the possibility of in silico IHC. More work is needed to harden this technique into a software that can be readily used in laboratories. It is also necessary to ensure variability in the histochemical slide preparation from other sources does not degrade the performance of in silico IHC. We note that in silico IHC is only able to identify pathologic changes that have a discernible disturbance in the H&E-LFB slides, and performance on pathologic change with less disruption may be more challenging to identify. For example, Aβ plaques result in less disruption than both NFTs and neuritic plaques, which may explain in silico IHC's weaker performance on Aβ plaques. It may also be more challenging to identify pretangles, which are precursors to NFTs, owing to their limited disruption on the H&E-LFB slides.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information and requests for resources should be directed to and will be fulfilled by the lead contact, James Zou (jamesz@stanford.edu).

Materials availability

This study did not generate new unique reagents.

Method details

Data curation

The training cohort comprised 160 cases drawn from the 90 + Study (Head et al., 2009). Tissues were sampled and analyzed as previously described (Montine et al., 2016; Besser et al., 2018). Regions analyzed for the training cohort included the substantia nigra at the level of the red nucleus, middle frontal gyrus or Brodmann area (BA)9, hippocampus at the level of the lateral geniculate nucleus, and amygdala. The additional validation cohort comprised 30 consecutive cases from the 90 + Study. For this validation cohort, each case consists of samples from the primary visual cortex (BA 17), substantia nigra at the level of the red nucleus, inferior parietal lobule (BA39), striatum at the level of the anterior commissure (caudate nucleus and putamen), and hippocampus at the level of the lateral geniculate nucleus. For evaluating AD neuropathologic change by consensus guidelines, Aβ staining on the inferior parietal lobule and middle frontal gyrus are the same score, Aβ staining on the striatum and amygdala are the same score, and phospho-tau staining on the primary visual cortex and middle frontal gyrus are the same score. Tissues were stained histochemically with H%E-LFB, and immunohistochemically with antibodies to Aβ (4G8, Biolegend, cat#800701, working dilution 1:1,000) or phospho-Tau (AT8, ThermoScientific, cat#MN1020, working dilution 1:1,000). All slides are digitized at 40× magnification on a Leica AT2 scanner.

Digital staining procedure

In silico-IHC takes an H&E-LFB-stained WSI as input (Figure 1A). WSIs are typically around 100,000 × 100,000 pixels (>1 GB), which are too large to process with a neural network. To handle this large size, in silico-IHC first divides the WSI into 2,048 × 2,048 patches (517 μm × 517 μm). Each patch is then passed through our trained neural network, resulting in a separate prediction for the probabilities that the patch contains an Aβ plaque, NFT or neuritic plaque. The predictions for the patches are then merged as a synthetic IHC stained image by representing each patch with a colored spot based on the probability of containing each lesion.

Registration of H&E-LFB and IHC slides

To train our system, we use serial sections of tissue stained with H&E-LFB, phospho-tau IHC, and Aβ IHC (Figure 1B). The samples are collected from five different regions: amygdala, hippocampus, contralateral hippocampus, midbrain, and BA9. The sections from a sample are closely related owing to their serial nature (5 μm between sections), but cutting the sample results in the exact spatial relationship between the sections being destroyed: slight distortions in the tissue will result from the cutting process, and the images will be translated and rotated owing to the sections not being in the same position on the slide when digitizing (Figure S3). The first step in our training procedure is then to register the IHC slides to the H&E-LFB slides to provide paired examples for the neural network. To avoid this issue, we begin by using Otsu's method to threshold the H&E-LFB and IHC slides into foreground and background (Otsu, 1979). After binarizing the images, the different color schemes of the stains are no longer an issue, but the main features of the sample are still visible. Next, we identify candidate key points using the ORB feature detector. The ORB feature detector identifies areas of the image with distinctive structures (e.g., sharp corners of the tissue) as candidate key points. For each key point, the ORB feature detector provides a descriptor, a real-valued vector, that allows the key point to be matched across images. With the candidate key points and descriptors for a pair of H&E-LFB and IHC slides, we create a matching between the key points from the two slides with the most similar descriptors. This process results in many correct matches, but will also include incorrect matches that must be filtered before finding a transform between the images. We use the RANSAC method for identifying the outliers from the matches, and we fit an affine transform on the remaining matches to register the images (Figure S3). We run RANSAC for 2,000 iterations and consider a key point as an inlier if the error is within 25 pixels.

Identification of lesions from IHC

The next step in generating the expected output for training the neural network is identifying the hallmark lesions from the IHC images. The IHCs can have considerable variation in the background color along with other unrelated structures such as folds in the tissue, lipofuscin, and neuromelanin, which must be distinguished from the NFTs and Aβ plaques. To handle these issues, we selected 500 Aβ and 500 phospho-tau IHC regions of 16,384 × 16,384 pixels (4,161 μm × 4,161 μm) from the WSIs in the training set. The Aβ patches were annotated for instances of Aβ plaques, and the phospho-tau patches were annotated for instances of NFTs and neuritic plaques. The regions were then divided into 2,048 × 2,048-pixel patches, which were considered positive for each lesion if any instance of the lesion appeared within the patch. With these annotated patches, we trained two separate DenseNet121 models (Huang et al., 2017) to identify Aβ plaques from the Aβ IHC slides and identify NFTs and neuritic plaques from the phospho-tau IHC slides using 2,048 × 2,048-pixel patches extracted from the larger patches. Our models were implemented using the PyTorch library (Paszke et al., 2019). Our trained model achieved AUCs of 0.98 for NFTs, 0.99 for neuritic plaques, and 0.97 for Aβ plaques (Figure S4).

Training and evaluating the H&E-LFB model

With the combined results of registration and identification from IHC, we have patch-level annotations for the H&E-LFB slides. For each H&E-LFB patch, we identify the corresponding IHC patch and run our trained IHC model on the patch to identify if Aβ plaques, NFTs, or neuritic plaques are present. From our training set, we extract 190,992 patches, and train a Densenet121 model to predict both the presence of Aβ plaques, NFTs, and neuritic plaques. We evaluate the performance of the model on the patches from the held-out test patients corresponding to IHC patches that were confidently classified for each hallmark change (<5% or >95%). We find that the model's predicted probabilities of the hallmark changes are closely aligned to the true fraction of positive patches (Figure S5).

Model architecture and training

For generating predictions with in silico-IHC, we use a DenseNet121 architecture (Huang et al., 2017), which has previously been shown to perform well on the ImageNet dataset (Deng et al., 2009; Russakovsky et al., 2015). The DenseNet121 architecture consists of 120 convolutional layers arranged into 4 densely connected blocks, followed by a fully connected layer. We initialize the model with pretrained ImageNet weights, and we fine-tune all parameters in the model for 150 epochs. We use a stochastic gradient descent optimizer with an initial learning rate of 1 × 10−4 and a momentum of 0.9, and we decay the learning rate by a factor of 10 every 50 epochs. The optimizer trains the model by minimizing the binary cross-entropy loss between the model's predictions and the label for each hallmark change extracted from the matched IHC patch. During training, we augment the dataset by including all rotations and reflections of the patches. For our final evaluation, we select the model from the epoch with the highest AUC on the validation set.

Interpretation of in silico-IHC predictions

To interpret the predictions made by in silico-IHC, we first use the integrated gradients method to provide per-pixel attributions for each patch. We then sought to identify regions, rather than pixels, that resulted in positive predictions. First, we applied a Gaussian filter to the attributions with a standard deviation of 5 for the Gaussian kernel; this allows nearby regions with high attributions to be connected. Next, we identified pixels with attributions above the 90th percentile, and extracted connected regions of pixels. Regions smaller than 500 pixels (32 μm2) were then filtered out. Finally, the contours of the remaining regions were then extracted.

Quantification and statistical analysis

CIs for the AUROCs in the results were computed using 10,000 bootstrapped samples and obtaining 95 percentile ranges for each prediction. The performance of in silico-IHC for NFT and neuritic plaque predictions were computed for 51 samples with phospho-tau staining in the test set, and the performance for amyloid plaque predictions were computed for 53 samples with Aβ staining in the test set. The performance of in silico-IHC on the additional evaluation data was computed for 60 samples with phospho-tau staining and 90 samples with Aβ staining. The performance of the identification of lesions from IHC was computed using 75 labeled patches in the test set. Additional analysis details are provided in the Results section and in figure legends. Statistical analysis was performed using Python.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Antibodies

β-Amyloid	Biolegend	cat#800701
Phospho-Tau	ThermoScientific	cat#MN1020

Software and algorithms

In silico-IHC	This paper	https://github.com/bryanhe/insilico-ihc (https://doi.org/10.5281/zenodo.6236002)

23 in total

1. A Deep Convolutional Neural Network for segmenting and classifying epithelial and stromal regions in histopathological images.

Authors: Jun Xu; Xiaofei Luo; Guanhao Wang; Hannah Gilmore; Anant Madabhushi
Journal: Neurocomputing Date: 2016-02-17 Impact factor: 5.719

2. In Silico Labeling: Predicting Fluorescent Labels in Unlabeled Images.

Authors: Eric M Christiansen; Samuel J Yang; D Michael Ando; Ashkan Javaherian; Gaia Skibinski; Scott Lipnick; Elliot Mount; Alison O'Neil; Kevan Shah; Alicia K Lee; Piyush Goyal; William Fedus; Ryan Poplin; Andre Esteva; Marc Berndl; Lee L Rubin; Philip Nelson; Steven Finkbeiner
Journal: Cell Date: 2018-04-12 Impact factor: 41.582

3. The Revised National Alzheimer's Coordinating Center's Neuropathology Form-Available Data and New Analyses.

Authors: Lilah M Besser; Walter A Kukull; Merilee A Teylan; Eileen H Bigio; Nigel J Cairns; Julia K Kofler; Thomas J Montine; Julie A Schneider; Peter T Nelson
Journal: J Neuropathol Exp Neurol Date: 2018-08-01 Impact factor: 3.685

4. A machine learning algorithm for simulating immunohistochemistry: development of SOX10 virtual IHC and evaluation on primarily melanocytic neoplasms.

Authors: Aravindhan Sriharan; Louis J Vaickus; Christopher R Jackson
Journal: Mod Pathol Date: 2020-04-01 Impact factor: 7.842

5. Multisite assessment of NIA-AA guidelines for the neuropathologic evaluation of Alzheimer's disease.

Authors: Thomas J Montine; Sarah E Monsell; Thomas G Beach; Eileen H Bigio; Yunqi Bu; Nigel J Cairns; Matthew Frosch; Jonathan Henriksen; Julia Kofler; Walter A Kukull; Edward B Lee; Peter T Nelson; Aimee M Schantz; Julie A Schneider; Joshua A Sonnen; John Q Trojanowski; Harry V Vinters; Xiao-Hua Zhou; Bradley T Hyman
Journal: Alzheimers Dement Date: 2015-08-29 Impact factor: 21.566

6. Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology.

Authors: Harshita Sharma; Norman Zerbe; Iris Klempert; Olaf Hellwich; Peter Hufnagl
Journal: Comput Med Imaging Graph Date: 2017-06-16 Impact factor: 4.790

7. Pan-cancer image-based detection of clinically actionable genetic alterations.

Authors: Alexander T Pearson; Tom Luedde; Jakob Nikolas Kather; Lara R Heij; Heike I Grabsch; Chiara Loeffler; Amelie Echle; Hannah Sophie Muti; Jeremias Krause; Jan M Niehues; Kai A J Sommer; Peter Bankhead; Loes F S Kooreman; Jefree J Schulte; Nicole A Cipriani; Roman D Buelow; Peter Boor; Nadi-Na Ortiz-Brüchle; Andrew M Hanby; Valerie Speirs; Sara Kochanny; Akash Patnaik; Andrew Srisuwananukorn; Hermann Brenner; Michael Hoffmeister; Piet A van den Brandt; Dirk Jäger; Christian Trautwein
Journal: Nat Cancer Date: 2020-07-27

8. Data-efficient and weakly supervised computational pathology on whole-slide images.

Authors: Drew F K Williamson; Tiffany Y Chen; Ming Y Lu; Richard J Chen; Matteo Barbieri; Faisal Mahmood
Journal: Nat Biomed Eng Date: 2021-03-01 Impact factor: 25.671

9. Deep Learning-Based Image Classification in Differentiating Tufted Astrocytes, Astrocytic Plaques, and Neuritic Plaques.

Authors: Shunsuke Koga; Nikhil B Ghayal; Dennis W Dickson
Journal: J Neuropathol Exp Neurol Date: 2021-03-22 Impact factor: 3.685

10. Spatial transcriptomics inferred from pathology whole-slide images links tumor heterogeneity to survival in breast and lung cancer.

Authors: Alona Levy-Jurgenson; Xavier Tekpli; Vessela N Kristensen; Zohar Yakhini
Journal: Sci Rep Date: 2020-11-02 Impact factor: 4.379

1 in total

1. Evaluation of Feature Selection for Alzheimer's Disease Diagnosis.

Authors: Feng Gu; Songhua Ma; Xiude Wang; Jian Zhao; Ying Yu; Xinjian Song
Journal: Front Aging Neurosci Date: 2022-06-24 Impact factor: 5.702

1 in total