| Literature DB >> 33060677 |
Erik A Burlingame1,2, Mary McDonnell2, Geoffrey F Schau1,2, Guillaume Thibault2, Christian Lanciault3, Terry Morgan3, Brett E Johnson2, Christopher Corless4,5, Joe W Gray2,5,6, Young Hwan Chang7,8,9.
Abstract
Spatially-resolved molecular profiling by immunostaining tissue sections is a key feature in cancer diagnosis, subtyping, and treatment, where it complements routine histopathological evaluation by clarifying tumor phenotypes. In this work, we present a deep learning-based method called speedy histological-to-immunofluorescent translation (SHIFT) which takes histologic images of hematoxylin and eosin (H&E)-stained tissue as input, then in near-real time returns inferred virtual immunofluorescence (IF) images that estimate the underlying distribution of the tumor cell marker pan-cytokeratin (panCK). To build a dataset suitable for learning this task, we developed a serial staining protocol which allows IF and H&E images from the same tissue to be spatially registered. We show that deep learning-extracted morphological feature representations of histological images can guide representative sample selection, which improved SHIFT generalizability in a small but heterogenous set of human pancreatic cancer samples. With validation in larger cohorts, SHIFT could serve as an efficient preliminary, auxiliary, or substitute for panCK IF by delivering virtual panCK IF images for a fraction of the cost and in a fraction of the time required by traditional IF.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33060677 PMCID: PMC7566625 DOI: 10.1038/s41598-020-74500-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of virtual IF staining with SHIFT and feature-guided H&E sample selection. (a) Schematic of SHIFT modeling for training and testing phases. The generator network generates virtual IF tiles conditioned on H&E tiles. The discriminator network learns to discriminate between real and generated image pairs. See also Supplementary Fig. S2. (b) Four heterogeneous samples of H&E-stained PDAC biopsy tissue used in the current study. Pathologist annotations indicate regions that are benign (green), grade 1 PDAC (black), grade 2/3 PDAC (blue), and grade 2/3 adenosquamous (red). (c) Making direct comparisons between H&E whole slide images (WSIs) is intractable because each WSI can contain billions of pixels. By decomposing WSIs into sets of non-overlapping 256 × 256 pixel tiles, we can make tractable comparisons between the feature-wise distribution of tile sets. (d) Schematic of feature-guided H&E sample selection. First, H&E samples are decomposed into 256 × 256 pixel tiles. Second, all H&E tiles are used to train a variational autoencoder (VAE) to learn feature representations for all tiles; for each 196,608-pixel H&E tile in the dataset, the encoder learns a compact but expressive feature representation that maximizes the ability of the decoder to reconstruct the original tile from its feature representation (see “Methods”). Third, the tile feature representations are used to determine which samples are most representative of the whole dataset.
Figure 2Feature-guided H&E sample selection and virtual IF staining with SHIFT. (a) Distribution of the 16 latent features (L1-L16) extracted by VAE from sample H&E tiles. (b) t-SNE embedding of latent feature representations of sample H&E tiles, faceted by sample identity. Each point in each plot represents a single H&E tile. Contour lines indicate point density. (c) SHIFT model test performance for optimal (B and D) and non-optimal (A and B) training set sample compositions. The paired H&E and IF images from samples B and D were subdivided into smaller images B = {B1,B2} and D = {D1,D2,D3,D4,D5} to avoid regions of IF that exhibited substantial autofluorescence. The x-axis labels indicate sample identity, where each letter corresponds to a unique sample and each number corresponds to a subset of that sample. Each n denotes the number of image tiles that were extracted from that sample. Plots for sample subsets are not show if that sample subset was a component of a model’s training set. *p < .05; for three group comparisons we used the Friedman test with Nemenyi post-hoc test; for two group comparisons we used the Wilcoxon signed-rank test. White dots in violin plots represent distributional medians. (d) Visual comparison of virtual staining methods. The ensemble results are attained by averaging the output images of SHIFT and Label-Free Determination (LFD) models. See also Supplementary Fig. S2. (e) Test performance comparison of virtual staining methods. The x-axis labels indicate sample identity, where each letter corresponds to a unique sample and each number corresponds to a subset of that sample. Each n denotes the number of image tiles that were extracted from that sample. Plots for sample subsets B1 and D5 are not show because those sample subsets were components of the models’ training sets. *p < .05; Friedman test with Nemenyi post-hoc test. White dots in violin plots represent distributional medians.
| Reagent or resource | Source | Identifier |
|---|---|---|
| Alpha-smooth muscle actin monoclonal antibody (1A4 (ASM-1)) | ThermoFisher Scientific | |
| Goat anti-mouse IgG2a cross-adsorbed secondary antibody, AlexaFluor 555 | ThermoFisher Scientific | |
| Pan-cytokeratin monoclonal antibody (AE1/AE3), AlexaFluor 488, EBioscience | ThermoFisher Scientific | |
| Ki-67 (D3B5) monoclonal antibody, AlexaFluor 647, Conjugate #12075 | Cell Signaling Technology | |
| Human PDAC tumor samples | Oregon Pancreas Tissue Registry | IRB: 3609 |
| SlowFade Gold Antifade Mountant with DAPI | ThermoFisher Scientific | |
| Normal goat serum blocking solution | Vector laboratories | |
| Imaging data | This manuscript | |
| Python | Python Software Foundation | RRID:SCR_008394 |
| PyTorch | Paszke et al.[ | |
| imgaug | Jung[ | |
| Netron | ||
| HoloViews | Stevens et al.[ | |
| Bokeh | Bokeh Development Team[ | |
| scikit-image | van der Walt et al.[ | |
| scikit-learn | Pedregosa et al.[ | |
| scikit-posthocs | Terpilowski et al.[ | |
| SciPy | Virtanen et al.[ | |
| Jupyter | Kluyver et al.[ | |
| Hematoxylin | 10 min |
| Wash in water | 1 min |
| Acid alcohol (0.5% HCl in 70% Ethanol) | 8 s |
| Wash in water | 25 s |
| Bluing solution | 2 min |
| Wash in water | 20 s |
| 80% Ethanol/water | 25 s |
| Eosin | 10 s |
| 80% Ethanol/water | 25 s |
| 95% Ethanol/water | 20 s |
| 100% Ethanol (two times) | 25 s |
| Xylene (five times) | 25 s |
| Parameter | Description |
|---|---|
| Complete tile set of all samples, | |
| Single tile, | |
| Subset of | |
| Complete VAE-learned feature set, | |
| Single feature, | |
| Random variable defined over | |
| Random variable defined over |
| … | ||||
|---|---|---|---|---|
| − 1.64266 | 1.36952 | … | 1.23509 | |
| − 0.792104 | − 0.481497 | … | 1.07938 | |
| … | … | … | … | … |
| 0.00163981 | − 0.0162441 | … | -0.95883 |
| … | Sum | ||||
|---|---|---|---|---|---|
| 0.00418311 | 0.0850498 | … | 0.0743519 | 1 | |
| 0.0148384 | 0.0202433 | … | 0.0964193 | 1 | |
| … | … | … | … | … | 1 |
| 0.0208721 | 0.0205021 | … | 0.00798802 | 1 |