| Literature DB >> 36008541 |
Piumi Sandarenu1, Ewan K A Millar2,3,4,5, Yang Song1, Lois Browne6, Julia Beretov2,3,6, Jodi Lynch3,6, Peter H Graham3,6, Jitendra Jonnagaddala7, Nicholas Hawkins8, Junzhou Huang9, Erik Meijering10.
Abstract
Computational pathology is a rapidly expanding area for research due to the current global transformation of histopathology through the adoption of digital workflows. Survival prediction of breast cancer patients is an important task that currently depends on histopathology assessment of cancer morphological features, immunohistochemical biomarker expression and patient clinical findings. To facilitate the manual process of survival risk prediction, we developed a computational pathology framework for survival prediction using digitally scanned haematoxylin and eosin-stained tissue microarray images of clinically aggressive triple negative breast cancer. Our results show that the model can produce an average concordance index of 0.616. Our model predictions are analysed for independent prognostic significance in univariate analysis (hazard ratio = 3.12, 95% confidence interval [1.69,5.75], p < 0.005) and multivariate analysis using clinicopathological data (hazard ratio = 2.68, 95% confidence interval [1.44,4.99], p < 0.005). Through qualitative analysis of heatmaps generated from our model, an expert pathologist is able to associate tissue features highlighted in the attention heatmaps of high-risk predictions with morphological features associated with more aggressive behaviour such as low levels of tumour infiltrating lymphocytes, stroma rich tissues and high-grade invasive carcinoma, providing explainability of our method for triple negative breast cancer.Entities:
Mesh:
Year: 2022 PMID: 36008541 PMCID: PMC9411153 DOI: 10.1038/s41598-022-18647-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Magnified view of a tissue patch (right) extracted from one core (middle) of a TMA (left) from the TNBC cohort. On average each core is 1.25 mm in diameter. All slides were scanned at 0.25 µm/pixel resolution.
Characteristics of the clinicopathological parameters of the TNBC dataset.
| Parameter | ||
|---|---|---|
| 2 | 12 | 4.9 |
| 3 | 231 | 95.1 |
| Invasive ductal carcinoma | 220 | 90.5 |
| Metaplastic | 17 | 7.0 |
| Other | 6 | 2.5 |
| Positive | 85 | 35.0 |
| Negative | 155 | 63.8 |
| > 55 years | 141 | 58.0 |
| 102 | 42.0 | |
| > 20 mm | 131 | 53.9 |
| 112 | 46.1 | |
| < 30 | 110 | 45.3 |
| 133 | 54.7 | |
* n is the number of patients in the full dataset (statistical analysis is carried out in 3 stages and the number of patients for each stage is indicated in the relevant table).
** Other includes invasive micropapillary, lobular and apocrine carcinoma.
Figure 2Architecture of the MIL-based survival risk prediction model using pretrained feature encodings (Model 1).
Figure 3MIL-model can be reconfigured to allow each feature vector to be accepted as input to the network and attention weights be applied on each vector (Model 2). This generates an attention heatmap that highlights tissue areas of interest associated by the network as relevant to a given prediction.
C-index of 5-fold cross validation results for MIL deep learning architecture (Model 1).
| C-index for each fold | Average c-index | |||||
|---|---|---|---|---|---|---|
| Fold 1 | Fold 2 | Fold 3 | Fold 4 | Fold 5 | ||
| 4 | 0.6236 | 0.5058 | 0.6213 | 0.6984 | 0.5809 | 0.606 |
| 6 | 0.5857 | 0.6114 | 0.7002 | 0.5624 | 0.5559 | 0.603 |
| 8 | 0.5717 | 0.5613 | 0.6527 | 0.7444 | 0.5306 | 0.612 |
| 10 | 0.6470 | 0.6118 | 0.5764 | 0.5289 | 0.616 | |
| 12 | 0.6494 | 0.4957 | 0.6619 | 0.6141 | 0.5931 | 0.603 |
*Trained model of the best performing fold is used for univariate and multivariate experiments.
Figure 4Kaplan–Meier survival estimation for three stages of analysis using the deep learning model (Model 1) output for disease-specific survival in TNBC. The plots show the results of using (a) only the test data (Stage 1), (b) test and validation data (Stage 2), and (c) the whole dataset of TNBC patients (Stage 3).
Multivariate analysis for breast cancer specific survival of TNBC patients for Model 1 output on test and validation data.
| Method/parameter | Cutoff value | No. of patients | Multivariate ( | ||
|---|---|---|---|---|---|
| HR | 95% CI | ||||
| MIL (Model 1) | > 0.019 vs. | 30 vs. 30 | 4.00 | 1.41–11.35 | 0.01 |
| Age | > 55 vs. | 35 vs. 25 | 2.69 | 0.98–7.37 | 0.05 |
| Tumour size | > 20 vs. | 36 vs. 24 | 4.64 | 1.27–16.88 | 0.02 |
| LN status | pos. vs. neg. | 24 vs. 36 | 3.23 | 1.18–8.82 | 0.02 |
| TIL score | < 30 vs. | 36 vs. 24 | NS | ||
| Grade | 2 vs. 3 | 5 vs. 55 | NS | ||
NS not significant.
Univariate and multivariate analysis for breast cancer specific survival for the full dataset of TNBC patients using Model 1 output.
| Method/parameter | Risk group cutoff value | No. of patients in each group | Univariate ( | Multivariate ( | ||||
|---|---|---|---|---|---|---|---|---|
| HR | 95% CI | HR | 95% CI | |||||
| MIL (Model 1) | > 0.015 vs. | 120 vs.120 | 3.12 | 1.69-5.75 | < 0.005 | 2.68 | 1.44–4.99 | < 0.005 |
| Age | > 55 vs. | 139 vs. 101 | 1.98 | 1.08-3.62 | 0.03 | 1.87 | 1.00–3.49 | 0.05 |
| Tumour size | > 20 vs. | 129 vs. 111 | 2.40 | 1.29–4.49 | 0.01 | 2.16 | 1.13-4.15 | 0.02 |
| LN status | pos. vs. neg. | 85 vs. 155 | 3.21 | 1.80-5.71 | < 0.005 | 2.71 | 1.50–4.91 | < 0.005 |
| TIL score | < 30 vs. | 131 vs. 109 | 1.87 | 1.03–3.42 | 0.04 | NS | ||
| Grade | 2 vs. 3 | 12 vs. 228 | 1.45 | 0.52–4.06 | 0.47 | NS | ||
NS not significant.
Figure 5Kaplan–Meier survival estimation of high/low categories of Model 2 for disease-specific survival in TNBC.
Multivariate analysis for breast cancer specific survival of TNBC patients for Model 2 output.
| Parameter | Cutoff value | No. of patients | Multivariate ( | ||
|---|---|---|---|---|---|
| HR | 95% CI | ||||
| MIL (Model 2) | > 0.21 vs. | 118 vs. 122 | 2.28 | 1.24–4.18 | 0.01 |
| Age | > 55 vs. | 139 vs. 101 | 1.91 | 1.03–3.55 | 0.04 |
| Tumour size | > 20 vs. | 129 vs. 111 | 1.96 | 1.02–3.74 | 0.04 |
| LN status | pos. vs. neg. | 85 vs. 155 | 2.81 | 1.55–5.07 | < 0.005 |
| TIL score | < 30 vs. | 131 vs. 109 | NS | ||
| Grade | 2 vs. 3 | 12 vs. 228 | NS | ||
NS not significant.
Figure 6Heatmaps (a1–h1) and corresponding H&Es (a2–h2) from a representative case categorised as high-risk by the MIL classifier. The features present are those of a stroma-rich, low-TILs tumour and low-TILs tumour.