| Literature DB >> 29467373 |
Dmitrii Bychkov1, Nina Linder2,3, Riku Turkki2, Stig Nordling4, Panu E Kovanen5, Clare Verrill6, Margarita Walliander2, Mikael Lundin2, Caj Haglund7,8, Johan Lundin2,9.
Abstract
Image-based machine learning and deep learning in particular has recently shown expert-level accuracy in medical image classification. In this study, we combine convolutional and recurrent architectures to train a deep network to predict colorectal cancer outcome based on images of tumour tissue samples. The novelty of our approach is that we directly predict patient outcome, without any intermediate tissue classification. We evaluate a set of digitized haematoxylin-eosin-stained tumour tissue microarray (TMA) samples from 420 colorectal cancer patients with clinicopathological and outcome data available. The results show that deep learning-based outcome prediction with only small tissue areas as input outperforms (hazard ratio 2.3; CI 95% 1.79-3.03; AUC 0.69) visual histological assessment performed by human experts on both TMA spot (HR 1.67; CI 95% 1.28-2.19; AUC 0.58) and whole-slide level (HR 1.65; CI 95% 1.30-2.15; AUC 0.57) in the stratification into low- and high-risk patients. Our results suggest that state-of-the-art deep learning techniques can extract more prognostic information from the tissue morphology of colorectal cancer than an experienced human observer.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29467373 PMCID: PMC5821847 DOI: 10.1038/s41598-018-21758-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Patient characteristics.
| Clinicopathological variable | Selected Patients | Original Set |
|---|---|---|
|
| 420 | 641 |
|
| ||
| <50 years | 53 (12.6%) | 77 (12%) |
| 50–64 years | 123 (29.3%) | 189 (29.5%) |
| 65–74 years | 145 (34.5%) | 216 (33.7%) |
| 99 (23.6%) | 159 (24.8%) | |
| average | 65.4 | 65.9 |
|
| ||
| Female | 193 (46%) | 301 (47%) |
| Male | 227 (54%) | 340 (53%) |
|
| ||
| Colon | 234 (55.7%) | 352 (54.9%) |
| Rectum | 186 (44.3%) | 289 (45.1%) |
|
| ||
| A | 51 (12.1%) | 93 (14.5%) |
| B | 141 (33.6%) | 231 (36%) |
| C | 114 (27.1%) | 166 (25.9%) |
| D | 114 (27.1%) | 149 (23.2%) |
| NA | 0 (0%) | 2 (0.3%) |
|
| ||
| Low (1–2) | 285 (67.9%) | 439 (68.4%) |
| High (3–4) | 135 (32.1%) | 200 (32.5%) |
| NA | 0 (0%) | 2 (0.3%) |
|
| ||
| Low Risk | 185 (44.%) | — |
| High Risk | 191 (45.5%) | — |
| NA | 44 (10.5%) | — |
|
| ||
| Low Risk | 210 (50.0%) | — |
| High Risk | 210 (50.0%) | — |
Figure 1Overview of the image analysis pipeline and Long Short-Term Memory (LSTM) prognostic model. Images of tissue microarray (TMA) spots are characterized by a pre-trained convolutional neural network (VGG-16) in a tile-wise manner. The VGG-16 network produces a high-dimensional feature vector for each individual tile from an input image. These features then serve as inputs for classifiers trained to predict five-year disease-specific survival (DSS) (A). The Long Short-Term Memory (LSTM) Network slides through the entire image of the tissue microarray spot to jointly summarize observed image tiles and predict the patient risk score (B).
Figure 2Kaplan-Meier survival curves based on different predictors for 420 colorectal cancer patients. We dichotomized patients into Low Risk group (blue curve) and High Risk group (Red curve) by median value of each predictor independently. The LSTM model predictor yields stronger stratification with a Hazard Ratio (HR) of 2.3 (log rank p-value < 0.0001) as compared to Visual Scoring on a tissue microarray (TMA) spot level (HR 1.67; log rank p-value = 0.00016) and histological grade on a whole-slide level (HR 1.65; log rank p-value = 0.00016). Dukes’ stage also stratifies the patients into groups with significantly different outcome (p-value < 0.0001). Dukes’ stage remains a stronger predictor of survival, but is not directly comparable to the tissue-based variables, since it includes extent of local invasion, number of lymph nodes affected and whether distant metastasis is observed.
Figure 3Predictive performance of four classifiers evaluated in cross-validation on images of tissue microarray (TMA) spots across different image resolutions. We trained four (Naïve Bayes, Support Vector Machine (SVM), Logistic Regression and Long Short-Term Memory Network) classifiers to predict five-year disease specific survival from tissue microarray (TMA) spot images of three different resolutions. “High” indicates images of original pixel size (0.22 μm). “Medium” and “Low” correspond to images downscaled by a factor of 4 and 16 respectively. The graphs show that high resolution images give the best performance of prognostic models as measured by area under the receiver operating characteristic curve (AUC, graph to the left) and hazard ratio (HR, graph to the right) in 3-fold cross-validation. The error bars indicate variation among folds with a dot representing the average value. Human reference corresponds both to a visual risk score assessed on a TMA level and histological grade assessed on the whole tumour sample (i.e. whole-slide) level as part of the primary diagnosis.
Univariate and Multivariate Cox proportional hazards model based on disease-specific survival in 420 patients with colorectal cancer.
| Factor | Univariate | P-value | Multivariate | |||
|---|---|---|---|---|---|---|
| P-value (Wald) | HR | (95% CI) | HR | (95% CI) | ||
|
| 0.013 | |||||
| <50 years | 1 | 1 | ||||
| 50–64 years | 0.98 | (0.63–1.54) | 0.46 | 1.21 | (0.73–1.99) | |
| 65–74 years | 1.42 | (0.93–2.19) | 0.008 | 1.91 | (1.18–3.08) | |
| 75 | 1.64 | (1.04–2.58) | <0.001 | 3.41 | (2.06–5.65) | |
|
| 0.55 | |||||
| Female | 1 | |||||
| Male | 1.08 | (0.84–1.40) | ||||
|
| <0.001 | |||||
| A | 1 | 1 | ||||
| B | 2.05 | (1.03–4.06) | 0.23 | 1.65 | (0.73–3.71) | |
| C | 4.78 | (2.45–9.34) | <0.001 | 4.91 | (2.24–10.78) | |
| D | 20.29 | (10.44–39.44) | <0.001 | 21.49 | (9.72–47.53) | |
|
| <0.001 | |||||
| Low (1–2) | 1 | |||||
| High (3–4) | 1.65 | (1.30–2.15) | ||||
|
| <0.001 | 0.03 | ||||
| Low Risk | 1 | 1 | ||||
| High Risk | 1.67 | (1.28–2.19) | 1.42 | (1.04–1.94) | ||
|
| <0.001 | <0.001 | ||||
| Low Risk | 1 | 1 | ||||
| High Risk | 2.3 | (1.79–3.03) | 1.89 | (1.41–2.53) | ||
Figure 4Features pre-trained on the ImageNet database distinguish tissue types in images of colorectal tumours stained for basic morphology (haematoxylin and eosin). We first extracted high-dimensional features from tissue microarray spot images split into individual tiles (n = 380,000) with the convolutional neural network model (VGG-16). Then we projected the features onto 2D plane with t-distributed Stochastic Neighbour Embedding such that each dot on the scatterplot corresponds to a tile. Each tile contains a (relatively) homogeneous tissue pattern and hence the tiles group together on the scatter plot according to pattern similarity. Zooming into local areas of the scatter plot identifies tissue entities that group together: stroma, cancer epithelium, infiltrating immune cells. This observation suggests that the VGG-16 model appears as an efficient descriptor of microscopic images of colorectal samples. Finally, tiles (indicated as points) on the scatter plot are coloured based on the histological grade of the sample they belong to.
Figure 5Individual units of the Long Short-Term Memory prognostic model learnt to separate tissue patterns. Tiles that correspond to extreme activations (either strong positive or strong negative) of individual units within the prognostic Long Short-Term Memory outcome prediction network are visualized. The upper half of each grid indicates the top 98 tiles with extreme positive response of a neuron and the bottom half indicates 98 most negative responses of the same neuron.