| Literature DB >> 34686474 |
Mohsin Bilal1, Shan E Ahmed Raza1, Ayesha Azam2, Simon Graham1, Mohammad Ilyas3, Ian A Cree4, David Snead5, Fayyaz Minhas1, Nasir M Rajpoot6.
Abstract
BACKGROUND: Determining the status of molecular pathways and key mutations in colorectal cancer is crucial for optimal therapeutic decision making. We therefore aimed to develop a novel deep learning pipeline to predict the status of key molecular pathways and mutations from whole-slide images of haematoxylin and eosin-stained colorectal cancer slides as an alternative to current tests.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34686474 PMCID: PMC8609154 DOI: 10.1016/S2589-7500(21)00180-1
Source DB: PubMed Journal: Lancet Digit Health ISSN: 2589-7500
Figure 1IDaRS prediction pipeline and histopathological feature discovery of colorectal cancer pathways
(A) Tissue segmentation and tile extraction were performed to obtain informative tiles. Model 1 (ResNet18) was trained to separate tumour from non-tumour tiles. These tiles served as input to iterative draw and rank sampling (an adaptation of ResNet34; model 2), which was trained on tumour tiles for label prediction. (B) A concept diagram of iterative draw and rank sampling illustrating the training strategy for the fast labelling of whole-slide images. The deep learning model was trained iteratively for classification with a random draw (di) of the same number of tiles from each whole-slide image and the k top-ranked tiles of the same slide drawn in the previous iteration. (C) The trained iterative draw and rank sampling model gave a prediction score to each tile in the whole-slide image, which were used to obtain a slide score and identify the top-ranked tiles from each slide. (D) Model 3 (HoVer-Net) inference was used to segment and classify different types of nuclei in top-ranked representative tiles in a cellular composition analysis of colorectal cancer pathways. Histological patterns of the molecular characteristics of colorectal cancers are shown as a spider plot based on the feature importance of different cellular composition profiles modelled via a support vector machine. H&E=haematoxylin and eosin. IDaRS=iterative draw and rank sampling. NEP1=neoplastic epithelial type 1. NEP2=neoplastic epithelial type 2.
Performance of the iterative draw and rank sampling method for prediction of hypermutation, microsatellite instability, chromosomal instability, and mutation status from haematoxylin and eosin-stained slides of colorectal cancer
| Total | Positive | Negative | ||||
|---|---|---|---|---|---|---|
| High | 430 | 67 | 363 | 0·71 | 0·81 (0·03) | 0·57 (0·09) |
| Microsatellite instability | 428 | 62 | 366 | 0·74 (0·66–0·80) | 0·86 (0·04) | 0·62 (0·10) |
| Chromosomal instability | 430 | 313 | 117 | 0·73 | 0·83 (0·02) | 0·92 (0·01) |
| CIMP-high | 239 | 55 | 184 | .. | 0·79 (0·05) | 0·51 (0·05) |
| 502 | 59 | 443 | 0·66 | 0·79 (0·01) | 0·33 (0·05) | |
| 502 | 294 | 208 | 0·64 | 0·73 (0·02) | 0·78 (0·04) | |
| 502 | 208 | 294 | 0·60 | 0·60 (0·04) | 0·53 (0·04) | |
| High vs low mutation density | 359 | 66 | 293 | .. | 0·88 (0·02) | 0·66 (0·06) |
| Microsatellite instability | 359 | 62 | 297 | 0·77 | 0·90 (0·01) | 0·72 (0·02) |
| Chromosomal instability | 359 | 257 | 102 | .. | 0·85 (0·02) | 0·92 (0·02) |
| CIMP-high | 203 | 51 | 152 | .. | 0·84 (0·01) | 0·61 (0·02) |
| Microsatellite instability | 359 | 62 | 297 | 0·77 | 0·90 | 0·72 |
| Microsatellite instability | 47 | 12 | 35 | .. | 0·98 | 0·95 |
Labels considered positive were high mutation density, microsatellite instability, chromosomal instability, CIMP-high, and mutant (BRAFmut, TP53mut, and KRASmut). Labels considered negative were low mutation density, microsatellite stability, genomic stability, CIMP-low, and wild-type (BRAFWT, TP53WT, and KRASWT). Alongside our analysis, we list published results from previous studies11, 18 of three-fold cross-validation mean AUROCs and a train-to-test split AUROC in the TCGA-CRC-DX cohort. AUPRC=area under the precision-recall curve. AUROC=area under the convex hull of the receiver operating characteristic curve. CIMP=CpG island methylator phenotype. IDaRS=iterative draw and rank sampling. PAIP=Pathology Artificial Intelligence Platform. TCGA-CRC-DX=The Cancer Genome Atlas colon and rectal cancer.
Figure 2Iterative draw and rank sampling-based prediction of colorectal cancer pathways in the TCGA-CRC-DX cohort
AUROC plots of four-fold cross-validation for prediction of hypermutation (A), microsatellite instability (B), chromosomal instability (C), CpG island methylator phenotype (D), BRAF mutation status (E), and TP53 mutation status (F). The true positive rate represents sensitivity and the false positive rate represents 1–specificity. The blue shaded areas represent the SD. AUROC=area under the convex hull of the receiver operating characteristic curve. TCGA-CRC-DX=The Cancer Genome Atlas colon and rectal cancer.
Figure 3Spider chart of differential cellular compositions as histological features of colorectal cancer pathways
Normalised weights between –1 and 1 show the size of significance of the corresponding histological feature. CIMP-high=CpG island methylator phenotype of high frequencies of DNA hypermethylation. NEP1=neoplastic epithelial type 1. NEP2=neoplastic epithelial type 2.