| Literature DB >> 31767541 |
Shidan Wang1, Tao Wang2, Lin Yang3, Donghan M Yang1, Junya Fujimoto4, Faliu Yi1, Xin Luo1, Yikun Yang5, Bo Yao1, ShinYi Lin1, Cesar Moran6, Neda Kalhor6, Annikka Weissferdt6, John Minna7, Yang Xie8, Ignacio I Wistuba4, Yousheng Mao5, Guanghua Xiao9.
Abstract
BACKGROUND: The spatial distributions of different types of cells could reveal a cancer cell's growth pattern, its relationships with the tumor microenvironment and the immune response of the body, all of which represent key "hallmarks of cancer". However, the process by which pathologists manually recognize and localize all the cells in pathology slides is extremely labor intensive and error prone.Entities:
Keywords: Cell distribution and interaction; Convolutional neural network; Deep learning; Lung adenocarcinoma; Pathology image; Prognosis
Mesh:
Year: 2019 PMID: 31767541 PMCID: PMC6921240 DOI: 10.1016/j.ebiom.2019.10.033
Source DB: PubMed Journal: EBioMedicine ISSN: 2352-3964 Impact factor: 8.143
Fig. 1Flow chart of ConvPath-aided pathological image analysis.
CHCAMS, National Cancer Center/Cancer Hospital of Chinese Academy of Medical Sciences, China; CI, confidence interval; HR, hazard ratio; TCGA, The Cancer Genome Atlas.
Fig. 2Image preprocessing step of the ConvPath software. (a) Selection of regions of interest (ROIs) in whole pathological imaging slides. (b) Image segmentation pipeline to extract cell-centered image patches from selected ROIs.
Fig. 3Cell type recognition step of the ConvPath software. (a) Schema and structure of the convolutional neural network (CNN) to recognize the types of cells in the centers of image patches. (b) Confusion matrix of internal testing results of CNN on the NLST and TCGA training image slides. Prediction accuracies are calculated based on 3996 image patches for each cell type. (c) Confusion matrix of independent testing results of CNN on image patches of the SPORE dataset. Prediction accuracies are calculated based on 8245 lymphocyte, 2211 stroma, and 6836 tumor patches.
Fig. 4Feature extraction step of the ConvPath software. (a) A zoomed-in part of a sampling region (Supplemental Figure 3) in which cell nuclei centroids are labeled with predicted cell types. Green, stroma; cyan, lymphocyte; yellow, tumor. (b) Cell type region detection using a kernel smoothing algorithm for the sampling region shown in Supplemental Figure 3. Area and perimeters are evaluated for regions of tumor, stroma, and lymphocyte.
Fig. 5Application of the prognostic model to independent datasets. (a, b) Validation of the prognostic model in the TCGA overall survival data (a, log rank test, p = 0.0047) and the CHCAMS recurrence data (b, log rank test, p = 0.030). (c) Boxplot for the distribution of predicted risk scores in the 5 histological subtypes of lung adenocarcinoma for the CHCAMS dataset patients. Jonckheere-Terpstra k-sample test, p = 0.0039. The boxes and whiskers show the lower (Q1) and upper (Q3) quartiles and the median for each histological subtype.
Multivariate analysis of the predicted risk scores in the CHCAMS and TCGA datasets adjusted by clinical variables.
| TCGA dataset ( | HR | 95% CI | p value |
|---|---|---|---|
| High risk vs. low risk | 2.19 | 1.33–3.60 | 0.0021 |
| Age (per year) | 1.03 | 1.01–1.06 | 0.014 |
| Male vs. female | 0.69 | 1.45–1.16 | 0.16 |
| Smoker vs. non-smoker | 0.88 | 0.53–1.47 | 0.62 |
| Stage | |||
| Stage I | ref | – | |
| Stage II | 2.69 | 1.45–5.00 | 0.0017 |
| Stage III | 5.04 | 2.69–9.43 | <0.001 |
| Stage IV | 6.06 | 2.49–14.73 | <0.001 |
| CHCAMS dataset ( | HR | 95% CI | p value |
| High risk vs. low risk | 2.21 | 1.16–4.21 | 0.016 |
| Age (per year) | 1.02 | 0.99–1.06 | 0.202 |
| Male vs. female | 1.85 | 0.69–4.91 | 0.22 |
| Smoker vs. non-smoker | 0.76 | 0.28–2.04 | 0.585 |
CHCAMS, National Cancer Center/Cancer Hospital of Chinese Academy of Medical Sciences, China;.
CI, confidence interval;.
HR, hazard ratio;.
TCGA, The Cancer Genome Atlas.