| Literature DB >> 29991684 |
Shidan Wang1, Alyssa Chen1,2, Lin Yang1,3, Ling Cai1,4, Yang Xie1,5,6, Junya Fujimoto7, Adi Gazdar6,8, Guanghua Xiao9,10,11.
Abstract
Pathology images capture tumor histomorphological details in high resolution. However, manual detection and characterization of tumor regions in pathology images is labor intensive and subjective. Using a deep convolutional neural network (CNN), we developed an automated tumor region recognition system for lung cancer pathology images. From the identified tumor regions, we extracted 22 well-defined shape and boundary features and found that 15 of them were significantly associated with patient survival outcome in lung adenocarcinoma patients from the National Lung Screening Trial. A tumor region shape-based prognostic model was developed and validated in an independent patient cohort (n = 389). The predicted high-risk group had significantly worse survival than the low-risk group (p value = 0.0029). Predicted risk group serves as an independent prognostic factor (high-risk vs. low-risk, hazard ratio = 2.25, 95% CI 1.34-3.77, p value = 0.0022) after adjusting for age, gender, smoking status, and stage. This study provides new insights into the relationship between tumor shape and patient prognosis.Entities:
Mesh:
Year: 2018 PMID: 29991684 PMCID: PMC6039531 DOI: 10.1038/s41598-018-27707-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow chart of analysis process. CNN, convolutional neural network; NLST, the National Lung Screening Trial; TCGA, The Cancer Genome Atlas.
Figure 2Example results of image-level tumor region detection. (A) Original image. (B) Predicted tumor probability. Each point in the heatmap corresponds to a 300 × 300 pixel image patch in original 40x image. (C) Predicted region labels. Yellow: white (empty) region; green: tumor region; blue: non-malignant region.
Univariate analysis of tumor region features in NLST training dataset.
| HR (95% CI) | p-value | |
|---|---|---|
| Number of regions (per 1000) | 1.29 (0.64–2.58) | 0.48 |
| Area sum of all regions (per 1000 pixels*) | 1.030 (1.010–1.050) | 0.0033 |
| Perimeter sum of all regions (per 1000 pixels) | 1.088 (1.028–1.151) | 0.0034 |
| Sum of convex area for all regions (per 1000 pixels) | 1.020 (1.006–1.033) | 0.0047 |
| Sum of filled area for all regions (per 1000 pixels) | 1.027 (1.009–1.045) | 0.0029 |
| Sum of hole numbers of all regions (per 100) | 1.087 (1.031–1.16) | 0.0033 |
| Sum of major axis length of all regions (per 1000 pixels) | 1.40 (1.00–1.96) | 0.051 |
| Sum of minor axis length of all regions (per 1000 pixels) | 2.65 (1.10–6.40) | 0.030 |
| Perimeter2/area of all regions (per 1000) | 1.18 (1.03–1.35) | 0.019 |
| Area of main region (per 1000 pixels) | 1.027 (1.007–1.048) | 0.0093 |
| Convex area of main region (per 1000 pixels) | 1.018 (1.004–1.032) | 0.010 |
| Eccentricity of main region | 6.37 (0.57–71.56) | 0.13 |
| Hole number of main region (per 100) | 1.087 (1.020–1.15) | 0.0060 |
| Extent of main region | 4.90 (0.19–126.30) | 0.34 |
| Filled area for main region (per 1000 pixels) | 1.025 (1.007–1.043) | 0.0072 |
| Major axis length for main region (per 100 pixels) | 1.57 (1.11–2.21) | 0.0099 |
| Minor axis length for main region (per 100 pixels) | 1.73 (1.05–2.83) | 0.031 |
| Angle between the X-axis and the major axis of main region | 0.98 (0.64–1.50) | 0.92 |
| Perimeter of main region (per 1000 pixels) | 1.087 (1.023–1.15) | 0.0068 |
| Solidity of main region | 7.24 (0.45–117.40) | 0.16 |
| Average tumor probability of the main region (per 0.10) | 1.11 (0.53–2.24) | 0.78 |
| Perimeter2/area for main region (per 1000) | 1.21 (1.03–1.42) | 0.021 |
NLST, the National Lung Screening Trial.
*1 pixel in heatmap = 1 patch in 40 X pathological image. Patch size: 300 pixels * 300 pixels.
Figure 3Comparison of tumor shapes with high or low values of eccentricity and PA ratio of main tumor region. Original heatmaps are cropped to the same size with the same image scale. Yellow, main tumor region; green, non-main tumor region; dark blue, non-malignant tissue; blue, blank part of pathology image. PA ratio, perimeter2 to area ratio.
Figure 4Prognostic performance in TCGA validation dataset illustrated by Kaplan-Meier plot. Patients are dichotomized according to median predicted risk score. Difference between the two risk groups: log-rank test, p value = 0.0029.
Multivariate analysis of predicted risk and clinical variables in TCGA.
| Variable | HR (95% CI) | p-value |
|---|---|---|
| High risk vs. low risk | 2.25 (1.34–3.77) | 0.0022 |
| Age | 1.02 (1.00–1.04) | 0.12 |
| Male vs. female | 0.71 (0.43–1.16) | 0.17 |
| Smoker vs. non-smoker | 0.95 (0.59–1.54) | 0.85 |
| Stage II vs. stage I | 2.58 (1.47–4.51) | <0.001 |
| Stage III vs. stage I | 5.23 (2.85–9.59) | <0.001 |
| Stage IV vs. stage I | 2.69 (1.19–6.09) | 0.017 |
Patient characteristics of training and validation datasets.
| NLST (training)* | TCGA (validation) | p-value | |
|---|---|---|---|
| No. of patients | 150 | 389 | |
| Age | 64.03 ± 5.12 | 64.98 ± 10.33 | 0.16 |
| Gender | 0.055 | ||
| Male | 82 (54.7) | 175 (45.0) | |
| Female | 68 (45.3) | 214 (55.0) | |
| Smoking status | 0.0020 | ||
| Yes | 81 (54.0) | 267 (68.6) | |
| No | 69 (46.0) | 122 (31.4) | |
| Stage | 0.0048 | ||
| I | 101 (67.3) | 222 (57.1) | |
| II | 16 (10.7) | 96 (24.7) | |
| III | 23 (15.3) | 49 (12.6) | |
| IV | 10 (6.7) | 22 (5.7) |
NLST, the National Lung Screening Trial; TCGA, the Cancer Genome Atlas.
*Values are either mean ± standard deviation, or number (percentage).