| Literature DB >> 33193560 |
Zhijun Wu1, Lin Wang2, Churong Li3, Yongcong Cai4, Yuebin Liang5, Xiaofei Mo5, Qingqing Lu5, Lixin Dong6, Yonggang Liu7.
Abstract
It is critical for patients who cannot undergo eradicable surgery to predict the risk of lung cancer recurrence and metastasis; therefore, the physicians can design the appropriate adjuvant therapy plan. However, traditional circulating tumor cell (CTC) detection or next-generation sequencing (NGS)-based methods are usually expensive and time-inefficient, which urge the need for more efficient computational models. In this study, we have established a convolutional neural network (CNN) framework called DeepLRHE to predict the recurrence risk of lung cancer by analyzing histopathological images of patients. The steps for using DeepLRHE include automatic tumor region identification, image normalization, biomarker identification, and sample classification. In practice, we used 110 lung cancer samples downloaded from The Cancer Genome Atlas (TCGA) database to train and validate our CNN model and 101 samples as independent test dataset. The area under the receiver operating characteristic (ROC) curve (AUC) for test dataset was 0.79, suggesting a relatively good prediction performance. Our study demonstrates that the features extracted from histopathological images could be well used to predict lung cancer recurrence after surgical resection and help classify patients who should receive additional adjuvant therapy.Entities:
Keywords: convolutional neural network; hematoxylin and eosin staining; histopathological image; lung cancer; recurrence
Year: 2020 PMID: 33193560 PMCID: PMC7477356 DOI: 10.3389/fgene.2020.00768
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1The flowchart of this study. (A) The whole-slide images (WSIs) of lung cancer downloaded from The Cancer Genome Atlas database. (B) Construction of a dataset consisting of annotated WSIs split by non-overlapping 512 × 512 pixels windows. (C) Color normalization. (D) Convolutional neural network (CNN) model training. (E) Heat map and classification of a testing sample. Each tile from the test image was classified by trained CNN, and the results were finally aggregated per slide to extract the heat map.
FIGURE 2Color normalization of H&E slices. (A) The de-noising process applied to regions that have large blank spaces in the tumor regions. (B) The deep convolutional Gaussian mixture model (DCGMM) used for color normalization. The left column represents original images, and the right column represents imaging after color normalization.
FIGURE 3The ResNet network workflow.
FIGURE 4Receiver operating characteristic (ROC) and heat map on The Cancer Genome Atlas (TCGA) training data. (A) ROC curve of test data with the 512 × 512 pixel image. (B) Heat map of the tumor region applied in the convolutional neural network (CNN) model by using TCGA dataset. We also obtained the heat map given by the model shown in B. From the heat map, we found that the color of suspected tumor area was red and that the color of normal area was partial blue. The results were consistent as we have considered.
Confusion matrix definitions.
| True | Positive | True positive (TP) | False negative (FN) |
| Negative | False positive (FP) | True negative (TN) | |
Clinical characteristics.
| Age | Mean | 54 (31–83) |
| Gender | Male | 62 |
| Female | 47 | |
| Unknown | 1 | |
| Samples type | H&E | 1 |
| Metastasis and recurrence period | Tumor Free | 35 |
| Loco regional recurrence | 15 | |
| Distant metastasis | 60 | |
| Cancer subtype | Adenocarcinoma | 58 |
| Squamous carcinoma | 52 |
Tuning of the hyper-parameters.
| softmax | 100 | 0.79 | 0.73 |
| 150 | 0.82 | 0.75 | |
| 200 | 0.82 | 0.78 | |
| relu | 100 | 0.81 | 0.74 |
| 150 | 0.84 | 0.78 | |
| 200 | 0.83 | 0.80 | |
| tanh | 100 | 0.73 | 0.6 |
| 150 | 0.76 | 0.67 | |
| 200 | 0.77 | 0.69 | |
The confusion matrix of the model for test dataset.
| High risk | 49 | 9 | 58 |
| Low risk | 14 | 29 | 43 |
| Total | 63 | 38 | 101 |