| Literature DB >> 30635290 |
Shuo Wang1,2,3, Jingyun Shi4,3, Zhaoxiang Ye5,3, Di Dong1,2,3, Dongdong Yu1,2,3, Mu Zhou6,3, Ying Liu5, Olivier Gevaert6, Kun Wang1, Yongbei Zhu1, Hongyu Zhou7, Zhenyu Liu1, Jie Tian1,2,8.
Abstract
Epidermal growth factor receptor (EGFR) genotyping is critical for treatment guidelines such as the use of tyrosine kinase inhibitors in lung adenocarcinoma. Conventional identification of EGFR genotype requires biopsy and sequence testing which is invasive and may suffer from the difficulty of accessing tissue samples. Here, we propose a deep learning model to predict EGFR mutation status in lung adenocarcinoma using non-invasive computed tomography (CT).We retrospectively collected data from 844 lung adenocarcinoma patients with pre-operative CT images, EGFR mutation and clinical information from two hospitals. An end-to-end deep learning model was proposed to predict the EGFR mutation status by CT scanning.By training in 14 926 CT images, the deep learning model achieved encouraging predictive performance in both the primary cohort (n=603; AUC 0.85, 95% CI 0.83-0.88) and the independent validation cohort (n=241; AUC 0.81, 95% CI 0.79-0.83), which showed significant improvement over previous studies using hand-crafted CT features or clinical characteristics (p<0.001). The deep learning score demonstrated significant differences in EGFR-mutant and EGFR-wild type tumours (p<0.001).Since CT is routinely used in lung cancer diagnosis, the deep learning model provides a non-invasive and easy-to-use method for EGFR mutation status prediction.Entities:
Year: 2019 PMID: 30635290 PMCID: PMC6437603 DOI: 10.1183/13993003.00986-2018
Source DB: PubMed Journal: Eur Respir J ISSN: 0903-1936 Impact factor: 16.671
FIGURE 1Illustration of the deep learning model. This model is composed of convolutional layers with kernel size 3×3 and 1×1, batch normalisation and pooling layers. Sub-network 1 shares the same structure with the first 20 layers in DenseNet [31], which was pre-trained using 1.28 million natural images. Sub-network 2 was trained in the epidermal growth factor receptor (EGFR) mutation dataset, aiming at capturing the association between image features to EGFR mutation labels. When we feed a tumour into the deep learning model, it predicts the probability of the tumour being EGFR-mutant. CT: computed tomography.
Clinical characteristics of patients in the primary and validation cohorts
| 603 | 241 | |||||
| 59.50±9.72 | 61.36±8.96 | 0.016 | 59.59±8.83 | 59.21±7.28 | 0.716 | |
| <0.001 | <0.001 | |||||
| Female | 99 (39.76) | 206 (58.19) | 52 (42.62) | 79 (66.39) | ||
| Male | 150 (60.24) | 148 (41.81) | 70 (57.38) | 40 (33.61) | ||
| 0.047 | 0.017 | |||||
| I | 181 (72.69) | 240 (67.80) | 50 (40.98) | 65 (54.62) | ||
| II | 27 (10.84) | 27 (7.63) | 22 (18.03) | 8 (6.72) | ||
| III | 36 (14.46) | 69 (19.49) | 43 (35.25) | 35 (29.41) | ||
| IV | 5 (2.01) | 18 (5.08) | 7 (5.74) | 11 (9.24) | ||
| 249 (41.29) | 354 (58.71) | 122 (50.62) | 119 (49.38) | |||
Data are presented as mean±sd, or n (%), unless otherwise stated. EGFR: epidermal growth factor receptor.
Predictive performance of various methods in the primary and validation cohorts
| Primary | 0.66 (0.62–0.70) | 61.60 (57.90–65.15) | 64.39 (59.75–68.90) | 56.75 (50.65–62.68) |
| Validation | 0.61 (0.58–0.64) | 61.83 (58.88–64.88) | 56.30 (52.41–60.41) | 67.21 (63.20–71.20) |
| Primary | 0.76 (0.72–0.80) | 64.77 (61.31–68.22) | 71.49 (67.86–75.09) | 61.22 (57.45–65.12) |
| Validation | 0.64 (0.61–0.67) | 62.24 (59.94–64.72) | 63.03 (59.61–66.60) | 61.48 (58.22–64.92) |
| Primary | 0.70 (0.66–0.74) | 66.27 (62.96–69.83) | 40.98 (35.82–46.34) | |
| Validation | 0.64 (0.61–0.67) | 61.47 (58.69–64.69) | 64.04 (60.34–68.34) | 58.97 (55.10–63.10) |
| Primary | 76.83 (73.17–80.49) | |||
| Validation |
Data are presented as % (95% CI). All the results in the primary cohort were evaluated by five-fold cross-validation. Bold type represents the best performance. AUC: area under the receiver operating characteristic curve.
FIGURE 2Predictive performance of the deep learning model. a) Receiver operating characteristic curves of the deep learning (DL) model, radiomics model, semantic model and clinical model in the primary/validation cohorts. b) DL score between epidermal growth factor receptor (EGFR)-mutant and EGFR-wild type groups in the primary and validation cohorts. c) Decision curve of the DL model. The green line represents the benefit of treating all the patients as EGFR-wild type, and the blue line represents the benefit of treating all the patients as EGFR-mutant. The red line shows the benefit of using the DL model.
FIGURE 3Suspicious tumour area discovery. We used 0.5 as cut-off value to acquire the suspicious areas according to the attention map of the deep learning (DL) model. EGFR: epidermal growth factor receptor.
FIGURE 4Deep learning feature analysis. a) Convolutional filters (Conv_) from the 2nd, 13th, 20th and 24th layers of the deep learning model. Each convolutional layer includes hundreds of filters, and only the first three filters are illustrated in each layer. b) Response of the negative filter and the positive filter in epidermal growth factor receptor (EGFR)-mutant/-wild type tumours. The positive filter has strong response to EGFR-mutant tumours and the negative filter has strong response to EGFR-wild type tumours. All the tumour images are from the validation cohort. c) Response value of the positive and the negative filters in the two cohorts. d) Unsupervised clustering of lung adenocarcinoma patients (n=844) on the vertical axis and deep learning feature expression (feature dimension=32, the Conv_24 layer) on the horizontal axis.