| Literature DB >> 35574401 |
Guojin Zhang1,2, Liangna Deng3, Jing Zhang4, Yuntai Cao5, Shenglin Li3, Jialiang Ren6, Rong Qian1,2, Shengkun Peng1,2, Xiaodi Zhang7, Junlin Zhou3, Zhuoli Zhang8, Weifang Kong1,2, Hong Pu1,2.
Abstract
Background: This study aimed to noninvasively predict the mutation status of epidermal growth factor receptor (EGFR) molecular subtype in lung adenocarcinoma based on CT radiomics features.Entities:
Keywords: EGFR; NSCLC; computed tomography; lung adenocarcinoma; radiomics
Year: 2022 PMID: 35574401 PMCID: PMC9098955 DOI: 10.3389/fonc.2022.889293
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1The flowchart of the inclusion and exclusion criteria.
Figure 2Flowchart of the process of radiomics. (A) The tumours were segmented on CT images to form the region of interest (ROI). (B) Radiomics feature extraction from the ROI. (C) Radiomics feature dimensionality reduction process. (D) Construct a radiomics model.
The relationship between clinical variables of patients and EGFR molecular subtypes (Del-19 mutation vs. Wild-type) in the training set.
| Variable | Total (n = 395) | Del-19 mutation (n =154) | Wild-type (n = 241) | `Univariate analysis | Multivariate analysis | ||
|---|---|---|---|---|---|---|---|
| OR (95%CI) | |||||||
| Age (years) | <0.001 | 0.972 | 0.021 | ||||
| - Mean ± SD | 56.70 ± 9.19 | 54.87 ± 8.13 | 58.87 ± 9.65 | ||||
| - Median (Q1, Q3) | 56.0 (50.0,63.0) | 55.0(49.0,61.0) | 58.0(51.0,65.0) | ||||
| - Range | 26-79 | 32-78 | 26-79 | ||||
| Sex (%) | <0.001 | <0.001 | |||||
| - Male | 221 (55.9%) | 51 (33.1%) | 170 (70.5%) | Reference | |||
| - Female | 174 (44.1%) | 103 (66.9%) | 71 (29.5%) | 3.193 | |||
| Smoking history (%) | <0.001 | NA | |||||
| - No | 240 (60.8%) | 122 (79.2%) | 118 (49.0%) | ||||
| - Yes | 155 (39.2%) | 32 (20.8%) | 123 (51.0%) | ||||
| CEA (%) | 0.391 | NA | |||||
| - Normal | 167 (42.3%) | 61 (39.6%) | 106 (44.0%) | ||||
| - High | 228 (57.7%) | 93 (60.4%) | 135 (56.0%) | ||||
| Lobe location (%) | 0.959 | NA | |||||
| - Right upper lobe | 135 (34.2%) | 54 (35.1%) | 81 (33.6%) | ||||
| - Right middle lobe | 17 (4.3%) | 6 (3.9%) | 11 (4.6%) | ||||
| - Right lower lobe | 97 (24.6%) | 35 (22.7%) | 62 (25.7%) | ||||
| - Left upper lobe | 81 (20.5%) | 33 (21.4%) | 48 (19.9%) | ||||
| - Left lower lobe | 65 (16.5%) | 26 (16.9%) | 39 (16.2%) | ||||
CEA, Carcinoembryonic antigen; CI, Confidence interval; Del 19, Exon-19 deletion mutation; EGFR, Epidermal growth factor receptor; NA, not applicable; OR, Odds ratio; SD, Standard deviation. vs., versus.
The relationship between clinical variables of patients and EGFR molecular subtypes (L858R mutation vs. Wild-type) in the training set.
| Variable | Total (n = 386) | L858R mutation (n = 145) | Wild-type (n = 241) | Univariate analysis | Multivariate analysis | |
|---|---|---|---|---|---|---|
| OR (95%CI) | ||||||
| Age (years) | 0.686 | NA | ||||
| - Mean ± SD | 58.12 ± 9.46 | 58.50 ± 9.15 | 58.87 ± 9.65 | |||
| - Median (Q1, Q3) | 58.0 (52.0,65.0) | 58.0(53.0,64.0) | 58.0(51.0,65.0) | |||
| - Range | 21-82 | 21-82 | 26-79 | |||
| Sex (%) | <0.001 | <0.001 | ||||
| - Male | 223 (57.8%) | 53 (36.6%) | 170 (70.5%) | Reference | ||
| - Female | 174 (44.2%) | 92 (63.4%) | 71 (29.5%) | 2.612 | ||
| Smoking history (%) | <0.001 | 0.004 | ||||
| - No | 234 (60.6%) | 116 (80.0%) | 118 (49.0%) | Reference | ||
| - Yes | 152 (39.4%) | 29 (20.0%) | 123 (51.0%) | 0.427 | ||
| CEA (%) | 0.301 | NA | ||||
| - Normal | 162 (42.0%) | 56 (38.6%) | 106 (44.0%) | |||
| - High | 224 (58.0%) | 89 (61.4%) | 135 (56.0%) | |||
| Lobe location (%) | 0.262 | NA | ||||
| - Right upper lobe | 124 (32.1%) | 43 (29.7%) | 81 (33.6%) | |||
| - Right middle lobe | 24 (6.2%) | 13 (9.0%) | 11 (4.6%) | |||
| - Right lower lobe | 99 (25.6%) | 37 (25.5%) | 62 (25.7%) | |||
| - Left upper lobe | 83 (21.5%) | 35 (24.1%) | 48 (19.9%) | |||
| - Left lower lobe | 56 (14.5%) | 17 (11.7%) | 39 (16.2%) | |||
CEA, Carcinoembryonic antigen; CI, Confidence interval; EGFR, Epidermal growth factor receptor; L858R, Exon-21 L858R point mutation; NA, not applicable; OR, Odds ratio; SD, Standard deviation. vs., versus.
The relationship between clinical variables of patients and EGFR molecular subtypes (Del-19 mutation vs. L858R mutation) in the training set.
| Variable | Total (n = 299) | Del-19 mutation (n = 154) | L858R mutation (n = 145) | Univariate analysis | Multivariate analysis | |
|---|---|---|---|---|---|---|
| OR (95%CI) | ||||||
| Age (years) | <0.001 | 1.050 | <0.001 | |||
| - Mean ± SD | 56.63 ± 8.81 | 54.87 ± 8.13 | 58.50 ± 9.15 | |||
| - Median (Q1, Q3) | 56.0 (50.5,62.5) | 55.0(49.0,61.0) | 58.0(53.0,64.0) | |||
| - Range | 21-82 | 32-78 | 21-82 | |||
| Sex (%) | 0.533 | NA | ||||
| - Male | 104 (34.8%) | 51 (33.1%) | 53 (36.6%) | |||
| - Female | 195 (65.2%) | 103 (66.9%) | 92 (63.4%) | |||
| Smoking history (%) | 0.867 | NA | ||||
| - No | 238 (79.6%) | 122 (79.2%) | 116 (80.0%) | |||
| - Yes | 61 (20.4%) | 32 (20.8%) | 29 (20.0%) | |||
| CEA (%) | 0.861 | NA | ||||
| - Normal | 117 (39.1%) | 61 (39.6%) | 56 (38.6%) | |||
| - High | 182 (60.9%) | 93 (60.4%) | 89 (61.4%) | |||
| Lobe location (%) | 0.235 | NA | ||||
| - Right upper lobe | 97 (32.4%) | 54 (35.1%) | 43 (29.7%) | |||
| - Right middle lobe | 19 (6.4%) | 6 (3.9%) | 13 (9.0%) | |||
| - Right lower lobe | 72 (24.1%) | 35 (22.7%) | 37 (25.5%) | |||
| - Left upper lobe | 68 (22.7%) | 33 (21.4%) | 35 (24.1%) | |||
| - Left lower lobe | 43 (14.4%) | 26 (16.9%) | 17 (11.7%) | |||
CEA, Carcinoembryonic antigen; CI, Confidence interval; Del 19, Exon-19 deletion; EGFR, Epidermal growth factor receptor; L858R, Exon-21 L858R point mutation; NA, not applicable; OR, Odds ratio; SD, Standard deviation. vs., versus.
Figure 3Receiver operating characteristic (ROC) curves of the three models were used to predict the mutant status of EGFR molecular subtypes. (A, B) Del-19 mutation vs. wild-type. (C, D) L858R mutation vs. wild-type. (E, F) Del-19 mutation vs. L858R mutation. (A, C, E) Training set. (B, D, F) Validation set.
The prediction performance of different models in the training and validation sets.
| Models | AUC | Accuracy | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|---|
| Clinical model | 0.719 | 0.706 | 0.721 | 0.697 | 0.603 | 0.796 |
| Radiomics model | 0.807 | 0.747 | 0.708 | 0.772 | 0.665 | 0.805 |
| Combined model | 0.838 | 0.775 | 0.682 | 0.834 | 0.724 | 0.804 |
| Clinical model | 0.693 | 0.667 | 0.648 | 0.679 | 0.565 | 0.750 |
| Radiomics model | 0.779 | 0.732 | 0.685 | 0.762 | 0.649 | 0.790 |
| Combined model | 0.813 | 0.768 | 0.648 | 0.845 | 0.729 | 0.789 |
| Clinical model | 0.701 | 0.679 | 0.634 | 0.705 | 0.564 | 0.762 |
| Radiomics model | 0.825 | 0.764 | 0.772 | 0.759 | 0.659 | 0.847 |
| Combined model | 0.855 | 0.756 | 0.890 | 0.676 | 0.623 | 0.911 |
| Clinical model | 0.697 | 0.672 | 0.660 | 0.679 | 0.550 | 0.770 |
| Radiomics model | 0.812 | 0.746 | 0.760 | 0.738 | 0.633 | 0.838 |
| Combined model | 0.852 | 0.739 | 0.920 | 0.631 | 0.597 | 0.930 |
| Logistic model | 0.581 | 0.587 | 0.789 | 0.395 | 0.554 | 0.662 |
| RF model | 0.881 | 0.786 | 0.715 | 0.853 | 0.822 | 0.759 |
| SVM model | 0.601 | 0.591 | 0.805 | 0.388 | 0.556 | 0.676 |
| Clinical model | 0.660 | 0.599 | 0.797 | 0.411 | 0.563 | 0.679 |
| Combined model† | 0.906 | 0.833 | 0.821 | 0.845 | 0.835 | 0.832 |
| Logistic model | 0.673 | 0.679 | 0.827 | 0.537 | 0.632 | 0.763 |
| RF model | 0.871 | 0.849 | 0.788 | 0.907 | 0.891 | 0.817 |
| SVM model | 0.652 | 0.651 | 0.808 | 0.500 | 0.609 | 0.730 |
| Clinical model | 0.514 | 0.538 | 0.692 | 0.389 | 0.522 | 0.568 |
| Combined model† | 0.875 | 0.830 | 0.885 | 0.778 | 0.793 | 0.875 |
AUC, Area under the curve; Del 19, Exon-19 deletion; L858R, Exon-21 L858R point mutation; NPV, Negative predictive value; PPV, Positive predictive value; RF, Random forest; SVM, Support vector machine; vs., versus.
†Combined model: RF model combined Clinical model.
Figure 4Nomogram was used to identify Del-19 mutation and wild-type. (A) Construct a nomogram in the training set based on the combined model. (B, C) Calibration curve of the combined model in the training (B) and validation (C) sets. The x-axis represents the use of the combined model to predict the risk of Del-19 mutation. The y-axis represents the actual Del-19 mutation rate. The green, red, and blue lines represent the distinguishing ability of the clinical, radiomics, and combined models, respectively, while the gray diagonal line represents the ideal evaluation of the ideal model. The closer the fit to the diagonal line indicates the better discrimination ability. (D, E) Decision curve analysis for the combined model in the training (D) and validation (E) sets. The x-axis shows the threshold probability, and the y-axis measures the net benefit. The gray line represents all patients with Del-19 mutation, and the black line represents all patients without Del-19 mutation. The green, red, and blue lines represent the clinical, radiomics, and combined models, respectively.
Figure 5Nomogram was used to identify L858R mutation and wild-type. (A) Construct a nomogram in the training set based on the combined model. (B, C) Calibration curve of the combined model in the training (B) and validation (C) sets. (D, E) Decision curve analysis for the combined model in the training (D) and validation (E) sets.
Figure 6Precision-recall (PR) curves of the different models in the training (A) and validation sets (B). PR represents the relationship between precision and recall.The larger the area under the PR curve, the better the model performance.