| Literature DB >> 35234540 |
Wenting An1, Wei Fan1, Feiyang Zhong1, Binchen Wang1, Shan Wang1, Tian Gan1, Sufang Tian1, Meiyan Liao1.
Abstract
Purpose We aimed to determine the epidermal growth factor receptor (EGFR) genetic profile of lung cancer in Asians, and develop and validate a non-invasive prediction scoring system for EGFR mutation before treatment. Methods This was a single-center retrospective cohort study using data of patients with lung cancer who underwent EGFR detection (n = 1450) from December 2014 to October 2020. Independent predictors were filtered using univariate and multivariate logistic regression analyses. According to the weight of each factor, a prediction scoring system for EGFR mutation was constructed. The model was internally validated using bootstrapping techniques and temporally validated using prospectively collected data (n = 210) between November 2020 and June 2021.Results In 1450 patients with lung cancer, 723 single mutations and 51 compound mutations were observed in EGFR. Thirty-nine cases had two or more synchronous gene mutations. We developed a scoring system according to the independent clinical predictors and stratified patients into risk groups according to their scores: low-risk (score <4), moderate-risk (score 4-8), and high-risk (score >8) groups. The C-statistics of the scoring system model was 0.754 (95% CI 0.729-0.778). The factors in the validation group were introduced into the prediction model to test the predictive power of the model. The results showed that the C-statistics was 0.710 (95% CI 0.638-0.782). The Hosmer-Lemeshow goodness-of-fit showed that χ2 = 6.733, P = 0.566. Conclusions The scoring system constructed in our study may be a non-invasive tool to initially predict the EGFR mutation status for those who are not available for gene detection in clinical practice.Entities:
Keywords: epidermal growth factor receptor; lung cancer; predictive model; scoring system
Mesh:
Substances:
Year: 2022 PMID: 35234540 PMCID: PMC8894628 DOI: 10.1177/15330338221078732
Source DB: PubMed Journal: Technol Cancer Res Treat ISSN: 1533-0338
Figure 1.Flow diagram of patient selection for the development and temporal validation cohort.
Figure 2.Pie charts showing the distribution of EGFR mutations in the study cohort. (A) Single mutation, (B) compound mutation.
Figure 3.Co-mutated genes and their count observed in the study cohort.
Univariate and multivariate analysis of clinicopathological characteristics with EGFR mutational status from the development cohort.
| Characteristics | n | Univariate analysis | Multivariate analysis* | |||
|---|---|---|---|---|---|---|
| EGFR ( + ) | EGFR (-) | P value | OR [95% CI] | P value | ||
| Age (y) |
| 0.378 | ||||
| Mean ± SD | 61.9 ± 9.8 | 61.3 ± 9.8 | 62.6 ± 9.7 | |||
| <62 | 657 | 376 | 281 | Reference | ||
| ≥62 | 793 | 398 | 395 | 0.892 [0.693, 1.150] | ||
| Sex |
|
| ||||
| Male | 876 | 358 | 518 | Reference | ||
| Female | 574 | 416 | 158 | 1.853 [1.352, 2.541] | ||
| Smoking status |
|
| ||||
| Current/Former | 771 | 233 | 538 | Reference | ||
| Never | 645 | 529 | 116 | 1.932 [1.367, 2.729] | ||
| Unknown | 34 | 12 | 22 | |||
| Smoking index |
| |||||
| Mean ± SD | 774.6 ± 522.3 | 664.8 ± 491.3 | 836.6 ± 529.7 |
| ||
| ≥770 | 300 | 215 | 85 | Reference | ||
| <770 | 1116 | 677 | 439 | 1.646 [1.133, 2.390] | ||
| Unknown | 34 | 12 | 22 | |||
| Family history of malignant tumors |
|
| ||||
| Yes | 117 | 74 | 43 | 1.701 [1.053, 2.748] | ||
| No | 1299 | 488 | 811 | Reference | ||
| Unknown | 34 | 12 | 22 | |||
| History of other malignant tumors |
|
| ||||
| Yes | 70 | 25 | 45 | Reference | ||
| No | 1346 | 737 | 609 | 2.249 [1.254, 4.032] | ||
| Unknown | 34 | 12 | 22 | |||
| Tumor location | 0.239 | |||||
| Left | 603 | 329 | 274 | |||
| Right | 798 | 402 | 396 | |||
| Bilateral | 16 | 7 | 9 | |||
| Unknown | 33 | 12 | 21 | |||
| CT imaging manifestation |
|
| ||||
| Solid | 1355 | 711 | 644 | Reference | ||
| GGO (Pure/Mix) | 58 | 49 | 9 | 3.515 [1.463, 8.447] | ||
| Unknown | 37 | 14 | 23 | |||
| Gross type |
|
| ||||
| Central type | 208 | 87 | 121 | Reference | ||
| Peripheral type | 1205 | 673 | 532 | 1.535 [1.056, 2.232] | ||
| Unknown | 37 | 14 | 23 | |||
| T stage |
|
| ||||
| T1 | 277 | 177 | 100 | Reference | ||
| T2 | 280 | 138 | 142 | |||
| T3 | 143 | 65 | 78 | 1.057 [0.760, 1.471] | ||
| T4 | 237 | 114 | 123 | |||
| Unknown | 513 | 280 | 233 | |||
| N stage | 0.191 | |||||
| N0 | 388 | 213 | 175 | |||
| N1 | 90 | 55 | 35 | |||
| N2 | 344 | 176 | 168 | |||
| N3 | 165 | 80 | 85 | |||
| Unknown | 463 | 250 | 213 | |||
| M stage | 0.328 | |||||
| M0 | 537 | 276 | 261 | |||
| M1a | 91 | 48 | 43 | |||
| M1b | 288 | 160 | 128 | |||
| M1c | 207 | 121 | 86 | |||
| Unknown | 327 | 169 | 158 | |||
| Clinical stage | ||||||
| I A | 155 | 100 | 55 |
| ||
| I B | 52 | 24 | 28 | |||
| II A | 25 | 8 | 17 | 0.142 | ||
| II B | 80 | 39 | 41 | |||
| III A | 124 | 65 | 59 | 0.094 | ||
| III B | 81 | 33 | 48 | |||
| III C | 22 | 7 | 15 | |||
| IV A | 390 | 215 | 175 | 0.436 | ||
| IV B | 207 | 121 | 86 | |||
| I-III | 814 | 423 | 391 | 0.108 | ||
| IV | 597 | 336 | 261 | |||
| Unknown | 39 | 25 | 14 | |||
| CEA |
|
| ||||
| Positive | 578 | 332 | 246 | 1.627 [1.195, 2.217] | ||
| Negative | 659 | 331 | 328 | Reference | ||
| Unknown | 213 | 111 | 102 | |||
| AFP | 0.724 | |||||
| Positive | 23 | 11 | 12 | |||
| Negative | 803 | 414 | 389 | |||
| Unknown | 624 | 349 | 275 | |||
| FERR | 0.387 | |||||
| Positive | 68 | 29 | 39 | |||
| Negative | 147 | 72 | 75 | |||
| Unknown | 1235 | 673 | 562 | |||
| CA125 |
| 0.624 | ||||
| Positive | 455 | 226 | 229 | 1.085 [0.783, 1.503] | ||
| Negative | 729 | 409 | 320 | Reference | ||
| Unknown | 266 | 139 | 127 | |||
| CA15-3 | 0.074 | |||||
| Positive | 86 | 57 | 29 | |||
| Negative | 382 | 213 | 169 | |||
| Unknown | 982 | 504 | 478 | |||
| CA19-9 |
|
| ||||
| Positive | 152 | 66 | 86 | Reference | ||
| Negative | 687 | 365 | 322 | 1.646 [1.052, 2.577] | ||
| Unknown | 611 | 343 | 268 | |||
| SCCA |
|
| ||||
| Positive | 78 | 19 | 59 | Reference | ||
| Negative | 273 | 157 | 116 | 2.572 [1.311, 5.046] | ||
| Unknown | 1099 | 598 | 501 | |||
| CYFRA 21-1 |
| 0.881 | ||||
| Positive | 206 | 92 | 114 | 0.958 [0.547, 1.677] | ||
| Negative | 130 | 79 | 51 | Reference | ||
| Unknown | 1114 | 603 | 511 | |||
| CA72-4 | 0.163 | |||||
| Positive | 45 | 17 | 28 | |||
| Negative | 145 | 72 | 73 | |||
| Unknown | 1260 | 685 | 575 | |||
| NSE | 0.252 | |||||
| Positive | 237 | 121 | 116 | |||
| Negative | 869 | 480 | 389 | |||
| Unknown | 344 | 173 | 171 | |||
| Adenocarcinoma component |
|
| ||||
| With | 1313 | 758 | 555 | Reference | ||
| Without | 117 | 15 | 102 | 3.025 [1.539, 5.945] | ||
| NOS# | 9 | 0 | 9 | |||
| Unknown | 11 | 1 | 10 | |||
| Predominant component in adenocarcinoma |
| |||||
| Micropapillary/Solid | 78 | 25 | 53 | Reference | ||
| Acinar/Papillary | 211 | 141 | 70 | 2.962 [1.587, 5.528] |
| |
| AIS/Lepidic | 73 | 53 | 20 | 2.983 [1.316, 6.766] |
| |
| Minimally invasive | 2 | 0 | 2 | |||
| Unknown | 909 | 521 | 388 | |||
| Adenocarcinoma | 1273 | 740 | 533 | 0.098 | ||
| Adenosquamous carcinoma | 40 | 18 | 22 | |||
| Differentiation grade |
| |||||
| Poor | 271 | 86 | 185 | Reference | ||
| Moderate | 241 | 158 | 83 | 2.566 [1.657, 3.973] |
| |
| Well | 70 | 50 | 20 | 3.131 [1.590, 6.165] |
| |
| Unknown | 868 | 480 | 388 | |||
| Mucus component |
| 0.073 | ||||
| Yes | 36 | 11 | 25 | 0.457 [0.194, 1.075] | ||
| No | 1414 | 763 | 651 | Reference | ||
| TTF-1 |
|
| ||||
| Positive | 866 | 483 | 383 | 2.853 [1.480, 5.500] | ||
| Negative | 138 | 17 | 121 | Reference | ||
| Unknown | 446 | 274 | 172 | |||
| Napsin A |
|
| ||||
| Positive | 648 | 374 | 274 | 3.003 [1.775, 5.083] | ||
| Negative | 193 | 29 | 164 | Reference | ||
| Unknown | 609 | 371 | 238 | |||
| P63 | 0.974 | |||||
| Positive | 210 | 61 | 149 | |||
| Negative | 294 | 85 | 209 | |||
| Unknown | 946 | 628 | 318 | |||
| P40 |
| 0.441 | ||||
| Positive | 67 | 21 | 46 | 1.340 [0.637, 2.822] | ||
| Negative | 360 | 175 | 185 | Reference | ||
| Unknown | 1023 | 578 | 445 | |||
| CK-7 |
| 0.280 | ||||
| Positive | 687 | 343 | 344 | 0.549 [0.185,1.629] | ||
| Negative | 42 | 8 | 34 | Reference | ||
| Unknown | 721 | 423 | 298 | |||
| Ki67(%) |
| 0.052 | ||||
| Mean ± SD | 36.4 ± 21.6 | 32.9 ± 20.9 | 40.0 ± 21.8 | |||
| <36 | 461 | 264 | 197 | Reference | ||
| ≥36 | 453 | 205 | 248 | 1.396 [0.996, 1.956] | ||
| Unknown | 536 | 305 | 231 | |||
| Specimen type | 0.140 | |||||
| Biopsy | 811 | 419 | 392 | |||
| Surgical resection | 639 | 355 | 284 | |||
| Technology | 0.430 | |||||
| ARMS | 1167 | 617 | 550 | |||
| NGS | 283 | 157 | 126 | |||
Abbreviations: AIS: adenocarcinoma in situ, AFP: alpha fetoprotein, ARMS: amplification refractory mutation system, CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CK: cytokeratin, CT: computerized tomography, EGFR: epidermal growth factor receptor, FERR: ferritin, GGO: ground-glass opacity, NGS: next-generation sequencing, NSE: neuron-specific enolase, OR: odds ratio, SCCA: squamous cell carcinoma antigen, SD: standard deviation, CYFRA21-1: soluble fragment of cytokeratin 19, TTF-1: thyroid transcription factor-1.
*Items were included in the multivariate analysis only when the P value is <0.05 in univariate analysis.
#NOS, not otherwise specified indicates pathologically confirmed NSCLC, but the pathologic type was not clearly identified.
Figure 4.Receiver operating characteristic curve of models 1 and 2 in the development cohort (A,C) and temporal validation cohort (B,D).
Figure 5.Calibration plot comparing the actual and predicted probabilities of the EGFR mutation. (A) Model 1; (B) model 2.
Comparison of prediction accuracy between two models.
| Successful prediction | Failure prediction | P value | |
|---|---|---|---|
| Model 1 | 144 | 66 | 0.455 |
| Model 2 | 151 | 59 |
Multivariate analysis for the independent clinical predictors in EGFR mutation and corresponding points.
| Categories | β | S.E. | Wald | P value | OR [95% CI] | Points |
|---|---|---|---|---|---|---|
| Intercept | −4.208 | 0.541 | 60.496 | <0.001 |
| |
| Sex | ||||||
| Male | Reference | 0 | ||||
| Female | 0.802 | 0.149 | 29.150 | <0.001 | 2.230 [1.667, 2.984] | 1 |
| Smoking status | ||||||
| Current/Former | Reference | 0 | ||||
| Never | 0.625 | 0.162 | 14.807 | <0.001 | 1.869 [1.359, 2.570] | 1 |
| Smoking index | ||||||
| ≥770 | Reference | 0 | ||||
| <770 | 0.591 | 0.177 | 11.155 | 0.001 | 1.806 [1.277, 2.555] | 1 |
| Family history of malignant tumors | ||||||
| No | Reference | 0 | ||||
| Yes | 0.592 | 0.226 | 6.877 | 0.009 | 1.808 [1.161, 2.815] | 1 |
| History of other malignant tumors | ||||||
| Yes | Reference | 0 | ||||
| No | 0.973 | 0.279 | 12.213 | <0.001 | 2.647 [1.533, 4.570] | 2 |
| CT imaging manifestation | ||||||
| Solid | Reference | 0 | ||||
| GGO (Pure/Mix) | 1.314 | 0.392 | 11.687 | 0.001 | 3.822 [1.772, 8.243] | 2 |
| Gross type | ||||||
| Central type | Reference | 0 | ||||
| Peripheral type | 0.555 | 0.173 | 10.282 | 0.001 | 1.743 [1.241, 2.447] | 1 |
| CEA | ||||||
| Negative | Reference | 0 | ||||
| Positive | 0.659 | 0.144 | 21.048 | <0.001 | 1.932 [1.458, 2.561] | 1 |
| CA19-9 | ||||||
| Positive | Reference | 0 | ||||
| Negative | 0.567 | 0.212 | 7.149 | 0.008 | 1.763 [1.163, 2.671] | 1 |
| SCCA | ||||||
| Positive | Reference | 0 | ||||
| Negative | 1.180 | 0.316 | 13.903 | <0.001 | 3.253 [1.750, 6.049] | 2 |
| Total | 13 |
Abbreviations: CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CT: computerized tomography, EGFR: epidermal growth factor receptor, GGO: ground-glass opacity, OR: odds ratio, SCCA: squamous cell carcinoma antigen.
Scoring system table.
| Risk score quantities | n | OR [95% CI] | P value | |
|---|---|---|---|---|
| Development Cohort | ||||
| Range (0, 12) | ||||
| Score<4 | 143 | 26 (18.2) | Reference | - |
| 4 ≤ Score ≤8 | 1108 | 595 (53.7) | 5.22 [3.36, 8.11] | <0.001 |
| Score >8 | 188 | 153 (81.4) | 19.67 [11.22, 34.50] | <0.001 |
| Temporal Validation Cohort | ||||
| Range (1, 13) | ||||
| Score<4 | 25 | 7 (28.0) | Reference | - |
| 4 ≤ Score ≤8 | 141 | 60 (42.6) | 1.91 [0.75, 4.85] | 0.172 |
| Score >8 | 44 | 30 (68.2) | 8.08 [2.80, 23.33] | <0.001 |
Abbreviations: CI: confidence interval, EGFR: epidermal growth factor receptor, OR: odds ratio.
Figure 6.Histogram for the risk of EGFR mutation. According to the scoring system, a histogram of the proportions of the low-, medium-, and high-risk populations were drawn.