Literature DB >> 35234540

Development and Validation of a Concise Prediction Scoring System for Asian Lung Cancer Patients with EGFR Mutation Before Treatment.

Wenting An1, Wei Fan1, Feiyang Zhong1, Binchen Wang1, Shan Wang1, Tian Gan1, Sufang Tian1, Meiyan Liao1.   

Abstract

Purpose We aimed to determine the epidermal growth factor receptor (EGFR) genetic profile of lung cancer in Asians, and develop and validate a non-invasive prediction scoring system for EGFR mutation before treatment. Methods This was a single-center retrospective cohort study using data of patients with lung cancer who underwent EGFR detection (n = 1450) from December 2014 to October 2020. Independent predictors were filtered using univariate and multivariate logistic regression analyses. According to the weight of each factor, a prediction scoring system for EGFR mutation was constructed. The model was internally validated using bootstrapping techniques and temporally validated using prospectively collected data (n = 210) between November 2020 and June 2021.Results In 1450 patients with lung cancer, 723 single mutations and 51 compound mutations were observed in EGFR. Thirty-nine cases had two or more synchronous gene mutations. We developed a scoring system according to the independent clinical predictors and stratified patients into risk groups according to their scores: low-risk (score <4), moderate-risk (score 4-8), and high-risk (score >8) groups. The C-statistics of the scoring system model was 0.754 (95% CI 0.729-0.778). The factors in the validation group were introduced into the prediction model to test the predictive power of the model. The results showed that the C-statistics was 0.710 (95% CI 0.638-0.782). The Hosmer-Lemeshow goodness-of-fit showed that χ2 = 6.733, P = 0.566. Conclusions The scoring system constructed in our study may be a non-invasive tool to initially predict the EGFR mutation status for those who are not available for gene detection in clinical practice.

Entities:  

Keywords:  epidermal growth factor receptor; lung cancer; predictive model; scoring system

Mesh:

Substances:

Year:  2022        PMID: 35234540      PMCID: PMC8894628          DOI: 10.1177/15330338221078732

Source DB:  PubMed          Journal:  Technol Cancer Res Treat        ISSN: 1533-0338


Introduction

Lung cancer is the leading cause of cancer-related deaths worldwide, accounting for 22%–23% of all cancer-related deaths. During recent years, with the rapid development of precision medicine and tumor molecular biology, targeted therapy has become an important treatment for lung cancer following the traditional approaches of surgery, radiotherapy, and chemotherapy. Considerably high positive response rate and safety have made individualized and refined treatment possible. Epidermal growth factor receptor (EGFR)-tyrosine kinase inhibitors (TKIs) such as gefitinib, osimertinib, and erlotinib have increased the overall survival of patients expressing EGFR. EGFR mutation is the most common genetic mutation related to lung cancer, with a mutational frequency of approximately 46%–58% in China.[3-5] Currently, the sources of tumor material for gene detection include tumor tissue specimens, cytology specimens, and serum specimens. Paraffin-embedded tumor tissue specimens have conventionally been the main source and still account for most diagnostic samples in clinical practice. Cytology specimens have been shown to be an adequate alternative source when tissue samples are not available or contain insufficient amounts of tumor DNA, and their use has increased over recent years. Serum specimens, known as liquid biopsies, are not popular in clinical practice owing to the lack of standardized techniques, limited coverage of hotspot mutations, lack of sensitivity, and insufficient clinical validation. For patients with advanced lung cancer, biopsy specimens are the only available histological evidence. However, tumor heterogeneity is a major challenge, as gene information from local tumor tissue might not accurately reflect the whole genetic profile. In addition, these specimens may be difficult to obtain from some patients with advanced age, poor pulmonary function, or poor coagulation. Furthermore, in some remote areas in Asia, some patients cannot afford the cost of hospitalization, and primary hospitals lack gene detection technology. Thus, it is necessary to establish a non-invasive and convenient method to initially predict the EGFR mutation status and evaluate the clinical treatment efficacy. To date, there have been various small-scale studies revealing the relationship between some clinicopathological characteristics and EGFR mutation.[8,9] A few studies have also established the prediction scoring system for EGFR mutation but have low credibility because of the use of too few cases or having a lack of validation. In this study, we summarized the mutational status of EGFR mutation, subtype mutations and co-mutations in several cases, systematically identified the risk factors of EGFR mutation, and developed a prediction scoring system that is concise and readily adoptable at most institutions, aiding those who were unavailable for gene detection. Furthermore, we validated our model using prospectively collected data to examine its generalizability and reliability.

Methods

Data Collection and Definitions

This was a single-center retrospective cohort study. We performed the study in accordance with the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) reporting checklist. The development cohort (n = 1450) was retrospectively compiled from patients with lung cancer admitted to our institution between December 2014 to October 2020. Temporal validation samples (n = 210) were prospectively collected from the same institution as the development cohort but at a later time point, between November 2020 and June 2021. The criteria for inclusion were patients with pathologically diagnosed lung cancer who underwent EGFR detection in our institution. The exclusion criteria were patients 1) whose medical records could not be obtained and 2) who were treated with any therapy before EGFR detection. We have de-identified all patient details. We collected clinicopathological data from the electronic medical records of patients, including sex, age, smoking status, smoking index, family history of malignant tumors, history of other malignant tumors, tumor location, computerized tomography (CT) imaging manifestation, gross type, TNM and clinical stage, serum tumor markers, including carcinoembryonic antigen (CEA), alpha fetoprotein (AFP), ferritin (FERR), carbohydrate antigen (CA) 125, carbohydrate antigen (CA) 15–3, carbohydrate antigen (CA) 19–9, squamous cell carcinoma antigen (SCCA), soluble fragment of cytokeratin 19 (CYFRA21-1), carbohydrate antigen (CA) 72–4, and neuron-specific enolase (NSE), histologic subtype, differentiation grade, mucus component, and some immunohistochemical results, including thyroid transcription factor-1 (TTF-1), napsin A, P63, P40, CK-7, and Ki67. The smoking index (Brinkman index) was defined as the number of cigarettes smoked per day multiplied by the number of smoking years. Tumors were staged according to the eighth edition of the TNM staging classification of the International Union against Cancer and the American Joint Committee on Cancer (UICC/AJCC) for lung cancer. Ground-glass opacity (GGO) is defined as hazy opacity on high resolution computed tomography (HRCT), through which pulmonary vessels or bronchial structures can be visualized. The upper limit of each tumor marker was as follows: CEA, 5 ng/mL; AFP, 8.78 ng/mL; FERR, 130 ng/mL (male); FERR, 55 ng/mL (female); CA125, 35 U/mL; CA15-3, 31.3 U/mL; CA19-9, 37 U/mL; SCCA, 1.5 ng/mL; CYFRA 21-1, 3.3 ng/mL; CA72-4, 6.9 U/mL; and NSE 15.2 ng/mL. The above tumor markers were considered positive if their values were higher than the upper limit. Histologic subtypes were determined according to the diagnostic criteria of the 2015 WHO histological classification of lung cancer.

EGFR Detection Methods

All tumor tissue samples were fixed in 10% neutral formalin solution and embedded in paraffin. We used the QIAamp DNA formalin-fixed paraffin-embedded (FFPE) tissue kit (Qiagen NV, Venlo, Netherlands) to extract genomic DNA and RNA from FFPE tissues, according to the manufacturer's instructions. The EGFR analysis results were obtained and analyzed by amplification refractory mutation system-polymerase chain reaction (ARMS-PCR) or using next-generation sequencing (NGS) method.

Development and Validation of the Scoring System

A scoring system was developed with (a) a β-coefficient obtained for each significant factor from the multivariate logistical regression model (b) each of the coefficient variables rounded off to whole integers and (c) the score of each variable set as its corresponding coefficient. The discriminatory ability of the model was evaluated using receiver operating characteristic curve (ROC) analysis and the area under the ROC curve. Goodness-of-fit was assessed using Hosmer–Lemeshow test. The scoring system was stratified into the following three groups: low, moderate, and high risks. The scoring system was validated in the internal and temporal validation cohorts by assessing model discrimination and calibration. The C-statistic was adopted for the quantification of discrimination, which is equal to the area under the ROC. Calibration was studied from graphical representations of the relationship between the observed outcome frequencies and the predicted probabilities (calibration curves). The internal validation involved bootstrapping techniques. Thousand bootstrap samples were drawn from the original data set with replacement. The model was temporally validated using prospectively collected data.

Statistical Analysis

All data were analyzed using R.4.0.5, SPSS 22.0 software (IBM Corp., Armonk, NY, USA) and Microsoft Excel (2019). Count data are expressed as the number of cases or rate (%), and the difference between groups was evaluated using the chi-square test or Fisher's exact test. Statistical significance was set at P < 0.05. The relationships between clinicopathological characteristics and gene mutations were analyzed using univariate and multivariate logistic regression analyses. Parameters were included in the multivariate analysis only when the P value was <0.05 in the univariate analysis.

Results

Study Population

A total of 1450 participants with lung cancer who underwent EGFR detection from December 2014 to October 2020 were included in our study. Furthermore, 210 samples whose data were prospectively collected between November 2020 and June 2021 were included in temporal validation cohort. A detailed flow diagram of patient selection is presented in Figure 1.
Figure 1.

Flow diagram of patient selection for the development and temporal validation cohort.

Flow diagram of patient selection for the development and temporal validation cohort.

Status of EGFR Mutation

The identified mutations in EGFR in the 1450 patients were 723 single mutations and 51 compound mutations. The single mutations comprised 337 cases with 19 del, 322 cases with L858R, 17 cases with 20 ins, 12 cases with L861Q, 11 cases with G719X, 2 cases with S768I, and 22 cases with uncommon mutations (Figure 2A).
Figure 2.

Pie charts showing the distribution of EGFR mutations in the study cohort. (A) Single mutation, (B) compound mutation.

Pie charts showing the distribution of EGFR mutations in the study cohort. (A) Single mutation, (B) compound mutation. The compound mutations comprised 13 cases with T790M + L858R, 12 cases with T790M + 19 del, 5 cases with 19 del + L858R, 3 cases with S768I + L858R, 3 cases with G719X + S768I, 2 cases with 19 del + 20 ins, 2 cases with G719X + L861Q, 1 case with I706T + G719A, 1 case with E709A + G719A, 1 case with G724S + L858R, 1 case with G719S + S768I, 1 case with S720F + L858R, 1 case with 19 del + L861Q, 1 case with I759M + L858R, 1 case with V819A + L858R, 1 case with T790M + E746_A750del, 1 case with L861Q + L858R, and 1 case with T790M + L858R + S768I (Figure 2B). Furthermore, among the 1450 patients, 39 cases had 2 or more synchronous gene mutations. Among these, EGFR + TP53 was found to be the most frequent co-mutation, followed by EGFR + EML4-ALK, EGFR + BRAF, EGFR + ROS1, EGFR + KRAS, TP53 + EML4-ALK, EGFR + FGFR3 + TP53, KRAS + TP53, EGFR + KRAS + NRAS, EGFR + PIK3CA + FBXW7, EGFR + TP53 + PTEN, EGFR + PIK3CA, EGFR + TP53 + CTNNB1, EGFR + KRAS + TP53, EGFR + PTEN, EGFR + NRAS, EGFR + HRAS, KRAS + PTEN + TSC1, KRAS + STK11 + TP53, PIK3CA + TP53, MET + ERBB4, ERBB2 + PTEN, and ERBB2 + NRAS (Figure 3).
Figure 3.

Co-mutated genes and their count observed in the study cohort.

Co-mutated genes and their count observed in the study cohort.

Relationship Between the EGFR Mutation and Clinicopathological Characteristics

The multivariate logistic regression analysis showed that sex; smoking status; smoking index; family history of malignant tumors; history of other malignant tumors; CT imaging manifestation; gross type; levels of CEA, CA19-9, and SCCA; histologic type and subtype; differentiation grade; and levels of TTF-1 and napsin A were significantly correlated with the mutations in EGFR (Table 1).
Table 1.

Univariate and multivariate analysis of clinicopathological characteristics with EGFR mutational status from the development cohort.

CharacteristicsnUnivariate analysisMultivariate analysis*
EGFR ( + )EGFR (-)P valueOR [95% CI]P value
Age (y) 0.015 0.378
Mean ± SD61.9 ± 9.861.3 ± 9.862.6 ± 9.7
<62657376281 Reference
≥62793398395 0.892 [0.693, 1.150]
Sex <0.001 <0.001
Male876358518 Reference
Female574416158 1.853 [1.352, 2.541]
Smoking status <0.001 <0.001
Current/Former771233538 Reference
Never645529116 1.932 [1.367, 2.729]
Unknown341222
Smoking index <0.001
Mean ± SD774.6 ± 522.3664.8 ± 491.3836.6 ± 529.7 0.009
≥77030021585 Reference
<7701116677439 1.646 [1.133, 2.390]
Unknown341222
Family history of malignant tumors 0.001 0.030
Yes1177443 1.701 [1.053, 2.748]
No1299488811 Reference
Unknown341222
History of other malignant tumors 0.002 0.007
Yes702545 Reference
No1346737609 2.249 [1.254, 4.032]
Unknown341222
Tumor location 0.239
Left603329274
Right798402396
Bilateral1679
Unknown331221
CT imaging manifestation <0.001 0.005
Solid1355711644 Reference
GGO (Pure/Mix)58499 3.515 [1.463, 8.447]
Unknown371423
Gross type <0.001 0.025
Central type20887121 Reference
Peripheral type1205673532 1.535 [1.056, 2.232]
Unknown371423
T stage <0.001 0.740
T1277177100 Reference
T2280138142
T31436578 1.057 [0.760, 1.471]
T4237114123
Unknown513280233
N stage 0.191
N0388213175
N1905535
N2344176168
N31658085
Unknown463250213
M stage 0.328
M0537276261
M1a914843
M1b288160128
M1c20712186
Unknown327169158
Clinical stage
I A15510055 0.019
I B522428
II A258170.142
II B803941
III A12465590.094
III B813348
III C22715
IV A3902151750.436
IV B20712186
I-III8144233910.108
IV597336261
Unknown392514
CEA 0.011 0.002
Positive578332246 1.627 [1.195, 2.217]
Negative659331328 Reference
Unknown213111102
AFP 0.724
Positive231112
Negative803414389
Unknown624349275
FERR 0.387
Positive682939
Negative1477275
Unknown1235673562
CA125 0.031 0.624
Positive455226229 1.085 [0.783, 1.503]
Negative729409320 Reference
Unknown266139127
CA15-3 0.074
Positive865729
Negative382213169
Unknown982504478
CA19-9 0.030 0.029
Positive1526686 Reference
Negative687365322 1.646 [1.052, 2.577]
Unknown611343268
SCCA <0.001 0.006
Positive781959 Reference
Negative273157116 2.572 [1.311, 5.046]
Unknown1099598501
CYFRA 21-1 0.004 0.881
Positive20692114 0.958 [0.547, 1.677]
Negative1307951 Reference
Unknown1114603511
CA72-4 0.163
Positive451728
Negative1457273
Unknown1260685575
NSE 0.252
Positive237121116
Negative869480389
Unknown344173171
Adenocarcinoma component <0.001 0.001
With1313758555 Reference
Without11715102 3.025 [1.539, 5.945]
NOS#909
Unknown11110
Predominant component in adenocarcinoma <0.001
Micropapillary/Solid782553 Reference
Acinar/Papillary21114170 2.962 [1.587, 5.528] 0.001
AIS/Lepidic735320 2.983 [1.316, 6.766] 0.009
Minimally invasive202
Unknown909521388
Adenocarcinoma12737405330.098
Adenosquamous carcinoma401822
Differentiation grade <0.001
Poor27186185 Reference
Moderate24115883 2.566 [1.657, 3.973] <0.001
Well705020 3.131 [1.590, 6.165] 0.001
Unknown868480388
Mucus component 0.005 0.073
Yes361125 0.457 [0.194, 1.075]
No1414763651 Reference
TTF-1 <0.001 0.002
Positive866483383 2.853 [1.480, 5.500]
Negative13817121 Reference
Unknown446274172
Napsin A <0.001 <0.001
Positive648374274 3.003 [1.775, 5.083]
Negative19329164 Reference
Unknown609371238
P63 0.974
Positive21061149
Negative29485209
Unknown946628318
P40 0.009 0.441
Positive672146 1.340 [0.637, 2.822]
Negative360175185 Reference
Unknown1023578445
CK-7 <0.001 0.280
Positive687343344 0.549 [0.185,1.629]
Negative42834 Reference
Unknown721423298
Ki67(%) <0.001 0.052
Mean ± SD36.4 ± 21.632.9 ± 20.940.0 ± 21.8
<36461264197 Reference
≥36453205248 1.396 [0.996, 1.956]
Unknown536305231
Specimen type 0.140
Biopsy811419392
Surgical resection639355284
Technology 0.430
ARMS1167617550
NGS283157126

Abbreviations: AIS: adenocarcinoma in situ, AFP: alpha fetoprotein, ARMS: amplification refractory mutation system, CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CK: cytokeratin, CT: computerized tomography, EGFR: epidermal growth factor receptor, FERR: ferritin, GGO: ground-glass opacity, NGS: next-generation sequencing, NSE: neuron-specific enolase, OR: odds ratio, SCCA: squamous cell carcinoma antigen, SD: standard deviation, CYFRA21-1: soluble fragment of cytokeratin 19, TTF-1: thyroid transcription factor-1.

*Items were included in the multivariate analysis only when the P value is <0.05 in univariate analysis.

#NOS, not otherwise specified indicates pathologically confirmed NSCLC, but the pathologic type was not clearly identified.

Univariate and multivariate analysis of clinicopathological characteristics with EGFR mutational status from the development cohort. Abbreviations: AIS: adenocarcinoma in situ, AFP: alpha fetoprotein, ARMS: amplification refractory mutation system, CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CK: cytokeratin, CT: computerized tomography, EGFR: epidermal growth factor receptor, FERR: ferritin, GGO: ground-glass opacity, NGS: next-generation sequencing, NSE: neuron-specific enolase, OR: odds ratio, SCCA: squamous cell carcinoma antigen, SD: standard deviation, CYFRA21-1: soluble fragment of cytokeratin 19, TTF-1: thyroid transcription factor-1. *Items were included in the multivariate analysis only when the P value is <0.05 in univariate analysis. #NOS, not otherwise specified indicates pathologically confirmed NSCLC, but the pathologic type was not clearly identified.

Development and Validation of Prediction Models

Model 1 based on the clinical predictors demonstrated both good discrimination and calibration, as determined using the C-statistics and Hosmer–Lemeshow goodness-of-fit of 0.754 (95% CI 0.729-0.778) (χ2 = 6.733, degrees of freedom = 8, P = 0.566) for the development and 0.710 (95% CI 0.638-0.782) for the temporal validation cohort. The components of model 1 were sex; smoking status; smoking index; family history of malignant tumors; history of other malignant tumors; CT imaging manifestation; gross type; and levels of CEA, CA19-9, and SCCA. Model 2 based on the clinicopathological predictors demonstrated both good discrimination and calibration, as determined using the C-statistics and Hosmer–Lemeshow goodness-of-fit of 0.812 (95% CI 0.790-0.834) (χ2 = 6.418, degrees of freedom = 8, P = 0.989) for the development and 0.790 (95% CI 0.730-0.851) for the temporal validation cohort. The components of model 2 were sex; smoking status; smoking index; family history of malignant tumors; history of other malignant tumors; CT imaging manifestation; gross type; levels of CEA, CA19-9, and SCCA, histologic type and subtype; differentiation grade; and levels of TTF-1 and napsin A. Figure 4 shows the ROC curves of prediction for models 1 and 2 in the development cohort (A,C) and temporal validation cohort (B,D). The calibration of the internal validation was good, as shown in Figure 5. The x-axis is the prediction calculated from the model and the y-axis is the actual EGFR mutation observed in the cohort. The reference line indicates the difference between the predicted probabilities. The calibration plots showed a small deviation from ideal predictions.
Figure 4.

Receiver operating characteristic curve of models 1 and 2 in the development cohort (A,C) and temporal validation cohort (B,D).

Figure 5.

Calibration plot comparing the actual and predicted probabilities of the EGFR mutation. (A) Model 1; (B) model 2.

Receiver operating characteristic curve of models 1 and 2 in the development cohort (A,C) and temporal validation cohort (B,D). Calibration plot comparing the actual and predicted probabilities of the EGFR mutation. (A) Model 1; (B) model 2. As shown in Table 2, there was no significant difference in the prediction accuracy between the models (P = 0.455).
Table 2.

Comparison of prediction accuracy between two models.

Successful predictionFailure predictionP value
Model 1144660.455
Model 215159
Comparison of prediction accuracy between two models.

Construction of a Clinical Scoring System

We developed a concise scoring system enabling the differentiation of EGFR mutations based on the β coefficients obtained for the independent clinical predictors from the multivariate logistical regression model (Table 3). The predictive equation was logistic (p) = −4.208 + (0.802 × “female”) + (0.625 × “never smoking”) + (0.591 × “smoking index<770”) + (0.592 × “with family history of malignant tumors”) + (0.973 × “without history of other malignant tumors”) + (1.314 × “ground-glass opacity”) + (0.555 × “peripheral type”) + (0.659 × “positive CEA”) + (0.567 × “negative CA19-9”) + (1.180 × “negative SCCA”) . Model scores ranged from 0 to 13.
Table 3.

Multivariate analysis for the independent clinical predictors in EGFR mutation and corresponding points.

CategoriesβS.E.WaldP valueOR [95% CI]Points
Intercept−4.2080.54160.496<0.001 -
Sex
Male Reference0
Female0.8020.14929.150<0.0012.230 [1.667, 2.984]1
Smoking status
Current/Former Reference0
Never0.6250.16214.807<0.0011.869 [1.359, 2.570]1
Smoking index
≥770 Reference0
<7700.5910.17711.1550.0011.806 [1.277, 2.555]1
Family history of malignant tumors
No Reference0
Yes0.5920.2266.8770.0091.808 [1.161, 2.815]1
History of other malignant tumors
Yes Reference0
No0.9730.27912.213<0.0012.647 [1.533, 4.570]2
CT imaging manifestation
Solid Reference0
GGO (Pure/Mix)1.3140.39211.6870.0013.822 [1.772, 8.243]2
Gross type
Central type Reference0
Peripheral type0.5550.17310.2820.0011.743 [1.241, 2.447]1
CEA
Negative Reference0
Positive0.6590.14421.048<0.0011.932 [1.458, 2.561]1
CA19-9
Positive Reference0
Negative0.5670.2127.1490.0081.763 [1.163, 2.671]1
SCCA
Positive Reference0
Negative1.1800.31613.903<0.0013.253 [1.750, 6.049]2
Total 13

Abbreviations: CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CT: computerized tomography, EGFR: epidermal growth factor receptor, GGO: ground-glass opacity, OR: odds ratio, SCCA: squamous cell carcinoma antigen.

Multivariate analysis for the independent clinical predictors in EGFR mutation and corresponding points. Abbreviations: CA: carbohydrate antigen, CEA: carcinoembryonic antigen, CI: confidence interval, CT: computerized tomography, EGFR: epidermal growth factor receptor, GGO: ground-glass opacity, OR: odds ratio, SCCA: squamous cell carcinoma antigen.

Risk Group Stratification and Probability of EGFR Mutation

Patients were stratified into risk groups according to their scores: low-risk (score <4), moderate-risk (score 4-8), and high-risk (score >8) groups. In the development cohort, compared with that in the low-risk group, the occurrence of EGFR mutation was increased in the moderate-risk (OR = 5.22, 95% CI 3.36-8.11, P < 0.001) and high-risk (OR = 19.67, 95% CI 11.22-34.50, P < 0.001) groups. In the temporal validation cohort, compared with that in the low-risk group, the occurrence of EGFR mutation was increased in the moderate-risk (OR = 1.91, 95% CI 0.75-4.85, P = 0.172) and high-risk (OR = 8.08, 95% CI 2.80-23.33, P < 0.001) groups (Table 4).
Table 4.

Scoring system table.

Risk score quantitiesnEGFR ( + ), n (%)OR [95% CI]P value
Development Cohort
Range (0, 12)
Score<414326 (18.2)Reference -
4 ≤ Score ≤81108595 (53.7)5.22 [3.36, 8.11]<0.001
Score >8188153 (81.4)19.67 [11.22, 34.50]<0.001
Temporal Validation Cohort
Range (1, 13)
Score<4257 (28.0)Reference -
4 ≤ Score ≤814160 (42.6)1.91 [0.75, 4.85]0.172
Score >84430 (68.2)8.08 [2.80, 23.33]<0.001

Abbreviations: CI: confidence interval, EGFR: epidermal growth factor receptor, OR: odds ratio.

Scoring system table. Abbreviations: CI: confidence interval, EGFR: epidermal growth factor receptor, OR: odds ratio. Figure 6 presents the probability of EGFR mutation. In the development cohort, among patients with low, moderate, and high risks of EGFR mutation, the actual mutation frequency was 18.2%, 53.7%, and 81.4%, respectively. In the temporal validation cohort, among patients with low, moderate, and high risks of EGFR mutation, the actual mutation frequency was 28.0%, 42.6%, and 68.2%, respectively.
Figure 6.

Histogram for the risk of EGFR mutation. According to the scoring system, a histogram of the proportions of the low-, medium-, and high-risk populations were drawn.

Histogram for the risk of EGFR mutation. According to the scoring system, a histogram of the proportions of the low-, medium-, and high-risk populations were drawn.

Discussion

There were two main parts to our study. In the first part, we described the EGFR genetic profile of lung cancer. We observed that among 1450 lung cancer cases, the most common mutation was identified in EGFR. More specifically, mutations in EGFR were detected in more than half of the patients (774/1450, 53.4%). A high frequency of mutations in EGFR in Chinese patients with lung cancer highlights the significance of exploring the mutational status of EGFR.[3,5] In our study, the most common mutational subtype in EGFR was 19 del (46.6%), followed by L858R (44.5%). However, several studies have reported that the frequency of L858R was higher than that of 19 del, which might have been due to the number of samples, geographical differences, various tumor stages, different specimen types, and detection technologies. In addition, T790M was confirmed as the most important TKI-resistant mutation, accounting for more than half of the observed secondary resistance. We identified one instance with single T790M mutation, whereas all other instances were cases of coexistence with sensitive mutation sites. Furthermore, T790M was found to incline to co-mutate with L858R (25.5%) compared with 19 del (23.5%), suggesting that point mutations are more likely to have compound mutations. In our cohort, all patients with a T790M secondary mutation were treated with osimertinib, an irreversible third generation TKI drug. EGFR/TP53 co-mutation was found to be the most frequent in two or more synchronous gene mutations. TP53, as a tumor suppressor gene, is frequently identified in lung adenocarcinoma and is associated with a poor prognosis of patients with non-small cell lung cancer (NSCLC). Mutations in TP53 have been reported to result in DNA repair and excessive apoptosis, leading to cancer. A retrospective study of 1017 samples found that two of the most common mutations in the EGFR-mutant lung cancer were PIK3CA and MET, but this study did not analyze the tumor suppressor genes. Zhao et al. reported that patients with co-mutations in EGFR and TP53 accounted for 22.4% of all patients with lung adenocarcinoma and had higher tumor mutational burden and worse recurrence-free survival. In addition, the authors provided insights into the prognostic value of co-mutation of EGFR/TP53 for patients with lung adenocarcinoma. Thus, patients with lung cancer harboring EGFR and co-mutational tumor suppressor genes should be regarded as a unique subgroup, and more informed and genomically empowered molecular diagnosis and monitoring, and dynamically applied rational polytherapy strategies should be employed to address the clonal and subclonal co-alterations driving disease progression and drug resistance in order to better control lung cancer. In the second part, we established two prediction models based on the clinical and clinicopathological features and found there was no significant difference between them (P = 0.455). Thus, for patients who are not available for gene detection pathologically, we developed a practical predictive scoring system that could initially identify EGFR mutation, based on the independent clinical predictors, including sex, smoking status, smoking index, family history of malignant tumors, history of other malignant tumors, CT imaging manifestation, gross type, CEA, CA19-9, and SCCA. The discrimination of our scoring system in distinguishing patients with wild-type or EGFR-mutated tumors was good. The system uses coefficients to convert the features into quantitative data, demonstrating good discrimination with C-statistics of 0.754 (95% CI 0.729-0.778) for the development and 0.710 (95% CI 0.638-0.782) for the temporal validation cohorts. The system demonstrated that mutations in EGFR were significantly associated with female sex and non-smokers or smokers with lower smoking index (<770), but not stage. A study of 884 patients reported that mutations in EGFR are common in Chinese patients with early-stage non-small cell lung cancer (NSCLC; I–II); whereas another study reported a contradictory conclusion. Most surgical specimens reflect cases of early lung cancer, whereas most biopsy specimens reflect cases of advanced lung cancer. Therefore, inclusion of only a single specimen type is likely to be the underlying cause of the discordance in the results between these studies. Moreover, our study showed that biopsy specimens can replace surgical specimens for the detection of the mutational status of EGFR (P > 0.05), consistent with the findings of a previous study. In our study, we included the smoking index (Brinkman index), which can reflect the degree of influence of smoking. Approximately 83% of women with lung cancer do not have a smoking habit in China. Sex may influence the correlation between smoking status and EGFR mutation. Tomita et al. compared the correlation between the Brinkman index and EGFR mutation status of 90 Japanese men and found that the Brinkman index of the EGFR mutation group was lower than that of the wild-type group, but with no significant differences (P = 0.8357). A recent Chinese study analyzed the Brinkman index in male subjects and found that the degree of smoking between the EGFR group and the control group was still significantly different (P = 0.020). After excluding the influence of sex, our result was in agreement with the latter. A meta-analysis also showed that TKI therapy should be expanded to former smokers with less than 15 pack-years or those who did not smoke for more than 25 years. In addition, a family history of malignant tumors and history of other malignant tumors were found to be significantly related to mutations in EGFR. To the best of our knowledge, there have been no reports on these factors. Our system showed that a higher proportion of GGO was linked with mutations in EGFR, consistent with most previous study findings.[8,9,22] However, some previous studies have reported that either no significant correlation was observed between mutations in EGFR and GGO or mutations in EGFR were more common in lesions with >50% solid component.[24,25] The difference could be a result of insufficient recruitment of early-stage patients, or because different definitions were used for the quantification and classification of GGO. After excluding the influence of other malignant tumors, positive CEA, negative SCCA, and negative CA19-9 in serum were demonstrated to be predictors of mutations in EGFR. Niu et al. indicated that compared with the wild type, patients with EGFR mutations were more likely to have a high CEA level. Moreover, high CEA levels may represent higher tumor burden; therefore, it has been recommended to successfully evaluate the efficacy and prognosis of treatment with EGFR-TKIs. TKIs are known to disrupt abnormal downstream signaling pathways induced by mutations in EGFR to inhibit tumor proliferation, whereas overexpression of CEA has been reported to accelerate tumor development, suggesting that CEA might be a "cofactor" of mutations in EGFR. However, the opposite conclusion, that is, increased serum tumor markers are independent of driver genes, has also been drawn. Therefore, further research needs to be performed in this regard. Besides, SCCA is known to be the preferred marker of squamous cell carcinoma, whereas mutations in EGFR are common in adenocarcinoma. Thus, detecting mutations in EGFR in patients with expressing of SCCA would be rare, consistent with the univariate analysis results of Wen et al. However, the inconsistency with their multivariate analysis results may be related to sample size, inclusion criteria, and population differences. We also demonstrated the value of CA 19-9 for the prediction of EGFR mutations in patients with lung cancer. To date, no relevant research has been conducted. In addition, a study showed that cytological CYFRA21-1 correlated with the mutational status in EGFR, but not after its release into the serum. Therefore, serum tumor markers, such as CEA, SCCA, and CA19-9, the detection of which is both convenient and easy in China, could be widely used in hospitals lacking gene detection technology. The proposed scoring system is based on clinical characteristics with external validation to show its generalizability and reliability. According to our scoring system, low-, moderate-, and high-risk grades may distinguish the EGFR mutation type and wild type. Although a few studies have evaluated the risk factors of EGFR mutation or propose a scoring system, the parameters were incomprehensive and the efficiency of these scoring systems was not externally validated.[2,10,15] To the best of our knowledge, the current study proposes the first scoring system using various clinical parameters to predict the EGFR mutation status in a relatively large development cohort and validate its clinical value. Regarding pathological features, we found that patients with well-differentiated adenocarcinoma and peripheral tumors were more likely to have EGFR mutations. The EGFR mutations were more common in patients with AIS or lepidic predominant adenocarcinoma. It has been reported that the CT findings of GGO tend to correspond to lepidic pattern observed pathologically, although this correlation is not absolute. The prognosis was best for AIS and lepidic predominant adenocarcinoma. The OS rate of patients with micropapillary or solid predominant adenocarcinoma was significantly lower than that of patients with minimally invasive, lepidic, acinar, or papillary adenocarcinoma. Patients with acinar and papillary adenocarcinoma exhibit a similar cumulative 1-year overall survival (OS) rate. Thus, we classified different subtypes of adenocarcinoma based on the corresponding prognosis when constructing the model. The results revealed that the EGFR mutation rate was the highest in patients with AIS/lepidic predominant adenocarcinoma, followed by patients with papillary/acinar predominant adenocarcinoma and micropapillary/solid predominant adenocarcinoma. Previous study findings indicated that EGFR-mutated tumors are significantly associated with lepidic or acinar predominant subtype.[32-34] Thus, when performing subsolid lung nodule biopsy for EGFR mutation detection, it is recommended that interventional radiologists should target the GGO components as much as possible to increase the detection rate of EGFR mutations. Additionally, TTF-1 and napsin A have been demonstrated to be significant predictive factors for EGFR mutation. TTF-1 is a tissue-specific transcription factor that helps to maintain the functions of terminal respiratory unit cells. Among immunohistochemical markers, TTF-1 is considered the gold standard for primary lung adenocarcinoma. Napsin A is an aspartic protease expressed in the lung and kidney that is capable of cleaving the preform of surfactant protein B expressed in type II pneumocytes. It has been reported that napsin A is a more sensitive marker than TTF-1 for pulmonary ADCs, which may complement immunohistochemical analyses using TTF-1.[39,40] Hence, the immunohistochemical results might assist in guiding targeted therapy.[35,41] This study had some limitations. First, a retrospective single-center study could only reflect the mutational rate in patients in local areas in Asia. Second, we did not analyze the clinical value of radiomics signature or establish EGFR subtype prediction model. Third, owing to a small amount of EGFR compound mutations and co-mutations, we only described statistical association. Finally, NGS technology is expensive, and most patients in our study could not afford it. Our research data on rare gene mutations was limited.

Conclusions

In conclusion, we determined the EGFR genetic profile of lung cancer in Asians and systematically analyzed the relevant clinicopathological factors. We developed and validated a concise and non-invasive scoring system based on the clinical predictors to initially predict the EGFR mutation status in those who are not available for gene detection.
  41 in total

Review 1.  The IASLC Lung Cancer Staging Project: Proposals for Coding T Categories for Subsolid Nodules and Assessment of Tumor Size in Part-Solid Tumors in the Forthcoming Eighth Edition of the TNM Classification of Lung Cancer.

Authors:  William D Travis; Hisao Asamura; Alexander A Bankier; Mary Beth Beasley; Frank Detterbeck; Douglas B Flieder; Jin Mo Goo; Heber MacMahon; David Naidich; Andrew G Nicholson; Charles A Powell; Mathias Prokop; Ramón Rami-Porta; Valerie Rusch; Paul van Schil; Yasushi Yatabe
Journal:  J Thorac Oncol       Date:  2016-04-21       Impact factor: 15.609

2.  EGFR-mutant lung adenocarcinoma harboring co-mutational tumor suppressor genes predicts poor prognosis.

Authors:  Yue Zhao; Yunjian Pan; Chao Cheng; Difan Zheng; Yang Zhang; Zhendong Gao; Fangqiu Fu; Hang Li; Shanbo Zheng; Lingdun Zhuge; Hengyu Mao; Muyu Kuang; Xiaoting Tao; Yizhou Peng; Hong Hu; Jiaqing Xiang; Yuan Li; Yihua Sun; Haiquan Chen
Journal:  J Cancer Res Clin Oncol       Date:  2020-05-02       Impact factor: 4.553

3.  Value of serum tumor markers for predicting EGFR mutations and positive ALK expression in 1089 Chinese non-small-cell lung cancer patients: A retrospective analysis.

Authors:  Sufei Wang; Pei Ma; Guanzhou Ma; Zhilei Lv; Feng Wu; Mengfei Guo; Yumei Li; Qi Tan; Siwei Song; E Zhou; Wei Geng; Yanran Duan; Yan Li; Yang Jin
Journal:  Eur J Cancer       Date:  2019-11-07       Impact factor: 9.162

4.  Clinicopathological features of Chinese lung cancer patients with epidermal growth factor receptor mutation.

Authors:  Hui Ning; Ming Liu; Lina Wang; Yang Yang; Nan Song; Xiaoxiong Xu; Jin Ju; Gening Jiang
Journal:  J Thorac Dis       Date:  2017-03       Impact factor: 2.895

5.  The relationship between EGFR mutation status and clinic-pathologic features in pulmonary adenocarcinoma.

Authors:  Huiyan Deng; Junying Liu; Xiaojin Duan; Yueping Liu
Journal:  Pathol Res Pract       Date:  2017-09-21       Impact factor: 3.250

6.  Clinicoradiopathological features and prognosis according to genomic alterations in patients with resected lung adenocarcinoma.

Authors:  Yeonseok Choi; Ki-Hwan Kim; Byeong-Ho Jeong; Kyung-Jong Lee; Hojoong Kim; O Jung Kwon; Jhingook Kim; Yoon-La Choi; Ho Yun Lee; Sang-Won Um
Journal:  J Thorac Dis       Date:  2020-10       Impact factor: 2.895

7.  Are there imaging characteristics associated with epidermal growth factor receptor and KRAS mutations in patients with adenocarcinoma of the lung with bronchioloalveolar features?

Authors:  Catherine Glynn; Maureen F Zakowski; Michelle S Ginsberg
Journal:  J Thorac Oncol       Date:  2010-03       Impact factor: 15.609

8.  Correlation of EGFR mutation status with predominant histologic subtype of adenocarcinoma according to the new lung adenocarcinoma classification of the International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society.

Authors:  Celina Villa; Philip T Cagle; Melissa Johnson; Jyoti D Patel; Anjana V Yeldandi; Rishi Raj; Malcolm M DeCamp; Kirtee Raparia
Journal:  Arch Pathol Lab Med       Date:  2014-02-26       Impact factor: 5.534

9.  Napsin A and thyroid transcription factor-1 expression in carcinomas of the lung, breast, pancreas, colon, kidney, thyroid, and malignant mesothelioma.

Authors:  Justin A Bishop; Rajni Sharma; Peter B Illei
Journal:  Hum Pathol       Date:  2009-09-08       Impact factor: 3.466

10.  EGFR-TKI-sensitive mutations in lung carcinomas: are they related to clinical features and CT findings?

Authors:  Xiaoyi Qin; Xiaolong Gu; Yingru Lu; Wei Zhou
Journal:  Cancer Manag Res       Date:  2018-10-01       Impact factor: 3.989

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.