Liwen Zhang1, Bojiang Chen2, Xia Liu3, Jiangdian Song4, Mengjie Fang5, Chaoen Hu5, Di Dong6, Weimin Li7, Jie Tian8. 1. School of automation, Harbin University of Science and Technology, Harbin, Heilongjiang, 150080, China; CAS Key Lab of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China. 2. Department of respiratory and critical care medicine, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China. 3. School of automation, Harbin University of Science and Technology, Harbin, Heilongjiang, 150080, China. 4. School of Medical Informatics, China Medical University, Shenyang, Liaoning 110122, China. 5. CAS Key Lab of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China. 6. CAS Key Lab of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 100049, China. Electronic address: di.dong@ia.ac.cn. 7. Department of respiratory and critical care medicine, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China. Electronic address: weimin003@163.com. 8. CAS Key Lab of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 100049, China. Electronic address: tian@ieee.org.
Abstract
OBJECTIVES: To predict epidermal growth factor receptor (EGFR) mutation status using quantitative radiomic biomarkers and representative clinical variables. METHODS: The study included 180 patients diagnosed as of non-small cell lung cancer (NSCLC) with their pre-therapy computed tomography (CT) scans. Using a radiomic method, 485 features that reflect the heterogeneity and phenotype of tumors were extracted. Afterwards, these radiomic features were used for predicting epidermal growth factor receptor (EGFR) mutation status by a least absolute shrinkage and selection operator (LASSO) based on multivariable logistic regression. As a result, we found that radiomic features have prognostic ability in EGFR mutation status prediction. In addition, we used radiomic nomogram and calibration curve to test the performance of the model. RESULTS: Multivariate analysis revealed that the radiomic features had the potential to build a prediction model for EGFR mutation. The area under the receiver operating characteristic curve (AUC) for the training cohort was 0.8618, and the AUC for the validation cohort was 0.8725, which were superior to prediction model that used clinical variables alone. CONCLUSION: Radiomic features are better predictors of EGFR mutation status than conventional semantic CT image features or clinical variables to help doctors to decide who need EGFR tyrosine kinase inhibitor (TKI) treatment.
OBJECTIVES: To predict epidermal growth factor receptor (EGFR) mutation status using quantitative radiomic biomarkers and representative clinical variables. METHODS: The study included 180 patients diagnosed as of non-small cell lung cancer (NSCLC) with their pre-therapy computed tomography (CT) scans. Using a radiomic method, 485 features that reflect the heterogeneity and phenotype of tumors were extracted. Afterwards, these radiomic features were used for predicting epidermal growth factor receptor (EGFR) mutation status by a least absolute shrinkage and selection operator (LASSO) based on multivariable logistic regression. As a result, we found that radiomic features have prognostic ability in EGFR mutation status prediction. In addition, we used radiomic nomogram and calibration curve to test the performance of the model. RESULTS: Multivariate analysis revealed that the radiomic features had the potential to build a prediction model for EGFR mutation. The area under the receiver operating characteristic curve (AUC) for the training cohort was 0.8618, and the AUC for the validation cohort was 0.8725, which were superior to prediction model that used clinical variables alone. CONCLUSION: Radiomic features are better predictors of EGFR mutation status than conventional semantic CT image features or clinical variables to help doctors to decide who need EGFR tyrosine kinase inhibitor (TKI) treatment.
Recently, considerable progress has been made in the treatment of non-small cell lung cancer (NSCLC). Pathological analysis and evaluation of biomolecular markers are the primary guidelines for the investigation of lung adenocarcinomas [1], [2], [3]. The development of a lung cancer molecular mechanism showed that lung cancer is polygenetic [4]. Various genes are involved in the occurrence, development, invasion, and metastasis of NSCLCs, such as epidermal growth factor receptor (EGFR) used for mutation testing [5], Kirsten ratsarcoma viral oncogene homolog (KRAS) [6], and anaplastic lymphoma kinase (ALK) [7]. The EGFR has attracted increasing attention in recent years; it is frequently over-expressed and is directly related to extending the survival period. EGFR tyrosine kinase inhibitor (TKI) treatment is more effective for NSCLCpatients with EGFR mutations [8]. A study found that patients with EGFR mutations achieved a significantly better treatment result than patients without the mutation (log-rank test, P = .0023; Breslow-Gehan-Wilcoxon test, P = .0012) [9]. Compared with traditional cytotoxic drugs, molecular targeted drugs possess a specific active site and have very little impact on normal tissue cells while inhibiting tumor cell growth; safety and tolerability are excellent advantages. Hence, the key to targeted therapy is to find drugs that target the “accepted crowd”.A few studies have recently indicated that the EGFR mutation status is related to many factors, such as smoking status, histological subtypes, gender, and ethnicity [10], [11]. A recent study revealed that the EGFR mutation can be correlated with computed tomography (CT) image features [12], which showed that the ground glass opacity (GGO) volume had higher percentage of the exon 21 missense mutation compared with other patients. Other meaningful studies found that CT characteristics of lung adenocarcinomas, in conjunction with clinical variables, are better parameters for prediction of EGFR mutation status than using clinical variables alone [13]. However, all of these studies tried to predict the EGFR mutation by traditional, univariate CT features or clinical features. Obviously, previous studies had some limitations in minable information of CT features.We hypothesized that large amounts of quantitative features extracted from the CT image can build a predictable model of EGFR mutation [14], [15], [16]. Radiomics is the mining of high-throughput features from medical images. Recently, some studies found that radiomics was more significant and efficient than traditional clinical data and few image features for analysis of medical images and potential improvement of oncology treatments [17], [18]. Moreover, radiomics is an emerging approach that built a predictive model using multiple imaging biomarkers, which is yet to be accepted. Therefore, the aim of this study was to develop a logistic regression model with multivariate radiomic features and clinical data to investigate the potential relationship.
Patients and Methods
The study was approved by the West China Hospital, Sichuan University; ethical approval was obtained for use of the CT images. Informed consents of data collection and gene prediction were waved for research from each patient before surgery, in accordance with the related policy of the West China Hospital.
Patients
We obtained patient records of 1476 original cases of NSCLC at West China Hospital; CT scans and clinical data of all cases were collected between December 2008 and September 2014.Inclusion and exclusion criteria are presented in the Supplementary material (Figure S1). Only 180 cases were eligible for the investigation. In accordance with the requirements, clinical and demographic data were collected for all patients, including smoking, sex, age, clinical stage, and histological subtype. The patient stage was classified according to the TNM classification system of the American Join Committee on Cancer [19]. All patients were tested to examine mutations of EGFR exons 18, 19, 20, and 21 using the Amplification Refractory Mutation System (ARMS).
Radiomic Method
The process of the radiomic method included the following steps (Figure 1):
Figure 1
The process of the radiomics method. (a) Original images of NSCLC patients. (b) Experienced radiologists segmented the tumor region of interest (ROI) on all CT slices to extract the radiomic features. (c) Extraction of features from the ROI, such as tumor shape, intensity, texture, and wavelet features. (d) Prediction for EGFR mutation.
Image acquisitionThe process of the radiomics method. (a) Original images of NSCLCpatients. (b) Experienced radiologists segmented the tumor region of interest (ROI) on all CT slices to extract the radiomic features. (c) Extraction of features from the ROI, such as tumor shape, intensity, texture, and wavelet features. (d) Prediction for EGFR mutation.Images included in the set must have the same or similar parameters. In our study, all CT images were the same scan type and were in the same Digital Imaging and Communications in Medicine (DICOM) format. The CT image acquisition and retrieving procedures are described in Supplementary material (page 2).SegmentationTumor segmentation is a key step for the following procedure of the feature extraction and predictable model construction. At present, manual segmentation is considered the “gold standard”. ITK-SNAP software was used for three-dimensional (3D) manual segmentation [20]. In this study, tumors were segmented by a professional radiologist with 5 years of experience, each region of interest (ROI) of delineation was validated by a second radiologist, who had 5 years of experience. Each ROI was also examined by two clinical students.Feature extractionHigh-dimensional radiomic features were extracted to describe tumor phenotype. Features were divided into four groups: (1) clinical cognitive features, (2) image intensity features, (3) textural features, and (4) wavelet features. The primary purpose of the first group was to describe the tumor phenotype, such as the shape of roundness and burr, area, volume, and compactness. The second group described the tumor region of the gray histogram and gray distribution of the voxel. The third group revealed the homogeneity or heterogeneity of the structure within the tumor. The final group implemented the image intensity and texture by decomposing the original image. The algorithms for feature extraction were processed by Matlab 2015b (The Mathworks, Inc., Natick, MA, USA). More details are shown in the Supplementary material (page 3–8, Table S1-S5).Feature selectionA large number of radiomic features were useful and meaningful for quantifying tumor. However, the redundancy among the features will exist and cause poor classification performance, and medical images used in model building generally belong to a small sample. Therefore, feature selection is very important to improve the generalization ability and optimize the model. The least absolute shrinkage and selection operator (LASSO) method was applied to select the features that were most distinguishable and build a logistic regression model. LASSO is an accepted algorithm that has been used for feature selection in high-dimensional variables. A radiomic score (Rad-score) was obtained for each patient using features selected and weighted by the respective coefficients.
Development of the Multivariable Prediction Model
A multivariable logistic regression model was built using the following clinical predictors: age, gender, smoking status, and radiomic features. All of these features were included in the development of a diagnostic model to predict EGFR mutations. We also developed a radiomic nomogram based on a multivariable logistic analysis in the training cohort. We used radiomic signature (Rad_signature) to represent the possibility for each patient, which was obtained by the LASSO regression model developed by radiomic features.
Statistical Analysis
Statistical analysis was performed using Matlab 2015b and R software (version 3.3.3, http://www.R-project.org). We used the “SPM12” package in Matlab 2015b to extract features. The LASSO binary logistic regression model was built using the “glmnet” package. The features were compared using a Mann–Whitney U test with an abnormal distribution. The nomogram was depicted based on the results of the multivariate analysis using the “rms” package in R. The “Hmisc” package was used to investigate the performance of the nomogram in concordance with the C-index. The larger C-index represented an accurate prognostic prediction. Moreover, calibration curves were plotted for the nomogram. A P < .05 was considered statistically significant.
Results
Clinical Data Analysis
No significant differences in EGFR mutation were detected between the two cohorts (P = .91) in terms of smoking status, histological subtype, age, sex, pathological stage, or Rad-score. Male patients comprised 74.3% (104/140) and female patients comprised 25.7% (46/140) of the total training cohort. The mean age was 60.2±9.7 years (range, 30 to 82 years). Adenocarcinoma was 48.6% (68/140) of cases; squamous cell carcinoma was 38.6% (54/140), and others were 12.8% (18/140). Smokers accounted for 87.5% (91/140) of patients, and non-smokers accounted for 12.5% (49/140). Clinical stage IIIA was 45% (63/140), IIIB was 32.7% (36/140), and IV was 22.3% (41/140) of cases. In the validation cohort, males were 75.0% (30/40) and females were 25.0% (10/40) of total patients. The mean age was 60.4±9.7 years (range, 30 to 82 years). Adenocarcinoma was 50.0% (20/40) of cases; squamous cell carcinoma was 35.0% (14/40), and others were 15.0% (6/40). Smokers were 70.0% (28/40) of patients; non-smokers were 30.0% (12/40). Clinical stage IIIA was 30.0% (12/40), IIIB accounted for 27.5% (11/40), and IV was 42.5% (17/40). We investigated smokers and estimated the number of cigarettes smoked in 1 year, which was used for the nomogram in the subsequent section. More patient information is shown in Table 1.
Table 1
Analysis of Patients in the Training and Validation Cohorts
Characteristics
Training Cohort
P
Validation Cohort
P
EGFR+
EGFR-
EGFR+
EGFR-
No. of patients
66
74
−
20
20
Age, mean±STD
52.9±9.5
65.1±7.1
0.322
60.6±13.2
60.2±4.0
.301
Gender
0.002
.003
Male
41 (62%)
63 (85%)
11 (55%)
19 (95%)
Female
25 (38%)
11 (15%)
9 (45%)
1 (5%)
Smoking status
<0.001
<.001
Yes
35 (53%)
56(76%)
11 (55%)
17 (85%)
No
31 (47%)
18(24%)
9 (45%)
3 (15%)
Stage
0.047
.021
IIIA
28 (42%)
35(47%)
4 (20%)
8 (40%)
IIIB
18 (27%)
18(24%)
7 (35%)
4 (20%)
IV
20 (31%)
21(28%)
9 (45%)
8 (40%)
Histological subtype
0.007
<.001
Adenocarcinoma
43 (65%)
25(34%)
11 (55%)
9 (45%)
Squamous cell carcinoma
16 (24%)
38(51%)
4 (20%)
10 (50%)
Others
7 (11%)
11(15%)
5 (25%)
1 (5%)
Rad-score (mean)
0.501
−0.667
<0.001
0.602
−0.556
<.001
*The P value represents the univariate association between each of the clinical variables and EGFR mutation using the Wilcoxon rank sum test. A P < .05 indicates significance. Abbreviations: STD, standard deviation; Rad-score, radiomic score.
Analysis of Patients in the Training and Validation Cohorts*The P value represents the univariate association between each of the clinical variables and EGFR mutation using the Wilcoxon rank sum test. A P < .05 indicates significance. Abbreviations: STD, standard deviation; Rad-score, radiomic score.
Feature Extraction and Selection
In total, 485 radiomic features were extracted from the ROI. Clinical features included smoking status, gender, age, clinical stage, and histological subtype. The LASSO algorithm and 10-fold cross-validation were used to consolidate all of the features into 10 potential predictors based on 140 patients in the training cohort, which were implemented to develop the LASSO logistic regression model (Figure 2). The features used in the model and a description of the rad-score calculation are included in supplementary material (page 10).
Figure 2
The least absolute shrinkage and selection operator (LASSO) binary logistic regression model for the feature selection. (a) With the number of coefficients of the 485 radiomic features and four clinical features shrinking, the value of ln(λ) increased. The optimal value of λ was 0.0537, and the value of ln(λ) was −2.92. As shown, the vertical dotted line was drawn at the value selected by the 10-fold cross-validation, where the 10 optimal coefficients were obtained. (b) The relationship between the area under the receiver operating characteristic (AUC) and the parameter (ln(λ)) was visually shown. In order to avoid overfitting the model, the number of features was as few as possible. When the value ln(λ) increased to −2.92, the AUC reached the peak again with the appropriate number of features according to the 10-fold cross-validation.
The least absolute shrinkage and selection operator (LASSO) binary logistic regression model for the feature selection. (a) With the number of coefficients of the 485 radiomic features and four clinical features shrinking, the value of ln(λ) increased. The optimal value of λ was 0.0537, and the value of ln(λ) was −2.92. As shown, the vertical dotted line was drawn at the value selected by the 10-fold cross-validation, where the 10 optimal coefficients were obtained. (b) The relationship between the area under the receiver operating characteristic (AUC) and the parameter (ln(λ)) was visually shown. In order to avoid overfitting the model, the number of features was as few as possible. When the value ln(λ) increased to −2.92, the AUC reached the peak again with the appropriate number of features according to the 10-fold cross-validation.
Development of the Prediction Model and ROC Curve Analysis
The LASSO logistic regression analysis (Table 3) revealed that 7 radiomic features combined with 3 clinical features had the potential to build the prediction model for EGFR mutation by the training cohort of 140 patients, which include IIF.range (P = .001), IIF.Skewness (P = .003), WF.IF.mean_absolute_deviation (P < .001), WF.IF.median (P = .004), WF.IF.mean (P = .010), WF.GLCM.variance (P = .019), GLRLM_HGLRE (P = .020), gender (P = .002), smoking status (P < .001) and histological subtype (P = .007). More information was shown in Supplementary material (Table S5). To validate the discrimination of the model, we also used a validation cohort consist of 40 patients to prove the performance (Figure 3).
Figure 3
Receive operating characteristic (ROC) curve for the training cohort and validation cohort. As shown above, radiomics features combined with clinical variables had the potential ability to predict the mutation of EGFR. (The AUC for the training cohort was 0.8618. The AUC for the validation cohort was 0.8725).
Receive operating characteristic (ROC) curve for the training cohort and validation cohort. As shown above, radiomics features combined with clinical variables had the potential ability to predict the mutation of EGFR. (The AUC for the training cohort was 0.8618. The AUC for the validation cohort was 0.8725).In addition, we calculated the sensitivity, specificity, positive predictive value, negative predictive value and accuracy to show the ability (Table 2).
Table 2
Diagnostic Accuracy of Patients in the Primary Cohort and Validation Cohort
Data
Sensitivity (%)
Specificity (%)
Positive Predictive Value (%)
Negative Predictive Value (%)
Accuracy (%)
Training cohort
65.2 (43/66)
86.5 (64/74)
81.1 (43/53)
73.6 (64/87)
76.4 (107/140)
Validation cohort
90.0 (18/20)
55.0 (11/20)
66.7 (18/27)
84.6 (11/13)
72.5 (29/40)
Total
70.9 (61/86)
79.8 (75/94)
76.3 (61/80)
75.0 (75/100)
75.6 (136/180)
Diagnostic Accuracy of Patients in the Primary Cohort and Validation Cohort
ROC Curves Analysis for Radiomic Features and Clinical Predictors
The model revealed that gender, smoking status, clinical stage and histological subtype were independent predictors of EGFR mutation. However, the model was merely developed by these features showing poor performance (Figure 4). For the model that included both radiomic and clinical features, the AUC increased from 0.62 to 0.85 when the radiomic features were added (P < .001). In addition, the EGFR mutation status was not related to age (P = .4161).
Figure 4
ROC curves were depicted to describe the discrimination between the radiomic features and clinical features. The blue line shows the model developed by the radiomic features and clinical features (AUC = 0.8497). Others presented in the model were built only by the clinical features.
ROC curves were depicted to describe the discrimination between the radiomic features and clinical features. The blue line shows the model developed by the radiomic features and clinical features (AUC = 0.8497). Others presented in the model were built only by the clinical features.In order to illustrate the potential ability for prediction of EGFR mutation, we compared the models developed by radiomics features, clinical variables, and combination of them (Figure 5). The roc curves showed the good performance and generalization for the model built by radiomics features. AUC for radiomics model was 0.7598 in the primary cohort and 0.77 in the validation cohort. However, the model built by clinical variables showed poor performance in validation (AUC = 0.5075). When the model built by the both radiomic features and clinical variables, AUC was 0.8618 in the training cohort and 0.8725 in the validation cohort, which was shown the good performance for prediction of EGFR mutation.
Figure 5
(a) Models developed by radiomics features, clinical variables, and combination of them in the training cohort; (b) Models developed by radiomics features, clinical variables, and combination of them in the validation cohort.
(a) Models developed by radiomics features, clinical variables, and combination of them in the training cohort; (b) Models developed by radiomics features, clinical variables, and combination of them in the validation cohort.
Analysis of An Individualized Prediction Model
Multivariate logistic regression analysis identified the clinical stage, age, gender, smoking status, and Rad-signature as independent predictors (Table 2). The individualized EGFR mutation prediction model that consisted of the above independent predictors was visualized by the nomogram (Figure 6).
Figure 6
The nomogram was depicted to present the relationship between radiomic features and clinical features and visually show the potential ability individually. The nomogram was built in the training cohort, with the stage, age (range is from 30 to 82 years), gender, smoking status (the number of cigarettes smoked in 1 year) and rad_signatures.
The nomogram was depicted to present the relationship between radiomic features and clinical features and visually show the potential ability individually. The nomogram was built in the training cohort, with the stage, age (range is from 30 to 82 years), gender, smoking status (the number of cigarettes smoked in 1 year) and rad_signatures.
Validation of the Radiomic Nomogram with Calibration Curves
Calibration curves were plotted to describe the performance of the nomogram in the primary cohort, in combination with the Hosmer-Lemeshow test [21]. To quantify the discrimination performance of the radiomic nomogram, Harrell's concordance index (C-index) was calculated and 1000 bootstrap resamples for validation were used to calculate a relatively corrected C-index (Figure 7).
Figure 7
Calibration curves for the nomogram by radiomic signature and clinical predictors. (a) Calibration curve was depicted in the training cohort to test the model prediction ability for EGFR mutation. The sample size was 140. The mean absolute error was 0.064. The C-index was 0.808 (95% CI: 0.739–0.876). (b) Calibration curve for the validation cohort. The sample was 40 and mean absolute error was 0.065. The C-index was 0.905 (95% CI: 0.819–0.991). X-axis represents the nomogram predicted probability of the EGFR mutation. The length of vertical black line on the X-axis represents the distribution of the samples. Y-axis represents the actual EGFR mutation rate in a small set. The diagonal blue line shows an ideal prediction by an optimal model. The red line shows the prediction of the nomogram. The black solid line presents the performance of the calibration curve with multiple sets of the bootstrap (B) to get a higher accuracy for the prediction. The calibration curve was drawn by plotting P1 on the X-axis and P2 = [1 + exp.-(x1 + ax2)]-1 on the Y-axis, where P2 is the real probability, a = logit (P1), P1 is the predicted probability, ×1 is the calibration intercept, and ×2 is the estimation of the slope.
Calibration curves for the nomogram by radiomic signature and clinical predictors. (a) Calibration curve was depicted in the training cohort to test the model prediction ability for EGFR mutation. The sample size was 140. The mean absolute error was 0.064. The C-index was 0.808 (95% CI: 0.739–0.876). (b) Calibration curve for the validation cohort. The sample was 40 and mean absolute error was 0.065. The C-index was 0.905 (95% CI: 0.819–0.991). X-axis represents the nomogram predicted probability of the EGFR mutation. The length of vertical black line on the X-axis represents the distribution of the samples. Y-axis represents the actual EGFR mutation rate in a small set. The diagonal blue line shows an ideal prediction by an optimal model. The red line shows the prediction of the nomogram. The black solid line presents the performance of the calibration curve with multiple sets of the bootstrap (B) to get a higher accuracy for the prediction. The calibration curve was drawn by plotting P1 on the X-axis and P2 = [1 + exp.-(x1 + ax2)]-1 on the Y-axis, where P2 is the real probability, a = logit (P1), P1 is the predicted probability, ×1 is the calibration intercept, and ×2 is the estimation of the slope.
Discussion
In this study, we defined 485 radiomic features extracted from the CT images and combined traditional clinical characteristics to predict the EGFR mutation status. We used the LASSO algorithm and 10-fold cross-validation to shrink all of the features to 10 potential predictors based on 140 patients in the training cohort. We showed the features used in the model in detail (Supplementary material Table S5) and calculated the Rad-score (Supplementary materials Figure S5) to reflect the potential risk of EGFR mutation status. We should note that we investigated the smokers and estimated the number of cigarettes smoked in 1 year, which would be used for the nomogram in the subsequent section. Furthermore, we found that radiomic features and clinical features included smoking status, gender and histological subtype with potential discrimination for the EGFR mutation status.This is a new idea to investigate the gene mutation by extracting amount of high-dimensional image features, minable features and combining clinical variables. Recently, Gillies et al. focused on the relationship between semantic CT features and EGFR mutation, and built a model using multiple logistic regression for ROC analysis (AUC = 0.778) [13]. We developed a model by the more complex approach using the radiomic features and depicted the ROC (AUC = 0.8618). Moreover, we created a radiomic signature-based nomogram for individualized mutation prediction. The nomogram included age, clinical stage, gender, smoking status and Rad_signature. The Rad_signature successfully classified patients to differentiate EGFR mutation subgroup. The nomogram visualized the Rad_signature and clinical risk factors into an easy-to-use individualized prediction of EGFR mutation. In addition, the calibration curves were depicted to indicate the performance of the radiomics nomogram for the probability of EGFR mutation, which demonstrated good agreement between prediction and observation in the training and validation cohorts.Currently, the pervasive approach of ARMS is available only with surgery, which is painful and requires high economic investment. The efficient and non-invasive way to detect the EGFR mutation status is necessary for more patients to decide whether they need to receive the EGFR TKI treatment. Prior work has documented that some clinical factors and a few semantic image features were linked to EGFR mutation, such as smoking status, histological subtype, gender, ground-glass opacity (GGO), spiculated margin, pleural retraction, and air bronchogram [13], [22], [23], [24]. Furthermore, in accordance with Ozkan et al., they only investigated the relationship between CT gray-level texture features and EGFR mutations in a relatively small sample of 45 patients [25].However, these studies investigated only clinical features or focused on a limited number of image features. Some studies concurred; some did not. The potential reason is that some clinical features are defined by radiologists and may have different standards for evaluation according to different subjectively experiential judgment. In addition, a few image features are too incomplete to quantity tumor phenotype. Based on these deficiencies, we focused on the recently emerging radiomics approach, as the task of EGFR mutation status prediction by extracting minable image features and series of rigorous statistics verification.However, although our conclusions were encouraging, some limitations should be discussed. First, the sample size was small. To make the study meaningful, we selected eligible patients from 1476 candidates. However, only 180 of the 1476 candidates met the inclusion criteria, and most patients had adenocarcinoma, which made it impossible to determine a relationship between EGFR mutation and histological subtype. Otherwise, the results revealed that sensitivity and specificity were influenced by sample size. Thus, combining the prevalent approach of deep learning, our findings are encouraging and should be studied in a larger cohort. Second, the EGFR mutations were more common in Asian patients (47%), such as those in China. However, the results also lacked universality we encourage more researchers in Western or other racial populations to validate them. Third, despite analyzing the EGFR mutation status by the multivariable logistic regression model with the ROC curve in conjunction with the nomogram and calibration curve, which are universally accepted in the field of medical image analysis, further work should follow up the comparison with other machine learning methods.This approach optimizes the method of predictor selection, regarding the contribution of potential features, and embodies the panel of radiomic features to be combined into a Rad_signature. Multimarker analyses that incorporate individual markers into one panel and presented using a nomogram have been embraced in recent studies [26]. For example, Huang et al. presented a method using the radiomics nomogram that included the radiomics signature, CT-reported lymph node status, and carcinoembryonic antigen level, which can be used for individualized prediction of lymph node metastasis in patients with colorectal cancer [16].EGFR TKI is the efficient first-line treatment for lung cancerpatients with EGFR mutations, and provides longer progression-free survival and better quality of life compared with chemotherapy. The median overall survival can be prolonged from approximately 8 months to 2 years [27], [28]. Thus, identifying the presence of the EGFR mutation is of great importance. Therefore, further studies are necessary to explore and verify the radiomic features in a multi-modality setting, such as positron emission tomography (PET) and magnetic resonance imaging (MRI) with accurate and authentic EGFR mutation detection. In addition, a combination of the radiomics method with other -omics, such as proteomics and genomics, should be explored. We expect that our intelligent medical method can help radiologists in clinical by further discriminative mineable findings in molecular phenotypes and to translate the deep interpretation of image into clinical practice for disease diagnosis and treatment.
Authors: Turkey Refaee; Guangyao Wu; Abdallah Ibrahim; Iva Halilaj; Ralph T H Leijenaar; William Rogers; Hester A Gietema; Lizza E L Hendriks; Philippe Lambin; Henry C Woodruff Journal: Respiration Date: 2020-01-28 Impact factor: 3.580
Authors: Bihong T Chen; Taihao Jin; Ningrong Ye; Isa Mambetsariev; Ebenezer Daniel; Tao Wang; Chi Wah Wong; Russell C Rockne; Rivka Colen; Andrei I Holodny; Sagus Sampath; Ravi Salgia Journal: Magn Reson Imaging Date: 2020-03-13 Impact factor: 2.546
Authors: Radouane El Ayachy; Nicolas Giraud; Paul Giraud; Catherine Durdux; Philippe Giraud; Anita Burgun; Jean Emmanuel Bibault Journal: Front Oncol Date: 2021-05-05 Impact factor: 6.244
Authors: Martina Sollini; Francesco Bartoli; Andrea Marciano; Roberta Zanca; Riemer H J A Slart; Paola A Erba Journal: Eur J Hybrid Imaging Date: 2020-12-09