| Literature DB >> 35186727 |
Xuemei Huang1, Yingli Sun1, Mingyu Tan1, Weiling Ma1, Pan Gao1, Lin Qi1, Jinjuan Lu1, Yuling Yang1, Kun Wang1, Wufei Chen1, Liang Jin1, Kaiming Kuang2, Shaofeng Duan3, Ming Li1.
Abstract
OBJECTIVES: EGFR testing is a mandatory step before targeted therapy for non-small cell lung cancer patients. Combining some quantifiable features to establish a predictive model of EGFR expression status, break the limitations of tissue biopsy.Entities:
Keywords: EGFR; NSCLC; deep learning; machine learning; radiogenomics; tomography
Year: 2022 PMID: 35186727 PMCID: PMC8848731 DOI: 10.3389/fonc.2022.772770
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Schematic for the models’ construction. CT, Computed Tomography; VOI, Volume Region of Interest; Light GBM, Light Gradient Boosting Machine; Res-Net, Residual Network; Modelclinical incorporated clinical-radiology features, Modelradiomic incorporated radiomic features, Modelradiomic+clinical combined clinical-radiology and radiomic features, ModelCNN incorporated deep learning features, and ModelCNN+radiomic+clinical combined clinical-radiology, radiomic, and deep learning features.
The distribution of clinical-radiology features for EGFR mutant and wild type cases in the training set.
| Characteristics | EGFR wild | EGFR mutation |
|
|---|---|---|---|
|
|
| ||
| Male | 176 (46.6) | 142 (36.2) | |
| Female | 202 (53.4) | 250 (63.8) | |
|
| 57.5 (18.0) | 60.0 (16.0) |
|
|
|
| ||
| Non-invasive | 64 (16.9) | 27 (6.9) | |
| Micro-invasive | 158 (41.8) | 131 (33.4) | |
| Invasive | 156 (41.3) | 234 (59.7) | |
|
|
| ||
| RUL | 137 (36.2) | 145 (37.0) | |
| RML | 23 (6.1) | 39 (9.9) | |
| RLL | 64 (16.9) | 59 (15.1) | |
| LUL | 98 (25.9) | 103 (26.3) | |
| LLL | 56 (14.8) | 46 (11.7) | |
|
|
| ||
| Pure GGO | 72 (19.0) | 29 (7.4) | |
| Mixed GGO | 221 (58.5) | 313 (79.8) | |
| Solid | 85 (22.5) | 50 (12.8) | |
|
|
| ||
| Well-define | 271 (71.7) | 254 (64.8) | |
| Less-define | 61 (16.1) | 79 (20.2) | |
| Ill-define | 46 (12.2) | 59 (15.1) | |
|
|
| ||
| Present | 112 (29.6) | 202 (51.5) | |
| Absent | 266 (70.4) | 190 (48.5) | |
|
|
| ||
| Present | 120 (31.7) | 194 (49.5) | |
| Absent | 258 (68.3) | 198 (50.5) | |
|
|
| ||
| Short | 77 (20.4) | 99 (25.3) | |
| Deep | 26 (6.9) | 32 (8.2) | |
| Mixed | 60 (15.9) | 81 (20.7) | |
| Absent | 215 (56.9) | 180 (45.9) | |
|
|
| ||
| Shallow | 125 (33.1) | 114 (29.1) | |
| Deep | 4 (1.1) | 11 (2.8) | |
| Mixed | 248 (65.6) | 263 (67.1) | |
| Absent | 1 (0.3) | 4 (1.0) | |
|
|
| ||
| Present | 46 (12.2) | 76 (19.4) | |
| Absent | 332 (87.8) | 316 (80.6) | |
|
|
| ||
| Present | 191 (50.5) | 204 (52.0) | |
| Absent | 187 (49.5) | 188 (48.0) | |
|
|
| ||
| Present | 133 (35.2) | 187 (47.7) | |
| Absent | 245 (64.8) | 205 (52.3) | |
|
|
| ||
| Present | 29 (7.7) | 59 (15.1) | |
| Absent | 349 (92.3) | 333 (84.9) | |
|
|
| ||
| Yes | 172 (45.5) | 146 (37.2) | |
| No | 206 (54.5) | 246 (62.8) |
RUL, right upper lobe; RML, right middle lobe; RLL, right lower lobe; LUL, left upper lobe; LLL, left lower lobe; GGO, ground glass opacity; Categorical variables (e.g. gender) are expressed by a number (percentage), continuous variables (e.g. age) are expressed by the Median (interquartile range). *p<0.05 (significant), P-values taken with three decimal places equal to 0.000 are expressed as <0.001.
The bolded values in the left column refer to the clinical-radiological features included in the statistical analysis of this study, and the bolded values in the right column refer to the P values, with P less than 0.05 as the criterion to evaluate whether they are statistically significant and whether they are included in the subsequent statistical sub-analysis. The data are bolded for the purpose of making them more prominent and clear only.
| Statistical analysis outcome of clinical-radiology characteristics.
| Selected Features | Univariate Analysis | Multivariate Analysis | ||
|---|---|---|---|---|
| Z or χ2 |
| Regression coefficient |
| |
|
| 8.481 |
| -0.649 |
|
|
| -2.826 |
| 0.015 |
|
|
| 32.923 |
| -1.158 |
|
|
| 42.991 |
| 1.510 |
|
|
| 38.221 |
| 0.571 |
|
|
| 25.088 |
| 0.251 |
|
|
| 9.348 |
| 0.313 |
|
|
| 7.520 |
| -0.506 |
|
|
| 12.418 |
| 0.145 |
|
|
| 10.352 |
| 0.481 |
|
|
| 5.413 |
| -0.335 |
|
Univariate Analysis: Continuous variables were analyzed using Mann-Whitney U test. Categorical variables were analyzed using chi-square tests or Fisher's exact test.
Features with bolded numbers of the P-value column are independent predictors.
Figure 2Performance evaluation of the models in the test set. (A) Receiver Operating Characteristic curve; (B) Precision-Recall curve. ‘CNN+Clinical+Radiomic’ refers to ModelCNN+radiomic+clinical, ‘Clinical+Radiomic’ refers to Modelradiomic+clinical, ‘Radiomic’ refers to Modelradiomic, ‘Clinical’ refers to Modelclinical, and ‘CNN’ refers to ModelCNN.