| Literature DB >> 32391274 |
Bihong T Chen1, Zikuan Chen1, Ningrong Ye1, Isa Mambetsariev2, Jeremy Fricke2, Ebenezer Daniel1, George Wang1, Chi Wah Wong3, Russell C Rockne4, Rivka R Colen5,6, Mohd W Nasser7, Surinder K Batra7, Andrei I Holodny8, Sagus Sampath9, Ravi Salgia2.
Abstract
Lung cancer can be classified into two main categories: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), which are different in treatment strategy and survival probability. The lung CT images of SCLC and NSCLC are similar such that their subtle differences are hardly visually discernible by the human eye through conventional imaging evaluation. We hypothesize that SCLC/NSCLC differentiation could be achieved via computerized image feature analysis and classification in feature space, as termed a radiomic model. The purpose of this study was to use CT radiomics to differentiate SCLC from NSCLC adenocarcinoma. Patients with primary lung cancer, either SCLC or NSCLC adenocarcinoma, were retrospectively identified. The post-diagnosis pre-treatment lung CT images were used to segment the lung cancers. Radiomic features were extracted from histogram-based statistics, textural analysis of tumor images and their wavelet transforms. A minimal-redundancy-maximal-relevance method was used for feature selection. The predictive model was constructed with a multilayer artificial neural network. The performance of the SCLC/NSCLC adenocarcinoma classifier was evaluated by the area under the receiver operating characteristic curve (AUC). Our study cohort consisted of 69 primary lung cancer patients with SCLC (n = 35; age mean ± SD = 66.91± 9.75 years), and NSCLC adenocarcinoma (n = 34; age mean ± SD = 58.55 ± 11.94 years). The SCLC group had more male patients and smokers than the NSCLC group (P < 0.05). Our SCLC/NSCLC classifier achieved an overall performance of AUC of 0.93 (95% confidence interval = [0.85, 0.97]), sensitivity = 0.85, and specificity = 0.85). Adding clinical data such as smoking history could improve the performance slightly. The top ranking radiomic features were mostly textural features. Our results showed that CT radiomics could quantitatively represent tumor heterogeneity and therefore could be used to differentiate primary lung cancer subtypes with satisfying results. CT image processing with the wavelet transformation technique enhanced the radiomic features for SCLC/NSCLC classification. Our pilot study should motivate further investigation of radiomics as a non-invasive approach for early diagnosis and treatment of lung cancer.Entities:
Keywords: artificial neural network; computed tomography radiomics (CT Radiomics); non-linear classifier; non-small cell lung cancer (NSCLC); small cell lung cancer (SCLC)
Year: 2020 PMID: 32391274 PMCID: PMC7188953 DOI: 10.3389/fonc.2020.00593
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Schema for lung cancer segmentation, radiomic feature extraction and predictive modeling. (A) Representative CT images from small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) showing tumor segmentation. (B) Illustrations of radiomic feature extraction for texture, shape, and intensity. (C) Decision of SCCL/NSCLC classification (upper panel) with the receiver operating characteristic (ROC) curves (middle panel) and the heat map of radiomic features (lower panel).
Figure 2The nnet architecture of the radiomics-based SCLC/NSCLC classifier. This figure presents the input layer with 20 nodes receiving 20 radiomic features, the 3 hidden layers for non-linear mapping, and the output layer with 2 nodes for “SCLC” and “NSCLC” decision upon a hard thresholding f(node)>0 and f(node)≤0, respectively. SCLC, small cell lung cancer; NSCLC, non-small cell lung cancer.
Patient demographic data.
| Gender | 0.01 | ||
| Male | 24 (68.57%) | 12 (35.29%) | |
| Female | 11 (31.42%) | 22 (64.70%) | |
| Age | 0.002 | ||
| Mean ± SD | 66.91 ± 9.75 | 58.55 ± 11.94 | |
| History of Smoking | <0.001 | ||
| Yes | 34 (97.14%) | 9 (26.47%) | |
| No | 1 (2.86%) | 25 (73.53%) | |
| Race | 0.03 | ||
| Asian | 7 (20.00%) | 16 (47.05%) | |
| Caucasian | 26 (74.29%) | 15 (44.12%) | |
| Other | 2 (5.71%) | 3 (8.82%) |
NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer.
Figure 3The top 20 features selected from the radiomic data set (total 1,731 features) for the small cell lung cancer (SCLC) / non-small-cell lung cancer (NSCLC) classification. (A) Measurements for top 20 features. Each feature (matrix row) consisted of 35 SCLC measurements (index 1:35) and 34 NSCLC measurements (index 36:69). Each feature vector was normalized by max=1. (B) Mutual information map for the top 20 features. A large mutual information value indicated a high redundancy between the features.
Top 20 features for SCLC/NSCLC classification in descending order of feature correlation with the target vector.
| GLSZM.ZSN @ WT(HHH) | GLSZM.ZSN @ WT(HHH) |
| NGTDM.complex @ WT(LHL) | NGTDM.complex @ WT(LHL) |
| Global.range @ WT(LHH) | Global.range @ WT(LHH) |
| Smoking @ Clinic | GLSZM.SZLGE @ WT(LLH) |
| GLSZM.SZLGE @ WT(LLH) | GLSZM.LGZE @ WT(HHL) |
| GLSZM.LGZE @ WT(HHL) | GLSZM.ZSN @ WT(LHH) |
| GLSZM.ZSN @ WT(LHH) | GLSZM.SZLGE @ WT(HHH) |
| GLSZM.SZLGE @ WT(HHH) | GLSZM.SZLGE @ WT(HLH) |
| GLSZM.SZLGE @ WT(HLH) | Global.variance @ WT(HLH) |
| Global.variance @ WT(HLH) | Global.kurt @ WT(HLH) |
| Global.kurt @ WT(HLH) | GLSZM.GLN @ WT(LHH) |
| GLSZM.GLN @ WT(LHH) | GLSZM.ZSN @ rawNg=32 |
| GLSZM.ZSN @ rawNg=32 | GLSZM.ZP @ WT(HLH) |
| GLSZM.ZP @ WT(HLH) | GLSZM.ZSN @ WT(LLL) |
| GLSZM.ZSN @ WT(LLL) | Global.mean @ WT(HHL) |
| Global.mean @ WT(HHL) | NGTDM.complex @ WT(LHH) |
| NGTDM.complex @ WT(LHH) | Global.max @ WT(LLH) |
| Global.max @ WT(LLH) | NGTDM.contrast @ WT(HLH) |
| NGTDM.contrast @ WT(HLH) | GLSZM.ZSV @ WT(LHH) |
| GLSZM.ZSV @ WT(LHH) | GLSZM.ZP @ rawNg=16 |
@, feature derived from image; GLCM, gray-level co-occurrence matrix; Global, whole image statistics (no spatial attributes); GLN, gray-level non-uniformity; GLRLM, gray-level run-length matrix; GLSZM, gray-level size zone matrix; LRLGE, long run low gray-level emphasis; raw, original CT image (no wavelet transform); SZLGE, small zone low gray-level emphasis; WT(xxx), wavelet-transform with 8 subbands (LLL, LLH, LHL, LHH, HLL, HLH, HHL, or HHH); ZP, zone percentage; ZSN, zone size non-uniformity; ZSV, zone size variance.
Figure 4Scatter plots of the top 20 feature measurements from the dataset of 69 patients. All feature measures were normalized to a range [−1,1] (i.e., max = 1). The correlation (corr) value indicated the correlation between the feature vector and the target vector (SCLC = 1, NSCLC = 0). The notations for the selected features were presented in Table 2.
Figure 5Two scenarios for demonstrating the nnet “training-validating-testing” performance. Upper: one case of 1 misclassification; lower: one case of no misclassification. The panels designated as a1 and a2 present the nnet training behaviors under random initial settings (w: weight and b: bias); The panels designated as b1 and b2 present the output node values (in value range [−1,1], in black dots) in reference to target setting (SCLC = 1, NSCLC = -1); and the panels designated as c1 and c2 present the confusion matrices. SCLC, small cell lung cancer; NSCLC, non-small cell lung cancer.
Figure 6Receiver operating characteristic curve (ROC) performance for the SCLC/NSCLC neural network classifications with the clinical data (A) and without the clinical data (B). The average ROC plot was the average over 30 ROC trials with random initializations for the classifier. AUC, area under the ROC curve; FPR, false positive rate; TPR, true positive rate; CI, confidence interval.