| Literature DB >> 34290979 |
Kyongmin Sarah Beck1, Bomi Gil1, Sae Jung Na1, Ji Hyung Hong2, Sang Hoon Chun2, Ho Jung An2, Jae Jun Kim3, Soon Auck Hong4, Bora Lee5, Won Sang Shim5, Sungsoo Park5, Yoon Ho Ko2,6.
Abstract
The prediction of lymphovascular invasion (LVI) or pathological nodal involvement of tumor cells is critical for successful treatment in early stage non-small cell lung cancer (NSCLC). We developed and validated a Deep Cubical Nodule Transfer Learning Algorithm (DeepCUBIT) using transfer learning and 3D Convolutional Neural Network (CNN) to predict LVI or pathological nodal involvement on chest CT images. A total of 695 preoperative CT images of resected NSCLC with tumor size of less than or equal to 3 cm from 2008 to 2015 were used to train and validate the DeepCUBIT model using five-fold cross-validation method. We also used tumor size and consolidation to tumor ratio (C/T ratio) to build a support vector machine (SVM) classifier. Two-hundred and fifty-four out of 695 samples (36.5%) had LVI or nodal involvement. An integrated model (3D CNN + Tumor size + C/T ratio) showed sensitivity of 31.8%, specificity of 89.8%, accuracy of 76.4%, and AUC of 0.759 on external validation cohort. Three single SVM models, using 3D CNN (DeepCUBIT), tumor size or C/T ratio, showed AUCs of 0.717, 0.630 and 0.683, respectively on external validation cohort. DeepCUBIT showed the best single model compared to the models using only C/T ratio or tumor size. In addition, the DeepCUBIT model could significantly identify the prognosis of resected NSCLC patients even in stage I. DeepCUBIT using transfer learning and 3D CNN can accurately predict LVI or nodal involvement in cT1 size NSCLC on CT images. Thus, it can provide a more accurate selection of candidates who will benefit from limited surgery without increasing the risk of recurrence.Entities:
Keywords: computed tomography; deep learning; lobectomy; non-small cell lung cancer; prognosis
Year: 2021 PMID: 34290979 PMCID: PMC8287408 DOI: 10.3389/fonc.2021.661244
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Data criteria and specification.
Baseline characteristics of the training, validation, and external validation cohorts.
| Characteristic | Total (n = 695) | Training and validation cohort (n = 600) | External validation cohort (n = 95) |
| |
|---|---|---|---|---|---|
| Age (years)* | Mean ± SD | 63.0 ± 9.7 | 63.0 ± 9.8 | 63.6 ± 9.5 | 0.551 |
| Sex | Male | 361 | 308 (51.3%) | 53 (55.8%) | 0.419 |
| Female | 334 | 292 (48.7%) | 42 (44.2%) | ||
| Smoking history | Never | 412 | 349 (58.2%) | 63 (66.3%) | 0.281 |
| Current | 128 | 112 (18.7%) | 16 (16.8%) | ||
| Former | 155 | 139 (23.1%) | 16 (16.8%) | ||
| Histology | AC | 471 | 395 (65.8%) | 76 (80.0%) | 0.005 |
| SqCC | 123 | 108 (18.0%) | 15 (15.8%) | ||
| Others | 101 | 97 (16.2%) | 4 (4.2%) | ||
| Tumor size (cm)† | 2.0 (1.6-2.6) | 2.0 (1.5-2.6) | 2.1 (1.7-2.6) | 0.184 | |
| C/T ratio† | 1.0 (0.5-1.0) | 1.0 (0.5-1.0) | 1.0 (0.7-1.0) | 0.008 | |
| LVI or nodal involvement | Yes | 254 | 232 (38.7%) | 22 (23.2%) | 0.004 |
| No | 441 | 368 (61.3%) | 73 (76.8%) |
*Data are mean ± SD.
†Measured on CT image and data are median (with interquartile range in parentheses).
C/T ratio, consolidation to tumor ratio; LVI, lymphovascular invasion; AC, adenocarcinoma; SqCC, squamous cell carcinoma.
Figure 2Evaluation pipeline for proposed model.
Figure 3Overall process of the DeepCUBIT algorithm. (A) Pre-training Process: Nodule samples and malignancy samples are presented to pre-train. (B) Deep Transfer Learning: Predicting the LVI or nodal involvement by fine-tuning the model with weights of pre-trained weights. (C) Prediction of LVI or nodal involvement: Feature integration and prediction of LVI or nodal involvement for extra validation cohort.
Figure 4Architecture of DeepCUBIT model.
Performance comparison for transfer learning in Cohort 1 and 2.
| Cohort | P-value | Model | CLF | AUC | Cis (95%) |
|---|---|---|---|---|---|
|
|
|
| NN |
|
|
| Deep 3D CNN without TL | NN | 0.606 | 0.503 | ||
|
|
|
| NN |
|
|
| Deep 3D CNN without TL | NN | 0490 | 0.364 |
CNN, Convolutional Neural Network; CLF, Classifier; NN, Neural Network; TL, Transfer Learning; AUC, area under the curve; CIs, Confidence Intervals for AUC score.
Variables with DeepCUBIT model are shown in bold type.
Performance evaluation for test data (Cohort I, average of 5 fold hold-out test set).
| Classifier | SVM | Xgboost | Random Forest | |||
|---|---|---|---|---|---|---|
| Feature Type | AUC | CIs (95%) | AUC | CIs (95%) | AUC | CIs (95%) |
|
|
|
|
|
|
|
|
| Tumor size | 0.657 | 0.558 - 0.751 | 0.621 | 0.522 - 0.720 | 0.577 | 0.473 - 0.684 |
| C/T Ratio | 0.742 | 0.663 - 0.817 | 0.726 | 0.644 - 0.803 | 0.631 | 0.538 - 0.721 |
| Tumor Size + C/T Ratio | 0.754 | 0.669 - 0.834 | 0.735 | 0.658 - 0.817 | 0.686 | 0.591 - 0.777 |
| 3D CNN + Tumor size | 0.770 | 0.681 - 0.852 | 0.752 | 0.663 - 0.833 | 0.725 | 0.635 - 0.813 |
CNN, Convolutional Neural Network; DeepCUBIT, Deep Cubical Nodule Transfer Learning Algorithm; C/T Ratio, consolidation to tumor ratio; SVM, Support Vector Machine; AUC, area under the curve; CIs, Confidence Intervals for AUC score.
Variable with DeepCUBIT model is shown in bold type.
Performance evaluation for external validation data (Cohort II, external hold-out set).
| Classifier | SVM | Xgboost | Random Forest | |||
|---|---|---|---|---|---|---|
| Feature Type | AUC | CIs (95%) | AUC | CIs (95%) | AUC | CIs (95%) |
|
|
|
|
|
|
|
|
| Tumor size | 0.630 | 0.502 - 0.749 | 0.634 | 0.510 - 0.752 | 0.606 | 0.476 - 0.729 |
| C/T Ratio | 0.683 | 0.614 - 0.743 | 0.682 | 0.612 - 0.743 | 0.658 | 0.591 - 0.733 |
| Tumor size + C/T Ratio | 0.716 | 0.606 - 0.813 | 0.715 | 0.613 - 0.812 | 0.663 | 0.544 - 0.776 |
| 3D CNN + Tumor size | 0.759 | 0.646 - 0.855 | 0.757 | 0.654 - 0.843 | 0.716 | 0.607 - 0.820 |
CNN, Convolutional Neural Network; DeepCUBIT, Deep Cubical Nodule Transfer Learning Algorithm; C/T Ratio, consolidation to tumor ratio; SVM, Support Vector Machine; AUC, area under the curve; CIs, Confidence Intervals for AUC score.
Variable with DeepCUBIT model is shown in bold type.
Figure 5Gradient-weighted class activation heatmaps of nodule cubes. (A) Raw intensity, (B) gradient heatmap, and (C) overaly heatmap of a solid tumor with C/T ratio 1.0 show the area most responsible for the prediction of LVI or nodal involvement to be the solid tumor itself, rather than pleural tag. (D) Raw intensity, (E) gradient heatmap and (F) overlay heatmap of a part-solid tumor with C/T ratio 0.75 show that the area most responsible for the prediction of LVI or nodal involvement to be the interface of the tumor with the adjacent lung parenchyma.
Figure 6Kaplan–Meier curves according to predicted risk of recurrence for NSCLC patients with stage I in Cohort I (test set only) and Cohort II (105 patients). Curves obtained using (A) DeepCUBIT model, (B) SVM classifier using DeepCUBIT features with tumor size and C/T ratio, (C) SVM classifier using tumor size alone, and (D) SVM classifier using C/T ratio alone.