| Literature DB >> 35681555 |
Lifeng Xu1,2, Chun Yang2,3, Feng Zhang1, Xuan Cheng2,3, Yi Wei4, Shixiao Fan2,3, Minghui Liu2,3, Xiaopeng He4,5, Jiali Deng2,3, Tianshu Xie2,3, Xiaomin Wang2,3, Ming Liu2,3, Bin Song4.
Abstract
This retrospective study aimed to develop and validate deep-learning-based models for grading clear cell renal cell carcinoma (ccRCC) patients. A cohort enrolling 706 patients (n = 706) with pathologically verified ccRCC was used in this study. A temporal split was applied to verify our models: the first 83.9% of the cases (years 2010-2017) for development and the last 16.1% (year 2018-2019) for validation (development cohort: n = 592; validation cohort: n = 114). Here, we demonstrated a deep learning(DL) framework initialized by a self-supervised pre-training method, developed with the addition of mixed loss strategy and sample reweighting to identify patients with high grade for ccRCC. Four types of DL networks were developed separately and further combined with different weights for better prediction. The single DL model achieved up to an area under curve (AUC) of 0.864 in the validation cohort, while the ensembled model yielded the best predictive performance with an AUC of 0.882. These findings confirms that our DL approach performs either favorably or comparably in terms of grade assessment of ccRCC with biopsies whilst enjoying the non-invasive and labor-saving property.Entities:
Keywords: class imbalance; clear cell renal cell carcinoma; deep learning; label noise; self-supervised learning; tumor grading
Year: 2022 PMID: 35681555 PMCID: PMC9179576 DOI: 10.3390/cancers14112574
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.575
Patient characteristics.
| Patient Characteristic | Development Cohort | Validation Cohort |
|---|---|---|
| Number | 592 | 114 |
| CT Images | 9978 | 2491 |
| Male | 374 (63.2%) | 71 (62.3%) |
| Female | 218 (36.8%) | 43 (37.7%) |
| Average Age | 54.9 (±12.1) | 55.8 (±12.1) |
| Acquisition Date | 2010–2017 | 2018–2019 |
| Low-grade | 354 (59.8%) | 76 (66.7%) |
| High-grade | 238 (40.2%) | 38 (33.3%) |
Figure 1Segmentation model concentrates the CT image’s content on the tumor.
Figure 2The overall flow of pre-training and developing. The top part of the figure shows the pre-training process. In the pre-training process, the original images are expanded into four images after rotation transformation, and their labels are 0, 1, 2, and 3, representing that they are obtained by quarter-turning the original image 0, 1, 2, and 3 times, clockwise. The bottom part shows the developing process. The developing process network is initialized from the pre-training process network.
Results of different network models and ensemble models in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| SE_RESNET50 | 85.5 ± 6.6 | 76.3 ± 1.3 | 82.5 ± 4.0 | 86.4 ± 0.2 |
| RESNET101 | 77.6 ± 3.9 | 76.3 ± 4.0 | 77.1 ± 1.3 | 82.2 ± 0.3 |
| REGNET400 | 82.9 ± 4.0 | 72.4 ± 1.3 | 79.4 ± 3.1 | 83.0 ± 0.1 |
| REGNET800 | 84.2 ± 7.9 | 74.3 ± 4.6 | 81.0 ± 3.7 | 85.9 ± 0.3 |
| ENSEMBLE | 85.5 ± 1.3 | 75.0 ± 2.6 | 82.0 ± 0.1 | 88.2 ± 0.6 |
ACC = Accuracy; SEN = Sensitivity; SPC = Specificity; AUC = Area under the receiver operating characteristic curve.
Figure 3Receiver operating characteristic (ROC) curve of the four different models and the ensemble model.
Figure 4Network output probabilities for low-grade and high-grade patients. The left subplot is the network output probability distribution of low-grade and high-grade patients. The right subplot is the CT images of low-grade and high-grade patients with different network output probabilities.
Figure 5The probability matrix of four grades of patients being predicted to low-grade and high-grade. The subplot in the left is the result in the development cohort. The the subplot on the right is the result in the validation cohort.
Performance of the four basic models in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| SE_RESNET50 | 65.8 ± 3.7 | 86.3 ± 3.4 | 72.6 ± 2.0 | 78.0 ± 2.3 |
| RESNET101 | 54.4 ± 13.0 | 85.5 ± 12.4 | 64.8 ± 4.7 | 72.5 ± 0.6 |
| REGNET400 | 65.7 ± 6.4 | 85.1 ± 4.5 | 72.2 ± 2.9 | 76.6 ± 0.5 |
| REGNET800 | 66.4 ± 4.7 | 79.6 ± 2.4 | 70.8 ± 2.5 | 75.8 ± 1.4 |
Performance of four types of self-supervised pre-trained models without mixed loss strategy and sample reweighting methods in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| SE_RESNET50 | 63.1 ± 2.1 | 90.3 ± 2.7 | 72.2 ± 0.5 | 81.8 ± 0.8 |
| RESNET101 | 68.4 ± 2.6 | 80.3 ± 1.3 | 73.4 ± 0.3 | 81.2 ± 0.6 |
| REGNET400 | 69.3 ± 2.5 | 79.8 ± 1.6 | 72.8 ± 1.3 | 80.8 ± 0.2 |
| REGNET800 | 62.3 ± 2.5 | 93.0 ± 2.5 | 72.5 ± 0.8 | 82.7 ± 0.2 |
Performance of four types of basic models with mixed loss and sample reweighting methods in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| SE_RESNET50 | 76.2 ± 3.6 | 75.0 ± 1.1 | 75.9 ± 2.2 | 79.2 ± 1.1 |
| RESNET101 | 73.7 ± 2.1 | 76.8 ± 3.3 | 74.7 ± 1.1 | 80.4 ± 0.3 |
| REGNET400 | 72.8 ± 8.9 | 73.2 ± 10.6 | 72.9 ± 2.5 | 79.4 ± 1.1 |
| REGNET800 | 75.0 ± 2.3 | 75.3 ± 3.0 | 75.1 ± 0.6 | 80.0 ± 0.7 |
Comparison of the SE-ResNet50 model performance based on different pre-training methods in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| ImageNet | 75.0 ± 1.3 | 77.3 ± 3.3 | 75.7 ± 1.2 | 80.3 ± 0.8 |
| Ours | 85.5 ± 6.6 | 76.3 ± 1.3 | 82.5 ± 4.0 | 86.4 ± 0.2 |
Performance of machine learning methods in the validation cohort.
| Model | Sen (%) | Spec (%) | ACC (%) | AUC (%) |
|---|---|---|---|---|
| SVM | 63.2 ± 18.9 | 63.2 ± 17.8 | 63.2 ± 6.7 | 62.5 ± 7.1 |
| KNN | 71.2 ± 16.0 | 54.6 ± 17.9 | 60.3 ± 7.1 | 65.2 ± 2.9 |
| DecisionTree | 96.1 ± 2.9 | 12.8 ± 3.4 | 40.1 ± 1.6 | 54.4 ± 1.0 |
| RandomForest | 61.8 ± 7.8 | 68.8 ± 5.7 | 66.4 ± 1.9 | 68.4 ± 3.1 |
| GradientBoosting | 63.8 ± 11.7 | 75.7 ± 13.0 | 71.7 ± 5.9 | 68.7 ± 4.1 |
| Ours-Ensemble | 85.5 ± 1.3 | 75.0 ± 2.6 | 82.0 ± 0.1 | 88.2 ± 0.6 |