| Literature DB >> 36249050 |
Meiyi Yang1,2, Xiaopeng He3, Lifeng Xu4, Minghui Liu2, Jiali Deng2, Xuan Cheng2, Yi Wei5, Qian Li5, Shang Wan5, Feng Zhang4, Lei Wu2, Xiaomin Wang2, Bin Song5, Ming Liu4.
Abstract
Background: Clear cell Renal Cell Carcinoma (ccRCC) is the most common malignant tumor in the urinary system and the predominant subtype of malignant renal tumors with high mortality. Biopsy is the main examination to determine ccRCC grade, but it can lead to unavoidable complications and sampling bias. Therefore, non-invasive technology (e.g., CT examination) for ccRCC grading is attracting more and more attention. However, noise labels on CT images containing multiple grades but only one label make prediction difficult. However, noise labels exist in CT images, which contain multiple grades but only one label, making prediction difficult. Aim: We proposed a Transformer-based deep learning algorithm with CT images to improve the diagnostic accuracy of grading prediction and to improve the diagnostic accuracy of ccRCC grading.Entities:
Keywords: clear cell renal cell carcinoma; deep learning; ensemble learning; transformer network; tumor grading
Year: 2022 PMID: 36249050 PMCID: PMC9555088 DOI: 10.3389/fonc.2022.961779
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1(A) shows the flow frame of data processing, in which stage one is the kidney detection network and stage two is a tumor detection network. (B) shows the details of the detection network.
Figure 2Illustrations of the results using various data augmentation (A) shows the original CT image; (B) is the CT image after 180° rotation; (C) is the CT image after Affine transformation; (D) shows the CT image after crop and resize transformation; (E) is the CT image with Gaussian blur; (F) is the CT image with Gaussian noise.
Figure 3A simple network framework for TransResNet. (A) represents the overall architecture, mainly including the CNN structure and the transformed structure. (B) shows the details of TransResNet. The network framework contains 12 transformer blocks (i.e., L = 12).
The demographic and clinical statistics of patients with ccRCC.
| Attribute | Training cohort | Testing cohort |
|---|---|---|
| Age (years) | 56 ± 12 (589) | 54 ± 11 (170) |
| Male | 374 | 112 |
| Female | 215 | 58 |
| Grade I | 21 | 4 |
| Grade II | 371 | 81 |
| Grade III | 165 | 67 |
| Grade IV | 32 | 18 |
Figure 4Illustrations of data augmentation operators.
The impact of transformer on different network structures. .
| Model | ACC | AUC | SE | SP |
|---|---|---|---|---|
| ResNet-34 | 82.3 ± 2.5 | 85.7 ± 2.3 | 89.4 ± 1.7 | 83.2 ± 1.2 |
| TransResNet | 87.1 ± 2.3 | 90.3 ± 2.5 | 91.3 ± 1.4 | 85.3 ± 1.5 |
| DenseNet-121 | 81.5 ± 2.3 | 85.5 ± 2.4 | 80.0 ± 0.5 | 84.3 ± 1.2 |
| TransDenseNet | 83.9 ± 2.1 | 90.5 ± 2.2 | 80.6 ± 0.6 | 86.8 ± 1.0 |
| Inception-V3 | 80.0 ± 2.0 | 83.7 ± 2.0 | 76.5 ± 1.2 | 78.8 ± 1.3 |
| TransInception | 84.3 ± 2.0 | 89.4 ± 2.0 | 83.4 ± 1.3 | 85.8 ± 1.6 |
| SENet | 81.8 ± 2.5 | 84.2 ± 2.8 | 76.8 ± 1.5 | 85.1 ± 1.4 |
| TransSENet | 85.1 ± 2.3 | 89.2 ± 2.5 | 89.1 ± 1.3 | 82.1 ± 1.1 |
| RegNet | 81.9 ± 2.1 | 84.3 ± 2.5 | 82.5 ± 1.5 | 80.0 ± 1.0 |
| TransRegNet | 82.3 ± 2.0 | 87.7 ± 2.5 | 84.7 ± 1.0 | 80.0 ± 0.5 |
We report the average accuracy and standard deviation of five time runs.
ACC, AUC, SE, and SP are the accuracy, the Area Under Curve, the sensitivity and the specificity of the model on the testset, respectively.
Figure 5Receiver operating characteristic (ROC) curves for the task of tumor classification using a positive ratio feature.
Ensemble results under different strategies.
| Model | ACC | AUC | SE | SP |
|---|---|---|---|---|
| TransResNet | 85.8 | 90.3 | 91.3 | 85.3 |
| TransDenseNet | 83.9 | 90.5 | 80.6 | 86.8 |
| TransInception | 84.3 | 89.4 | 83.4 | 85.8 |
| TransRegNet | 82.3 | 87.7 | 84.7 | 80.0 |
| TransSENet | 85.1 | 89.2 | 89.1 | 82.1 |
| Ensemble(voting) | 85.8 | 90.5 | 91.4 | 89.8 |
| Ensemble(averaging) |
|
|
|
|
| Ensemble(stacking) | 86.1 | 90.8 | 89.1 | 91.1 |
| Ensemble(blending) | 86.1 | 90.0 | 90.0 | 85.5 |
ACC, AUC, SE, and SP are the accuracy, the Area Under Curve, the sensitivity and the specificity of the model on the testset, respectively. The averaging strategy of ensemble shows the best results (the seventh column, highlighted in bold).
Comparison with the state-of-the-art transformer.
| Model | ACC | AUC | SE | SP |
|---|---|---|---|---|
| TransResNet |
|
|
|
|
| ViT-Small | 78.6 | 83.5 | 88.5 | 81.3 |
| CaiT-Small | 79.4 | 83.2 | 89.1 | 82.0 |
| Conformer | 76.4 | 79.3 | 83.5 | 65.7 |
ACC, AUC, SE, and SP are the accuracy, the Area Under Curve, the sensitivity and the specificity of the model on the testset, respectively. Our model TransResNet obtains the best results (the first column, highlighted in bold).
Figure 6A series of experimental results about CNN.
The comparison results of our method and transfer learning.
| Transfer Model | ACC | AUC | SE | SP |
|---|---|---|---|---|
| ResNet18 | 80.6 | 86.0 | 83.3 | 78.8 |
| ResNet34 | 80.2 | 85.6 | 86.3 | 78.2 |
| DenseNet121 | 81.2 | 85.6 | 78.9 | 89.4 |
| TransResNet |
|
|
|
|
| ViT-tiny | 74.7 | 75.6 | 64.7 | 84.7 |
The comparison models are ResNet, DenseNet, and ViT-tiny parameterized by pre-training using the ImageNet dataset. Our model is trained from random initialization.
ACC ,AUC ,SE, and SP are the accuracy, the Area Under Curve, the sensitivity and the specificity of the model on the testset, respectively. Our model TransResNet obtains the best results (the seventh column, highlighted in bold).
Figure 7The demonstration of error classification. It mainly includes four categories. (A) shows the the CT image of positive samples; (B) shows the the CT image of negative samples; (C) shows the CT image of positive samples and corresponding class activation map; (D) shows the CT image of negative samples and corresponding class activation map.
Figure 8Visualization of the class activation map generated by the last transformer layer on images from the ccRCC. The yellow box indicates the lesion area. The color red denotes higher attention values, and the color blue denotes lower. (A) is the original image; (B) shows the class activation map of the CNN model; and (C) is the class activation map of the TransResNet model.