| Literature DB >> 35009825 |
Xiaoyuan Yu1, Suigu Tang1, Chak Fong Cheang1, Hon Ho Yu2, I Cheong Choi2.
Abstract
The automatic analysis of endoscopic images to assist endoscopists in accurately identifying the types and locations of esophageal lesions remains a challenge. In this paper, we propose a novel multi-task deep learning model for automatic diagnosis, which does not simply replace the role of endoscopists in decision making, because endoscopists are expected to correct the false results predicted by the diagnosis system if more supporting information is provided. In order to help endoscopists improve the diagnosis accuracy in identifying the types of lesions, an image retrieval module is added in the classification task to provide an additional confidence level of the predicted types of esophageal lesions. In addition, a mutual attention module is added in the segmentation task to improve its performance in determining the locations of esophageal lesions. The proposed model is evaluated and compared with other deep learning models using a dataset of 1003 endoscopic images, including 290 esophageal cancer, 473 esophagitis, and 240 normal. The experimental results show the promising performance of our model with a high accuracy of 96.76% for the classification and a Dice coefficient of 82.47% for the segmentation. Consequently, the proposed multi-task deep learning model can be an effective tool to help endoscopists in judging esophageal lesions.Entities:
Keywords: classification; esophageal endoscopic images; image retrieval; multi-task; segmentation
Mesh:
Year: 2021 PMID: 35009825 PMCID: PMC8749873 DOI: 10.3390/s22010283
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Comparison of the methods for esophageal lesion classification.
| Authors | Methods | Performance |
|---|---|---|
| Münzenmayer et al. [ | content-based image retrieval | 0.71 kappa |
| Riaz et al. [ | autocorrelation Gabor features | 82.39% accuracy |
| Yeh et al. [ | color coherence vector | 92.86% accuracy |
| Liu et al. [ | support vector machines | 90.75% accuracy |
| Nakagawa et al. [ | SSMD | 91.00% accuracy |
| Kumagai et al. [ | GoogLeNet | 90.90% accuracy |
| Liu et al. [ | VGGNets, etc. | 89.00% accuracy |
| Du et al. [ | ECA-DDCNN | 90.63% accuracy |
| Igarashi et al. [ | AlexNet | 96.50% accuracy |
Comparison of the methods for esophageal lesion segmentation.
| Authors | Methods | Performance |
|---|---|---|
| Sommen et al. [ | local color and texture features | 0.95 recall |
| Yang et al. [ | online atlas selection | 0.73 DSC |
| Mendel et al. [ | transfer learning | 0.94 sensitivity |
| Huang et al. [ | channel-attention U-Net | 0.725 DV |
| Tran et al. [ | spatial attention network and STAPLE algorithm | 0.869 Dice |
| Chen et al. [ | U-Net Plus | 0.79 DV |
| Diniz et al. [ | Atlas-based Residual-U-Net | 0.8215 Dice |
Figure 1The network architecture of the proposed multi-task deep learning model.
Figure 2The processes of training and testing the proposed model using the dataset.
Figure 3The results of classification and retrieval. “Type: 0.xx” on the top-left means the “predicted category: confidence level”. (a) The input images with high confidence levels indicate the high possibility of a correct prediction made by the classification task. (b) The input images with low confidence levels indicate the high possibility of an incorrect prediction made by the classification task.
Comparison of the classification results of our model and other models on the testing set.
| Models | Top-1 Accuracy ± std | F1 Score ± std |
|---|---|---|
| VGG-16 [ | 92.68% ± 0.26 | 88.12% ± 0.26 |
| ResNet-18 [ | 93.18% ± 0.25 | 88.36% ± 0.27 |
| ResNeXt-50 [ | 94.34% ± 0.38 | 90.76% ± 0.33 |
| Efficientnet-B0 [ | 95.15% ± 0.40 | 92.42% ± 0.39 |
| RegNetY-400MF [ | 94.64% ± 0.52 | 91.57% ± 0.59 |
| Ours | 96.76% ± 0.22 | 94.22% ± 0.23 |
The diagnostic performance of the endoscopists without and with the proposed model.
| Performance | Accuracy | Precision | Sensitivity | Specificity | NPV | F1-Score | |
|---|---|---|---|---|---|---|---|
| Our model | cancer | 98.48% | 98.21% | 96.49% | 99.29% | 99.59% | 97.34% |
| normal | 96.46% | 90.00% | 95.74% | 96.69% | 98.65% | 92.78% | |
| esophagitis | 95.96% | 96.74% | 94.68% | 97.12% | 95.28% | 95.70% | |
| all | 96.96% | 94.98% | 95.64% | 97.70% | 97.84% | 95.27% | |
| Endoscopists only | cancer | 91.41% | 87.04% | 82.46% | 95.04% | 93.06% | 84.69% |
| normal | 83.84% | 60.87% | 89.36% | 82.12% | 96.12% | 72.41% | |
| esophagitis | 76.26% | 81.33% | 64.89% | 86.54% | 73.17% | 72.19% | |
| all | 83.84% | 76.41% | 78.90% | 87.90% | 87.45% | 76.43% | |
| Endoscopists | cancer | 93.43% | 95.83% | 78.90% | 98.58% | 92.67% | 87.62% |
| normal | 87.04% | 65.67% | 93.62% | 84.77% | 97.71% | 77.19% | |
| esophagitis | 81.31% | 84.34% | 74.47% | 87.5% | 79.13% | 79.10% | |
| all | 87.26% | 81.94% | 82.33% | 90.28% | 89.84% | 81.30% | |
| Endoscopists | cancer | 96.46% | 93.10% | 94.74% | 97.16% | 97.86% | 93.91% |
| normal | 90.40% | 73.33% | 93.62% | 89.4% | 97.83% | 82.24% | |
| esophagitis | 89.9% | 96.25 | 81.91% | 97.12% | 85.51% | 88.50% | |
| all | 92.25% | 87.56% | 90.09% | 94.56% | 97.73% | 88.22% | |
The diagnosis results of the endoscopists after referring to the results of the proposed model.
| Counts | Endoscopists (Before) | Total | ||
|---|---|---|---|---|
| Right | Wrong | |||
| Endoscopists (after) | Right | 152 |
| 175 |
| Wrong | 8 | 11 | 23 | |
| Total | 160 | 38 | 198 | |
Figure 4The top-5 most similar labeled samples are selected by images retrieval. (a) The input image is predicted as esophagitis. (b) The input image is predicted as normal.
Comparison of the segmentation results of our model and other models on the testing set.
| Models | IoU | Dice |
|---|---|---|
| U-Net [ | 63.55% | 75.12% |
| PSPNet [ | 62.28% | 75.62% |
| FCN [ | 63.95% | 76.72% |
| Deeplab V3+ [ | 66.24% | 78.20% |
| CCNet [ | 62.52% | 74.90% |
| OCRNet [ | 61.04% | 73.63% |
| SegFormer [ | 67.25% | 80.38% |
| Ours | 71.27% | 82.47% |
Figure 5The segmentation results of the proposed model and other models.
Figure 6The confusion matrix of our model.