| Literature DB >> 35957992 |
Zhaomin Yao1,2, Yizhe Yuan3, Zhenning Shi3, Wenxin Mao3, Gancheng Zhu3, Guoxu Zhang1,2, Zhiguo Wang1,2.
Abstract
Diabetic retinopathy (DR) and age-related macular degeneration (AMD) are forms of degenerative retinal disorders that may result in vision impairment or even permanent blindness. Early detection of these conditions is essential to maintaining a patient's quality of life. The fundus photography technique is non-invasive, safe, and rapid way of assessing the function of the retina. It is widely used as a diagnostic tool for patients who suffer from fundus-related diseases. Using fundus images to analyze these two diseases is a challenging exercise, since there are rarely obvious features in the images during the incipient stages of the disease. In order to deal with these issues, we have proposed a deep learning method called FunSwin. The Swin Transformer constitutes the main framework for this method. Additionally, due to the characteristics of medical images, such as their small number and relatively fixed structure, transfer learning strategy that are able to increase the low-level characteristics of the model as well as data enhancement strategy to balance the data are integrated. Experiments have demonstrated that the proposed method outperforms other state-of-the-art approaches in both binary and multiclass classification tasks on the benchmark dataset.Entities:
Keywords: diabetic retinopathy; disease stage prediction; fundus image; macular edema; swin transformer
Year: 2022 PMID: 35957992 PMCID: PMC9358036 DOI: 10.3389/fphys.2022.961386
Source DB: PubMed Journal: Front Physiol ISSN: 1664-042X Impact factor: 4.755
FIGURE 1Overview of the proposed methodology.
FIGURE 2The framework of the Swin transformer.
FIGURE 3The details of the Swin Transformer Block.
Comparing the existing methods of binary classification for diabetic retinopathy.
| Methods | Accuracy | Sensitivity | Specificity | F1-score |
|---|---|---|---|---|
| Conformer | 0.8855 | 0.8817 | 0.8963 | 0.9193 |
| Convnext | 0.8267 |
| 0.4451 | 0.8913 |
| HRnet | 0.8378 | 0.8796 | 0.7195 | 0.8891 |
| Vgg11 | 0.8156 | 0.8538 | 0.7073 | 0.8725 |
| Mlp Mixer | 0.8585 | 0.9226 | 0.6768 | 0.9060 |
| Res2net50 | 0.8267 | 0.8452 | 0.7744 | 0.8782 |
| Shufflenet_v1 | 0.8045 | 0.8796 | 0.5915 | 0.8693 |
| T2T Vit | 0.8553 | 0.8323 |
| 0.8948 |
| Vit Transformer | 0.8076 | 0.9419 | 0.4268 | 0.8786 |
| Our Method |
| 0.9376 | 0.8171 |
|
Comparing the existing methods of binary classification for macular edema.
| Methods | Accuracy | Sensitivity | Specificity | F1-score |
|---|---|---|---|---|
| Conformer | 0.9745 | 0.9755 | 0.9727 | 0.9801 |
| Convnext | 0.9320 | 0.8962 |
| 0.9443 |
| HRnet | 0.9453 | 0.9283 | 0.9761 | 0.9563 |
| Vgg11 | 0.7947 | 0.7283 | 0.9147 | 0.8204 |
| Mlp Mixer | 0.9441 | 0.9453 | 0.9420 | 0.9561 |
| Res2net50 | 0.9648 | 0.9566 | 0.9795 | 0.9722 |
| Shufflenet_v1 | 0.9648 | 0.9698 | 0.9556 | 0.9726 |
| T2T Vit | 0.9587 | 0.9377 |
| 0.9669 |
| Vit Transformer | 0.9611 | 0.9604 | 0.9625 | 0.9695 |
| Our Method |
|
| 0.9863 |
|
Comparing the existing methods of multi-classification for diabetic retinopathy.
| Methods | Accuracy | Sensitivity | Specificity | F1-score |
|---|---|---|---|---|
| Conformer | 0.7704 | 0.7524 | 0.9122 | 0.7691 |
| Convnext | 0.7123 | 0.6754 | 0.8770 | 0.6999 |
| HRnet | 0.7374 | 0.7031 | 0.8954 | 0.7311 |
| Vgg11 | 0.6211 | 0.5630 | 0.8547 | 0.6013 |
| Mlp Mixer | 0.7248 | 0.7121 | 0.8942 | 0.7246 |
| Res2net50 | 0.6352 | 0.5568 | 0.8665 | 0.6054 |
| Shufflenet_v1 | 0.6101 | 0.6500 | 0.8383 | 0.6026 |
| T2T Vit | 0.6950 | 0.7023 | 0.8679 | 0.6845 |
| Vit Transformer | 0.7563 | 0.7268 | 0.9150 | 0.7490 |
| Our Method |
|
|
|
|
Comparing the existing methods of multi-classification for macular edema.
| Methods | Accuracy | Sensitivity | Specificity | F1-score |
|---|---|---|---|---|
| Conformer | 0.9733 | 0.9733 | 0.9865 | 0.9733 |
| Convnext | 0.8214 | 0.8147 | 0.9104 | 0.8091 |
| HRnet | 0.8761 | 0.8725 | 0.9379 | 0.8745 |
| Vgg11 | 0.6136 | 0.6035 | 0.8056 | 0.5753 |
| Mlp Mixer | 0.9210 | 0.9202 | 0.9602 | 0.9209 |
| Res2net50 | 0.8882 | 0.8849 | 0.9443 | 0.8872 |
| Shufflenet_v1 | 0.9635 | 0.9638 | 0.9816 | 0.9636 |
| T2T Vit | 0.9174 | 0.9144 | 0.9583 | 0.9160 |
| Vit Transformer | 0.9514 | 0.9510 | 0.9754 | 0.9514 |
| Our Method |
|
|
|
|
Performance of binary classification before and after data augmentation.
| Methods | Accuracy | Sensitivity | Specificity | F1-score | |
|---|---|---|---|---|---|
| Diabetic Retinopathy | No Augmentation | 0.5444 | 1.0 | 0 | 0.7050 |
| Augmentation | 0.9062 | 0.9376 | 0.8171 | 0.9366 | |
| Macular Edema | No Augmentation | 0.9389 | 0.7941 | 0.9726 | 0.8308 |
| Augmentation | 0.9866 | 0.9868 | 0.9863 | 0.9896 |
FIGURE 4Model convergence performance of binary classification.