| Literature DB >> 35902186 |
Guihua Zhang1, Jian-Wei Lin1, Ji Wang1, Jie Ji2, Ling-Ping Cen1, Weiqi Chen1, Peiwen Xie1, Yi Zheng1, Yongqun Xiong1, Hanfu Wu1, Dongjie Li1, Tsz Kin Ng1, Chi Pui Pang1,3, Mingzhi Zhang4.
Abstract
OBJECTIVE: To develop and validate a real-world screening, guideline-based deep learning (DL) system for referable diabetic retinopathy (DR) detection.Entities:
Keywords: diabetic retinopathy; medical retina; vetreoretinal
Mesh:
Year: 2022 PMID: 35902186 PMCID: PMC9341185 DOI: 10.1136/bmjopen-2021-060155
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 3.006
Summary of the data sets
|
| Development set | External validation set | |||||
| Total | JSIEC | LEDRSP | Subtotal | Liuzhou | STU-2nd | Subtotal | |
| Camera type | AFC-230, Canon CR-DGi, Top-2000 | Top-2000 | AFC-230, Canon CR-DGi | AFC-230, Canon CR-DGi, Top-2000 | AFC-230 | Top-2000 | AFC-230, Top-2000 |
| Periods | April 2014–June 2018 | June 2016–June 2018 | April 2014–April 2016 | April 2014–June 2018 | June 2017–June 2018 | June 2016–June 2018 | June 2016–June 2018 |
| Images, n/N (%) | 83 465/83 465 (100) | 2567/83 465 (3.1) | 52 313/83 465 (60.7) | 53 211/83 465 (63.8) | 12 898/83 465 (15.5) | 17 356/83 465 (20.8) | 30 254/83 465 (36.2) |
| Eyes, n/N (%) | 39 836/39 836 (100) | 2241/39 836 (5.6) | 24 299/39 836 (61.0) | 26 540/39 836 (66.6) | 6572/39 836 (16.5) | 6916/39 836 (17.4) | 13 488/39 836 (33.9) |
| Patients, n/N (%) | 21 716/21 716 (100) | 2051/21 716 (9.4) | 13 026/21 716 (60.0) | 15 077/21 716 (69.4) | 3298/21 716 (15.2) | 3512/21 716 (16.2) | 6810/21 716 (31.4) |
| Patients with sex available, n/N (%) | 17 042/21 716 (78.5) | 1804/2051 (88.0) | 8685/13 026 (66.7) | 10 489/15 077 (69.6) | 3298/3298 (100) | 3426/3512 (97.6) | 6724/6810 (98.7) |
| Male, n/N (%) | 7493/17 042 (44.0) | 932/1804 (51.66) | 3893/8685 (44.8) | 4825/10 489 (46.0) | 1284/3298 (38.9) | 1465/3426 (42.8) | 2749/6724 (40.9) |
| Patients with age available, n/N (%) | 20 150/21 716 (92.5) | 1804/2051 (88.0) | 11 793/13 026 (90.5) | 13 597/15 077 (90.2) | 3298/3298 (100) | 3426/3512 (97.6) | 6724/6810 (98.7) |
| Age, mean (SD), years | 60.0 (12.9) | 44.6 (18.8) | 61.4 (10.1) | 59.2 (13.0) | 65.2 (9.7) | 57.7 (13.9) | 61.4 (12.6) |
JSIEC, Joint Shantou International Eye Center of Shantou University and the Chinese University of Hong Kong; LEDRSP, Lifeline Express Diabetic Retinopathy Screening Program; Liuzhou, Liuzhou City Red Cross Hospital; STU-2nd, Second Affiliated Hospital of Shantou University Medical College.
Performance of the five classifiers of the system
| Classifiers | Data set | n | Accuracy | F1 score | Sensitivity | Specificity | AUROC (95% CI) | |||
| TN | FP | FN | TP | |||||||
| Image quality | Training | 5818 | 180 | 1741 | 32 145 | 0.952 | 0.971 | 0.949 | 0.97 | 0.9882 (0.9874 to 0.9891) |
| Validation | 804 | 64 | 231 | 4165 | 0.944 | 0.966 | 0.947 | 0.926 | 0.9812 (0.9780 to 0.9844) | |
| Test | 1095 | 103 | 418 | 6447 | 0.935 | 0.961 | 0.939 | 0.914 | 0.9768 (0.9737 to 0.9798) | |
| External validation | 4706 | 243 | 1429 | 23 876 | 0.945 | 0.966 | 0.944 | 0.951 | 0.9751 (0.9732 to 0.9770) | |
| Retinopathy | Training | 30 118 | 544 | 9 | 3215 | 0.984 | 0.921 | 0.997 | 0.982 | 0.9992 (0.9991 to 0.9994) |
| Validation | 3916 | 107 | 15 | 358 | 0.972 | 0.854 | 0.96 | 0.973 | 0.9956 (0.9941 to 0.9970) | |
| Test | 6031 | 178 | 16 | 640 | 0.972 | 0.868 | 0.976 | 0.971 | 0.9962 (0.9951 to 0.9972) | |
| External validation | 21 609 | 784 | 65 | 2847 | 0.966 | 0.87 | 0.978 | 0.965 | 0.9944 (0.9936 to 0.9952) | |
| Maculopathy gradability | Training | 5068 | 148 | 1269 | 27 401 | 0.958 | 0.975 | 0.956 | 0.972 | 0.9934 (0.9928 to 0.9940) |
| Validation | 617 | 43 | 174 | 3562 | 0.951 | 0.97 | 0.953 | 0.935 | 0.9896 (0.9873 to 0.9918) | |
| Test | 970 | 49 | 302 | 5544 | 0.949 | 0.969 | 0.948 | 0.952 | 0.9890 (0.9871 to 0.9910) | |
| External validation | 4374 | 451 | 1704 | 18 776 | 0.915 | 0.946 | 0.917 | 0.907 | 0.9639 (0.9617 to 0.9660) | |
| Maculopathy | Training | 24 301 | 470 | 170 | 3729 | 0.978 | 0.921 | 0.956 | 0.981 | 0.9962 (0.9957 to 0.9967) |
| Validation | 3186 | 104 | 29 | 417 | 0.964 | 0.862 | 0.935 | 0.968 | 0.9906 (0.9864 to 0.9948) | |
| Test | 4900 | 130 | 61 | 755 | 0.967 | 0.888 | 0.925 | 0.974 | 0.9928 (0.9912 to 0.9944) | |
| External validation | 16 987 | 572 | 150 | 2771 | 0.965 | 0.885 | 0.949 | 0.967 | 0.9904 (0.9888 to 0.9919) | |
| Photocoagulation | Training | 32 526 | 105 | 0 | 1255 | 0.997 | 0.96 | 1.000 | 0.997 | 1.0000 (0.9999 to 1.0000) |
| Validation | 4252 | 27 | 5 | 112 | 0.993 | 0.875 | 0.957 | 0.994 | 0.9924 (0.9794 to 1.0000) | |
| Test | 6589 | 33 | 8 | 235 | 0.994 | 0.92 | 0.967 | 0.995 | 0.9979 (0.9958 to 1.0000) | |
| External validation | 24 277 | 467 | 29 | 532 | 0.98 | 0.682 | 0.948 | 0.981 | 0.9904 (0.9869 to 0.9940) | |
| Referable DR* | ||||||||||
| Images | Training | 23 888 | 362 | 147 | 4890 | 0.983 | 0.951 | 0.971 | 0.985 | 0.9980 (0.9977 to 0.9983) |
| Validation | 3138 | 89 | 28 | 544 | 0.969 | 0.903 | 0.951 | 0.972 | 0.9932 (0.9899 to 0.9965) | |
| Test | 4838 | 114 | 55 | 953 | 0.972 | 0.919 | 0.945 | 0.977 | 0.9952 (0.9940 to 0.9964) | |
| External validation | 16 667 | 575 | 117 | 3859 | 0.967 | 0.918 | 0.971 | 0.967 | 0.9931 (0.9920 to 0.9942) | |
| Eyes | Training | 14 764 | 411 | 110 | 3136 | 0.972 | 0.923 | 0.966 | 0.973 | 0.9961 (0.9955 to 0.9967) |
| Validation | 1986 | 87 | 15 | 342 | 0.958 | 0.87 | 0.958 | 0.958 | 0.9906 (0.9850 to 0.9961) | |
| Test | 2949 | 117 | 36 | 608 | 0.959 | 0.888 | 0.944 | 0.962 | 0.9923 (0.9901 to 0.9946) | |
| External validation | 9429 | 624 | 59 | 1876 | 0.943 | 0.846 | 0.97 | 0.938 | 0.9884 (0.9863 to 0.9905) | |
| Patients | Training | 8415 | 291 | 74 | 2141 | 0.967 | 0.921 | 0.967 | 0.967 | 0.9956 (0.9949 to 0.9964) |
| Validation | 1138 | 64 | 10 | 237 | 0.949 | 0.865 | 0.96 | 0.947 | 0.9894 (0.9837 to 0.9951) | |
| Test | 1669 | 80 | 25 | 407 | 0.952 | 0.886 | 0.942 | 0.954 | 0.9914 (0.9884 to 0.9943) | |
| External validation | 4683 | 492 | 37 | 1219 | 0.918 | 0.822 | 0.971 | 0.905 | 0.9848 (0.9819 to 0.9877) | |
*By integrating the prediction of referable retinopathy and maculopathy on an image, referable DR decisions were given by the system when any referable lesion was detected, and the accuracies were based on the image, eye and patient levels.
AUROC, area under the receiver operating characteristic curve; DR, diabetic retinopathy; FN, false negative; FP, false positive; TN, true negative; TP, true positive.
Figure 1Receiver operating characteristic curves of the main dimensional classifiers and referable DR detection. The classification performances of four subsets (training, validation, test and external validation) are shown as receiver operating characteristic curves and AUC for detection of referable retinopathy (A), referable maculopathy (B), image-level referable DR (C) and photocoagulation (D). Notably, the detection of referable DR on an image (D) was automatically generated by integrating the results of referable retinopathy and referable maculopathy. AUC, area under receiver operating characteristic curve; DR, diabetic retinopathy.
Figure 2Visualisation by the SHAP-CAM heatmap technique for referable DR lesions. The original images are displayed in the first column, the combined heatmaps generated by SHAP-CAM are shown in the last column, and the heatmaps by CAM and DeepSHAP are shown in the second and third columns for comparison, respectively. (A) Vitreous haemorrhage located on the temporal-superior retina of the original image with the centred macula, suggesting the R3 degree of DR. The CAM heatmap showed a rough location as a wide red-cyan area for the lesion, while the DeepSHAP heatmap demonstrated dispersed dots with some irrelevant noises. The SHAP-CAM heatmap retained an light pink background area, with similar size as that of CAM, and depicted a deeper red clear lesion, same as that of DeepSHAP, in the background. The residue area was masked by CAM as white to reduce inference of redundant information. (B) Retinopathy of R2, including venous beading, intraretinal microvascular abnormality and multiple blot haemorrhages, located around the optic disc on original images. The CAM heatmap showed a rough area for detection, whereas the DeepSHAP heatmap indicated the optic disc as a lesion. For the SHAP-CAM heatmap, all key lesions are depicted in the accurate light pink area without involving the optic disc and macula. (C) The original image showed a referable maculopathy with multiple exudates involving the centre of the fovea. The SHAP-CAM heatmap accurately predicted the shape/outline of the lesions in the macula area, whereas CAM only visualised the lesions by a wide red-cyan circle area and the DeepSHAP showed several light noises out of the macula. DR, diabetic retinopathy.
Performance comparison between the system and three DR experts
| Dimension | Reader | n | Accuracy | F1 score | Sensitivity | Specificity | AUROC (95% CI) | |||
| TN | FP | FN | TP | |||||||
| Referable retinopathy | Expert 1 | 181 | 4 | 5 | 62 | 0.964 | 0.932 | 0.925 | 0.978 | NA |
| Expert 2 | 181 | 4 | 6 | 61 | 0.960 | 0.924 | 0.910 | 0.978 | NA | |
| Expert 3 | 183 | 2 | 2 | 65 | 0.984 | 0.97 | 0.970 | 0.989 | NA | |
| Experts’ average | NA | NA | NA | NA | 0.970 | 0.942 | 0.935 | 0.982 | NA | |
| DLA | 176 | 9 | 0 | 67 | 0.964 | 0.937 | 1.000 | 0.951 | 0.9958 (0.9916 to 1.0000) | |
| Referable maculopathy | Expert 1 | 171 | 3 | 7 | 71 | 0.960 | 0.934 | 0.910 | 0.983 | NA |
| Expert 2 | 167 | 7 | 4 | 74 | 0.956 | 0.931 | 0.949 | 0.960 | NA | |
| Expert 3 | 166 | 8 | 4 | 74 | 0.952 | 0.925 | 0.949 | 0.954 | NA | |
| Experts’ average | NA | NA | NA | NA | 0.956 | 0.93 | 0.936 | 0.966 | NA | |
| DLA | 163 | 11 | 4 | 74 | 0.940 | 0.908 | 0.949 | 0.937 | 0.9877 (0.9756 to 0.9999) | |
| **Referable DR | Expert 1 | 164 | 3 | 7 | 78 | 0.960 | 0.94 | 0.918 | 0.982 | NA |
| Expert 2 | 161 | 6 | 6 | 79 | 0.952 | 0.929 | 0.929 | 0.964 | NA | |
| Expert 3 | 160 | 7 | 4 | 81 | 0.956 | 0.936 | 0.953 | 0.958 | NA | |
| Experts’ average | NA | NA | NA | NA | 0.956 | 0.935 | 0.933 | 0.968 | NA | |
| DLA | 163 | 4 | 4 | 81 | 0.968 | 0.953 | 0.953 | 0.976 | 0.9909 (0.9809 to 1.0000) | |
NA, not applicable.
*By integrating the prediction of referable retinopathy and maculopathy on an image, referable DR decisions were given by the system when any referable lesion is detected.
AUROC, area under the receiver operating characteristic curve; DLA, deep learning algorithm; DR, diabetic retinopathy; FN, false negative; FP, false positive; TN, true negative; TP, true positive.