| Literature DB >> 34394982 |
Jingjing Zhang1, Yangyang Liu2, Toshiharu Mitsuhashi3, Toshihiko Matsuo1.
Abstract
BACKGROUND: Retinopathy of prematurity (ROP) occurs in preterm infants and may contribute to blindness. Deep learning (DL) models have been used for ophthalmologic diagnoses. We performed a systematic review and meta-analysis of published evidence to summarize and evaluate the diagnostic accuracy of DL algorithms for ROP by fundus images.Entities:
Year: 2021 PMID: 34394982 PMCID: PMC8363465 DOI: 10.1155/2021/8883946
Source DB: PubMed Journal: J Ophthalmol ISSN: 2090-004X Impact factor: 1.909
Figure 1Prisma flow diagram for study selection.
Characteristics of nine studies for the systematic review and meta-analysis.
| General characteristics | Dataset characteristics | Definition and grade of ROP | ||||||||||
| Author | Year, data source | Camera | Reference standard | Dataset | Identification and grade | |||||||
|
| ||||||||||||
| Brown et al. [ | 2018, i-ROP | RetCam | RSD, images and clinical diagnosis | 5511 images | 4535 N, 805 pre and 172 plus | |||||||
| Wang et al. [ | 2018, hospital and web | RetCam 3 | ICROP, CRYO-ROP, and ETROP | 3722 cases | 2823 N and 899 ROP; 382 Min and 295 S | |||||||
| Hu et al. [ | 2019, hospital | RetCam 3 | Consistent label | 2668 images | 1484 N and 1184 ROP; 382 Mil and 295 S | |||||||
| Tan et al. [ | 2019, ART-ROP | RetCam | Images and clinical diagnosis | 6974 images | 5336 N and 1638 plus | |||||||
| Wang et al. [ | 2019, hospital | NR | Consistent label | 11000 images | 7559 N and 3441 ROP; 529 Mil and 1204 S | |||||||
| Zhang et al. [ | 2019, hospital | RetCam 2/3 | The same criteria | 19543 images | 11298 N and 8245 ROP | |||||||
| Huang et al. [ | 2020, hospital | RetCam | ICROP + consistent label | 18808 images | 1222 N and 1129 ROP; 1189 Mil and 1174 S | |||||||
| Ramachandran et al. [ | 2021, KIDROP | RetCam 3 | Consistent label | 289 infants | 200 N and 89 plus | |||||||
| Wang et al. [ | 2021, hospital | RetCam 2/3 | Consistent label | 52249 images | 6363 any stage and 42177 N; 885 pre or plus and 17223 N | |||||||
|
| ||||||||||||
| DL model characteristics | ||||||||||||
| Author | Neural network | Algorithm evaluation | Classification | |||||||||
|
| ||||||||||||
| Brown et al. [ | CNN: U-Net and Inception V1 | The 5-fold cross-validation | N/pre and plus | |||||||||
| Plus/N and pre | ||||||||||||
| Wang et al. [ | DNN: Id-Net and Gr-Net | NR | N/ROP | |||||||||
| Min/S | ||||||||||||
| Hu et al. [ | CNN: a pretrained ImageNet (VGG16, inception V2, and ResNet-50) | Select the best module and image size | N/ROP | |||||||||
| Mil/S | ||||||||||||
| Tan et al. [ | CNN: Inception V3 | NR | N/plus | |||||||||
| Wang et al. [ | CNN: a pretrained ImageNet (Inception V2, Inception V3, and ResNet-50) | Select the best module | N/ROP | |||||||||
| Mil/S | ||||||||||||
| Zhang et al. [ | DNN: AlexNet, VGG16, and GoogLeNet | Select the best module | N/ROP | |||||||||
| Huang et al. [ | DNN: VGG16, VGG19, MobileNet, InceptionV3, and DenseNet | Select the best module and then 5-fold cross-validation | N/ROP | |||||||||
| Mil/S | ||||||||||||
| Ramachandran et al. [ | CNN: a pretrained ImageNet (Darknet-53 network) | Select the best module | N/plus | |||||||||
| Wang et al. [ | CNN: ResNet18, DenseNet121, and EfficientNetB2 | Five independent classifiers validation | Preplus plus/non | |||||||||
| Any stage/non | ||||||||||||
| Accuracy values | ||||||||||||
| Author | Negative vs. positive | TD | VD | ACC | SN | SP | AUC | TED | ACC | SN | SP | AUC |
|
| ||||||||||||
| Brown et al. [ | N vs. pre and plus | 80% | 20% | NR | NR | NR | 0.94 | 100 (from the same set with TD) | 0.91 | 0.93 | 0.94 | NR |
| N and pre vs. plus | 80% | 20% | NR | NR | NR | 0.98 | 1 | 0.94 | NR | |||
| Wang et al. [ | N vs. ROP | 2226 | 298 | NR | 0.9664 | 0.9933 | 0.9949 | 944 (from web) | NR | 0.8491 | 0.9690 | NR |
| Min vs. S | 2004 | 104 | NR | 0.8846 | 0.9231 | 0.9508 | 106 (from web) | NR | 0.933 | 0.736 | NR | |
| Hu et al. [ | N vs. ROP | 2068 | 300 | 0.97 | 0.96 | 0.98 | 0.9922 | 406 (from the same set with TD) | NR | 0.900 | 0.989 | NR |
| Mil vs. S | 466 | 100 | 0.84 | 0.82 | 0.86 | 0.9212 | 31 (from ROP in TED) | NR | 0.944 | 0.923 | NR | |
| Tan et al. [ | N vs. plus | 5579 | 1395 | 0.973 | 0.966 | 0.98 | 0.993 | 90 (external set) | 0.856 | 0.939 | 0.807 | NR |
| Wang et al. [ | N vs. ROP | 8507 | 1228 | 0.927 | 0.8999 | NR | NR | 1265 (from TD) | NR | NR | NR | NR |
| Mil vs. S | 1175 | 269 | 0.785 | 0.9235 | NR | NR | 289 (from ROP in TED) | NR | NR | NR | NR | |
| Zhang et al. [ | N vs. ROP | 17801 | 1742 | 0.988 | 0.935 | 0.995 | 0.998 | 1742 (from the same set with TD) | 0.988 | 0.935 | 0.995 | 0.998 |
| Huang et al. [ | N vs. ROP | 2351 | 368 cases | NR | Average 0.911 | Average 0.992 | NR | 101 (from the same set with TD) | 0.96 | 0.966 | 0.952 | 0.97 |
| Mil vs. S | 2363 | 339 cases | NR | Average 0.987 | Average 0.985 | NR | 85 (from ROP in TED) | 0.988 | 1 | 0.984 | 0.99 | |
| Ramachandran et al. [ | N vs. plus | About 80% | About 20% | 0.99 | 0.99 | 0.98 | 0.9947 | 1610 (from the same set with TD) | NR | 0.98 | 0.98 | NR |
| Wang et al. [ | Non vs. any stage | 36235 | 4813 | NR | 0.972 | 0.984 | 0.9977 | 7492 (from the same set with TD) | NR | 0.982 | 0.985 | 0.9981 |
| Non vs. preplus and plus | 13524 | 1866 | NR | 0.909 | 0.984 | 0.9882 | 2718 (from the same set with TD) | NR | 0.918 | 0.97 | 0.9827 | |
ROP, retinopathy of prematurity. Reference Standard. Based on images: RSD, a reference standard diagnosis; ICROP, International Classification of ROP, and based on both images and clinical information: CRYO-ROP, Cryotherapy for Retinopathy of Prematurity; ETROP, early treatment ROP; N, normal, pre, preplus disease; plus, plus disease; Min, minor; Mil, mild; S, severe; i-ROP, Imaging and Informatics in Retinopathy of Prematurity; ART-ROP, Auckland Regional Telemedicine ROP image library; KIDROP, Karnataka Internet assisted diagnosis of ROP program; DL, deep learning; CNN, convolutional neural network; DNN, deep neural network; DCNN, deep convolutional neural network; TD, training dataset; VD, validation dataset; TED, test dataset. Total data set includes TD, VD, and TED; ACC, accuracy; SN, sensitivity; SP, specificity; AUC, area under the receiver operating curve; NR, not reported.
The results of primary and subgroup analyses.
| Sensitivity (95% CI) | Specificity (95% CI) | PLR (95% CI) | NLR (95% CI) | DOR (95% CI) | AUC (95% CI) | Spearman | |
|---|---|---|---|---|---|---|---|
| Primary analyses | 0.953 (0.946–0.959) | 0.975 (0.973–0.977) | 19.265 (8.431–44.019) | 0.065 (0.040–0.105) | 313.73 (115.85–849.60) | 0.984 (0.978–0.989) | −0.561 (0.030) |
| Validation dataset | 0.934 (0.922–0.945) | 0.973 (0.969–0.977) | 26.232 (6.978–98.616) | 0.076 (0.046–0.125) | 359.58 (94.565–1367.3) | 0.977 (0.968–0.986) | −0.612 (0.060) |
| Test dataset | 0.969 (0.961–0.975) | 0.977 (0.974–0.979) | 22.853 (12.593–41.475) | 0.049 (0.026–0.092) | 522.92 (213.89–1278.4) | 0.987 (0.982–0.992) | −0.280 (0.354) |
| Define ROP | 0.956 (0.949–0.962) | 0.979 (0.977–0.981) | 30.118 (19.225–47.184) | 0.055 (0.033–0.092) | 576.21 (238.54–1391.9) | 0.9895 (0.9849–0.9941) | −0.503 (0.138) |
| Distinguish ROP | 0.931 (0.906–0.952) | 0.856 (0.826–0.882) | 7.927 (2.049–30.674) | 0.097 (0.038–0.252) | 88.655 (13.251–593.13) | 0.9820 (0.9641–0.9999) | −0.600 (0.285) |
Note. PLR, positive likelihood ratio; NLR, negative likelihood ratio; DOR, diagnostic odds ratios.
Figure 2Performance of the DL models for detecting and grading ROP in primary analyses. Forest plots of sensitivities (a), specificities (b), and diagnostic odds ratios (DOR) (c), with respective confidence intervals, respectively, as well as to assess the heterogeneity in accuracy estimates across studies. Plots of individual study results in ROC space with receiver operating characteristics curve for all classifiers included (SROC) (d).