| Literature DB >> 36243746 |
Mingyu Kim1, You Na Kim2, Miso Jang1,3, Jeongeun Hwang4,5, Hong-Kyu Kim6, Sang Chul Yoon7, Yoon Jeon Kim8, Namkug Kim9,10.
Abstract
Realistic image synthesis based on deep learning is an invaluable technique for developing high-performance computer aided diagnosis systems while protecting patient privacy. However, training a generative adversarial network (GAN) for image synthesis remains challenging because of the large amounts of data required for training various kinds of image features. This study aims to synthesize retinal images indistinguishable from real images and evaluate the efficacy of the synthesized images having a specific disease for augmenting class imbalanced datasets. The synthesized images were validated via image Turing tests, qualitative analysis by retinal specialists, and quantitative analyses on amounts and signal-to-noise ratios of vessels. The efficacy of synthesized images was verified by deep learning-based classification performance. Turing test shows that accuracy, sensitivity, and specificity of 54.0 ± 12.3%, 71.1 ± 18.8%, and 36.9 ± 25.5%, respectively. Here, sensitivity represents correctness to find real images among real datasets. Vessel amounts and average SNR comparisons show 0.43% and 1.5% difference between real and synthesized images. The classification performance after augmenting synthesized images outperforms every ratio of imbalanced real datasets. Our study shows the realistic retina images were successfully generated with insignificant differences between the real and synthesized images and shows great potential for practical applications.Entities:
Mesh:
Year: 2022 PMID: 36243746 PMCID: PMC9569369 DOI: 10.1038/s41598-022-20698-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Examples of randomly synthesized images.
Figure 2High-resolution synthesized retinal photographs. (a) Synthesized retinal images that most ophthalmologists selected as “real” image. (b) Synthesized retinal images that most ophthalmologists considered as “synthesized” images. Numbers in parentheses indicate the number of examiners out of 40 ophthalmologists who chose image as “real” or “synthesized.”
Figure 3Representative randomly synthesized images demonstrating ERM. Cellophane-like membrane formation at the macula and perifoveal vascular tortuosity are shown in synthetic fundus images. Heatmap derived using Grad-CAM (https://github.com/jacobgil/pytorch-grad-cam) correspond to these characteristic ERM features.
Baseline characteristics of study for evaluating classification performance.
| Dataset | Normal | ERM | No. ratio (Normal: ERM) | ||||
|---|---|---|---|---|---|---|---|
| No. of images | Age, years (SD) | Sex, N (M, F) | No. of images | Age, years (SD) | Sex, N (M, F) | ||
| Train | 691 | 49.79(9.28) | 691, 425 | 691 | 58.75 (8.06) | 401, 490 | 1:1 |
| 1382 | 49.73(9.13) | 864, 518 | 1:0.5 | ||||
| 552 | 58.94(7.93) | 314, 238 | 1:0.4 | ||||
| 414 | 59.10(7.74) | 236, 178 | 1:0.3 | ||||
| 276 | 59.37(7.41) | 151, 125 | 1:0.2 | ||||
| 138 | 59.00(7.55) | 77, 61 | 1:0.1 | ||||
| Valid | 197 | 49.82(9.28) | 108, 89 | 197 | 59.18(8.79) | 110, 87 | 1:1 |
| Test | 396 | 49.47(8.83) | 252, 144 | 396 | 59.34(8.57) | 225, 171 | 1:1 |
Image Turing test results. Accuracy, sensitivity, and specificity of each group and each method.
| Method | Group ID | Accuracy (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) |
|---|---|---|---|---|
| Method1 | 1 | 0.51 (0.44–0.58) | 0.69 (0.56–0.79) | 0.33 (0.20–0.50) |
| 2 | 0.48 (0.44–0.51) | 0.68 (0.50–0.81) | 0.28 (0.16–0.45) | |
| 3 | 0.48 (0.43–0.54) | 0.63 (0.51–0.74) | 0.33 (0.24–0.44) | |
| 4 | 0.58 (0.51–0.65) | 0.78 (0.68–0.85) | 0.38 (0.22–0.57) | |
| 5 | 0.65 (0.55–0.74) | 0.78 (0.63–0.88) | 0.52 (0.31–0.71) | |
| Method2 | 1 | 0.51 (0.44–0.58) | 0.69 (0.56–0.79) | 0.33 (0.20–0.50) |
| 2 | 0.48 (0.45–0.51) | 0.65 (0.55–0.75) | 0.31 (0.22–0.40) | |
| 3 | 0.62 (0.55–0.68) | 0.78 (0.69–0.85) | 0.45 (0.31–0.60) |
P of Logistic GEE results of each group and each method.
| Method | Group ID | Accuracy | Sensitivity | Specificity | Elapsed time |
|---|---|---|---|---|---|
| Method1 | 1 | Ref | Ref | Ref | Ref |
| 2 | 0.44 | 0.92 | 0.62 | 0.21 | |
| 3 | 0.53 | 0.51 | 0.97 | 0.85 | |
| 4 | 0.18 | 0.20 | 0.71 | 0.11 | |
| 5 | 0.03 | 0.31 | 0.18 | 0.48 | |
| Method2 | 1 | ref | ref | ref | ref |
| 2 | 0.45 | 0.67 | 0.75 | 0.39 | |
| 3 | 0.03 | 0.18 | 0.29 | 0.18 |
Classification performance for various ratios between normal and ERM. AUC, accuracy, sensitivity, and specificity were shown for each ratio with and without adding synthesized ERM images.
| No. ratio of real dataset (Normal:ERM) | Add synthesized ERM* | AUC | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|---|
| 1:1 | No | 0.994 | 0.971 | 0.965 | 0.977 |
| 1:0.5 | No | 0.988 | 0.963 | 0.955 | 0.972 |
| Yes | 0.989 | 0.970 | 0.957 | 0.982 | |
| 1:0.4 | No | 0.983 | 0.943 | 0.909 | 0.977 |
| Yes | 0.994 | 0.970 | 0.942 | 0.997 | |
| 1:0.3 | No | 0.984 | 0.905 | 0.826 | 0.985 |
| Yes | 0.987 | 0.968 | 0.947 | 0.990 | |
| 1:0.2 | No | 0.943 | 0.874 | 0.843 | 0.904 |
| Yes | 0.966 | 0.914 | 0.904 | 0.924 | |
| 1:0.1 | No | 0.735 | 0.559 | 0.174 | 0.944 |
| Yes | 0.909 | 0.739 | 0.508 | 0.970 |
*Yes means adding synthesized ERM for balancing number ratio with normal dataset.
Figure 4Group-averaged performance of image Turing test: (a) accuracy, (b) sensitivity, (c) specificity, (d) elapsed time for each image. Numerical values of average estimates and their standard deviations are also shown.