| Literature DB >> 35833885 |
Yukun Zhou1,2,3, Siegfried K Wagner2, Mark A Chia2, An Zhao1,4, Peter Woodward-Court2,5, Moucheng Xu1,3, Robbert Struyven2,3, Daniel C Alexander1,4, Pearse A Keane2.
Abstract
Purpose: To externally validate a deep learning pipeline (AutoMorph) for automated analysis of retinal vascular morphology on fundus photographs. AutoMorph has been made publicly available, facilitating widespread research in ophthalmic and systemic diseases.Entities:
Mesh:
Year: 2022 PMID: 35833885 PMCID: PMC9290317 DOI: 10.1167/tvst.11.7.12
Source DB: PubMed Journal: Transl Vis Sci Technol ISSN: 2164-2591 Impact factor: 3.048
Figure 1.Diagram of the proposed AutoMorph pipeline. The input is color fundus photography, and the final output is the measured vascular morphology features. Image quality grading and anatomical segmentation modules use deep learning models. Confidence analysis decreases false gradable images in the image quality grading module.
Characteristics of the Training and External Validation Data
| Type of Data | Dataset Name | Country of Origin | Image Quantity | Device (Manufacturer) |
|---|---|---|---|---|
| Image Quality Grading | ||||
| Training data | EyePACS-Q-train | USA | 12,543 (NR, more than 99%) | A variety of imaging devices, including DRS (CenterVue, Padova, Italy); iCam (Optovue, Fremont, CA); CR1/DGi/CR2 (Canon, Tokyo, Japan); Topcon NW 8 (Topcon, Tokyo, Japan) |
| Internal validation data | EyePACS-Q-test | USA | 16,249 (NR, more than 99%) | — |
| External validation data | DDR test | China | 4,105 (100%) | 42 types of fundus cameras, mainly Topcon D7000, Topcon TRC NW48, D5200 (Nikon, Tokyo, Japan), and Canon CR 2 cameras |
| Binary Vessel Segmentation | ||||
| Training data | DRIVE | Netherlands | 40 (100%) | CR5 non-mydriatic 3CCD camera (Canon) |
| STARE | USA | 20 (100%) | TRV-50 fundus camera (Topcon) | |
| CHASEDB1 | UK | 28 (0%) | NM-200D handheld fundus camera (Nidek, Aichi, Japan) | |
| HRF | Germany and Czech Republic | 45 (100%) | CF-60UVi camera (Canon) | |
| IOSTAR | Netherlands and China | 30 (53.3%) | EasyScan camera (i-Optics, Rijswijk, Netherlands) | |
| LES-AV | NR | 22 (0%) | Visucam Pro NM fundus camera (Carl Zeiss Meditec, Jena, Germany) | |
| External validation data | AV-WIDE | USA | 30 (100%) | 200Tx Ultra-widefield Imaging Device (Optos, Dunfermline, UK) |
| DR HAGIS | UK | 39 (100%) | TRC-NW6s (Topcon), TRC-NW8 (Topcon), or CR-DGi fundus camera (Canon) | |
| Artery/Vein Segmentation | ||||
| Training data | DRIVE-AV | Netherlands | 40 (100%) | CR5 non-mydriatic 3CCD camera (Canon) |
| HRF-AV | Germany and Czech Republic | 45 (100%) | CF-60UVi camera (Canon) | |
| LES-AV | Nauru | 22 (9%) | Visucam Pro NM fundus camera (Zeiss) | |
| External validation data | IOSTAR-AV | Netherlands and China | 30 (53.3%) | EasyScan camera (i-Optics) |
| Optic Disc Segmentation | ||||
| Training data | REFUGE | China | 800 (100%) | Visucam 500 fundus camera (Zeiss) and CR-2 camera (Canon) |
| GAMMA | China | 100 (100%) | — | |
| External validation data | IDRID | India | 81 (100%) | VX-10α digital fundus camera (Kowa, Las Vegas, NV) |
External validation data are unseen for model training and were purely used to evaluate the trained model performance on out-of-distribution data with different countries of origin and imaging devices. EyePACS-Q is a subset of EyePACS with image quality grading annotation. NR, not reported.
Image quantity indicates the image number used in this work and the parentheses show the proportion of macula-centered images.
Although we have evaluated the binary vessel segmentation model on the ultra-widefield retinal fundus dataset AV-WIDE, we recommend using AutoMorph on retinal fundus photographs with a 25° to 60° FOV, as all of the deep learning models are trained using images with FOV equals to 25° to 60°, and the preprocessing step is tailored for images with this FOV.
Evaluated on disc due to no cup annotation.
Figure 2.Features measured by AutoMorph, including tortuosity, vessel caliber, disc-to-cup ratio, and others. For each image, the optic disc/cup information is measured, including the height and width, as well as cup-to-disc ratio. For binary vessels, the tortuosity, fractal dimension, vessel density, and average width are measured. In addition to these features, arteries/veins are also used for measuring the caliber features CRAE, CRVE, and AVR by Hubbard and Knudtson methods.
Figure 3.Confidence analysis for image quality grading. M1 to M8 represent the eight ensemble models. For each image, the predicted category is transferred as gradable or ungradable (good and usable are as gradable, reject as ungradable). The average probability and SD are calculated for the predicted category. (a, b) Two image cases with high confidence in prediction. The case shown in (c) is classified as gradable quality with low average probability of 0.619, and the case in (d) has a high SD of 0.191, which are defined as low-confidence images in our work. Although (c) and (d) are preliminarily classified as gradable, the final classification is rectified as ungradable with the confidence threshold.
Validation of Functional Modules and Comparison With Other Methods
| Image Quality Grading | Artery/Vein Segmentation | |||||
|---|---|---|---|---|---|---|
| EyePACS-Q Test | DDR Test | IOSTAR-AV | ||||
| AutoMorph (Internal) | Comparison | AutoMorph (External) | Comparisona (Internal) | AutoMorph (External) | Comparison | |
| Sensitivity | 0.85 | 0.85 | 1 | 0.93 | 0.64 | 0.79 |
| Specificity | 0.93 | NR | 0.89 | 0.97 | 0.98 | 0.76 |
| Precision | 0.87 | 0.87 | 0.6 | 0.73 | 0.68 | NR |
| Accuracy | 0.92 | 0.92 | 0.91 | 0.99 | 0.96 | 0.78 |
| AUC-ROC | 0.97 | NR | 0.99 | 0.99 | 0.95 | NR |
|
| 0.86 | 0.86 | 0.75 | 0.82 | 0.66 | NR |
| IoU | — | — | — | — | 0.53 | NR |
| Binary Vessel Segmentation | Optic Disc | |||||
| Ultra-widefield: AV-WIDE | Standard Field: DR HAGIS | IDRID | ||||
| AutoMorph (External) | Comparison | AutoMorph (External) | Comparison | AutoMorph (External) | Comparison | |
| Sensitivity | 0.71 | 0.78 | 0.84 | 0.67 | 0.9 | 0.9 |
| Specificity | 0.98 | NR | 0.98 | 0.98 | 0.95 | NR |
| Precision | 0.75 | 0.82 | 0.73 | NR | 0.94 | NR |
| Accuracy | 0.96 | 0.97 | 0.97 | 0.97 | 0.99 | 0.99 |
| AUC-ROC | 0.96 | NR | 0.98 | NR | 0.95 | NR |
|
| 0.73 | 0.8 | 0.78 | 0.71 | 0.94 | NR |
| IoU | 0.57 | NR | 0.64 | NR | 0.91 | 0.85 |
“Internal” indicates that the validation and training data are from the same dataset but isolated. “External” means that validation data are from external datasets. The comparisons are with competitive methods of image quality grading, binary vessel segmentation,, artery/vein segmentation, and optic disc segmentation. NR, not reported.
aDue to no comparison method on the DDR test, we compared AutoMorph (external) to the same architecture, EfficientNet-b4, that is trained with DDR train data (internal).
Figure 4.The confusion matrix of the grading results on EyePACS-Q test data. (a) The results before confidence thresholding; (b) the results after thresholding. The value is normalized in rows. The diagonal includes the correct classification ratio. The red box indicates false gradable (i.e., ungradable images are wrongly classified as gradable), and the green box shows the percentage of false ungradable (i.e., gradable images are wrongly categorized as ungradable). The false gradable of (b) is reduced by 76.2% compared with that of (a), but the false ungradable increases in (b).
Figure 5.Visualization results of anatomical segmentation, including binary vessel (first two columns), artery/vein (third column), and optic disc (final column).
Agreement Calculation of Measured Vascular Features Between AutoMorph and Expert Annotation
| ICC (95% Confidence Interval) | |||
|---|---|---|---|
| Zone B | Zone C | Whole Image | |
| DR HAGIS | |||
| Fractal dimension | 0.94 (0.88–0.97) | 0.98 (0.95–0.99) | 0.94 (0.88–0.97) |
| Vessel density | 0.98 (0.96–0.99) | 0.97 (0.94–0.99) | 0.94 (0.88–0.97) |
| Average width | 0.95 (0.89–0.98) | 0.96 (0.93–0.98) | 0.97 (0.95–0.99) |
| Distance tortuosity | 0.80 (0.59–0.91) | 0.85 (0.69–0.93) | 0.86 (0.73–0.93) |
| Squared curvature tortuosity | 0.68 (0.34–0.85) | 0.88 (0.75–0.94) | 0.84 (0.68–0.92) |
| Tortuosity density | 0.89 (0.77–0.95) | 0.70 (0.38–0.86) | 0.87 (0.74–0.93) |
| IOSTAR-AV | |||
| CRAE (Hubbard) | 0.81 (0.56–0.92) | 0.82 (0.57–0.91) | — |
| CRVE (Hubbard) | 0.8 (0.54–0.91) | 0.78 (0.52–0.89) | — |
| AVR (Hubbard) | 0.87 (0.69–0.94) | 0.81 (0.66–0.92) | — |
| CRAE (Knudtson) | 0.76 (0.45–0.9) | 0.75 (0.44–0.89) | — |
| CRVE (Knudtson) | 0.85 (0.67–0.94) | 0.86 (0.58–0.9) | — |
| AVR (Knudtson) | 0.85 (0.66–0.94) | 0.82 (0.51–0.91) | — |
The agreement of vessel caliber was validated on the IOSTAR-AV dataset, other metrics with the DR HAGIS dataset. Because caliber features rely on the six largest arteries and veins in Zones B and C, there is no caliber feature for the whole image level.
Figure 6.Bland–Altman plots of vascular feature agreement between expert annotation and AutoMorph segmentation at Zone B. The first two row features (e.g., tortuosity, fractal dimension) were calculated with the binary vessel segmentation map from DR HAGIS; the last row features (caliber) were measured with the artery/vein segmentation map from IOSTAR-AV. In each subplot, the central line indicates the mean difference and two dashed lines represent 95% limits of agreement. The unit of average width, CRAE, and CRVE is the pixel, as resolution was unknown.
Figure 7.Interface of AutoMorph on Google Colaboratory. After uploading images and clicking the “run” button, all processes are executed and results stored, requiring no human intervention. The left side shows the files directory, and the right bottom lists five examples with parts of features.