| Literature DB >> 33177614 |
Fares Antaki1,2,3, Ghofril Kahwati4,5, Julia Sebag1, Razek Georges Coussa6, Anthony Fanous7, Renaud Duval1,3, Mikael Sebag8,9.
Abstract
We aimed to assess the feasibility of machine learning (ML) algorithm design to predict proliferative vitreoretinopathy (PVR) by ophthalmologists without coding experience using automated ML (AutoML). The study was a retrospective cohort study of 506 eyes who underwent pars plana vitrectomy for rhegmatogenous retinal detachment (RRD) by a single surgeon at a tertiary-care hospital between 2012 and 2019. Two ophthalmologists without coding experience used an interactive application in MATLAB to build and evaluate ML algorithms for the prediction of postoperative PVR using clinical data from the electronic health records. The clinical features associated with postoperative PVR were determined by univariate feature selection. The area under the curve (AUC) for predicting postoperative PVR was better for models that included pre-existing PVR as an input. The quadratic support vector machine (SVM) model built using all selected clinical features had an AUC of 0.90, a sensitivity of 63.0%, and a specificity of 97.8%. An optimized Naïve Bayes algorithm that did not include pre-existing PVR as an input feature had an AUC of 0.81, a sensitivity of 54.3%, and a specificity of 92.4%. In conclusion, the development of ML models for the prediction of PVR by ophthalmologists without coding experience is feasible. Input from a data scientist might still be needed to tackle class imbalance-a common challenge in ML classification using real-world clinical data.Entities:
Mesh:
Year: 2020 PMID: 33177614 PMCID: PMC7658348 DOI: 10.1038/s41598-020-76665-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Collected clinical variables from the EHR.
| Category | Variable |
|---|---|
| Sociodemographic characteristics | Age in years [continuous], sex [binary] |
| Past ocular history | Previous ocular surgery [binary] |
| History of present illness | Duration of symptoms in days [continuous] |
| Retinal detachment characteristics | Subtotal/ total retinal detachment (≥ 3 quadrants) [binary], macular status [binary], pre-existing PVR [binary], vitreous hemorrhage [binary], number of retinal breaks [continuous], giant retinal tear [binary] |
| Other examination findings | Macular hole [binary], anterior uveitis [binary], number of quadrants of lattice degeneration [polynomial], intraocular pressure in mmHg using Goldmann applanation [continuous], postoperative lens status [binary] |
The variables included sociodemographic and past ocular history data as well as retinal detachment characteristic and other relevant examination findings.
EHR electronic health records, PVR proliferative vitreoretinopathy.
Figure 1This diagram demonstrates the workflow for predictive modeling of proliferative vitreoretinopathy (PVR) using automated machine learning (AutoML). After data preparation and class balancing using random undersampling (RUS), the two ophthalmologists used the Classification Learner App to design support vector machine (SVM) and Naïve Bayes (NB) algorithms. Details about the feature sets and models are described in the methodology. In parallel to the design by the two ophthalmologists, manually coded algorithms were also prepared as a benchmarking measure.
Clinical characteristics of patients in the PVR and no PVR groups.
| PVR (n = 46) | No PVR (n = 460) | P value | |
|---|---|---|---|
| < 0.001 | |||
| Mean | 68.78 | 59.93 | |
| Standard deviation | 8.448 | 11.931 | |
| Median | 68.50 | 61.00 | |
| Range | 48–90 | 18–95 | |
| 0.743 | |||
| Male | 30 (65.2%) | 311 (67.6%) | |
| Female | 16 (34.8%) | 149 (32.4%) | |
| 0.650 | |||
| Yes | 7 (15.2%) | 60 (13.0%) | |
| No | 39 (84.8%) | 400 (87.7%) | |
| < 0.001 | |||
| Mean | 35.05 | 16.73 | |
| Standard deviation | 62.429 | 39.598 | |
| Median | 21.00 | 7.00 | |
| Range | 1–365 | 0–365 | |
| < 0.001 | |||
| Yes | 22 (47.8%) | 59 (12.8%) | |
| No | 24 (52.2%) | 401 (87.2%) | |
| 0.001 | |||
| On | 8 (17.4%) | 193 (42.0%) | |
| Off | 38 (82.6%) | 267 (58.0%) | |
| < 0.001 | |||
| Yes | 17 (37.0%) | 1 (0.2%) | |
| No | 29 (63.0%) | 459 (99.8%) | |
| 0.002 | |||
| Yes | 10 (21.7%) | 31 (6.7%) | |
| No | 36 (78.3%) | 429 (93.3%) | |
| 0.323 | |||
| Mean | 2.41 | 2.03 | |
| Standard deviation | 2.296 | 1.569 | |
| Median | 2.00 | 1.00 | |
| Range | 1–10 | 0–9 | |
| 0.023 | |||
| Yes | 4 (8.7%) | 9 (2.0%) | |
| No | 42 (91.3%) | 451 (98.0%) | |
| 0.714 | |||
| Yes | 1 (2.2%) | 23 (5.0%) | |
| No | 45 (97.8%) | 437 (95.0%) | |
| 0.006 | |||
| Yes | 3 (6.5%) | 2 (0.4%) | |
| No | 43 (93.5%) | 458 (99.6%) | |
| 0.441 | |||
| 0 | 31 (67.4%) | 301 (65.4%) | |
| 1 | 12 (26.1%) | 91 (19.8%) | |
| 2 | 1 (2.2%) | 46 (10.0%) | |
| 3 | 1 (2.2%) | 14 (3.0%) | |
| 4 | 1 (2.2%) | 8 (1.7%) | |
| 0.002 | |||
| Mean | 11.40 | 14.69 | |
| Standard deviation | 5.065 | 5.039 | |
| Median | 12.00 | 15.00 | |
| Range | 0–20 | 0–60 | |
| 0.759 | |||
| Phakic | 25 (54.3%) | 237 (51.5%) | |
| Pseudophakic/aphakic | 21 (45.7%) | 223 (48.5%) |
The comparison between the “PVR” and “No PVR” groups was performed using Mann–Whitney U test for all continuous variables. For categorical variables, we used Chi-square and Fisher’s exact tests (for cells with expected counts < 5).
PVR proliferative vitreoretinopathy, RRD rhegmatogenous retinal detachment.
Summary of the discriminative performance of all 4 ML models.
| Model | TP | FP | TN | FN | AUC | F1 | SN (%) | SP (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|---|---|---|---|
| Model 1: Quadratic SVM | 29 | 2 | 90 | 17 | 0.90 | 0.75 | 63.0 | 97.8 | 93.5 | 84.1 |
| Model 2: Optimized NB | 32 | 4 | 88 | 14 | 0.86 | 0.78 | 69.6 | 95.7 | 88.9 | 86.3 |
| Model 3: Optimized SVM | 21 | 5 | 87 | 25 | 0.81 | 0.58 | 45.7 | 94.6 | 80.8 | 77.7 |
| Model 4: Optimized NB | 25 | 7 | 85 | 21 | 0.81 | 0.64 | 54.3 | 92.4 | 78.1 | 80.2 |
ML machine learning, TP true positives, FP false positives, TN true negatives, FN false negatives, AUC area under the receiver operating characteristics, F1 F1 score, SN sensitivity, SP specificity, PPV positive predictive value, NPV negative predictive value, SVM support vector machine, NB Naïve Bayes.
Figure 2Receiver operating characteristic (ROC) curves of the discriminative performance of Models 1–4. Models 1 (quadratic Support Vector Machine [SVM]) and 2 (optimized Naïve Bayes [NB]) used Feature Set 1 that included all clinically important features. Models 3 (optimized SVM) and 4 (optimized NB) used Feature Set 2, which did not include pre-existing PVR as an input feature.
Figure 3Four representative cases are shown highlighting correct classifications and misclassifications by Model 1. Case 1 illustrates a correctly classified case of PVR in an eye with a total rhegmatogenous retinal detachment (RRD) and late presentation prior to surgery. Case 3 shows a common case of simple macula-off RRD with no obvious risk factors for PVR, correctly classified as “No PVR”. In case 2, the algorithm misclassified the case as “PVR” probably due to the long duration of symptoms and the presence of vitreous hemorrhage. In case 4, the algorithm failed to predict PVR despite the presence of a giant retinal tear. The normal intraocular pressure, absence of pre-existing PVR, and macula-on status might have influenced the classifier’s decision.