| Literature DB >> 31304405 |
Tae Keun Yoo1,2, Ik Hee Ryu1, Geunyoung Lee3, Youngnam Kim3, Jin Kuk Kim1, In Sik Lee1, Jung Sub Kim1, Tyler Hyungtaek Rim4,5.
Abstract
Recently, it has become more important to screen candidates that undergo corneal refractive surgery to prevent complications. Until now, there is still no definitive screening method to confront the possibility of a misdiagnosis. We evaluate the possibilities of machine learning as a clinical decision support to determine the suitability to corneal refractive surgery. A machine learning architecture was built with the aim of identifying candidates combining the large multi-instrument data from patients and clinical decisions of highly experienced experts. Five heterogeneous algorithms were used to predict candidates for surgery. Subsequently, an ensemble classifier was developed to improve the performance. Training (10,561 subjects) and internal validation (2640 subjects) were conducted using subjects who had visited between 2016 and 2017. External validation (5279 subjects) was performed using subjects who had visited in 2018. The best model, i.e., the ensemble classifier, had a high prediction performance with the area under the receiver operating characteristic curves of 0.983 (95% CI, 0.977-0.987) and 0.972 (95% CI, 0.967-0.976) when tested in the internal and external validation set, respectively. The machine learning models were statistically superior to classic methods including the percentage of tissue ablated and the Randleman ectatic score. Our model was able to correctly reclassify a patient with postoperative ectasia as an ectasia-risk group. Machine learning algorithms using a wide range of preoperative information achieved a comparable performance to screen candidates for corneal refractive surgery. An automated machine learning analysis of preoperative data can provide a safe and reliable clinical decision for refractive surgery.Entities:
Keywords: Eye manifestations; Machine learning
Year: 2019 PMID: 31304405 PMCID: PMC6586803 DOI: 10.1038/s41746-019-0135-8
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Schematic illustrating the purpose of this study
Characteristics of the subjects in this study for training and validation data
| Variable | Training set ( | Internal validation set ( | External validation set ( | |
|---|---|---|---|---|
| Age (years) | 27.94 ± 6.12 | 27.89 ± 6.10 | 26.23 ± 6.51 | <.001 |
| Sex, female (%) | 5609 (53.1) | 1374 (52.0) | 2879 (54.5) | .081 |
| Spherical equivalent (Diopter) | −4.56 ± 2.24 | −4.55 ± 2.20 | −4.80 ± 2.28 | <.001 |
| CDVA (logMAR) | −0.015 ± 0.042 | −0.016 ± 0.043 | 0.001 ± 0.041 | <.001 |
| IOP (mmHg) | 15.20 ± 4.81 | 15.25 ± 5.47 | 15.16 ± 3.06 | .008 |
| Central corneal thickness (μm) | 541.86 ± 31.54 | 541.82 ± 31.93 | 542.80 ± 33.38 | .070 |
| NIBUT (seconds) | 6.87 ± 6.60 | 6.90 ± 6.67 | 6.83 ± 5.93 | <.001 |
| Corneal refractive surgery | ||||
| LASIK (%) | 3630 (34.4) | 914 (34.6) | 1579 (29.9) | <.001 |
| LASEK (%) | 2891 (27.4) | 729 (27.6) | 1273 (24.1) | <.001 |
| SMILE (%) | 3036 (28.7) | 746 (28.3) | 2052 (38.8) | <.001 |
| Contraindication cases for surgery (%) | 1004 (9.5) | 251 (9.5) | 375 (7.1) | <.001 |
CDVA corrected distance visual acuity, IOP intraocular pressure, LASEK laser epithelial keratomileusis, LASIK laser in situ keratomileusis, NIBUT noninvasive break-up time, SMILE small incision lenticule extraction
aComparison using the Kruskal−Wallis test and chi-square test
Fig. 2Heatmaps representing the predictive performance (AUC) of feature selection and machine learning methods to predict candidates for corneal refractive surgery. This figure shows the results from the tenfold cross-validation procedure. a Support vector machine. b Artificial neural networks. c Random forest. d Least absolute shrinkage and selection operator (LASSO). e AdaBoost
Classification performance of machine learning models to predict candidates for corneal refractive surgery using the tenfold cross-validation in the training set
| AUC (95% CI) | Accuracy (%) | Sensitivity (%) | Specificity (%) | Duncan subgroupb | ||
|---|---|---|---|---|---|---|
| Without feature selection | ||||||
| SVM | 0.612 (0.603–0.621) | 55.2 (54.3–56.2) | 54.7 (53.7–55.7) | 60.8 (57.7–63.8) | <.001 | G |
| ANN | 0.824 (0.818–0.833) | 75.1 (74.2–75.9) | 75.3 (74.4–76.2) | 72.9 (70.0–75.6) | <.001 | F |
| RF | 0.966 (0.963–0.970) | 89.6 (88.9–90.1) | 89.4 (88.8–90.0) | 91.0 (89.1–92.7) | <.001 | B, C |
| AdaBoost | 0.962 (0.958–0.965) | 89.0 (88.4–89.6) | 89.0 (88.4–89.6) | 89.2 (87.2–91.1) | <.001 | B, C |
| LASSO | 0.818 (0.811–0.825) | 76.9 (76.1–77.7) | 77.6 (76.7–78.4) | 70.9 (68.0–73.7) | <.001 | F |
| With feature selection | ||||||
| SVM | 0.963 (0.959–0.966) | 90.1 (89.5–90.7) | 90.2 (89.6–90.8) | 89.6 (87.6–91.5) | <.001 | C |
| ANN | 0.972 (0.969–0.975) | 91.8 (91.2–92.3) | 91.9 (91.3–92.4) | 90.7 (88.8–92.5) | .004 | B |
| RF | 0.981 (0.978–0.983) | 92.7 (92.2–93.2) | 92.6 (92.1–93.1) | 93.6 (91.9–95.1) | Reference | A |
| AdaBoost | 0.962 (0.958–0.965) | 89.0 (88.4–89.6) | 89.0 (88.4–89.6) | 89.2 (87.2–91.1) | <.001 | B, C |
| LASSO | 0.938 (0.932–0.941) | 87.3 (86.7–88.0) | 87.5 (86.8–88.1) | 86.0 (83.7–88.1) | <.001 | D |
| Ensemble | 0.983 (0.980–0.985) | 94.3 (93.8–94.7) | 94.5 (94.0–94.9) | 92.5 (90.7–94.1) | .579 | A |
| PTA | 0.827 (0.820–0.835) | 74.6 (73.7–75.4) | 74.6 (73.7–75.5) | 74.2 (71.4–76.9) | <.001 | F |
| Randleman score | 0.897 (0.892–0.903) | 74.6 (73.7–75.4) | 74.6 (73.7–75.5) | 74.2 (71.4–76.9) | <.001 | E |
ANN artificial neural networks, AUC area under curve, CI confidence interval, LASSO least absolute shrinkage and selection operator, PTA percentage of tissue ablated, RF random forest, SVM support vector machine
aComparison of receiver operating characteristics curves with the single best technique (random forest with feature selection) according to the Delong test
bThe different letters (A, B, C, D, E, F, and G) indicate statistically different means according to Duncan’s multiple range test using the AUCs. The subgroup A (ensemble and random forest with feature selection) was significantly superior to other subgroups. The machine learning techniques with feature selection (A, B, C, and D) were significantly superior to the classic methods (E and F)
Fig. 3The ROC curves for the machine learning algorithms and classic screening methods. a The ROC curves of the internal validation set. b The ROC curves of the external validation set. The machine learning classifiers include random forest (RF), AdaBoost, artificial neural networks (ANN), and ensemble classifier. The classic methods include percentage of tissue ablated (PTA) and Randleman ectatic score
Fig. 4The classification performance of high-risk subgroups according to the tenfold cross-validation results. The performance was measured based on the average of the AUCs. The error bars represent the 95% confidence intervals. a Performances in the high myopia group. b Performances in the high astigmatism group. c Performances in the thin corneal thickness group. Error bars indicate the standard deviation of the mean
Fig. 5Outcome value histograms of the ensemble machine learning technique in the tenfold cross-validation. The misclassified samples with an opposite outcome value are shown
Fig. 6Machine learning technique performance in the ectasia-risk groups, including post-LASIK ectasia, keratoconus, and forme fruste keratoconus patients. a Accuracy in each ectasia-risk group. b ROC curves for classification between the normal control (no postoperative ectasia, N = 9556) and total ectasia-risk group (N = 153)
Fig. 7An architecture of our proposed machine learning system to predict candidates for corneal refractive surgery