Literature DB >> 32818085

Evaluating the Performance of Various Machine Learning Algorithms to Detect Subclinical Keratoconus.

Ke Cao^1,2, Karin Verspoor³, Srujana Sahebjada^1,2, Paul N Baird².

Abstract

Purpose: Keratoconus (KC) represents one of the leading causes of corneal transplantation worldwide. Detecting subclinical KC would lead to better management to avoid the need for corneal grafts, but the condition is clinically challenging to diagnose. We wished to compare eight commonly used machine learning algorithms using a range of parameter combinations by applying them to our KC dataset and build models to better differentiate subclinical KC from non-KC eyes.
Methods: Oculus Pentacam was used to obtain corneal parameters on 49 subclinical KC and 39 control eyes, along with clinical and demographic parameters. Eight machine learning methods were applied to build models to differentiate subclinical KC from control eyes. Dominant algorithms were trained with all combinations of the considered parameters to select important parameter combinations. The performance of each model was evaluated and compared.
Results: Using a total of eleven parameters, random forest, support vector machine and k-nearest neighbors had better performance in detecting subclinical KC. The highest area under the curve of 0.97 for detecting subclinical KC was achieved using five parameters by the random forest method. The highest sensitivity (0.94) and specificity (0.90) were obtained by the support vector machine and the k-nearest neighbor model, respectively. Conclusions: This study showed machine learning algorithms can be applied to identify subclinical KC using a minimal parameter set that are routinely collected during clinical eye examination. Translational Relevance: Machine learning algorithms can be built using routinely collected clinical parameters that will assist in the objective detection of subclinical KC. Copyright 2020 The Authors.

Entities: Chemical Disease Gene Species

Keywords: artificial intelligence; keratoconus; machine learning; subclinical keratoconus

Mesh：

Year: 2020 PMID： 32818085 PMCID： PMC7396174 DOI： 10.1167/tvst.9.2.24

Source DB: PubMed Journal: Transl Vis Sci Technol ISSN： 2164-2591 Impact factor: 3.283

Introduction

Keratoconus (KC) is a common corneal condition characterized by progressive corneal thinning that results in corneal protrusion, reduced vision and potential blindness. Prevalence of KC ranges from 0.17 in 1000 in the United States to 47.89 in 1000 in Saudi Arabia. The reported prevalence appears to have increased rapidly wherein only 1:2000 cases (United States) were reported in 1986 but as many as 1:375 (Netherlands) in 2016; although this may reflect improvements in imaging. Three articles reported KC prevalence in Iran, showing an increase from 1:126 in 2013, to 1:42 in 2014, to 1:32 in 2018. Similarly, the prevalence of KC in Israel increased from 1:43 in 2011 to 1:31 in 2014. A recent meta-analysis that analyzed more than 7 million participants from 15 countries reported the prevalence of KC as 1 in 725. The onset of the disease is usually in the teens to early adulthood and our recent findings indicate that quality of life of KC patients is substantially lower than that of patients with later onset eye diseases such as age-related macular degeneration or diabetic retinopathy. This highlights the significant long-term morbidity associated with the condition. Management for KC follows an orderly transition from glasses/contact lenses to corneal transplantation as the condition progresses from mild/moderate to severe stages respectively. KC is the most common indication for corneal transplantation globally, and accounts for ∼30% of corneal grafts (Australian Corneal Graft registry). Collagen crosslinking treatment, which stiffens the cornea, has been available as a treatment to slow KC progression for several years,. However, collagen crosslinking treatment using the standard Dresden protocol requires a minimum of 400-µm corneal thickness and is only suitable for patients in early (subclinical) stages of KC; early detection is therefore a prerequisite for this treatment. Once KC progresses, patients may require corneal transplantation. Detecting the subclinical stage of KC is clinically challenging because (1) subjects are asymptomatic; (2) do not produce detectable signs at routine clinical examination using slit lamp, retinoscopy, or keratometry; and (3) the advanced corneal topographic systems that can detect subclinical KC are not always available in all optometric/primary eye care practices. Thus, a number of challenges currently exist with regard to reliable detection of subclinical KC. Machine learning models have been applied to detect KC at different clinical stages with a number of these presented as specific to a particular tomographic or topographic imaging system–. The majority of these studies have used a single machine learning method such as regression analysis-, a tree-based method,–, ensemble method,, discriminant function analysis,,, support vector machine,,, or neural network,,–. Parameters derived from a particular topographic or tomographic imaging system were collected in these studies, and established the machine learning models without selecting important parameter combinations,,,,. We therefore lack knowledge as to the performance characteristics of different machine learning methods to the same dataset and the evaluation of the same machine learning method to various parameter combinations. This would most readily be addressed by applying a number of machine learning algorithms to the same dataset and comparing their results. Moreover, the indicated studies that have used machine learning in KC use a number of parameters related to corneal measures, they do not include other clinical measures such as axial length (AL), spherical equivalent or demographic parameters, which are reported to have an association with KC and may also have a role when establishing a clinically accepted detection model for KC–. We therefore wished to evaluate the performance of a range of different machine learning methods on a subset of subclinical dataset recruited in Australia as a part of the Australian Study of Keratoconus (ASK). These machine learning methods were further systematically trained and tested to explore various combinations of commonly used corneal topographic parameters together with clinical and demographic parameters to identify a best performing machine learning model to detect subclinical KC from control eyes.

Methods

Subjects

This is a substudy of ASK that was established to better understand the clinical, genetic environmental risk factors for KC. The study protocol was approved by the Royal Victorian Eye and Ear Hospital Human Research and Ethics Committee (Project #10/954H). The protocol followed the tenets of the Declaration of Helsinki and all privacy requirements were met. Subclinical KC patients were recruited from public and private clinics at the Royal Victorian Eye and Ear Hospital and private consulting rooms and optometry clinics in Melbourne, Australia. All patients were provided with a patient information sheet, consent form, privacy statement, and patient rights. A comprehensive eye examination was undertaken for each patient and KC was diagnosed clinically–. Subclinical KC was defined as those eyes with abnormal corneal topography, including inferior-superior localized steepening or asymmetric bowtie pattern. These eyes had no detectable clinical signs on slit-lamp biomicroscopy and retinoscopy examination. Subjects with other ocular diseases such as corneal degenerations and dystrophies, macular disease, and optic nerve disease (e.g., optic neuritis, optic atrophy) were excluded from the study. KC subjects were recruited from ASK, whereas controls were recruited from the “GEnes in Myopia” study where a similar recruitment protocol was used. The control group consisted of refractive error subjects with no ocular disease that may affect refraction in the eyes including amblyopia (greater than a two-line Snellen difference between the eyes), strabismus, visually significant lens opacification, glaucoma, or any other corneal abnormality. Individuals with connective tissue disease such as Marfan's or Stickler syndrome were also excluded from the study. The latter conditions were identified by the individual's medical history obtained via a general questionnaire.

Eye Examination

The anterior segment was assessed using slit-lamp biomicroscopy examination and refraction was performed on each eye using a Nidek auto refractor. AL was recorded for each participant using a noncontact partial coherence interferometry with an IOL Master optical biometer (Carl Zeiss, Oberkochen, Germany). The corneal topographic measurement parameters were obtained on all subjects using a Pentacam corneal tomographer (Oculus, Wetzlar, Germany). The subjects were required to remove their contact lenses, if worn, at least 24 hours before examination. The results used in the study were from the four-map selectable display of Pentacam results incorporating front and back elevation maps, along with front sagittal curve and pachymetry. These maps were chosen to highlight the inferior decentration of the corneal apex on both the front and back surface, which assisted in the detection of KC. Mean corneal curvature (Km) was calculated automatically by the device as the mean value of horizontal and vertical central radial curvatures in the 3-mm zone. The detailed methodology of eye examination for KC can be found elsewhere. Nine parameters that classically represent KC, including AL, SE, mean front corneal curvature (front Km), mean back corneal curvature (back Km), central corneal thickness (CCT), corneal thickness at the apex (CTA), corneal thickness at the thinnest point (CTT), anterior chamber depth (ACD), and corneal volume (CV), were included in this study to build the algorithms to detect subclinical KC.

Statistical Analysis

Data were analyzed with RStudio (version 1.1.456) for Windows. All statistical tests were considered significant when the P value was less than 0.05. A χ2 test was used to compare gender between groups, and a Wilcoxon signed-rank test was applied to test the difference in age and other clinical characteristics, including SE, AL, front Km, back Km, CCT, CTA, CTT, ACD, and CV.

Machine Learning Methods

Eight machine learning algorithms, including random forest, decision tree, logistic regression, support vector machine, linear discriminant analysis, multilayer perceptron neural network, lasso regression, and k-nearest neighbor were applied to build classification models to differentiate subclinical KC from control subjects. Briefly, Simple regression methods (e.g., logistic regression) learn a mapping from input variables (X) to an output variable (Y) with Y = f(X); Tree-based methods (e.g., decision tree) involve building a decision making tree with “if this then that” logic; ensemble methods (e.g., random forest) and combine several machine learning techniques into one model (i.e., random forest constructs multiple trees); k-nearest neighbor method makes a decision by searching through the whole dataset for the k most similar instances; Discriminant function analysis (e.g., linear discriminant analysis) finds a combination of variables that will discriminate between the categories; Support vector machine translates data into another space where a plane (“hyperplane”) maximally separates disparate data groups from itself; Regularization methods (e.g., lasso regression) add a penalty to optimize outcomes; and Neural networks (e.g., multilayer perceptron neural network) process across multiple layers of interconnected nodes, each computing a non-linear function of the sum of their inputs. We chose these eight methods for the following reasons: These algorithms are eight of the most commonly used machine learning algorithms that aid in health care diagnosis–; These are the most commonly used algorithms in the previously published KC studies,,-,,,,,,; They represent eight distinct learning functions,,,,– in machine learning and thus it is worthwhile to empirically compare the performance of these algorithms for our dataset.

Machine Learning Analysis

The caret package (Classification And REgression Training) in R was used to perform all of the machine learning processes, including training and assessing the performance of models. The “train” function in the caret package was also applied for tuning the hyper-parameters in each model. During each of the training, the train function generated a candidate set of parameter values, and the function picked the tuning parameters associated with the best accuracy.

Comparison of Machine Learning Algorithms

The following steps were performed in a loop for each of the methods: Data for each eye, with the corresponding label indicating whether the eye was subclinical KC or not, was imported into RStudio package; Each machine learning method was respectively trained to differentiate subclinical KC from control eyes using all 11 parameters; To validate the results of each model, a 10-fold cross-validation method was used on the full dataset, wherein the data was split into 10 subsets (folds), each representing 10% of the data. On each iteration, a model was trained using nine of these folds (90% of the data) and tested on the remaining fold, repeatedly, 10 times across the folds to assess the performance of the methods as the output. In this way, each fold serves as held-out test data for a model trained on the other nine folds, and the average performance across the 10 folds was measured. This represents a standard evaluation paradigm for small datasets,. Figure 1 shows an example of 10-fold cross-validation.

Figure 1.

The 10-fold cross validation for analysis of test data. Twenty rhombuses are randomly partitioned into 10 subsets, with two rhombuses in each subset. Of the 10 subsets, one subset is retained as the validation data, and the remaining nine subsets are used to train the model. This cross-validation process is then repeated 10 times. In summary, cross-validation combines measures of 10 fitness and provide an average.

Selection of Parameter Combinations

Models that achieved the highest performance in at least one of the evaluation metrics were used for subsequent analysis. The following steps were performed in a loop for each of the methods: Each combination of the considered parameters, ranging from two variables (e.g., age, gender) up to all 11 variables, a total of 2036 different combinations were considered. Each method was trained to differentiate subclinical KC eyes from control eyes using each combination of parameters described previously; Figure 2 represents a flowchart for training machine learning models with different parameter combinations.

Figure 2.

Training machine learning models with different parameter sets. All possible combination of 11 parameters, from combination of two (e.g., gender and age, gender and SE) parameters to combination of 11 parameters, were used as input respectively to train machine learning models to differentiate subclinical KC eyes from control eyes.

A 10-fold cross-validation method was used to validate the performance of each model. Training machine learning models with different parameter sets. All possible combination of 11 parameters, from combination of two (e.g., gender and age, gender and SE) parameters to combination of 11 parameters, were used as input respectively to train machine learning models to differentiate subclinical KC eyes from control eyes.

Evaluation Criteria

Accuracy, sensitivity, specificity, area under the curve (AUC) and precision were the measures that were used to evaluate the performance of each model as these are typically used in health care settings. Accuracy determines the ability of the model to correctly classify the cases and controls, sensitivity represents the ability of the model to identify the cases from the case group (true positive), and specificity is the ability of the model to identify controls from the given control group (true negative) under investigation. AUC represents how much a model is capable of distinguishing between two groups. The higher the AUC, the better the model is at classifying cases as cases and controls as controls. Precision defines how correctly the proportion of case identifications was achieved by the model. The criteria for evaluating the performance of each model was optimized from 0 to 1 with values greater than 0.90 defined as the highest performing model, followed by those ranging between 0.8 and 0.9 classified as good fit, between 0.5 and 0.79 as moderate performers, and finally those less than 0.5 considered as poor performing methods in the present study.

Results

Demographics

A total of 88 subjects consisting of 49 subclinical KC eyes, and 39 control eyes were available for analysis. There were significantly (P < 0.01) more males in the subclinical group (37 males, 75.5%) compared with the control group (14 males, 35.9%). The mean age of the subclinical KC group was 30.37 ± 12.53 years and the control group was 36.08 ± 11.91 years. The subclinical KC patients were significantly younger compared to the control group (P < 0.05). Demographic data for all subjects are presented in Table 1.

Table 1.

Demographic Data for all the Subjects Included in the Study

	N	Mean Age (SD)	% Female	P Value
Subclinical KC	49	30.37 (12.53)	24.5	<0.01
Control	39	36.08 (11.91)	64.1	<0.01

KC, keratoconus; SD, standard deviation.

Demographic Data for all the Subjects Included in the Study KC, keratoconus; SD, standard deviation. Considering individual parameters, there were significant differences between subclinical KC and control eyes for spherical equivalent (P < 0.01), AL (P < 0.01), front Km (P = 0.01), as well as corneal thickness-related parameters (CCT, P = 0.02; CTA, P = 0.01; CTT, P < 0.01). As expected, corneal thickness in the subclinical KC group was significantly thinner than those in the control group. However, subclinical KC eyes tended to have significantly flatter cornea when compared with control eyes (P = 0.01). Moreover, control eyes showed more myopic and longer AL than subclinical KC (P < 0.01). This may due to the control group being recruited from the GEnes in Myopia study. There was no significant difference in back Km (P = 0.60), ACD (P = 0.09), and CV (P = 0.27) between the groups (Table 2).

Table 2.

Clinical Characteristics of all Eyes Used in the Analysis by Individual Parameter

	Subclinical KC Eyes	Control Eyes	P Value
SE, D (SD)	−2.20 (3.32)	−6.59 (4.87)	<0.01
AL, mm (SD)	24.44 (1.48)	26.62 (2.21)	<0.01
ACD, mm (SD)	3.59 (0.60)	3.67 (0.43)	0.09
Front Km, D (SD)	42.45 (1.38)	43.22 (2.09)	0.01
Back Km, D (SD)	−6.03 (1.02)	−6.22 (0.34)	0.60
CCT, µm (SD)	511.20 (45.82)	531.74 (31.49)	0.02
CTA, µm (SD)	511.90 (46.60)	531.87 (31.07)	0.01
CTT, µm (SD)	487.67 (82.22)	528.97 (31.56)	<0.01
CV, mm³ (SD)	61.22 (21.13)	59.02 (4.81)	0.27

P value- values of Wilcoxon signed-rank test

ACD, anterior chamber depth; AL, axial length; back Km, mean back corneal curvature; CCT, central corneal thickness; CTA, corneal thickness at the apex; CTT, corneal thickness at the thinnest point; CV, corneal volume; front Km, mean front corneal curvature; KC, keratoconus; SD, standard deviation; SE, spherical equivalent.

Clinical Characteristics of all Eyes Used in the Analysis by Individual Parameter P value- values of Wilcoxon signed-rank test ACD, anterior chamber depth; AL, axial length; back Km, mean back corneal curvature; CCT, central corneal thickness; CTA, corneal thickness at the apex; CTT, corneal thickness at the thinnest point; CV, corneal volume; front Km, mean front corneal curvature; KC, keratoconus; SD, standard deviation; SE, spherical equivalent. Table 3 shows the performance of each of the eight machine learning methods that were used in this analysis. In our dataset, amongst all the methods, random forest method presented the highest performance for AUC (0.96) and had a good accuracy (0.87) and precision (0.89) while support vector machine had the highest sensitivity (0.92) and the k-nearest neighbor model was good for specificity (0.88). On the other hand, the multilayer perceptron neural network showed poor performance in our dataset with a specificity of 0.2 and a precision of 0.44. Other models had moderate to good performance ranging from 0.51 to 0.89.

Table 3.

Comparison of the Eight Machine Learning Algorithms Using Different Performance Indicators

Algorithms	Accuracy	Sensitivity	Specificity	AUC	Precision
Random forest	0.87	0.88	0.85	0.96	0.89
Support vector machine	0.86	0.92	0.78	0.89	0.84
K-nearest neighbors	0.73	0.61	0.88	0.73	0.88
Logistic regression	0.81	0.84	0.77	0.89	0.84
Linear discriminant analysis	0.81	0.84	0.78	0.89	0.83
Lasso regression	0.84	0.86	0.83	0.91	0.88
Decision tree	0.80	0.82	0.78	0.81	0.82
Multilayer perceptron neural network	0.52	0.80	0.20	0.51	0.44

The number in bold indicates the highest value obtained for each performance indicator.

Comparison of the Eight Machine Learning Algorithms Using Different Performance Indicators The number in bold indicates the highest value obtained for each performance indicator. From the previous step, we confirmed that random forest, support vector machine and k-nearest neighbor methods fit better than the other methods by using the 11 input parameters for all machine learning classifiers. We then tested all possible parameter combinations with these three methods. In our dataset, the following models had the best performances using a minimal parameter set: The greatest AUC was obtained using a minimal parameter set of gender, SE, front Km, CTT, and CV (AUC of 0.97, using the random forest method). The highest sensitivity was obtained with the parameter set of SE, ACD, back Km, CCT, and CTT (sensitivity of 0.94, using support vector machine method). The highest specificity was obtained with the parameter set of age, SE, AL, CTA, and CTT (specificity of 0.90, using k-nearest neighbor method).

Discussion

We compared the eight commonly used machine learning techniques and their performance in distinguishing subclinical KC eyes from control eyes using an Australian dataset. This is the first study to evaluate and compare the performance of such a wide range of machine learning techniques and present their efficacy in detecting subclinical KC. It is also the first time to develop algorithms with a great amount of parameter combinations to achieve the most parsimonious performing machine learning model to detect subclinical KC. Machine learning algorithms are computational methods that allow us to efficiently navigate complex data to arrive at a best-fit model. The performance of different machine learning algorithms strongly depends on the nature of the data and the task being explored, and thus the correct choice of algorithm is best determined through experimentation. In our dataset, using 11 parameters (age, gender, SE, AL, ACD, front Km, back Km, CCT, CTA, CTT, CV), the random forest model had a highest performance for AUC (0.96), which means clinically it has a good measure of differentiating subclinical KC from the control eyes. Conversely, multilayer perception neural network had an AUC near to 0.5, reflecting that this model has no discrimination capacity to distinguish subclinical KC and control eyes. The random forest model also achieved good performance for accuracy (0.87) in our dataset (i.e., clinically it can correctly classify 87% of subclinical KC and control eyes). Moreover, the precision of the random forest model is 0.89, translating to 89% of subclinical KC eyes classified by the random forest model are real subclinical KC eyes. In addition, the support vector machine model reached 0.92 of sensitivity, showing a 92% probability of correctly identifying subclinical KC eyes, and k-nearest neighbor had an 88% chance of correctly identifying control eyes (specificity of 0.88). We further developed models using random forest, support vector machine, and k-nearest neighbor methods with different parameter combinations to distinguish subclinical KC and control eyes. Our results indicated that using a combination of gender, spherical equivalent, mean front corneal curvature, corneal thickness at the thinnest point, and corneal volume had a good measure of identifying subclinical KC from control eyes (AUC 0.97). In addition to this, a model developed using spherical equivalent, anterior chamber depth, mean back corneal curvature, central corneal thickness, and corneal thickness at the thinnest point had a sensitivity 0.94 (i.e., this model can 94% of the times correctly identify subclinical eyes). Finally, we could also develop a model using age, spherical equivalent, axial length, corneal thickness at the apex, and corneal thickness at the thinnest point, which had the highest specificity and a 90% chance to identify control eyes. Therefore, our analysis attempted to optimize performance by testing multiple algorithms, comparing the results between algorithms and selecting the appropriate algorithm for clinical practice. Chan et al. recently reported the costs associated with the diagnosis and management of keratoconus represent a significant economic burden to the patient as well as the society. The result from this study is a good start for providing a machine learning based model to assist clinicians to identify KC in early/subclinical form and reducing the economic burden of the condition. The Pentacam imaging system that we used is a sensitive device for detecting subtle corneal curvature and pachymetry changes that have high reproducibility and repeatability. For the purpose of better clinical interpretability, we analyzed only commonly available Pentacam corneal parameters but also included other routinely measured parameters that are of primary relevance in keratoconus detection to assess how they would alter the models. One of the main limitations of previous studies that have used machines learning techniques is that the models that were built were specific to the instrument that was used. However, the parameters available for each machine may vary. For example, in the study by Lopes et al. used 18 parameters derived from the Pentacam in their random forest model (sensitivity 0.85, specificity 0.97), but several Pentacam-derived indices (e.g. index of surface variance, index of vertical asymmetry), which were only available from the Pentacam machine, were included in their model. Thus, their model could only be applied in clinics with a Pentacam and not exported to other machines. To address this issue, we assessed all possible combination sets of parameters to test in three dominant machine learning algorithms with the aim of achieving a high degree of identifying subclinical KC from controls with the minimum number of parameters. Based on the results, we demonstrated that this approach could identify smaller subsets of parameters and increase their performance of machine learning models compared to using all parameters. Another common feature of most of the studies published on machine learning techniques and subclinical KC is the definition used for classifying these eyes. Subclinical KC was defined as the normal fellow eye of uniliteral KC,-,. The current study avoided this limitation by defining subclinical eyes based on their own characteristics. Hence, data labeling was based on the clinical assessment of the eyes, which were then used to train the machine to mimic and build the algorithms that most closely represented the input dataset. Our models are based on a clinically meaningful dataset. For the same dataset, different machine learning methods have different performance characteristics, which can be applied accordingly based on the clinical requirements. In the present study, we achieved the highest AUC, sensitivity and specificity using the random forest, support vector machine, and k-nearest neighbor. These results were comparable but had better performance in detecting subclinical KC eyes compared with other results in the literature (Table 4).

Table 4.

Details of Previously Published Studies Using Machine Learning Algorithms for the Detection of Subclinical Keratoconus

Author and Year	Topography System	Sample Size	Algorithm Used	Performance
Kovács et al.²¹ (2016)	Pentacam	15 cases^*/30 controls	Multilayer perceptron neural network	Sensitivity 0.90; Specificity 0.90
Ruiz et al.²³ (2016)	Pentacam HR	67 cases^*/ 339 controls	Support vector machine	Sensitivity 0.79; Specificity 0.98
Hwang et al.³⁰ (2018)	Pentacam HR and SD OCT	30 cases^*/ 60 controls	Multivariable logistic regression	Sensitivity 1.00; Specificity 1.00
Smadja et al.²⁵ (2013)	GALILEI	47 cases^*/ 177 controls	Decision tree	Sensitivity 0.94; Specificity 0.97
Accardo et al.¹⁸ (2002)	EyeSys	30 cases^#/65 controls	Neural network	Sensitivity 1.00; Specificity 0.99
Saad et al.²⁴ (2010)	OrbscanIIz	40 cases^* / 72 controls	Discriminant analysis	Sensitivity:0.93; Specificity:0.92
Ucakhan et al.²⁶ (2011)	Pentacam	44 cases^*/ 63 controls	Logistic regression	Sensitivity:0.77; Specificity:0.92
Ventura et al.²⁷ (2013)	Ocular Response Analyzer	68 cases^†/ 136 controls	Neural network	AUC: 0.978

Subclinical KC was defined as normal fellow eye of uniliteral KC.,–,

KC of mild and moderate severity was considered.

Grade I and II KC according to the Krumeich severity classification.

Details of Previously Published Studies Using Machine Learning Algorithms for the Detection of Subclinical Keratoconus Subclinical KC was defined as normal fellow eye of uniliteral KC.,–, KC of mild and moderate severity was considered. Grade I and II KC according to the Krumeich severity classification. Ruiz et al. used a support vector machine method to analyze 22 parameters derived from Pentacam. They found a sensitivity of 0.79 and specificity of 0.98 in discriminating “forme fruste” KC (N = 67) from normal eyes (N = 194). Kovács et al. used 15 unilateral KC and 30 normal KC subjects to construct a model using multilayer perceptron neural network and reported 0.90 sensitivity and specificity. Similarly, Ucakhan et al. used 44 KC and 63 non-KC subjects using logistic regression and reported a sensitivity of 0.77 and specificity of 0.92 to detect subclinical KC from control eyes. Hwang et al. reported an accuracy of 100%, after training a logistic regression model based on 13 parameters combining measurements from Pentacam and OCT imaging. However, this study indicated that they trained the model with 90 eyes (30 subclinical KC and 60 normal) but did not clarify if the same dataset that was used both for training and testing of the model so it is possible that the same dataset was used in both training and model testing resulting in an artificially higher performance (known as overfitting). We have tested each of our models with subjects that were not included in the training dataset through the 10-fold cross-validation methodology, which allowed us to evaluate the performance of our model across different (simulated blind) test sets. The study by Smadja et al. used 47 Forme Fruste KC and 177 normal eyes to show that the decision tree algorithm with six parameters from Galilei could achieve a sensitivity of 0.94 and specificity 0.97. Although this performance is somewhat better than the model presented in this study, they included a machine-specific index (e.g., asphericity asymmetry index, opposite sector index), which cannot be applied to other imaging systems. Similarly, Saad et al. used 40 Forme Fruste KC and 72 normal eyes to show that discriminant analysis resulted in a sensitivity of 93% and specificity of 92%. There were more than 50 parameters generated from the Orbscan IIz involved in their model, including calculated parameters that could not be repeated by other imaging systems. In contrast to these studies, we used routinely measured clinical parameters and common corneal topographic parameters such as corneal curvature, pachymetry and corneal volume that are not limited to a specific device, providing a real opportunity for our results to be translated and used in different imaging systems. Several limitations of the current study should be noted. First, we have considered measurements derived only from a single topographic machine (Pentacam) in a single hospital. Further experimentation is required to test whether the models would be effective with data sourced from different machines. Second, the cross-validation strategy we used for evaluation was the most appropriate to allow for simulation of a held-out test data scenario, considering distinct training/test sets. However, this approach still draws the test data from the same underlying sample. Therefore, it would be reassuring to collect more data from our hospital and data from other clinics to allow for more rigorous testing of the generalization capacity and robustness of the best models in the face of patient variation.

Conclusion

The current study shows promising results toward detecting subclinical keratoconus from control eyes using parameters that can be collected in a routine clinical eye examination. Results from this study suggested the value of exploring a range of machine learning techniques for modeling the task, and the impact of including a broad range of clinical and demographic features related to keratoconus when developing such approaches and the usefulness of selecting important parameter combinations from a larger parameter set when building machine learning models. Further experimentation will lead to more objective and effective screening strategies for keratoconus and would be a helpful tool in clinical practice.

54 in total

1. Automated keratoconus detection using the EyeSys videokeratoscope.

Authors: P J Chastang; V M Borderie; S Carvajal-Gonzalez; W Rostène; L Laroche
Journal: J Cataract Refract Surg Date: 2000-05 Impact factor: 3.351

2. Keratoconus association with axial myopia: a prospective biometric study.

Authors: Benjamin J Ernst; Hugo Y Hsu
Journal: Eye Contact Lens Date: 2011-01 Impact factor: 2.018

3. Preliminary results of neural networks and zernike polynomials for classification of videokeratography maps.

Authors: Luis Alberto Carvalho
Journal: Optom Vis Sci Date: 2005-02 Impact factor: 1.973

4. Evaluation of Scheimpflug imaging parameters in subclinical keratoconus, keratoconus, and normal eyes.

Authors: Ömür Ö Uçakhan; Volkan Cetinkor; Muhip Özkan; Ayfer Kanpolat
Journal: J Cataract Refract Surg Date: 2011-06 Impact factor: 3.351

5. Computer aided diagnosis for suspect keratoconus detection.

Authors: Ikram Issarti; Alejandra Consejo; Marta Jiménez-García; Sarah Hershko; Carina Koppen; Jos J Rozema
Journal: Comput Biol Med Date: 2019-04-23 Impact factor: 4.589

6. Keratoconus diagnosis with optical coherence tomography–based pachymetric scoring system.

Authors: Bing Qin; Shihao Chen; Robert Brass; Yan Li; Maolong Tang; Xinbo Zhang; Xiaoyu Wang; Qinmei Wang; David Huang
Journal: J Cataract Refract Surg Date: 2013-12 Impact factor: 3.351

7. Collagen crosslinking for keratoconus.

Authors: Petrina Tan; Jodhbir S Mehta
Journal: J Ophthalmic Vis Res Date: 2011-07

8. Evaluation of intereye corneal asymmetry in patients with keratoconus. A scheimpflug imaging study.

Authors: Lóránt Dienes; Kinga Kránitz; Eva Juhász; Andrea Gyenes; Agnes Takács; Kata Miháltz; Zoltán Z Nagy; Illés Kovács
Journal: PLoS One Date: 2014-10-08 Impact factor: 3.240

9. Prevalence of keratoconus and subclinical keratoconus in subjects with astigmatism using pentacam derived parameters.

Authors: Huseyin Serdarogullari; Mehmet Tetikoglu; Hatice Karahan; Feyza Altin; Mustafa Elcioglu
Journal: J Ophthalmic Vis Res Date: 2013-07

10. Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms.

Authors: Hsin-Yi Tsao; Pei-Ying Chan; Emily Chia-Yu Su
Journal: BMC Bioinformatics Date: 2018-08-13 Impact factor: 3.169

9 in total

Review 1. Artificial intelligence and corneal diseases.

Authors: Linda Kang; Dena Ballouz; Maria A Woodward
Journal: Curr Opin Ophthalmol Date: 2022-07-12 Impact factor: 4.299

2. Artificial intelligence in laser refractive surgery - Potential and promise!

Authors: Chaitra Jayadev; Rohit Shetty
Journal: Indian J Ophthalmol Date: 2020-12 Impact factor: 1.848

3. Meibography Phenotyping and Classification From Unsupervised Discriminative Feature Learning.

Authors: Chun-Hsiao Yeh; Stella X Yu; Meng C Lin
Journal: Transl Vis Sci Technol Date: 2021-02-05 Impact factor: 3.283

4. A Deep Learning Model for Screening Multiple Abnormal Findings in Ophthalmic Ultrasonography (With Video).

Authors: Di Chen; Yi Yu; Yiwen Zhou; Bin Peng; Yujing Wang; Shan Hu; Miao Tian; Shanshan Wan; Yuelan Gao; Ying Wang; Yulin Yan; Lianlian Wu; LiWen Yao; Biqing Zheng; Yang Wang; Yuqing Huang; Xi Chen; Honggang Yu; Yanning Yang
Journal: Transl Vis Sci Technol Date: 2021-04-01 Impact factor: 3.283

Review 5. Machine Learning Algorithms to Detect Subclinical Keratoconus: Systematic Review.

Authors: Howard Maile; Ji-Peng Olivia Li; Daniel Gore; Marcello Leucci; Padraig Mulholland; Scott Hau; Anita Szabo; Ismail Moghul; Konstantinos Balaskas; Kaoru Fujinami; Pirro Hysi; Alice Davidson; Petra Liskova; Alison Hardcastle; Stephen Tuft; Nikolas Pontikos
Journal: JMIR Med Inform Date: 2021-12-13

Review 6. Accuracy of Machine Learning Assisted Detection of Keratoconus: A Systematic Review and Meta-Analysis.

Authors: Ke Cao; Karin Verspoor; Srujana Sahebjada; Paul N Baird
Journal: J Clin Med Date: 2022-01-18 Impact factor: 4.241

7. Artificial Intelligence Technique of Synthesis and Characterizations for Measurement of Optical Particles in Medical Devices.

Authors: Walid Theib Mohammad; Sherin Hassan Mabrouk; Rania Mohammed Abd Elgawad Mostafa; Mohammad Bani Younis; Ahmad Maher Al Sayeh; Mona Abdelmoneim Abdelmabood Ebrahim; Samar Zuhair Alshawwa; Heba Abdelazezm Hassan Ismail; Manal Mahrous Abdalhamed Mohamed; Sara Wans Alshmmry; Malik Bader Alazzam; Md Kawser Ahmed
Journal: Appl Bionics Biomech Date: 2022-02-11 Impact factor: 1.781

8. Keratoconus Severity Classification Using Features Selection and Machine Learning Algorithms.

Authors: Mustapha Aatila; Mohamed Lachgar; Hrimech Hamid; Ali Kartit
Journal: Comput Math Methods Med Date: 2021-11-16 Impact factor: 2.238

9. Applying Information Gain to Explore Factors Affecting Small-Incision Lenticule Extraction: A Multicenter Retrospective Study.

Authors: Shuang Liang; Shufan Ji; Xiao Liu; Min Chen; Yulin Lei; Jie Hou; Mengdi Li; Haohan Zou; Yusu Peng; Zhixing Ma; Yuanyuan Liu; Vishal Jhanji; Yan Wang
Journal: Front Med (Lausanne) Date: 2022-05-03

9 in total