| Literature DB >> 30602368 |
Kai Zhang1,2, Xiyang Liu3,4,5, Jiewei Jiang1,2, Wangting Li2, Shuai Wang6, Lin Liu1, Xiaojing Zhou7, Liming Wang1,8,6.
Abstract
BACKGROUND: The common treatment for pediatric cataracts is to replace the cloudy lens with an artificial one. However, patients may suffer complications (severe lens proliferation into the visual axis and abnormal high intraocular pressure; SLPVA and AHIP) within 1 year after surgery and factors causing these complications are unknown.Entities:
Keywords: Association rules mining; Genetic feature selection; Medical decision making system; Naïve Bayesian; Random forest
Year: 2019 PMID: 30602368 PMCID: PMC6317183 DOI: 10.1186/s12967-018-1758-2
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
The specification of attributes in dataset
| NO. of Attribute (attribute name) | Values |
|---|---|
| 1. Gender | Male, female |
| 2. Secondary IOL placement | Yes, no (primary IOL placement) |
| 3. Operation mode | Lens aspiration (I/A), lens aspiration with posterior continuous curvilinear capsulorhexis (I/A + PCCC), lens aspiration with posterior continuous curvilinear capsulorhexis and anterior vitrectomy (I/A + PCCC + A-Vit) |
| 4. Laterality | Unilateral cataracts, bilateral cataracts |
| 5. Age at surgery (AS) | 1, 2, 3, 4, 5, 6, 7 |
| 6. Area of cataracts (AC) | Large, small |
| 7. Density of cataracts (DC) | Dense, sloppy |
| 8. Position of cataracts (PC) | Covering the central area of lens, not covering the central area of lens |
| 9. Nystagmus | Yes, no |
| 10. Microphthalmia | Yes, no |
| 11. Microcornea | Yes, no |
| 12. Persistent hyperplastic primary vitreous (PHPV) | Yes, no |
Fig. 1Three grading standards of severity of pediatric cataracts. a the area of cataract is small; b the area of cataract is large; c the density of the cataract is sloppy; d the density of the cataract is dense; e the cataract covers the central area of lens; f the cataract does not covering the central area of lens
Performance indicators in three binary classification problems
| Method | Accuracy | FNR | FPR | |
|---|---|---|---|---|
| Problem 1: Whether a patient suffers from complications | ||||
| Random forest | – | 0.757 ± 0.025 | 0.414 ± 0.031 | 0.128 ± 0.013 |
| SMOTE | 0.762 ± 0.019 | 0.231 ± 0.013 | 0.220 ± 0.037 | |
| Naïve Bayesian | – | 0.748 ± 0.025 | 0.465 ± 0.042 | 0.887 ± 0.023 |
| SMOTE | 0.751 ± 0.032 | 0.270 ± 0.043 | 0.208 ± 0.044 | |
| Problem 2: Whether a patient suffers from SLPVA | ||||
| Random forest | – | 0.810 ± 0.014 | 0.621 ± 0.089 | 0.071 ± 0.023 |
| SMOTE | 0.753 ± 0.069 | 0.257 ± 0.054 | 0.258 ± 0.044 | |
| Naïve Bayesian | – | 0.782 ± 0.014 | 0.155 ± 0.043 | 0.449 ± 0.100 |
| SMOTE | 0.782 ± 0.043 | 0.244 ± 0.065 | 0.267 ± 0.025 | |
| Problem 3: Whether a patient suffers from AHIP | ||||
| Random forest | – | 0.838 ± 0.024 | 0.580 ± 0.050 | 0.015 ± 0.014 |
| SMOTE | 0.813 ± 0.016 | 0.228 ± 0.055 | 0.265 ± 0.025 | |
| Naïve Bayesian | – | 0.847 ± 0.033 | 0 ± 0 | 0.321 ± 0.043 |
| SMOTE | 0.816 ± 0.037 | 0.225 ± 0.047 | 0.265 ± 0.074 | |
Fig. 2ROC curves and AUC values in three binary classification problem. (ROC receiver operating characteristics curve, AUC: area under curve, RF random forest; SMOTE synthetic minority oversampling technique, NB Naïve Bayesian classifier)
Fig. 3The relationship between accuracy and the number of trees of random forest. (RF means random forest)
The performance of RF and NB for additional testing
| Problem | Algorithm | Accuracy | Sensitivity | Specificity |
|---|---|---|---|---|
| Whether a patient suffers from complications | Random forest | 0.700 | 0.625 | 0.769 |
| Naïve Bayesian | 0.700 | 0.731 | 0.667 | |
| Whether a patient suffers from SLPVA | Random forest | 0.720 | 0.667 | 0.722 |
| Naïve Bayesian | 0.660 | 0.611 | 0.688 | |
| Whether a patient suffers from AHIP | Random forest | 0.700 | 0.636 | 0.718 |
| Naïve Bayesian | 0.660 | 0.545 | 0.692 |
Fig. 4ROC curves and AUC values for additional testing
Fig. 5Flowchart of postoperative complication prediction. (At first, the inputted data is preprocessed with data discretion method. Then one of the three models is applied to distinguish whether a patient suffers from complication. If the patient is not normal, then the remaining two models are used to judge whether the patient has two types of complication. If the patient is judged to be normal, the remaining two models will not be used)
Fig. 6The relationship between threshold and the number of association rules