| Literature DB >> 31936770 |
Xu Yang1, Guo Chen1, Yunchong Qian1, Yuhan Wang1, Yisong Zhai1, Debao Fan1, Yang Xu1.
Abstract
According to literature, myopia has become the second most common eye disease in China, and the incidence of myopia is increasing year by year, and showing a trend of younger age. Previous researches have shown that the occurrence of myopia is mainly determined by poor eye habits, including reading and writing posture, eye length, and so on, and parents' heredity. In order to better prevent myopia in adolescents, this paper studies the influence of related factors on myopia incidence in adolescents based on machine learning method. A feature selection method based on both univariate correlation analysis and multivariate correlation analysis is used to better construct a feature sub-set for model training. A method based on GBRT is provided to help fill in missing items in the original data. The prediction model is built based on SVM model. Data transformation has been used to improve the prediction accuracy. Results show that our method could achieve reasonable performance and accuracy.Entities:
Keywords: artificial intelligence; correlation analysis; machine learning; myopia in adolescents
Mesh:
Year: 2020 PMID: 31936770 PMCID: PMC7013571 DOI: 10.3390/ijerph17020463
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Results of univariate correlation analysis.
| Factor | Description | Correlation | |
|---|---|---|---|
| DAI | No. of parents wearing glasses | 0.061 | 0.002 |
| GENDER | Gender | 0.067 |
|
| RPR | Before mydriasis (right eye) | 0.091 |
|
| JTR | Close adjustment ability (right eye) | 0.108 | 0.001 |
| YTR | Remote adjustment ability (right eye) | 0.267 |
|
| DRNBASE | Distant vision (right eye) | 0.222 |
|
| JG | Amount of indoor activities | 0.029 | 0.001 |
| YW | Amount of outdoor activities | 0.030 | 0.032 |
| AL | Axial length | 0.106 |
|
| K1 | Corneal curvature (left eye) | 0.062 | 0.002 |
| K2 | Corneal curvature (right eye) | 0.043 | 0.036 |
| PULSE | Pulses per minute | 0.006 | 0.008 |
| TUTOR1 | Participation in outdoor classes | 0.282 |
|
| TUTOR2 | Participation in indoor classes | 0.203 |
|
| ETEST | If have regular eye examination | 0.344 |
|
| MSMK | Whether or not smoke | 0.397 |
|
| CELLP | Whether or not play cellphone | 0.229 |
|
| COSTM | Whether or not write with wrong posture | 0.092 |
|
| BED | Whether or not read in bed | 0.261 |
|
| COLA | Frequency of drinking carbonated drinks | 0.092 |
|
| REDM | Frequency of eating red meat | 0.037 | 0.026 |
| WHIM | Frequency of eating white meat | 0.028 | 0.044 |
| EGG | Frequency of eating eggs | 0.077 |
|
| MILK | Frequency of drinking milk | 0.077 |
|
| VOLUME | Daily amount of water drinking | 0.096 |
|
Results of multivariate correlation analysis.
| Factor | Description | Correlation | Std Err |
|
|
|---|---|---|---|---|---|
| DAI | No. of parents wearing glasses | −0.0218 | 0.033 | −0.670 | 0.042 |
| GENDER | Gender | −0.0418 | 0.047 | −0.882 | 0.015 |
| RPR | Before mydriasis (right eye) | −0.0160 | 0.022 | −0.735 | 0.207 |
| JTR | Close adjustment ability (right eye) | 0.0402 | 0.060 | 0.667 | 0.005 |
| YTR | Remote adjustment ability (right eye) | 0.2077 | 0.049 | 4.245 | 0.000 |
| DRNBASE | Distant vision (right eye) | −0.4191 | 0.200 | −2.099 | 0.306 |
| JG | Amount of indoor activities | −1.043 | 1.13 | −0.922 | 0.035 |
| YW | Amount of outdoor activities | −0.0099 | 0.014 | −0.699 | 0.048 |
| AL | Axial length | −0.0420 | 0.073 | −0.578 | 0.003 |
| K1 | Corneal curvature (left eye) | −0.2655 | 0.578 | −0.459 | 0.046 |
| K2 | Corneal curvature (right eye) | −0.2155 | 0.495 | −0.435 | 0.036 |
| PULSE | Pulses per minute | 0.0003 | 0.002 | 0.167 | 0.008 |
| TUTOR1 | Participation in outdoor classes | 0.0871 | 0.047 | 1.869 | 0.062 |
| TUTOR2 | Participation in indoor classes | 0.0333 | 0.044 | 0.751 | 0.453 |
| ETEST | If have regular eye examination | −0.0877 | 0.043 | −2.042 | 0.401 |
| MSMK | Whether or not smoke | −0.3692 | 0.266 | −1.387 | 0.166 |
| CELLP | Whether or not play cellphone | 0.0530 | 0.047 | 1.129 | 0.259 |
| COSTM | Whether or not write with wrong posture | −0.0058 | 0.028 | −0.205 | 0.838 |
| BED | Whether or not read in bed | −0.0513 | 0.031 | −1.677 | 0.094 |
| COLA | Frequency of drinking carbonated drinks | 0.0041 | 0.025 | 0.163 | 0.007 |
| REDM | Frequency of eating red meat | 0.0167 | 0.022 | 0.761 | 0.044 |
| WHIM | Frequency of eating white meat | −0.0231 | 0.027 | −0.842 | 0.040 |
| EGG | Frequency of eating eggs | −0.0054 | 0.025 | −0.214 | 0.013 |
| MILK | Frequency of drinking milk | 0.0163 | 0.031 | 0.532 | 0.595 |
| VOLUME | Daily amount of water drinking | −0.0041 | 0.032 | −0.130 | 0.897 |
Figure 1Flow of the feature selection method.
Figure 2Flow of the prediction model.
Results under different p value.
|
|
| |
|---|---|---|
| 0.9 | 0.791 | 0.165 |
| 0.8 | 0.789 | 0.145 |
| 0.7 | 0.787 | 0.178 |
| 0.6 | 0.755 | 0.178 |
| 0.5 | 0.786 | 0.178 |
| 0.4 | 0.776 | 0.198 |
| 0.3 | 0.766 | 0.202 |
| 0.2 | 0.743 | 0.220 |
| 0.1 | 0.694 | 0.258 |
Figure 3REP trend with respect to p value.
Figure 4Comparison between different feature selection method.
Figure 5Influence of data transformation on results.
Figure 6Comparison of accuracy
Comparison of methods.
| Accuracy | 10-Fold Cross-Validation | Precision | Sensitivity | f1 | AUC | Specificity | |
|---|---|---|---|---|---|---|---|
| Our method | 93% | 0.92 | 0.95 | 0.94 | 0.94 | 0.98 | 0.94 |
| Logistic Regression | 89% | 0.89 | 0.82 | 0.88 | 0.85 | 0.95 | 0.88 |
| Naive Bayes | 88% | 0.84 | 0.89 | 0.92 | 0.90 | 0.93 | 0.92 |
| KNN | 60% | 0.58 | 0.66 | 0.73 | 0.69 | 0.55 | 0.73 |
| Random Forest | 91% | 0.90 | 0.94 | 0.91 | 0.92 | 0.97 | 0.91 |
| BP Neural Network | 92% | 0.90 | 0.95 | 0.93 | 0.94 | 0.97 | 0.93 |