| Literature DB >> 33912534 |
Mingqi Zhao1, Changjun Song2, Tao Luo1, Tianyue Huang3, Shiming Lin2,3.
Abstract
Fatty liver disease (FLD) is a common liver disease, which poses a great threat to people's health, but there is still no optimal method that can be used on a large-scale screening. This research is based on machine learning algorithms, using electronic physical examination records in the health database as data support, to a predictive model for FLD. The model has shown good predictive ability on the test set, with its AUC reaching 0.89. Since there are a large number of electronic physical examination records in most of health database, this model might be used as a non-invasive diagnostic tool for FLD for large-scale screening.Entities:
Keywords: XGBoost; chi-square binning algorithm; electronic medical records; fatty liver disease; genetic algorithm; machine learning
Year: 2021 PMID: 33912534 PMCID: PMC8072129 DOI: 10.3389/fpubh.2021.668351
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1Data preprocessing flowchart.
Figure 2Statistical information and chi-square test results of different features in different groups.
Figure 3Violin chart: the distribution of different features under different age groups and genders.
Figure 4Genetic algorithm flowchart.
Figure 5Demonstration of individual and individual variation.
Figure 6Trade-off between variance and bias.
Figure 7The influence of the number of features used on the model and feature importance.