Literature DB >> 33743696

Exploratory study on classification of diabetes mellitus through a combined Random Forest Classifier.

Xuchun Wang1, Mengmeng Zhai1, Zeping Ren2, Hao Ren1, Meichen Li1, Dichen Quan1, Limin Chen3, Lixia Qiu4.   

Abstract

BACKGROUND: Diabetes Mellitus (DM) has become the third chronic non-communicable disease that hits patients after tumors, cardiovascular and cerebrovascular diseases, and has become one of the major public health problems in the world. Therefore, it is of great importance to identify individuals at high risk for DM in order to establish prevention strategies for DM.
METHODS: Aiming at the problem of high-dimensional feature space and high feature redundancy of medical data, as well as the problem of data imbalance often faced. This study explored different supervised classifiers, combined with SVM-SMOTE and two feature dimensionality reduction methods (Logistic stepwise regression and LAASO) to classify the diabetes survey sample data with unbalanced categories and complex related factors. Analysis and discussion of the classification results of 4 supervised classifiers based on 4 data processing methods. Five indicators including Accuracy, Precision, Recall, F1-Score and AUC are selected as the key indicators to evaluate the performance of the classification model.
RESULTS: According to the result, Random Forest Classifier combining SVM-SMOTE resampling technology and LASSO feature screening method (Accuracy = 0.890, Precision = 0.869, Recall = 0.919, F1-Score = 0.893, AUC = 0.948) proved the best way to tell those at high risk of DM. Besides, the combined algorithm helps enhance the classification performance for prediction of high-risk people of DM. Also, age, region, heart rate, hypertension, hyperlipidemia and BMI are the top six most critical characteristic variables affecting diabetes.
CONCLUSIONS: The Random Forest Classifier combining with SVM-SMOTE and LASSO feature reduction method perform best in identifying high-risk people of DM from individuals. And the combined method proposed in the study would be a good tool for early screening of DM.

Entities:  

Keywords:  Classification; Diabetes mellitus; Imbalanced data; Indicators; Random Forest Classifier

Mesh:

Year:  2021        PMID: 33743696      PMCID: PMC7980612          DOI: 10.1186/s12911-021-01471-4

Source DB:  PubMed          Journal:  BMC Med Inform Decis Mak        ISSN: 1472-6947            Impact factor:   2.796


  22 in total

1.  Prevention of type 2 diabetes mellitus by changes in lifestyle.

Authors:  J G Fodor; K B Adamo
Journal:  N Engl J Med       Date:  2001-08-30       Impact factor: 91.245

2.  Prevalence, awareness, treatment, and control of hypertension among China's Sichuan Tibetan population: A cross-sectional study.

Authors:  Xiaobo Huang; Zhengyang Zhou; Jianxiong Liu; Weifang Song; Yong Chen; Ya Liu; Mingyu Zhang; Wen Dai; Yanjing Yi; Shuiping Zhao
Journal:  Clin Exp Hypertens       Date:  2016-06-30       Impact factor: 1.749

3.  Comparison of variable selection methods for clinical predictive modeling.

Authors:  L Nelson Sanchez-Pinto; Laura Ruth Venable; John Fahrenbach; Matthew M Churpek
Journal:  Int J Med Inform       Date:  2018-05-21       Impact factor: 4.046

4.  Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy.

Authors:  Salvador García; Francisco Herrera
Journal:  Evol Comput       Date:  2009       Impact factor: 3.277

5.  Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data.

Authors:  Andreas Holzinger; Benjamin Haibe-Kains; Igor Jurisica
Journal:  Eur J Nucl Med Mol Imaging       Date:  2019-06-15       Impact factor: 9.236

6.  The higher prevalence of truncal obesity and diabetes in American than Chinese patients with chronic hepatitis C might contribute to more rapid progression to advanced liver disease.

Authors:  H Rao; E Wu; S Fu; M Yang; B Feng; A Lin; R Fei; R J Fontana; A S Lok; L Wei
Journal:  Aliment Pharmacol Ther       Date:  2017-08-22       Impact factor: 8.171

7.  Non-linear associations of risk factors with mild hypoglycemia among Chinese patients with type 2 diabetes.

Authors:  Weijun Gu; Yanfeng Ren; Linong Ji; Tianpei Hong; Yiming Mu; Lixin Guo; Qiang Li; Qing Tian; Xilin Yang
Journal:  J Diabetes Complications       Date:  2015-12-29       Impact factor: 2.852

8.  International Diabetes Federation 2017.

Authors:  Ann M Carracher; Payal H Marathe; Kelly L Close
Journal:  J Diabetes       Date:  2018-02-13       Impact factor: 4.006

9.  Epidemiology of dyslipidemia in Chinese adults: meta-analysis of prevalence, awareness, treatment, and control.

Authors:  Yuanxiu Huang; Lin Gao; Xiaoping Xie; Seng Chuen Tan
Journal:  Popul Health Metr       Date:  2014-10-28

10.  Prevalence, awareness, treatment, and control of hypertension and associated risk factors among adults in Xi'an, China: A cross-sectional study.

Authors:  Meiqin Hu; Yi Wan; Lifen Yu; Jing Yuan; Yonghong Ma; Bin Hou; Xun Jiang; Lei Shang
Journal:  Medicine (Baltimore)       Date:  2016-08       Impact factor: 1.889

View more
  4 in total

1.  Machine Learning-Based Shear Wave Elastography Elastic Index (SWEEI) in Predicting Cervical Lymph Node Metastasis of Papillary Thyroid Microcarcinoma: A Comparative Analysis of Five Practical Prediction Models.

Authors:  Xue Huang; Yukun Zhang; Du He; Lin Lai; Jun Chen; Tao Zhang; Huilin Mao
Journal:  Cancer Manag Res       Date:  2022-09-21       Impact factor: 3.602

2.  Efficacy Analysis of Team-Based Nursing Compliance in Young and Middle-Aged Diabetes Mellitus Patients Based on Random Forest Algorithm and Logistic Regression.

Authors:  Dongni Qian; Hong Gao
Journal:  Comput Math Methods Med       Date:  2022-07-29       Impact factor: 2.809

3.  Predicting the 2-Year Risk of Progression from Prediabetes to Diabetes Using Machine Learning among Chinese Elderly Adults.

Authors:  Qing Liu; Qing Zhou; Yifeng He; Jingui Zou; Yan Guo; Yaqiong Yan
Journal:  J Pers Med       Date:  2022-06-27

4.  Associations of Serum Total 25OHD, 25OHD3, and epi-25OHD3 with Insulin Resistance: Cross-Sectional Analysis of the National Health and Nutrition Examination Survey, 2011-2016.

Authors:  Meiling Zhou; Ruixue Huang
Journal:  Nutrients       Date:  2022-08-26       Impact factor: 6.706

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.