Literature DB >> 22490586

Performance comparison between Logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus.

Chang-ping Li1, Xin-yue Zhi, Jun Ma, Zhuang Cui, Zi-long Zhu, Cui Zhang, Liang-ping Hu.   

Abstract

BACKGROUND: Various methods can be applied to build predictive models for the clinical data with binary outcome variable. This research aims to explore the process of constructing common predictive models, Logistic regression (LR), decision tree (DT) and multilayer perceptron (MLP), as well as focus on specific details when applying the methods mentioned above: what preconditions should be satisfied, how to set parameters of the model, how to screen variables and build accuracy models quickly and efficiently, and how to assess the generalization ability (that is, prediction performance) reliably by Monte Carlo method in the case of small sample size.
METHODS: All the 274 patients (include 137 type 2 diabetes mellitus with diabetic peripheral neuropathy and 137 type 2 diabetes mellitus without diabetic peripheral neuropathy) from the Metabolic Disease Hospital in Tianjin participated in the study. There were 30 variables such as sex, age, glycosylated hemoglobin, etc. On account of small sample size, the classification and regression tree (CART) with the chi-squared automatic interaction detector tree (CHAID) were combined by means of the 100 times 5-7 fold stratified cross-validation to build DT. The MLP was constructed by Schwarz Bayes Criterion to choose the number of hidden layers and hidden layer units, alone with levenberg-marquardt (L-M) optimization algorithm, weight decay and preliminary training method. Subsequently, LR was applied by the best subset method with the Akaike Information Criterion (AIC) to make the best used of information and avoid overfitting. Eventually, a 10 to 100 times 3-10 fold stratified cross-validation method was used to compare the generalization ability of DT, MLP and LR in view of the areas under the receiver operating characteristic (ROC) curves (AUC).
RESULTS: The AUC of DT, MLP and LR were 0.8863, 0.8536 and 0.8802, respectively. As the larger the AUC of a specific prediction model is, the higher diagnostic ability presents, MLP performed optimally, and then followed by LR and DT in terms of 10-100 times 2-10 fold stratified cross-validation in our study. Neural network model is a preferred option for the data. However, the best subset of multiple LR would be a better choice in view of efficiency and accuracy.
CONCLUSION: When dealing with data from small size sample, multiple independent variables and a dichotomous outcome variable, more strategies and statistical techniques (such as AIC criteria, L-M optimization algorithm, the best subset, etc.) should be considered to build a forecast model and some available methods (such as cross-validation, AUC, etc.) could be used for evaluation.

Entities:  

Mesh:

Year:  2012        PMID: 22490586

Source DB:  PubMed          Journal:  Chin Med J (Engl)        ISSN: 0366-6999            Impact factor:   2.628


  14 in total

1.  A Clinical Decision Support System for Predicting the Early Complications of One-Anastomosis Gastric Bypass Surgery.

Authors:  Abbas Sheikhtaheri; Azam Orooji; Abdolreza Pazouki; Maryam Beitollahi
Journal:  Obes Surg       Date:  2019-07       Impact factor: 4.129

2.  Artificial Intelligence Methodologies and Their Application to Diabetes.

Authors:  Mercedes Rigla; Gema García-Sáez; Belén Pons; Maria Elena Hernando
Journal:  J Diabetes Sci Technol       Date:  2017-05-25

3.  Support vector machine model for diagnosing pneumoconiosis based on wavelet texture features of digital chest radiographs.

Authors:  Biyun Zhu; Hui Chen; Budong Chen; Yan Xu; Kuan Zhang
Journal:  J Digit Imaging       Date:  2014-02       Impact factor: 4.056

4.  Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework.

Authors:  Mingyue Xue; Yinxia Su; Chen Li; Shuxia Wang; Hua Yao
Journal:  J Diabetes Res       Date:  2020-09-24       Impact factor: 4.011

5.  Development of a screening algorithm for Alzheimer's disease using categorical verbal fluency.

Authors:  Yeon Kyung Chi; Ji Won Han; Hyeon Jeong; Jae Young Park; Tae Hui Kim; Jung Jae Lee; Seok Bum Lee; Joon Hyuk Park; Jong Chul Yoon; Jeong Lan Kim; Seung-Ho Ryu; Jin Hyeong Jhoo; Dong Young Lee; Ki Woong Kim
Journal:  PLoS One       Date:  2014-01-02       Impact factor: 3.240

6.  The development and evaluation of a computerized diagnosis scheme for pneumoconiosis on digital chest radiographs.

Authors:  Biyun Zhu; Wei Luo; Baoping Li; Budong Chen; Qiuying Yang; Yan Xu; Xiaohua Wu; Hui Chen; Kuan Zhang
Journal:  Biomed Eng Online       Date:  2014-10-02       Impact factor: 2.819

7.  Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods.

Authors:  Payam Amini; Saman Maroufizadeh; Reza Omani Samani; Omid Hamidi; Mahdi Sepidarkish
Journal:  Osong Public Health Res Perspect       Date:  2017-06-30

8.  Accurate and rapid screening model for potential diabetes mellitus.

Authors:  Dongmei Pei; Yang Gong; Hong Kang; Chengpu Zhang; Qiyong Guo
Journal:  BMC Med Inform Decis Mak       Date:  2019-03-12       Impact factor: 2.796

9.  A Noninvasive Prediction Model for Hepatitis B Virus Disease in Patients with HIV: Based on the Population of Jiangsu, China.

Authors:  Yi Yin; Mingyue Xue; Lingen Shi; Tao Qiu; Derun Xia; Gengfeng Fu; Zhihang Peng
Journal:  Biomed Res Int       Date:  2021-03-29       Impact factor: 3.411

10.  Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study.

Authors:  Oleg Metsker; Kirill Magoev; Alexey Yakovlev; Stanislav Yanishevskiy; Georgy Kopanitsa; Sergey Kovalchuk; Valeria V Krzhizhanovskaya
Journal:  BMC Med Inform Decis Mak       Date:  2020-08-24       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.