Literature DB >> 17997291

Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods.

Baek Hwan Cho1, Hwanjo Yu, Kwang-Won Kim, Tae Hyun Kim, In Young Kim, Sun I Kim.   

Abstract

OBJECTIVE: Diabetic nephropathy is damage to the kidney caused by diabetes mellitus. It is a common complication and a leading cause of death in people with diabetes. However, the decline in kidney function varies considerably between patients and the determinants of diabetic nephropathy have not been clearly identified. Therefore, it is very difficult to predict the onset of diabetic nephropathy accurately with simple statistical approaches such as t-test or chi(2)-test. To accurately predict the onset of diabetic nephropathy, we applied various machine learning techniques to irregular and unbalanced diabetes dataset, such as support vector machine (SVM) classification and feature selection methods. Visualization of the risk factors was another important objective to give physicians intuitive information on each patient's clinical pattern. METHODS AND MATERIALS: We collected medical data from 292 patients with diabetes and performed preprocessing to extract 184 features from the irregular data. To predict the onset of diabetic nephropathy, we compared several classification methods such as logistic regression, SVM, and SVM with a cost sensitive learning method. We also applied several feature selection methods to remove redundant features and improve the classification performance. For risk factor analysis with SVM classifiers, we have developed a new visualization system which uses a nomogram approach.
RESULTS: Linear SVM classifiers combined with wrapper or embedded feature selection methods showed the best results. Among the 184 features, the classifiers selected the same 39 features and gave 0.969 of the area under the curve by receiver operating characteristics analysis. The visualization tool was able to present the effect of each feature on the decision via graphical output.
CONCLUSIONS: Our proposed method can predict the onset of diabetic nephropathy about 2-3 months before the actual diagnosis with high prediction performance from an irregular and unbalanced dataset, which statistical methods such as t-test and logistic regression could not achieve. Additionally, the visualization system provides physicians with intuitive information for risk factor analysis. Therefore, physicians can benefit from the automatic early warning of each patient and visualize risk factors, which facilitate planning of effective and proper treatment strategies.

Entities:  

Mesh:

Year:  2007        PMID: 17997291     DOI: 10.1016/j.artmed.2007.09.005

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  22 in total

Review 1.  Modeling paradigms for medical diagnostic decision support: a survey and future directions.

Authors:  Kavishwar B Wagholikar; Vijayraghavan Sundararajan; Ashok W Deshpande
Journal:  J Med Syst       Date:  2011-10-01       Impact factor: 4.460

2.  Comparison of variable selection methods for clinical predictive modeling.

Authors:  L Nelson Sanchez-Pinto; Laura Ruth Venable; John Fahrenbach; Matthew M Churpek
Journal:  Int J Med Inform       Date:  2018-05-21       Impact factor: 4.046

3.  Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

Authors:  Huseyin Polat; Homay Danaei Mehr; Aydin Cetin
Journal:  J Med Syst       Date:  2017-02-27       Impact factor: 4.460

4.  Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

Authors:  Chau-Kuang Chen; Michelle Bruce; Lauren Tyler; Claudine Brown; Angelica Garrett; Susan Goggins; Brandy Lewis-Polite; Mirabel L Weriwoh; Paul D Juarez; Darryl B Hood; Tyler Skelton
Journal:  J Health Care Poor Underserved       Date:  2013-02

5.  Psoriasis prediction from genome-wide SNP profiles.

Authors:  Shenying Fang; Xiangzhong Fang; Momiao Xiong
Journal:  BMC Dermatol       Date:  2011-01-07

6.  Application of support vector machine for prediction of medication adherence in heart failure patients.

Authors:  Youn-Jung Son; Hong-Gee Kim; Eung-Hee Kim; Sangsup Choi; Soo-Kyoung Lee
Journal:  Healthc Inform Res       Date:  2010-12-31

7.  An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients.

Authors:  Guan-Mau Huang; Kai-Yao Huang; Tzong-Yi Lee; Julia Weng
Journal:  BMC Bioinformatics       Date:  2015-01-21       Impact factor: 3.169

8.  A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network.

Authors:  Nader Salari; Shamarina Shohaimi; Farid Najafi; Meenakshii Nallappan; Isthrinayagy Karishnarajah
Journal:  PLoS One       Date:  2014-11-24       Impact factor: 3.240

9.  Decoding Saccadic Directions Using Epidural ECoG in Non-Human Primates.

Authors:  Jeyeon Lee; Hoseok Choi; Seho Lee; Baek Hwan Cho; Kyoung Ha Ahn; In Young Kim; Kyoung Min Lee; Dong Pyo Jang
Journal:  J Korean Med Sci       Date:  2017-08       Impact factor: 2.153

10.  Big data analytics for preventive medicine.

Authors:  Muhammad Imran Razzak; Muhammad Imran; Guandong Xu
Journal:  Neural Comput Appl       Date:  2019-03-16       Impact factor: 5.102

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.