Literature DB >> 17707617

Feature selection and classification model construction on type 2 diabetic patients' data.

Yue Huang1, Paul McCullagh, Norman Black, Roy Harper.   

Abstract

OBJECTIVE: Diabetes affects between 2% and 4% of the global population (up to 10% in the over 65 age group), and its avoidance and effective treatment are undoubtedly crucial public health and health economics issues in the 21st century. The aim of this research was to identify significant factors influencing diabetes control, by applying feature selection to a working patient management system to assist with ranking, classification and knowledge discovery. The classification models can be used to determine individuals in the population with poor diabetes control status based on physiological and examination factors.
METHODS: The diabetic patients' information was collected by Ulster Community and Hospitals Trust (UCHT) from year 2000 to 2004 as part of clinical management. In order to discover key predictors and latent knowledge, data mining techniques were applied. To improve computational efficiency, a feature selection technique, feature selection via supervised model construction (FSSMC), an optimisation of ReliefF, was used to rank the important attributes affecting diabetic control. After selecting suitable features, three complementary classification techniques (Naïve Bayes, IB1 and C4.5) were applied to the data to predict how well the patients' condition was controlled.
RESULTS: FSSMC identified patients' 'age', 'diagnosis duration', the need for 'insulin treatment', 'random blood glucose' measurement and 'diet treatment' as the most important factors influencing blood glucose control. Using the reduced features, a best predictive accuracy of 95% and sensitivity of 98% was achieved. The influence of factors, such as 'type of care' delivered, the use of 'home monitoring', and the importance of 'smoking' on outcome can contribute to domain knowledge in diabetes control.
CONCLUSION: In the care of patients with diabetes, the more important factors identified: patients' 'age', 'diagnosis duration' and 'family history', are beyond the control of physicians. Treatment methods such as 'insulin', 'diet' and 'tablets' (a variety of oral medicines) may be controlled. However lifestyle indicators such as 'body mass index' and 'smoking status' are also important and may be controlled by the patient. This further underlines the need for public health education to aid awareness and prevention. More subtle data interactions need to be better understood and data mining can contribute to the clinical evidence base. The research confirms and to a lesser extent challenges current thinking. Whilst fully appreciating the requirement for clinical verification and interpretation, this work supports the use of data mining as an exploratory tool, particularly as the domain is suffering from a data explosion due to enhanced monitoring and the (potential) storage of this data in the electronic health record. FSSMC has proved a useful feature estimator for large data sets, where processing efficiency is an important factor.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17707617     DOI: 10.1016/j.artmed.2007.07.002

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  13 in total

Review 1.  Data-mining technologies for diabetes: a systematic review.

Authors:  Miroslav Marinov; Abu Saleh Mohammad Mosa; Illhoi Yoo; Suzanne Austin Boren
Journal:  J Diabetes Sci Technol       Date:  2011-11-01

2.  Temporal data mining for the assessment of the costs related to diabetes mellitus pharmacological treatment.

Authors:  Stefano Concaro; Lucia Sacchi; Carlo Cerra; Mario Stefanelli; Pietro Fratino; Riccardo Bellazzi
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

Review 3.  Unobtrusive sensing and wearable devices for health informatics.

Authors:  Ya-Li Zheng; Xiao-Rong Ding; Carmen Chung Yan Poon; Benny Ping Lai Lo; Heye Zhang; Xiao-Lin Zhou; Guang-Zhong Yang; Ni Zhao; Yuan-Ting Zhang
Journal:  IEEE Trans Biomed Eng       Date:  2014-05       Impact factor: 4.538

4.  Discretization of continuous features in clinical datasets.

Authors:  David M Maslove; Tanya Podchiyska; Henry J Lowe
Journal:  J Am Med Inform Assoc       Date:  2012-10-11       Impact factor: 4.497

5.  Relational machine learning for electronic health record-driven phenotyping.

Authors:  Peggy L Peissig; Vitor Santos Costa; Michael D Caldwell; Carla Rottscheit; Richard L Berg; Eneida A Mendonca; David Page
Journal:  J Biomed Inform       Date:  2014-07-15       Impact factor: 6.317

Review 6.  -Omic and Electronic Health Record Big Data Analytics for Precision Medicine.

Authors:  Po-Yen Wu; Chih-Wen Cheng; Chanchala D Kaddi; Janani Venugopalan; Ryan Hoffman; May D Wang
Journal:  IEEE Trans Biomed Eng       Date:  2016-10-10       Impact factor: 4.538

7.  Pilot study on developing a decision support tool for guiding re-administration of chemotherapeutic agent after a serious adverse drug reaction.

Authors:  Pei Yi Loke; Lita Chew; Chun Wei Yap
Journal:  BMC Cancer       Date:  2011-07-28       Impact factor: 4.430

8.  A machine learning-based framework to identify type 2 diabetes through electronic health records.

Authors:  Tao Zheng; Wei Xie; Liling Xu; Xiaoying He; Ya Zhang; Mingrong You; Gong Yang; You Chen
Journal:  Int J Med Inform       Date:  2016-10-01       Impact factor: 4.046

9.  Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining.

Authors:  Shafi Habibi; Maryam Ahmadi; Somayeh Alizadeh
Journal:  Glob J Health Sci       Date:  2015-03-18

10.  [-25]A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks.

Authors:  Alejandra García-Hernández; Carlos E Galván-Tejada; Jorge I Galván-Tejada; José M Celaya-Padilla; Hamurabi Gamboa-Rosales; Perla Velasco-Elizondo; Rogelio Cárdenas-Vargas
Journal:  Sensors (Basel)       Date:  2017-11-21       Impact factor: 3.576

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.