Literature DB >> 23059731

Discretization of continuous features in clinical datasets.

David M Maslove1, Tanya Podchiyska, Henry J Lowe.   

Abstract

BACKGROUND: The increasing availability of clinical data from electronic medical records (EMRs) has created opportunities for secondary uses of health information. When used in machine learning classification, many data features must first be transformed by discretization.
OBJECTIVE: To evaluate six discretization strategies, both supervised and unsupervised, using EMR data.
MATERIALS AND METHODS: We classified laboratory data (arterial blood gas (ABG) measurements) and physiologic data (cardiac output (CO) measurements) derived from adult patients in the intensive care unit using decision trees and naïve Bayes classifiers. Continuous features were partitioned using two supervised, and four unsupervised discretization strategies. The resulting classification accuracy was compared with that obtained with the original, continuous data.
RESULTS: Supervised methods were more accurate and consistent than unsupervised, but tended to produce larger decision trees. Among the unsupervised methods, equal frequency and k-means performed well overall, while equal width was significantly less accurate. DISCUSSION: This is, we believe, the first dedicated evaluation of discretization strategies using EMR data. It is unlikely that any one discretization method applies universally to EMR data. Performance was influenced by the choice of class labels and, in the case of unsupervised methods, the number of intervals. In selecting the number of intervals there is generally a trade-off between greater accuracy and greater consistency.
CONCLUSIONS: In general, supervised methods yield higher accuracy, but are constrained to a single specific application. Unsupervised methods do not require class labels and can produce discretized data that can be used for multiple purposes.

Entities:  

Mesh:

Year:  2012        PMID: 23059731      PMCID: PMC3628044          DOI: 10.1136/amiajnl-2012-000929

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  19 in total

1.  STRIDE--An integrated standards-based translational research informatics platform.

Authors:  Henry J Lowe; Todd A Ferris; Penni M Hernandez; Susan C Weber
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

2.  Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper.

Authors:  Charles Safran; Meryl Bloomrosen; W Edward Hammond; Steven Labkoff; Suzanne Markel-Fox; Paul C Tang; Don E Detmer
Journal:  J Am Med Inform Assoc       Date:  2006-10-31       Impact factor: 4.497

3.  A hybrid Decision Support System for the risk assessment of retinopathy development as a long term complication of Type 1 Diabetes Mellitus.

Authors:  Marios Skevofilakas; Konstantia Zarkogianni; Basil G Karamanos; Konstantina S Nikita
Journal:  Annu Int Conf IEEE Eng Med Biol Soc       Date:  2010

4.  Improving classification performance with discretization on biomedical datasets.

Authors:  Jonathan L Lustgarten; Vanathi Gopalakrishnan; Himanshu Grover; Shyam Visweswaran
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

5.  Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease.

Authors:  Iftikhar J Kullo; Jin Fan; Jyotishman Pathak; Guergana K Savova; Zeenat Ali; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

Review 6.  The detection and prevention of errors in laboratory medicine.

Authors:  Mario Plebani
Journal:  Ann Clin Biochem       Date:  2009-12-01       Impact factor: 2.057

7.  Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis.

Authors:  Mitchell J Cohen; Adam D Grossman; Diane Morabito; M Margaret Knudson; Atul J Butte; Geoffrey T Manley
Journal:  Crit Care       Date:  2010-02-02       Impact factor: 9.097

8.  Feature selection and classification model construction on type 2 diabetic patients' data.

Authors:  Yue Huang; Paul McCullagh; Norman Black; Roy Harper
Journal:  Artif Intell Med       Date:  2007-08-17       Impact factor: 5.326

9.  Explorative data analysis techniques and unsupervised clustering methods to support clinical assessment of Chronic Obstructive Pulmonary Disease (COPD) phenotypes.

Authors:  Matteo Paoletti; Gianna Camiciottoli; Eleonora Meoni; Francesca Bigazzi; Lucia Cestelli; Massimo Pistolesi; Carlo Marchesi
Journal:  J Biomed Inform       Date:  2009-06-06       Impact factor: 6.317

10.  Clinical COPD phenotypes: a novel approach using principal component and cluster analyses.

Authors:  P-R Burgel; J-L Paillasseur; D Caillaud; I Tillie-Leblond; P Chanez; R Escamilla; I Court-Fortune; T Perez; P Carré; N Roche
Journal:  Eur Respir J       Date:  2010-01-14       Impact factor: 16.671

View more
  15 in total

1.  Using Active Learning for Speeding up Calibration in Simulation Models.

Authors:  Mucahit Cevik; Mehmet Ali Ergun; Natasha K Stout; Amy Trentham-Dietz; Mark Craven; Oguzhan Alagoz
Journal:  Med Decis Making       Date:  2015-10-15       Impact factor: 2.583

2.  Autopopulus: A Novel Framework for Autoencoder Imputation on Large Clinical Datasets.

Authors:  Davina J Zamanzadeh; Panayiotis Petousis; Tyler A Davis; Susanne B Nicholas; Keith C Norris; Katherine R Tuttle; Alex A T Bui; Majid Sarrafzadeh
Journal:  Annu Int Conf IEEE Eng Med Biol Soc       Date:  2021-11

3.  Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data.

Authors:  Esther I Metting; Johannes C C M In 't Veen; P N Richard Dekhuijzen; Ellen van Heijst; Janwillem W H Kocks; Jacqueline B Muilwijk-Kroes; Niels H Chavannes; Thys van der Molen
Journal:  ERJ Open Res       Date:  2016-01-22

4.  Predicting urinary tract infections in the emergency department with machine learning.

Authors:  R Andrew Taylor; Christopher L Moore; Kei-Hoi Cheung; Cynthia Brandt
Journal:  PLoS One       Date:  2018-03-07       Impact factor: 3.240

5.  Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities.

Authors:  Kyle Edmunds; Magnús Gíslason; Sigurður Sigurðsson; Vilmundur Guðnason; Tamara Harris; Ugo Carraro; Paolo Gargiulo
Journal:  PLoS One       Date:  2018-03-07       Impact factor: 3.240

6.  Deep Neural Network-Based Method for Detecting Central Retinal Vein Occlusion Using Ultrawide-Field Fundus Ophthalmoscopy.

Authors:  Daisuke Nagasato; Hitoshi Tabuchi; Hideharu Ohsugi; Hiroki Masumoto; Hiroki Enno; Naofumi Ishitobi; Tomoaki Sonobe; Masahiro Kameoka; Masanori Niki; Ken Hayashi; Yoshinori Mitamura
Journal:  J Ophthalmol       Date:  2018-11-01       Impact factor: 1.909

7.  Application of Machine Learning Methods to Ambulatory Circadian Monitoring (ACM) for Discriminating Sleep and Circadian Disorders.

Authors:  Beatriz Rodriguez-Morilla; Eduard Estivill; Carla Estivill-Domènech; Javier Albares; Francisco Segarra; Angel Correa; Manuel Campos; Maria Angeles Rol; Juan Antonio Madrid
Journal:  Front Neurosci       Date:  2019-12-10       Impact factor: 4.677

8.  The application of unsupervised deep learning in predictive models using electronic health records.

Authors:  Lei Wang; Liping Tong; Darcy Davis; Tim Arnold; Tina Esposito
Journal:  BMC Med Res Methodol       Date:  2020-02-26       Impact factor: 4.615

9.  Development and validation of a prehospital-stage prediction tool for traumatic brain injury: a multicentre retrospective cohort study in Korea.

Authors:  Yeongho Choi; Jeong Ho Park; Ki Jeong Hong; Young Sun Ro; Kyoung Jun Song; Sang Do Shin
Journal:  BMJ Open       Date:  2022-01-12       Impact factor: 2.692

10.  Will they participate? Predicting patients' response to clinical trial invitations in a pediatric emergency department.

Authors:  Yizhao Ni; Andrew F Beck; Regina Taylor; Jenna Dyas; Imre Solti; Jacqueline Grupp-Phelan; Judith W Dexheimer
Journal:  J Am Med Inform Assoc       Date:  2016-04-27       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.