Literature DB >> 20473190

Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches.

Jionglin Wu1, Jason Roy, Walter F Stewart.   

Abstract

BACKGROUND: Electronic health record (EHR) databases contain vast amounts of information about patients. Machine learning techniques such as Boosting and support vector machine (SVM) can potentially identify patients at high risk for serious conditions, such as heart disease, from EHR data. However, these techniques have not yet been widely tested.
OBJECTIVE: To model detection of heart failure more than 6 months before the actual date of clinical diagnosis using machine learning techniques applied to EHR data. To compare the performance of logistic regression, SVM, and Boosting, along with various variable selection methods in heart failure prediction. RESEARCH
DESIGN: Geisinger Clinic primary care patients with data in the EHR data from 2001 to 2006 diagnosed with heart failure between 2003 and 2006 were identified. Controls were randomly selected matched on sex, age, and clinic for this nested case-control study. MEASURES: Area under the curve (AUC) of receiver operator characteristic curve was computed for each method using 10-fold cross-validation. The number of variables selected by each method was compared.
RESULTS: Logistic regression with model selection based on Bayesian information criterion provided the most parsimonious model, with about 10 variables selected on average, while maintaining a high AUC (0.77 in 10-fold cross-validation). Boosting with strict variable importance threshold provided similar performance.
CONCLUSIONS: Heart failure was predicted more than 6 months before clinical diagnosis, with AUC of about 0.76, using logistic regression and Boosting. These results were achieved even with strict model selection criteria. SVM had the poorest performance, possibly because of imbalanced data.

Entities:  

Mesh:

Year:  2010        PMID: 20473190     DOI: 10.1097/MLR.0b013e3181de9e17

Source DB:  PubMed          Journal:  Med Care        ISSN: 0025-7079            Impact factor:   2.983


  90 in total

1.  Characterizing Physicians Practice Phenotype from Unstructured Electronic Health Records.

Authors:  Sanjoy Dey; Yajuan Wang; Roy J Byrd; Kenney Ng; Steven R Steinhubl; Christopher deFilippi; Walter F Stewart
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

2.  Differential Privacy Preserving in Big Data Analytics for Connected Health.

Authors:  Chi Lin; Zihao Song; Houbing Song; Yanhong Zhou; Yi Wang; Guowei Wu
Journal:  J Med Syst       Date:  2016-02-12       Impact factor: 4.460

3.  Delirium Prediction using Machine Learning Models on Preoperative Electronic Health Records Data.

Authors:  Anis Davoudi; Ashkan Ebadi; Parisa Rashidi; Tazcan Ozrazgat-Baslanti; Azra Bihorac; Alberto C Bursian
Journal:  Proc IEEE Int Symp Bioinformatics Bioeng       Date:  2018-01-11

4.  Coronary artery disease risk assessment from unstructured electronic health records using text mining.

Authors:  Jitendra Jonnagaddala; Siaw-Teng Liaw; Pradeep Ray; Manish Kumar; Nai-Wen Chang; Hong-Jie Dai
Journal:  J Biomed Inform       Date:  2015-08-28       Impact factor: 6.317

5.  Recurrent Neural Networks for Early Detection of Heart Failure From Longitudinal Electronic Health Record Data: Implications for Temporal Modeling With Respect to Time Before Diagnosis, Data Density, Data Quantity, and Data Type.

Authors:  Robert Chen; Walter F Stewart; Jimeng Sun; Kenney Ng; Xiaowei Yan
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2019-10-15

6.  Clinical risk prediction by exploring high-order feature correlations.

Authors:  Fei Wang; Ping Zhang; Xiang Wang; Jianying Hu
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

7.  Preprocessing structured clinical data for predictive modeling and decision support. A roadmap to tackle the challenges.

Authors:  José Carlos Ferrão; Mónica Duarte Oliveira; Filipe Janela; Henrique M G Martins
Journal:  Appl Clin Inform       Date:  2016-12-07       Impact factor: 2.342

8.  Learning to predict post-hospitalization VTE risk from EHR data.

Authors:  Emily Kawaler; Alexander Cobian; Peggy Peissig; Deanna Cross; Steve Yale; Mark Craven
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

9.  Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes.

Authors:  Peter C Austin; Jack V Tu; Jennifer E Ho; Daniel Levy; Douglas S Lee
Journal:  J Clin Epidemiol       Date:  2013-02-04       Impact factor: 6.437

Review 10.  Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0).

Authors:  Abhyuday Jagannatha; Feifan Liu; Weisong Liu; Hong Yu
Journal:  Drug Saf       Date:  2019-01       Impact factor: 5.606

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.