Literature DB >> 26992568

Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting.

David M Vock1, Julian Wolfson2, Sunayan Bandyopadhyay3, Gediminas Adomavicius4, Paul E Johnson5, Gabriela Vazquez-Benitez6, Patrick J O'Connor7.   

Abstract

Models for predicting the probability of experiencing various health outcomes or adverse events over a certain time frame (e.g., having a heart attack in the next 5years) based on individual patient characteristics are important tools for managing patient care. Electronic health data (EHD) are appealing sources of training data because they provide access to large amounts of rich individual-level data from present-day patient populations. However, because EHD are derived by extracting information from administrative and clinical databases, some fraction of subjects will not be under observation for the entire time frame over which one wants to make predictions; this loss to follow-up is often due to disenrollment from the health system. For subjects without complete follow-up, whether or not they experienced the adverse event is unknown, and in statistical terms the event time is said to be right-censored. Most machine learning approaches to the problem have been relatively ad hoc; for example, common approaches for handling observations in which the event status is unknown include (1) discarding those observations, (2) treating them as non-events, (3) splitting those observations into two observations: one where the event occurs and one where the event does not. In this paper, we present a general-purpose approach to account for right-censored outcomes using inverse probability of censoring weighting (IPCW). We illustrate how IPCW can easily be incorporated into a number of existing machine learning algorithms used to mine big health care data including Bayesian networks, k-nearest neighbors, decision trees, and generalized additive models. We then show that our approach leads to better calibrated predictions than the three ad hoc approaches when applied to predicting the 5-year risk of experiencing a cardiovascular adverse event, using EHD from a large U.S. Midwestern healthcare system.
Copyright © 2016 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Censored data; Electronic health data; Inverse probability weighting; Machine learning; Risk prediction; Survival analysis

Mesh:

Year:  2016        PMID: 26992568      PMCID: PMC4893987          DOI: 10.1016/j.jbi.2016.03.009

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  30 in total

1.  Bayesian networks in biomedicine and health-care.

Authors:  Peter J F Lucas; Linda C van der Gaag; Ameen Abu-Hanna
Journal:  Artif Intell Med       Date:  2004-03       Impact factor: 5.326

2.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS.

Authors:  Rosa Blanco; Iñaki Inza; Marisa Merino; Jorge Quiroga; Pedro Larrañaga
Journal:  J Biomed Inform       Date:  2005-06-04       Impact factor: 6.317

3.  Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches.

Authors:  Jionglin Wu; Jason Roy; Walter F Stewart
Journal:  Med Care       Date:  2010-06       Impact factor: 2.983

4.  Impact of censoring on learning Bayesian networks in survival modelling.

Authors:  Ivan Stajduhar; Bojana Dalbelo-Basić; Nikola Bogunović
Journal:  Artif Intell Med       Date:  2009-10-14       Impact factor: 5.326

5.  Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches.

Authors:  B Sierra; P Larrañaga
Journal:  Artif Intell Med       Date:  1998 Sep-Oct       Impact factor: 5.326

6.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.

Authors:  Michael J Pencina; Ralph B D'Agostino; Ewout W Steyerberg
Journal:  Stat Med       Date:  2010-11-05       Impact factor: 2.373

Review 7.  Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians.

Authors:  Marie Therese Cooney; Alexandra L Dudina; Ian M Graham
Journal:  J Am Coll Cardiol       Date:  2009-09-29       Impact factor: 24.094

8.  General cardiovascular risk profile for use in primary care: the Framingham Heart Study.

Authors:  Ralph B D'Agostino; Ramachandran S Vasan; Michael J Pencina; Philip A Wolf; Mark Cobain; Joseph M Massaro; William B Kannel
Journal:  Circulation       Date:  2008-01-22       Impact factor: 29.690

9.  Improved cardiovascular risk prediction using nonparametric regression and electronic health record data.

Authors:  Edward H Kennedy; Wyndy L Wiitala; Rodney A Hayward; Jeremy B Sussman
Journal:  Med Care       Date:  2013-03       Impact factor: 2.983

10.  Combining knowledge and data driven insights for identifying risk factors using electronic health records.

Authors:  Jimeng Sun; Jianying Hu; Dijun Luo; Marianthi Markatou; Fei Wang; Shahram Edabollahi; Steven E Steinhubl; Zahra Daar; Walter F Stewart
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03
View more
  30 in total

1.  Clinical Value of Predicting Individual Treatment Effects for Intensive Blood Pressure Therapy.

Authors:  Tony Duan; Pranav Rajpurkar; Dillon Laird; Andrew Y Ng; Sanjay Basu
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2019-03

Review 2.  The Next Era: Deep Learning in Pharmaceutical Research.

Authors:  Sean Ekins
Journal:  Pharm Res       Date:  2016-09-06       Impact factor: 4.200

3.  Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

Authors:  Thomas Lane; Daniel P Russo; Kimberley M Zorn; Alex M Clark; Alexandru Korotcov; Valery Tkachenko; Robert C Reynolds; Alexander L Perryman; Joel S Freundlich; Sean Ekins
Journal:  Mol Pharm       Date:  2018-04-26       Impact factor: 4.939

4.  Cure models to estimate time until hospitalization due to COVID-19: A case study in Galicia (NW Spain).

Authors:  Maria Pedrosa-Laza; Ana López-Cheda; Ricardo Cao
Journal:  Appl Intell (Dordr)       Date:  2021-05-12       Impact factor: 5.086

5.  Simulating Screening for Risk of Childhood Diabetes: The Collaborative Open Outcomes tooL (COOL).

Authors:  Mohamed Ghalwash; Eileen Koski; Riitta Veijola; Jorma Toppari; William Hagopian; Marian Rewers; Vibha Anand
Journal:  AMIA Annu Symp Proc       Date:  2022-02-21

6.  In with the old, in with the new: machine learning for time to event biomedical research.

Authors:  Ioana Danciu; Greeshma Agasthya; Janet P Tate; Mayanka Chandra-Shekar; Ian Goethert; Olga S Ovchinnikova; Benjamin H McMahon; Amy C Justice
Journal:  J Am Med Inform Assoc       Date:  2022-09-12       Impact factor: 7.942

7.  Predicting outcomes in central venous catheter salvage in pediatric central line-associated bloodstream infection.

Authors:  Lorne W Walker; Andrew J Nowalk; Shyam Visweswaran
Journal:  J Am Med Inform Assoc       Date:  2021-03-18       Impact factor: 4.497

8.  Patient similarity analytics for explainable clinical risk prediction.

Authors:  Hao Sen Andrew Fang; Ngiap Chuan Tan; Wei Ying Tan; Ronald Wihal Oei; Mong Li Lee; Wynne Hsu
Journal:  BMC Med Inform Decis Mak       Date:  2021-07-01       Impact factor: 2.796

9.  Islet Autoimmunity and HLA Markers of Presymptomatic and Clinical Type 1 Diabetes: Joint Analyses of Prospective Cohort Studies in Finland, Germany, Sweden, and the U.S.

Authors:  Vibha Anand; Ying Li; Bin Liu; Mohamed Ghalwash; Eileen Koski; Kenney Ng; Jessica L Dunne; Josefine Jönsson; Christiane Winkler; Mikael Knip; Jorma Toppari; Jorma Ilonen; Michael B Killian; Brigitte I Frohnert; Markus Lundgren; Anette-Gabriele Ziegler; William Hagopian; Riitta Veijola; Marian Rewers
Journal:  Diabetes Care       Date:  2021-06-23       Impact factor: 17.152

10.  Oropharyngeal cancer patient stratification using random forest based-learning over high-dimensional radiomic features.

Authors:  Harsh Patel; David M Vock; G Elisabeta Marai; Clifton D Fuller; Abdallah S R Mohamed; Guadalupe Canahuate
Journal:  Sci Rep       Date:  2021-07-07       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.