Literature DB >> 16980153

Bias arising from missing data in predictive models.

Marc H Gorelick1.   

Abstract

OBJECTIVE: The purpose of this study is to determine the effect of three common approaches to handling missing data on the results of a predictive model. STUDY DESIGN AND
SETTING: Monte Carlo simulation study using simulated data was used. A baseline logistic regression using complete data was performed to predict hospital admission, based on the white blood cell count (WBC) (dichotomized as normal or high), presence of fever, or procedures performed (PROC). A series of simulations was then performed in which WBC data were deleted for varying proportions (15-85%) of patients under various patterns of missingness. Three analytic approaches were used: analysis restricted to cases with complete data, missing data assumed to be normal (MAN), and use of imputed values.
RESULTS: In the baseline analysis, all three predictors were all significantly associated with admission. Using either the MAN approach or imputation, the odds ratio (OR) for WBC was substantially over- or underestimated depending on the missingness pattern, and there was considerable bias toward the null in the OR estimates for fever. In the CC analyses, OR for WBC was consistently biased toward the null, OR for PROC was biased away from the null, and the OR for fever was biased toward or away from the null. Estimates for overall model discrimination were substantially biased using all analytic approaches.
CONCLUSIONS: All three methods of handling large amounts of missing data can lead to biased estimates of the OR and of model performance in predictive models. Predictor variables that are measured inconsistently can affect the validity of such models.

Entities:  

Mesh:

Year:  2006        PMID: 16980153     DOI: 10.1016/j.jclinepi.2004.11.029

Source DB:  PubMed          Journal:  J Clin Epidemiol        ISSN: 0895-4356            Impact factor:   6.437


  10 in total

1.  Influences of School Latino Composition and Linguistic Acculturation on a Prevention Program for Youth.

Authors:  Flavio F Marsiglia; Scott T Yabiku; Stephen Kulis; Tanya Nieri; Benjamin Lewin
Journal:  Soc Work Res       Date:  2010-03-01

2.  Preprocessing structured clinical data for predictive modeling and decision support. A roadmap to tackle the challenges.

Authors:  José Carlos Ferrão; Mónica Duarte Oliveira; Filipe Janela; Henrique M G Martins
Journal:  Appl Clin Inform       Date:  2016-12-07       Impact factor: 2.342

Review 3.  Reporting and methods in clinical prediction research: a systematic review.

Authors:  Walter Bouwmeester; Nicolaas P A Zuithoff; Susan Mallett; Mirjam I Geerlings; Yvonne Vergouwe; Ewout W Steyerberg; Douglas G Altman; Karel G M Moons
Journal:  PLoS Med       Date:  2012-05-22       Impact factor: 11.069

4.  Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist.

Authors:  Karel G M Moons; Joris A H de Groot; Walter Bouwmeester; Yvonne Vergouwe; Susan Mallett; Douglas G Altman; Johannes B Reitsma; Gary S Collins
Journal:  PLoS Med       Date:  2014-10-14       Impact factor: 11.069

5.  Risk prediction models for selection of lung cancer screening candidates: A retrospective validation study.

Authors:  Kevin Ten Haaf; Jihyoun Jeon; Martin C Tammemägi; Summer S Han; Chung Yin Kong; Sylvia K Plevritis; Eric J Feuer; Harry J de Koning; Ewout W Steyerberg; Rafael Meza
Journal:  PLoS Med       Date:  2017-04-04       Impact factor: 11.069

6.  The Peripheral Blood Transcriptome Is Correlated With PET Measures of Lung Inflammation During Successful Tuberculosis Treatment.

Authors:  Trust Odia; Stephanus T Malherbe; Stuart Meier; Elizna Maasdorp; Léanie Kleynhans; Nelita du Plessis; Andre G Loxton; Daniel E Zak; Ethan Thompson; Fergal J Duffy; Helena Kuivaniemi; Katharina Ronacher; Jill Winter; Gerhard Walzl; Gerard Tromp
Journal:  Front Immunol       Date:  2021-02-10       Impact factor: 7.561

7.  Risk-based prediction model for selecting eligible population for lung cancer screening among ever smokers in Korea.

Authors:  Boyoung Park; Yeol Kim; Jaeho Lee; Nayoung Lee; Seung Hun Jang
Journal:  Transl Lung Cancer Res       Date:  2021-12

8.  Lifestyle variables and the risk of myocardial infarction in the general practice research database.

Authors:  Joseph A C Delaney; Stella S Daskalopoulou; James M Brophy; Russell J Steele; Lucie Opatrny; Samy Suissa
Journal:  BMC Cardiovasc Disord       Date:  2007-12-18       Impact factor: 2.298

9.  Early ART Results in Greater Immune Reconstitution Benefits in HIV-Infected Infants: Working with Data Missingness in a Longitudinal Dataset.

Authors:  Livio Azzoni; Russell Barbour; Emmanouil Papasavvas; Deborah K Glencross; Wendy S Stevens; Mark F Cotton; Avy Violari; Luis J Montaner
Journal:  PLoS One       Date:  2015-12-15       Impact factor: 3.240

Review 10.  Literature review and methodological considerations for understanding circulating risk biomarkers following trauma exposure.

Authors:  Sarah D Linnstaedt; Anthony S Zannas; Samuel A McLean; Karestan C Koenen; Kerry J Ressler
Journal:  Mol Psychiatry       Date:  2019-12-20       Impact factor: 15.992

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.