BACKGROUND: Prediction models combine patient characteristics and test results to predict the presence of a disease or the occurrence of an event in the future. In the event that test results (predictor) are unavailable, a strategy is needed to help users applying a prediction model to deal with such missing values. We evaluated 6 strategies to deal with missing values. METHODS: We developed and validated (in 1295 and 532 primary care patients, respectively) a prediction model to predict the risk of deep venous thrombosis. In an application set (259 patients), we mimicked 3 situations in which (1) an important predictor (D-dimer test), (2) a weaker predictor (difference in calf circumference), and (3) both predictors simultaneously were missing. The 6 strategies to deal with missing values were (1) ignoring the predictor, (2) overall mean imputation, (3) subgroup mean imputation, (4) multiple imputation, (5) applying a submodel including only the observed predictors as derived from the development set, or (6) the "one-step-sweep" method. We compared the model's discriminative ability (expressed by the ROC area) with the true ROC area (no missing values) and the model's estimated calibration slope and intercept with the ideal values of 1 and 0, respectively. RESULTS: Ignoring the predictor led to the worst and multiple imputation to the best discrimination. Multiple imputation led to calibration intercepts closest to the true value. The effect of the strategies on the slope differed between the 3 scenarios. CONCLUSIONS: Multiple imputation is preferred if a predictor value is missing.
BACKGROUND: Prediction models combine patient characteristics and test results to predict the presence of a disease or the occurrence of an event in the future. In the event that test results (predictor) are unavailable, a strategy is needed to help users applying a prediction model to deal with such missing values. We evaluated 6 strategies to deal with missing values. METHODS: We developed and validated (in 1295 and 532 primary care patients, respectively) a prediction model to predict the risk of deep venous thrombosis. In an application set (259 patients), we mimicked 3 situations in which (1) an important predictor (D-dimer test), (2) a weaker predictor (difference in calf circumference), and (3) both predictors simultaneously were missing. The 6 strategies to deal with missing values were (1) ignoring the predictor, (2) overall mean imputation, (3) subgroup mean imputation, (4) multiple imputation, (5) applying a submodel including only the observed predictors as derived from the development set, or (6) the "one-step-sweep" method. We compared the model's discriminative ability (expressed by the ROC area) with the true ROC area (no missing values) and the model's estimated calibration slope and intercept with the ideal values of 1 and 0, respectively. RESULTS: Ignoring the predictor led to the worst and multiple imputation to the best discrimination. Multiple imputation led to calibration intercepts closest to the true value. The effect of the strategies on the slope differed between the 3 scenarios. CONCLUSIONS: Multiple imputation is preferred if a predictor value is missing.
Authors: Felicitas J Detmer; Bong Jae Chung; Fernando Mut; Martin Slawski; Farid Hamzei-Sichani; Christopher Putman; Carlos Jiménez; Juan R Cebral Journal: Int J Comput Assist Radiol Surg Date: 2018-08-09 Impact factor: 2.924
Authors: Janneke M T Hendriksen; Wim A M Lucassen; Petra M G Erkens; Henri E J H Stoffers; Henk C P M van Weert; Harry R Büller; Arno W Hoes; Karel G M Moons; Geert-Jan Geersing Journal: Ann Fam Med Date: 2016-05 Impact factor: 5.166
Authors: Teus H Kappen; Yvonne Vergouwe; Wilton A van Klei; Leo van Wolfswinkel; Cor J Kalkman; Karel G M Moons Journal: Med Decis Making Date: 2012-03-16 Impact factor: 2.583
Authors: Daniele Bottigliengo; Giulia Lorenzoni; Honoria Ocagli; Matteo Martinato; Paola Berchialla; Dario Gregori Journal: Int J Environ Res Public Health Date: 2021-06-22 Impact factor: 3.390
Authors: Rogier C J de Jonge; Marieke S Sanders; Caroline B Terwee; Martijn W Heymans; Reinoud J B J Gemke; Irene Koomen; Lodewijk Spanjaard; A Marceline van Furth Journal: PLoS One Date: 2013-03-11 Impact factor: 3.240
Authors: Geert-Jan Geersing; Petra M G Erkens; Wim A M Lucassen; Harry R Büller; Hugo Ten Cate; Arno W Hoes; Karel G M Moons; Martin H Prins; Ruud Oudega; Henk C P M van Weert; Henri E J H Stoffers Journal: BMJ Date: 2012-10-04