Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Missing data should be handled differently for prediction than for description or causal explanation.

Literature DB >> 32540389

Missing data should be handled differently for prediction than for description or causal explanation.

Matthew Sperrin¹, Glen P Martin², Rose Sisk², Niels Peek².

Abstract

Missing data are much studied in epidemiology and statistics. Theoretical development and application of methods for handling missing data have mostly been conducted in the context of prospective research data and with a goal of description or causal explanation. However, it is now common to build predictive models using routinely collected data, where missing patterns may convey important information, and one might take a pragmatic approach to optimizing prediction. Therefore, different methods to handle missing data may be preferred. Furthermore, an underappreciated issue in prediction modeling is that the missing data method used in model development may not match the method used when a model is deployed. This may lead to overoptimistic assessments of model performance. For prediction, particularly with routinely collected data, methods for handling missing data that incorporate information within the missingness pattern should be explored and further developed. Where missing data methods differ between model development and model deployment, the implications of this must be explicitly evaluated. The trade-off between building a prediction model that is causally principled, and building a prediction model that maximizes the use of all available information, should be carefully considered and will depend on the intended use of the model.

Keywords: Clinical prediction models; Missing data; Model performance; Multiple imputation; Prognostic model; Routinely collected data

Year: 2020 PMID： 32540389 DOI： 10.1016/j.jclinepi.2020.03.028

Source DB: PubMed Journal: J Clin Epidemiol ISSN： 0895-4356 Impact factor: 6.437

Keyword Cloud
Cited

8 in total

1. Benchmarking missing-values approaches for predictive models on health databases.

Authors: Alexandre Perez-Lebel; Gaël Varoquaux; Marine Le Morvan; Julie Josse; Jean-Baptiste Poline
Journal: Gigascience Date: 2022-04-15 Impact factor: 7.658

2. Informative presence and observation in routine health data: A review of methodology for clinical risk prediction.

Authors: Rose Sisk; Lijing Lin; Matthew Sperrin; Jessica K Barrett; Brian Tom; Karla Diaz-Ordaz; Niels Peek; Glen P Martin
Journal: J Am Med Inform Assoc Date: 2021-01-15 Impact factor: 4.497

3. OptiMissP: A dashboard to assess missingness in proteomic data-independent acquisition mass spectrometry.

Authors: Angelica Arioli; Arianna Dagliati; Bethany Geary; Niels Peek; Philip A Kalra; Anthony D Whetton; Nophar Geifman
Journal: PLoS One Date: 2021-04-15 Impact factor: 3.240

4. LACE Score-Based Risk Management Tool for Long-Term Home Care Patients: A Proof-of-Concept Study in Taiwan.

Authors: Mei-Chin Su; Yu-Chun Chen; Mei-Shu Huang; Yen-Hsi Lin; Li-Hwa Lin; Hsiao-Ting Chang; Tzeng-Ji Chen
Journal: Int J Environ Res Public Health Date: 2021-01-28 Impact factor: 3.390

5. Prediction of sustained biologic and targeted synthetic DMARD-free remission in rheumatoid arthritis patients.

Authors: Theresa Burkard; Ross D Williams; Enriqueta Vallejo-Yagüe; Thomas Hügle; Axel Finckh; Diego Kyburz; Andrea M Burden
Journal: Rheumatol Adv Pract Date: 2021-11-13

6. COVID-19 Vaccine Hesitancy in Delaware's Underserved Communities.

Authors: Sharron Xuanren Wang; Nicole Bell-Rogers; Dorothy Dillard; Melissa A Harrington
Journal: Dela J Public Health Date: 2021-09-27

7. On prediction of aided behavioural measures using speech auditory brainstem responses and decision trees.

Authors: Emanuele Perugia; Ghada BinKhamis; Josef Schlittenlacher; Karolina Kluk
Journal: PLoS One Date: 2021-11-16 Impact factor: 3.240

8. Accommodating heterogeneous missing data patterns for prostate cancer risk prediction.

Authors: Matthias Neumair; Michael W Kattan; Stephen J Freedland; Alexander Haese; Lourdes Guerrios-Rivera; Amanda M De Hoedt; Michael A Liss; Robin J Leach; Stephen A Boorjian; Matthew R Cooperberg; Cedric Poyet; Karim Saba; Kathleen Herkommer; Valentin H Meissner; Andrew J Vickers; Donna P Ankerst
Journal: BMC Med Res Methodol Date: 2022-07-21 Impact factor: 4.612

8 in total