Artuur M Leeuwenberg1, Ewoud Schuit2,1. 1. Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht 3508 GA, Netherlands. 2. Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht 3508 GA, Netherlands.
As of Sept 2, 2020, more than 25 million cases of COVID-19 have been reported, with more than 850 000 associated deaths worldwide. Patientsinfected with severe acute respiratory syndrome coronavirus 2, the virus that causes COVID-19, could require treatment in the intensive care unit for up to 4 weeks. As such, this disease is a major burden on health-care systems, leading to difficult decisions about who to treat and who not to. Prediction models that combine patient and disease characteristics to estimate the risk of a poor outcome from COVID-19 can provide helpful assistance in clinical decision making.In a living systematic review by Wynants and colleagues, 145 models were reviewed, of which 50 were for prognosis of patients with COVID-19, including 23 predicting mortality. Critical appraisal of these models showed a high risk of bias for all models (eg, because of a high risk of model overfitting and unclear reporting on intended use of the models, or because of no reporting of the models' calibration performance). Moreover, external validation of these models, deemed essential before application can even be considered, was rarely done. Therefore, use of any of these reported prediction models was not recommended in current practice.In The Lancet Digital Health, Arjun S Yadaw and colleagues present two models to predict mortality in patients with COVID-19 admitted to the Mount Sinai Health System in the New York city area. These researchers have addressed many of the issues encountered by Wynants and colleagues and provide extensive information about the modelling in the appendix. The dataset used for model development (n=3841) is larger than in most currently published models, and the accompanying number of patients who died (n=313) seems appropriate according to the prediction model risk of bias assessment tool (PROBAST) and guidance on sample size requirements for prediction model development. The calibration performance of the models is reported, which (although essential) is often missing, particularly in studies reporting on machine-learning algorithms, and external validations of the models was done. Yadaw and colleagues acknowledge that additional external validation will be necessary because external validation was done in a random subset of the initial patient population and another set of recent patients from the same health system, and because the number of events in the validation sets were below the 100 suggested for reliable external validity assessment.For other researchers to apply and externally validate models, adherence to transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) criteria is advised to present the full models, accompanied by code in case of complex machine-learning models. Yadaw and colleagues reported many items in TRIPOD, however, the models themselves are not reported in the Article or appendix (item 15a of TRIPOD) so it is not possible for a reader to make predictions for new individuals (eg, to validate the developed models in their own data or investigate the contribution of the individual predictors).The moment for risk estimation defines which values of predictors will be available and is especially important for time-varying predictors (eg, temperature). The models reported by Yadaw and colleagues predict risk using measurements collected throughout the entire encounter of the patient with the health system, with no specific moment of prediction defined. This raises questions about the actual prognostic value of the time-varying predictors (eg, the minimum oxygen saturation) and, hence, how and when the model should be used as the predictive value of time-varying predictors will likely increase when measured closer to the outcome. Consequently, it remains unclear how to interpret the reported area under the curve of approximately 90% in relation to the moment of measurement of these time-varying predictors.Two suggestions can be made regarding modelling. First, the current machine-learning models were constructed using the default hyperparameter values provided by the respective software packages. These often provide reasonable starting values, but important hyperparameters should be carefully tuned to the specific use case. Second, as acknowledged by Yadaw and colleagues, patients who had not developed the outcome by the end of the study were considered not to have the outcome. Since the outcome for these patients might occur after the study ended, the actual incidence of mortality could have been underestimated. Alternatively, a fixed follow-up period per patient could have been defined to allow sufficient follow-up time to measure the outcome in each patient.The study by Yadaw and colleagues ticks a lot of boxes, but it still struggles somewhat to break away from the overall negative picture painted by Wynants and colleagues. Improvements can be achieved by more and better collaboration among researchers from different backgrounds, clinicians, and institutes and sharing of patient data from COVID-19 studies and registries. Then, and with improved reporting (by adherence to TRIPOD criteria), validity, and quality (according to PROBAST), prediction models can provide the decision support that is needed when COVID-19 cases and hospital admissions will again test the limits of the health-care system.
Authors: Richard D Riley; Joie Ensor; Kym I E Snell; Frank E Harrell; Glen P Martin; Johannes B Reitsma; Karel G M Moons; Gary Collins; Maarten van Smeden Journal: BMJ Date: 2020-03-18
Authors: Robert F Wolff; Karel G M Moons; Richard D Riley; Penny F Whiting; Marie Westwood; Gary S Collins; Johannes B Reitsma; Jos Kleijnen; Sue Mallett Journal: Ann Intern Med Date: 2019-01-01 Impact factor: 25.391
Authors: Ben Van Calster; David J McLernon; Maarten van Smeden; Laure Wynants; Ewout W Steyerberg Journal: BMC Med Date: 2019-12-16 Impact factor: 8.775
Authors: Laure Wynants; Ben Van Calster; Gary S Collins; Richard D Riley; Georg Heinze; Ewoud Schuit; Marc M J Bonten; Darren L Dahly; Johanna A A Damen; Thomas P A Debray; Valentijn M T de Jong; Maarten De Vos; Paul Dhiman; Maria C Haller; Michael O Harhay; Liesbet Henckaerts; Pauline Heus; Michael Kammer; Nina Kreuzberger; Anna Lohmann; Kim Luijken; Jie Ma; Glen P Martin; David J McLernon; Constanza L Andaur Navarro; Johannes B Reitsma; Jamie C Sergeant; Chunhu Shi; Nicole Skoetz; Luc J M Smits; Kym I E Snell; Matthew Sperrin; René Spijker; Ewout W Steyerberg; Toshihiko Takada; Ioanna Tzoulaki; Sander M J van Kuijk; Bas van Bussel; Iwan C C van der Horst; Florien S van Royen; Jan Y Verbakel; Christine Wallisch; Jack Wilkinson; Robert Wolff; Lotty Hooft; Karel G M Moons; Maarten van Smeden Journal: BMJ Date: 2020-04-07
Authors: Christian Jung; Behrooz Mamandipoor; Jesper Fjølner; Raphael Romano Bruno; Bernhard Wernly; Antonio Artigas; Bernardo Bollen Pinto; Joerg C Schefold; Georg Wolff; Malte Kelm; Michael Beil; Sigal Sviri; Peter V van Heerden; Wojciech Szczeklik; Miroslaw Czuczwar; Muhammed Elhadi; Michael Joannidis; Sandra Oeyen; Tilemachos Zafeiridis; Brian Marsh; Finn H Andersen; Rui Moreno; Maurizio Cecconi; Susannah Leaver; Dylan W De Lange; Bertrand Guidet; Hans Flaatten; Venet Osmani Journal: JMIR Med Inform Date: 2022-03-31