Literature DB >> 33842642

Reporting of coronavirus disease 2019 prognostic models: the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis statement.

Liuqing Yang1,2, Qiang Wang1,2, Tingting Cui1,2, Jinxin Huang1,2, Naiyang Shi1,2, Hui Jin1,2.   

Abstract

Evaluation of the validity and applicability of published prognostic prediction models for coronavirus disease 2019 (COVID-19) is essential, because determining the patients' prognosis at an early stage may reduce mortality. This study was aimed to utilize the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) to report the completeness of COVID-19-related prognostic models and appraise its effectiveness in clinical practice. A systematic search of the Web of Science and PubMed was performed for studies published until August 11, 2020. All models were assessed on model development, external validation of existing models, incremental values, and development and validation of the same model. TRIPOD was used to assess the completeness of included models, and the completeness of each item was also reported. In total, 52 publications were included, including 67 models. Age, disease history, lymphoma count, history of hypertension and cardiovascular disease, C-reactive protein, lactate dehydrogenase, white blood cell count, and platelet count were the commonly used predictors. The predicted outcome was death, development of severe or critical state, survival time, and length-of-hospital stay. The reported discrimination performance of all models ranged from 0.361 to 0.994, while few models reported calibration. Overall, the reporting completeness based on TRIPOD was between 31% and 83% [median, 67% (interquartile range: 62%, 73%)]. Blinding of the outcome to be predicted or predictors were poorly reported. Additionally, there was little description on the handling of missing data. This assessment indicated a poorly-reported COVID-19 prognostic model in existing literature. The risk of over-fitting may exist with these models. The reporting of calibration and external validation should be given more attention in future research. 2021 Annals of Translational Medicine. All rights reserved.

Entities:  

Keywords:  Coronavirus disease 2019 (COVID-19); prognostic model; transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD)

Year:  2021        PMID: 33842642      PMCID: PMC8033387          DOI: 10.21037/atm-20-6933

Source DB:  PubMed          Journal:  Ann Transl Med        ISSN: 2305-5839


Introduction

The novel coronavirus disease 2019 (COVID-19) poses an urgent threat to global health. As of August 28, 2020; 24,299,923 confirmed cases of COVID-19, including 827,730 deaths, were reported to the World Health Organization (WHO) (1). The huge number of infected cases brought tremendous pressure on the medical facilities. In addition to the high risk of infection to the medical staff, effectively allocating resources, such as the number of intensive care unit (ICU) beds or other medical equipment, is also a challenge. According to existing reports, many infected patients show mild flu-like symptoms and can recover quickly (2). However, some rapidly develop acute respiratory distress syndrome, multiple organ failure, and death (3-6). Therefore, a current concern is to determine the patients’ prognosis at an early stage, to reduce mortality. To provide the patients with the most reasonable level of treatment and care, many studies have combined multiple predictors to establish models, to predict the patients’ prognosis in clinical practice, but the quality of these reports has not been evaluated (7-9). Complete reporting is benefit to study replication and assess the applicability to other individuals. Therefore, high-quality reporting about prediction model is essential. In 2015, multiple journals simultaneously published a study on how to improve the quality of reports on prediction model studies, namely transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement (10). TRIPOD is a list of 22 items involving title and abstract (items 1 and 2), background and objectives (item 3), methods (items 4 through 12), results (items 13 through 17), discussion (items 18 through 20), and other information (items 21 and 22). The TRIPOD statement covers the development and external validation of prediction models as well as studies with only external validation (updates with or without predictors). A previous systematic review showed unsatisfactory level of quality of prediction models in various clinical fields (11). Wynants et al. also conducted a systematic review of the prediction models in COVID-19 (12). However, the results were qualitative, and no unified indicator to measure and compare the reporting integrity between different studies was reported. Our study provides a new evaluation method for model reporting, and summarizes the omissions commonly existing in current reporting, so that future research can focus on avoiding these problems to improve the quality of model reporting. Our research aimed to use the TRIPOD tool to systematically review and critically evaluate the published models for predicting the prognosis or course of COVID-19 in patients. The results could provide the key for further improvement of the quality of COVID-19-related prognostic model reporting. We present the following article in accordance with the PRISMA reporting checklist (available at http://dx.doi.org/10.21037/atm-20-6933).

Methods

Search strategy

A search was conducted in PubMed and Web of Science databases until August 11, 2020, with no language restrictions. The terms related to COVID-19 (COVID-19, SARS-COV-2, novel corona, 2019-ncov) and prognostic model (prognostic, prediction model, regression) were searched in the databases. We also searched for reviews in this field and references of the original articles, to identify whether there were any missed studies. Only peer-reviewed studies on the prognostic model of COVID-19 were included in our research, and the preprint form was not considered.

Inclusion and exclusion criteria

We included articles on multivariate models or risk scores for predicting any prognostic outcomes of COVID-19. The exclusion criteria were as follows: (I) non-human research; (II) studies on the prediction model of disease transmission; (III) diagnostic model of COVID-19; (IV) studies on predictive factors but with no established prognostic models; (V) studies on prediction models using non-regression techniques; since TRIPOD does not support the evaluation of such methods (e.g., machine learning, neural networks) (13). Studies based on the above criteria were screened by two investigators (LQY and QW), and differences were resolved after discussion.

Data extraction

Two investigators (LQY and TTC) independently reviewed the titles and abstracts of all extracted articles. Any discrepancies were agreed upon through discussion and, if necessary, resolved by a consultant (HJ). Investigators used TRIPOD standard data extraction forms to determine the completeness of articles (www.tripod-statement.org). Additionally, the publications were grouped into four types of prediction models: development, external validation of existing models, incremental values, and development and validation of the same model. Publications could be classified into more than one type of prediction model. In other words, for the development model, if different models were developed using the same data in one study, we extracted information from the primary model. For external validation of different existing models, information was extracted separately. Studies that reported both development and external validation of different models were classified into both development and external validation models. The basic information of each study (study region, study design, sample size, and predicted outcomes) were extracted. In addition, information about predictors were addressed in the articles. Predictors refer to variables that are included in the model at the time of model construction and that build statistical relationships with predicted outcomes. Previous researchers encourage that age, sex, C-reactive protein, lactic dehydrogenase, lymphocyte count, and potentially features derived from CT-scoring should be included in the COVID-19 prognostic model (12). Similarly, we extracted the prediction performance, including discrimination and calibration and their standard error (SE) or 95% confidence interval (CI), if provided. Discrimination was usually measured by the area under the receiver operator characteristic curve (AUROC) or c-index, while calibration was usually quantified by calibration intercept and calibration slope. The closer the AUROC or c-index and calibration slope is to 1, the better the performance of the model. The performance data were extracted in the following order: external validation, internal validation, and original performance (if the two above were not included).

Analysis

To evaluate the completeness of included models, the number of TRIPOD items that were completely reported was divided by the total number of TRIPOD items in the article. Furthermore, to assess the overall reporting completeness of each item in the TRIPOD statement, we divided the number of models with complete reports for a specific TRIPOD item by the total number of models applicable to this item. To evaluate for completeness, if an item was not considered applicable to a study, the five items declared by TRIPOD included “if completed” or “if applicable” statements (items 5c, 10e, 11, 14b, and 17). Then, such items were excluded from both the numerator and denominator. In validation, the random effect model was used to pool the presented prediction performance with their 95% CI in the meta-analysis. The I statistic was used to assess the heterogeneity among the studies. When I2 statistic was >50% (moderate heterogeneity), the random effect model was used for the analysis.

Results

After screening, a total of 52 publications were included in our study (). From the 52 publications, we scored 67 models using the TRIPOD tool as follows: 37 (55%) development, 14 (21%) external validation of existing models, 3 (5%) incremental values, and 13 (19%) development and validation of the same model.
Figure 1

The flowchart of literature research. The flow chart is made according to PRISMA (the Preferred Reporting Items for Systematic Reviews and Meta-Analysis).

The flowchart of literature research. The flow chart is made according to PRISMA (the Preferred Reporting Items for Systematic Reviews and Meta-Analysis).

Primary information

Thirty-six studies used COVID-19 patients’ data from China, four from Italy, and two from the United States. Britain, France, Norway, Turkey, Spain, and Mexico had one each. Four studies did not specify the country or region of the data. Regarding the study design, most (88%) were retrospective studies, while two were prospective studies. One study used retrospective data in model development, but prospective methods in a validation cohort to recruit patients. One study identified the race of the participants as Caucasian (8). In a total of 23 studies, the follow-up date was mentioned. All the studies reported the sample sizes (median sample size, 220.5 [interquartile range (IQR): 109.25, 459.25]. Detailed information is shown in and Appendix 1.
Table 1

Primary information of prognostic models

No.First authorStudy regionStudy designOutcomeSample sizePerformance (discrimination)Validation
Type of validationSample sizePerformanceCalibration
1YuanWuhan, ChinaRetrospectiveDeath270.901 (0.873, 0.928)NoneNoneNoneNo
2OsborneVeterans, United StatesRetrospectiveDeath4,6140.73Internal validation (randomly split)1,977Not reportedNo
3FranconeNot reportedRetrospectiveDeath1300.672 (0.647, 0.877)NoneNoneNone§No
4CozziNot reportedRetrospectiveICU admission234ICC0.92 (0.88, 0.95)NoneNoneNoneNo
5BorghesiItalyRetrospectiveDeath3020.853NoneNoneNoneNo
6WangWuhan, ChinaRetrospectiveDeath2960.88 (0.80, 0.95)External validation440.83 (0.75, 0.96)No
7HongZhejiang, ChinaRetrospectiveProlonged length of stay in hospital750.848 (0.753, 0.944)NoneNoneNoneNo
8YuWuhan, ChinaRetrospectiveDeath1,4640.765 (0.725, 0.805)NoneNoneNoneNo
9GallowayLondon, EnglandNot reportedCritical care admission and death5780.757 (0.713, 0.805)Internal validation (randomly split)5790.712 (0.664, 0.759)Yes
10LiuWuhan, ChinaRetrospectiveThe development of severe/critical disease840.804 (0.702, 0.883)External validation710.881 (0.782, 0.946)Yes
11BorghesiItalyNot reportedDeath100Kw 0.82 (0.79, 0.86)NoneNoneNoneNo
12LiuShanghai, ChinaRetrospectiveSevere-event-free survival1340.78 (0.69, 0.88)NoneNoneNoneNo
13YaoWuhan, ChinaRetrospectiveDeath2480.85 (0.77, 0.92)NoneNoneNoneNo
14ZhouSichuan, ChinaRetrospectiveDevelopment of severe COVID-193660.863 (0.801, 0.925)Internal validation (bootstrap)Not reported0.839Yes
15LiangChinaRetrospectiveDevelopment of critical illness1,5900.88 (0.85, 0.91)Internal validation (bootstrap)/external validationNot reported/7100.88 (0.85, 0.91)/0.88 (0.84, 0.93)No
16DongWuhan, ChinaRetrospective Survival time3770.901Internal validation (randomly split)2510.892Yes
17ZhengHubei/Anhui, ChinaRetrospectiveICU admission, mechanical ventilation, or death1660.82 (0.76, 0.88)External validation720.89 (0.82, 0.96)Yes
18ZhangWuhan, ChinaRetrospectiveSurvival probability5160.886 (0.873, 0.899)External validation1860.879 (0.856, 0.900)Yes
Type of validationSample sizePerformanceCalibration
19XiaoHubei/Jiangxi, ChinaRetrospectiveSevere state2310.861 (0.800, 0.922)Internal validation (randomly split)/external validation101/1100.871 (0.769, 0.972)/0.826 (0.746, 0.907)Yes
20WangWuhan, ChinaRetrospectiveDeath1080.964 (0.909, 0.990)NoneNoneNoneNo
21ZhengZhejiang, ChinaRetrospectiveSevere state1410.821 (0.746, 0.896)NoneNoneNoneYes
22WuWuhan, ChinaRetrospectiveModerately ill and severely/critically ill2100.955Internal validation (randomly split)600.945No
23LuoNot reportedRetrospectiveDeath1,0180.907 (0.886, 0.928)NoneNoneNoneYes
24HuangHubei, ChinaRetrospectiveDisease progression in mild cases3440.849NoneNoneNoneNo
25LiuWuhan, ChinaRetrospectiveDeath3360.994 (0.979, 0.999)NoneNoneNoneNo
26HuWuhan, ChinaRetrospectiveDeath of severe or critical patients1050.864NoneNoneNoneNo
27ZhangWuhan, ChinaRetrospectiveThe death rate of critically patients in ICU136Not reportedNoneNoneNoneNo
28Lorente-RosNot reportedRetrospectiveDeath7700.775NoneNoneNoneNo
29MyrstadOslo area, NorwayProspectiveSevere disease and in-hospital mortality660.786 (0.659, 0.913)NoneNoneNoneNo
30LiuBeijing, ChinaProspectiveDevelopment of critical illness.610.807 (0.676, 0.938)External validation540.882 (0.778, 0.986)Yes
31NguyenParis, FrenchRetrospectiveUnfavorable outcome2790.75NoneNoneNoneYes
32ZhangBeijing, ChinaRetrospectiveSeverity of the disease800.906External validation220.958No
33SaticiIstanbul, TurkeyRetrospective30-day mortality6810.92 (0.89, 0.94)NoneNoneNoneNo
34Pascual GómezMadrid, SpainRetrospectiveDeath rate1630.874 (0.816, 0.933)NoneNoneNoneNo
35Luo,Wuhan, ChinaRetrospectiveDeath11150.955 (0.941, 0.970)NoneNoneNoneNo
36Bello-ChavollaMexicanRetrospective30-day death rate41,3060.822Internal validation (randomly split)10,3270.83No
Type of validationSample sizePerformanceCalibration
37JiAnhui/Beijing, ChinaRetrospectiveSevere progression2080.86 (0.81, 0.91)Internal validation (bootstrap)Not reportedNot reportedYes
38ZhaoNew York City, United statesRetrospectiveICU admission and death4540.87 (0.83, 0.92)Internal validation1870.74 (0.63, 0.85)No
39LuoWuhan, ChinaNot reportedDeath or survival7390.956 (0.928, 0.984)NoneNoneNoneNo
40BiZhejiang, ChinaRetrospectiveOccurrence of severe illness1130.712 (0.610, 0.814)External validation28Not reportedYes
41ZhengZhejiang, ChinaRetrospectiveRehabilitation duration90R2 0.361NoneNoneNoneNo
42LiuWuhan, ChinaRetrospectiveCritical progression880.971 (0.910, 0.995)NoneNoneNoneNo
43GidariItalyRetrospectiveICU admission710.90 (0.82, 0.97)NoneNoneNoneNo
44VultaggioFlorence, ItalyRetrospectiveClinical deterioration2080.86NoneNoneNoneYes
45YangChongqing, ChinaRetrospectiveCritical progression1330.8842NoneNoneNoneNo
46WangWuhan, ChinaRetrospectiveDeath of critical patients1040.893 (0.807, 0.98)Internal validation (bootstrap)Not reportedNot reportedNo
47ChenChinaRetrospectiveDeath1,5900.91 (0.85, 0.97)Internal validation (bootstrap)Not reportedNot reportedYes
48ShangWuhan, ChinaRetrospectiveThe death of severe cases1130.919 (0.870, 0.97)External validation3390.938 (0.902, 0.973)Yes
49LiShanghai, ChinaRetrospective/prospectiveThe development of severe disease3220.92 (0.88, 0.95)External validation3170.92 (0.89, 0.95)Yes
50ZengHunan, ChinaRetrospectiveICU admission4610.835 (0.742, 0.929)NoneNoneNoneYes
51GongGuangzhou, ChinaRetrospectiveSevere progression1890.912 (0.846, 0.978)Internal validation (3-fold cross-validation)/external validation165/18Not reported/0.853 (0.790, 0.916)Yes
52ShangWuhan, ChinaRetrospectiveSevere progression4430.774NoneNoneNoneNo

†, ICU is the abbreviation of intensive care unit; ‡, not reported means the information cannot be extracted; §, none means this part is not appliable for this study.

†, ICU is the abbreviation of intensive care unit; ‡, not reported means the information cannot be extracted; §, none means this part is not appliable for this study.

Prognostic predictors

In the final model, six studies used computed tomography (CT) or chest X-ray results to establish the scoring rules. The median number of prognostic predictors was five (IQR: 3, 6.25). The most frequently used predictors in the model (>10 times) were as follows: age, disease history, lymphocyte count, history of hypertension and cardiovascular disease, C reactive protein, lactate dehydrogenase, white blood cell count, and platelet count, reported 26 (50%), 17 (33%), 14 (27%), 12 (23%), 12 (23%), 11 (21%), 10 (19%), and 10 (19%) times, respectively. The commonly used predictors (>5 times) were as follows: lymphocyte ratio, procalcitonin, aspartate aminotransferase, and dyspnea reported 8 (15%), 5 (10%), 5 (10%), and 5 (10%) times, respectively (Appendix 2).

Prediction outcomes and performances

The prediction outcomes in 23, 17, 8, 2, and 2 studies were death, severe or critical state disease development, ICU admission/mechanical ventilation/death, survival time, and length-of-hospital stay, respectively (). For death, the reported discrimination performance ranged from 0.584 to 0.994. Another study reported the weighted kappa (kw) and 95% CI (14). The calibration of the prediction models on mortality by Luo et al. showed good consistency between the prediction in the training cohort and actual observations (15). In two other studies, the model also fitted well (16,17). When the outcome was severe or critical progression of the disease, the discrimination ranged from 0.636 to 0.971. For ICU admission/mechanical ventilation/death, the discrimination varied between 0.712 and 0.900. Discrimination reported for the length-of-hospital stay outcome ranged from 0.361 to 0.848. For survival time, the discrimination was between 0.672 and 0.892.

Reporting completeness per model in TRIPOD

and the file (https://cdn.amegroups.cn/static/application/df0da0ff07a31a06aa1b1e1cf3b15d66/atm-20-6933-1.pdf) present the completeness of the model in TRIPOD. Overall, the reporting completeness was between 31% and 83%, with a median of 67% (IQR: 62%, 73%). The best completeness reporting was incremental value, with a median of 83%. This was followed by validation (70%, IQR: 64%, 74%). The development (66%, IQR: 62%, 70%) and the development and validation of the same model (62%, IQR: 56%, 71%) had similar reporting completeness.
Figure 2

The reporting completeness of models in TRIPOD. Data are median [interquartile range (IQR)] and each point represents the completeness of one model; TRIPOD is the abbreviation of the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis.

The reporting completeness of models in TRIPOD. Data are median [interquartile range (IQR)] and each point represents the completeness of one model; TRIPOD is the abbreviation of the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis.

Reporting completeness per TRIPOD items

We found that TRIPOD items in the discussion section were well completed (items 18–20); up to 100%. Supplementary information for item 21 and research funding for item 22 were well reported at 100%. The remaining 14 items were reported at ≥75% completeness, for all types of models (e.g., development, validation, development and validation of the same model, and incremental value). Four items reported <25%. Information in the other parts of the TRIPOD items were described carefully below. Since there were three models in the incremental value that qualified and the sample size was small (hence not representative), we did not include this type of model in the following elaboration. All details are shown in and Appendix 3.
Figure 3

Reporting of the items in TRIPOD. The combination of numbers and letters in the abscissa represents the items in TRIPOD; TRIPOD is the abbreviation of the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis. NA is the abbreviation of not applicable and it means that the item does not apply to this type of models.

Reporting of the items in TRIPOD. The combination of numbers and letters in the abscissa represents the items in TRIPOD; TRIPOD is the abbreviation of the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis. NA is the abbreviation of not applicable and it means that the item does not apply to this type of models.

Items 1–3 (title/abstract/introduction)

In all types of models, the reporting completeness on the title and abstract section items was low, ranging from 5% to 36%. However, the completion of the introduction section (item 3) was high, both specifying the objectives, presenting the background, and including references to existing models. In development, 5 (11%) of the 37 models explicitly identified the study as development and/or validation multivariable prediction models; then, they reported the target population and predicted the outcomes in the title. These completeness were 36% and 31% for the validation, and development and validation of the same model, respectively. Four models in the validation satisfied all the 12 elements in item 2. That is, the research objectives, study design, setting, participants, sample sizes, predictors, prediction outcomes, and statistical analyses were all provided in the abstract as well as brief results and conclusions. The completeness of item 2 was 5% and 23% in the development, and development and validation of the same model, respectively.

Items 4–12 (methods)

Items 4–5, 6a, 8, 10c, and 11 were highly reported among all the models; with all the values >80%. This meant that the sources of data, key study dates, and eligibility criteria for the participants were well reported. However, the reported completeness of how the missing data were handled (item 9) and the model-building procedures (item 10b) were low, at <15%. In the development (57%) and development and validation of the same model (46%), the completeness of any blinding of the outcome to be predicted was not high. Assessment of the model performance (item 10d) had general completeness reporting of 24% in development, 43% in validation, and 54% in development and validation of the same model. These results were mainly due to the inadvertent reporting of the calibration element. In validation, very few (7%) noted the need to compare validation with data from development (item 12). However, item 12 was well reported in the development and validation of the same model; up to 77%.

Items 4–17 (results)

All types of models were highly completed in the reporting of the number of participants and outcome events in the analysis and the unadjusted association between candidate predictors and outcomes (items 14a and 14b); reaching more than 90%. However, only few models could consider all the four elements in item 13b, and the reporting completeness was <5%. This was due to the fact that researchers tended to ignore the number of participants with missing data in predictors and prediction outcomes when reporting information. In the development, and development and validation of the same model, few studies reported adequate information in the final model (item15a), with the completeness of 32% and 8%, respectively. Although most models presented regression coefficients for each predictor, the intercept, or the cumulative baseline hazard (or baseline survival) for at least one time point was poorly reported. In development, 46% of all models were fully reported for item 15b, and many researchers did not explain how to use the newly established prediction model. Whether in development, validation, or development and validation of the same model, the reporting of the prediction model performance measures (item 16) was not ideal at 24%, 43%, and 62%, respectively. These were due to the inability of many models to adhere to one of these elements that reported model calibration, which also corresponded to the low reporting of item 10d in the methods section.

Meta-analysis

In the meta-analysis, we screened five studies for the included validation from which the discrimination of CURB-65 could be extracted. The CURB-65 score is a prediction model used to divide patients with community-acquired pneumonia into different treatment patient groups (18). The pooled performance of CURB-65 in COVID-19 infectious patients was 0.768 (95% CI, 0.694, 0.841). The forest plot is shown in Appendix 4.

Discussion

In this systematic review of prognostic models related to COVID-19, we included a total of 67 models from 52 studies. The main prediction outcomes were as follows: death, development of severe/critical state, ICU admission/mechanical ventilation/death, survival time, and length-of-hospital stay. There was a mix between outcomes. The predicted outcome of some studies were the indicators of the outcomes predicted in some other studies. Zeng et al. focused on identifying patients with a high risk of progression and who would require transfer to the ICU (19). On the other hand, many other studies listed ICU admission as one of the indicators of their prediction outcomes (i.e. severe or critical progression and mortality) (20-22). Additionally, the same outcome was defined differently in different studies; the definition of severe and critical cases was not uniform. Liu et al. assessed the status of patients according to the American Thoracic Society guidelines (23). Liang et al. also defined the severity based on the American Thoracic Society guidelines for community-acquired pneumonia, given the extensive acceptance of this guideline (24). However, Xiao et al. used the Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial Version 7) as the guideline for the spectrum of severity (25). The blind evaluation of the prediction outcome and prediction factors were ignored in the models. For the all-cause mortality, it was well-defined and not affected by subjective factors, while in other instances such as in severe state progression, an explicit mention about the judgement of outcome was expected.

Potential for popularizing clinical practice

Optimistic discrimination performance was reported for all the models. However, the existing models had the risk of over-fitting, because the number of available samples and events which were used for developing the new prediction model were limited by the sample sizes. In addition to the above reasons, most studies directly excluded the missing data from the original data, which reduced the sample sizes greatly. Multiple imputation may be used to address this challenge. The overfitting can also be alleviated by calibration, which has rarely been evaluated in models. In future prediction model research, attention should be paid to the disposal of missing values, and multiple interpolation should be carried out for missing values when appropriate. In addition, emphasis should be placed on calibration results in reporting model performance. Similarly, there were few (only 13) external validations of the newly established models, so these were insufficient to promote the existing models directly in clinical practice. In addition, there were few internal validations of the newly established models. Random splitting was the most frequently used method instead of bootstrap or k-fold cross-validation, which enhanced the limitation of the small sample size in the model prediction. Based on our findings, we encourage researchers to count age, disease history, lymphocyte count, history of hypertension and cardiovascular disease, C reactive protein, lactate dehydrogenase, white blood cell count and platelet count into the prediction model, rather than simply selecting the predictors in a data-driven manner, which may put the model at risk of overfitting. Research participants should be adequately described in the development data, which is beneficial to popularize newly established models in the real world. Borghesi et al. identified Caucasians as participants in a study (8). Osborne clarified that their model was aimed at veterans in the United States (26). Pascual determined that the setting of their study was the hospital emergency room (27). However, the applicability of the model among most of the studies was not of great importance. Although we realized that due to the particularity of COVID-19, the time and space for the completion of these studies were limited. Moreover, the reporting completeness of the final model presentation was poor. Although the regression coefficient (or a derivative such as hazard ratio, odds ratio, and risk ratio) for each predictor in the model was reported in a large number of models. The intercept or the cumulative baseline hazard for at least one time point was ignored, which will make future research to re-validate the developed model and recalibrate it difficult. All of the above hindered the improvement of the prediction model and its promotion in clinical practice. In our study, moderate or even excellent degree of discrimination ability was found when the existing CURB-65 model was used to predict the prognosis of COVID-19 patients. In future research, we may consider adding the prediction variables or recalibrating the model to achieve better prediction results. What’s more, with the development of vaccine trials worldwide, whether vaccination will have an impact on the prediction model, that is, whether vaccination can also become a new predictor is also the direction that researchers need to focus on.

Limitations

The number of studies was relatively small. However, these evaluation results may be improved with the promotion of COVID-19 prognosis model research. In particular, the number of incremental value studies was few, so it may not be appropriate to use the quantitative method converted by the TRIPOD statement for the evaluation. Secondly, due to the limitation of the applicability of TRIPOD, we were unable to evaluate models that were established by artificial intelligence. Thirdly, some hospitals provided data for different studies at the same time, which made it unclear to us how much overlap we included from the studies. Moreover, most of the articles we included were from China, especially Wuhan; and there was no description of demographic variables such as race, economic status, and educational level that might affect patient outcomes. All of these factors may have potential impacts on our results.

Conclusions

In the present study, the prognostic prediction models for COVID-19 were evaluated according to the TRIPOD statement; we found the reporting completeness to be poor. The potential for the clinical promotion of the model is low due to over-fitting and the lack of calibration and external validation. Overall, we need to focus our research in the future on the validation and improvement of existing models. The premise for this was a high-quality research, following the TRIPOD reporting guidelines. The article’s supplementary files as
  26 in total

1.  Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study.

Authors:  W S Lim; M M van der Eerden; R Laing; W G Boersma; N Karalus; G I Town; S A Lewis; J T Macfarlane
Journal:  Thorax       Date:  2003-05       Impact factor: 9.139

2.  Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy.

Authors:  Giacomo Grasselli; Alberto Zangrillo; Alberto Zanella; Massimo Antonelli; Luca Cabrini; Antonio Castelli; Danilo Cereda; Antonio Coluccello; Giuseppe Foti; Roberto Fumagalli; Giorgio Iotti; Nicola Latronico; Luca Lorini; Stefano Merler; Giuseppe Natalini; Alessandra Piatti; Marco Vito Ranieri; Anna Mara Scandroglio; Enrico Storti; Maurizio Cecconi; Antonio Pesenti
Journal:  JAMA       Date:  2020-04-28       Impact factor: 56.272

3.  Risk Factor Analysis and Nomogram Construction for Non-Survivors among Critical Patients with COVID-19.

Authors:  Binchen Wang; Feiyang Zhong; Hanfei Zhang; Wenting An; Meiyan Liao; Yiyuan Cao
Journal:  Jpn J Infect Dis       Date:  2020-06-30       Impact factor: 1.362

4.  Development and Validation of a Clinical Risk Score to Predict the Occurrence of Critical Illness in Hospitalized Patients With COVID-19.

Authors:  Wenhua Liang; Hengrui Liang; Limin Ou; Binfeng Chen; Ailan Chen; Caichen Li; Yimin Li; Weijie Guan; Ling Sang; Jiatao Lu; Yuanda Xu; Guoqiang Chen; Haiyan Guo; Jun Guo; Zisheng Chen; Yi Zhao; Shiyue Li; Nuofu Zhang; Nanshan Zhong; Jianxing He
Journal:  JAMA Intern Med       Date:  2020-08-01       Impact factor: 21.873

5.  Predicting Mortality Due to SARS-CoV-2: A Mechanistic Score Relating Obesity and Diabetes to COVID-19 Outcomes in Mexico.

Authors:  Omar Yaxmehen Bello-Chavolla; Jessica Paola Bahena-López; Neftali Eduardo Antonio-Villa; Arsenio Vargas-Vázquez; Armando González-Díaz; Alejandro Márquez-Salinas; Carlos A Fermín-Martínez; J Jesús Naveja; Carlos A Aguilar-Salinas
Journal:  J Clin Endocrinol Metab       Date:  2020-08-01       Impact factor: 5.958

6.  Automated EHR score to predict COVID-19 outcomes at US Department of Veterans Affairs.

Authors:  Thomas F Osborne; Zachary P Veigulis; David M Arreola; Eliane Röösli; Catherine M Curtin
Journal:  PLoS One       Date:  2020-07-27       Impact factor: 3.240

7.  Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement.

Authors:  Pauline Heus; Johanna A A G Damen; Romin Pajouheshnia; Rob J P M Scholten; Johannes B Reitsma; Gary S Collins; Douglas G Altman; Karel G M Moons; Lotty Hooft
Journal:  BMC Med       Date:  2018-07-19       Impact factor: 8.775

8.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

9.  Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study.

Authors:  Nanshan Chen; Min Zhou; Xuan Dong; Jieming Qu; Fengyun Gong; Yang Han; Yang Qiu; Jingli Wang; Ying Liu; Yuan Wei; Jia'an Xia; Ting Yu; Xinxin Zhang; Li Zhang
Journal:  Lancet       Date:  2020-01-30       Impact factor: 79.321

10.  [Potential biomarkers predictors of mortality in COVID-19 patients in the Emergency Department].

Authors:  N F Pascual Gómez; I Monge Lobo; I Granero Cremades; A Figuerola Tejerina; F Ramasco Rueda; A von Wernitz Teleki; F M Arrabal Campos; M A Sanz de Benito
Journal:  Rev Esp Quimioter       Date:  2020-07-13       Impact factor: 1.553

View more
  3 in total

1.  Automatic Tumor Grading on Colorectal Cancer Whole-Slide Images: Semi-Quantitative Gland Formation Percentage and New Indicator Exploration.

Authors:  Shenlun Chen; Meng Zhang; Jiazhou Wang; Midie Xu; Weigang Hu; Leonard Wee; Andre Dekker; Weiqi Sheng; Zhen Zhang
Journal:  Front Oncol       Date:  2022-05-11       Impact factor: 5.738

2.  A New Clinical Nomogram From the TCGA Database to Predict the Prognosis of Hepatocellular Carcinoma.

Authors:  Dingde Ye; Jiamu Qu; Jian Wang; Guoqiang Li; Beicheng Sun; Qingxiang Xu
Journal:  Front Oncol       Date:  2021-09-06       Impact factor: 6.244

3.  Reporting Quality of Studies Developing and Validating Melanoma Prediction Models: An Assessment Based on the TRIPOD Statement.

Authors:  Isabelle Kaiser; Katharina Diehl; Markus V Heppt; Sonja Mathes; Annette B Pfahlberg; Theresa Steeb; Wolfgang Uter; Olaf Gefeller
Journal:  Healthcare (Basel)       Date:  2022-01-26
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.