Literature DB >> 33364802

Reporting and Methods in Developing Prognostic Prediction Models for Metabolic Syndrome: A Systematic Review and Critical Appraisal.

Hui Zhang¹, Jing Shao¹, Dandan Chen¹, Ping Zou², Nianqi Cui³, Leiwen Tang¹, Dan Wang¹, Zhihong Ye¹.

Abstract

PURPOSE: A prognostic prediction model for metabolic syndrome can calculate the probability of risk of experiencing metabolic syndrome within a specific period for individualized treatment decisions. We aimed to provide a systematic review and critical appraisal on prognostic models for metabolic syndrome.
MATERIALS AND METHODS: Studies were identified through searching in English databases (PubMed, EMBASE, CINAHL, and Web of Science) and Chinese databases (Sinomed, WANFANG, CNKI, and CQVIP). A checklist for critical appraisal and data extraction for systematic reviews of prediction modeling studies (CHARMS) and the prediction model risk of bias assessment tool (PROBAST) were used for the data extraction process and critical appraisal.
RESULTS: From the 29,668 retrieved articles, eleven studies meeting the selection criteria were included in this review. Forty-eight predictors were identified from prognostic prediction models. The c-statistic ranged from 0.67 to 0.95. Critical appraisal has shown that all modeling studies were subject to a high risk of bias in methodological quality mainly driven by outcome and statistical analysis, and six modeling studies were subject to a high risk of bias in applicability.
CONCLUSION: Future model development and validation studies should adhere to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement to improve methodological quality and applicability, thus increasing the transparency of the reporting of a prediction model study. It is not appropriate to adopt any of the identified models in this study for clinical practice since all models are prone to optimism and overfitting.

Entities: Chemical

Keywords: metabolic syndrome; prediction model; prognosis; risk; systematic review

Year: 2020 PMID： 33364802 PMCID： PMC7751606 DOI： 10.2147/DMSO.S283949

Source DB: PubMed Journal: Diabetes Metab Syndr Obes ISSN： 1178-7007 Impact factor: 3.168

Introduction

Metabolic syndrome (MetS) is defined as a cluster of cardiometabolic risk factors related to type 2 diabetes and cardiovascular disease.1 These risk factors are waist circumference, blood pressure, high-density lipoprotein level, triglycerides level, and hyperglycemia.2 MetS is a growing public health problem and concern. The prevalence of MetS has increased from 25.3% to 34.2% among US adults,3 and the prevalence of metabolic syndrome was 33.9% among Chinese adults as of 2010.4 People with MetS are likely to suffer from cardiovascular disease, insulin resistance, and hypertension leading to increased risk of cardiovascular morbidity and mortality.5 As diseases related to MetS can impose enormous health and economic burden, it is important to adopt effective and early measures to prevent the onset of morbidities for at-risk individuals. A healthy lifestyle is recommended as a suitable first-line intervention for MetS prevention and management.6 However, the specific target population that can benefit from a healthy lifestyle has yet to be determined using an evidence-informed decision-making approach. A prognostic prediction model provides a multivariate predictor equation to help healthcare providers make decisions and plan for lifestyle changes or therapeutics.7 The prognostic prediction model can calculate the probability of risk of experiencing a particular health outcome within a specific time period for individualized treatment decisions.8 As we are in an era of personalized medicine with substantial and accumulating evidence from prognosis studies, there has been a rapid emergence of prognostic prediction models for MetS. However, none of them has been used in clinical practice or routine care. Herein, we aim to conduct a systematic review of prognostic prediction models for MetS to summarize and identify available evidence and knowledge gaps to facilitate their use. This could help clinicians and nurses determine which prognostic prediction model can be used in clinical practice.9 A systematic review and review protocol of prediction model for MetS were published previously.10,11 However, they did not include EMBASE database, which is one of the most important databases in the medical field.12 Moreover, unlike other systematic reviews for prediction model studies,13,14 the previous systematic review did not recommend any candidate predictors for future modeling studies based on their conclusions. A set of candidate predictors can help researchers select predictors to develop prediction models instead of using a purely data-driven approach. Lastly, events per variable (EPV) for sample size, the relationship between predictors and outcome definition, and some appropriate index (eg, Harrell’s c-index, and calibration plot) are important details for prediction models, but they were missing in the previous systematic review. To fill the gap, in our systematic review, we aimed to expand and update prognostic prediction models for MetS in several important English and Chinese database, describe characteristics and the performance of the prognostic prediction models, and critically appraise methods and reporting of identified studies to indicate what is needed in further modeling studies.

Methods

This systematic review protocol was registered on the PROSPERO on 22 July 2020, and the registration number was CRD42020193282 (some updates were submitted to the PROSPERO). To improve rigor and reproducibility of this systematic review, we used the checklist for critical appraisal and data extraction for systematic reviews of prediction modeling studies (CHARMS) and the prediction model risk of bias assessment tool (PROBAST) to form the review question, study design, data extraction, and appraisal.15,16 This study adheres to the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) statement.

Search Strategy

The search strategy combined concepts related to prognostic prediction modeling studies and metabolic syndrome. A medical librarian helped us choose databases, iteratively developed strategy, and refined the search strategy. We developed an English search strategy combining subject indexing terms (ie MeSH) and free-text search terms in the title and abstract fields in Pubmed. This search strategy was translated appropriately for Embase, CINAHL, Web of Science (Core Collection). We also developed a Chinese search strategy combining subject indexing terms and free-text search terms in the title and abstract fields in Sinomed. This search strategy was translated appropriately for WANFANG, CNKI, CQVIP. The search strategy is presented in . We systematically searched electronic databases from inception to July 27, 2020.

Study Selection

Inclusion criteria contained prognostic multivariable prediction studies (eg, model development studies and validation studies) for metabolic syndrome. Multivariable prediction studies focus on predicting an outcome by at least two predictors.17 Exclusion criteria contained diagnostic prediction model, predictor finding studies, model impact studies, and studies investigating a single predictor, test, or marker (such as single diagnostic test accuracy or single prognostic marker studies). This systematic review was limited to studies conducted in humans and published in English or Chinese. Study timing or setting was not limited. The titles and abstracts of all retrieved articles were independently screened by two reviewers (HZ, DDC) based on the selection criteria. When the information of the titles and abstracts suggested that the study was eligible, full text of this article was retrieved for further assessment. If there is any doubt regarding eligibility or disagreement, two reviewers (HZ, DDC) discussed with an advisor (JS) to reach a consensus.

Data Extraction and Critical Appraisal

A standardized electronic form based on the CHARMS checklist was constructed to facilitate the data extraction process.15 Information about objective, source of data, participants, outcome(s) to be predicted, candidate predictors, sample size, missing data, model development, model performance (eg, discrimination, calibration, and classifications measures), results, and interpretation of presented models was filled to complete this standardized form. If important information is missing, we sought clarification from the authors using email communication. One reviewer (HZ) extracted the data from the included studies, and another reviewer (DDC) checked the extracted data. For any disagreements, an advisor (JS) was consulted to resolve the disagreement and to reach a consensus. We adopted PROBAST to assess the risk of bias, which can cause distorted estimation of a prediction model’s performance. PROBAST can also evaluate concerns regarding applicability of a prediction model.16 For the risk of bias, there are four key domains included in this tool: participants, predictors, outcome and analysis. Each domain can be rated as “high”, “low” or “unclear” risk of bias. For applicability, there are three key domains included in this tool: participants, predictors, and outcome. Each domain can be rated as “high”, “low” or “unclear” concerns. When risk of bias and applicability are evaluated as “low” in all domains, a prediction model can be judged as “low risk of bias” and “low concerns regarding applicability”, respectively. When risk of bias and applicability are assessed as “high” in one or more domains, a prediction model can be judged as “high risk of bias” or “high concerns regarding applicability”, respectively. When the evaluation about one or more domains is unclear and the remaining domains are judged as “low”, a prediction model can be judged as “unclear risk of bias” or “unclear concerns regarding applicability”, respectively. Two reviewers (HZ, DDC) independently assessed the methodological quality (risk of bias) and applicability of included studies. If there were any disagreements, this was resolved by discussion and consultation with an advisor (JS) to reach a consensus.

Results

From the 29,668 retrieved articles using the search strategy, 31 full articles were reviewed. Finally, eleven studies meeting the selection criteria were included in this review (Figure 1).

Figure 1

PRISMA flow diagram.

Study Characteristics

The 11 included studies reported 22 prognostic prediction models identified as prediction model development with or without external validation. There were no prediction model external validation studies. The duration of follow-up ranged from 2 years to 7 years. Data source and study design can be found in Table 1. Eight prediction modeling studies used a retrospective cohort study design using health examination data19–26 and one used a retrospective cohort study design from Isfahan Cohort Study (Table 1).27 One prediction modeling study used a case–control study design from the French occupational GAZEL cohort.28 Notably, one study adopted a prospective cohort study design to develop risk model, but this model was validated using a cross-sectional study design.29 All models predicted a single endpoint that was MetS.

Table 1

Characteristics of Studies

Reference	Language	Source of Data	Study Setting	Sample Size	Modeling Method	The Number of Models
Gao et al19	Chinese	Retrospective cohort	Multi-center/Hospital health examination center	Male: 1020 Female: 545	BMA-MSPCox regression	4
Yang et al24	Chinese	Retrospective cohort	Multi-center/Hospital health examination center	Training dataset: 7519Validation dataset: 6454	Logistic regression	1
Sun et al23	Chinese	Retrospective cohort	Multi-center/Hospital health examination center	Male: 10,040 Female: 5832	Cox regression	2
Hirose et al20	English	Retrospective cohort	Single center/Hospital health examination center	Training dataset: 246Validation dataset: 164	Artificial neural networkLogistic regression	2
Hsiao et al21	English	Retrospective cohort	Single center/Hospital health examination center	352	Logistic regression	4
Obokata et al22	English	Retrospective cohort	Single center/Hospital health examination center	Training dataset (initial score): 6817Validation dataset (initial score): 6817Final score:2743	Logistic regression	1
Zou et al26	English	Retrospective cohort	Single center/Hospital health examination center	Training dataset: 2930Validation dataset: 1465	Logistic regression	1
Zhang et al45	English	Retrospective cohort	Single center/Hospital health examination center	Male: 1020Female: 545	Cox regression	2
Pujos-Guillot et al46	English	Case-control study	A utility firm (Électricité de France-Gaz de France)	Training dataset: 56(control) 56 (case)Validation dataset: 47 (control) 47 (case)	Logistic regression	2
Karimi-Alavijeh et al43	English	Retrospective cohort	Urban and rural areas in Iran	2107	Decision treeSupport vector machine	2
Efstathiou et al29	English	Prospective cohort	A preventive medicine program	Training dataset: 1270Validation dataset: 1091	Logistic regression	1

Abbreviation: BMA-MSP, Bayesian model averaging method.

Characteristics of Studies Abbreviation: BMA-MSP, Bayesian model averaging method.

Characteristics of the Models

There were two studies including only males.20,28 Studies used different diagnostic criteria for metabolic syndrome (Table 2). One study aimed to develop a model that can both predict recovery from MetS and the risk of suffering MetS.22 An original score was developed from the samples without MetS, and then some of the predictors of this original score were reduced to form a final score based on the samples who recovered from MetS.22

Table 2

Clinical Characteristics of the Study Population

Reference	Inclusion Criteria	Exclusion Criteria	Diagnostic Criteria Used For Metabolic Syndrome	Number of Events
Gao et al19	Free of MetS at sampling	Not reported	China Diabetes Society	Male: 286 Female: 62
Yang et al24	Free of MetS at samplingaged 35–74 years	Take medication for hyperlipidaemia, hypertension, and diabetesmellitus	NCEP-ATPIII	Training dataset:897Validation dataset:742
Sun et al23	Free of MetS at samplingaged 20–80 years	Not reported	China Diabetes Society	Male:1273 Female:318
Hirose et al20	Male, Free of MetS at samplingaged 30–59 years	Endocrine disease, significant renal or hepatic disordersthose on medication for diabetes mellitus at baseline	Japanese diagnostic criteria	Training dataset: 16Validation dataset: 11
Hsiao et al21	Free of MetS at samplingaged 30–60 years	Regularly drinking alcohol or were current smokerstaking antidiabetic, antihypertensive or lipid lowering agents	NCEP-ATPIII	30
Obokata et al22	Free of MetS at samplingindividuals ≥20 years	Participants without detailed information regarding their medication useParticipants without data regarding their serum creatinine levels;missing values for components of MetS	An integrated criteria based on several criteria	Training dataset(Initial score): 878^aValidation dataset(Initial score): 757^aFinal score:906^b
Zou et al26	Free of MetS at sampling	Missing values for components of MetS and important examination detailsNo information on medication use; lose to follow up	China Diabetes Society	Not reported
Zhang et al45	Free of MetS at samplingaged 18–82 years	Not reported	China Diabetes Society	Male:286Female: 62
Pujos-Guillot et al46	Male; Free of MetS at sampling25 ≤ BMI < 30 kg/m²52 ≤ age < 64 years	Not reported	NCEP-ATP III	Training dataset: 56Validation dataset: 47
Karimi-Alavijeh et al43	Free of MetS and heart diseaseat sampling	Not reported	NCEP-ATP III	596
Efstathiou et al29	No children/adolescents belonging to other ethnic/racial groups	Children with known major cardiovascular,endocrinal, nutritional, or renal problems, with secondary obesity, ortaking drugs that influence metabolic profile	International Diabetes Federation consensus	Training dataset: 105Validation dataset: 86

Notes: NCEP-ATP III, the National Cholesterol Education Program Expert Panel and Adult Treatment Panel III; MetS, metabolic syndrome; BMI, body mass index. aThe number of individuals with MetS. bThe number of individuals recover from MetS.

Clinical Characteristics of the Study Population Notes: NCEP-ATP III, the National Cholesterol Education Program Expert Panel and Adult Treatment Panel III; MetS, metabolic syndrome; BMI, body mass index. aThe number of individuals with MetS. bThe number of individuals recover from MetS. Ten studies aimed to develop prediction models for adults,19–28 and only one study aimed to predict risk probability for adolescent youths.29 The included studies used modeling methods encompassing logistic regression, cox regression, and machine learning techniques (eg, decision tree and support vector machine). A total of 48 predictors were selected in the final models (Figure 2), and 10 of these predictors are the criteria of the diagnostic criteria of MetS.

Figure 2

Predictors included in 22 models for metabolic syndrome.

Analysis Methods

Two studies reported that participants without data about the diagnostic criteria of MetS and important examination results were excluded.22,26 One study reported that cases with missing values were excluded by applying listwise deletion.29 Others did not report the presence and methods for handling missing data. Continuous predictors were dichotomized or categorized in three studies.22,26,29 The number of participants with the outcome events and the number of candidate predictor parameters varied across the studies. Events per variable can be found in Figure 3, and statistical power was not sufficient in six studies due to EPV below 10.19–21,25,28,29 Univariate analysis was used to select candidate predictors in some studies22,24,26,28 and one study chose principal component analysis to select candidate predictors.25 One study did not report the number of outcome events,26 and we tried to contact the corresponding author via email to ask for more information about the number of events, but no response has been received.

Figure 3

Events per variable for prediction modeling studies.

Events per variable for prediction modeling studies. Regarding model evaluation, two studies reported only apparent performance, which means the predictive performance has not been corrected (Table 3).19,21 Three studies reported only external validation without internal validation.24,28,29 Three performed internal validation using cross-validation,23,25,27 and the others used random split-sample for internal validation.20,22,26

Table 3

The Presentation Format and Performance of Models

Reference	Presentation Format	Model Evaluation	Calibration	Discrimination (c-Statistic [95% CI])^a
Gao et al19	Not reported	Apparent performance	Not reported	Female: BMA-MSP model 0.87(0.80–0.95) Cox Model 0.83(0.75–0.92)Male: BMA-MSP model 0.82(0.79–0.86) Cox Model 0.81(0.78–0.85)
Yang et al24	Regression formulate	External validation^b only	Not reported	Training dataset: 0.83(0.81–0.84)Validation datasets: 1) :0.81(0.79–0.84), 2) 0.83(0.80–0.85), 3) 0.80(0.77–0.82)
Sun et al23	Regression formulate	Internal validation (cross-validation)	Not reported	Female:0.75(0.73–0.76)Male:0.75(0.74–0.76)Internal validation: 0.75 (female) 0.75(male)
Hirose et al20	Regression formulate	Using random split-sample for internal validation^c	Not reported	Not reported (only reported sensitivity and specificity)
Hsiao et al21	Regression formulate	Apparent performance	Hosmer–Lemeshow test	model 1:0.77 (0.69–0.84) model2: 0.78 (0.70–0.85)model 3: 0.80 (0.73–0.87) model 4: 0.81 (0.74–0.88)
Obokata et al22	Regression coefficients without baseline components	Using random split-sample for internal validation	Calibration plot	Training dataset: 0.82 (Initial score)0.79 (0.78–0.8) (final score)Validation dataset: 0.83 (Initial score)0.81 (0.80–0.83) (final score)
Zou et al26	A MetS risk score	Using random split-sample for internal validation	Not reported	Training dataset: 0.67 Validation dataset: 0.69
Zhang et al45	Not reported	Internal validation (cross-validation)	Not reported	Training dataset: male 0.80(0.78–0.83) female 0.90(0.87–0.93)Validation dataset: male 0.80(0.77–0.82) female 0.90(0.87–0.92)
Pujos-Guillot et al46	Not reported	Internal validation and external validation	Not reported	Training dataset(Internal validation):model1 0.86(0.83−0.95) model 2 0.85(0.78−0.92)Validation dataset: not reported
Karimi-Alavijeh et al43	Not reported	Internal validation (cross-validation)	Not reported	Not reported (only reported sensitivity and specificity)
Efstathiou et al29	Regression coefficients without baseline components	External validation without internal validation	Hosmer–Lemeshow test	Not reported (only reported sensitivity and specificity)

Notes: aThe concordance statistic is equal to the area under the receiver operating characteristic curve for models predicting binary outcomes; bExternal validation in three datasets; BMA-MSP, Bayesian model averaging method; ccross-validation only for validation dataset.

The Presentation Format and Performance of Models Notes: aThe concordance statistic is equal to the area under the receiver operating characteristic curve for models predicting binary outcomes; bExternal validation in three datasets; BMA-MSP, Bayesian model averaging method; ccross-validation only for validation dataset. Regarding model performance, there were three studies reporting calibration. However, only one used the calibration plot,22 while two studies used the Hosmer–Lemeshow test (Table 3). 21,29 Three studies report classification measures (eg, sensitivity and specificity) instead of c-statistic, and did not predefine a probability threshold.20,27,29 The c-statistic is also known as the area-under-the-curve of a receiver operating characteristic curve ranged from 0.67 to 0.95 in all studies. The full regression equations were presented in only four studies.20,21,23,24 One study presented a simplified scoring system,26 and two studies provided regression coefficients without baseline components.22,29 Others did not present their prediction models.19,25,27,28

Risk of Bias and Applicability of Model Studies

PROBAST was used to assess the risk of bias and applicability of model studies. As shown in Figure 4, most model studies had a high risk of bias driven by the analysis and outcome domains. Six studies were found to have high concerns of applicability due to inappropriate exclusion criteria, uncommon and unavailable predictors, and selected predictors being part of the outcome definition.20–22,24,28,29

Figure 4

The risk of bias and the applicability of the model studies.

Discussion

This systematic review identified and critically appraised 11 prediction modeling studies for MetS. Machine learning methods and regression methods were adopted to develop prognostic prediction models. Compared to a similar systematic review,10 we added six prognostic prediction models by searching English and Chinese databases.19,20,23,24,27,28 Three of them came from Japan, France, and Iran.20,27,28 The remaining studies came from China. Additionally, some of the important details were expanded and updated.

Model Development

One study adopted a nested case–control design from a pre-existing cohort, but it did not appropriately adjust for the original cohort outcome event frequency in the analysis. This can result in a high risk of bias for the prediction model.28 In such cases, it is recommended to reweight the control and case samples by the inverse sampling fraction from the original cohort as this can correct the estimation of baseline risk to obtain a corrected absolute predicted probability and model calibration measures.30 Some studies recruited only male participants or excluded participants who regularly drank alcohol or were current smokers.20,21,30 This means that the enrolled participants cannot be representative of this model’s targeted population resulting in a high risk of bias in participants’ domain and raising concern for applicability. One study adopted a prospective cohort study design to predict MetS in adolescence from Natal and Parental Profile, but this prognostic model was validated in a cross-sectional study that only recruited adolescence.29 The external validation may be inappropriate, due to study designs and populations issues. Researchers should choose a representative population, and a correct study design to improve the performance of the prediction model.13 Regarding sample sizes, EPV above 20 is recommended as the minimum sample size for model development.31 Moreover, for modeling studies using machine-learning techniques, a substantially higher EPV (often >200) is required to avoid overfitting.32 The sample size of five studies was not appropriate, because their EPVs were below 10 (Figure 3).19–21,25,28,29 Another study developed two prediction models based on a female cohort and a male cohort, and unfortunately, EPV of the female cohort is not appropriate.19 Inappropriate EPV can cause high risks of overfitting and biased predictions.33 This means that although researchers reported a C statistic which is close to 1, the performance of these models will probably be worse due to optimism when they are tested in another new dataset. Future studies are encouraged to determine an appropriate sample size, since different prediction modeling studies and different modeling techniques require different EPVs, for example, the EPV should be above 100 for model validation studies.34 Three studies used Cox proportional hazard models to develop a prognostic model.19,23,25 However, these modeling studies did not describe censored data, which are important for time-to-event analysis. Additionally, Harrell’s c-index or the D statistic is more appropriate than C statistic for evaluating survival model performance. Therefore, these models may suffer analysis bias resulting in a risk of bias when estimating predictive performance. Future studies should use appropriate statistical methods based on their study designs (eg, cross-sectional study or cohort study) and types of outcome data (binary or time-to-event outcomes), when performing statistical analysis. Missing data are recognized as a common and increasingly vital problem in medical scientific research.35 However, simply excluding participants with missing data from the analysis called complete case (CC) analysis will cause biased predictor–outcome associations and biased model performance.16 Some studies adopted CC analysis,22,26,29 which discards valuable information from incomplete records. The remaining studies identified by this review did not report information about missing data. In such cases, participants with missing data are more likely to be deleted from statistical analysis because statistical packages automatically exclude individuals with any missing value.16 Multiple imputation can be a solution to missing data. The main advantage of multiple imputation is that it can obtain correct standard errors and P values, so it is regarded as the most appropriate method to handle missing data.36 One study dichotomized continuous predictors.22 Although dichotomization of continuous predictors can improve clinical interpretation and maintain simplicity, it is a suboptimal choice because of loss of information, lower predictive ability, and higher optimism.37 One study converted continuous predictors into categorized variables.26 However, it is recommended that predictors should be kept as continuous and the linear association between predictors and outcomes should be examined (eg, restricted cubic splines or fractional polynomials). If researchers consider categories in their studies, they should categorize continuous predictors into four or more groups based on widely accepted cut points.38 Researchers’ datasets usually have many features that could be selected as candidate predictors. The process of selecting predictors can be divided into two stages. First, researchers need to select candidate predictors for inclusion when the multivariable analysis is performed. To reduce the number of predictors, some studies adopted the univariate analysis to produce a simpler model.22,24,26,28 This strategy is not recommended because nuances in the data set or confounding by other predictors may exclude some important predictors causing predictor selection bias and increased overfitting.39 Future studies are encouraged to adopt some appropriate options, such as clinical reasons and a literature review.39 Secondly, predictors are selected during multivariable modeling. Common methods include stepwise selection techniques (eg, forward elimination), and these techniques were used in four studies.19,22,23,26,29 However, forward selection techniques are more likely to increase the risk of overfitting, especially in small sample size.40 Some popular penalized regression approaches should be adopted, such as ridge regression and lasso regression because these techniques can shrink each predictor effect differently and exclude some predictors entirely.41 This review identified 48 predictors from included prediction models (Figure 2). However, 10 of these predictors are included in the outcome definition. Ideally, outcomes should be determined without the need for predictor information; otherwise, the association between predictors and outcomes is prone to be overestimated and the model performance is more likely to be optimistic.17 After excluding the predictors mentioned above, a set of candidate predictors may be considered for future studies in adults if they were included at least two models:13 serum HMW-adiponectin, total adiponectin, HOMA-IR, serum insulin, free fatty acids, weight, glycated albumin, hip circumference, MCV, MCH, physical activity, AST, ALT, BMI, NGC, TC, serum uric acid, LDL-cholesterol, gender, smoking, WBC, LC, Hb, HCT, and age. These predictors are available in clinical practice. Because there was only one study about adolescence included in this review,29 we cannot recommend any predictor for the prognostic prediction model in adolescence.

Model Evaluation

After developing or validating prediction models, testing model performance is a vital step. Different measures to evaluate model performance may be used, such as calibration, discrimination, and (re)classification. It is recommended that both calibration and discrimination should be reported in all prediction model papers.42 However, only three studies reported both calibration and discrimination.21,22,29 For calibration, calibration plots are more appropriate than a statistical test of calibration (eg, Hosmer–Lemeshow test), because the direction or magnitude of miscalibration cannot be indicated by the Hosmer–Lemeshow test. For discrimination, Karimi-Alavijeh et al43 and Efstathiou et al29 reported sensitivity and specificity without a predefined probability. A predefined probability threshold is required when researchers adopt the (re)classification measures to avoid optimism and biases. For modeling studies, external and internal validation are important. Two studies only evaluated apparent performance as model performance without internal validation.19,21 Regarding the reproducibility, internal validation is needed in model development studies. Three studies randomly split a dataset into a training group and a validation group.20,22,26 However, this approach is suboptimal especially in small samples, since this technique merely creates two smaller but similar datasets by chance, and does not use all available data to develop the prediction models.8 Bootstrapping and cross-validation techniques are recommended to conduct internal validation to correct the optimism of prediction models.44 To ensure transportability of a prediction model, external validation is needed. There were three studies that used external validation,24,28,29 but two of them only adopted external validation and omitted internal validation.24,29 Internal validation is vital to develop models, as models with only external validation may also be overfitting and optimism.17

Risk of Bias and the Applicability of Models

Eleven prognostic modeling studies in MetS were identified and these models are all at high risk of bias mainly driven by the analysis and outcome domains. The concerns regarding the applicability of several models are due to their inclusion/exclusion criteria (eg, male and BMI) and availability of predictors. Consequently, it is not appropriate to adopt any of them in clinical practice since each model performance is prone to be optimistic and overfitting.

Strengths and Limitations

This systematic review of prognostic prediction models in MetS adopted rigorous methods and a sensitive search strategy to search across several leading Chinese and English databases of biomedical literature. Compared to a previous systematic review,10 we expanded and updated many important details, such as EPV for sample size, the relationship between predictors and outcome definition, and some appropriate index (eg, Harrell’s c-index, and calibration plot). Additionally, this review added six prognostic prediction models by searching different databases.19,20,23,24,27,28 Another strength is that a set of candidate predictors are recommended for future modeling studies in this review. These candidate predictors are serum HMW-adiponectin, total adiponectin, HOMA-IR, serum insulin, free fatty acids, weight, glycated albumin, hip circumference, MCV, MCH, physical activity, AST, ALT, BMI, NGC, TC, serum uric acid, LDL-cholesterol, gender, smoking, WBC, LC, Hb, HCT, and age. This systematic review did not search gray literature, so unpublished models were not included. Furthermore, we did not receive responses for information concerning the missing event number in a study from the corresponding author.26 Lastly, a quantitative analysis of the results was not performed because of the lack of homogeneity in the predictors, poor data, and model types.

Conclusion

This systematic review draws a map of the studies on multivariable prognostic models in MetS summarizing and appraising characteristics of studies, methodological characteristics, model performance, risk of bias, and the applicability of models. Future modeling development and validation studies are encouraged to adhere to the TRIPOD reporting guideline to improve the statistical methods of studies to increase the transparency of a prediction model study.

38 in total

1. Risk prediction measures for case-cohort and nested case-control designs: an application to cardiovascular disease.

Authors: Andrea Ganna; Marie Reilly; Ulf de Faire; Nancy Pedersen; Patrik Magnusson; Erik Ingelsson
Journal: Am J Epidemiol Date: 2012-03-06 Impact factor: 4.897

2. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models.

Authors: Yvonne Vergouwe; Ewout W Steyerberg; Marinus J C Eijkemans; J Dik F Habbema
Journal: J Clin Epidemiol Date: 2005-05 Impact factor: 6.437

3. Prognosis and prognostic research: Developing a prognostic model.

Authors: Patrick Royston; Karel G M Moons; Douglas G Altman; Yvonne Vergouwe
Journal: BMJ Date: 2009-03-31

4. Metabolic syndrome in adolescence: can it be predicted from natal and parental profile? The Prediction of Metabolic Syndrome in Adolescence (PREMA) study.

Authors: Stamatis P Efstathiou; Irini I Skeva; Evi Zorbala; Evangelos Georgiou; Theodore D Mountokalakis
Journal: Circulation Date: 2012-01-12 Impact factor: 29.690

5. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist.

Authors: Karel G M Moons; Joris A H de Groot; Walter Bouwmeester; Yvonne Vergouwe; Susan Mallett; Douglas G Altman; Johannes B Reitsma; Gary S Collins
Journal: PLoS Med Date: 2014-10-14 Impact factor: 11.069

6. The increasing need for systematic reviews of prognosis studies: strategies to facilitate review production and improve quality of primary research.

Authors: Johanna A A G Damen; Lotty Hooft
Journal: Diagn Progn Res Date: 2019-01-23

7. Risk models and scores for metabolic syndrome: systematic review protocol.

Authors: Musa Saulawa Ibrahim; Dong Pang; Gurch Randhawa; Yannis Pappas
Journal: BMJ Open Date: 2019-09-27 Impact factor: 2.692

8. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.

Authors: David Moher; Alessandro Liberati; Jennifer Tetzlaff; Douglas G Altman
Journal: PLoS Med Date: 2009-07-21 Impact factor: 11.069

9. Predicting metabolic syndrome using decision tree and support vector machine methods.

Authors: Farzaneh Karimi-Alavijeh; Saeed Jalili; Masoumeh Sadeghi
Journal: ARYA Atheroscler Date: 2016-05

10. How to develop a more accurate risk prediction model when there are few events.

Authors: Menelaos Pavlou; Gareth Ambler; Shaun R Seaman; Oliver Guttmann; Perry Elliott; Michael King; Rumana Z Omar
Journal: BMJ Date: 2015-08-11

6 in total

1. Predictive models for musculoskeletal injury risk: why statistical approach makes all the difference.

Authors: Daniel I Rhon; Deydre S Teyhen; Gary S Collins; Garrett S Bullock
Journal: BMJ Open Sport Exerc Med Date: 2022-10-14

2. Machine Learning-Based Prediction for 4-Year Risk of Metabolic Syndrome in Adults: A Retrospective Cohort Study.

Authors: Hui Zhang; Dandan Chen; Jing Shao; Ping Zou; Nianqi Cui; Leiwen Tang; Xiyi Wang; Dan Wang; Jingjie Wu; Zhihong Ye
Journal: Risk Manag Healthc Policy Date: 2021-10-20

3. Cardiovascular disease risk prediction models in the Chinese population- a systematic review and meta-analysis.

Authors: Guo Zhiting; Tang Jiaying; Han Haiying; Zhang Yuping; Yu Qunfei; Jin Jingfen
Journal: BMC Public Health Date: 2022-08-24 Impact factor: 4.135

Review 4. The Association between Circadian Clock Gene Polymorphisms and Metabolic Syndrome: A Systematic Review and Meta-Analysis.

Authors: Ivana Škrlec; Jasminka Talapko; Snježana Džijan; Vera Cesar; Nikolina Lazić; Hrvoje Lepeduš
Journal: Biology (Basel) Date: 2021-12-24

5. External Validation of the Prognostic Prediction Model for 4-Year Risk of Metabolic Syndrome in Adults: A Retrospective Cohort Study.

Authors: Hui Zhang; Dandan Chen; Jin Shao; Ping Zou; Nianqi Cui; Leiwen Tang; Xiyi Wang; Dan Wang; Zhihong Ye
Journal: Diabetes Metab Syndr Obes Date: 2021-07-01 Impact factor: 3.168

6. Development and Internal Validation of a Prognostic Model for 4-Year Risk of Metabolic Syndrome in Adults: A Retrospective Cohort Study.

Authors: Hui Zhang; Dandan Chen; Jing Shao; Ping Zou; Nianqi Cui; Leiwen Tang; Dan Wang; Zhihong Ye
Journal: Diabetes Metab Syndr Obes Date: 2021-05-18 Impact factor: 3.168

6 in total