| Literature DB >> 32747386 |
Martina Vettoretti1, Enrico Longato1, Alessandro Zandonà1, Yan Li2, José Antonio Pagán3,4, David Siscovick5, Mercedes R Carnethon6, Alain G Bertoni7, Andrea Facchinetti1, Barbara Di Camillo8.
Abstract
INTRODUCTION: Many predictive models for incident type 2 diabetes (T2D) exist, but these models are not used frequently for public health management. Barriers to their application include (1) the problem of model choice (some models are applicable only to certain ethnic groups), (2) missing input variables, and (3) the lack of calibration. While (1) and (2) drives to missing predictions, (3) causes inaccurate incidence predictions. In this paper, a combined T2D risk model for public health management that addresses these three issues is developed. RESEARCH DESIGN AND METHODS: The combined T2D risk model combines eight existing predictive models by weighted average to overcome the problem of missing incidence predictions. Moreover, the combined model implements a simple recalibration strategy in which the risk scores are rescaled based on the T2D incidence in the target population. The performance of the combined model was compared with that of the eight existing models using data from two test datasets extracted from the Multi-Ethnic Study of Atherosclerosis (MESA; n=1031) and the English Longitudinal Study of Ageing (ELSA; n=4820). Metrics of discrimination, calibration, and missing incidence predictions were used for the assessment.Entities:
Keywords: modeling; prevention; risk factor modeling; type 2 diabetes
Mesh:
Year: 2020 PMID: 32747386 PMCID: PMC7398107 DOI: 10.1136/bmjdrc-2020-001223
Source DB: PubMed Journal: BMJ Open Diabetes Res Care ISSN: 2052-4897
Figure 1(A) Illustrates the steps for calculating the combined model risk score, considering the data of an imaginary individual, that is, John, an African-American, 55-year-old citizen of New York with missing values on cholesterol level and heart rate (this is not a real subject, but an invented case used as illustrative example). (B) Schematizes the analyses performed on the MESA and ELSA datasets for the assessment of the combined model and the eight existing models (original version, rescaled models, and fully recalibrated models). ARIC 1, Atherosclerosis Risk in Communities simple model; ARIC 2, ARIC clinical mode lwithout lipids; ARIC 3, ARIC clinical model with lipids; DPoRT, Diabetes Population Risk Tool; ELSA, English Longitudinal Study of Ageing; FINDRISC, Finnish Diabetes Risk Score; FRAMINGHAM, Framingham model; MESA, Multi-Ethnic Study of Atherosclerosis; T2D, type 2 diabetes.
Figure 2ROC curve at 8 years on the MESA test set for the original literature models (blue) and the models after full recalibration is performed on the MESA training set (red). In a setting in which the subjects with risk scores above a certain threshold T are predicted to develop T2D within a certain time t from the baseline, the ROC curve represents the plot of the true positive rate (sensitivity) versus the false positive rate (1-specificity) for different values of the threshold T. The greater the AU-ROC, the more accurately the score discriminates between subjects at high versus low risk. ARIC 1, Atherosclerosis Risk in Communities simple model; ARIC 2, ARIC clinical model without lipids; ARIC 3, ARIC clinical model with lipids; AU-ROC, area under the receiver operating characteristic curve; DPoRT, Diabetes Population Risk Tool; FINDRISC, Finnish Diabetes Risk Score; FRAMINGHAM, Framingham model; MESA, Multi-Ethnic Study of Atherosclerosis; recal, recalibration; T2D, type 2 diabetes.
Figure 3Calibration plot at 8 years on the MESA test set for the original literature models (blue), the rescaled models (green), and the fully recalibrated models (red). The calibration plot represents the number of observed events versus the number of expected events, at time t, for increasing deciles of predicted event probability. The more the calibration plot is close to the line with 0 intercept and 45° slope, the better the model is calibrated. ARIC 1, Atherosclerosis Risk in Communities simple model; ARIC 2, ARIC clinical model without lipids; ARIC 3, ARIC clinical model with lipids; DPoRT, Diabetes Population Risk Tool; E/O, expected to observed event ratio; FINDRISC, Finnish Diabetes Risk Score; FRAMINGHAM, Framingham model; MESA, Multi-Ethnic Study of Atherosclerosis; recal, recalibration.
Performance of the literature models (original and rescaled) and the combined T2D model on the MESA and ELSA test sets
| Test set | Scenario | Model | C-index* | E/O† original model | E/O† rescaled model | Missing predications‡ |
| Sc1 | DPoRT men | 0.70 (0.64–0.77) | 6.65 (4.86–9.10) | 1.27 (0.92–1.73) | 1% | |
| DPoRT women | 0.70 (0.63–0.76) | 3.47 (2.61–4.60) | 1.11 (0.84–1.48) | |||
| FINDRISC | 0.72 (0.67–0.76) | 0.52 (0.42–0.64) | 0.89 (0.72–1.09) | 0% | ||
| Sc2 | ARIC 1 | 0.73 (0.68–0.78) | 2.13 (1.58–2.88) | 1.09 (0.81–1.47) | 45% | |
| KAHN§ | 0.75 (0.69–0.81) | 3.91 (2.89–5.29) | 1.11 (0.82–1.51) | 46% | ||
| Sc3 | STERN | 0.81 (0.75–0.86) | 1.75 (1.33–2.30) | 1.10 (0.83–1.44) | 42% | |
| ARIC 2 | 0.82 (0.76–0.87) | 1.02 (0.76–1.38) | 0.99 (0.74–1.34) | 45% | ||
| ARIC 3 | 0.83 (0.77–0.88) | 1.05 (0.78–1.41) | 1.00 (0.74–1.35) | 45% | ||
| FRAMINGHAM | 0.83 (0.79–0.87) | 0.34 (0.27–0.43) | 0.82 (0.65–1.04) | 17% | ||
| Combined model | 0.83 (0.79–0.87) | 1.00 (0.81–1.24) | 0% | |||
| Sc1 | DPoRT men | 0.72 (0.67–0.76) | 9.29 (7.60–11.36) | 1.40 (1.14–1.71) | 25% | |
| DPoRT women | 0.71 (0.67–0.75) | 3.63 (3.02–4.37) | 1.23 (1.02–1.47) | |||
| FINDRISC | 0.73 (0.70–0.77) | 0.65 (0.57–0.74) | 1.14 (0.99–1.30) | 19% | ||
| Sc2 | ARIC 1 | 0.74 (0.72–0.77) | 3.00 (2.61–3.46) | 1.40 (1.22–1.62) | 34% | |
| KAHN§ | 0.76 (0.73–0.79) | 5.30 (4.57–6.15) | 1.41 (1.22–1.64) | 38% | ||
| Sc3 | STERN | 0.79 (0.74–0.83) | 2.46 (1.99–3.04) | 1.81 (1.46–2.23) | 64% | |
| ARIC 2 | 0.77 (0.73–0.82) | 1.67 (1.35–2.05) | 1.58 (1.29–1.95) | 63% | ||
| ARIC 3 | 0.80 (0.76–0.84) | 1.60 (1.30–1.97) | 1.56 (1.27–1.92) | 64% | ||
| FRAMINGHAM | 0.82 (0.79–0.86) | 0.39 (0.32–0.44) | 1.22 (1.00–1.49) | 64% | ||
| Combined model | 0.77 (0.74–0.79) | 1.17 (1.04–1.31) | 4% |
*C-index varies between 0 and 1, with 0.5 corresponding to a random assignment of the scores and 1 representing the perfect score.
†Values of E/O close to 1 indicate that the model has good calibration, whereas values significantly higher/lower than 1 indicate that the model tends to overestimate/underestimate the event probability.
‡Percentage of missing predictions, that is, percentage of subjects for whom the model cannot return a valid risk score.
§Note that in Ref. 25, only the risk scoring system derived from the Weibull model was reported, whereas the parameters of the original Weibull model were not published. Therefore, to obtain the probability scores for this model, we divided the KAHN’s risk scores (range 0–100) by 100.
ARIC 1, Atherosclerosis Risk in Communities simple model; ARIC 2, ARIC clinical model without lipids; ARIC 3, ARIC clinical model with lipids; C-index, concordance index; DPoRT, Diabetes Population Risk Tool; ELSA, English Longitudinal Study of Ageing; E/O, expected to observed event ratio; FINDRISC, Finnish Diabetes Risk Score; FRAMINGHAM, Framingham model; MESA, Multi-Ethnic Study of Atherosclerosis; Sc, scenario; T2D, type 2 diabetes.
Discrimination performance of the literature models and the combined T2D model on the subset of MESA and ELSA test sets without missing predictions (ie, all the models can be applied without missing values in the input variables)
| Scenario | Model | C-index | |
| MESA test set | ELSA test set | ||
| Sc1 | DPoRT men | 0.79 | 0.72 |
| DPoRT women | 0.68 | 0.75 | |
| FINDRISC | 0.76 | 0.75 | |
| Sc2 | ARIC 1 | 0.77 | 0.73 |
| KAHN | 0.78 | 0.74 | |
| Sc3 | STERN | 0.80 | 0.80 |
| ARIC 2 | 0.82 | 0.79 | |
| ARIC 3 | 0.84 | 0.81 | |
| FRAMINGHAM | 0.83 | 0.82 | |
| Combined model | 0.84 | 0.83 | |
ARIC 1, Atherosclerosis Risk in Communities simple model; ARIC 2, ARIC clinical model without lipids; ARIC 3, ARIC clinical model with lipids; C-index, concordance index; DPoRT, Diabetes Population Risk Tool; ELSA, English Longitudinal Study of Ageing; FINDRISC, Finnish Diabetes Risk Score; FRAMINGHAM, Framingham model; MESA, Multi-Ethnic Study of Atherosclerosis; Sc, scenario; T2D, type 2 diabetes.
Figure 4Receiver operating characteristic (ROC) curve and calibration plot at 8 years on the Multi-Ethnic Study of Atherosclerosis (MESA) test set (top panel) and the English Longitudinal Study of Ageing (ELSA) test set (bottom panels) for the combined type 2 diabetes model (black) and the original models of scenario 3 (STERN in green, Atherosclerosis Risk in Communities clinical model without lipids (ARIC 2) in blue, Atherosclerosis Risk in Communities clinical model with lipids (ARIC 3) in orange, Framingham model (FRAMINGHAM) in red).