| Literature DB >> 34976696 |
Vera Helen Buss1,2, Marlien Varnfield1, Mark Harris2, Margo Barr2.
Abstract
The study aimed to assess the performance of a lifestyle-based prognostic risk model (Diabetes Lifestyle Score) for the prediction of 5-year risk of type 2 diabetes mellitus. The model comprises nine self-reported predictors (sex, age, antihypertensive drugs, body mass index, family history of diabetes, physical activity, fruits, vegetables, and wholemeal/brown bread). We conducted an external validation and update of the model in an Australian cohort including 97,615 residents of New South Wales aged 45 years and older who were free of type 1 and 2 diabetes mellitus at baseline. Of all participants, 4,741 developed type 2 diabetes mellitus over 5 years. We conducted the statistical analyses in RStudio using the programming language R. The area under the receiver operating characteristic curve (AUC) of the original model was 0.726 (95% confidence interval: 0.719, 0.733). After adjusting the calibration intercept and slope, the original model performed reasonably well in the external cohort. The best performance was measured by using the numerical predictors as continuous variables and refitting all coefficients (AUC: 0.741, 95% confidence interval: 0.734, 0.748). The results of the original model after calibration were comparable to those received from the AUSDRISK score which is routinely used in Australian clinical practice. Hence, the lifestyle-based model might be a reasonable alternative for laypersons since the required information is most likely known by these. Further, the risk score may communicate the message about the importance of a healthy diet to reduce the risk of diabetes.Entities:
Keywords: Cohort analysis; Diabetes mellitus, type 2; Logistic regression; Risk factor scores; Validation study
Year: 2021 PMID: 34976696 PMCID: PMC8684002 DOI: 10.1016/j.pmedr.2021.101647
Source DB: PubMed Journal: Prev Med Rep ISSN: 2211-3355
Fig. 1Diabetes Lifestyle Score according to Simmons et al. (Simmons et al., 2007). Abbreviations: BMI = body mass index, T2DM = type 2 diabetes mellitus.
Updating methods for the logistic regression model.
| 0 – no adjustments | see |
| 1 – calibration-in-the large | adjust intercept based on T2DM incidence in the validation dataset |
| 2 – logistic calibration | adjust intercept and regression coefficients using calibration intercept and slope from logistic regression model fitted with linear predictor as the only covariate |
| 3 – refitting | re-estimate all regression coefficients using only the validation dataset |
| 4 – refitting with different predictor assessment | like 3, but with overall vegetable consumption (cooked + raw vegetables) as a proxy for green leafy vegetables instead of raw vegetables |
| 5 – refitting with numerical predictors as continuous | like 4, but numerical predictors (BMI, moderate + vigorous physical activity, raw + cooked vegetables, fruits, brown bread) as continuous variables |
Abbreviations: BMI = body mass index, T2DM = type 2 diabetes mellitus.
Fig. 2Logistic regression model of AUSDRISK score (Chen et al., 2010). Abbreviations: BMI = body mass index, T2DM = type 2 diabetes mellitus.
Fig. 3Flowchart for identifying T2DM cases and controls. APDC = Admitted Patient Data Collection data; GDM = gestational diabetes mellitus; PBS = Pharmaceutical Benefits Scheme data.
Comparison of participants’ characteristics in derivation (Simmons et al., 2007) and validation cohort.
| All respondents | 209 (1.7) | 4,741 (4.9) | 12,310 (98.3) | 92,874 (95.1) | <0.001 |
| Age (in years) | 62.8 (8.4) | 62.4 (9.3) | 59.0 (9.3) | 60.2 (9.6) | <0.001 |
| Women | 92 (44.0) | 2,279 (48.1) | 6,842 (55.6) | 53,005 (57.1) | <0.001 |
| Family history | <0.001 | ||||
| Parent or sibling with diabetes | 32 (15.3) | 1,352 (28.5) | 1,362 (11.1) | 16,978 (18.3) | |
| Parent and sibling with diabetes | 5 (2.4) | 245 (5.2) | 106 (0.9) | 1,940 (2.1) | |
| Body mass index | <0.001 | ||||
| < 25.0 | 25 (12.1) | 725 (16.4) | 4,980 (40.5) | 35,941 (41.3) | |
| 25.0–27.5 | 51 (24.6) | 805 (18.2) | 3,392 (27.6) | 20,684 (23.7) | |
| 27.5–30.0 | 48 (23.2) | 872 (19.7) | 2,141 (17.4) | 14,393 (16.5) | |
| > 30.0 | 83 (40.1) | 2,031 (45.8) | 1,772 (14.4) | 16,074 (18.5) | |
| Antihypertensive drugs | 66 (31.6) | 1,708 (36.0) | 2,196 (17.8) | 18,253 (19.7) | <0.001 |
| Physical activity ≥ 1 h/week | 57 (27.3) | 3,291 (73.0) | 5,782 (47.0) | 72,076 (80.9) | <0.001 |
| Green leafy (raw) | 28 (13.5) | 3,480 (85.1) | 2,485 (20.6) | 72,470 (87.5) | <0.001 |
| Fresh fruits ≥ 1 portion/day | 83 (40.5) | 4,119 (91.9) | 6,006 (49.7) | 83,341 (93.4) | <0.001 |
| Wholemeal/brown bread ≥ 1 portion/day | 64 (32.2) | 3,832 (86.0) | 4,698 (39.8) | 78,033 (87.7) | <0.001 |
n (%).
mean (standard deviation).
in derivation dataset: green leafy vegetables; in validation dataset: raw vegetables.
differences between derivation and validation cohort, for age Mann-Whitney U test and all other variables Pearson’s χ2 test with Yates’ continuity correction.
Percent of missing values per predictor.
| Sex | 0.0 |
| Age | 0.0 |
| Family history | 0.0 |
| BMI | 6.2 |
| Antihypertensive drugs | 0.0 |
| Physical activity | 4.2 |
| Raw vegetables | 11.0 |
| Cooked vegetables | 3.1 |
| Fruits | 4.0 |
| Brown bread | 4.3 |
weight 3.3% and height 4.8% missing values.
Overview of models’ discrimination and overall performance in the validation.
| 0 – no adjustments | 0.726 (0.719, 0.733) | – | 1.47% | 0.781 (0.752, 0.811) | 0.669 (0.539, 0.800) |
| 1 – calibration-in-the-large | 0.726 (0.719, 0.733) | – | 5.26% | 0.781 (0.752, 0.811) | −0.531 (−0.618, −0.444) |
| 2 – logistic calibration | 0.726 (0.719, 0.733) | – | 5.89% | 1.000 (0.962, 1.038) | 0.000 (−0.106, 0.106) |
| 3 – refitting | 0.738 (0.731, 0.745) | 0.737 (0.731, 0.744) | 6.53% | 1.000 (0.965, 1.035) | 0.000 (−0.098, 0.098) |
| 4 – refitting with different predictor assessment | 0.738 (0.731, 0.745) | 0.737 (0.731, 0.745) | 6.53% | 1.000 (0.965, 1.035) | 0.000 (−0.098, 0.098) |
| 5 – refitting with numerical predictors as continuous | 0.741 (0.734, 0.748) | 0.741 (0.734, 0.748) | 6.53% | 1.000 (0.966, 1.034) | 0.000 (−0.097, 0.097) |
| AUSDRISK | 0.723 (0.716, 0.730) | – | 4.42% | 0.956 (0.920, 0.991) | −0.514 (−0.600, −0.430) |
Abbreviations: AUC = area under the receiver-operator curve; AUCbias = bias-corrected AUC for refitted models; Brierscaled = scaled Brier score; CI = confidence interval.
Results of likelihood ratio test for refitted models (in sequential order).
| Sex | 147.38 | 1 | <0.001 | 147.38 | 1 | <0.001 |
| Age | 190.60 | 1 | <0.001 | 190.60 | 1 | <0.001 |
| Antihypertensive drugs | 516.25 | 1 | <0.001 | 516.25 | 1 | <0.001 |
| BMI | 1986.03 | 3 | <0.001 | 2033.14 | 1 | <0.001 |
| Family history | 404.05 | 2 | <0.001 | 408.56 | 2 | <0.001 |
| Physical activity | 49.68 | 1 | <0.001 | 31.10 | 1 | <0.001 |
| Fruits | 7.91 | 2 | 0.019 | 3.49 | 1 | 0.062 |
| Vegetables | 6.05 | 1 | 0.014 | 2.54 | 1 | 0.111 |
| Brown bread | 3.15 | 4 | 0.533 | 0.49 | 1 | 0.484 |
Abbreviation: df = degrees of freedom.
raw and cooked vegetables combined.
Fig. 4Calibration curves, vertical lines indicate the predicted probability distribution.