| Literature DB >> 24302994 |
Shuyu Guo1, Robyn M Lucas, Anne-Louise Ponsonby.
Abstract
BACKGROUND: Epidemiological evidence suggests that vitamin D deficiency is linked to various chronic diseases. However direct measurement of serum 25-hydroxyvitamin D (25(OH)D) concentration, the accepted biomarker of vitamin D status, may not be feasible in large epidemiological studies. An alternative approach is to estimate vitamin D status using a predictive model based on parameters derived from questionnaire data. In previous studies, models developed using Multiple Linear Regression (MLR) have explained a limited proportion of the variance and predicted values have correlated only modestly with measured values. Here, a new modelling approach, nonlinear radial basis function support vector regression (RBF SVR), was used in prediction of serum 25(OH)D concentration. Predicted scores were compared with those from a MLR model.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24302994 PMCID: PMC3841172 DOI: 10.1371/journal.pone.0079970
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Recent studies using a multiple linear regression prediction model for 25(OH)D concentration.
| Reference | Cohort | Sample | Model covariates | R2 for the model | Validation |
| Giovannucci et al, | Health Professionals | Male | Geographical region | 28% | Measured plasma 25(OH)D level rose across increasing |
| 2006 | Follow-Up Study (HPFS), | 40–75 | Dietary vitamin D intake | deciles of predicted 25(OH)D score (ptrend<0.001) | |
| US | Training set: 1095 | Vitamin D supplements | |||
| Validation set: 542 | Race | ||||
| BMI | |||||
| Physical activity level | |||||
| Chan et al., 2010 | Adventist Health Study-2 | Male & Female | Race | White: 22% | N/A |
|
| (AHS-2), | Black: 209 | BMI | Black: 31% | |
| US, Canada | White: 236 | Skin type | Total: 42% | ||
| UV season | |||||
| Latitude | |||||
| Erythemal zone | |||||
| Total vitamin D intake | |||||
| Duration of sun exposure | |||||
| Percentage of body exposed | |||||
| Liu et al., 2010 | Framingham Offspring | Male & Female | Age | 25.75% | Spearman rho for measured 25(OH)D concentration vs. |
|
| Study, | 50–70 | Sex | predicted score = 0.51 (p<0.001) | |
| Massachusetts, US | Training set: 883 | BMI | |||
| Validation set: 845 | Total vitamin D intake | ||||
| Smoking status | |||||
| Total energy intake | |||||
| Millen et al., | Women's Health Initiative | Female | Langleys | 21% | Pearson correlation coefficient for measured plasma |
| 2010 | Clinical Trial (WHI-CT), | 50–79 | Race | 25(OH)D vs. predicted score r = 0.45, 95%CI: 0.40,0.49 | |
|
| US | Training set: 3055 | Age | The predictive model was poor at categorizing women in the | |
| Validation set: 1528 | Waist circumference | severely deficient (3%) and sufficient (3%) range of vitamin | |||
| Recreational physical activity | D status. | ||||
| Total vitamin D intake | |||||
| Peiris et al., 2011 | Veterans Administration | Male | Triglyceride | 12.9% | The model correctly classified vitamin D deficiency status |
|
| Center patients | Race | for 70.6% patients; only 30.6% of those who were actually | ||
| Southeastern US | Total cholesterol | deficient were correctly identified as deficient. | |||
| BMI | |||||
| Calcium level | |||||
| Number of missed appointments | |||||
| Bertrand et al., | Nurses' Health Study | NHS: female, 30–55 y | Race | NHS: 33% | Spearman rho for measured 25(OH)D concentration vs. |
| 2012 | (NHS) | Training set:2246 | UV-B flux | NHSII: 25% | predicted score were 0.23, 95%CI: 0.16,0.29 for NHS, 0.42, |
| Nurses' Health Study II | Validation set:818 | Dietary vitamin D intake | HPFS: 28% | 95%CI:0.34, 0.49 for NHSII, 0.30, 95%CI: 0.21 0.37 | |
| (NHSII) | NHSII: female, 25–42 y | Supplementary vitamin D intake | (adjusted for batch, age and season of blood draw) | ||
| Health Professionals | Training set:1646 | BMI | |||
| Follow-up Study (HPFS) | Validation set: 479 | Physical activity | |||
| HPFS: Male, 40–75 y | Alcohol intake | ||||
| Training set: 1255 | Post-menopausal hormone use | ||||
| Validation set: 841 | Season of blood draw |
Figure 1Performance demonstration of SVR and MLR in a simple scenario (two-dimensional case).
The black dots indicate actual simulation data set. The solid curve denotes SVR regress line and the dot line represents the MLR regression line. The simulation data set is randomly generated by MATLAB.
Predicted 25(OH)D concentration and mean absolute difference between predicted and measured 25(OH)D level (nmol/L).
| Mean | Standard deviation | Minimum | Maximum | |
| Measured 25(OH)D level | 81.71 | 28.33 | 14.2 | 163.3 |
| Predicted level MLR | 81.3 | 20.41 | 34.54 | 121.71 |
| Predicted level RBF SVR | 78.10 | 18.87 | 28.01 | 129.91 |
| Mean absolute difference MLR | 19.04 | 15.23 | 0.18 | 76.39 |
| Mean absolute difference RBF-SVR | 15.65 | 8.91 | 0.05 | 49.33 |
RBF SVR, radial basis function support vector regression (nonlinear support vector regression).
MLR, multiple linear regression.
Mean absolute difference is the average of the absolute differences between the predicted and measured values.
Figure 2Correlation of measured 25(OH)D concentration (nmol/L) and predicted 25(OH)D concentration using (A) a multiple linear regression model; and (B) a radial basis function support vector regression model.
Figure 3Bland – Altman plots of measured 25(OH)D concentration compared to predicted scores from (A) a MLR model; (B) a RBF SVR model.
The solid lines indicate the mean bias (middle line) and 95% limits of agreement (top and bottom lines). All measurements are in nmol/L.
Figure 4ROC curves of MLR and RBF SVR.
ROC curves showing true-positive rates (sensitivity) plotted against the false-positive rate for different cut off points of the quantified components of MLR (gray diamonds) and RBF SVR (black circles). The points highlighted are 25(OH)D scores of 75 nmol/l for MLR and RBF SVR. The area under the curve is 0.79 and 0.87 for MLR and RBF SVR respectively.
Figure 5Accuracy of predicted 25(OH)D score in each quintile of 25(OH)D concentration.
Figure 6Percentage of individuals classified by quintiles of measured 25(OH)D concentration and predicted 25(OH)D score.