| Literature DB >> 29900615 |
Apostolos Davillas1, Andrew M Jones2,3.
Abstract
Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four-parameter generalized beta of the second kind (GB2), the three-parameter generalized gamma, and their three-, two-, or one-parameter nested and limiting cases. Commonly used blood-based biomarkers for inflammation, diabetes, cholesterol, and stress-related hormones are modelled. Although some of the three-parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going "beyond the mean" to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.Entities:
Keywords: biomarkers; generalized beta of second kind; heavy tails; tail probabilities
Mesh:
Substances:
Year: 2018 PMID: 29900615 PMCID: PMC6175412 DOI: 10.1002/hec.3787
Source DB: PubMed Journal: Health Econ ISSN: 1057-9230 Impact factor: 3.046
Figure A1Quantile–quantile plots of the biomarkers by gender [Colour figure can be viewed at http://wileyonlinelibrary.com]
AIC and BIC for each model
| Fibrinogen | Hba1c | Cholesterol ratio | DHEAS | |||||
|---|---|---|---|---|---|---|---|---|
| AIC | BIC | AIC | BIC | AIC | BIC | AIC | BIC | |
| GB2 |
| 20,948 |
|
|
| 39,257 |
| 53,889 |
| B2 | 21,221 | 21,296 | 76,134 | 76,371 |
|
| 53,897 | 53,979 |
| SM |
|
| 72,329 | 72,404 | 39,432 | 39,506 |
|
|
| Dagum | 20,872 | 20,947 | 72,927 | 73,001 | 39,315 | 39,390 | 53,855 | 53,937 |
| Fisk | 20,883 | 20,950 | 73,563 | 73,629 | 39,482 | 39,549 | 54,149 | 54,223 |
| Lomax | 51,843 | 51,910 | 112,182 | 112,249 | 59,542 | 59,624 | 61,959 | 62,040 |
| GG | 21,204 | 21,278 | 74,986 | 75,060 | 39,180 | 39,270 | 53,927 | 54,016 |
| Lognormal | 21,502 | 21,569 | 77,305 | 77,372 | 39,306 | 39,373 | 54,407 | 54,482 |
| Gamma | 21,219 | 21,287 | 79,049 | 79,116 | 39,867 | 39,934 | 53,942 | 54,016 |
| Weibull | 22,804 | 22,871 | 88,676 | 88,743 | 42,443 | 42,518 | 54,640 | 54,715 |
| Exponential | 51,841 | 51,900 | 112,180 | 112,239 | 59,540 | 59,615 | 61,957 | 62,031 |
| OLS | 21,500 | 21,558 | 84,119 | 84,178 | 42,875 | 42,950 | 58,371 | 58,446 |
Note. AIC: Akaike information criteria; BIC: Bayesian information criteria; B2: beta of the second kind; DHEAS: dehydroepiandrosterone sulfate; GB2: generalized beta of the second kind; GG: generalized gamma; HbA1c: glycated hemoglobin; OLS: ordinary least squares; SM: Singh–Maddala. For each biomarker, bold values highlight those models that exhibit the best performance according to AIC and BIC.
Figure 1Distribution of biomarkers and quantile‐normal (Q‐N) plots. DHEAS: dehydroepiandrosterone sulfate; HbA1c: glycated hemoglobin [Colour figure can be viewed at http://wileyonlinelibrary.com]
Descriptive statistics
| Biomarker | Mean | Median | Standard deviation | Skewness | Kurtosis | Minimum | Maximum | Sample size |
|---|---|---|---|---|---|---|---|---|
| Fibrinogen (g/L) | 2.79 | 2.70 | 0.59 | 0.47 | 3.82 | 0.40 | 5.50 | 12,811 |
| HbA1c (mmol/mol) | 37.25 | 36.00 | 8.19 | 4.17 | 31.15 | 12 | 133.0 | 12,153 |
| Cholesterol ratio | 3.74 | 3.46 | 1.36 | 1.42 | 6.43 | 1.16 | 13.67 | 12,865 |
| DHEAS (μmol/L) | 4.62 | 3.80 | 3.24 | 1.29 | 5.11 | 0.20 | 25.30 | 12,809 |
Note. DHEAS: dehydroepiandrosterone sulfate; HbA1c: glycated hemoglobin.
LR and Wald tests (p‐values) for special cases of the GB2 and GG
| Fibrinogen | HbA1c | Cholesterol ratio | DHEAS | |||||
|---|---|---|---|---|---|---|---|---|
| LR | Wald | LR | Wald | LR | Wald | LR | Wald | |
| GB2 versus … | ||||||||
| B2 | 0.000 | 0.000 | 0.000 | 0.000 |
|
| 0.000 | 0.000 |
| SM |
|
| 0.000 | 0.000 | 0.000 | 0.188 |
|
|
| Dagum | 0.004 | 0.013 | 0.000 | 0.000 | 0.000 | 0.020 | 0.000 | 0.000 |
| Fisk | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| Lomax | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| GG versus … | ||||||||
| Gamma | 0.000 | 0.024 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| Lognormal | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| Weibull | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| Exponential | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
Note. B2: beta of the second kind; DHEAS: dehydroepiandrosterone sulfate; LR: likelihood ratio; GB2: generalized beta of the second kind; GG: generalized gamma; HbA1c: glycated hemoglobin; SM: Singh–Maddala. For each biomarker, bold p‐values highlight those models that we are not able to reject the null hypothesis of restrictions being valid, according to both the LR and Wald tests, compared to the GB2 or GG models.
Estimated parameters from the GB2 and the GG models
| Biomarker | GB2 | GG | |||
|---|---|---|---|---|---|
|
|
|
|
| Ln(σ) | |
| Fibrinogen | 7.892 [7.017, 8.767] | 1.104 [0.933, 1.275] | 1.299 [1.063, 1.535] | 0.267 [0.209, 0.326] | −1.606 [−1.621, −1.592] |
| HbA1c | 42.986 [36.674, 49.298] | 0.348 [0.287, 0.410] | 0.198 [0.167, 0.230] | −0.461 [−0.555, −0.368] | −1.970 [−2.017, −1.924] |
| Cholesterol ratio | 1.442 [0.777, 2.108] | 23.345 [−9.920, 56.612] | 6.611 [1.761, 11.463] | −0.246 [−0.290, −0.200] | −1.169 [−1.183, −1.157] |
| DHEAS | 2.538 [2.202, 2.873] | 1.036 [0.846, 1.225] | 2.316 [1.717, 2.915] | 0.446 [0.396, 0.495] | −0.615 [−0.631, −0.599] |
Note. DHEAS: dehydroepiandrosterone sulfate; GB2: generalized beta of the second kind; GG: generalized gamma; HbA1c: glycated hemoglobin.
Figure 2Actual versus fitted tail probabilities. DHEAS: dehydroepiandrosterone sulfate; HbA1c: glycated hemoglobin [Colour figure can be viewed at http://wileyonlinelibrary.com]