| Literature DB >> 28950902 |
Sung Kyun Park1,2, Zhangchen Zhao3, Bhramar Mukherjee4,3.
Abstract
BACKGROUND: There is growing concern of health effects of exposure to pollutant mixtures. We initially proposed an Environmental Risk Score (ERS) as a summary measure to examine the risk of exposure to multi-pollutants in epidemiologic research considering only pollutant main effects. We expand the ERS by consideration of pollutant-pollutant interactions using modern machine learning methods. We illustrate the multi-pollutant approaches to predicting a marker of oxidative stress (gamma-glutamyl transferase (GGT)), a common disease pathway linking environmental exposure and numerous health endpoints.Entities:
Keywords: Bayesian additive regression tree (BART); Bayesian kernel machine regression (BKMR); Cardiovascular disease; Elastic-net; Environmental risk score (ERS); Machine learning; Metals; Mixtures; Multipollutants; Super Learner
Mesh:
Substances:
Year: 2017 PMID: 28950902 PMCID: PMC5615812 DOI: 10.1186/s12940-017-0310-9
Source DB: PubMed Journal: Environ Health ISSN: 1476-069X Impact factor: 5.984
Fig. 1Schematic diagram of Environmental Risk Score (ERS) construction and analytical methods. AENET-I, adaptive elastic-net with main effects and pairwise interactions; BART, Bayesian additive regression tree; BKMR, Bayesian kernel machine regression; PRESS, predicted residual sums of squares; MSE, mean square error; MSPE, mean square prediction error; AUC, area under the receiver operating characteristics curve; OR, odds ratio; SBP/DBP, systolic and diastolic blood pressure; CVD, cardiovascular disease
Characteristics of the study population overall and by NHANES cycle
| Cycle | Overall | ||||||
|---|---|---|---|---|---|---|---|
| 2003-2004 | 2005-2006 | 2007-2008 | 2009-2010 | 2011–2012 | 2013-2014 | ||
|
|
|
|
|
|
| N = 9664 | |
| CONTINUOUS, mean(SD) | |||||||
| Age, years | 51.0 (19.5) | 48.3 (18.8) | 50.2 (17.6) | 49.4 (17.8) | 48.0 (17.7) | 48.3 (17.2) | 49.2 (17.2) |
| BMI, kg/m2 | 28.6 (6.32) | 28.7 (6.70) | 28.7 (6.13) | 29.1 (6.86) | 28.7 (6.90) | 29.1 (7.11) | 28.8 (6.69) |
| GGT, U/L | 21.5 (1.91) | 20.85 (1.96) | 24.1 (1.92) | 22.1 (1.92) | 20.5 (1.92) | 21.3 (1.96) | 21.7 (1.94) |
| SBP, mm Hg | 126.0 (20.7) | 123.0 (19.3) | 123.7 (18.6) | 121.9 (18.4) | 122.5 (17.8) | 122.6 (17.5) | 123.2 (18.7) |
| DBP, mm Hg | 70.3 (12.3) | 69.1 (12.7) | 70.4 (11.8) | 69.0 (12.2) | 71.0 (11.8) | 70.2 (11.3) | 70.0 (12.0) |
| CATEGORICAL, N (%) | |||||||
| Female | 839 (52.08) | 825 (51.79) | 966 (50.31) | 1078 (52.00) | 871 (49.77) | 961 (51.72) | 4911 (50.82) |
| Race/Ethnicity | |||||||
| Mexican American | 323 (20.05) | 311 (19.52) | 322 (16.77) | 384 (18.52) | 154 (8.80) | 267 (14.37) | 1618 (16.74) |
| Other Hispanic | 39 (2.42) | 47 (2.95) | 234 (12.19) | 217 (10.47) | 179 (10.23) | 159 (8.56) | 779 (8.06) |
| Non-Hispanic White | 852 (52.89) | 816 (51.22) | 893 (46.51) | 973 (46.94) | 639 (36.51) | 805 (43.33) | 4501 (46.57) |
| Non-Hispanic Black | 341 (21.17) | 363 (22.79) | 402 (20.94) | 372 (17.95) | 458 (26.17) | 370 (19.91) | 1976 (20.45) |
| Other | 56 (3.48) | 56 (3.52) | 69 (3.59) | 127 (6.13) | 320 (18.29) | 257 (13.83) | 790 (8.17) |
| Smoking Status | |||||||
| Never | 830 (51.55) | 843 (52.92) | 1015 (52.89) | 1116 (53.84) | 1002 (57.29) | 1044 (56.19) | 5242 (54.24) |
| Former | 337 (20.93) | 338 (21.22) | 428 (22.30) | 444 (21.42) | 340 (19.44) | 376 (20.24) | 2000 (20.70) |
| Current | 443 (27.52) | 412 (25.86) | 476 (24.80) | 513 (24.75) | 407 (23.27) | 438 (23.57) | 2422 (25.06) |
| Education | |||||||
| < High School | 486 (30.25) | 450 (28.28) | 611 (31.82) | 591 (28.55) | 420 (24.03) | 393 (21.16) | 2599 (26.89) |
| High School | 850 (52.89) | 836 (52.54) | 966 (50.32) | 1082 (52.27) | 872 (49.89) | 996 (53.64) | 5018 (51.92) |
| College or Above | 271 (16.86) | 305 (19.17) | 343 (17.86) | 397 (19.18) | 456 (26.09) | 468 (25.20) | 2047 (21.18) |
| Hypertension | 549 (40.76) | 450 (33.06) | 608 (37.98) | 666 (36.78) | 550 (36.74) | 595 (36.28) | 3418 (36.92) |
| Mortality | |||||||
| Total | 163 (11.32) | 102 (7.19) | 62 (3.73) | 32 (1.70) | NA | NA | 359 (5.61) |
| CVD | 42 (2.92) | 33 (2.33) | 17 (1.02) | 2 (0.11) | NA | NA | 94 (1.47) |
| Cancer | 38 (2.64) | 25 (1.76) | 14 (0.84) | 13 (0.69) | NA | NA | 90 (1.41) |
BMI body mass index, GGT gamma-glutamyl transferase, SBP systolic blood pressure, DBP diastolic blood pressure, CVD cardiovascular disease
Geometric means and geometric standard deviations of metals overall and by NHANES cycle
| Cycle | Overall | ||||||
|---|---|---|---|---|---|---|---|
| 2003-2004 | 2005-2006 | 2007-2008 | 2009-2010 | 2011–2012 | 2013-2014 | ||
|
|
|
|
|
|
| ||
| In whole blood | |||||||
| Lead, μg/dL | 1.67 (1.92) | 1.47 (2.01) | 1.49 (1.91) | 1.31 (1.95) | 1.13 (2.01) | 1.01 (1.97) | 1.32 (2.00) |
| Cadmium, μg/L | 0.39 (2.27) | 0.37 (2.17) | 0.39 (2.15) | 0.38 (2.13) | 0.36 (2.27) | 0.33 (2.30) | 0.37 (2.22) |
| Total Mercury, μg/L | 0.88 (2.78) | 0.96 (2.54) | 0.90 (2.58) | 1.01 (2.59) | 0.94 (2.87) | 0.87 (2.70) | 0.93 (2.68) |
| In urine, μg/L | |||||||
| Antimony | 0.08 (1.81) | 0.07 (2.29) | 0.06 (2.15) | 0.06 (2.22) | 0.05 (2.00) | 0.04 (2.36) | 0.06 (2.21) |
| Total Arsenic | 8.92 (3.13) | 9.91 (3.21) | 8.77 (2.99) | 10.07 (3.25) | 8.81 (3.29) | 7.14 (3.06) | 8.88 (3.17) |
| Arsenous acid | 0.83 (1.21) | 0.87 (1.18) | 0.87 (1.21) | 0.88 (1.24) | 0.45 (1.63) | 0.32 (2.84) | 0.65 (1.91) |
| Arsenic acid | 0.73 (1.18) | 0.73 (1.21) | 0.72 (1.17) | 0.73 (1.19) | 0.64 (1.21) | 0.57 (1.13) | 0.68 (1.21) |
| Arsenobetaine | 1.77 (5.15) | 2.17 (5.81) | 1.45 (5.55) | 1.82 (6.17) | 2.75 (4.31) | 2.27 (3.85) | 1.99 (5.19) |
| Arsenocholine | 0.41 (1.18) | 0.43 (1.21) | 0.43 (1.19) | 0.44 (1.35) | 0.21 (1.41) | 0.10 (1.78) | 0.30 (1.91) |
| Dimethylarsonic acid | 3.84 (2.20) | 4.01 (2.18) | 3.88 (2.19) | 3.87 (2.39) | 4.14 (2.41) | 3.46 (2.19) | 3.85 (2.27) |
| Monomethylacrsonic acid | 0.82 (1.65) | 0.85 (1.56) | 0.85 (1.56) | 0.83 (1.60) | 0.79 (1.55) | 0.42 (2.41) | 0.73 (1.85) |
| Barium | 1.29 (2.64) | 1.33 (2.80) | 1.28 (2.71) | 1.30 (2.59) | 1.07 (2.69) | 0.97 (2.79) | 1.20 (2.72) |
| Cadmium | 0.29 (2.69) | 0.26 (2.74) | 0.27 (2.65) | 0.25 (2.62) | 0.22 (2.78) | 0.18 (2.94) | 0.24 (2.77) |
| Cobalt | 0.31 (2.24) | 0.37 (2.27) | 0.35 (2.16) | 0.34 (2.28) | 0.31 (2.29) | 0.37 (2.27) | 0.34 (2.26) |
| Cesium | 4.54 (2.08) | 4.63 (2.02) | 4.38 (1.97) | 4.11 (1.94) | 3.89 (2.00) | 3.92 (1.99) | 4.22 (2.00) |
| Lead | 0.70 (2.15) | 0.65 (2.43) | 0.57 (2.34) | 0.53 (2.33) | 0.41 (2.45) | 0.32 (2.51) | 0.51 (2.47) |
| Molybdenum | 38.27 (2.42) | 42.97 (2.33) | 42.17 (2.43) | 40.46 (2.37) | 36.95 (2.41) | 32.42 (2.47) | 38.64 (2.42) |
| Thallium | 0.14 (2.12) | 0.15 (2.06) | 0.14 (2.06) | 0.14 (2.05) | 0.15 (2.07) | 0.14 (2.14) | 0.14 (2.09) |
| Tungsten | 0.06 (2.61) | 0.08 (2.74) | 0.09 (2.76) | 0.07 (2.67) | 0.07 (2.69) | 0.05 (2.95) | 0.07 (2.77) |
| Uranium | 0.01 (2.23) | 0.01 (2.63) | 0.01 (2.75) | 0.01 (2.70) | 0.01 (2.44) | 0.01 (2.97) | 0.01 (2.65) |
Fig. 2Heat map of Spearman correlations between metal biomarkers. Asterisk next to the metal names indicates metals measured in whole blood. As, arsenic; As III, arsenous acid; As V, arsenic acid; MMA, monomethylarsonic acid (MMA); DMA, dimethylarsonic acid; Mo, molybdenum
Fig. 3Selected predictors of the main effects (diagonal cells) and pairwise interactions (off-diagonal combinations) for serum gamma-glutamyl transferase (GGT) in adaptive elastic net. Bubble size indicates the magnitude of the association. The number inside indicates p-value. Asterisk next to the metal names indicates metals measured in whole blood. As, arsenic; As III, arsenous acid; As V, arsenic acid; MMA, monomethylarsonic acid (MMA); DMA, dimethylarsonic acid; Mo, molybdenum
Comparison of ERS distribution and risk prediction performance by different statistical approaches
| Base Modela | AENET-M | AENET-I | BART | BKMR | SL | Full Modelb | |
|---|---|---|---|---|---|---|---|
| Distributions of ERS | |||||||
| Training Set | |||||||
| Mean (SD) | – | 0.00 (0.05) | 0.00 (0.06) | 0.00 (0.06) | 0.00 (0.27) | 0.00 (0.06) | – |
| Range | – | (−0.18, 0.25) | (−0.26, 0.49) | (−0.22, 0.45) | (−1.01, 1.83) | (−0.18, 0.27) | – |
| Testing Set | |||||||
| Mean (SD) | – | 0.00 (0.05) | 0.00 (0.06) | 0.00 (0.05) | 0.01 (0.08) | 0.00 (0.04) | – |
| Range | – | (−0.22, 0.19) | (−0.22, 0.35) | (−0.22, 0.32) | (−0.34, 0.66) | (−0.17, 0.24) | – |
| Risk Prediction Performance | |||||||
| Continuous GGTc | |||||||
| Training Set | |||||||
| Correlationd | – | 0.22 | 0.24 | 0.35 | 0.82 | 0.75 | – |
| MSE | 7.2E-02 | 7.0E-02 | 6.9E-02 | 6.4E-02 | 2.6E-04 | 3.6E-02 | 6.7E-02 |
| Testing Set | |||||||
| Correlationd | – | 0.25 | 0.27 | 0.20 | 0.00 | 0.26 | – |
| PRESS | 332.9 | 320.6 | 316.1 | 325.1 | 332.3 | 321.7 | 327.2 |
| MSPE | 6.9E-02 | 6.6E-02 | 6.5E-02 | 6.7E-02 | 6.9E-02 | 6.6E-02 | 6.8E-02 |
| Dichotomous GGTe | |||||||
| Training Set | |||||||
| AUC | 0.67 | 0.70 | 0.71* | 0.75† | >0.99‡ | 0.92‡ | 0.73† |
| 95% CI | (0.64, 0.69) | (0.67, 0.72) | (0.68, 0.73) | (0.73, 0.78) | (0.99, 1.00) | (0.91, 0.93) | (0.70, 0.75) |
| Testing Set | |||||||
| AUC | 0.66 | 0.69 | 0.70* | 0.69 | 0.66 | 0.70* | 0.68 |
| 95% CI | (0.64, 0.68) | (0.67, 0.71) | (0.68, 0.72) | (0.66, 0.71) | (0.64, 0.68) | (0.67, 0.72) | (0.66, 0.70) |
AENET-M adaptive elastic net for main effects, AENET-I adaptive elastic net for main effects and pairwise interactions, BART Bayesian Additive Regression Tree, BKMR Bayesian Kernel Machine Regression, SL Super Learner, GGT gamma-glutamyl transferase, MSE mean square error, PRESS predicted residual sums of squares, MSPE mean square prediction error, AUC area under the receiver operating characteristic curve
aBase model contains only covariates (age, sex, race/ethnicity, smoking status, education, body mass index, urinary creatinine)
bFull model contains all covariates, main effects and all possible pairwise interactions of metals
cGGT was logarithmically transformed. Mean (SD) of log(GGT) = 0.27 (0.21)
dCorrelation between GGT and ERS
eGGT was dichotomized at the 90th percentile (50 I/U)
* P < 0.1, † P < 0.05, ‡ P < 0.01. P-values were computed with permutation tests comparing with AUC of the base model
Fig. 4Odds ratios (95% confidence intervals) of having high GGT (50 U/L and above) comparing the highest vs. the lowest quintiles of ERS and individual pollutants that compose the ERS in the testing set. All models were adjusted for age, BMI, creatinine, gender, race/ethnicity, smoking status and education
Associations of health endpoints (blood pressure, hypertension, and mortality) with ERS’s from different statistical approaches. For the comparison purpose, associations with blood lead and blood cadmium are presented
| AENET-M | AENET-I | BART | SL | Blood Lead | Blood Cadmium | ||
|---|---|---|---|---|---|---|---|
| SBP | β | 0.03 | 0.69 | 0.52 | 1.03 | 0.78 | 0.28 |
| DBP | β | 1.12 | 1.50 | 0.90 | 1.61 | 0.97 | 0.29 |
| Hypertension | OR | 1.11 | 1.26 | 1.17 | 1.30 | 1.08 | 1.06 |
| Total mortality | HR | 1.07 | 1.07 | 1.07 | 1.15 | 1.08 | 1.37 |
| CVD mortality | HR | 0.99 | 1.09 | 1.07 | 0.98 | 0.92 | 1.24 |
| Cancer mortality | HR | 1.50 | 1.24 | 1.23 | 1.23 | 1.41 | 1.50 |
AENET-M adaptive elastic net for main effects, AENET-I adaptive elastic net for main effects and pairwise interactions, BART Bayesian Additive Regression Tree, SL Super Learner
Effect estimates (β, odds ratio (OR), and hazard ratio (HR)) are based on a standardized increment which is equivalent to one standard deviation increase in each ERS. All models were adjusted for age (except mortality outcomes), sex, race/ethnicity, body mass index, smoking status, education