BACKGROUND: Both genetic and environmental factors contribute to triglyceride, low-density lipoprotein-cholesterol (LDL-C), and high-density lipoprotein-cholesterol (HDL-C) levels. Although genome-wide association studies are currently testing the genetic factors systematically, testing and reporting one or a few factors at a time can lead to fragmented literature for environmental chemical factors. We screened for correlation between environmental factors and lipid levels, utilizing four independent surveys with information on 188 environmental factors from the Centers of Disease Control, National Health and Nutrition Examination Survey, collected between 1999 and 2006. METHODS: We used linear regression to correlate each environmental chemical factor to triglycerides, LDL-C and HDL-C adjusting for age, age(2), sex, ethnicity, socio-economic status and body mass index. Final estimates were adjusted for waist circumference, diabetes status, blood pressure and survey. Multiple comparisons were controlled for by estimating the false discovery rate and significant findings were tentatively validated in an independent survey. RESULTS: We identified and validated 29, 9 and 17 environmental factors correlated with triglycerides, LDL-C and HDL-C levels, respectively. Findings include hydrocarbons and nicotine associated with lower HDL-C and vitamin E (γ-tocopherol) associated with unfavourable lipid levels. Higher triglycerides and lower HDL-C were correlated with higher levels of fat-soluble contaminants (e.g. polychlorinated biphenyls and dibenzofurans). Nutrients and vitamin markers (e.g. vitamins B, D and carotenes), were associated with favourable triglyceride and HDL-C levels. CONCLUSIONS: Our systematic association study has enabled us to postulate about broad environmental correlation to lipid levels. Although subject to confounding and reverse causality bias, these findings merit evaluation in additional cohorts.
BACKGROUND: Both genetic and environmental factors contribute to triglyceride, low-density lipoprotein-cholesterol (LDL-C), and high-density lipoprotein-cholesterol (HDL-C) levels. Although genome-wide association studies are currently testing the genetic factors systematically, testing and reporting one or a few factors at a time can lead to fragmented literature for environmental chemical factors. We screened for correlation between environmental factors and lipid levels, utilizing four independent surveys with information on 188 environmental factors from the Centers of Disease Control, National Health and Nutrition Examination Survey, collected between 1999 and 2006. METHODS: We used linear regression to correlate each environmental chemical factor to triglycerides, LDL-C and HDL-C adjusting for age, age(2), sex, ethnicity, socio-economic status and body mass index. Final estimates were adjusted for waist circumference, diabetes status, blood pressure and survey. Multiple comparisons were controlled for by estimating the false discovery rate and significant findings were tentatively validated in an independent survey. RESULTS: We identified and validated 29, 9 and 17 environmental factors correlated with triglycerides, LDL-C and HDL-C levels, respectively. Findings include hydrocarbons and nicotine associated with lower HDL-C and vitamin E (γ-tocopherol) associated with unfavourable lipid levels. Higher triglycerides and lower HDL-C were correlated with higher levels of fat-soluble contaminants (e.g. polychlorinated biphenyls and dibenzofurans). Nutrients and vitamin markers (e.g. vitamins B, D and carotenes), were associated with favourable triglyceride and HDL-C levels. CONCLUSIONS: Our systematic association study has enabled us to postulate about broad environmental correlation to lipid levels. Although subject to confounding and reverse causality bias, these findings merit evaluation in additional cohorts.
Serum lipid levels are risk factors for coronary heart disease (CHD), atherosclerosis, type 2 diabetes and stroke. Both genetic and environmental factors influence lipid level phenotypes. Lipid level variation can be influenced by genetics., On the other hand, environmental factors also play a role. For example, lifestyle factors such as physical exercise, smoking and diet have well-documented relationships with lipid levels. There are a number of reports connecting lipid levels, cardiovascular disease, type 2 diabetes and the metabolic syndrome with specific persistent pollutants, such as dioxins, organochlorinated pesticides, dibenzofurans and polychlorinated biphenyls (PCBs). Other less tangible environmental factors, such as air pollution, may also have an adverse relationship with lipid levels.Although extensive efforts are underway to dissect genetic components with genome-wide association studies (GWASs), similar studies to systematically identify specific environmental factors are lacking. Results of epidemiologic studies, which typically test one or a few factors at a time, may be further distorted by selective reporting of subsets of analyses, outcomes and adjustments. It has been postulated that this contributes to a fragmented, ultimately unreliable literature. More importantly, the phenomenon of environmental exposure is complex and influenced by differences in individuals, time, place and other exposures. Humans are exposed to not a few, but many environmental adverse or protective factors simultaneously. Due to this complexity, the net effects due to environmental factors on human health may be miscalculated when considering a few factors at a time. Protective effects of environmental factors are not usually considered in context of adverse effects of other co-existing factors, potentially leading to lack of physiological coherence and public health relevance. Health surveys and bio-monitoring projects, in which multiple environmental factors are simultaneously measured, provides an opportunity to hypothesize about how a system of environmental factors relate to disease and other characteristics among the general population.We conduct a more systematic approach to associating multiple environmental chemical factors with serum lipid levels, similar to a GWAS, utilizing the National Health and Nutrition Examination Survey (NHANES), a nationally representative health survey. Instead of testing a few associations at a time, we evaluate 188 environmental factors for association to lipid levels while accounting for the multiplicity of comparisons. The emerging significant associations are then validated in an independent NHANES dataset. Further, we conduct systematic sensitivity analyses among the measured confounders to estimate bias. We term this method an ‘Environment-wide Association Study’, or EWAS.Using such an analytic procedure, we have found and validated 29, 9 and 17 markers for environmental chemical factors correlated with triglycerides, low-density lipoprotein-cholesterol (LDL-C) and high-density lipoprotein-cholesterol (HDL-C), respectively, including a spectrum of persistent organic pollutants, nutrients and vitamins. Many of these factors have been explored before in association with related diseases, such as type 2 diabetes, obesity, lipid levels and the metabolic syndrome., However, each of these studies addresses issues of model adjustment, variable coding and assessment of effects in different ways, possibly leading to conflicting study results. In this systematic study, we propose one type of analytic process to unify and standardize these analyses. Specifically, we assess how environmental factors are correlated among themselves and with changes in serum lipid levels, while consistently adjusting for other factors such as age, sex, ethnicity, socio-economic status (SES) and body mass index (BMI).
Methods
Data
We downloaded all available NHANES laboratory and questionnaire data for 1999–2000, 2001–02, 2003–04 and 2005–06 surveys. Laboratory data included serum and urine measures of environmental factors and clinical measures including lipid levels. Each survey is an independent, non-overlapping sampling of participants representative of the general US population. We analysed factors that were a direct measurement of an extrinsic environmental factor (e.g. amount of pesticide or heavy metal in urine or blood). We did not consider intrinsic physiological measures (e.g. red blood cell count or albumin) or responses to questionnaires except for sensitivity analyses.We used three of the four surveys (1999–2000, 2001–02, 2005–06) for testing for multiple environmental factors in association with lipid levels and reserved one survey (2003–04) for validation testing of findings. As each survey had a different set and number of environmental factors measured, we selected 2003–04 as the validation survey as it had the largest number of shared factors with each of the other surveys, maximizing the number of factors that could be validated.We eliminated 119 factor variables from our analyses whose majority of observations were under the National Centers for Health Statistics (NCHS) documented limit of detection or, for categorical factor variables, varied little. Specifically, we omitted continuous variables if 99% of the observations were deemed below the threshold limit of detection. For categorical factor variables, we omitted those that had 99% of the observations belonging to one category. After the elimination of these 119 such factor variables, we were left with 169 variables from the 1999–2000 survey, 182 from 2001–02, 96 from 2005–06 and 258 from the 2003–04 (validation) survey. Next, we selected factors from each survey that were present in the validation survey. This left us with a total of 188 unique factors that could be validated, of which 126 were from the 1999–2000 survey, 157 from 2001–02 and 65 from 2005–06. Using a categorization provided by NHANES, we binned these factors into 26 ‘classes’ of related factors (Figure 1A, Supplementary Table 1, available as Supplementary Data at IJE online).
Figure 1
Summary of environmental factors and analytic method. (A) Summary of the 26 factor classes and the number of factors within them for each NHANES test survey. (B) 100–7500 individuals had their HDL-C, LDL-C and triglyceride levels measured for each of these factors in each survey; these lipid levels were log transformed to assume normality for least squares regression. (C) Each of these 126, 157 and 65 factors was tested for association with the logarithm base 10 of HDL-C, LDL-C and triglyceride levels with a linear regression model adjusted for age, age2, sex, BMI, ethnicity and SES. (D) We estimated the FDR by permuting the lipid levels and re-computing the linear models; an FDR of 0.05 was considered significant. We deemed a factor to be tentatively validated if it was found to be significant in the validation survey with P ≤ 0.05 and an effect in the same direction. (E) We estimated a final coefficient for tentatively validated factors by combining all surveys and adjusting for age, age2, sex, ethnicity, SES, BMI, waist circumference, type 2 diabetes status (fasting blood glucose ≥ 126 mg/dl), blood pressure and survey. (F) We estimated the coefficient of determination (R2) for the final, combined models. (G) We recomputed our final models, adding 62 self-report variables one by one to attempt to check the validity of the environmental effect
Summary of environmental factors and analytic method. (A) Summary of the 26 factor classes and the number of factors within them for each NHANES test survey. (B) 100–7500 individuals had their HDL-C, LDL-C and triglyceride levels measured for each of these factors in each survey; these lipid levels were log transformed to assume normality for least squares regression. (C) Each of these 126, 157 and 65 factors was tested for association with the logarithm base 10 of HDL-C, LDL-C and triglyceride levels with a linear regression model adjusted for age, age2, sex, BMI, ethnicity and SES. (D) We estimated the FDR by permuting the lipid levels and re-computing the linear models; an FDR of 0.05 was considered significant. We deemed a factor to be tentatively validated if it was found to be significant in the validation survey with P ≤ 0.05 and an effect in the same direction. (E) We estimated a final coefficient for tentatively validated factors by combining all surveys and adjusting for age, age2, sex, ethnicity, SES, BMI, waist circumference, type 2 diabetes status (fasting blood glucose ≥ 126 mg/dl), blood pressure and survey. (F) We estimated the coefficient of determination (R2) for the final, combined models. (G) We recomputed our final models, adding 62 self-report variables one by one to attempt to check the validity of the environmental effectDifferent environmental factors were measured in varying numbers of participants: 109–3610 (median 938), 101–3388 (median 896) and 222–7485 (median 1958) individuals for triglyceride, LDL-C and HDL-C levels, respectively (Figure 1B). Individuals are selected randomly based on their demographic characteristics for the complex, stratified survey. Serum triglyceride levels were measured in the morning after >8.5 h of fasting. LDL-C levels were derived from total cholesterol and direct HDL-C measurements were derived from the Friedewald calculation.
Correlation between factors
We computed the pair-wise partial Pearson correlation coefficient between each environmental factors using the test and validation surveys separately, adjusting for age and BMI in addition to creatinine levels for urinary measures. Since we had 188 environmental factors, the total number of possible pairs of factors (and correlations) equals 17 578 (188 × 187/2); however, 4455 (25%) of all possible pairs of factors were not measured in the same overlapping individuals and, as a result, their correlations could not be computed. We assessed correlations between factors in the test and validation cohorts separately and compared their relative strength by estimating percentiles of the entire distribution of correlations. We also compared correlations within classes (or ‘intra-class’ correlation) and between classes (‘inter-class’) correlations. For factors measured in more than one of the test surveys, their coefficient was combined using a meta-analytic random effects method.
Correlation of environmental factors with lipid levels
The systematic analysis encompasses multiple steps (Figure 1C–G), which we term an ‘environment-wide association study’ (EWAS). First, survey-weighted linear regressions are performed, whereby log10-transformed lipid levels are dependent variables, modelled as a function of each environmental factor and age, age2, sex, BMI, ethnicity and SES (Figure 1C). For SES, we used the tertile of poverty index (participant's household income divided by the time-adjusted poverty threshold), as previously described. Ethnicity was coded in five groups (Mexican American, Non-Hispanic Black, Non-Hispanic White, Other Hispanic, Other). We used R survey module for all survey-weighted analyses with appropriate pseudo-strata, pseudo-sampling units and weights to accommodate the complex sampling of the data.Chemical exposure data arising from mass spectrometry or absorption measurements were log-transformed. We used z-scores [standard deviations (SDs) from the mean] to compare effect sizes; specifically, effect sizes for these variables denote change in lipid levels for a change in 1 SD of exposure. For binary variables, such as presence/absence assays for infectious agents, effect sizes denote change in lipid levels for those with presence of a factor vs those without.We calculated the false discovery rate (FDR), the estimated proportion of false discoveries made vs the number of total discoveries made for a given significance level α, to control for multiple hypothesis testing (Figure 1D). We created a ‘null distribution’ of regression test statistics for each survey separately, permuting the triglycerides, HDL-C and LDL-C levels 1000 times and refitting the linear regression models, collecting the test statistics for the coefficients corresponding to the environmental factor. In other words, the distributions of the lipid levels were not changed, but randomly assigned to different individuals in the survey.The FDR is the ratio of the number of coefficients called significant at a given level α in the null distribution and the number of results called significant from our real screen (Supplementary Methods, available as Supplementary Data at IJE online). We used a FDR of < 5% to select significant associations. We used an independent survey, the 2003–04 survey, to validate significant findings (Figure 1D). We considered a significant factor as ‘tentatively validated’ if it was significant (P < 0.05) in the validation survey.We then fit a final linear regression model with data combined from the four independent NHANES surveys for a tentatively validated environmental factor, attaining an overall estimate and P-value (Figure 1E). We utilized the larger sample size to adjust for additional co-variates that we were unable to adjust for in the single survey analyses (due to limited residual degrees of freedom) that also influence lipid levels. In addition to initial covariates, we also adjusted for waist circumference, type 2 diabetes status (approximated by fasting blood glucose ≥126 mg/dl), systolic and diastolic blood pressure (mm Hg) and survey. To estimate how much of the variance was described by each environmental factor, we estimated the change in the coefficient of determination (R2) adding that factor vs a model including only the adjusting factors (Figure 1F). We also performed regressions on untransformed lipid levels to estimate raw effect size.
Sensitivity analyses
We conducted sensitivity analyses to account for recent food, alcohol, supplements, medications, exercise and history of cardiovascular health (Figure 1G). In total, 62 questionnaire items were used (Supplementary Table 2, available as Supplementary data at IJE online). To evaluate the impact of these 62 adjusting variables, we recomputed the regression models by adding each variable to our final model one by one and observed the change in the effect size for each environmental factor. We also built a model adjusting for lipid-lowering drugs, supplement use, exercise and self-report cardiovascular-related disease simultaneously. More details can be found in the Supplementary Data (available at IJE online).
Power calculations
We estimated that our analyses had >80% median power for all surveys for detection of 5% change in HDL-C and LDL-C and 10% change of triglyceride levels for P-values corresponding to an FDR of 5%.
Results
Demographic and baseline associations with lipid levels
Tables 1–3 describe the multivariate relationship between baseline demographics and lipid levels. As expected, demographics, BMI, ethnicity and SES are associated with lipid levels. For example, consistent positive correlations existed between age and triglycerides (5–10% higher per 10 years, P < 0.02) and BMI and triglycerides (2% higher per 1 unit of BMI, P < 0.004) and consistent negative correlations between Black ethnicity and triglycerides (13% lower vs White, P < 0.001). Consistent polynomial relationships existed between age and both HDL-C and LDL-C. Negative correlations existed between BMI and HDL-C (1% lower per BMI unit, P < 0.0001). In addition, SES was associated with HDL-C (1–5% lower for lower vs higher tertile, P < 0.03). These indicated that BMI, ethnicity, sex, SES, age and age2, were all covariates that needed to be controlled in our first stage analysis.Estimates of multivariate linear regression model predicting log10(triglycerides) as a function of sex, age, age2, ethnicity (in reference to Whites), an estimate of SES (in reference to high SES) and BMI for each survey95% confidence interval (CI) and P-value of associations are also shown. n is unweighted sample size.Estimates of multivariate linear regression model predicting log10(LDL-C) as a function of sex, age, age2, ethnicity, SES and BMI for each survey95% CI: 95% confidence interval.Estimates of multivariate linear regression model predicting log10(HDL-C) as a function of sex, age, age2, ethnicity, SES and BMI for each survey95% CI: 95% confidence interval.
Factor correlations
We computed the partial Pearson correlation between each pair of environmental chemical factors tested where pair-wise data were available. Of the 17 578 possible correlations, 13 123 correlations could be computed (‘Methods’ section). These 13 123 correlations were adjusted for BMI and age in addition to creatinine levels for urinary measures. We computed the correlations in the test surveys and verified these correlations among the validation survey (P < 0.05 for both test and validation surveys and with same sign); after this verification step, we were left with a total of 11 672 confirmed correlations (Figure 2). The 5th, 10th, 90th and 95th percentiles of verified partial Pearson correlations were −0.11, −0.07, 0.26 and 0.38, respectively (lower left panel histogram, Figure 2).
Figure 2
Partial Pearson's correlation between environmental factors. Partial Pearson's correlation, adjusted by age and BMI (and creatinine for factors measured in urine) for each of the 188 factors were computed for each survey separately. We combined correlations between surveys using a meta-analytic random-effects estimate and displayed them in a heatmap (above), and ordered them by environmental ‘class’, coloured as in Figure 1A. Pairs of factors where correlations could not be computed are shown in grey
Partial Pearson's correlation between environmental factors. Partial Pearson's correlation, adjusted by age and BMI (and creatinine for factors measured in urine) for each of the 188 factors were computed for each survey separately. We combined correlations between surveys using a meta-analytic random-effects estimate and displayed them in a heatmap (above), and ordered them by environmental ‘class’, coloured as in Figure 1A. Pairs of factors where correlations could not be computed are shown in greyThe intra-class partial correlations were higher than between-class correlations in both the test surveys and validation survey (test surveys: mean ρ = 0.26, t-test P < 1e−10; validation survey: mean ρ = 0.27, t-test P < 1e−10). Specifically, the intra-class correlation for class PCBs was 0.41, 0.42 for dioxins, 0.5 for carotenoid nutrients, 0.2 for heavy metals, 0.2 for hydrocarbons, 0.3 for phytoestrogens, 0.3 for phthalates and 0.2 for phenols for the test surveys. We observed similar patterns among the validation survey. We observed several instances of large inter-class correlations, such as inverse correlations between carotenoid and vitamin E factors (trans-β-carotene and γ-tocopherol, ρ = −0.3). We also observed positive correlations between cotinine and the heavy metals lead and cadmium (ρ > 0.3) and hydrocarbons2- and 3-hydroxyflourene (ρ > 0.5). Similarly, we observed gross inter-class correlations between classes such as furans and dioxins (mean ρ = 0.4), PCBs and dioxins (mean ρ = 0.2), PCBs and organochlorine pesticides (mean ρ = 0.2) and phthalates and hydrocarbons (mean ρ = 0.2).
Environment associations with lipid levels
For triglyceride levels, 15 out of 126, 29 out of 157 and 12 out of 65 factors passed the requested threshold of significance (FDR <5%) for the 1999–2000, 2001–02 and 2005–06 surveys, respectively (Figure 3A). For LDL-C, 2 out of 131, 10 out of 162 and 9 out of 65 were significant, respectively (Figure 3B). For HDL-C, 1 out of 131, 26 out of 162 and 15 out of 65 were significant (Figure 3C). We tentatively validated significant findings from our screen, by searching for whether any of the factors significant in any of the three studies above were also significant in the fourth independent 2003–04 survey at P < 0.05. We found 29, 9 and 17 tentatively validated factors for triglycerides, LDL-C and HDL-C, respectively (Figure 3A–C).
Figure 3
Significance of association [−log10(FDR)] for each of 188 factors by survey in association to (A) triglycerides, (B) LDL-C, (C) HDL-C. Y-axis indicates −log10(FDR) of the adjusted linear regression coefficient for each of the environmental factors. Colours represent different environmental classes as represented in Figure 1A. Red line corresponds to an FDR of 0.05. Findings validated in the 2003–04 survey are seen in the open markers
Significance of association [−log10(FDR)] for each of 188 factors by survey in association to (A) triglycerides, (B) LDL-C, (C) HDL-C. Y-axis indicates −log10(FDR) of the adjusted linear regression coefficient for each of the environmental factors. Colours represent different environmental classes as represented in Figure 1A. Red line corresponds to an FDR of 0.05. Findings validated in the 2003–04 survey are seen in the open markersThe data were combined across surveys for each tentatively validated factor and estimates were further adjusted for waist circumference, type 2 diabetes status, blood pressure and survey, in addition to age, age-squared, BMI, age, sex, SES and ethnicity. The variance ascribed to baseline co-variates was 22–25% (triglycerides), 15–16% (LDL-C) and 23–26% (HDL-C). Each of the tentatively validated environmental factors described an additional 0.7–18.4% (triglycerides), 1.8–14.1% (LDL-C) and 0.4–4.0% (HDL-C) of the variance in lipid levels (Supplementary Tables 3–5, available as Supplementary Data at IJE online).Effects for the tentatively validated associations are shown in Figure 4. We present here some of them in more detail. Effect sizes for continuous variables are for 1 SD of log-transformed value of the environmental factor.
Figure 4
Forest plots for validated environmental factors associated with (A) triglycerides, (B) LDL-C, (C) HDL-C. Survey (labelled as 1999–2000, 2001–02, 2005–06, filled points) denotes the NHANES survey in which the specific factor was found to be significant (FDR < 0.05) in a model adjusting for age, age2, SES, ethnicity, sex, BMI. ‘Validation’ indicates the estimates found for the significant factor in the validation survey. Combined survey (unfilled points) denotes the estimate attained when combining all surveys available for exposure in a model adjusting for age, age-squared, SES, ethnicity, sex, BMI, waist circumference, type 2 diabetes status, blood pressure and survey. Percent change (x-axis) is the percent change of lipid level for a change in 1 SD of logged exposure value. Effect size (in mg/dl) attained when fitting the untransformed lipids to the model. Symbols proportional to sample size and colours represent different environmental classes as represented in Figure 1A. For triglycerides and HDL-C, only the top most significant factors for each factor class is shown; forest plots of all validated factors are seen in Supplementary Figure 1, available as Supplementary Data at IJE online
Forest plots for validated environmental factors associated with (A) triglycerides, (B) LDL-C, (C) HDL-C. Survey (labelled as 1999–2000, 2001–02, 2005–06, filled points) denotes the NHANES survey in which the specific factor was found to be significant (FDR < 0.05) in a model adjusting for age, age2, SES, ethnicity, sex, BMI. ‘Validation’ indicates the estimates found for the significant factor in the validation survey. Combined survey (unfilled points) denotes the estimate attained when combining all surveys available for exposure in a model adjusting for age, age-squared, SES, ethnicity, sex, BMI, waist circumference, type 2 diabetes status, blood pressure and survey. Percent change (x-axis) is the percent change of lipid level for a change in 1 SD of logged exposure value. Effect size (in mg/dl) attained when fitting the untransformed lipids to the model. Symbols proportional to sample size and colours represent different environmental classes as represented in Figure 1A. For triglycerides and HDL-C, only the top most significant factors for each factor class is shown; forest plots of all validated factors are seen in Supplementary Figure 1, available as Supplementary Data at IJE online
Vitamins A and E: unfavourable association with lipid levels
For all three lipids, we found a consistent association for lipid-soluble, anti-oxidant vitamins, such as vitamins A, E, and carotenoids (Figure 4 A–C and Supplementary Figure 1, available as Supplementary Data at IJE online). For example, a form of vitamin A, retinol, was positively associated with triglycerides (P = 6 × 10−21, effect = 10% or 25 mg/dl higher triglycerides per 1 SD) in all surveys examined. Another form of vitamin A, retinyl palmitate, was also positively associated with triglycerides (P = 6 × 10−21, effect = 10%) and LDL-C (P = 4 × 10−13, effect = 5% or 6 mg/dl). Retinyl stearate was negatively associated with HDL-C (P = 4 × 10−5, effect = −3% or −1 mg/dl).We observed a consistent association between forms of vitamin E (α and γ tocopherol) and lipid levels. α-tocopherol strongly correlated with higher triglyceride and LDL-C levels [effect = 35% (P = 8 × 10−20) and 16% (P = 7 × 10−19) or 67 and 16 mg/dl, respectively]. γ-tocopherol was also correlated with higher triglycerides (effect = 17% higher, P = 10−17) and LDL-C (6% higher, P = 3 × 10−14) levels, but also with lower HDL-C (effect = −2%, P = 6 × 10−6). Tocopherols are highly lipophilic and their absorption is enhanced by triglycerides, though both were significant despite controlling for BMI and waist circumference.
Carotenoids: favourable association with HDL-C and triglycerides and unfavourable association with LDL-C
Both isomers of β-carotene, cis- and trans-, were associated with lower triglyceride levels (P = 10−6, effect = −7% or 12 mg/dl; P = 10−8, effect = −10% or 16 mg/dl, respectively). However, both isomers of carotene, in addition to other carotenoids such as β-cryptoxanthin and lycopene, were consistently associated with higher levels of both HDL-C and LDL-C. The effect was 5% (P = 3 × 10−12) and 6% (P = 5 × 10−11) for HDL-C and LDL-C levels, respectively for cis-β-carotene and 3% (P = 10−10) and 12% (P = 8 × 10−17) for lycopene.
Favourable lipid correlations with vitamins B, C, D, iron, mercury and enterolactone
We found serum levels of folate (vitamin B), C, D, iron and mercury to be favourably associated with HDL-C (Figure 4C). Effect sizes of vitamin, iron and mercury levels on HDL-C were similar, ranging from 3 to 4% (1–2 mg/dl) higher HDL-C (P < 0.002). Last, we found enterolactone, a product of lignan metabolism in the intestine, to be associated with 10% (17 mg/dl) lower triglyceride levels (P = 2 × 10−7) (Figure 4A).
Persistent pollutants: unfavourable association with triglycerides and HDL-C
PCBs, dibenzofurans and organochlorine pesticides, all persistent organic pollutants, were unfavourably associated with both triglyceride and HDL-C levels (Figure 4A and C). Seven PCB factors were tentatively validated and the most significant co-geners PCB74 and PCB170 were associated with 15% (P = 10−6) and 19% (P = 4 × 10−6) higher triglyceride levels. Five organochlorine factors were tentatively validated, among which oxychlordane and trans-nonachlor changes were linked to 29 and 30% higher (P = 5 × 10−9, 1 × 10−8) triglyceride levels. Another organochlorine pesticide, heptachlor epoxide, was associated with 3% lower HDL-C (P = 0.006).
Markers for air pollution and nicotine: unfavourable association with HDL-C
Several markers of air pollution and nicotine exposure were unfavourably associated with HDL-C (Figure 4C). The polyaromatic hydrocarbon markers of fluorene, 3-hydroxyfluorene and 2-hydroxyfluorene, were associated with 3% lower HDL-C (P = 0.006 and P = 0.004). Cotinine, a serum biomarker for nicotine, was associated also with 3% lower HDL-C (P = 2 × 10−6).
Sensitivity analyses with further adjustments
For most questionnaire variable adjustments, we did not see a sizable difference in estimated coefficients or P-values for the environmental factors (Supplementary Figures 2–4, available as Supplementary Data at IJE online), including questionnaire items regarding self-reported cardiovascular-related disease status and use of drugs. Interestingly, some adjustments increased the effect size of the environmental factor. For example, the association of cotinine, 3- and 2-hydroxyfluorene with HDL-C strengthened after adjustment for alcohol intake. Adjustment for fish and shellfish consumption strengthened the association between retinyl stearate and HDL-C and triglyceride levels. Conversely, the effect of vitamin C and folate in relation to HDL-C decreased when taking supplement count, total fiber intake and physical activity into account. Adjusting for supplement count decreased the effect of γ-tocopherol on HDL-C.Simultaneous adjustment for self-reported cardiovascular-related disease, supplement count, lipid-lowering drugs and physical activity strengthened the association between tocopherols and pollutant factors and triglycerides, while attenuating the association to α-carotene (Supplementary Figure 2 available as Supplementary Data at IJE online). For HDL-C levels, effects of cotinine, mercury, 3- and 2-flourene, folate, vitamin C, vitamin D and γ-tocopherol were all attenuated >15% (Supplementary Figure 4 available as Supplementary data at IJE online). However, the direction and significance of the effects were preserved throughout.
Discussion
By combing through a large number of environmental exposures using a systematic approach, we have found and validated multiple previously known environmental chemical factors correlated with serum lipid levels beyond the level of false discovery. Populations are exposed to many environmental factors, both harmful and beneficial. It is possible that by studying a few of these factors, we may miss major factors that truly influence disease. Further, by examining multiple factors, we may capture the relative effect of different factors as compared with others. This approach gives a broader, inclusive perspective of benefits and harms that may enhance the interpretation and overall public health relevance of this literature. Such an investigation is made possible by health survey data assaying multiple environmental factors; these surveys are critical to understanding their relationship with characteristics in the general population.By using transparent reporting and estimation of the FDR, this approach bypasses the problem of selectively testing and reporting one or a few associations at a time that has been debated as a source of biased results and false positives in epidemiological studies.,,,, We use the breadth of environmental factor and phenotypic measures to conduct extensive sensitivity and correlation analyses that are critical given the complex physiological web of correlation apparent in environmental epidemiological study. Relatedly, such a systematic display of a large number of associations (Figure 3) may enable us to create hypotheses regarding how multiple chemical factors might jointly contribute to phenotypic states. While we have focused here on modelling main effects, a next analytical step might include evaluating how mixtures of environmental factors are connected to lipid levels. However, assessing interactions between environmental factors will add another layer of complexity, potentially requiring more power for the study.We acknowledge that the approach has drawbacks. First, in our current scenario, some factors were present in more surveys than others and, therefore, have additional opportunity for tentative validation (defined as an FDR of < 5% in test cohorts and P < 0.05 in validation cohort), potentially leading to a bias in factors found. Secondly, the method calls for multiple testing on different types of factors without consideration of priors and a strict FDR threshold is applied, giving way to the possibility of false negatives., Nevertheless, just as systematic genome-wide studies have had utility in finding novel genetic loci associated with complex disease, this EWAS strategy provides an opportunity to find novel markers of exposure and prioritize their validation in follow-up studies.Our findings reveal complex relationships between serum lipid levels and fat-soluble antioxidant vitamins A and E and carotenoids. Randomized studies and meta-analyses have shown these vitamins to have no benefits or even confer harm when given in high doses, in contrast to previous favourable associations in observational studies., The unfavourable lipid profile that we observed with vitamin E forms is consistent with observational data, and possibly consistent with the randomized evidence on clinical outcomes.We observed an association of vitamins B (folate), C and D, mercury and iron, to higher HDL-C levels. Folate and vitamin D have previously been associated with higher HDL-C. Fish, a source of cardioprotective omega-3 fatty acids, are also a large source of mercury; however, we did not observe a large change in effect size of mercury when accounting for consumption of fish. These nutrients and metals may be to some extent surrogate markers of ‘healthy diet’ behaviours; however, what exactly constitutes a ‘healthy diet’ is currently very difficult to define, in contrast to earlier claims., The strength of the association for these dietary markers is similar on HDL-C, ranging from 1 to 3 mg/dl for a standardized change per factor. These are small effects and it is unclear whether cumulatively they could have a much larger impact in raising HDL-C level, given the correlations between these markers (Figure 2).We also identified enterolactone to be strongly associated with favourable triglyceride levels in this study. Enterolactone is a metabolite of lignans, which are found in foods such as flaxseed and have been associated with favourable cholesterol profiles in this form.,. Again, it is unclear what role, if any, this marker plays as a surrogate of ‘healthy diets’ and effects on heart disease have been inconsistent.We found markers of hydrocarbons, 2- and 3-hydroxyfluorene to be strongly associated with unfavourable HDL-C levels. Although others have shown the association of these metabolites to self-report cardiovascular disease with the NHANES data, to our knowledge the association with HDL-C is novel. Relatedly, we also found a marker of nicotine, cotinine, to have a similar association with HDL-C. Particulate matter air pollution, composed of many types of hydrocarbons and smoking long have been a major concern for cardiovascular-related diseases.,, It is well-known that smoking influences HDL-C levels, and acute and chronic exposures to tobacco smoke have been shown to decrease HDL-C substantially. The high correlation of the hydrocarbon markers to cotinine suggests that these associations might all indicate exposure to cigarette smoke.We also have reconfirmed the correlation between banned-use persistent pollutants, such as organochlorine pesticides, dibenzofurans and polychlorinated biphenyls, with adverse lipid profiles, such as large increase of triglycerides and large decrease in HDL-C. These environmental factors have already been implicated in other metabolic-related and cardiovascular diseases and among several populations. For example, PCB170 and heptachlor epoxide have been associated with type 2 diabetes and hypertension in these surveys., Similarly, PCBs and dibenzofurans have been associated with metabolic syndrome in a Japanese population.We acknowledge that these associations might be confounded due to the fat solubility of these pollutants. Nevertheless, there have been efforts to elucidate causal relationships using different analytic methods and ecological data. For example, in a recent study considering causal pathways and confounding bias via structural equation modeling, investigators found a relationship between polychlorinated biphenyls and lipid levels consistent with forward causality for a native population with high exposure of these pollutants in upstate New York. Another study found an ecological relationship between cardiovascular-related hospitalization rates in areas close to PCB pollution. Nonetheless, the etiological relationships between persistent pollutants and the metabolic syndrome, type 2 diabetes and cardiovascular diseases remains elusive., However, current etiological speculation includes the role of these pollutants interfering with PPARs, transcription factors known to be involved in lipid homoeostasis, and/or influencing change in DNA methylation., Persistent pollutants were recently associated with atherosclerosis in the elderly in Sweden, independent of serum lipid levels, suggesting a direct pollutant effect on atherosclerosis. Further investigation of these pollutants and consideration of other phenotypes along the causal pathway for cardiovascular-related diseases is warranted.Elucidating both influence of persistent pollutants on lipids and quantifying their amount in serum lipids remains an issue of debate., For example, there are methods to quantify persistent pollutants via adjustment with serum lipids;, but differing methods of adjustment of these factors could lead to conflicting results. Porta et al., in investigating the influence of organochlorine pesticides on pancreatic carcinoma, indicate that linear adjustment may be inappropriate in some cases. Assessments between persistent pollutants and serum lipid levels as described here may address some of these issues.Factor variability must be characterized to ensure their adequate analytical modelling. For example, we considered BMI as having a confounding role and included it as an adjustment in our models. However, BMI may lie on the causal pathway towards adverse lipid profiles. The inter-relationship among lipid levels, BMI and persistent pollutant factors is complex, differing in the context of sex and demographics, clinical characteristics and after changes in weight (e.g. after bariatric surgery and overall weight gain). Inter-individual differences in pharmacokinetics also play a role in this complex relationship. The choice of how to model adjusting variables ultimately influences inferences. Long-term longitudinal investigations, and causal inference methods may be more suitable to understand causal pathways, if any, underlying the correlation of these environmental factors with lipid levels and other phenotypes.There are some important limitations in a study using cross-sectional measurements and the observed correlations are far from causal. These associations may reflect a complex web of physiological correlation and/or reverse causality. For example, α-tocopherol and carotenes are transported in serum with HDL and LDL and accurate measurement of serum α-tocopherol is dependent on serum lipids. In this regard, the strong association between α-tocopherol and LDL and triglycerides might be considered a true positive result. On the other hand, given the lack of evidence for γ-tocopherol or retinol associating with lipoprotein complexes, their association might be due to reverse causality, or increased anti-oxidant consumption among those who know about their adverse lipid level profile. However, given that vitamin E consumption has been found to increase mortality in meta-analysis, the large effect sizes suggest that prospective studies may be scrutinized for any potentially adverse effects of vitamin E on lipid levels and other metabolic disorders, such as type 2 diabetes.Like vitamins, we must consider how the distribution of persistent pollutants among biomolecules in serum may influence our analyses. Persistent pollutants have a unique signature in plasma LDL and HDL; for example, PCBs are primarily carried in LDL whereas their metabolites are evenly carried in both LDL and HDL. Furthermore, ascertaining levels of pollutants found in tissue other than serum may be eventually required to understand pathology. To this end, there are reports of concordance between concentrations of persistent pollutants found in different adipose tissues, such as between breast and abdominal adipose tissue. There appears to be concordance between levels of Dichlorodiphenyldichloroethylene (DDE) found in serum and breast adipose tissue; however, relative estimation varies based on the type of adjustment methods used.Another issue includes the measurement of pollutant environmental factors themselves. For example, limits of detection varied across different NHANES surveys. To address this, we filtered out variables that had a majority of undetectable measurements; however, results may be biased due to imbalance in measurement techniques and differing thresholds. In the future, factor measurement should be standardized, as proposed by the PhenX project, to ensure comparability of results among different studies and cohorts. Environmental exposure biomonitoring data from other public health surveys might be able to aid in this effort and the National Academy of Science Committee on Human Biomonitoring for Environmental Toxicants lists examples of such efforts.Despite these limitations, we have shown here a systematic approach to create robust hypotheses regarding association of environmental factors with disease. Further studies should focus on elucidating their role in disease, if any.
Supplementary Data
Supplementary data are available at IJE online.
Funding
The National Library of Medicine (grant numbers T15 LM 007033 and R01 LM009719); National Institute of General Medical Sciences (grant number R01 GM079719); National Institutes of Health Clinical and Translational Science Award (UL1 RR025744); Lucile Packard Foundation for Children's Health, Howard Hughes Medical Institute.
Table 1
Estimates of multivariate linear regression model predicting log10(triglycerides) as a function of sex, age, age2, ethnicity (in reference to Whites), an estimate of SES (in reference to high SES) and BMI for each survey
1999–2000 (n = 3002)
2001–02 (n = 3610)
Triglycerides
Estimate (95% CI)
P-value
Estimate (95% CI)
P-value
Sex (vs male)
−0.026 (−0.061 to 0.008)
0.1
−0.061 (−0.089 to −0.033)
0.002
Age (10 years)
Age
0.044 (0.011–0.076)
0.02
0.052 (0.019–0.085)
0.009
Age2
−0.00017 (−0.00047 to 0.00013)
0.2
−0.00026 (−0.00061 to 8.7e−05)
0.1
Ethnicity (vs White)
Black
−0.14 (−0.19 to −0.098)
9×10−4
−0.13 (−0.17 to −0.079)
0.001
Mexican–American
0.011 (−0.049 to 0.071)
0.6
0.0088 (−0.036 to 0.053)
0.6
Other Hispanic
−0.034 (−0.075 to 0.0074)
0.08
0.038 (−0.1 to 0.18)
0.5
Other
−0.027 (−0.098 to 0.044)
0.3
0.03 (−0.046 to 0.11)
0.4
SES (vs high tertile)
SES (medium)
0.011 (−0.032 to 0.055)
0.5
0.018 (−0.011 to 0.047)
0.2
SES (low)
0.027 (−0.018 to 0.072)
0.2
0.037 (−0.0034 to 0.077)
0.07
BMI (10 U)
0.11 (0.078–0.15)
9×10−4
0.093 (0.059–0.13)
0.001
95% confidence interval (CI) and P-value of associations are also shown. n is unweighted sample size.
Table 2
Estimates of multivariate linear regression model predicting log10(LDL-C) as a function of sex, age, age2, ethnicity, SES and BMI for each survey
1999–2000 (n = 2743)
2001–02 (n = 3318)
LDL-C
Estimate (95% CI)
P-value
Estimate (95% CI)
P-value
Sex (vs male)
−0.015 (−0.034 to 0.0043)
0.1
−0.016 (−0.028 to −0.0049)
0.01
Age (10 years)
Age
0.059 (0.042–0.076)
7 × 10−4
0.058 (0.042–0.073)
2 × 10−4
Age2
−0.00046 (−0.00066 to −0.00027)
0.003
−0.00048 (−0.00065 to −0.00032)
7 × 10−4
Ethnicity (vs White)
Black
−0.015 (−0.036 to 0.0065)
0.1
−0.0087 (−0.032 to 0.014)
0.4
Mexican–American
−0.012 (−0.028 to 0.0047)
0.1
−0.019 (−0.037 to −0.0013)
0.04
Other Hispanic
−0.013 (−0.034 to 0.0081)
0.2
−0.015 (−0.041 to 0.011)
0.2
Other
−0.0098 (−0.041 to 0.021)
0.4
0.01 (−0.036 to 0.056)
0.6
SES (vs high tertile)
SES (medium)
−0.0028 (−0.027 to 0.021)
0.8
0.0044 (−0.019 to 0.028)
0.6
SES (low)
0.0094 (−0.014 to 0.033)
0.3
0.01 (−0.013 to 0.033)
0.3
BMI (10 U)
0.022 (0.006–0.038)
0.02
0.014 (0.0048–0.023)
0.01
95% CI: 95% confidence interval.
Table 3
Estimates of multivariate linear regression model predicting log10(HDL-C) as a function of sex, age, age2, ethnicity, SES and BMI for each survey
Authors: Robert D Brook; Sanjay Rajagopalan; C Arden Pope; Jeffrey R Brook; Aruni Bhatnagar; Ana V Diez-Roux; Fernando Holguin; Yuling Hong; Russell V Luepker; Murray A Mittleman; Annette Peters; David Siscovick; Sidney C Smith; Laurie Whitsel; Joel D Kaufman Journal: Circulation Date: 2010-05-10 Impact factor: 29.690
Authors: Mary S Wolff; Julie A Britton; Susan L Teitelbaum; Sybil Eng; Elena Deych; Karen Ireland; Zhisong Liu; Alfred I Neugut; Regina M Santella; Marilie D Gammon Journal: Cancer Epidemiol Biomarkers Prev Date: 2005-09 Impact factor: 4.254
Authors: Tanya M Teslovich; Kiran Musunuru; Albert V Smith; Andrew C Edmondson; Ioannis M Stylianou; Masahiro Koseki; James P Pirruccello; Samuli Ripatti; Daniel I Chasman; Cristen J Willer; Christopher T Johansen; Sigrid W Fouchier; Aaron Isaacs; Gina M Peloso; Maja Barbalic; Sally L Ricketts; Joshua C Bis; Yurii S Aulchenko; Gudmar Thorleifsson; Mary F Feitosa; John Chambers; Marju Orho-Melander; Olle Melander; Toby Johnson; Xiaohui Li; Xiuqing Guo; Mingyao Li; Yoon Shin Cho; Min Jin Go; Young Jin Kim; Jong-Young Lee; Taesung Park; Kyunga Kim; Xueling Sim; Rick Twee-Hee Ong; Damien C Croteau-Chonka; Leslie A Lange; Joshua D Smith; Kijoung Song; Jing Hua Zhao; Xin Yuan; Jian'an Luan; Claudia Lamina; Andreas Ziegler; Weihua Zhang; Robert Y L Zee; Alan F Wright; Jacqueline C M Witteman; James F Wilson; Gonneke Willemsen; H-Erich Wichmann; John B Whitfield; Dawn M Waterworth; Nicholas J Wareham; Gérard Waeber; Peter Vollenweider; Benjamin F Voight; Veronique Vitart; Andre G Uitterlinden; Manuela Uda; Jaakko Tuomilehto; John R Thompson; Toshiko Tanaka; Ida Surakka; Heather M Stringham; Tim D Spector; Nicole Soranzo; Johannes H Smit; Juha Sinisalo; Kaisa Silander; Eric J G Sijbrands; Angelo Scuteri; James Scott; David Schlessinger; Serena Sanna; Veikko Salomaa; Juha Saharinen; Chiara Sabatti; Aimo Ruokonen; Igor Rudan; Lynda M Rose; Robert Roberts; Mark Rieder; Bruce M Psaty; Peter P Pramstaller; Irene Pichler; Markus Perola; Brenda W J H Penninx; Nancy L Pedersen; Cristian Pattaro; Alex N Parker; Guillaume Pare; Ben A Oostra; Christopher J O'Donnell; Markku S Nieminen; Deborah A Nickerson; Grant W Montgomery; Thomas Meitinger; Ruth McPherson; Mark I McCarthy; Wendy McArdle; David Masson; Nicholas G Martin; Fabio Marroni; Massimo Mangino; Patrik K E Magnusson; Gavin Lucas; Robert Luben; Ruth J F Loos; Marja-Liisa Lokki; Guillaume Lettre; Claudia Langenberg; Lenore J Launer; Edward G Lakatta; Reijo Laaksonen; Kirsten O Kyvik; Florian Kronenberg; Inke R König; Kay-Tee Khaw; Jaakko Kaprio; Lee M Kaplan; Asa Johansson; Marjo-Riitta Jarvelin; A Cecile J W Janssens; Erik Ingelsson; Wilmar Igl; G Kees Hovingh; Jouke-Jan Hottenga; Albert Hofman; Andrew A Hicks; Christian Hengstenberg; Iris M Heid; Caroline Hayward; Aki S Havulinna; Nicholas D Hastie; Tamara B Harris; Talin Haritunians; Alistair S Hall; Ulf Gyllensten; Candace Guiducci; Leif C Groop; Elena Gonzalez; Christian Gieger; Nelson B Freimer; Luigi Ferrucci; Jeanette Erdmann; Paul Elliott; Kenechi G Ejebe; Angela Döring; Anna F Dominiczak; Serkalem Demissie; Panagiotis Deloukas; Eco J C de Geus; Ulf de Faire; Gabriel Crawford; Francis S Collins; Yii-der I Chen; Mark J Caulfield; Harry Campbell; Noel P Burtt; Lori L Bonnycastle; Dorret I Boomsma; S Matthijs Boekholdt; Richard N Bergman; Inês Barroso; Stefania Bandinelli; Christie M Ballantyne; Themistocles L Assimes; Thomas Quertermous; David Altshuler; Mark Seielstad; Tien Y Wong; E-Shyong Tai; Alan B Feranil; Christopher W Kuzawa; Linda S Adair; Herman A Taylor; Ingrid B Borecki; Stacey B Gabriel; James G Wilson; Hilma Holm; Unnur Thorsteinsdottir; Vilmundur Gudnason; Ronald M Krauss; Karen L Mohlke; Jose M Ordovas; Patricia B Munroe; Jaspal S Kooner; Alan R Tall; Robert A Hegele; John J P Kastelein; Eric E Schadt; Jerome I Rotter; Eric Boerwinkle; David P Strachan; Vincent Mooser; Kari Stefansson; Muredach P Reilly; Nilesh J Samani; Heribert Schunkert; L Adrienne Cupples; Manjinder S Sandhu; Paul M Ridker; Daniel J Rader; Cornelia M van Duijn; Leena Peltonen; Gonçalo R Abecasis; Michael Boehnke; Sekar Kathiresan Journal: Nature Date: 2010-08-05 Impact factor: 49.962
Authors: Duk-Hee Lee; Michael W Steffes; Andreas Sjödin; Richard S Jones; Larry L Needham; David R Jacobs Journal: PLoS One Date: 2011-01-26 Impact factor: 3.240
Authors: Melissa A Merritt; Ioanna Tzoulaki; Shelley S Tworoger; Immaculata De Vivo; Susan E Hankinson; Judy Fernandes; Konstantinos K Tsilidis; Elisabete Weiderpass; Anne Tjønneland; Kristina E N Petersen; Christina C Dahm; Kim Overvad; Laure Dossus; Marie-Christine Boutron-Ruault; Guy Fagherazzi; Renée T Fortner; Rudolf Kaaks; Krasimira Aleksandrova; Heiner Boeing; Antonia Trichopoulou; Christina Bamia; Dimitrios Trichopoulos; Domenico Palli; Sara Grioni; Rosario Tumino; Carlotta Sacerdote; Amalia Mattiello; H Bas Bueno-de-Mesquita; N Charlotte Onland-Moret; Petra H Peeters; Inger T Gram; Guri Skeie; J Ramón Quirós; Eric J Duell; María-José Sánchez; D Salmerón; Aurelio Barricarte; Saioa Chamosa; Ulrica Ericson; Emily Sonestedt; Lena Maria Nilsson; Annika Idahl; Kay-Tee Khaw; Nicholas Wareham; Ruth C Travis; Sabina Rinaldi; Isabelle Romieu; Chirag J Patel; Elio Riboli; Marc J Gunter Journal: Cancer Epidemiol Biomarkers Prev Date: 2015-02 Impact factor: 4.254
Authors: Chirag J Patel; Jacqueline Kerr; Duncan C Thomas; Bhramar Mukherjee; Beate Ritz; Nilanjan Chatterjee; Marta Jankowska; Juliette Madan; Margaret R Karagas; Kimberly A McAllister; Leah E Mechanic; M Daniele Fallin; Christine Ladd-Acosta; Ian A Blair; Susan L Teitelbaum; Christopher I Amos Journal: Cancer Epidemiol Biomarkers Prev Date: 2017-07-14 Impact factor: 4.254