| Literature DB >> 31871919 |
Eva M Tanner1, Carl-Gustaf Bornehag1,2, Chris Gennings1.
Abstract
Weighted Quantile Sum (WQS) regression is a method commonly used in environmental epidemiology to assess the impact of chemical mixtures in relation to a health outcome of interest. Data are partitioned into a single training and test set to reduce sample-specific chemical weights. However, in typical epidemiology sample sizes, this may produce unstable chemical weights and WQS index estimates, and investigators may resort to training and testing on the same data. To solve this problem, we propose repeated holdout validation whereby data are randomly partitioned 100 times, producing a distribution of validated results. Taking the mean as the final estimate, confidence estimates may also be calculated for inference. Further, this method helps characterize the variability in chemical weights, aiding in the identification of chemicals of concern. This is important since it may direct future research into specific chemicals. Using data from 718 mother-child pairs in the Swedish Environmental Longitudinal, Mother and Child, Asthma and Allergy (SELMA) study, we assessed the association between prenatal exposure to 26 endocrine disrupting chemicals and child Intelligence Quotient (IQ). Results using a single partition were unstable, varying by random seed. The WQS index estimate was significant when all data was used (e.g. no partition) (β = -2.2 CI = -3.43, -0.98), but attenuated and nonsignificant using repeated holdout validation (β = -0.82 CI = -2.11, 0.45). When implementing WQS in epidemiologic studies with limited sample sizes, repeated holdout validation is a viable alternative to using a single, or no partitioning. Repeated holdout can both stabilize results and help characterize the uncertainty in identifying chemicals of concern, while maintaining some of the the rigor of holdout validation. •Repeated holdout validation improves the stability of WQS estimates in finite study samples•Uncertainty in identifying toxic chemicals of concern is acknowledged and characterized.Entities:
Keywords: Bootstrap; Chemical mixtures; Chemical of concern; Cross-validation; Environmental epidemiology; Repeated holdout validation for weighted quantile sum regression; Uncertainty plot
Year: 2019 PMID: 31871919 PMCID: PMC6911906 DOI: 10.1016/j.mex.2019.11.008
Source DB: PubMed Journal: MethodsX ISSN: 2215-0161
Fig. 1Comparison of Standard versus Novel Partitioning Schemes for WQS.
Conventional WQS regression partitions a full dataset into a single training and test set to estimate chemical weights and test the association between the WQS index and outcome (left). Repeated holdout validation randomly partitions data m times and takes the average WQS index estimate (right).
WQS Index β Coefficients and CIs by Validation Technique & Estimation Type.
| Validation Technique | Estimation Type | β Coefficient | Lower Limit | Upper Limit |
|---|---|---|---|---|
| None: Train/Test Full Dataset | Mean & SE-based 95 % CI | −2.20 | −3.43 | −0.98 |
| Repeated Holdout | Mean & SD-based 95 % CI | −0.83 | −2.11 | 0.45 |
| Repeated Holdout | Median, 2.5th & 97.5th percentiles | −0.86 | −1.99 | 0.43 |
Fig. 2Chemicals of Concern Identification & Uncertainty for 26 Endocrine Disrupting Chemicals in Relation to IQ.
Bars correspond to right axis and indicate the number of times a chemical exceeded the concern threshold in 100 repeated holdouts. Data points, boxplots, and diamonds correspond to left axis. Data points indicate weights for each of the 100 holdouts. Box plots show 25th, 50th, and 75th percentiles, and whiskers show 10th and 90th percentiles of weights for the 100 holdouts. Closed diamonds show mean weights for the 100 holdouts. For comparison, open diamonds show the mean weight of the full sample analysis. Threshold = 3.8 %
| Subject Area: | Environmental Science |
| More specific subject area: | Environmental Epidemiology |
| Method name: | Repeated Holdout Validation for Weighted Quantile Sum Regression |
| Name and reference of original method: | Weighted Quantile Sum Regression |
| Resource availability: | gWQS R Package ( |