| Literature DB >> 26593934 |
Robert H Lyles1, Dane Van Domelen2, Emily M Mitchell3, Enrique F Schisterman4.
Abstract
Pooling biological specimens prior to performing expensive laboratory assays has been shown to be a cost effective approach for estimating parameters of interest. In addition to requiring specialized statistical techniques, however, the pooling of samples can introduce assay errors due to processing, possibly in addition to measurement error that may be present when the assay is applied to individual samples. Failure to account for these sources of error can result in biased parameter estimates and ultimately faulty inference. Prior research addressing biomarker mean and variance estimation advocates hybrid designs consisting of individual as well as pooled samples to account for measurement and processing (or pooling) error. We consider adapting this approach to the problem of estimating a covariate-adjusted odds ratio (OR) relating a binary outcome to a continuous exposure or biomarker level assessed in pools. In particular, we explore the applicability of a discriminant function-based analysis that assumes normal residual, processing, and measurement errors. A potential advantage of this method is that maximum likelihood estimation of the desired adjusted log OR is straightforward and computationally convenient. Moreover, in the absence of measurement and processing error, the method yields an efficient unbiased estimator for the parameter of interest assuming normal residual errors. We illustrate the approach using real data from an ancillary study of the Collaborative Perinatal Project, and we use simulations to demonstrate the ability of the proposed estimators to alleviate bias due to measurement and processing error.Entities:
Keywords: epidemiology; errors-in-variables; odds ratio; pooling
Mesh:
Substances:
Year: 2015 PMID: 26593934 PMCID: PMC4661676 DOI: 10.3390/ijerph121114723
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Analysis of CPP Substudy Data Including 164 Individual and 251 Pooled MCP1 Assays a.
| Model | AIC | ||||||
|---|---|---|---|---|---|---|---|
| ME and PE b | 0.031 (0.026) | -- | -- | -- | -- | -- | -- |
| ME only | 0.032 (0.025) | 0.102 | 0.001 c | -- | 0.311 (0.25) [−0.17, 0.80] | 0.310 (0.25) [−0.17, 0.79] | 420.64 |
| PE only | 0.031 (0.026) | 0.079 | -- | 0.078 | 0.388 (0.32) [−0.25, 1.02] | 0.383 (0.32) [−0.25, 1.01] | 412.82 |
| Neither ME nor PE | 0.032 (0.025) | 0.103 | -- | -- | 0.309 (0.25) [−0.17, 0.79] | 0.308 (0.24) [−0.17, 0.79] | 418.46 |
| Logistic regression d | -- | -- | -- | -- | 0.270 (0.24) [−0.20, 0.74] | -- | -- |
a Numbers in parentheses () are estimated standard errors; 95% CIs are in brackets []; b Model fails to identify σ2 due to design limitations (ki = 1,2 only); c hits boundary constraint of 0.001; d Based on Weinberg-Umbach poolwise model (Section 2.2), not accounting for ME or PE; e Estimates and standard errors adjusted as proposed in Section 2.4.
Simulations Under Model with Neither ME nor PE to Assess Estimators in Section 2.2 a,b.
| N | MSE | Logistic Regression c | |||
|---|---|---|---|---|---|
| 2000 | 0.035 (0.013) | 0.080 | 0.439 (0.166) | 0.438 (0.166) | 0.441 (0.168) |
| 200 | 0.035 (0.042) | 0.080 | 0.447 (0.545) | 0.438 (0.534) | 0.474 (0.586) |
a Table shows mean estimates across 5000 simulations, with empirical standard deviations in parentheses () and 95% CI coverages in brackets []; b True values: β* = 0.035, σ2 = 0.080, ln(OR) = 0.438; c Based on Weinberg-Umbach poolwise model.
Simulations Under Model with Both ME and PE to Assess Estimators in Section 2.3 a,b,c.
| N | Logistic Regression e | |||||
|---|---|---|---|---|---|---|
| 2000 | 0.035 (0.017) | 0.079 | 0.081 | 0.082 | 0.474 |0.438| (0.28) [95.4%] | 0.254 |0.254| (0.13) [66.7%] |
| 1000 | 0.035 (0.024) | 0.077 | 0.082 | 0.081 | 0.463 |0.417| (0.37) [96.2%] | 0.252 |0.251| (0.18) [79.4%] |
| 500 | 0.035 (0.034) | 0.077 | 0.081 | 0.080 | 0.448 |0.402| (0.49) [97.2%] | 0.259 |0.254| (0.26) [88.9%] |
a Table shows mean estimates across 2500 simulations, with median estimates in bars ||, empirical standard deviations in parentheses () and 95% CI coverages in brackets []; b True values: β* = 0.035, , ln(OR) = 0.438; c Mean estimates of β* and variance components exclude simulation runs in which σ2 estimate hit 0.001 boundary. This occurred in 7.6%, 1.2%, and 0.08% of runs with N = 500, 1000, 2000, respectively; d Final log OR estimate incorporates AIC-based model selection (see Section 4.2) with ME only or PE only model selected in 19.1%, 8.5%, and 3.4% of runs with N = 500, 1000, 2000, respectively; e Based on Weinberg-Umbach poolwise model (Section 2.2), not accounting for ME or PE.
Simulations Under ME Only Model to Assess Estimators in Section 2.3 a,b.
| N | Logistic regression d | |||||
|---|---|---|---|---|---|---|
| 2000 | 0.035 (0.016) | 0.079 | 0.080 | 0.448 (0.22) [0.21] {95.4%} | 0.438 (0.21) [0.21] {95.6%} | 0.291 (0.13) [0.13] {79.9%} |
| 1000 | 0.035 (0.021) | 0.079 | 0.080 | 0.474 (0.32) [0.31] {96.5%} | 0.450 (0.30) [0.30] {96.4%} | 0.298 (0.19) [0.19] {89.0%} |
| 500 | 0.036 (0.031) | 0.076 | 0.083 | 0.522 c (0.51) [0.50] {97.5%} | 0.454 c (0.42) [0.43] {96.8%} | 0.307 (0.28) [0.27] {92.0%} |
a Table shows mean estimates across 2500 simulations, with empirical standard deviations in parentheses (), mean estimated standard errors in brackets [] and 95% CI coverages in braces {}; b True values: β* = 0.035, , ln(OR) = 0.438; c Final log OR estimate incorporates AIC-based model selection (see Section 4.2) with PE only model selected in 0.5% of runs with N = 500. ME only model used in 100% of runs with N = 1000 and 2000; d Based on Weinberg-Umbach poolwise model (Section 2.2), not accounting for ME or PE.
Simulations Under PE Only Model to Assess Estimators in Section 2.3 a,b,c.
| N | Logistic Regression | |||||
|---|---|---|---|---|---|---|
| 2000 | 0.035 (0.014) | 0.080 | 0.080 | 0.444 (0.18) [0.18] {95.6%} | 0.442 (0.18) [0.18] {95.5%} | 0.356 (0.15) [0.15] {91.3%} |
| 1000 | 0.035 (0.020) | 0.079 | 0.078 | 0.441 (0.26) [0.26] {95.0%} | 0.438 (0.26) [0.26] {95.0%} | 0.356 (0.21) [0.21] {92.4%} |
| 500 | 0.035 (0.029) | 0.079 | 0.078 | 0.447 (0.37) [0.37] {96.0%} | 0.440 c (0.37) [0.36] {96.0%} | 0.361 (0.30) [0.30] {94.3%} |
a Table shows mean estimates across 2500 simulations, with empirical standard deviations in parentheses (), mean estimated standard errors in brackets [] and 95% CI coverages in braces {}; b True values: β* = 0.035, , ln(OR) = 0.438; c PE only model used in 100% of all runs for each sample size; d Based on Weinberg-Umbach (1999) poolwise model (Section 2.2), not accounting for ME or PE.