| Literature DB >> 25951124 |
Michael A McIsaac1, Richard J Cook2.
Abstract
Response-dependent two-phase designs are used increasingly often in epidemiological studies to ensure sampling strategies offer good statistical efficiency while working within resource constraints. Optimal response-dependent two-phase designs are difficult to implement, however, as they require specification of unknown parameters. We propose adaptive two-phase designs that exploit information from an internal pilot study to approximate the optimal sampling scheme for an analysis based on mean score estimating equations. The frequency properties of estimators arising from this design are assessed through simulation, and they are shown to be similar to those from optimal designs. The design procedure is then illustrated through application to a motivating biomarker study in an ongoing rheumatology research program.Entities:
Keywords: adaptive design; asymptotic efficiency; incomplete data; mean score analysis; response-dependent sampling
Mesh:
Substances:
Year: 2015 PMID: 25951124 PMCID: PMC4691319 DOI: 10.1002/sim.6523
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373
Figure 1Empirical differences in sampling fractions between the asymptotically-optimal design and the two-stage adaptive designs employing proportional stratified sampling in phase-IIa according to the percentage of the phase-II sample that is selected at phase-IIb (n(/n) for a binary expensive covariate; N = 800; n = 200; (β0,β,β) = (−1.95,1.00,0.90), (α0,α) = (1.05,−0.41), and γ0=−0.04 ; n represents the number of individuals selected for the measurement of X from the available N individuals in stratum (Y,V) in the two-stage adaptive design, while represents the corresponding sample size for the asymptotically-optimal design. This two-stage adaptive design was derived for selecting individuals for measurement of a binary X and relied on either parametric or empirical estimation of design components from data collected in phase-I and phase-IIa.
Figure 3Empirical differences in sampling fractions between the asymptotically-optimal design and the two-stage adaptive designs employing balanced sampling in phase-IIa according to the percentage of the phase-II sample that is selected at phase-IIb (n(/n) for a binary expensive covariate; N = 800; n = 200; (β0,β,β) = (−1.95,1.00,0.90), (α0,α) = (1.05,−0.41), and γ0=−0.04 ; n represents the number of individuals selected for the measurement of X from the available N individuals in stratum (Y,V) in the two-stage adaptive design, while represents the corresponding sample size for the asymptotically-optimal design. This two-stage adaptive design was derived for selecting individuals for measurement of a binary X and relied on either parametric or empirical estimation of design components from data collected in phase-I and phase-IIa.
Figure 2Empirical differences in sampling fractions between the asymptotically-optimal design and the two-stage adaptive designs employing proportional stratified sampling in phase-IIa according to the percentage of the phase-II sample that is selected at phase-IIb (n(/n) for a continuous expensive covariate; N = 800; n = 200; (β0,β,β) = (−2.18,0.03,.84), (α0,α1,α) = (1.40,10,5), and γ0=−0.04; n represents the number of individuals selected for the measurement of X from the available N individuals in stratum (Y,V) in the two-stage adaptive design, while represents the corresponding sample size for the asymptotically-optimal design. This two-stage adaptive design was derived for selecting individuals for measurement of a continuous X and relied on either parametric or empirical estimation of design components from data collected in phase-I and phase-IIa.
Figure 4Empirical differences in sampling fractions between the asymptotically-optimal design and the two-stage adaptive designs employing balanced sampling in phase-IIa according to the percentage of the phase-II sample that is selected at phase-IIb (n(/n) for a continuous expensive covariate; N = 800; n = 200; (β0,β,β) = (−2.18,0.03,.84), (α0,α1,α) = (1.40,10,5), and γ0=−0.04; n represents the number of individuals selected for the measurement of X from the available N individuals in stratum (Y,V) in the two-stage adaptive design, while represents the corresponding sample size for the asymptotically-optimal design. This two-stage adaptive design was derived for selecting individuals for measurement of a continuous X and relied on either parametric or empirical estimation of design components from data collected in phase-I and phase-IIa.
Empirical relative efficiencies (ERE) and empirical relative interquartile ranges (ERI) compared with the asymptotically-optimal design based on true (unknown) parameters and empirical coverage probabilities (ECP) of estimators for β based on 2000 simulated datasets with N = 800 and n = 200.
| Binary | Continuous | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Parametric | Empirical | Parametric | Empirical | |||||||||
| 100 | ERE | ERI | ECP | ERE | ERI | ECP | ERE | ERI | ECP | ERE | ERI | ECP |
| Two-stage adaptive – proportional sampling in phase-IIa | ||||||||||||
| 0* | 76.6 | 88.7 | 95.5 | 76.6 | 88.7 | 95.5 | 88.9 | 94.4 | 95.2 | 88.9 | 94.4 | 95.2 |
| 10 | 85.0 | 93.4 | 95.3 | 84.3 | 90.2 | 95.2 | 94.7 | 97.7 | 94.8 | 89.4 | 97.9 | 94.8 |
| 20 | 86.3 | 89.9 | 94.7 | 84.6 | 90.9 | 94.8 | 94.0 | 96.9 | 94.7 | 87.8 | 92.2 | 94.0 |
| 30 | 94.0 | 97.8 | 95.2 | 89.3 | 97.9 | 93.8 | 102.6 | 99.8 | 95.2 | 93.2 | 97.1 | 94.6 |
| 40 | 96.3 | 96.3 | 95.0 | 103.1 | 100.8 | 95.4 | 100.7 | 104.3 | 94.8 | 86.0 | 91.1 | 93.2 |
| 50 | 98.2 | 95.0 | 94.8 | 92.6 | 96.3 | 94.2 | 103.2 | 102.4 | 95.1 | 88.5 | 94.7 | 93.3 |
| 60 | 103.1 | 100.0 | 95.2 | 95.2 | 96.5 | 94.8 | 105.0 | 102.0 | 95.5 | 88.2 | 97.0 | 93.7 |
| 70 | 97.2 | 98.2 | 95.0 | 95.4 | 98.8 | 94.4 | 103.0 | 103.3 | 94.6 | 93.9 | 96.9 | 94.8 |
| 80 | 100.7 | 103.1 | 95.1 | 100.8 | 101.3 | 95.0 | 107.9 | 105.2 | 95.6 | 85.6 | 96.6 | 94.5 |
| 90 | 98.6 | 102.2 | 95.4 | 97.7 | 97.5 | 94.5 | 103.7 | 103.8 | 95.0 | 82.8 | 94.0 | 94.7 |
| Two-stage adaptive – balanced sampling in phase-IIa | ||||||||||||
| 0* | 95.2 | 99.6 | 95.2 | 95.2 | 99.6 | 95.2 | 90.6 | 90.6 | 95.0 | 90.6 | 90.6 | 95.0 |
| 10 | 101.6 | 102.3 | 95.3 | 102.9 | 102.5 | 95.5 | 98.3 | 98.6 | 94.6 | 95.2 | 99.2 | 95.0 |
| 20 | 96.8 | 101.5 | 93.8 | 98.4 | 98.9 | 95.0 | 102.0 | 99.4 | 94.7 | 95.9 | 99.8 | 94.6 |
| 30 | 105.2 | 103.7 | 95.3 | 99.9 | 100.7 | 95.0 | 107.5 | 103.1 | 95.9 | 88.2 | 92.2 | 93.8 |
| 40 | 100.1 | 103.2 | 95.0 | 102.5 | 98.4 | 94.9 | 109.7 | 105.4 | 95.5 | 93.9 | 95.3 | 94.0 |
| 50 | 103.8 | 99.3 | 95.3 | 100.0 | 100.1 | 94.8 | 101.0 | 101.8 | 95.0 | 96.7 | 95.5 | 95.5 |
| 60 | 99.8 | 97.6 | 95.0 | 98.9 | 96.4 | 95.2 | 100.8 | 98.3 | 94.8 | 86.5 | 90.8 | 94.2 |
| 70 | 103.7 | 98.9 | 95.5 | 96.6 | 98.5 | 94.3 | 103.2 | 103.6 | 95.0 | 85.0 | 90.7 | 93.8 |
| 80 | 100.2 | 101.1 | 94.7 | 97.7 | 98.0 | 94.7 | 103.9 | 104.0 | 95.2 | 86.6 | 91.5 | 94.8 |
| 90 | 95.8 | 98.5 | 95.2 | 96.0 | 96.2 | 94.8 | 102.3 | 100.6 | 95.2 | 84.0 | 95.9 | 94.8 |
| Fully adaptive | ||||||||||||
| 102.4 | 107.0 | 95.3 | 99.0 | 96.7 | 94.5 | 102.4 | 101.9 | 95.3 | 86.8 | 89.1 | 94.3 | |
The parameters were set to (β0,β,β) = (−1.95,1.00,0.90), (α0,α) = (1.05,−0.41), and γ0=−0.04 for the case with binary X and to (β0,β,β) = (−2.18,0.03,.84), (α0,α1,α) = (1.40,10,5), and γ0=−0.04 for the setting with continuous X. Two-stage adaptive designs select n − n( individuals using proportional or balanced sampling and use these individuals to estimate the design components either through parametric estimation (parametric) or through empirical estimation (empirical) to approximate optimal selection of the remaining n( individuals. Nonadaptive designs are a special case of the two-stage sampling where all individuals are selected in phase-IIa. Fully adaptive designs involved selecting an initial balanced sample of size 40 (corresponding to 20% of n) and then selecting the remaining individuals one at a time while updating estimates of the design component after each individual is selected.
*The nonadaptive design does not require estimation of the design components, so there is no distinction between design components being estimated ‘empirically’ or ‘parametrically’.