| Literature DB >> 35205519 |
Miguel G R Miguel1, Rafael P Waissman2, Marcelo S Lauretto2, Julio M Stern1.
Abstract
Haphazard intentional sampling is a method developed by our research group for two main purposes: (i) sampling design, where the interest is to select small samples that accurately represent the general population regarding a set of covariates of interest; or (ii) experimental design, where the interest is to assemble treatment groups that are similar to each other regarding a set of covariates of interest. Rerandomization is a similar method proposed by K. Morgan and D. Rubin. Both methods intentionally select good samples but, in slightly different ways, also introduce some noise in the selection procedure aiming to obtain a decoupling effect that avoids systematic bias or other confounding effects. This paper compares the performance of the aforementioned methods and the standard randomization method in two benchmark problems concerning SARS-CoV-2 prevalence and vaccine efficacy. Numerical simulation studies show that haphazard intentional sampling can either reduce operating costs in up to 80% to achieve the same estimation errors yielded by the standard randomization method or, the other way around, reduce estimation errors in up to 80% using the same sample sizes.Entities:
Keywords: haphazard intentional sampling; optimal sampling design; pure randomization; rerandomization
Year: 2022 PMID: 35205519 PMCID: PMC8871113 DOI: 10.3390/e24020225
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Trade-off between balance and decoupling in 300 allocations for two municipalities containing, respectively, 34 (a) and 176 (b) sectors. Sectors are the minimal units by which census information is made publicly available, each containing about 200 households. Balance between allocated and non-allocated sectors is expressed by the 95th percentile of Mahalanobis distance. Decoupling is expressed by Fleiss’s Kappa coefficient—notice the different range in each case (a,b).
Parameters and maximum CPU time for MILP solver by number of sectors.
| Sectors |
| Time (s) |
|---|---|---|
| <50 | 0.1 | 5 |
| 50–4000 | 0.01 | 30 |
| >4000 | 0.001 | 120 |
Figure 2Difference between groups 1 (sampled sectors) and 0 (not sampled sectors) with respect to average standardized covariate values for each type of allocation.
Root mean square error (RMSE) and standard deviation (SD); red: best result; blue: intermediate result; black: worst result.
| City | Haphazard | Rerandomization | Pure Randomization | |||
|---|---|---|---|---|---|---|
| RMSE | SD | RMSE | SD | RMSE | SD | |
| São Paulo |
|
|
|
| 4.9930% | 4.9899% |
| Rorainópolis |
|
|
|
| 3.0028% | 3.0008% |
| Rio de Janeiro |
|
|
|
| 4.6324% | 4.6216% |
| Oiapoque |
|
|
|
| 3.2107% | 3.2107% |
| Marília |
|
|
|
| 3.4950% | 3.4919% |
| Iguatu |
|
|
|
| 3.9094% | 3.9003% |
| Cruzeiro do Sul |
|
|
|
| 5.0029% | 5.0003% |
| Corrente |
|
|
|
| 2.8250% | 2.8230% |
| Campos dos Goytacazes |
|
|
|
| 4.4839% | 4.4829% |
| Brasília |
|
|
|
| 3.9608% | 3.9539% |
Figure 3RMSE averaged among municipalities × number of sampled sectors.
Efficacy rates for each vaccine [24].
| Vaccine | Efficacy (%) |
|---|---|
| CORONAVAC/SINOVAC (control) | 50.4 |
| ASTRAZENECA/OXFORD | 70.4 |
| MODERNA | 94.5 |
| PFIZER/BIONTECH | 95 |
Figure 4Maximum pairwise standardized differences in each covariate.
Figure 5Estimated (boxplots) and actual (vertical lines) COVID-19 infection rates after administration of recommended doses for each vaccine.
Root mean square error (RMSE) and standard deviation (SD); red: best result; blue: intermediate result; black: worst result.
| Group | Haphazard | Rerandomization | Pure Randomization | |||
|---|---|---|---|---|---|---|
| RMSE | SD | RMSE | SD | RMSE | SD | |
| 1—Coronavac (Sinovac) |
|
|
|
| 2.872% | 2.872% |
| 2—Pfizer/Biontech |
|
|
|
| 0.260% | 0.260% |
| 3—AstraZeneca/Oxford |
|
|
|
| 1.696% | 1.696% |
| 4—Moderna |
|
|
|
| 0.311% | 0.311% |