| Literature DB >> 34247586 |
Aya A Mitani1, Nathaniel D Mercaldo2, Sebastien Haneuse3, Jonathan S Schildcrout4.
Abstract
BACKGROUND: A large multi-center survey was conducted to understand patients' perspectives on biobank study participation with particular focus on racial and ethnic minorities. In order to enrich the study sample with racial and ethnic minorities, disproportionate stratified sampling was implemented with strata defined by electronic health records (EHR) that are known to be inaccurate. We investigate the effect of sampling strata misclassification in complex survey design.Entities:
Keywords: Complex survey; Design-based analysis; Disproportionate stratified sampling; Model-based analysis; Stratum misclassification
Mesh:
Year: 2021 PMID: 34247586 PMCID: PMC8273975 DOI: 10.1186/s12874-021-01332-8
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.612
Misclassification matrix among Vanderbilt University Medical Center respondents overall and by trust in the healthcare system. Cell values indicate number of respondents and those in parentheses denote row percentages by strata
| Self-reported race/ethnicity | |||||
|---|---|---|---|---|---|
| EHR-based race/ethnicity | White | Black | Asian | Other | Hispanic |
| White | 134 (94.4) | 1 (0.7) | 0 (0.0) | 6 (4.2) | 1 (0.7) |
| Black | 0 (0.0) | 74 (93.7) | 0 (0.0) | 4 (5.1) | 1 (1.3) |
| Asian | 1 (1.2) | 1 (1.2) | 62 (76.5) | 14 (17.3) | 3 (3.7) |
| Other | 59 (48.0) | 5 (4.1) | 16 (13.0) | 34 (27.6) | 9 (7.3) |
| Hispanic | 43 (24.0) | 29 (16.2) | 3 (1.7) | 9 (5.0) | 95 (53.1) |
| Trust = 0 | |||||
| White | 35 (97.2) | 0 (0.0) | 0 (0.0) | 0 (0.0) | 1 (2.8) |
| Black | 0 (0.0) | 20 (87.0) | 0 (0.0) | 2 (8.7) | 1 (4.3) |
| Asian | 0 (0.0) | 0 (0.0) | 25 (78.1) | 6 (18.8) | 1 (3.1) |
| Other | 28 (56.0) | 1 (2.0) | 3 (6.0) | 14 (28.0) | 4 (8.0) |
| Hispanic | 11 (18.0) | 9 (14.8) | 1 (1.6) | 3 (4.9) | 37 (60.7) |
| Trust = 1 | |||||
| White | 99 (93.4) | 1 (0.9) | 0 (0.0) | 6 (5.7) | 0 (0.0) |
| Black | 0 (0.0) | 54 (96.4) | 0 (0.0) | 2 (3.6) | 0 (0.0) |
| Asian | 1 (2.0) | 1 (2.0) | 37 (75.5) | 8 (16.3) | 2 (4.1) |
| Other | 31 (42.5) | 4 (5.5) | 13 (17.8) | 20 (27.4) | 5 (6.8) |
| Hispanic | 32 (27.1) | 20 (16.9) | 2 (1.7) | 6 (5.1) | 58 (49.2) |
Fig. 1Directed acyclic graphs (DAGs) representing disproportionate stratified sampling in the presence of non-differential and differential misclassification
Simulation results: Means, empirical standard errors and 95% coverage probabilities of parameter estimates from 10,000 simulations under observed misclassification rates of race/ethnicity by sampling design and method
| Full cohort | SRS | Disproportionate stratified sampling | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Design-agnostic | Model-based | Design-based | ||||||||
| Intercept | -0.75 (0.01) | 94.7 | -0.75 (0.05) | 95.0 | -0.75 (0.06) | 95.1 | -0.75 (0.10) | 94.5 | -0.75 (0.07) | 94.9 |
| Black | -0.25 (0.02) | 95.0 | -0.26 (0.15) | 94.9 | -0.25 (0.11) | 94.8 | -0.26 (0.25) | 95.4 | -0.25 (0.12) | 94.9 |
| Asian | -0.50 (0.07) | 95.0 | -0.59 (0.82) | 96.7 | -0.51 (0.21) | 95.2 | -0.51 (0.26) | 95.2 | -0.52 (0.27) | 93.0 |
| Other | 1.25 (0.03) | 95.1 | 1.26 (0.21) | 95.2 | 1.25 (0.14) | 95.5 | 1.26 (0.18) | 95.6 | 1.26 (0.21) | 95.1 |
| Hispanic | -1.50 (0.07) | 95.0 | -1.63 (0.87) | 96.8 | -1.55 (0.41) | 95.9 | -1.55 (0.42) | 96.0 | -1.56 (0.47) | 95.2 |
| Low income | 1.00 (0.02) | 94.5 | 1.00 (0.12) | 95.4 | 1.00 (0.11) | 95.1 | 1.01 (0.11) | 95.1 | 1.00 (0.14) | 95.0 |
| Intercept | -0.75 (0.01) | 94.7 | -0.75 (0.05) | 95.1 | -0.57 (0.06) | 13.0 | -0.74 (0.10) | 95.0 | -0.75 (0.07) | 95.2 |
| Black | -0.25 (0.02) | 95.0 | -0.25 (0.15) | 95.1 | -0.42 (0.11) | 67.6 | -0.42 (0.24) | 90.1 | -0.26 (0.12) | 94.8 |
| Asian | -0.50 (0.07) | 95.0 | -0.58 (0.79) | 96.5 | -0.81 (0.20) | 67.9 | -2.16 (0.27) | 0.0 | -0.52 (0.28) | 91.7 |
| Other | 1.25 (0.03) | 95.1 | 1.26 (0.21) | 95.0 | 0.96 (0.13) | 42.4 | 0.02 (0.19) | 0.0 | 1.25 (0.20) | 94.5 |
| Hispanic | -1.50 (0.07) | 95.0 | -1.62 (0.87) | 97.0 | -1.63 (0.36) | 96.3 | -2.14 (0.38) | 62.1 | -1.55 (0.42) | 95.2 |
| Low income | 1.00 (0.02) | 94.5 | 1.00 (0.12) | 94.7 | 1.00 (0.11) | 95.2 | 1.01 (0.11) | 94.9 | 1.00 (0.15) | 94.8 |
Fig. 2Relative uncertainty of design-agnostic, model-based and design-based methods under disproportionate stratified sampling compared to simple random sampling by degree of non-differential misclassification
Demographics of the Vanderbilt University Medical Center CERC respondent sample. Percentages [counts] are provided for each characteristic
| Survey response of 604 respondents, % [count] | |
|---|---|
| Male | 45 [274] |
| Female | 55 [330] |
| <35 | 26 [160] |
| 35+ | 74 [444] |
| White | 39 [237] |
| Black | 18 [110] |
| Asian | 13 [81] |
| Other | 11 [67] |
| Hispanic | 18 [109] |
| Less than HS | 9 [53] |
| HS to some college | 38 [230] |
| At least BS | 53 [321] |
| Suburban/Urban | 54 [327] |
| Rural | 46 [277] |
| <30,000 | 27 [166] |
| 30,000 to 59,999 | 23 [138] |
| 60,000 to 149,999 | 33 [200] |
| 150,000+ | 17 [100] |
Results from design-agnostic, model-based and design-based logistic regression analyses in which trust in healthcare system was regressed on self-reported race/ethnicity, low income, age, gender, rural living and education
| Design-agnostic | Model-based | Design-based | |
|---|---|---|---|
| Variable | OR (95% CI) | OR (95% CI) | OR (95% CI) |
| White | 1.00 | 1.00 | 1.00 |
| Black | 1.11 (0.66, 1.86) | 1.42 (0.60, 3.33) | 0.71 (0.26, 1.92) |
| Asian | 0.80 (0.46, 1.38) | 1.69 (0.71, 4.02) | 0.69 (0.33, 1.44) |
| Other | 0.76 (0.43, 1.35) | 1.19 (0.61, 2.31) | 1.23 (0.37, 4.07) |
| Hispanic | 0.67 (0.41, 1.09) | 0.78 (0.42, 1.47) | 0.24 (0.08, 0.76) |
| No (Income ≥$30,000) | 1.00 | 1.00 | 1.00 |
| Yes (Income <$30,000) | 1.25 (0.81, 1.95) | 1.24 (0.79, 1.95) | 1.46 (0.56, 3.83) |
| ≤35 | 1.03 (0.69, 1.52) | 1.54 (0.58, 4.10) | 0.87 (0.39, 1.94) |
| >35 | 1.00 | 1.00 | 1.00 |
| Male | 1.00 | 1.00 | 1.00 |
| Female | 0.69 (0.49, 0.99) | 1.23 (0.34, 4.45) | 1.03 (0.49, 2.16) |
| No (Suburban/Urban) | 1.00 | 1.00 | 1.00 |
| Yes (Rural) | 0.83 (0.59, 1.18) | 0.82 (0.57, 1.17) | 0.63 (0.31, 1.27) |
| Less than HS | 0.96 (0.48, 1.92) | 0.90 (0.43, 1.86) | 0.90 (0.13, 6.37) |
| HS to some college | 0.95 (0.64, 1.39) | 0.93 (0.62, 1.39) | 1.30 (0.58, 2.89) |
| At least college graduate | 1.00 | 1.00 | 1.00 |