| Literature DB >> 26613900 |
Mark E McGovern1,2,3, Giampiero Marra4, Rosalba Radice5, David Canning6, Marie-Louise Newell7,8, Till Bärnighausen7,6.
Abstract
INTRODUCTION: HIV testing is a cornerstone of efforts to combat the HIV epidemic, and testing conducted as part of surveillance provides invaluable data on the spread of infection and the effectiveness of campaigns to reduce the transmission of HIV. However, participation in HIV testing can be low, and if respondents systematically select not to be tested because they know or suspect they are HIV positive (and fear disclosure), standard approaches to deal with missing data will fail to remove selection bias. We implemented Heckman-type selection models, which can be used to adjust for missing data that are not missing at random, and established the extent of selection bias in a population-based HIV survey in an HIV hyperendemic community in rural South Africa.Entities:
Keywords: HIV prevalence; Heckman-type selection models; demographic surveillance; missing data; non-participation; selection bias
Mesh:
Year: 2015 PMID: 26613900 PMCID: PMC4662682 DOI: 10.7448/IAS.18.1.19954
Source DB: PubMed Journal: J Int AIDS Soc ISSN: 1758-2652 Impact factor: 5.396
Consent to test for HIV at the 2009 Africa Centre Surveillance cohort by sex
| Women | Men | |||
|---|---|---|---|---|
|
| % |
| % | |
| Refuse to test | 10,242 | 65 | 7018 | 73 |
| Consent to test | 5565 | 35 | 2567 | 27 |
| Total | 15,807 | 100 | 9585 | 100 |
Interviewer statistics for the Africa Centre 2009 HIV survey
| Men | Women | |
|---|---|---|
| Number of interviewers | 56 | 57 |
| Median number of interviewees per interviewer (25th and 75th percentiles) | 127 (32–259) | 174 (94–403.5) |
| Median consent (25th and 75th percentiles) | 25% (15–39%) | 33% (21–40%) |
| Median HIV prevalence (25th and 75th percentiles) | 15% (10–21%) | 24% (18–31%) |
Estimates are calculated using one observation per interviewer. For each interviewer, the consent rate is calculated as the number of residents from whom consent to test for HIV was obtained by the interviewer, divided by the number of residents from whom consent to test for HIV was sought by the interviewer. The median HIV prevalence is the median in the distribution of prevalence observed across the participants who consented for each interviewer.
Figure 1Number of interviews, consent rates and HIV prevalence by interviewer (male respondents).
Figure 2Number of interviews, consent rates and HIV prevalence by interviewer (female respondents).
Predictors of consent to an HIV test
| Women | Men | |
|---|---|---|
| Logit odds ratio | Logit odds ratio | |
| Variables | Consent | Consent |
| Good interviewer (above 75th consent percentile) | 2.17 | 2.40 |
| Interviewer experience (lowest quintile omitted) | ||
| Second quintile | 0.96 (0.06) | 0.94 (0.08) |
| Middle quintile | 0.94 (0.06) | 0.79 |
| Fourth quintile | 1.14 | 0.99 (0.10) |
| Highest quintile | 1.31 | 1.35 |
| Age group (15 to 19 omitted) | ||
| 20–24 | 0.97 (0.09) | 0.98 (0.09) |
| 25–29 | 0.68 | 0.79 |
| 30–34 | 0.65 | 0.82 |
| 35–39 | 0.65 | 0.77 |
| 40–44 | 0.62 | 0.80 (0.11) |
| 45–49 | 0.75 | 1.04 (0.15) |
| 50–54 | 0.83 | 1.17 (0.17) |
| 55–59 | 0.87 (0.10) | 1.34 |
| 60 + | 0.92 (0.10) | 2.03 |
| Type of location of residence ( | ||
| Peri-urban | 1.07 (0.07) | 1.12 (0.11) |
| Rural | 2.18 (2.56) | 0.36 (0.29) |
| Distance to nearest clinic (≤1 km omitted), km | ||
| 1–2 | 0.94 (0.07) | 0.80 |
| 2–3 | 0.92 (0.08) | 0.77 |
| 3–4 | 1.02 (0.09) | 1.01 (0.13) |
| 4–5 | 1.18 (0.12) | 1.12 (0.16) |
| 5+ | 1.37 | 1.62 |
| Distance to nearest secondary school, km | ||
| 1–2 | 0.99 (0.05) | 0.99 (0.07) |
| 2–3 | 1.09 (0.07) | 1.08 (0.10) |
| 3–4 | 0.97 (0.08) | 0.98 (0.12) |
| 4–5 | 0.96 (0.11) | 0.80 (0.15) |
| 5+ | 0.65 | 0.72 (0.20) |
| Distance to nearest primary school, km | ||
| 1–2 | 1.22 | 1.17 |
| 2–3 | 1.16 | 1.18 (0.12) |
| 3–4 | 1.25 (0.21) | 0.91 (0.25) |
| Distance to nearest Level 1 road, km | ||
| 1–2 | 0.97 (0.07) | 1.03 (0.10) |
| 2–3 | 0.84 (0.10) | 1.11 (0.18) |
| 3–4 | 0.87 (0.13) | 1.04 (0.22) |
| 4–5 | 0.75 | 0.95 (0.20) |
| 5+ | 0.55 | 0.70 |
| Distance to nearest Level 2 road, km | ||
| 1–2 | 0.91 | 0.88 |
| 2–3 | 0.96 (0.06) | 0.81 |
| 3–4 | 1.05 (0.10) | 1.03 (0.13) |
| 4–5 | 1.44 | 1.58 |
| 5+ | 1.35 (0.26) | 3.11 |
| Marital status ( | ||
| Polygamous | 1.10 (0.13) | 0.69 |
| Divorced/separated/ widowed | 0.95 (0.06) | 1.23 (0.22) |
| Engaged | 1.34 | 1.00 (0.17) |
| Never married | 1.04 (0.06) | 1.71 |
| Under legal age | 0.90 (0.10) | 1.91 |
| Missing/other | 0.67 (0.34) | 0.41 |
| Mother alive ( | ||
| Alive | 1.01 (0.05) | 0.94 (0.08) |
| Missing/other | 0.43 | 1.26 (0.48) |
| Father alive ( | ||
| Alive | 1.00 (0.06) | 0.90 (0.07) |
| Missing/other | 0.91 (0.22) | 0.78 (0.24) |
| Have electricity in house ( | ||
| No | 0.91 (0.06) | 0.95 (0.09) |
| N/A | 1.02 (0.23) | 1.08 (0.34) |
| Missing/other | 1.35 (0.75) | 0.46 (0.25) |
| Type of fuel in house ( | ||
| Coal/wood | 1.04 (0.06) | 0.82 |
| Gas | 1.03 (0.09) | 0.87 (0.11) |
| Other | 1.06 (0.13) | 0.80 (0.13) |
| Missing/other | 0.92 (0.21) | 0.43 |
| N/A | 0.75 (0.42) | 1.09 (0.60) |
| Household asset quintile (lowest omitted) | ||
| Second | 0.89 | 1.12 (0.10) |
| Third | 0.88 | 0.98 (0.10) |
| Fourth | 0.79 | 0.83 (0.10) |
| Fifth | 0.71 | 0.73 |
| Missing/other | 0.94 (0.19) | 1.36 (0.36) |
| Education ( | ||
| Primary | 1.09 (0.07) | 0.90 (0.10) |
| Junior secondary | 0.95 (0.07) | 0.88 (0.10) |
| Upper secondary | 0.71 | 0.70 |
| Unknown | 0.77 | 0.75 |
| Missing/other | 0.55 | 0.86 (0.16) |
| Running water in house | 1.09 | 1.08 (0.08) |
| Inside toilet | 0.83 | 1.10 (0.15) |
| Constant | 1.09 (0.26) | 0.39 |
| Observations | 15,807 | 9585 |
Robust standard errors in parentheses;
p<0.01
p<0.05
p<0.1;
coefficients shown are odds ratios from a logistic regression model for consent to test. In addition to the variables shown in the table, the models also control for location of residence (Isigodi) fixed effects and month of interview, which are not shown for reasons of space. Column 1 is for women only; Column 2 is for men only. The “good interviewer” variable is defined as having been interviewed by an interviewer who obtained an overall consent rate above the 75th consent percentile. For each respondent in the sample, the interviewer consent rate is calculated as the consent rate among that interviewer's other respondents, excluding whether that respondent consented or not (in order to avoid a mechanical correlation between own consent and interviewer-level consent). Interviewer experience is calculated as the number of interviews conducted in the 2009 surveillance by a respondent's interviewer prior to the respondent's own interview.
Estimates of HIV prevalence
| Model | HIV prevalence | 95% CI | |
|---|---|---|---|
| Men | |||
| Cases with valid HIV test | 16 | 15 | 17 |
| Imputation | 18 | 16 | 21 |
| Heckman selection model (interviewer) | 25 | 15 | 35 |
| Women | |||
| Cases with valid HIV test | 24 | 23 | 26 |
| Imputation | 27 | 26 | 28 |
| Heckman selection model (interviewer) | 33 | 27 | 40 |
CI, confidence interval. The following variables are included as predictors of consent to test for HIV and HIV status: age group, location of residence (Isigodi), type of location of residence (urban/rural/peri-urban), distance to nearest clinic, distance to nearest secondary school, distance to nearest primary school, distance to nearest Level 1 road, distance to nearest Level 2 road, marital status, education, mother/father is alive, electricity in home, fuel in home, toilet in home, water in home and household asset index.
The first row is the mean prevalence among the sample who consent to test and have a valid HIV test (complete case analysis). The second row imputes HIV prevalence for those who refused consent using the covariates described above. Row 3 implements a Heckman selection model for HIV status and consent to an HIV test using interviewer fixed effects. We show results from the copula selection model with the best fit as measured by the AIC, which for both men and women is the Gaussian copula (equivalent to assuming the error terms are drawn from the bivariate normal distribution).
The confidence interval for the imputation model is based on five imputations. The confidence interval for the Heckman selection model is based on the delta method.