| Literature DB >> 30830933 |
Dick Durevall1, Annika Lindskog1, Gavin George2.
Abstract
We examine the relationship between school attendance and HIV incidence among young women in South Africa. Our aim is to distinguish a causal effect from correlation. Towards this end, we apply three methods to population-based longitudinal data for 2005-2012 in KwaZulu-Natal. After establishing a negative association, we first use a method that assesses the influence of omitted variables. We then estimate models with exclusion restrictions to remove endogeneity bias, and finally we estimate models that control for unobserved factors that remain constant over time. All the three methods have strengths and weaknesses, but none of them suggests a causal effect. Thus, interventions that increase school attendance in KwaZulu-Natal would probably not mechanically reduce HIV risk for young women. Although the impact of school attendance could vary depending on context, unobserved variables are likely to be an important reason for the common finding of a negative association between school attendance and HIV incidence in the literature.Entities:
Mesh:
Year: 2019 PMID: 30830933 PMCID: PMC6398860 DOI: 10.1371/journal.pone.0213056
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Female HIV prevalence and HIV incidence in 2005 by age.
Descriptive statistics.
| Total | Seroconverted | Risk ratio | 95% CI of risk ratio | |
|---|---|---|---|---|
| Attend school, n (%) | 5,697 (77,59) | 96 (1.69) | 0.280 | [0.213, 0.369] |
| Do not attend school, n (%) | 1,645 (22.41) | 99 (6.02) | Comparison group | |
| Urban, n (%) | 209 (2.85) | 5 (2.39) | 1.798 | [0.753, 4.293] |
| Peri-urban, n (%) | 2,296 (31.27) | 81 (3.53) | 2.651 | [2.103, 3.341] |
| Rural, n (%) | 4,837 (65.88) | 109 (2.25) | Comparison group | |
| Distance to primary road, mean (SD) | 7.271 (6.734) | 0.975 | [0.954, 0.997] | |
| Distance to secondary road, mean (SD) | 1.453 (1.241) | 0.979 | [0.872, 1.097] | |
| Age 15, n (%) | 572 (7.79) | 0 (0.00) | 0.000 | Not defined |
| Age 16, n (%) | 1,454 (19.80) | 8 (0.55) | Comparison group | |
| Age 17, n (%) | 1,358 (18.50) | 12 (0.88) | 1.606 | [0.659, 3.917] |
| Age 18, n (%) | 1,079 (14.70) | 23 (2.13) | 3.874 | [1.740, 8.628] |
| Age 19, n (%) | 836 (11.39) | 23 (2.75) | 5.000 | [2.247, 11.128] |
| Age 20, n (%) | 625 (5.51) | 31 (4.96) | 9.015 | [4.167, 19.500] |
| Age 21, n (%) | 466 (6.35) | 23 (4.94) | 8.970 | [4.040, 19.918] |
| Age 22, n (%) | 366 (4.99) | 24 (6.56) | 11.918 | [5.399, 26.310] |
| Age 23, n (%) | 309 (4.21) | 30 (9.71) | 17.646 | [8.169, 38.117] |
| Age 24, n (%) | 277 (3.77) | 21 (7.58) | 13.779 | [6.166, 30.792] |
| Year 2005, n (%) | 674 (9.18) | 7 (1.04) | Comparison group | |
| Year 2006, n (%) | 889 (12.11) | 22 (2.47) | 2.387 | [1.026, 5.553] |
| Year 2007, n (%) | 881 (12.00) | 23 (2.61) | 2.514 | [1.085, 5.823] |
| Year 2008, n (%) | 1,051 (14.31) | 20 (1.90) | 1.832 | [0.779, 4.310] |
| Year 2009, n (%) | 759 (10.34) | 23 (3.03) | 2.918 | [1.260, 6.756] |
| Year 2010, n (%) | 982 (13.38) | 25 (2.55) | 2.451 | [1.066, 5.635] |
| Year 2011, n (%) | 902 (12.29) | 29 (3.22) | 3.096 | [1.364, 7.024] |
| Year 2012, n (%) | 1,204 (16.40) | 46 (3.82) | 3.679 | [1.670, 8.102] |
| Total sample, n (%) | 7,342 (100.00) | 195 (2.66) | ||
The observations are from 2,976 women. The risk ratios were computed using Stata’s glm command with the binomial log link.
The association between secondary school attendance and HIV incidence (probit marginal effects).
| School attendance | −0.014 |
| (0.005) | |
| [-0.023, -0.004] | |
| 7,342 | |
| 2,976 |
The model also includes a constant, age and year dummies, peri-urban or urban residence, and distances to the primary road and the secondary road. Standard errors, clustered at the household level, in parentheses.
*** p<0.01.
a) The reported effect is the percentage point impact of school attendance on the probability of HIV infection.
Robustness of the impact of secondary school attendance on HIV incidence to selection on unobserved factors (bivariate probit model marginal effects).
| Assumed | 0.00 | −0.05 | −0.1 | −0.15 | −0.2 | −0.25 | −0.3 | −0.364 |
|---|---|---|---|---|---|---|---|---|
| Marginal effect | −0.014 | −0.009 | −0.004 | 0.000 | 0.005 | 0.010 | 0.016 | 0.023 |
| Standard error | (0.005) | (0.005) | (0.005) | (0.005) | (0.005) | (0.005) | (0.005) | (b) |
| 95% CI | [-0.023, -0.004] | [-0.018, -0.000] | [-0.011, 0.002] | [-0.009, 0.009] | [-0.004, 0.014] | [0.001, 0.020] | [0.007, 0.025] | |
Based on constrained bivariate probit estimations. Both the school attendance and the HIV incidence equations include a constant, age and year dummies, peri-urban or urban residence, distances to the primary road and the secondary road. Standard errors are computed with the delta method.
** p<0.05
*** p<0.01.
a) −0.364 is the selection on observed variables.
b) The standard error could not be estimated
The impact of secondary school attendance on HIV incidence using exclusion restrictions (bivariate probit model marginal effects).
| School attendance | 0.010 | [-0.022, 0.043] | |
| (0.006) | |||
| Nearest secondary school <7km away | 0.162 | [0.084 0.233] | |
| (0.038) | |||
| Distance between the nearest and second nearest secondary school | −0.013 | [-0.022, -0.002] | |
| (0.005) | |||
| −0.240 | |||
| [p-value of test of | [0.002] | ||
| Number of observations | 7,341 | ||
| Number of young women | 2.975 | ||
Both equations also include a constant, age and year dummies, peri-urban or urban residence, and distances to the primary road and the secondary road. Standard errors, clustered at the household level, in parentheses.
** p<0.05
*** p<0.01.
The association between secondary school attendance and HIV incidence (correlated random effect probit marginal effects).
| School attendance | −0.007 |
| (0.006) | |
| [-0.019, 0.005] | |
| 7,342 | |
| 2,976 |
The model also includes a constant, age and year dummies, peri-urban or urban residence, distances to the primary road and the secondary road, and individual level means of all explanatory variables. Mean Standard errors, clustered at the household level, in parentheses.