| Literature DB >> 30268127 |
Justin Millar1, Paul Psychas2, Benjamin Abuaku3, Collins Ahorlu3, Punam Amratia2, Kwadwo Koram3, Samuel Oppong4, Denis Valle2.
Abstract
BACKGROUND: There is a need for comprehensive evaluations of the underlying local factors that contribute to residual malaria in sub-Saharan Africa. However, it is difficult to compare the wide array of demographic, socio-economic, and environmental variables associated with malaria transmission using standard statistical approaches while accounting for seasonal differences and nonlinear relationships. This article uses a Bayesian model averaging (BMA) approach for identifying and comparing potential risk and protective factors associated with residual malaria.Entities:
Keywords: Bayesian model averaging; Nonlinear patterns; Risk factors; Statistical methods
Mesh:
Year: 2018 PMID: 30268127 PMCID: PMC6162921 DOI: 10.1186/s12936-018-2491-2
Source DB: PubMed Journal: Malar J ISSN: 1475-2875 Impact factor: 2.979
Fig. 1Map of study district, Bunkpurugu-Yunyoo, in northern Ghana (red polygon show in insert map). Interpolations depict malaria prevalence in young children (ages 6–59 months) in Bunkpurugu-Yunyoo, Ghana during the rainy and dry seasons in the left and right maps, respectively. Six biannual surveys were collected from 2010 to 2013 and pooled by season. Black circles denote the sampled communities and yellow stars denote local urban centers. Interpolations were made using inverse-distance weighted function in ArcGIS 10.3
Potential risk or protective covariates collected from surveys
| Variable | Details |
|---|---|
| Demographic and socio-economic | |
| Age | From 6 to 59 months old |
| Caretaker’s education | Binary variable; either (1) for high school education and above or (0) otherwise |
| Caretaker’s age | In years |
| Ethnicity | Four groups; (1) Bimoba, (2) Konkomba, (3) Mamprusi, and (4) Other, based on language of caretaker |
| Farming caretakera | Binary variable; either caretaker occupation being farming (1) or otherwise (0) |
| Gender | Binary variable; either male (1) or female (0) |
| Surface water source | Binary variable; either (1) source of drinking water from exposed surface water or (0) otherwise |
| Thatch roofing | Binary variable; either housing structure had a thatched roof (1) or otherwise (0) |
| Wealth quintile | Constructed from multiple variables, using the methodology of the Ghana Demographic Health Survey (2008) [ |
| Malaria intervention | |
| Health insurance—personal | Binary variable; either personal access to health insurance (1) or not (0) |
| Health insurance—community | Binary variable; either (1) for ≥ 80%b community coverage of sampled population or (0) otherwise |
| IRS in past 7 monthsa | Binary variable; either individual household having been treated with IRS in past 7 months (1) or not (0) |
| IRS in past year | Binary variable; either individual household having been treated with IRS in past year (1) or not (0) |
| Indoor residual spraying (IRS)—community coverage | Binary variable; either (1) for ≥ 80%b community coverage or (0) otherwise |
| Insecticide treated nets (ITN)—personal | Binary variable; either (1) if net was used in previous night or (0) otherwise |
| ITN—community coverage | Binary variable; either (1) for ≥ 80b % community coverage or (0) otherwise |
| Personal medication use | Binary variable; either (1) used in the past 2 weeks or (0) otherwise |
aRemoved from models due to high correlations (R2 ≥ 0.49) with one or more other variables
bBased on targets from Roll Back Malaria
Potential risk or protective covariates collected from remote sensing and GIS-based sources
| Variable | Source/satellite | Details |
|---|---|---|
| Distance to health facility | GIS-derived | Euclidean distance from active health facility at time of survey (based on survey location) |
| Distance to main roads | GIS-derived [ | Euclidean distance from major roads |
| Distance to urban centers | GIS-derived | Euclidean distance from center with population ≥ 5000 individuals |
| Distance to water bodies | GIS-derived [ | Euclidean distance from rivers and standing water bodies |
| Elevation | CGIAR SRTM [ | Meters above sea level |
| Land surface temperature—daya | NASA (Terra) MOD13A3 (Aqua) MYD13A3 [ | Average monthly daytime temperature (in degrees Celsius) 30 days prior to a survey |
| Land surface temperature—night | NASA (Terra) MOD13A3 and (Aqua) MYD13A3 [ | Average monthly nighttime temperature (in degrees Celsius) 30 days prior to a survey |
| Normalized difference vegetative indexa | NASA (Terra) MOD13A3 and (Aqua) MYD13A3 [ | The maximum monthly index 30 days prior to a survey |
| Population density | WorldPop [ | Population density per 100 m grid, log-transformed |
| Population density (≤ 5 y.o.)a | WorldPop [ | Population under 5 years of age density per 100 m grid, log-transformed |
| Rainfall (historical)a | WorldClim [ | Average of the cumulative sum of precipitation from 3 to 1 month prior to the survey date from past 50 years |
| Rainfall (current)a | FEWSNET [ | Average of the cumulative sum of precipitation from 3 to 1 month prior to survey |
| Slope | GIS-derived (from elevation) |
aRemoved from models due to high correlations (R2 ≥ 0.49) with one or more other variables
Fig. 2Mean slope estimates (circles) and 95% credible intervals (horizontal grey bars) from probit regression parameters. Variables whose credible intervals do not include zero are considered significant (labelled in bold). Risk factors (positive slopes) and protective factors (negative slopes) are shown in red and blue, respectively
Fig. 3Modelled patterns in malaria risk factors based on Bayesian probit regression containing seasonal interaction terms. The left panel depicts mean slope estimate (lines) and 95% credible intervals (polygons) for the predicted malaria prevalence based on age in the rainy and dry seasons. The right panel depicts the mean (points) and 95% credible intervals (vertical bars) for the predicted malaria prevalence based on ethnic group in the rainy and dry seasons
Fig. 4Implied patterns in malaria prevalence and distance to urban center (left) and distance to health facility (right) based on Bayesian probit regression model containing linear splines and seasonal interactions. Results for the rainy and dry seasons are shown in blue and yellow, respectively. The open circles depict where slopes are allowed to change (i.e., knot locations), selected at 20% quantiles of the observed data
Predictive comparisons of models based on the sum of the log-likelihood
| Training | Testing | Sum of log-likelihood | ||
|---|---|---|---|---|
| Logistic | Lasso | BMA | ||
| Base model (p = 29)a | ||||
| Rainy 2010 | Rainy 2011 | − 1072.27 | − 1049.24 | − 1028.24b |
| Rainy 2011 | Rainy 2012 | − 1055.39 | − 1037.49 | − 1032.57b |
| Rainy 2010 | Rainy 2012 | − 1153.62 | − 1110.05 | − 1057.07b |
| Dry 2011 | Dry 2012 | − 969.88 | − 919.45b | − 921.54 |
| Dry 2012 | Dry 2013 | − 915.95 | − 897.03b | − 903.60 |
| Dry 2011 | Dry 2013 | − 967.83 | − 920.28 | − 915.81b |
| Average | − 1022.49 | − 988.92 | − 976.47 | |
| Model with interactions and splines (p = 73)a | ||||
| Rainy 2010 | Rainy 2011 | − 1079.63 | − 1042.02 | − 1027.85b |
| Rainy 2011 | Rainy 2012 | − 1066.56 | − 1035.44 | − 1030.75b |
| Rainy 2010 | Rainy 2012 | − 1156.76 | − 1092.52 | − 1050.55b |
| Dry 2011 | Dry 2012 | − 1065.27 | − 1029.66 | − 921.05b |
| Dry 2012 | Dry 2013 | − 922.40 | − 902.79 | − 902.32b |
| Dry 2011 | Dry 2013 | − 1079.34 | − 1059.24 | − 917.82b |
| Average | − 1061.66 | − 1026.95 | − 975.06 | |
BMA Bayesian model average
ap refers to the number of covariates in the model
bIndicates the model with the best fit
Fig. 5Regression parameter estimates using BMA (black), logistic regression (red), and Lasso regression (gray) models containing interactions terms and spline covariates (73 independent variables). Parameter estimates were ordered according to the logistic regression results to better illustrate the shrinkage of coefficients associated with the BMA and Lasso algorithms