Jailos Lubinda1,2,3,4, Yaxin Bi5, Ubydul Haque6,7,8, Mukuma Lubinda9, Busiku Hamainza10, Adrian J Moore1. 1. School of Geography and Environmental Sciences, Ulster University, Coleraine, UK. 2. School of Nursing, Ulster University, Belfast, UK. 3. School of Computing, Engineering and Intelligent Systems, Ulster University, Derry, UK. 4. Telethon Kids Institute, Malaria Atlas Project, Nedlands, WA Australia. 5. School of Computing, Ulster University, Belfast, UK. 6. Department of Biostatistics and Epidemiology, University of North Texas Health Science Centre, Fort Worth, TX USA. 7. Department of Geography, University of Florida, Gainesville, FL USA. 8. Emerging Pathogens Institute, University of Florida, Gainesville, FL USA. 9. Malaria Institute at Macha/Macha Research Trust, Choma, Zambia. 10. Ministry of Health, National Malaria Elimination Centre, Lusaka, Zambia.
Abstract
Background: The spatial and temporal variability inherent in malaria transmission within countries implies that targeted interventions for malaria control in high-burden settings and subnational elimination are a practical necessity. Identifying the spatio-temporal incidence, risk, and trends at different administrative geographies within malaria-endemic countries and monitoring them in near real-time as change occurs is crucial for developing and introducing cost-effective, subnational control and elimination intervention strategies. Methods: This study developed intelligent data analytics incorporating Bayesian trend and spatio-temporal Integrated Laplace Approximation models to analyse high-burden over 32 million reported malaria cases from 1743 health facilities in Zambia between 2009 and 2015. Results: The results show that at least 5.4 million people live in catchment areas with increasing trends of malaria, covering over 47% of all health facilities, while 5.7 million people live in areas with a declining trend (95% CI), covering 27% of health facilities. A two-scale spatio-temporal trend comparison identified significant differences between health facilities and higher-level districts, and the pattern observed in the southeastern region of Zambia provides the first evidence of the impact of recently implemented localised interventions. Conclusions: The results support our recommendation for an adaptive scaling approach when implementing national malaria monitoring, control and elimination strategies and a particular need for stratified subnational approaches targeting high-burden regions with increasing disease trends. Strong clusters along borders with highly endemic countries in the north and south of Zambia underscore the need for coordinated cross-border malaria initiatives and strategies.
Background: The spatial and temporal variability inherent in malaria transmission within countries implies that targeted interventions for malaria control in high-burden settings and subnational elimination are a practical necessity. Identifying the spatio-temporal incidence, risk, and trends at different administrative geographies within malaria-endemic countries and monitoring them in near real-time as change occurs is crucial for developing and introducing cost-effective, subnational control and elimination intervention strategies. Methods: This study developed intelligent data analytics incorporating Bayesian trend and spatio-temporal Integrated Laplace Approximation models to analyse high-burden over 32 million reported malaria cases from 1743 health facilities in Zambia between 2009 and 2015. Results: The results show that at least 5.4 million people live in catchment areas with increasing trends of malaria, covering over 47% of all health facilities, while 5.7 million people live in areas with a declining trend (95% CI), covering 27% of health facilities. A two-scale spatio-temporal trend comparison identified significant differences between health facilities and higher-level districts, and the pattern observed in the southeastern region of Zambia provides the first evidence of the impact of recently implemented localised interventions. Conclusions: The results support our recommendation for an adaptive scaling approach when implementing national malaria monitoring, control and elimination strategies and a particular need for stratified subnational approaches targeting high-burden regions with increasing disease trends. Strong clusters along borders with highly endemic countries in the north and south of Zambia underscore the need for coordinated cross-border malaria initiatives and strategies.
Malaria remains one of the leading causes of death in children and pregnant women in sub-Saharan Africa[1]. With global progress in reducing incidence rates now levelling off, malaria is generally rising, especially in some high-burden African countries. As a result, most of Sub-Saharan Africa, including Zambia, is pursuing malaria policies and strategies to reduce case incidence. Reducing the population proportion living in high-risk areas and pursuing local pre-elimination and subsequent elimination status remains a priority goal for countries[2].As local and subnational malaria reduction depends on individual exposure to infectious mosquito bites, local mosquito density, and infectivity, tailored strategies are essential for various local contexts to best use the limited resources available. The need for strong health surveillance systems to inform appropriate intervention programmes, fine-scale monitoring of patterns, and adaptation of strategies as malaria transmission declines have become critical.This is especially true for most Sub-Saharan countries like Zambia, which are pursuing multiple approaches targeted at local malaria control in some settings and pre-elimination or elimination in others. Targeted interventions, particularly in elimination settings, aim to interrupt local transmission as it becomes increasingly concentrated in small areas that are often very hard and costly to reach[3]. Understanding the spatio-temporal scale dynamics of prevailing malaria epidemiology is imperative to facilitate and successfully target and control those remaining residual reservoirs and infection hotspots[4,5].For Zambia, focal targeting of low-transmission settings, alongside accelerated control of malaria in high transmission settings, could help quicken the reduction of malaria burden and hasten progress towards the malaria elimination goal. Besides, the current funding pressure dictates that available, often limited resources are more effectively directed at targeting those areas and populations where the most impact can be achieved[6-8]. With considerable spatial and temporal variability generally inherent in malaria transmission within endemic countries[4,9], targeted interventions across the transmission spectrum that operate at much finer resolutions than before are a practical necessity for understanding, monitoring, and ensuring effective control[8,10-14].Like many sub-Saharan African countries, Zambia’s local malaria policy decisions are primarily based on routinely collected data from its national Health Management Information Systems (HMIS), the District Health Information System (DHIS2). While such sources continue to experience data quality issues such as incompleteness and accuracy and often lack the equivalent scale of treatment-seeking information necessary for making adjustments, they are still the best available sources for strategic and operational planning. Over time, continued improvements in the quality of this data make it highly valuable in everyday decision-making as it requires less and less adjustments for analytical purposes.As Zambia is one of those countries concurrently pursuing intensified control strategies along with pre-elimination and elimination strategies, approaches targeted at the health facility level have gained momentum, particularly following the WHO’s recommendation to use a malaria continuum measure rather than a uniformly applied strategy[15,16]. This entails identifying and targeting the hardest-to-reach malaria hotspots in the all-important elimination phase or high-burden areas in the intensified control phase. Such an approach supports the design, monitoring, and adaption of appropriate intervention programmes as malaria transmission declines further.This study uses data at the lowest administrative (health facility) level to investigate the spatially structured temporal trends that characterise fine-scale malaria burden in Zambia. We compare spatio-temporal trends within the health facility and district-level models to identify key differences that could provide policymakers and disease surveillance experts with relevant operational-level evidence to help them develop and plan more cost-effective, targeted, multi-scale intervention strategies. Our findings will help improve the geospatial identification and stratification of areas with high or increasing burdens and improve targeting efficiency in allocating health service resources for malaria intervention planning.
Methods
Malaria datasets used in the study
Authorisation for the study in Zambia was obtained from the Zambia National Health Research Authority, and overall permission to use routine malaria data was granted by the Ministry of Health. Ethical approval to perform the study was granted by the Zambian ethical review board—ERES Converge IRB (Ref: 2017-Sept-011). The requirement for individual consent was also waived by the ethical review board, given that all the data were already aggregated to administrative units and contained no individual identifiers.We obtained aggregated monthly reported malaria case data for 2531 operational health facilities from the Zambian HMIS/DHIS2 from 2009 to 2015. Of the total, 612 (24.2%) are new health facilities that typically reported zero malaria cases throughout the study period and are therefore excluded. The absence of these health facilities was verified against a detailed government health facility census of 2012[17].Another 76 facilities, randomly spread across the country, are excluded from the analysis because of incomplete malaria data and/or lack of baseline population information as a denominator in calculating incidence rates and standardised risk ratios (here referred to as risk). Of those health facilities excluded, eight had complete malaria data from 2009 to 2015, ten only had data for 2014 and 2015, and the remaining 58 had data for 2015 only and data collection only started in the second half of the year. The reported malaria case data for 76 health facilities accounted for 0.8% of all recorded cases over the study period. Finally, 100 referral hospitals where severe malaria cases are admitted and treated are separated from the dataset. This separation helped avoid double counting of malaria cases already captured at the lower-level health facilities.For the same reason, all hospital-affiliated health centres (HAHCs) in this analysis are considered and included as separate health centres/units offering services similar to those of normal health facilities not affiliated with any hospital. All private hospitals are included because they also provide services offered by lower-level public health facilities. The remaining 1743 facilities comprised the complete dataset analysed (see Fig. 1and Supplementary methods). Our final working dataset includes 146,000 monthly health facility reports (from 1743 facilities), capturing over 32 million cases in seven years (January 2009 – December 2015). It is worth noting that the exclusion of the eight health facilities could be a potential source of unknown bias in the data by slightly underestimating any effects in the results presented.
Fig. 1
Summary schema of final health facility-level analysis.
A workflow of the process of cleaning the dataset and the exclusion/inclusion criteria applied to arrive at the final dataset analysed.
Summary schema of final health facility-level analysis.
A workflow of the process of cleaning the dataset and the exclusion/inclusion criteria applied to arrive at the final dataset analysed.The numerator consisted of positive malaria cases diagnosed and confirmed using RDTs, microscopy slides, and adjusted clinically diagnosed cases. Although data are reported in available age categories of <1, 1–4, and ≥5 years, the study population included all ages at risk of malaria, from infants to adults. The facility service base populations, used as denominators in calculating malaria rates, are derived in consultation with the programme and are based on recalculated census health facility service catchment data or reported facility headcounts. Priority is given to headcounts over census estimates.
Spatial and temporal models implemented
We implemented a Bayesian trend model and a spatio-temporal Integrated Laplace Approximation (INLA) model to analyse spatio-temporal trends over the 7-year study period[18,19]. We used the Integrated Laplace Approximation (R-INLA) package approximation strategy[18], as implemented in the R programme version 3.5[20]. Given the extensive number of observations in the dataset, we chose the simplified Laplace approximation, which is relatively less computationally intensive than the full Laplace but still retains similar accuracy[18].
Spatio-temporal model description of integrated nested Laplace approximations
The study area is divided into health facility catchment areas. We used the observed number of malaria cases (O) in a health facility area i at time point t, the total population at risk of malaria is N and the expected number of cases (E) for a health facility area i at time point t to estimate the malaria incidence ratio. A simplified equation of the standardised incidence ratio (SIR) calculation is shown by the equation:The SIR is calculated to estimate the risk of malaria at the health facility. In this study, risk and incidence denote the estimated polygon risk or rate, and the population denominator is the population at risk within that facility unit polygon or facility point (as opposed to the true health-centre catchment area).In our analysis, we implemented a re-parameterised Besag, York, and Mollié model[21] (BYM) model, which is called BYM2[22], with the spatial random effect parameter, presented asWhere u* denotes the scaled intrinsic Conditional Autoregressive (iCAR) model with a generalised variance = 1, and V are unstructured random effects. To avoid identifiability problems, a sum-to-zero constraint must be imposed. The expression of a weighted average covariance of matrices to the structured and unstructured spatial components comprises the variance of the overall random effect[23].The prior spatial distribution implemented through the BYM2 model fits the spatio-temporal model with a random temporal effect and is implemented with a temporal structured random walk of second-order (RW2). For a temporary structured random effect , we assume a prior distribution to be where T is the precision parameter, and R is the structure matrix of RW2 (T x T) taking after an iCAR prior to the spatially structured variability. We assumed a Gaussian exchangeable prior distribution to model unstructured heterogeneity for the space-time interaction random effect as , where, T is the precision parameter, while R represents the corresponding spatial and temporal structure matrix of nT x nT for the full interaction effects. For more details about the BYM2 model implemented, refer to refs. [22,24,25].The incorporated iCAR prior distribution is represented byWhere T denotes a precision parameter, and R represents the n × n spatial neighbourhood matrix known by a diagonal of elements equal to the sum of neighbours of each area. The non-diagonal elements (R) is equal to −1 whenever i and j are neighbouring areas; otherwise, (R) = 0, and N is the number of neighbours with the joint distribution. Two areas are defined as neighbours if they share a common border or edge. Here the neighbourhood structure is defined by the matrix of nonzero elements, given by extracting the structure directly from the shapefile.We also leveraged computation power from the shiny SSTCDapp[26] to run several comparative model fitting performance tests to determine and select the model with the best fit, using comparisons of deviance information criterion (DIC). We applied spatio-temporal models with prior distribution for the spatial random effect of the BYM2. The BYM2 model contains both an intrinsic conditional autoregressive (ICAR) component and an ordinary random effects component for spatial autocorrelation and non-spatial heterogeneity[27]. The model allows all parameters to have clear reading and a straightforward selection of hyperpriors[27]. The models are implemented with a random walk of order 2 (RW2) prior distribution for the random temporal effect and an unstructured temporal random effect. We added a space-time interaction of type (ii) random effect term to account for spatial and temporal autocorrelation.We computed a spatial neighbourhood matrix from an ESRI shapefile within which two health facilities are considered neighbours if they share a common border or edge. We used the queen criterion to compute the adjacency binary matrix W of spatial polygons, defined as w_ij = 1 where area “i” and “j” share a common border or edge, and 0 otherwise. We then fitted health facility Voronoi polygons to generate a unique geographic area with an associated year ID for the temporal variable with RW2. The RW2 assumes that variables take periodic random steps away from previous values, using independently and identically distributed (iid) size steps. While other studies may apply probability measures to density metrics to define populations within catchments of a public health facility[28], catchment areas in this study had populations assigned from a census estimate or headcount and are only used for mapping, visualisation of trends, and cluster analysis purposes. The Voronoi polygons represent arbitrary exclusive facility delineations that do not reflect the true catchment extent and do not contribute to the definition of catchment populations or incidence estimates.
Bayesian trends model in a Markov Chain Monte Carlo environment
Finally, we implemented a separate Bayesian Poisson trend mixed model with change point parameters to detect health facility-specific malaria trends over the study period. This is built on methods and models previously implemented for the district-level trend values to compare the health facility and district-level scales[29]. The models are implemented in a Markov Chain Monte Carlo (MCMC) environment, with a burn-in of 10,000, a sample of 110,000, and four parallel chains, with a thinning of the degree of 10. We used Gelman’s trace plots and visual diagnostics to determine the convergence of the models[30,31]. The model structure and equation of the temporal model are denoted by:Where malaria trends f(t|γS) estimated in the study are represented by (a) Constant trend—β1; (b) Linear increasing trend—β1 + γ1t, with γ1 > 0; and (c) Linear decreasing trend—β1 + γ2t, with γ2 < 0. A more detailed description of this model is given elsewhere in[32,33]. We also used ESRI’s ArcGIS 10.6 for the optimised hotspot analysis.
Table 1
Trend variation between district and health facility-level trend models.
Facility trend
District trend
# of facilities
%
Total change
Decline
Increase
10
0.6%
1.4%
Increase
Decline
14
0.8%
Increase
Increase
524
30.1%
67.5%
No change
No change
276
15.8%
Decline
Decline
377
21.6%
Decline
No change
58
3.3%
31.1%
No change
Decline
82
4.7%
Increase
No change
272
15.6%
No change
Increase
130
7.5%
The table shows that there is little cross-trend variation in malaria risk between the district and health facility levels, with at least 67% of health facilities exhibiting the same trend observed in the district in which it is located.
Authors: Mitzi Morris; Katherine Wheeler-Martin; Dan Simpson; Stephen J Mooney; Andrew Gelman; Charles DiMaggio Journal: Spat Spatiotemporal Epidemiol Date: 2019-08-12
Authors: Adam Bennett; Josh Yukich; John M Miller; Joseph Keating; Hawela Moonga; Busiku Hamainza; Mulakwa Kamuliwo; Ricardo Andrade-Pacheco; Penelope Vounatsou; Richard W Steketee; Thomas P Eisele Journal: Parasit Vectors Date: 2016-08-05 Impact factor: 3.876
Authors: Nick W Ruktanonchai; Patrick DeLeenheer; Andrew J Tatem; Victor A Alegana; T Trevor Caughlin; Elisabeth Zu Erbach-Schoenberg; Christopher Lourenço; Corrine W Ruktanonchai; David L Smith Journal: PLoS Comput Biol Date: 2016-04-04 Impact factor: 4.475
Authors: Teun Bousema; Gillian Stresman; Amrish Y Baidjoe; John Bradley; Philip Knight; William Stone; Victor Osoti; Euniah Makori; Chrispin Owaga; Wycliffe Odongo; Pauline China; Shehu Shagari; Ogobara K Doumbo; Robert W Sauerwein; Simon Kariuki; Chris Drakeley; Jennifer Stevenson; Jonathan Cox Journal: PLoS Med Date: 2016-04-12 Impact factor: 11.069
Authors: Kelly M Searle; Harry Hamapumbu; Jailos Lubinda; Timothy M Shields; Jessie Pinchoff; Tamaki Kobayashi; Jennifer C Stevenson; Daniel J Bridges; David A Larsen; Philip E Thuma; William J Moss Journal: Malar J Date: 2016-08-15 Impact factor: 2.979
Authors: Regina N Rabinovich; Chris Drakeley; Abdoulaye A Djimde; B Fenton Hall; Simon I Hay; Janet Hemingway; David C Kaslow; Abdisalan Noor; Fredros Okumu; Richard Steketee; Marcel Tanner; Timothy N C Wells; Maxine A Whittaker; Elizabeth A Winzeler; Dyann F Wirth; Kate Whitfield; Pedro L Alonso Journal: PLoS Med Date: 2017-11-30 Impact factor: 11.069