Literature DB >> 30223860

Environmental suitability for lymphatic filariasis in Nigeria.

Obiora A Eneanya1, Jorge Cano2, Ilaria Dorigatti3, Ifeoma Anagbogu4, Chukwu Okoronkwo4, Tini Garske3, Christl A Donnelly3,5.   

Abstract

BACKGROUND: Lymphatic filariasis (LF) is a mosquito-borne parasitic disease and a major cause of disability worldwide. It is one of the neglected tropical diseases identified by the World Health Organization for elimination as a public health problem by 2020. Maps displaying disease distribution are helpful tools to identify high-risk areas and target scarce control resources.
METHODS: We used pre-intervention site-level occurrence data from 1192 survey sites collected during extensive mapping surveys by the Nigeria Ministry of Health. Using an ensemble of machine learning modelling algorithms (generalised boosted models and random forest), we mapped the ecological niche of LF at a spatial resolution of 1 km2. By overlaying gridded estimates of population density, we estimated the human population living in LF risk areas on a 100 × 100 m scale.
RESULTS: Our maps demonstrate that there is a heterogeneous distribution of LF risk areas across Nigeria, with large portions of northern Nigeria having more environmentally suitable conditions for the occurrence of LF. Here we estimated that approximately 110 million individuals live in areas at risk of LF transmission.
CONCLUSIONS: Machine learning and ensemble modelling are powerful tools to map disease risk and are known to yield more accurate predictive models with less uncertainty than single models. The resulting map provides a geographical framework to target control efforts and assess its potential impacts.

Entities:  

Keywords:  Ensemble modelling; Generalised boosted model (GBM); Lymphatic filariasis; Machine learning; Random forest (RF)

Mesh:

Year:  2018        PMID: 30223860      PMCID: PMC6142334          DOI: 10.1186/s13071-018-3097-9

Source DB:  PubMed          Journal:  Parasit Vectors        ISSN: 1756-3305            Impact factor:   3.876


Background

Lymphatic filariasis (LF) is a mosquito-borne disease endemic in tropical regions and caused by the parasitic nematode Wuchereria bancrofti in Africa, and by Brugia malayi and B. timori in Southeast Asia [1]. These parasites are transmitted by various species of mosquitoes, with Anopheles spp. being major vectors in Africa [1, 2]. Other mosquito species of the genera Culex and Mansonia also contribute to transmission to some extent, particularly in urban and peri-urban settings [3, 4]. The majority of infected individuals are asymptomatic, but infections can lead to lymphedema, hydrocele and swellings of the breasts in women [5]. An independent International Task Force for Disease Eradication included LF as one of the nine diseases targeted for elimination [6], and in 1997 the World Health Assembly adopted Resolution WHA50.29 embarking on a global campaign to eliminate LF. Elimination of LF as a public health problem is deemed feasible for a number of reasons: (i) mosquitoes are very inefficient transmitters of filarial parasites [7]; (ii) the small number of animal reservoirs are restricted to particular foci for B. malayi, and there are no animal reservoirs for W. bancrofti [8]; and (iii) the availability of improved diagnostic tools and the existence of practical interventions for interruption of transmission [9-11]. An understanding of the geographical distribution of LF is required to underpin national elimination programmes. This enables more effective targeting of control efforts on highly endemic areas. Early maps of disease distribution have mostly relied on field surveys at national or sub-national levels [12-15], often with entire administrative units classified based on disease prevalence with no account for within-unit heterogeneity [16]. This might be useful for roughly estimating disease burden [17, 18]; however, such estimates fail to accurately represent disease burden and highly endemic foci may be misclassified [18]. To account for within-region heterogeneity, maps have been created by applying geostatistical modelling on point prevalence data, in combination with potential disease drivers (i.e. climatic, environmental and demographic factors), due to their impacts on mosquito populations and parasite biology [16, 19–24]. In 2003, the Nigerian Lymphatic Filariasis Elimination Programme (NLFEP) commenced LF mapping on a national scale, and to date, 761 out of 774 Local Government Areas (LGAs) have been mapped using immunochromatographic card tests (ICT) [15]. Of these, 574 LGAs are classed as endemic and targeted for mass drug administration (MDA), and 187 LGAs non-endemic for LF [15]. In total, an estimated 128 million people in Nigeria are thought to require preventive chemotherapy, and as of 2016, 54% of this population had been treated [25]. After more than five rounds of MDA in Plateau and Nassarawa states, Transmission Assessment Survey 1 (TAS-1) showed evidence of interruption of LF transmission in these areas [26]. However, for the vast areas of the country in which LF is present, understanding disease distribution on a finer scale is key for more focussed targeting of control measures. In this work we aim to (i) describe and map the ecological niche of LF in Nigeria and (ii) estimate the human population living in areas that are environmentally suitable for disease transmission. Here we fitted seven different model classes to the same selection of training and evaluation data points, and projected the final ecological niche map using an ensemble of the two best performing models.

Methods

LF occurrence data

Data used for ecological niche modelling (ENM) were pre-intervention site-level data collected during mapping surveys conducted by the Nigeria Ministry of Health from 2000–2013. The sampling geographical level for LF mapping is the implementation unit (IU), which corresponds to a Local Government Area (LGA), the second level administrative unit in Nigeria. In total, we had 1192 data points covering all 36 states, and the Federal Capital Territory in Nigeria. A uniform survey methodology was applied to all survey locations and study participants were tested for the presence of filarial antigenemia using a point-care rapid test (immunochromatographic card test), and following the mapping protocol of the World Health Organization (WHO) Operational Guidelines for Mapping of Bancroftian Filariasis in Africa [27]. Briefly, within each IU (LGA), at least one sample village is randomly selected for survey. Selected villages must be located at least 50 km apart from each other. In each selected village, 50–100 adults (seeking an equal number of males and females when possible, and > 15 years of age) are tested. If 20% or more of the first 50 individuals tested result positive, testing is stopped, and the entire IU is recorded as positive for LF. Otherwise, testing is continued until 100 adults have been examined. In the end, IUs with at least one sampled village yielding a prevalence equal to or more than 1% are considered to be endemic for LF and subsequently targeted with MDA. For this analysis, occurrence (coded 1) was considered when at least one LF case was recorded during mapping surveys and absence (coded 0) when none of the individuals tested resulted positive to the LF rapid test [28]. In total we had 932 ‘presence’ and 260 ‘absence’ records for this modelling exercise. Figure 1 shows the distribution of survey points and presence-absence records used in this work.
Fig. 1

Location of study sites in Nigeria. Red points show sites with at least one LF case and blue points show sites with no LF case

Location of study sites in Nigeria. Red points show sites with at least one LF case and blue points show sites with no LF case

Climatic and environmental data

A suite of environmental variables was considered to describe the ecological niche of LF. Continuous gridded maps of climate, topography, vegetation and land use for Africa were obtained from different sources (Table 1). Climate variables related to precipitation and temperature were downloaded from the WorldClim database [29], which provides a set of global climate layers obtained by interpolation of the data for the period of 1950–2000 collected in weather stations distributed across the world. From the Consortium of Spatial Information (CGIAR-CSI) we obtained raster datasets of potential evapo-transpiration (PET), elevation and aridity index at 1 km2 resolution [30]. PET is a measure of the ability of the atmosphere to remove water through evapo-transpiration. Our elevation layer resulted from processing and resampling the gridded digital elevation models (DEM) derived from the 30-arcsecond DEM produced by the Shuttle Radar Topography Mission (SRTM) [31]. The elevation layer was used to generate two topography-related datasets: slope of terrain and flow accumulation.
Table 1

Environmental variables used in the ENM and their sources

VariablesSource
Annual cumulative precipitationWorldClim [29]
Maximum temperature
Mean temperature
Minimum temperature
Mean temperature of the coldest quarter
Mean temperature of the warmest quarter
Precipitation of the driest quarter
Precipitation of the wettest quarter
Potential evapo-transpiration (PET)CGIAR-CSI [30]
Aridity index
ElevationSRTM [31]
SlopeDerived from elevation
Flow accumulationDerived from slope
Distance to permanent riversDigital Global Chart [34]
Distance to nearest water bodiesGlobal Database of Lakes, Reservoirs and Wetlands [33]
Land surface temperature (LST)AfSIS [35]
Enhanced vegetation index (EVI)
Sand, silt, clay fractionsISRIC [37]
Soil pH
Major land cover (forest, agriculture, shrubland-grassland)Arino et al [36]
Wetness indexDerived from slope and flow accumulation
Distance to stable lights 2006Elvidge et al. [38]
Environmental variables used in the ENM and their sources To produce the flow accumulation layer, we initially created a flow direction layer in which the direction of flow was determined by the direction of the steepest descent from each cell in the elevation dataset. This was calculated as follows: change in elevation value/distance × 100. Flow accumulation was then calculated as the accumulated weight of all cells flowing into each downslope cell in the flow direction layer. In addition, we calculated topographic wetness index (TWI) by applying the following algorithm where a is the upslope contributing area per unit contour length [or specific catchment area (SCA)], which can be approached by using the flow accumulation, and β is the local slope gradient for reflecting the local drainage potential [32]. We also produced continuous surfaces of straight-line distance (Euclidean distance) in kilometres to the nearest water body and permanent rivers. These data were derived from the Global Database of Lakes, Reservoirs and Wetlands [33] and Digital Global Chart [34], respectively. Raster datasets of averaged enhanced vegetation index (EVI) and land surface temperature (LST) for the period 2000–2015 were obtained from the African Soil Information System (AfSIS) project [35]. Here, EVI and LST were generated from remotely sensed data collected from the Moderate Resolution Imaging Spectoradiometer (MODIS) platform. MODIS collects earth data from the same place every 16 days at 250 m spatial resolution. Land cover types (according to the United Nations land cover classification system) were extracted from the GlobCover project at the European Space Agency [36]. Here, maps are derived by an automatic and regionally-tuned classification of a 300 m medium resolution imaging spectrometer (MERIS) sensor on board the ENVISAT satellite mission. Soil data (sand, silt and clay fractions, and soil pH) were downloaded from the International Soil Reference and Information Centre (ISRIC) project [37] at a spatial resolution of 250 m. Finally, night-light (NL) emissivity for 2006 captured by the Operational Linescan System instrument on board a satellite of the Defence Meteorological Satellite Programme was used as a proxy measure of poverty across Nigeria [38]. This instrument measures visible and infrared radiation emitted at night-time, resulting in remote imagery of lights on the ground. This information has been correlated with gross domestic product in developed countries [39, 40] and, although far from being precise, could provide an indirect measure of poverty in developing countries [41]. NL emissivity is provided as gridded maps of 1 km2 resolution, and values that go from 0 (undetectable NL emissivity) to 60 (maximum NL emissivity). Alternatively, we estimated the Euclidean distance to stable night-lights, considered as NL values > 0. Input grids were resampled to a common spatial resolution of 1 km2 using the nearest-neighbour approach [42], clipped to match the geographical extent of a map of mainland Nigeria, and aligned to it. Raster manipulation and processing were undertaken using raster package in R (v.3.3.2) [43]. All environmental covariates considered in our models are known to be biologically plausible for LF occurrence [16, 23].

Model implementation

Selection of covariates

Covariate data were extracted corresponding to each of the presence-absence data points. In this work we explored the existence of multicollinearity using the variance inflation factor (VIF). Multicollinearity often arises in statistical models, and occurs when two or more covariates are not statistically independent leading to unstable estimates of variances of regression coefficients [44]. The VIF represents the amount of variability of a covariate which is explained by other covariates. For instance, the VIF for the ith covariate can be calculated as: VIF = 1/(1 – R2), where R2 is the coefficient of determination obtained by fitting a linear regression model for the ith independent covariate. The VIF of the suite of environmental covariates tested here was calculated and correlated variables were excluded in a stepwise procedure at a generally accepted threshold value of 10 [44]. Of the 24 covariates initially tested for multicollinearity, seven (average precipitation, mean temperature, average maximum temperature, average minimum temperature, aridity index, PET and sand soil type) were excluded from further analysis. All remaining covariates were considered to be independent and were included in the analysis. The multicollinearity test was implemented using the usdm package in R (v.3.3.2) [43]. The relative importance of the covariates to our presence-absence dataset was identified using the boosted regression trees (BRT) machine-learning algorithm. BRT is a combination of two algorithms, regression trees and boosting. This produces an additive regression model in which simple trees are fitted in a forward, stepwise fashion. This method has been widely used in disease prediction and considered a powerful tool for ecological studies [24, 45, 46]. Relative importance is defined as the frequency of selection of covariates for splitting, weighted by the squared improvements to the model, and averaged over all trees [45]. Higher relative importance scores, which are computed as percentages (and scaled to a maximum sum of 100%), indicate greater contribution to the model. Variables that showed no substantial contribution to the model (we set this at a threshold of 10%) [47] were excluded in fitting the final ensemble of models. Variables dropped at this stage were soil pH, night-light emissivity, EVI, distance to the nearest water body, distance to rivers, flow accumulation, mean temperature of the coldest quarter, mean temperature of the warmest quarter, and clay and silt soil fraction. The remaining predictors: precipitation in the driest quarter, precipitation in the wettest quarter, wetness index, land surface temperature, elevation, distance to stable lights and terrain slope, were included in the final analysis for ensemble modelling.

Building the ensemble model

We fitted our data using seven model algorithms, namely generalised linear models (GLM), surface range envelopes (SRE), multivariate additive regression splines (MARS), artificial neural networks (ANN), BRT, also known as GBM (generalised boosted regression modelling), random forest (RF) and maximum entropy ecological niche models (MaxEnt). These algorithms are included within Biodiversity Modelling (BIOMOD) [48], a computational framework intended for modelling species distribution. BIOMOD was used to build the ecological niche model, and implemented with the package biomod2 in R (v.3.3.2) [43]. The MaxEnt algorithm is not in-built in BIOMOD; however, there is provision to include this as an add-on. The software was developed by S. Phillips and colleagues and is freely available from https://biodiversityinformatics.amnh.org/open_source/maxent/ (v.3.4.1). Ideally, model accuracy should be evaluated with data that are independent of the training data. As we did not have an independent dataset, the original data was partitioned into two, with a random sample of 30% of the original data retained for testing/evaluation and was considered ‘quasi-independent’, while the remaining 70% was used to train/calibrate the model. To evaluate the validity of model performance and accuracy on the quasi-independent data, BIOMOD offers an alternative ability to perform internal cross-validation whereby a set number of data splitting runs are computed. In each model run, the model is fitted to one part of the dataset and tested on the other part. This internal cross-validation does not provide a measure of predictive performance per se, but provides a measure of internal consistency of models [49]. We performed an iteration of 100 model runs for each of the seven algorithms, and the evaluation values of each run was stored and then averaged, to make the final result more robust. Model evaluation was performed based on the area under the receiver operating characteristic (ROC) curve and the Hanssen-Kuipers discriminant (also known as true skill statistic, TSS). TSS compares the number of correct predictions, minus predictions attributable to random guessing [50], taking into account both sensitivity and specificity. Its value ranges from -1 to +1, where +1 indicates perfect score, 0 indicates random performance and values of 0.5 or higher are generally considered to indicate acceptable model performance [49, 50]. The TSS value is not affected by the size of the validation dataset. Evaluation values for the cross-validation runs were then compared to the values from the runs from the quasi-independent data, checking for consistency in predictive accuracy scores. The two best-performing model algorithms, based on ROC and TSS scores, were then selected for ensemble projection. The evaluation summaries of ensemble predictions are presented as mean ROC and TSS, median ROC and TSS, and lower and upper confidence bounds of ROC and TSS. Sensitivity and specificity were calculated and a threshold value that maximizes their sum (optimal threshold value for each condition) was considered to generate binary maps that display areas where LF transmission is more likely to occur based on environmental suitability. A gridded map of estimated population density for Nigeria was obtained from the WorldPop Africa dataset [51]. Population density data available for Nigeria from this resource were for the years 2006, 2010, 2015 and 2020. As our data spanned from 2000–2013, we estimated the population based on population density estimates for the year 2010. We calculated population for each state by summing estimated numbers of people per pixel falling within predicted LF suitable areas and aggregating this to represent population by state. This analysis was performed using the Zonal Statistics function available within the Spatial Analyst Tool in ArcGIS 10.3 [52].

Results

Analysis was performed using 1192 survey sites reporting presence-absence of LF covering all 36 states and the Federal Capital Territory in Nigeria. Survey participants were tested for the presence of filarial antigenemia using ICT. In total, 142,881 individuals were surveyed and 11,479 tested positive for LF infection. Figure 2 shows the performance of the seven model algorithms implemented in the BIOMOD package. Here, the RF and GBM models outperform the others with area under the ROC and TSS values of > 0.95 and > 0.75, respectively. The two hundred models generated by these two algorithms where therefore chosen for constructing the final ensemble model.
Fig. 2

Model performance comparison by area under the receiver operating characteristic curve (ROC) and true skill statistic (TSS) values of all model classes. The points represent the mean estimates and the solid lines represent the 95% confidence intervals. Abbreviations: ANN, artificial neural networks; GBM, generalised boosted models; GLM, generalised linear models; MARS, multivariate additive regression splines; MAXENT, maximum entropy ecological niche models; RF, random forest; SRE, surface range envelope

Model performance comparison by area under the receiver operating characteristic curve (ROC) and true skill statistic (TSS) values of all model classes. The points represent the mean estimates and the solid lines represent the 95% confidence intervals. Abbreviations: ANN, artificial neural networks; GBM, generalised boosted models; GLM, generalised linear models; MARS, multivariate additive regression splines; MAXENT, maximum entropy ecological niche models; RF, random forest; SRE, surface range envelope Figures 3 and 4 show the relative contribution (RC, as percentage) and marginal effect plots of each covariate on the predicted suitability of occurrence for LF to the final GBM and RF models, respectively. For both model algorithms, precipitation of the driest quarter, precipitation of the wettest quarter, and elevation, were the major contributors to the ensemble of models. In total these three covariates contributed 60.91 and 59.38% to the fitted ensemble of GBM and RF models, respectively. The probability of LF occurrence appeared to steadily decrease with increasing precipitation of the driest quarter. High suitability for LF was positively associated with elevation up to 500 metres above sea level (masl), and then appeared to flatten up to 1500 masl. Land surface temperature and distance to stable lights also showed a negative correlation with LF occurrence.
Fig. 3

Marginal effects curves for covariates included in 100 ensembles of generalised boosted models. Blue lines represent the mean marginal effect and grey shading indicates the 95% bootstrap confidence intervals. The figures in parentheses indicate the relative contribution (RC) of each covariate, which add up to 100%. The Y-axis is the response (probability of LF occurrence) and the X-axis is the full range of covariate values

Fig. 4

Marginal effects curves for covariates included in 100 ensembles of random forest models. Blue lines represent the mean marginal effect and grey shading indicates the 95% bootstrap confidence intervals. The figures in parentheses indicate the relative contribution (RC) of each covariate, which add up to 100%. The Y-axis is the response (probability of LF occurrence) and the X-axis is the full range of covariate values

Marginal effects curves for covariates included in 100 ensembles of generalised boosted models. Blue lines represent the mean marginal effect and grey shading indicates the 95% bootstrap confidence intervals. The figures in parentheses indicate the relative contribution (RC) of each covariate, which add up to 100%. The Y-axis is the response (probability of LF occurrence) and the X-axis is the full range of covariate values Marginal effects curves for covariates included in 100 ensembles of random forest models. Blue lines represent the mean marginal effect and grey shading indicates the 95% bootstrap confidence intervals. The figures in parentheses indicate the relative contribution (RC) of each covariate, which add up to 100%. The Y-axis is the response (probability of LF occurrence) and the X-axis is the full range of covariate values A continuous risk map of environmental suitability of LF was projected on a geographical space based on the pre-selected environmental predictors (those shown in Figs. 3 and 4). The mean area under the ROC values on the evaluation dataset (30% of the full dataset) for the final ensemble model was 0.991 and the median was 0.993 (95% CI: 0.879–0.995). These areas under the ROC values measure the performance of the final ensemble model in fitting the presence-absence data and predicting across unsampled locations. The map presented in Fig. 5 suggests that large proportions of northern Nigeria are environmentally suitable and better able to drive LF transmission, although low suitability was predicted in the north-central state of Kogi. Low LF suitability was however predicted in most southern states in Nigeria.
Fig. 5

Median predicted environmental suitability of LF with the lower and upper bounds of the occurrence limits

Median predicted environmental suitability of LF with the lower and upper bounds of the occurrence limits A suitability threshold of 0.711 with a sensitivity of 95% and a specificity of 96.2% provided the best discrimination between presence and absence values according to the evaluation dataset, and therefore was used to reclassify the continuous predictive maps into binary maps, delineating land areas into either suitable or unsuitable for LF transmission (Fig. 6).
Fig. 6

Median predicted binary map of environmental suitability for LF with the lower and upper bounds of the occurrence limits

Median predicted binary map of environmental suitability for LF with the lower and upper bounds of the occurrence limits

Estimating population at risk

The total national population living in areas that are environmentally suitable for LF was estimated to be 110 (95% CI: 106–124) million, which corresponds to about 67% of Nigeria’s population in 2010. The largest portion of the population living in areas environmentally suited to LF transmission were found in Kano, Kaduna, and Katsina states with predicted populations of 9.6, 5.9 and 5.6 million, respectively (Table 2). All other states had population in at-risk areas of less than 5 million.
Table 2

Estimated human population living in areas environmentally suited to LF transmission by state in Nigeria in 2010

Zones in NigeriaPopulation in areas environmentally suited to LF transmissionTotal population
North-central StatesBenue2,997,2094,853,000
Kogi1,299,0573,838,000
Kwara841,7302,852,000
Nassarawa2,007,3172,151,000
Niger4,342,2524,538,000
Plateau3,568,6193,659,000
FCT1,438,1271,537,000
Subtotal16,494,31123,428,000
North-east StatesAdamawa3,087,5993,272,000
Bauchi4,893,7875,257,000
Borno4,115,2944,752,000
Gombe2,755,1062,773,000
Taraba2,061,2912,657,000
Yobe2,228,1362,652,000
Subtotal19,141,21321,363,000
North-west StatesJigawa4,701,5725,054,000
Kaduna5,903,9606,927,000
Kano9,625,82510,765,000
Katsina5,640,9116,550,000
Kebbi3,700,4853,758,000
Sokoto3,973,5034,137,000
Zamfara3,462,6533,689,000
Subtotal37,008,90940,880,000
South-east StatesAbia2,034,2463,269,000
Anambra4,314,0814,819,000
Ebonyi2,251,4892,345,000
Enugu2,778,4133,717,000
Imo4,190,7544,402,000
Subtotal15,568,98318,552,000
South-south StatesAkwa Ibom1,073,5924,461,000
Cross River2,351,7963,472,000
Bayelsa1,206,5772,087,000
Rivers1,630,5315,759,000
Delta2,025,9284,747,000
Edo2,910,6973,804,000
Subtotal11,199,12124,330,000
South-west StatesEkiti2,357,0672,516,000
Lagos538,36414,480,000
Ogun1,281,1003,953,000
Ondo2,608,8523,679,000
Osun3,020,8904,105,000
Oyo1,496,9526,532,000
Subtotal11,303,12935,265,000
Sum total110,715,856163,818,000
Estimated human population living in areas environmentally suited to LF transmission by state in Nigeria in 2010

Discussion

In this study, we have produced maps at a resolution of 1 km2 to inform ongoing interventions by delineating areas of highest transmission risk which are prone to resurgence, thus to help in efficiently targeting control measures at the lowest administrative level. Our occurrence map (Fig. 6) indicates that suitability to LF transmission is widely distributed in Nigeria. However, parts of the north-east state of Borno, southern states of Cross River, Rivers, Akwa Ibom, Delta and Edo, and south-west states of Lagos, Oyo, Ogun and Ondo, are not environmentally suited to LF transmission. According to our estimates, about 67% of the Nigerian population live in areas environmentally suitable for LF transmission. The benefits of machine learning algorithms compared to logistic regression models for niche and species distribution modelling have been thoroughly reviewed [45, 53–58]. Machine learning algorithms allow to account for complex non-linear associations between the response (e.g. disease occurrence) and explanatory variables, and control for interactions among explanatory variables [19]. Furthermore, combining predictions from more than one modelling algorithm to form an ensemble has been found to produce more precise estimates when used for ecological niche and species distribution models [59]. In our work, we have mapped the ecological niche of LF in Nigeria using an ensemble of two algorithms (GBM and RF), which are widely considered to produce accurate estimates of disease distribution [46, 58]. GBM models combine fitting regression trees with boosting, by recursively partitioning data into smaller binary splits, and splits repeatedly applied to their own until the best split is chosen [46]. ‘Boosting’ allows fine tuning of the overall model and is performed in a forward stepwise manner minimising residual variation in the response [46]. The combination of regression trees and boosting have been demonstrated to avoid over-fitting [45]. For RF models, a large number of trees are grown with the root node of each new tree containing a different random bootstrap prediction of the original data [58, 60]. For final predictions, the average values of all bootstrapped predictions are taken. It has been demonstrated that RF models are efficient in tuning and improving the accuracy of models [58]. Our work provides an insight into the regional distribution of LF in Nigeria, and we find that the areas less suitable for LF transmission correspond to mangrove ecosystems and freshwater swamps in the southern parts of the country, and also to short grass savanna in the north-east [61]. We have identified environmental factors associated with the occurrence of LF in Nigeria, with precipitation during the driest quarter contributing the most in driving the probability of LF occurrence. This finding shows that availability of temporal breeding sites during the driest period is critical for the major LF vectors, Anopheles spp. mosquitoes, to sustain the transmission [1]. However, marginal effect plots also showed that the probability of LF occurrence would decline when precipitation exceeds 800 mm during the wettest quarter of the year, which may suggest that although rainfall is required for vector survival and breeding, excessive rainfall may cause flooding and destruction of breeding sites [16, 19]. Similarly, the probability of LF occurrence started to decline at high land surface temperatures. This is consistent with experimental findings on adult mosquito survival and larval development [62-64], which suggest that both adults and larvae are unable to thrive at high temperatures. The probability of LF occurrence appeared to increase with increasing elevation, and levels off at around 500 metres above sea level. This phenomenon has been previously recorded [16, 19] and is thought to reflect the negative effect of decreasing temperature with increasing altitude on mosquito survival and the rate of parasite development within the vector [65]. We found a negative correlation between higher terrain slope and suitability of environment to LF, perhaps because steeper inclinations of terrain cause more rapid surface water runoff, thus reducing the collection of water pockets which may serve as breeding sites for mosquito vectors. Finally, it seems that environmental suitability for LF steadily declined with increasing distance from stable lights. Stable light is considered for any pixels at a value of > 0; however, values < 0 do not mean total pixel darkness but rather that the intensity of light emitted does not reach the threshold to be captured by the sensor. Here we took stable lights as proxy for rural-urban divide and economic activity, as urban areas are more likely to emit night-light (and thus stable lights) than rural areas. As the distance from stable night-light increased, the probability for LF occurrence decreased. This drop may be explained by the absence of stable lights in uninhabited areas where the mosquito population is more likely to be of low abundance, or in more rural settings where stable lights are less likely to be present, as electricity is in short supply in large parts of rural Nigeria. Although LF has always been associated with more rural areas [66-68], a recent study in Tanzania has highlighted the burden of LF in urban settings [69], and corroborated in a study conducted in an urban Nigerian setting [68]. Studies have also illustrated that mosquitoes are more likely to aggregate around human populations [70, 71]. Smith et al. [70] reported that the distribution of the human population influenced the aggregation of adult mosquitoes because mosquitoes are more likely to gravitate towards the human host. These authors demonstrated that mosquito density was lowest in rural settings but higher in peri-urban and urban settings. In Nigeria, Anopheles spp. are the principal vectors for LF [1, 72, 73], and our maps of environmental suitability correspond well with known historical distribution patterns of these mosquitoes in Nigeria [1]. This may be due to the relatively stable nature of the peri-domestic environmental factors driving the abundance and distribution of Anopheles mosquitoes in the past 20 years [16]. Previous studies have demonstrated that environmental factors may affect mosquito species differently. For instance, precipitation, which was a major driving factor in our model, is thought to have a greater effect on Anopheles spp. than it does on Culex spp. [16], perhaps due to their different breeding habitats. Culex mosquitoes are known to breed in areas with poor sanitary and housing conditions [3], thus human factors may play a more important role than precipitation. It will therefore be interesting to test our model in geographical areas where Culex spp. are the predominant vectors for LF. Furthermore, although LF transmission has been interrupted in the north-central states of Plateau and Nasarawa [26], the mosquito vectors remain; thus there is a risk of recrudescence due to within-country human migration [51] and a possible re-introduction of the LF parasite. Estimates of the human population living in areas environmentally suited to LF transmission have steadily increased over the years and varied significantly in previous studies [74-76]. This may be due to improved diagnosis and surveillance as well as population growth. In 1992, an estimated 113 million people lived in LF at-risk areas in Africa [74] and by 2009, this figure was estimated to be 212 million [76]. Nigeria was reported to have the third highest national LF burden with an estimated 114 million individuals living in at-risk areas in 2016 [15]. These estimates are usually calculated by summing up the populations of each of district where infection is detected, and may thus overestimate the actual LF burden since this approach does not account for within-district spatial variations and is highly dependent on the existence of field survey data in endemic areas. In addition, field surveys are more likely to be carried out where LF infection is suspected or in areas where universities are located [77] and thus surveillance may in some locations fail to capture the true infectious status of vast areas of the country. Since 2000, Nigeria has seen an increase in violence and militancy in the southern Niger Delta regions [78], and terrorist activities in the north-eastern parts of the country [78, 79]. These conditions make it difficult to carry out field surveys and data from these areas are patchy. In contrast, estimates of the human population at risk derived by geostatistical and machine learning-based models estimate the complete distribution of infection and thus may produce more accurate estimates of the true extent of infection and the human populations at risk [19]. The additional prospect of our model producing a threshold value that maximises the sum of specificity and sensitivity improves the accuracy of our binary map (Fig. 6) [48, 80]. The human population living in areas classified as environmentally suitable for LF transmission in Nigeria using a machine learning approach was previously estimated to be 143.97 million [19]. This figure was derived from modelling only 27 presence-only data points from surveys conducted between 1977 and 1990 and of varying diagnostic methods and study designs, whereas our model was implemented using 1192 presence-absence data points derived from standardised field surveys. Furthermore, niche models derived by using presence-only data points are often spatially biased as surveys are usually conducted in areas that are more easily accessible, and are usually positive (presence) counts [81]. This bias is usually remedied by selecting background or pseudo-absence data points [82], that is assumed absence data drawn at random for the region of interest, to balance the presence-absence data point ratio. Since presence points are usually concentrated in particular geographical regions due to convenience of sampling, by randomly selecting pseudo-absence data points analysis may be biased as true presence points might be treated as ‘absence’, and thus leading to inaccurate model predictions [81]. The geographical spread and the amount of presence-absence data used in the present study, together with the standardised methods for field surveys and data collection, implies that our models provide accurate population estimates of the number of individuals living in areas environmentally suited to LF transmission.

Conclusions

The data used in this analysis represent a unique resource and provide the most comprehensive database for LF distribution in Nigeria. As the national LF control programme moves towards elimination, the methods and results presented in this study will inform surveillance activities and help optimise resource allocation for disease control.
  43 in total

1.  Editorial: Lymphatic filariasis endemicity--an indicator of poverty?

Authors:  D N Durrheim; S Wynd; B Liese; J O Gyapong
Journal:  Trop Med Int Health       Date:  2004-08       Impact factor: 2.622

Review 2.  Ensemble forecasting of species distributions.

Authors:  Miguel B Araújo; Mark New
Journal:  Trends Ecol Evol       Date:  2006-09-29       Impact factor: 17.712

3.  Urban lymphatic filariasis in central Nigeria.

Authors:  A Terranella; A Eigiege; I Gontor; P Dagwa; S Damishi; E Miri; B Blackburn; D McFarland; J Zingeser; M Y Jinadu; F O Richards
Journal:  Ann Trop Med Parasitol       Date:  2006-03

4.  On the inefficiency of transmission of Wuchereria bancrofti from mosquito to human host.

Authors:  N G Hairston; B de Meillon
Journal:  Bull World Health Organ       Date:  1968       Impact factor: 9.408

5.  Climate change threatens European conservation areas.

Authors:  Miguel B Araújo; Diogo Alagador; Mar Cabeza; David Nogués-Bravo; Wilfried Thuiller
Journal:  Ecol Lett       Date:  2011-03-30       Impact factor: 9.492

Review 6.  Malaria on the move: human population movement and malaria transmission.

Authors:  P Martens; L Hall
Journal:  Emerg Infect Dis       Date:  2000 Mar-Apr       Impact factor: 6.883

7.  The risk of a mosquito-borne infection in a heterogeneous environment.

Authors:  David L Smith; Jonathan Dushoff; F Ellis McKenzie
Journal:  PLoS Biol       Date:  2004-10-26       Impact factor: 8.029

8.  Using remotely sensed night-time light as a proxy for poverty in Africa.

Authors:  Abdisalan M Noor; Victor A Alegana; Peter W Gething; Andrew J Tatem; Robert W Snow
Journal:  Popul Health Metr       Date:  2008-10-21

9.  The geographical distribution of lymphatic filariasis infection in Malawi.

Authors:  Bagrey Mm Ngwira; Phillimon Tambala; A Maria Perez; Cameron Bowie; David H Molyneux
Journal:  Filaria J       Date:  2007-11-29

10.  Mapping and Modelling the Geographical Distribution and Environmental Limits of Podoconiosis in Ethiopia.

Authors:  Kebede Deribe; Jorge Cano; Melanie J Newport; Nick Golding; Rachel L Pullan; Heven Sime; Abeba Gebretsadik; Ashenafi Assefa; Amha Kebede; Asrat Hailu; Maria P Rebollo; Oumer Shafi; Moses J Bockarie; Abraham Aseffa; Simon I Hay; Richard Reithinger; Fikre Enquselassie; Gail Davey; Simon J Brooker
Journal:  PLoS Negl Trop Dis       Date:  2015-07-29
View more
  8 in total

1.  The roadmap towards elimination of lymphatic filariasis by 2030: insights from quantitative and mathematical modelling.

Authors: 
Journal:  Gates Open Res       Date:  2019-09-13

2.  Mapping Environmental Suitability of Scrub Typhus in Nepal Using MaxEnt and Random Forest Models.

Authors:  Bipin Kumar Acharya; Wei Chen; Zengliang Ruan; Gobind Prasad Pant; Yin Yang; Lalan Prasad Shah; Chunxiang Cao; Zhiwei Xu; Meghnath Dhimal; Hualiang Lin
Journal:  Int J Environ Res Public Health       Date:  2019-12-02       Impact factor: 3.390

3.  Spatial variation in lymphatic filariasis risk factors of hotspot zones in Ghana.

Authors:  Efiba Vidda Senkyire Kwarteng; Samuel Ato Andam-Akorful; Alexander Kwarteng; Da-Costa Boakye Asare; Jonathan Arthur Quaye-Ballard; Frank Badu Osei; Alfred Allan Duker
Journal:  BMC Public Health       Date:  2021-01-28       Impact factor: 3.295

4.  Progress towards onchocerciasis elimination in Côte d'Ivoire: A geospatial modelling study.

Authors:  Obiora A Eneanya; Benjamin G Koudou; Meite Aboulaye; Aba Ange Elvis; Yeo Souleymane; Marie-Madeleine Kouakou; Gary J Weil; Peter U Fischer
Journal:  PLoS Negl Trop Dis       Date:  2021-02-10

5.  Exploring agricultural land-use and childhood malaria associations in sub-Saharan Africa.

Authors:  Hiral Anil Shah; Luis Roman Carrasco; Arran Hamlet; Kris A Murray
Journal:  Sci Rep       Date:  2022-03-08       Impact factor: 4.379

Review 6.  Artificial intelligence and lymphedema: State of the art.

Authors:  Abdullah S Eldaly; Francisco R Avila; Ricardo A Torres-Guzman; Karla Maita; John P Garcia; Luiza Palmieri Serrano; Antonio J Forte
Journal:  J Clin Transl Res       Date:  2022-06-01

7.  Informing Rift Valley Fever preparedness by mapping seasonally varying environmental suitability.

Authors:  Austin N Hardcastle; Joshua C P Osborne; Rebecca E Ramshaw; Erin N Hulland; Julia D Morgan; Molly K Miller-Petrie; Julia Hon; Lucas Earl; Peter Rabinowitz; Judith N Wasserheit; Marius Gilbert; Timothy P Robinson; G R William Wint; Shreya Shirude; Simon I Hay; David M Pigott
Journal:  Int J Infect Dis       Date:  2020-07-30       Impact factor: 3.623

Review 8.  Elimination of Lymphatic Filariasis in Shandong Province, China, 1957-2015.

Authors:  Xiaodan Huang; Xuli Deng; Jingxuan Kou; Xin Liu; Huaiwei Wang; Peng Cheng; Maoqing Gong
Journal:  Vector Borne Zoonotic Dis       Date:  2020-08-13       Impact factor: 2.523

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.