The leishmaniases are vector-borne diseases that have a broad global distribution throughout much of the Americas, Africa, and Asia. Despite representing a significant public health burden, our understanding of the global distribution of the leishmaniases remains vague, reliant upon expert opinion and limited to poor spatial resolution. A global assessment of the consensus of evidence for leishmaniasis was performed at a sub-national level by aggregating information from a variety of sources. A database of records of cutaneous and visceral leishmaniasis occurrence was compiled from published literature, online reports, strain archives, and GenBank accessions. These, with a suite of biologically relevant environmental covariates, were used in a boosted regression tree modelling framework to generate global environmental risk maps for the leishmaniases. These high-resolution evidence-based maps can help direct future surveillance activities, identify areas to target for disease control and inform future burden estimation efforts.
The leishmaniases are vector-borne diseases that have a broad global distribution throughout much of the Americas, Africa, and Asia. Despite representing a significant public health burden, our understanding of the global distribution of the leishmaniases remains vague, reliant upon expert opinion and limited to poor spatial resolution. A global assessment of the consensus of evidence for leishmaniasis was performed at a sub-national level by aggregating information from a variety of sources. A database of records of cutaneous and visceral leishmaniasis occurrence was compiled from published literature, online reports, strain archives, and GenBank accessions. These, with a suite of biologically relevant environmental covariates, were used in a boosted regression tree modelling framework to generate global environmental risk maps for the leishmaniases. These high-resolution evidence-based maps can help direct future surveillance activities, identify areas to target for disease control and inform future burden estimation efforts.
Entities:
Keywords:
boosted regression trees; cutaneous leishmaniasis; epidemiology; global health; human; infectious disease; leishmania; microbiology; niche based modelling; species distribution modelling; visceral leishmaniasis
The leishmaniases are a group of protozoan diseases transmitted to humans and other
mammals by phlebotomine sandflies (Murray et al.,
2005; WHO, 2010). Considered as
one of the neglected tropical diseases (NTD) (WHO,
2009), the leishmaniases can be caused by around 20
Leishmania species and include a complex life cycle involving
multiple arthropod vectors and mammalian reservoir species (Ashford, 1996; Ready,
2013). Sandflies belonging to either Phlebotomusspp.
(Old World) or Lutzomyiaspp. (New World) are the primary vectors;
domestic dogs, rodents, sloths, and opossums are amongst a long list of mammals that
are either incriminated or suspected reservoir hosts. Non-vector transmission (e.g.,
by accidental laboratory infection, blood transfusion, or organ transplantation) is
possible, but rare (Cardo, 2006).
Transmission of the leishmaniases can be either anthroponotic or zoonotic. The
leishmaniases rank as the leading NTD in terms of mortality and morbidity with an
estimated 50,000 deaths in 2010 (Lozano et al.,
2012) and 3.3 million disability adjusted life years (Murray et al., 2012).Symptoms of Leishmania infection can take many different and diverse
forms (Banuls et al., 2011), the two main
outcomes being cutaneous leishmaniasis (CL) and visceral leishmaniasis (VL).
Cutaneous leishmaniasis typically presents as cutaneous nodules or lesions at the
site of the sandfly bite (localised cutaneous leishmaniasis). In some cases,
parasites disseminate through the skin and present as multiple non-ulcerative
nodules (diffuse cutaneous leishmaniasis, DCL) or propagate through the lymphatic
system resulting in nasobronchial and buccal mucosal tissue destruction (mucosal
leishmaniasis, ML) (Reithinger et al.,
2007; Dedet and Pratlong, 2009).
Localised CL may resolve spontaneously and usually responds well to treatment;
management of DCL and ML cases is more difficult and cases may take considerably
longer to resolve, if at all. Visceral leishmaniasis generally affects the spleen,
liver, or other lymphoid tissues, and, if left untreated, is fatal; a fraction of
successfully treated VL cases may result in maculopapular or nodular rashes
(post-kala-azar dermal leishmaniasis) (Murray et
al., 2005; Dedet and Pratlong,
2009). While the Leishmania species determines which of
the main two forms of the leishmaniases will result from infection, establishment,
progression, and severity of infection as well as treatment regimen and outcome is
dependent on a range of other factors, including parasite strain, characteristics of
sandfly saliva, parasite infection with Leishmania RNA virus, host
genetics, and immunosuppression, particularly due to HIV co-infection (Reithinger et al., 2007; Ives et al., 2011; Novais et al., 2013).Species distribution models provide a robust means of mapping these diseases at a
global level. These models define a set of conditions, from a selection of
environmental covariates, which best categorise known occurrences. Through this
categorisation, areas of unknown pathogen presence can be identified and thus a
global evaluation of environmental suitability for presence can be made. A variety
of factors can influence the distribution of an organism, including an array of
environmental and other abiotic characteristics as well as biotic factors (Peterson, 2008). Whilst many areas may be
environmentally suitable for a given species, other factors may prevent the species
from being present in all of these locations. This distinction is often referred to
as the difference between the fundamental and the realised niche of the species, the
former describing a potential distribution based upon specific features of the
environment whilst the latter indicates the distribution we observe. Such a
framework can be applied just as successfully in the context of pathogens and their
vectors as with macroorganisms (Peterson et al.,
2011) and has already been applied to the mapping of malaria vectors
(Sinka et al., 2010, 2010, 2011) and dengue (Bhatt et al.,
2013). The relationships between the leishmaniases and environmental and
socioeconomic factors known to influence their distribution at a global scale has
not previously been considered in a comprehensive and quantitative manner (Hay et al., 2013). This study uses these
modelling techniques in order to define the first evidence-based global
environmental risk maps of the leishmaniases.
Results
Evidence of leishmaniasis
For each province or state across the globe (classed as Admin 1 by the Food and
Agriculture Organization's Global Administrative Unit Layers (FAO, 2008), totalling some 3450) evidence
was collected regarding CL and VL presence or absence. An assessment of the
consensus of this evidence ranging from comprehensive agreement on disease
presence (+100%) to consensus of disease absence (−100%) was made.
Figures 1A–4A present these
evidence consensus maps, with full reasoning for each administrative unit's
score outlined in the associated data set (Dryad data set doi: 10.5061/dryad.05f5h). For Brazil, it was possible to perform
this analysis at the district level (classed as Admin 2) totalling some 5510
units. In total, 950 Admin 1 units from 84 countries reported a consensus on CL
presence greater than indeterminate (a score of 0), with 310 Admin 1 units from
42 countries reporting a complete consensus on the presence of CL. In Brazil,
2469 Admin 2 regions recorded CL cases over the period of investigation.
Consensus on the presence of VL (score greater than 0) was reported in 793 Admin
1 units from 77 countries, with 88 Admin 1 units from 32 countries reporting
complete consensus on VL. In Brazil, 1320 Admin 2 units recorded VL cases.
Figure 1.
Reported and predicted distribution of cutaneous
leishmaniasis in the New World.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of cutaneous leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).
DOI:
http://dx.doi.org/10.7554/eLife.02851.003
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.
DOI:
http://dx.doi.org/10.7554/eLife.02851.004
Reported and predicted distribution of cutaneous
leishmaniasis in the New World.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of cutaneous leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.003
Uncertainty associated with predictions in Figure 1B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.DOI:
http://dx.doi.org/10.7554/eLife.02851.004
Reported and predicted distribution of visceral leishmaniasis
in the New World.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of visceral leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.005
Uncertainty associated with predictions in Figure 2B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.DOI:
http://dx.doi.org/10.7554/eLife.02851.006
Reported and predicted distribution of cutaneous
leishmaniasis in the Old World.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of cutaneous leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.007
Uncertainty associated with predictions in Figure 3B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.DOI:
http://dx.doi.org/10.7554/eLife.02851.008
Reported and predicted distribution of cutaneous
leishmaniasis in northeast Africa.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of cutaneous leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.009
Reported and predicted distribution of cutaneous
leishmaniasis across the Near East, including Syria, Iran and
Afghanistan.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of cutaneous leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.010
Reported and predicted distribution of visceral leishmaniasis
in the Old World.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of visceral leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.011
Uncertainty associated with predictions in Figure 4B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.DOI:
http://dx.doi.org/10.7554/eLife.02851.012
Reported and predicted distribution of visceral leishmaniasis
in northeast Africa.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of visceral leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.013
Reported and predicted distribution of visceral leishmaniasis
in the Indian subcontinent.
(A) Evidence consensus for presence of the disease
ranging from green (complete consensus on the absence:
−100%) to purple (complete consensus on the presence of
disease: +100%). The blue spots indicate occurrence points
or centroids of occurrences within small polygons.
(B) Predicted risk of visceral leishmaniasis
from green (low probability of presence) to purple (high
probability of presence).DOI:
http://dx.doi.org/10.7554/eLife.02851.014
Population at risk estimates for leishmaniasis.
Four scatterplots showing the relationship between non-zero
estimated mean annual incidence (Alvar et al., 2012) and estimated
population at risk derived from the cartographic approach for
(A) New World cutaneous leishmaniasis,
(B) New World visceral leishmaniasis,
(C) Old World cutaneous leishmaniasis, and
(D) Old World visceral leishmaniasis. For each
country the bars represent the annual incidence estimate
range.DOI:
http://dx.doi.org/10.7554/eLife.02851.015Of the 10 countries (Afghanistan, Colombia, Brazil, Algeria, Peru, Costa Rica,
Iran, Syria, Ethiopia, and Sudan) that contribute 75% of the global estimated CL
incidence (Alvar et al., 2012), only
Algeria did not have regions of complete evidence consensus on presence due to
incomplete and non-contemporary case data. Similarly, of the six countries
(Brazil, Ethiopia, Sudan, South Sudan, India, and Bangladesh) that report 90% of
all VL cases (Alvar et al., 2012), all
six had regions of complete consensus on VL.Figures 1A–4A also show the spatial
distribution of occurrence data, defined as one or more reports of leishmaniasis
in a given calendar year, collated from a variety of sources. Overall, there is
a relatively broad geographic spread and good correspondence with the evidence
consensus maps for each disease. Tunisia, Morocco and Brazil report the highest
number of unique CL occurrences in any given year, whilst India reported the
largest proportion of the VL occurrence data.Table 1 reports the sources and types of
data within the occurrence database. Whilst the majority of occurrence records
contain accurate point data (62%), the remainder were recorded at a provincial
or district level. Occurrence records for the two diseases were relatively
similar in number with a total of 6426 records for CL and 6137 for
VL.
Table 1.
Origin and spatial resolution of leishmaniasis occurrence data
DOI:
http://dx.doi.org/10.7554/eLife.02851.016
Origin and resolution of occurrence data
Point data
Province level data
District level data
Total
Cutaneous leishmaniasis
Literature
3680
879
1220
5779
CNR-L
531
47
31
609
HealthMap
31
–
–
31
GenBank
6
–
1
7
Total
4248
926
1252
6426
Visceral leishmaniasis
Literature
3050
1500
1068
5618
CNR-L
429
24
29
482
HealthMap
32
1
–
33
GenBank
3
–
1
4
Total
3514
1525
1098
6137
Each cell gives the number of occurrence records added to the
data set by considering each additional datasource after
removing duplicate records. Occurrence records are separated by
spatial resolution—whether they are recorded as points
(typically representing settlements) or as province level (admin
1) or district level (admin 2) data.
Origin and spatial resolution of leishmaniasis occurrence dataDOI:
http://dx.doi.org/10.7554/eLife.02851.016Each cell gives the number of occurrence records added to the
data set by considering each additional datasource after
removing duplicate records. Occurrence records are separated by
spatial resolution—whether they are recorded as points
(typically representing settlements) or as province level (admin
1) or district level (admin 2) data.
Modelled distribution of the leishmaniases
Figures 1B–4B show the global
predicted environmental risk maps for CL and VL. Table 2 identifies the top five predictor variables in
each of the four modelled regions (since CL and VL were modelled separately in
the Old World and New World) as measured by average contribution to the boosted
regression trees (BRT) submodels. Peri-urban and urban land cover is an
important predictor of the distribution of CL in the Old World and of VL
globally. Abiotic factors such as land surface temperature (LST) were better
predictors of CL than of VL. In total, LST variables (annual minimum, maximum
and mean) explain 21.99% of CL distribution in the Old World and 43.65% of CL
distribution in the New World (with maximum LST having the highest relative
contribution). Abiotic factors combined (including LST, normalised difference
vegetation index (NDVI) and precipitation) accounted for 29.02% and 48.55% of VL
distribution in the Old World and New World, respectively. Validation statistics
for all models were high with a mean area under the receiver operator curve
(AUC) above 0.97 and mean correlations above 0.85 for all models.
Table 2.
Mean relative contribution of predictor variables to the ensemble BRT
models of CL and VL in both the Old and New World
DOI:
http://dx.doi.org/10.7554/eLife.02851.017
Top predictors of CL
Relative contribution
Top predictors of VL
Relative contribution
Old world
Peri-urban extents
47.34
Peri-urban extents
51.50
Minimum LST
18.36
Urban extents
17.38
Urban extents
9.01
Maximum NDVI
7.87
G-Econ
7.33
Minimum LST
5.87
Minimum Precipitation
4.95
Maximum Precipitation
4.00
New World
Maximum LST
36.91
Peri-urban extents
25.90
Peri-urban extents
18.61
Urban extents
21.24
Maximum precipitation
12.06
Mean LST
9.18
Minimum precipitation
6.21
Mean NDVI
7.83
Minimum LST
4.39
Maximum LST
6.40
LST = Land Surface Temperature, G-Econ = Geographically
based Economic data, NDVI = Normalised Difference
Vegetation Index.
Mean relative contribution of predictor variables to the ensemble BRT
models of CL and VL in both the Old and New WorldDOI:
http://dx.doi.org/10.7554/eLife.02851.017LST = Land Surface Temperature, G-Econ = Geographically
based Economic data, NDVI = Normalised Difference
Vegetation Index.In the New World, CL is predicted to occur primarily within the Amazon basin and
other areas of rainforest. By contrast, VL is predicted to occur mainly along
the coastline of Brazil, with sporadic foci across the rest of Southern and
Central America. Outside of their main foci, both diseases are strongly
associated with urban and peri-urban areas, resulting in a focal distribution
throughout much of the New World.In the Old World, both CL and VL are predicted to be present from the
Mediterranean Basin across the Near East to Northwest India, with a few foci in
Central China as well as in a thin band of predicted risk across West Africa and
in the Horn of Africa. The predicted distribution of VL also extends into
Northeast India and China with a large predicted focus in the northwest.The populations living in areas predicted to be subject to environmental risk of
CL and VL are estimated to be 1.71 billion and 1.69 billion, respectively,
approximately a quarter of the world's population. Figure 4—figure supplement 4 compares these
national estimates to the annual case incidence data from all countries for
which at least one case per annum was estimated by Alvar et al. (2012). There is a strong
positive association between the two measures of disease occurrence. We provide
estimates of the populations at risk in 90 countries for which no human cases of
CL or VL were regularly reported (Alvar et al.,
2012). A full table of this information is presented in the
associated Dryad data set (doi: 10.5061/dryad.05f5h). For many of these countries, Alvar et al. (2012) reported a handful of
sporadic cases over the years indicating very rare occurrence of infection,
whilst the remainder were countries with inconclusive evidence of disease
presence or absence. It is important to note that the relationship between
environmental risk and true incidence of disease remains to be elucidated;
however the association between populations living in areas of environmental
risk and national level estimates of incidence suggests that the modelled
occurrence–incidence relationship approach used by Bhatt et al. (2013) for dengue could be applied if the
necessary longitudinal cohort study data were available.
Figure 4—figure supplement 4.
Population at risk estimates for leishmaniasis.
Four scatterplots showing the relationship between non-zero
estimated mean annual incidence (Alvar et al., 2012) and estimated
population at risk derived from the cartographic approach for
(A) New World cutaneous leishmaniasis,
(B) New World visceral leishmaniasis,
(C) Old World cutaneous leishmaniasis, and
(D) Old World visceral leishmaniasis. For each
country the bars represent the annual incidence estimate
range.
DOI:
http://dx.doi.org/10.7554/eLife.02851.015
Discussion
This work has compiled a large body of qualitative and quantitative information on
the global distribution of the leishmaniases and employed a statistical modelling
framework to generate the first published high-resolution global distribution maps
of these diseases.The evidence consensus maps provide a useful assessment of both global and regional
knowledge of these diseases. Whilst in many countries consensus on presence or
absence of the leishmaniases exists, in other areas, including large parts of Africa
and many states in India, these assessments reveal significant uncertainty in
assessing disease presence or absence using currently available evidence. It is in
these data-poor countries that increased surveillance efforts should be concentrated
to improve our knowledge of the global distribution of the leishmaniases. In some
locations, cases have been reported as locally transmitted without the presence of
proven vector species, which could indicate a false positive. However, the overall
consensus score will reflect any uncertainty associated with the validity of these
reports; if multiple independent sources report autochthonous cases, this increased
certainty will be reflected in a higher consensus score. Similarly, whilst the
occurrence database contains data from across the globe, this data set is inevitably
subject to spatial bias in reporting, with more data reported from more economically
developed countries where we already have a good knowledge of the disease (e.g.,
Spain, France, and Italy).The complexity and diversity of transmission cycles involving not just humans, but
also a multitude of vectors and reservoirs, necessitated a modelling approach which
can account for highly non-linear effects of covariates on probability of disease
presence. The BRT modelling approach employed is able to do this and has previously
been shown to produce highly accurate predictions across a wide range of species
(Elith et al., 2006, 2008). This ecological niche modelling
approach is therefore able to deal with not only the variation in parasites causing
infection, but also the various life-histories and habitat preferences associated
with the different vector species.A restriction of the BRT approach (in common with other species distribution
modelling approaches) is the need for absence data in addition to occurrence data.
Since reliable absence data were not available at this spatial scale, the
incorporation of pseudo-data into the modelling framework was necessary. The
methodology employed in this study attempted to minimise the problems this can
cause, by using a probabilistic approach to generate the pseudo-data which
incorporates the evidence consensus and distance from existing occurrence points.
Similarly, reporting bias within the occurrence database is an issue with all
presence-only species distribution models (Peterson et al., 2011). If bias is unaccounted for, there is the
potential that the model merely reflects factors that correlate with the probability
of reporting disease occurrence rather than the disease itself, such as healthcare
expenditure (Phillips et al., 2009; Syfert et al., 2013). The pseudo-data
selection procedures (which included information from both the occurrence data set
and the less-biased evidence consensus map) coupled with the model ensembling
approach aimed to minimise this potential source of bias.The differences in the most important predictors of disease presence between the two
forms of the disease and between the Old and New Worlds highlight the complex and
spatially variable epidemiology of the leishmaniases. Similar to a recent study of
the spatial predictors of dengue occurrence (Bhatt
et al., 2013), environmental and socioeconomic factors were found to be
important contributors to the distribution of both CL and VL. For VL, both Old World
and New World distributions are driven by peri-urban (and to a lesser degree urban)
land cover. This reflects recent trends observed, for instance, in Brazil and Bihar
state in India, where areas of highest risk have been found in peridomestic settings
(Bern et al., 2010; Harhay et al., 2011). This risk factor may
well be linked back to aspects of vector bionomics, with many vectors in these
regions associating with or near households in general (Singh et al., 2008; Poche
et al., 2011; Uranw et al.,
2013). Furthermore, whilst significant anthroponotic transmission of
L. donovani occurs across parts of the Old World, zoonotic
cycles of VL, primarily tied to canine hosts, dominate L. infantum
transmission (Chamaille et al., 2010; Ready, 2013), with infection in dogs shown to
be closely associated with human population density.Important predictors of CL distribution differed markedly between the Old and New
World. Whilst peri-urban land cover was the most important predictor of the disease
in the Old World, in the New World temperature was the highest predictor, with
abiotic factors predicting 74.18% of CL distribution. This difference in the
relative importance of climatic drivers reflects the fact that in the Old World the
main endemic CL areas are due to both anthroponotically transmitted L.
tropica and zoonotic cycles of L. major, whereas in
the New World the disease is primarily associated with sylvatic and zoonotic cycles
with a variety of different Leishmania spp. and wild reservoir
hosts implicated (Ashford, 1996; Reithinger et al., 2007; WHO, 2010; Lima et al., 2013; Ready,
2013).The distribution maps represent a spatially refined assessment of the global
environmental risk of leishmaniasis and provide a starting point for various public
health activities including targeting areas for control and assessing disease
burden. The maps compare favourably to the WHO Expert Committee on the Control of
Leishmaniases outputs (WHO, 2010), have
high model validation statistics and improve upon the existing body of work by
providing a finer resolution of risk at a subnational level. Similarly, the
countries indicated by Alvar et al. (2012)
as having 90% of all VL and 75% of all CL cases, were all predicted by our maps to
have risk for VL and CL, respectively.There are a number of regions in which our maps do not correspond as closely to these
previous findings. Regions such as Northwest China are predicted to have high risk
for VL, though the low population densities in this area are likely to lead to very
few cases and, given its remoteness, even fewer reported cases. Other regions, such
as the Mediterranean coastline of Europe, are predicted to be highly suitable for
leishmaniasis, but we see few human cases. This is because the maps presented
predict the probability of disease presence in an area, rather than directly infer
measures of incidence or burden, which can be influenced by a variety of other
factors (e.g., in the Mediterranean coastline of Europe, VL has been associated with
immunosuppression). The evidence consensus layer, used to mask out regions with high
consensus on leishmaniasis absence, acts as a rough filter on the environmental risk
maps. However, in order to model the true relationship between environmental risk
and disease incidence, a global data set of geopositioned disease incidence data
would be required; at present this is unavailable.Estimates of the populations living in areas of environmental risk are therefore
supplied as a proxy for the true burden of disease. However, they cannot be directly
compared with other global estimates of the leishmaniases’ disease burden,
such as the WHO estimates of clinical burden of around 350 million (WHO, 2010). Figure 4—figure supplement 4 shows a strong, positive
relationship between population at risk estimates and estimated annual incidence
from Alvar et al. (2012). The exceptions to
this relationship (e.g., Egypt, Nigeria, and Côte d’Ivoire) are all
countries with indeterminate evidence consensus scores, indicating a genuine lack of
knowledge regarding both the distribution and incidence of disease.Previous estimates of the leishmaniases' global burden have been complicated by poor
knowledge of the global distribution of the diseases (Bern et al., 2008; Reithinger, 2008). It is hoped that the maps presented here will help to
increase the accuracy of future estimates. Ideally, future improvements to the
global distribution maps presented here would distinguish between the different
Leishmania species and sandfly vectors. Species-specific models
at the same level of detail as those presented here are not currently possible due
to a lack of suitable data. Developments in the use of ‘big data’
approaches to disease mapping (such as the incorporation of informal internet
resources) may enable the construction of data sets which could be used in these
analyses (Hay et al., 2013). A further
complication with burden estimation is the epidemic nature of the disease, as
evidenced by the national case time series in Alvar
et al. (2012), leading to significant interannual variation in burden.
Therefore, any burden estimation would have to account for this and the temporal
spread of data would therefore be critical.It should be noted that non-environmental drivers of transmission and morbidity, such
as HIV immunosuppression and risk of infection via blood transfusions and
intravenous drug usage, are not incorporated into our present models. The maps
presented here can help inform the wider discussion of these factors and their
impact on leishmaniasis (e.g., by identifying regions with greater risk for HIV and
leishmaniasis co-infection) (Desjeux and Alvar,
2003). Similarly, the niche based models used here could enable a
decoupling of environmental from social factors to assess the importance of the
latter on leishmaniasis transmission in particular areas. It may indeed be the case
that in some specific localities it is these non-environmental risk factors that are
the main determinants of disease distribution.
Conclusions
These maps represent evidence-based estimates of the current global distribution
of the leishmaniases incorporating a comprehensive occurrence database and a
rigorous statistical modelling framework with associated uncertainty statistics.
We estimate that 1.71 billion and 1.69 billion individuals live in areas that
are suitable for CL and VL transmission, respectively. These figures highlight
the need for much greater awareness of this disease at a global scale. These
maps provide an important baseline assessment and a strong foundation on which
to base future burden estimates, target regions for control efforts and inform
public health decisions.
Materials and methods
A boosted regression tree (BRT) modelling framework was used to generate global
predicted environmental risk maps for CL and VL. This framework required four key
information components: (i) a map of the consensus of evidence for the global
extents of the leishmaniases; (ii) a comprehensive data set of geopositioned CL and
VL occurrence records; (iii) a suite of global, gridded data sets on environmental
correlates of the leishmaniases; and (iv) pseudo-data to augment the occurrence
records. In order to better capture the realised niche of these diseases, prediction
by the model is restricted to those areas of known disease transmission, or where
transmission is uncertain, as defined by the evidence consensus layer (i). The full
procedures used to generate these components and the resulting risk and prevalence
maps are outlined below.
Evidence consensus
The methodology used for generating the definitive extents for the leishmaniases
was adapted from work on dengue (Brady et al.,
2012). Four primary evidence categories were used to determine a
consensus on the presence or absence of the leishmaniases: (i) health reporting
organisations; (ii) peer-reviewed evidence of local autochthonous transmission;
(iii) case data; (iv) supplementary information. Cutaneous and visceral
leishmaniasis were the two symptomatologies investigated: other forms of the
disease were subset within these two – whilst VL contained cases of
post-kala-azar dermal leishmaniasis, CL included diffuse, disseminated, and
mucosal forms of the disease. Although limited amounts of data were available
for some of these forms, their epidemiology is similar, and consequently this
categorisation was seen as appropriate. Information was collected at provincial
level (termed Admin 1 units by the Food and Agriculture Organization's (FAO)
Global Administrative Unit Layers (GAUL) coding (FAO, 2008)) to better capture the focal nature of these
diseases.
Health Reporting Organisation Evidence (scores between −3 and
+3)
Two health reporting organisations were referenced, the Global Infectious
Diseases and Epidemiology Online Network (GIDEON) (Edberg, 2005) and the World Health Organization (WHO)
(WHO, 2010). The status of
disease was recorded for each Admin 1 unit as either present, absent or
unspecified. If both reported the disease as present, +3 was scored, if
both reported absence, −3 was scored, with +2/−2 scored
if one reporting body did not specify the presence or absence of the
disease. If the two disagreed, or both were non-specific, 0 was scored
reflecting the lack of a consensus on the status of that region.
Peer-reviewed evidence (scores between +2 and +6)
A review of reported leishmaniases' cases was performed. Using PubMed and Web
of Knowledge with ‘[admin1 province] leish*’ as the
search parameters, articles from January 1960 until September 2012 were
abstracted. Each abstract was imported into Endnote X4 and assessed for
relevance. Papers that included reported cases on either CL or VL were then
obtained. Cases were included if there was sufficient evidence to suggest
that local autochthonous transmission had occurred. Where individuals from a
non-endemic country had travelled to an endemic country (e.g., tourists and
military personnel) and returned with an infection, this was included (as
evidence for leishmaniasis in the foreign destination) since these typically
represent immunologically naive individuals who have undergone more rigorous
diagnostics in their home country, and thus represent a potentially more
informed data source. Each paper was assessed for contemporariness and
diagnostic accuracy. Contemporariness was graded in 3 bands:
2005–2012 = 3, 1997–2004 = 2 and 1997 and earlier
= 1, as was diagnostic accuracy where 1 was scored for data that
reported ‘confirmed’ cases without detailing methodologies
implemented; 2 was scored where evidence of microscopy, serology, or the
Montenegro skin test had been used; 3 was awarded to those studies that had
used PCR or other molecular techniques (Reithinger and Dujardin, 2007). Contemporariness bins were based
upon the potentially lengthy intrinsic incubation periods present with some
Leishmania spp. as well as to accommodate the potential
for epidemic cycles, where cases may only be detected in peak years and
missed in the intervening baseline periods. The most contemporary and
diagnostically accurate papers were then subset to maximise the consensus
score for any given area.
Case data (scores between −6 and +6)
Case data were derived from reports on the leishmaniases provided by national
health officials (Alvar et al.,
2012). A threshold value of 12 CL cases and 7 VL cases in a given
province in a given year was deemed suitable by the authors to distinguish
significant disease events from sporadic cases within that region. If cases
were reported at or above the threshold and were dated no later than 2005,
+6 was scored. If data existed below this threshold, indicating
sporadic cases, or data indicated a history of reported cases in the region
but with no evidence of time period, scores were assigned stratified by
total annual healthcare expenditure (HE) per capita at average US$ exchange
rates (WHO, 2011). This was used as
a proxy to determine genuine sporadic reporting from inadequate
surveillance. Three categories were defined—HE Low (<$100), HE
Medium ($100 ≤ HE < $500), and HE High (≥$500). If
sporadic cases were reported in an HE Low country, +4 was scored,
whilst in an HE Medium country, +2 was scored, and in an HE High
country, 0 was scored. If there were no reported case data available, HE Low
countries scored +2, HE Medium countries scored −2 and HE High
countries scored −6 (Brady et al.,
2012).
Supplementary evidence
Supplementary evidence was provided in cases where a consensus on presence or
absence could not be reached using the aforementioned evidence types,
typically with areas where the consensus value was close to 0%. For these
regions, additional literature searches were undertaken to determine whether
known vector species or infected reservoir hosts were reported in the
region. The justification for each provincial scenario is outlined in the
associated online databases (Dryad data set doi: 10.5061/dryad.05f5h). In total, this assessment was required
in 24 countries.An overall consensus score for each administrative region was calculated by
the sum of the scores in each category, divided by the maximum possible
score, then expressed as a percentage. Consensus was defined as either
complete (±75% to ±100%), good (±50% to ±74%), moderate
(±25% to ±49%), poor (±1% to ±24%), or indeterminate
(0%). Such a classification is intended more as a guide to the quality of
evidence for the leishmaniases in an area, rather than as a strict
classification of certainty. The full scores for each country are laid out
in the associated online data sets (Dryad data set doi: 10.5061/dryad.05f5h).
Brazil and Peru
The Brazilian Ministry of Health produces, via the Sistema de
Informação de Agravos de Notificação (SINAN, 2013) reporting network,
records of infections at the municipality level. This allowed for a more
thorough evidence consensus to be performed at district level (termed Admin
2 FAO, 2008) within Brazil. As
above, WHO and GIDEON status as well as peer-reviewed literature score were
recorded, both aggregated to Admin 1 provincial level. Case data were then
defined by the presence of a municipality reporting leishmaniasis between
2008 and 2011 inclusive, with positive reports scoring +6 and absence
scoring −6. The overall consensus score was then calculated as above.
In addition, provincial level case data for Peru was replaced by Ministry of
Health information as it was more contemporary than that listed by Alvar et al. (2012).
Occurrence records
Two separate searches using PubMed and Web of Knowledge were undertaken using the
search parameter “leish*,” and including articles up to
December 2012, and their respective abstracts, were filtered for relevance. From
these searches, 4845 articles were collated, with data recorded at the
resolution of either a point or Admin 1 or 2 polygon. These were then
geo-positioned using Google Maps (https://maps.google.co.uk/). Each entry was evaluated to ensure
that non-autochthonous cases and duplicate entries were eliminated. Each
occurrence was assigned a start and end date based upon the content of the
paper, used to define the time period over which occurrences were reported.In addition to this resource, reports were taken from the HealthMap database
(http://healthmap.org/en/).
HealthMap is an online based infectious disease surveillance system that
compiles data from informal data sources ranging from online news articles to
ProMED reports (Freifeld et al., 2008).
It parses information from these sources searching for relevant keywords, and
then, using crowdsourcing and automated processes, geopositions those relating
to the disease of interest. As of December 2012, a total of 690 leishmaniasis
relevant articles were archived.Searches were also performed on GenBank accessions, searching for archived
genetic information from Leishmania spp. known to infect humans
(WHO, 2010). If the host was
identified as human, geographic indicators were assigned either as point, Admin
1 or Admin 2, based upon the information in the location tag. Tags at the
national level were filtered out of the data set. In total, 563 accessions were
associated with sub-national location details and added to the database.Finally, data were provided from the curated strain archives of the Centre
National de Référence des Leishmanioses (CNR-L) in Montpellier,
France. In total, information about 3465 strains isolated from humans was
provided, collected from between 1954 and 2013.All data were geopositioned as precisely as possible, which resulted in both
point-level data (referring to cities, towns or villages) as well as
polygon-level data (provinces or districts) with area no greater than one square
decimal degree. All data that had been manually geopositioned were checked to
ensure coordinates were plausible and then occurrences were standardised
annually to remove intra-annual duplicates, so that each individual record used
in our model represented an occurrence of leishmaniasis infections in a given 5
km × 5 km location or administrative unit for one given year. As a result,
the occurrence data were independent of burden; a location with 200 cases in one
year has equal weighting in the model as a location with just one reported case,
since it was only the presence of the disease being modelled.
Environmental correlates
Leishmania spp. are known to have anthroponotic, zoonotic, or
sylvatic transmission cycles in nature (WHO,
2010; Ready, 2013) which is
apparent in the focal nature of the disease; however, there are some key
features of the environment that are important in determining the distribution
of disease across the globe. Numerous models have been constructed for local
transmission scenarios implicating various environmental features from
temperature and precipitation to socioeconomic factors relating to standards of
living in villages in endemic foci. For the modelling process, a suite of global
gridded environmental, biologically plausible, correlates was generated.
Precipitation
Humidity and moisture, whether from rainfall or in the soil, have often been
identified as important for the sandfly, with humidity influencing breeding
and resting (Ready, 2013). Whilst
relatively little is known about these breeding sites, of the few that have
been identified, high humidity seems to be a common trait, including moist
Amazonian soils, caves, animal burrows, and select human dwellings (Killick-Kendrick, 1999; Feliciangeli, 2004). Studies have
indicated soil type and their moisture profiles as determinants of sandfly
distribution (Bhunia et al., 2010;
Elnaiem, 2011). Precipitation
represents a good global proxy measure for moisture, and has been shown to
play a prominent role in shaping disease distribution in previous
leishmaniasis modelling efforts (Thomson
et al., 1999; Elnaiem et al.,
2003; Bhunia et al.,
2010; Chamaille et al.,
2010; Gonzalez et al.,
2010, 2011; Elnaiem, 2011; Hartemink et al., 2011; Malaviya et al., 2011).Estimates of precipitation were obtained from the WorldClim database
(www.worldclim.org). This resource, which is freely available
online, provides data spanning from 1950 to 2000, describing monthly
averages over this time, at a 1 km × 1 km resolution (Hijmans et al., 2005). Using this
baseline, interpolated global climate surfaces were produced using
ANUSPLIN-SPLINA software (Hutchinson,
1995). With the use of temporal Fourier analysis, seasonal and
inter-annual variation in precipitation patterns, taken from the
interpolated global surface, were used to calculate minimum and maximum
monthly precipitation averages (Rogers et
al., 1996; Scharlemann et al.,
2008).
Temperature
Temperature influences both the development of the infecting
Leishmania parasite in the sandfly (Hlavacova et al., 2013) as well as
the life cycle of the sandfly vectors. On one hand, studies have shown that
with increasing temperatures, the metabolism of the sandfly increases,
influencing oviposition, defecation, hatching, and adult emergence rates
(Kasap and Alten, 2005; Benkova and Volf, 2007; Guzman and Tesh, 2000). On the other
hand, higher temperatures have also been shown to increase mortality rates
of adults (Benkova and Volf, 2007;
Guzman and Tesh, 2000). Studies
have integrated the effects of temperature on sandfly biting rates, sandfly
mortality, and extrinsic incubation periods to produce maps of how the basic
reproductive number of canineleishmaniasis varied spatially (Hartemink et al., 2011). Multiple
studies have also implicated temperature (including maximum, minimum, and
mean temperatures) as being an important explanatory variable for both
sandfly and disease distribution (Thomson
et al., 1999; Gebre-Michael et
al., 2004; Bhunia et al.,
2010; Chamaille et al.,
2010; Fischer et al.,
2010; Galvez et al.,
2011; Fernandez et al.,
2012; Branco et al.,
2013).Using a similar methodology to generating precipitation surfaces, minimum,
maximum, and mean monthly temperature values were generated (Hijmans et al., 2005).
Normalised difference vegetation index (NDVI) and land cover
Vegetation provides many roles in sandfly habitat and survival, ranging from
maintaining the necessary moisture profile for both immature stages and
adults, to a sugar resource for both male and female sandflies (Killick-Kendrick, 1999; Feliciangeli, 2004; Ready, 2013). Moreover, vegetation is
an important resource for many mammals that sandflies feed on, and that
potentially are Leishmania reservoirs. The importance of
considering NDVI was demonstrated with respect to the distribution of the
reservoir Psammomys obesus (sand rat) and the distribution
of its primary food, chenopods (Toumi et
al., 2012). NDVI has been implicated as a key explanatory
variable in the distribution of leishmaniasis cases in several studies
(Cross et al., 1996; Thomson et al., 1999; Elnaiem et al., 2003; Gebre-Michael et al., 2004; Elnaiem, 2011; Hartemink et al., 2011; Bhunia et al., 2012; Toumi et al., 2012; de Oliveira et al., 2012).The Advanced Very High Resolution Radiometer (AVHRR) NDVI product uses the
spectral reflectance of AVHRR channels 1 and 2 (visible red and near
infrared wavelength) to quantitatively assess the level of photosynthesising
vegetation in a region (Hay et al.,
2006). Using this data, compiled over multiple time intervals,
patterns of NDVI were extracted for each gridded 1 km × 1 km cell.
Poverty
Neglected tropical diseases and poverty are often found to be linked and the
use of a purely economic variable was chosen to act as a proxy for a variety
of important global risk factors for disease, including malnutrition,
housing quality, and living with domesticated animals (Bern et al., 2010; Boelaert et al., 2009; Herrero
et al., 2009; Malafaia,
2009; Zeilhofer et al.,
2008).The G-Econ database (gecon.yale.edu) takes economic data, at the smallest
administrative division available, and spatially rescales these data to
create a 1o × 1o gridded surface of the globe
(Nordhaus, 2006, 2008). This rescaling estimates the
gross cell product of each grid cell, conceptually similar to gross domestic
product, referring to the total market value of all final goods and services
produced within 1 year, and can be considered as an indicator of overall
standard of living within that area. Some cells provided multiple data; in
these scenarios the best-quality information, as outlined by the quality
field associated with the data, was used to select one value. All gross cell
product values were then adjusted using purchasing power parity in US$ for
the years 1990, 1995, 2000, and 2005, using national aggregates estimated by
the World Bank (Nordhaus, 2006) and
computed the mean across all years for each gridded cell globally. This
adjusted measure was used as the indicator of poverty in the model.
Urbanisation
Over the last few decades, there has been a tendency for the leishmaniases
having a sylvatic/zoonotic transmission cycle to transition into the urban
and peri-urban environment in response to increasing urbanisation trends
(Harhay et al., 2011). The
increasing overlap in habitat between suitable human and animal hosts and
multiple available resting sites for adults can allow for transmission of
disease to occur relatively easily (Singh
et al., 2008; Poche et al.,
2011; Uranw et al.,
2013).The Gridded Population of the World version 3 (GPW3) population density
database projected for 2010 was used. The core Global Rural–Urban
Mapping Project Urban Extents surface used night-time light satellite
imagery to differentiate urban areas (Balk
et al., 2006); GPW3 is a revision which updates the criteria for
urban areas to those areas where population density is greater than or equal
to 1000 people per km2. Using the most up-to-date national
censuses available and other demographic data resolved to the smallest
available administrative unit, a gridded surface of 5 km × 5 km cells
was generated. Each pixel could then be classified as urban, peri-urban, or
rural.
Modelling with boosted regression trees
The boosted regression trees (BRT) methodology employed for mapping the
leishmaniases is a variant of the model used in a previous analysis of dengue
(Bhatt et al., 2013). Boosted
regression tree modelling combines both regression trees, which build a set of
decision rules on the predictor variables by portioning the data into
successively smaller groups with binary splits (De'ath, 2007; Elith et
al., 2008), and boosting, which selects the tree that minimises the
loss of function, to best capture the variables that define the distribution of
the input data. The core BRT setup followed standard protocol already defined
elsewhere (Elith et al., 2008; Bhatt et al., 2013).
Pseudo-data generation
As BRT requires both the presence and absence data, the latter which is often
hard to collate in an unbiased manner, pseudo-data had to be generated
(Elith et al., 2008). There is
no general consensus on how best to generate pseudo-data (Bhatt et al., 2013); however, several
factors of the generation process are known to influence the predicted
distribution and thus can be sources of potential bias (Phillips et al., 2009; Van Der Wal et al., 2009; Phillips and Elith, 2011; Barbet-Massin et al., 2012). In order
to minimise such effects, pseudo-absence selection was directly related to
the evidence consensus layer and restricted to a maximum distance (μ)
from any occurrence point. Pseudo-presence data was also incorporated, again
informed by the evidence consensus layer, to compensate for poor
surveillance capacity in low prevalence regions. As in Bhatt et al. (2013) points were randomly located in
regions above an evidence consensus threshold of −25, with regional
placement probability weighted by evidence consensus scores, so that regions
with higher evidence consensus contained more pseudo-presences than lower
scoring areas. Since the occurrence data set is from a wide range of sources
and institutions, this procedure aims to mitigate sampling bias. By
referencing the evidence consensus layer for pseudo-data selection,
detection bias was also mitigated.
‘Ensemble’ analysis
There is no definitive procedure for choosing the best number of pseudo-data
points to generate the most accurate predictive map. To account for the
impact that these parameters might have on the model predictions, an
ensemble BRT model was constructed with multiple BRT submodels fitted using
pseudo-data points generated using different combinations of parameters
n,
n, and μ. The
numbers of pseudo-absences (n)
and pseudo-presences (n) were
defined as a proportion of the total number of actual data occurrence
records (6426 and 6137 for CL and VL). The proportions used for generating
pseudo-absences were 2:1, 4:1, 6:1, 8:1, and 10:1, and pseudo-presences were
0.025:1, 0.05:1 and 0.1:1. The pseudo-data were also generated within a
restricted maximum distance (μ) from any actual presence point, and
μ was varied through 5 distances: 5, 10, 15, 20, and 25 arc degrees.
All combinations of these parameter values resulted in a total of 75
(5n ×
3n × 5μ)
individual input data sets and BRT submodels (making up the BRT
ensemble).For each disease, the 75 BRT submodels were used to predict a range of
different risk maps (each at 5 km × 5 km resolution), and these were
combined to produce a single mean ensemble risk map for each disease, also
allowing for computation of the associated range of uncertainty in these
predictions for every 5 km × 5 km pixel as shown in Figure 1—figure supplement 1,
Figure 2—figure supplement
1, Figure 3—figure
supplement 1, Figure
4—figure supplement 1. For both diseases, the New World
(the Americas) and Old World (Eurasia and Africa) were modelled separately
in order to account for and explore any differences in the epidemiology of
the diseases between these regions. This was done to differentiate the
potential effect that the different vectors namely
Lutzomyiaspp. in the New World and
Phlebotomusspp. in the Old World and their varying
life histories, might have on the distribution of the diseases within these
regions.
Figure 1—figure supplement 1.
Uncertainty associated with predictions in Figure 1B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.
DOI:
http://dx.doi.org/10.7554/eLife.02851.004
Figure 2—figure supplement 1.
Uncertainty associated with predictions in Figure 2B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.
DOI:
http://dx.doi.org/10.7554/eLife.02851.006
Figure 3—figure supplement 1.
Uncertainty associated with predictions in Figure 3B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.
DOI:
http://dx.doi.org/10.7554/eLife.02851.008
Figure 4—figure supplement 1.
Uncertainty associated with predictions in Figure 4B.
Uncertainty was calculated as the range of the 95% confidence
interval in predicted probability of occurrence for each pixel.
Regions of highest uncertainty are in dark brown, with blue
representing low uncertainty.
DOI:
http://dx.doi.org/10.7554/eLife.02851.012
Summarising the BRT model
The relative importance of predictor variables was quantified for the final
BRT ensemble. Relative importance is defined as the number of times a
variable is selected for splitting, weighted by the squared improvement to
the model as a result of each split and averaged over all trees (Friedman, 2001). These contributions
are scaled to sum to 100, with a higher number indicating a greater effect
on the response. To evaluate the ensemble's predictive performance, we used
the area under the receiver operator curve (AUC) (Fleiss et al., 2003)—the area under a plot of
the true positive rate versus false positive rate, reflecting the ability to
discriminate between the presence and absence. An AUC value of 0.5 indicates
no discriminative ability, and a value of 1 indicates perfect
discrimination.It is important to note that this distribution modelling technique assesses
pixel level risk, rather than population level risk. As such, the ensemble
evaluates the likelihood of leishmaniasis presence based upon the covariates
supplied. In reality, some other factors, such as national healthcare
provisioning and standards of living will influence the true observed
burden. Therefore, whilst these two levels of risk are inherently related,
additional information, namely incidence data from many different
populations, is required in order to assess the link quantitatively (Bhatt et al., 2013).
Estimation of population living in areas of environmental risk
Population living in areas of risk was estimated by using a threshold probability
to reclassify the probabilistic risk maps into a binary risk map, then
extracting the total human population in the ‘at risk’ areas using
a gridded data set of human population density from 2010 (Balk et al., 2006; CIESIN/IFPRI/WB/CIAT, 2007). The threshold value was set such that
95% of the point occurrence records fell within the at risk area. 5% of
occurrence points were allowed to fall outside the predicted risk area to
account for errors which could have arisen either from errors in the occurrence
data set or from inaccuracies in the predicted risk maps.For external validation, this population at risk information was compared to
national reported annual cases (Alvar et al.,
2012) to produce Figure
4—figure supplement 4. In these figures, the points represent
the mean value of the estimated annual incidence reported taking into account
the authors estimates of underreporting rates (Alvar et al., 2012). The upper and lower limits to these estimates
are reflected by the bars around each point. Note that these figures use a
log-scale on each axis and that only countries with non-zero estimates by Alvar et al. (2012) are included.The threshold probabilities of occurrence used to define ‘at risk’
were as follows: NW CL—0.22, OW CL—0.19, NW VL—0.42, OW
VL—0.19.eLife posts the editorial decision letter and author response on a selection of
the published articles (subject to the approval of the authors). An edited
version of the letter sent to the authors after peer review is shown, indicating
the substantive concerns or comments; minor concerns are not usually shown.
Reviewers have the opportunity to discuss the decision before the letter is sent
(see review
process). Similarly, the author response typically shows only
responses to the major concerns raised by the reviewers.Thank you for sending your work entitled “Global Distribution Maps of the
Leishmaniases” for consideration at eLife. Your article has
been favorably evaluated by Prabhat Jha (Senior editor), a Reviewing editor, and 3
reviewers.The Reviewing editor and the other reviewers discussed their comments before we
reached this decision, and the Reviewing editor has assembled the following comments
to help you prepare a revised submission.1) This work on mapping of leishmaniasis is unique and impressive in its scope and
depth, and makes for the most comprehensive overview of the leishmaniasis burden
worldwide to date. Rightly so, climatic as well as socio-economic factors were taken
into account to predict the risk of leishmaniasis. Indeed, this work will be able to
guide health authorities in future surveillance activities. However, to serve this
purpose it would be extremely helpful if the maps were presented in a format where
it would be possible to 'zoom in' so that the geographical locations can
be more easily identified. Specifically, the detail of the global prediction maps in
Figures 3 and 4 are difficult to
see. The authors could consider including larger insert maps for the major endemic
areas, e.g. east Africa and Indian subcontinent for VL.2) In Asia as well as in Africa, VL caused by L.donovani typically
presents as epidemics, with the case load rising and falling over a period of 5-10
years, probably dependent on climatic factors as rainfall, and thus presenting as a
varying burden to countries. Similarly, CL caused by L. major and
L. tropica are prone to epidemics. Please address in the
Discussion.3) A complete data review was used for establishing the evidence consensus for
presence of leishmaniasis. However, in any country where the appropriate vector for
transmission has not been confirmed according to the criteria set in 'Control
of the Leishmaniasis' (WHO, TRS 949, 2010) it cannot be assumed either that
leishmaniasis is endemic, or that the area is suitable for leishmaniasis
transmission. It is unclear whether this has been taken into account; if not, please
refer to 'Control of the Leishmaniasis' where expert consensus on vector
presence in each country is compiled. An example is Taiwan: according to map 3A
there is an area of confirmed CL presence, yet the vector for transmitting
L. tropica has not been confirmed.Minor comments:4) Differences in sandfly ecology. Different sandfly species have distinct ecologies
and habitat preference (for example Phlebotomus orientalis and
P. martini in east Africa) and the authors should explain how
such differences are taken into account.5) Classification of contemporariness. Provide a justification as to the year bins
used.6) Pseudo-presence data. The generation of such data was not clear and the authors
should provide further details.7) “We provide estimates of the populations at risk in 90 countries for which
no human cases of CL or VL were reported.” This is interesting information
but we did not find it presented obviously in the article.8) Furthermore, significant anthroponotic transmission of both L.
infantum and L. donovani occurs across much of the Old
World with zoonotic cycles of VL primarily tied to canine hosts. While transmission
of L. donovani is anthroponotic, there is no anthroponotic
transmission of L. infantum where transmission is entirely zoonotic
via canine hosts.9) “In the Old World the main endemic CL areas are due to
anthroponotically-transmitted L. tropica”. True, but a
significant case load is also caused by zoonotic L. major in Old
World CL.1) This work on mapping of leishmaniasis is unique and impressive in its
scope and depth, and makes for the most comprehensive overview of the
leishmaniasis burden worldwide to date. Rightly so, climatic as well as
socio-economic factors were taken into account to predict the risk of
leishmaniasis. Indeed, this work will be able to guide health authorities in
future surveillance activities. However, to serve this purpose it would be
extremely helpful if the maps were presented in a format where it would be
possible to 'zoom in' so that the geographical locations can be more
easily identified. Specifically, the detail of the global prediction maps
in
are difficult to see. The authors could consider including larger insert
maps for the major endemic areas, e.g. east Africa and Indian subcontinent for
VL.We agree with the reviewers and have therefore supplied 4 figure supplements
detailing a more close up view of CL in northeast Africa and across the Near East,
and of VL in northeast Africa and the Indian subcontinent. These panels depict areas
of the highest incidence of both CL and VL. In addition, maps can be provided to
individuals upon request to the authors.2) In Asia as well as in Africa, VL caused by L. donovani
typically presents as epidemics, with the case load rising and falling
over a period of 5-10 years, probably dependent on climatic factors as rainfall,
and thus presenting as a varying burden to countries. Similarly, CL caused
by L. major and L. tropica are prone to
epidemics. Please address in the Discussion.Whilst this is a crucial component of burden estimation, we believe this
doesn’t directly impact on the BRT models used, since these are reliant on
presence/absence data, not total numbers of cases in a given year. In parts of the
evidence consensus generation process where temporal data is important, such as
contemporariness score for peer-review data and the scoring of case data series, the
temporal divisions used to analyse these data is sufficiently broad to accommodate
most inter-annual variation. However, we have added a section in the Discussion
highlighting that this characteristic will further complicate potential burden
estimation:“A further complication with burden estimation is the epidemic nature of the
disease, as evidenced by the national case time series in , leading
to significant interannual variation in burden. Therefore, any burden estimation
would have to account for this and the temporal spread of data would therefore be
critical.”3) A complete data review was used for establishing the evidence consensus
for presence of leishmaniasis. However, in any country where the appropriate
vector for transmission has not been confirmed according to the criteria set in
'Control of the Leishmaniasis' (WHO, TRS 949, 2010) it cannot be
assumed either that leishmaniasis is endemic, or that the area is suitable for
leishmaniasis transmission. It is unclear whether this has been taken into
account; if not, please refer to 'Control of the Leishmaniasis' where
expert consensus on vector presence in each country is compiled. An example is
Taiwan: according to map 3A there is an area of confirmed CL presence, yet the
vector for transmitting L. tropica has not been
confirmed.From the outset we set out to model reported cases of leishmaniasis infection and
disease in humans, therefore the evidence consensus was primarily driven by evidence
of local autochthonous transmission of the disease. Whilst in some cases the vector
species is unknown or unproven, this may just as equally reflect the rarity of the
disease in this area (and hence little knowledge available on vector species) rather
than necessarily the suggestion of local transmission being incorrect. As a result,
we chose to prioritise evidence of autochthonous cases of disease. Where there was
insufficient evidence pertaining to human cases, information concerning vector and
reservoir distributions was also considered, and this was taken from reports in the
literature. In the regions where this was considered, the findings were consistent
with the WHO Technical Report, apart from the presence of sandflies (not proven to
transmit disease locally) in two regions of Tanzania. In the specific case of
Taiwan, several cases have been reported as being locally derived (as outlined in
the evidence consensus tables in the Dryad dataset) and therefore the evidence
consensus scores Taiwan as likely to have the disease present. The region scores
+53.33%, therefore indicating that this is not unanimously agreed upon by the
various sources consulted, however this is supported by GIDEON and the paper.
The WHO technical report also indicates that P. kiangsuensis could
act as a potential vector. In order to better clarify this situation, we have
clarified the text in three places:a) “(ii) peer-reviewed evidence of local autochthonous
transmission”b) “Cases were included if there was sufficient evidence to suggest that local
autochthonous transmission had occurred”c) “In some locations, cases have been reported as locally transmitted without
the presence of proven vector species, which could indicate a false positive.
However, the overall consensus score will reflect any uncertainty associated with
the validity of these reports; if multiple independent sources report autochthonous
cases, this increased certainty will be reflected in a higher consensus
score.”Minor comments:4) Differences in sandfly ecology. Different sandfly species have distinct
ecologies and habitat preference (for example Phlebotomus orientalis
and P. martini in east Africa) and the authors should
explain how such differences are taken into account.We have reinforced the relevant section in the Discussion relating to the flexibility
of BRT and how it can deal with complexity:“The complexity and diversity of transmission cycles involving not just
humans, but also a multitude of vectors and reservoirs, necessitated a modelling
approach which can account for highly non-linear effects of covariates on
probability of disease presence. The BRT modelling approach employed is able to do
this and has previously been shown to produce highly accurate predictions across a
wide range of species. This ecological niche modelling approach is therefore able to
deal with not only the variation in parasites causing infection, but also the
various life-histories and habitat preferences associated with the different vector
species.”5) Classification of contemporariness. Provide a justification as to the year
bins used.We have added the following in light of this comment:“Contemporariness bins were based upon the potentially lengthy intrinsic
incubation periods present with some Leishmania spp. as well as to
accommodate the potential for epidemic cycles, where cases may only be detected in
peak years and missed in the intervening baseline periods.”6) Pseudo-presence data. The generation of such data was not clear and the
authors should provide further details.We have added some more details to those already listed in the document to help
clarify. The manner in which the pseudo-presence data was incorporated as a
numerical parameter in the BRT process can be found in the paragraph entitled
Ensemble analysis:“As in points were randomly located in regions above an evidence
consensus threshold of -25, with regional placement probability weighted by evidence
consensus scores, so that regions with higher evidence consensus contained more
pseudo-presences than lower scoring areas.”7) “We provide estimates of the populations at risk in 90 countries
for which no human cases of CL or VL were reported.” This is interesting
information but we did not find it presented obviously in the
article.We have provided via the Dryad dataset associated with this output, tables detailing
national estimates of populations living in areas of environmental risk of
leishmaniasis. We have also inserted the following section as a synoptic overview of
these countries:“A full table of this information is presented in the associated Dryad
database (doi:10.5061/dryad.05f5h). For many of these countries, Alvar et al. (2012) reported a handful of
sporadic cases over the years indicating very rare occurrence of infection, whilst
the remainder were countries with inconclusive evidence of disease presence or
absence.”8) Furthermore, significant anthroponotic transmission of both L.
infantum and L. donovani occurs across much of the Old
World with zoonotic cycles of VL primarily tied to canine hosts. While
transmission of L. donovani is anthroponotic, there is no
anthroponotic transmission of L. infantum where transmission is
entirely zoonotic via canine hosts.We have changed this section to reflect this comment:“Furthermore, whilst significant anthroponotic transmission of L.
donovani occurs across parts of the Old World, zoonotic cycles of VL,
primarily tied to canine hosts, dominate L. infantum transmission
(Chamaille et al., 2010; Ready, 2013), with infection in dogs shown to
be closely associated with human population density.”9) “In the Old World the main endemic CL areas are due to
anthroponotically-transmitted L. tropica”. True, but a significant case
load is also caused by zoonotic L. major in Old World
CL.We have changed this section to reflect the fact that climatic factors have differing
relative influences between the Old World and New World. Table 2 demonstrates that whilst periurban extents are the
most important predictor of Old World CL, temperature and to a lesser extent,
precipitation, have a non-negligible influence, reflecting the two core
epidemiologies present with L. tropica and L.
major:“This difference in the relative importance of climatic drivers reflects the
fact that in the Old World the main endemic CL areas are due to both
anthroponotically-transmitted L. tropica as well as zoonotic cycles
of L. major, whereas in the New World the disease is primarily
associated with sylvatic and zoonotic cycles with a variety of different
Leishmania spp. and wild reservoir hosts implicated (Ashford, 1996; Lima et al., 2013; Ready,
2013; Reithinger et al., 2007;
WHO, 2010).”
Authors: Dia-Eldin A Elnaiem; Judith Schorscher; Anna Bendall; Valérie Obsomer; Maha E Osman; Abdelrafie M Mekkawi; Stephen J Connor; Richard W Ashford; Madeleine C Thomson Journal: Am J Trop Med Hyg Date: 2003-01 Impact factor: 2.345
Authors: Fernanda O Novais; Lucas P Carvalho; Joel W Graff; Daniel P Beiting; Gordon Ruthel; David S Roos; Michael R Betts; Michael H Goldschmidt; Mary E Wilson; Camila I de Oliveira; Phillip Scott Journal: PLoS Pathog Date: 2013-07-18 Impact factor: 6.823
Authors: Jane P Messina; Oliver J Brady; David M Pigott; Nick Golding; Moritz U G Kraemer; Thomas W Scott; G R William Wint; David L Smith; Simon I Hay Journal: Nat Rev Microbiol Date: 2015-03-02 Impact factor: 60.633
Authors: F Matthew Kuhlmann; John I Robinson; Gregory R Bluemling; Catherine Ronet; Nicolas Fasel; Stephen M Beverley Journal: Proc Natl Acad Sci U S A Date: 2017-01-17 Impact factor: 11.205
Authors: Erin A Brettmann; Jahangheer S Shaik; Haroun Zangger; Lon-Fye Lye; F Matthew Kuhlmann; Natalia S Akopyants; Dayna M Oschwald; Katherine L Owens; Suzanne M Hickerson; Catherine Ronet; Nicolas Fasel; Stephen M Beverley Journal: Proc Natl Acad Sci U S A Date: 2016-10-18 Impact factor: 11.205
Authors: Hongjie Guo; Sebastian Damerow; Luciana Penha; Stefanie Menzies; Gloria Polanco; Hicham Zegzouti; Michael A J Ferguson; Stephen M Beverley Journal: Proc Natl Acad Sci U S A Date: 2021-08-17 Impact factor: 12.779
Authors: E O Nsoesie; M U Kraemer; S I Hay; J S Brownstein; N Golding; D M Pigott; O J Brady; C L Moyes; M A Johansson; P W Gething; R Velayudhan; K Khan Journal: Euro Surveill Date: 2016-05-19