Literature DB >> 27897973

Structure in the variability of the basic reproductive number (R₀) for Zika epidemics in the Pacific islands.

Clara Champagne^1,2, David Georges Salthouse¹, Richard Paul^3,4, Van-Mai Cao-Lormeau⁵, Benjamin Roche⁶, Bernard Cazelles^1,6.

Abstract

Before the outbreak that reached the Americas in 2015, Zika virus (ZIKV) circulated in Asia and the Pacific: these past epidemics can be highly informative on the key parameters driving virus transmission, such as the basic reproduction number (R0). We compare two compartmental models with different mosquito representations, using surveillance and seroprevalence data for several ZIKV outbreaks in Pacific islands (Yap, Micronesia 2007, Tahiti and Moorea, French Polynesia 2013-2014, New Caledonia 2014). Models are estimated in a stochastic framework with recent Bayesian techniques. R0 for the Pacific ZIKV epidemics is estimated between 1.5 and 4.1, the smallest islands displaying higher and more variable values. This relatively low range of R0 suggests that intervention strategies developed for other flaviviruses should enable as, if not more effective control of ZIKV. Our study also highlights the importance of seroprevalence data for precise quantitative analysis of pathogen propagation, to design prevention and control strategies.

Entities: Chemical Disease Gene Species

Keywords: Basic reproductive number; Zika virus; bayesian inference; epidemiology; global health; none; stochastic dynamical models

Mesh：

Year: 2016 PMID： 27897973 PMCID： PMC5262383 DOI： 10.7554/eLife.19874

Source DB: PubMed Journal: Elife ISSN： 2050-084X Impact factor: 8.140

Introduction

In May 2015, the first local cases of Zika were recorded in Brazil and by December of the same year the number of cases had surpassed 1.5 million. On February 2016, the World Health Organization declared Zika as a public health emergency of international concern (Who, 2016) and in March 2016, local transmission of Zika was recognized in 34 countries. Previously the Zika virus had circulated in Africa and Asia but only sporadic human cases had been reported. In 2007 the outbreak on Yap (Micronesia) was the first Zika outbreak outside Africa and Asia (Duffy et al., 2009). Since, Zika outbreaks have been also reported in French Polynesia and in New Caledonia (Cao-Lormeau et al., 2014; Dupont-Rouzeyrol et al., 2015) between 2013 and 2014 and subsequently, there have been cases of Zika disease in the Cook Islands, the Solomon Islands, Samoa, Vanuatu, and Easter Island (Chile) (see Figure 1 in Petersen et al. [2016]). Zika virus (ZIKV) is a flavivirus, mostly transmitted via the bites of infected Aedes mosquitoes, although non-vector-borne transmission has been documented (sexual and maternofoetal transmission, laboratory contamination, transmission through transfusion) (Musso and Gubler, 2016). The most common clinical manifestations include rash, fever, arthralgia, and conjunctivitis (Musso and Gubler, 2016) but a large proportion of infections are asymptomatic or trigger mild symptoms that can remain unnoticed. Nevertheless, the virus may be involved in many severe neurological complications, including Guillain-Barre syndrome (Cao-Lormeau et al., 2016) and microcephaly in newborns (Schuler-Faccini et al., 2015). These complications and the impressive speed of its geographically propagation make the Zika pandemic a public health threat (Who, 2016). This reinforces the urgent need to characterize the different facets of virus transmission and to evaluate its dispersal capacity. We address this here by estimating the key parameters of ZIKV transmission, including the basic reproduction number (), based on previous epidemics in the Pacific islands. Defined as the average number of secondary cases caused by one typical infected individual in an entirely susceptible population, the basic reproduction number () is a central parameter in epidemiology used to quantify the magnitude of ongoing outbreaks and it provides insight when designing control interventions (Diekmann et al., 2010). It is nevertheless complex to estimate (Diekmann et al., 2010; van den Driessche and Watmough, 2002), and therefore, care must be taken when extrapolating the results obtained in a specific setting, using a specific mathematical model. In the present study, we explore the variability of using two models in several settings that had Zika epidemics in different years and that vary in population size (Yap, Micronesia 2007, Tahiti and Moorea, French Polynesia 2013–2014, and New Caledonia 2014). These three countries were successively affected by the virus, resulting in the first significant human outbreaks and they differ in several ways, including population size and location specific features. Hence, the comparison of their parameter estimates can be highly informative on the intrinsic variability of . For each setting, we compare two compartmental models using different parameters defining the mosquito populations. Both models are considered in a stochastic framework, a necessary layer of complexity given the small population size and recent Bayesian inference techniques (Andrieu et al., 2010) are used for parameter estimation.

Results

We use mathematical transmission models and data from surveillance systems and seroprevalence surveys for several ZIKV outbreaks in Pacific islands (Yap, Micronesia 2007 (Duffy et al., 2009), Tahiti and Moorea, French Polynesia 2013–2014 (CHSP, 2014; Mallet et al., 2015; Aubry et al., 2015b), New Caledonia 2014 [DASS, 2014]) to quantify the ZIKV transmission variability. Two compartmental models with vector-borne transmission are compared (cf. Figure 1). Both models use a Susceptible-Exposed-Infected-Resistant (SEIR) framework to describe the virus transmission in the human population, but differ in their representation of the mosquito population. Figure 1a is a schematic representation derived from Pandey et al. (2013) and formulates explicitly the mosquito population, with a Susceptible-Exposed-Infected (SEI) dynamic to account for the extrinsic incubation period (time taken for viral dissemination within the mosquito).

Figure 1.

Graphical representation of compartmental models.

DOI: http://dx.doi.org/10.7554/eLife.19874.003

Graphical representation of compartmental models.

Squared boxes and circles correspond respectively to human and vector compartments. Plain arrows represent transitions from one state to the next. Dashed arrows indicate interactions between humans and vectors. (a) Pandey model (Pandey et al., 2013). susceptible individuals; infected (not yet infectious) individuals; infectious individuals; recovered individuals; is the rate at which -individuals move to infectious class ; infectious individuals () then recover at rate ; susceptible vectors; infected (not yet infectious) vectors; infectious vectors; constant size of total mosquito population; is the rate at which -vectors move to infectious class ; vectors die at rate . (b) Laneri model (Laneri et al., 2010). susceptible individuals; infected (not yet infectious) individuals; infectious individuals; recovered individuals; is the rate at which -individuals move to infectious class ; infectious individuals () then recover at rate ; implicit vector-borne transmission is modeled with the compartments and ; current force of infection; latent force of infection reflecting the exposed state for mosquitoes during the extrinsic incubation period; is the transition rate associated to the extrinsic incubation period. DOI: http://dx.doi.org/10.7554/eLife.19874.003 By contrast, in the second model (Figure 1b) based on Laneri et al. (2010) the vector is modeled implicitly: the two compartments and do not represent the mosquito population but the force of infection for vector to human transmission. This force of infection passes through two successive stages in order to include the delay associated with the extrinsic incubation period: stands for this latent phase of the force of infection whereas corresponds directly to the rate at which susceptible humans become infected. The basic reproduction number of the models () is calculated using the next Generation Matrix method (Diekmann et al., 2010): In addition, we consider that only a fraction of the total population is involved in the epidemic, due to spatial heterogeneity, immuno-resistance, or cross-immunity. For both models we define with H the total size of the population reported by census. The dynamics of ZIKV transmission in these islands is highly influenced by several sources of uncertainties. In particular, the small population size (less than 7000 inhabitants in Yap) leads to high variability in transmission rates. Therefore all these models are simulated in a discrete stochastic framework (Poisson with stochastic rates [Bretó et al., 2009]), to take this phenomenon into account. Stochasticity requires specific inference techniques: thus estimations are performed with PMCMC algorithm (particle Markov Chain Monte Carlo [Andrieu et al., 2010]). Using declared Zika cases from different settings, the two stochastic models (Figure 1) were fitted (Figures 2–3). These models allow us to describe the course of the observed number of cases and estimate the number of secondary cases generated, . Our estimates of lie between 1.6 (1.5–1.7) and 3.2 (2.4–4.1) and vary notably with respect to settings and models (Figures 2–3 and Tables 1–2). Strikingly, Yap displays consistently higher values of in both models and in general, there is an inverse relationship between island size and both the value and variability of . This phenomenon may be explained by the higher stochasticity and extinction probability associated with smaller populations and can also reflect the information contained in the available data. However, the two highly connected islands in French Polynesia, Tahiti and Moorea, display similar values despite their differing sizes.

Figure 2.

Results using the Pandey model.

Posterior median number of observed Zika cases (solid line), 95% credible intervals (shaded blue area) and data points (black dots). First column: particle filter fit. Second column: Simulations from the posterior density. Third column: posterior distribution. (a) Yap. (b) Moorea. (c) Tahiti. (d) New Caledonia. The estimated seroprevalences at the end of the epidemic (with 95% credibility intervals) are: (a) 73% (CI95: 68–77, observed 73%); (b) 49% (CI95: 45–53, observed 49%); (c) 49% (CI95: 45–53, observed 49%); (d) 39% (CI95: 8–92). See Figure 4.

DOI: http://dx.doi.org/10.7554/eLife.19874.004

Figure 3.

Results using the Laneri model.

Posterior median number of observed Zika cases (solid line), 95% credible intervals (shaded blue area) and data points (black dots). First column: particle filter fit. Second column: Simulations from the posterior density. Third column: posterior distribution. (a) Yap. (b) Moorea. (c) Tahiti. (d) New Caledonia. The estimated seroprevalences at the end of the epidemic (with 95% credibility intervals) are: (a) 72% (CI95: 68–77, observed 73%); (b) 49% (CI95: 45–53, observed 49%); c) 49% (CI95: 45–53, observed 49%); d) 65% (CI95: 24–91). See Figure 5.

DOI: http://dx.doi.org/10.7554/eLife.19874.005

Table 1.

Parameter estimations for the Pandey model. Posterior median (95% credible intervals). All the posterior parameter distributions are presented in Figures 6–9 .

DOI: http://dx.doi.org/10.7554/eLife.19874.006

Pandey model		Yap	Moorea	Tahiti	New Caledonia
Population size	H	6892	16,200	178,100	268,767
Basic reproduction number	R0	3.2 (2.4–4.1)	2.6 (2.2–3.3)	2.4 (2.0–3.2)	2.0 (1.8–2.2)
Observation rate	r	0.024 (0.019-0.032)	0.058 (0.048-0.073)	0.060 (0.050-0.073)	0.024 (0.010-0.111)
Fraction of population involved	ρ	74% (69–81)	50% (48–54)	50% (46–54)	40% (9–96)
Initial number of infected individuals	HI(0)	2 (1–8)	5 (0–21)	329 (16–1047)	37 (1–386)
Infectious period in human (days)	γ-1	5.2 (4.1–6.7)	5.2 (4.1–6.8)	5.2 (4.1–6.7)	5.5 (4.2–6.8)
Extrinsic incubation period in mosquito (days)	τ-1	10.6 (8.7–12.5)	10.5 (8.6–12.4)	10.5 (8.6–12.6)	10.7 (8.9–12.5)
Mosquito lifespan (days)	μ-1	15.6 (11.7–19.3)	15.3 (11.4–19.1)	15.1 (11.2–19.0)	15.4 (11.6–19.4)

Table 2.

Parameter estimations for the Laneri model. Posterior median (95% credible intervals). All the posterior parameter distributions are presented in Figures 10–13.

DOI: http://dx.doi.org/10.7554/eLife.19874.007

Laneri model		Yap	Moorea	Tahiti	New Caledonia
Population size	H	6892	16,200	178,100	268,767
Basic reproduction number	R0	2.2 (1.9–2.6)	1.8 (1.6–2.0)	1.6 (1.5–1.7)	1.6 (1.5–1.7)
Observation rate	r	0.024 (0.019–0.033)	0.057 (0.047–0.07)	0.057 (0.049–0.069)	0.014 (0.010–0.037)
Fraction of population involved	ρ	73% (69–78)	51% (47–55)	54% (49–59)	71% (27–98)
Initial number of infected individuals	HI(0)	2 (1–10)	9 (1–28)	667 (22–1570)	82 (2–336)
Infectious period in human (days)	γ-1	5.3 (4.1–6.6)	5.3 (4.1–6.7)	5.2 (4.1–6.7)	5.4 (4.1–6.8)
Extrinsic incubation period in mosquito (days)	τ-1	10.7 (8.8–12.7)	10.6 (8.6–12.6)	10.5 (8.5–12.5)	10.8 (8.9–12.8)

Results using the Pandey model.

Figure 4.

Infected and recovered humans evolution during the outbreak with Pandey model.

Simulations from the posterior density: posterior median (solid line), 95% and 50% credible intervals (shaded blue areas) and observed seroprevalence (black dots). First column: Infected humans (). Second column: Recovered humans (). (a) Yap. (b) Moorea. (c) Tahiti. (d) New Caledonia.

DOI: http://dx.doi.org/10.7554/eLife.19874.013

DOI: http://dx.doi.org/10.7554/eLife.19874.004

Results using the Laneri model.

Posterior median number of observed Zika cases (solid line), 95% credible intervals (shaded blue area) and data points (black dots). First column: particle filter fit. Second column: Simulations from the posterior density. Third column: posterior distribution. (a) Yap. (b) Moorea. (c) Tahiti. (d) New Caledonia. The estimated seroprevalences at the end of the epidemic (with 95% credibility intervals) are: (a) 72% (CI95: 68–77, observed 73%); (b) 49% (CI95: 45–53, observed 49%); c) 49% (CI95: 45–53, observed 49%); d) 65% (CI95: 24–91). See Figure 5.

Figure 5.

Infected and recovered humans evolution during the outbreak with Laneri model.

DOI: http://dx.doi.org/10.7554/eLife.19874.014

DOI: http://dx.doi.org/10.7554/eLife.19874.005 Parameter estimations for the Pandey model. Posterior median (95% credible intervals). All the posterior parameter distributions are presented in Figures 6–9 .

Figure 6.

Posterior distributions.

Pandey model, Yap island.

DOI: http://dx.doi.org/10.7554/eLife.19874.015

Figure 9.

Posterior distributions.

Pandey model, New Caledonia.

DOI: http://dx.doi.org/10.7554/eLife.19874.018

DOI: http://dx.doi.org/10.7554/eLife.19874.006 Parameter estimations for the Laneri model. Posterior median (95% credible intervals). All the posterior parameter distributions are presented in Figures 10–13.

Figure 10.

Posterior distributions.

Laneri model, Yap island.

DOI: http://dx.doi.org/10.7554/eLife.19874.019

Figure 13.

Posterior distributions.

Laneri model, New Caledonia.

DOI: http://dx.doi.org/10.7554/eLife.19874.022

DOI: http://dx.doi.org/10.7554/eLife.19874.007 Regarding model variability, estimates are always higher and coarser with the Pandey model than with the Laneri model (cf. Tables 1–2). The Pandey model has two additional estimated parameters (in particular, the mosquito lifespan), which can explain the higher variability of the output. It is worth noting that these parameters are very sensitive (see Materials and methods). The difference in may also be linked to the difference in the estimated initial number of infected individuals (), which is higher in the Laneri model than in the Pandey model. Because of the high proportion of asymptomatic cases (the ratio of asymptomatic:symptomatic is estimated to be 1:1.3, V.-M Cao-Lormeau personal communication), it is hard to determine which scenario is more realistic, the time between introduction of the disease into the island and the first reported symptomatic case being unknown in most settings. For the durations of infectious and intrinsic incubation (in human) and extrinsic incubation (in mosquito) periods, the posterior density ressembles the informative prior (cf. Figures 6–13), indicating the models’ incapacity to identify properly these parameters without more informative data. Moreover, these parameters have a clear sensitivity (see Materials and methods) and precise field measures are therefore crucial for reliable model predictions. The fraction of the population involved in the epidemic is well estimated when the seroprevalence is known (in Yap and French Polynesia). In these cases, the proportion of the population involved is slightly greater than the seroprevalence rate, indicating a very high infection rate among involved individuals. In New Caledonia, as no information on seroprevalence was available, the fraction of population involved displays very large confidence intervals (cf. Tables 1 and 2), indicating that the model did not manage to identify properly this parameter with the available data. In this case, this parameter is highly correlated to the observation rate (cf Figures 17 and 21): and seem unidentifiable without precise information on seroprevalence.

Discussion

The reproduction number is a key parameter in epidemiology that characterizes the epidemic dynamics and the initial spread of the pathogen at the start of an outbreak in a susceptible population. can be used to inform public health authorities on the level of risk posed by an infectious disease, vaccination strategy, and the potential effects of control interventions (Anderson and May, 1982). In the light of the potential public health crisis generated by the international propagation of ZIKV, characterization of the potential transmissibility of this pathogen is crucial for predicting epidemic size, rate of spread and efficacy of intervention. Using data from both surveillance systems and seroprevalence surveys in four different geographical settings across the Pacific (Duffy et al., 2009; CHSP, 2014; Mallet et al., 2015; DASS, 2014; Aubry et al., 2015b), we have estimated the basic reproductive number (see Figures 2–3 and Tables 1–2). Our estimate of obtained by inference based on Particle MCMC (Andrieu et al., 2010) has values in the range 1.6 (1.5–1.7) – 3.2 (2.4–4.1). Our estimates vary notably across settings. Lower and finer values are found in larger islands. This phenomenon can at least in part be explained by large spatial heterogeneities and higher demographic stochasticity for islands with smaller populations, as well as the influence of stochasticity on biological and epidemiological processes linked to virus transmission. This phenomenon can also be specific to the selection of the studied islands or can reflect a highly clustered geographical pattern, the global incidence curve being the smoothed overview of a collection of more explosive small size outbreaks. However, it is notable that the two French Polynesian islands yield similar estimates of despite differing population sizes. Indeed, other important factors differ among French Polynesia, New Caledonia and Yap, such as the human genetic background and their immunological history linked to the circulation of others arboviruses. Moreover, whilst both New Caledonia and French Polynesia populations were infected by the same ZIKV lineage and transmitted by the same principle vector species, Aedes aegypti, the epidemic in Yap occurred much earlier with a different ZIKV lineage (Wang et al., 2016) and vectored by a different mosquito species Aedes hensilli (Ledermann et al., 2014). In French Polynesia, the vector Aedes polynesiensis is also present and dominates in Moorea with higher densities than in Tahiti. Finally, different vector control measures have been conducted in the three countries. To date, studies investigating Zika outbreaks in the Pacific have always estimated using a deterministic framework. Using a similar version of the Pandey model in French Polynesia, Kucharski et al. (Kucharski et al., 2016) estimated between 1.6 and 2.3 (after scaling to square root for comparison) for Tahiti and between 1.8 and 2.9 in Moorea. These estimates are slightly lower and less variable than ours. This difference can be explained firstly by the chosen priors on mosquito parameters and secondly because our model includes demographic stochasticity. Moreover, they predicted a seroprevalence rate at the end of the epidemic of 95–97%, far from the 49% measured. In Yap island, a study (Funk et al., 2016) used a very detailed deterministic mosquito model, and estimated an for Zika between 2.9 and 8. In this case, our lower and less variable estimates may come from the fact that our model is more parsimonious in the number of uncertain parameters, especially concerning the mosquito population. Finally, a third study (Nishiura et al., 2016a) relied on another method for calculation (based on the early exponential growth rate of the epidemic) in French Polynesia as a whole and in Yap. Again, the obtained parameters are lower than ours in French Polynesia and higher in Yap. The first estimations for South America using a similar methodology (Nishiura et al., 2016b; Towers et al., 2016; Gao et al., 2016) lead to similar values. In all these studies a deterministic framework is used excluding the possibility of accounting for the high variability of biological and epidemiological processes exacerbated by the small size of the population. In these three studies, like in ours, it is worth noting that little insight is obtained regarding mosquito parameters. Posterior distribution mimics the chosen prior (cf. Figures 6–13). Both the simulation of the epidemics and the estimated are highly sensitive to the choice of priors on mosquito parameters, for which precise field measures are rare. In the absence of sufficient data, the modeling of mosquito-borne pathogen transmission is a difficult task due to non-linearity and non-stationarity of the involved processes (Cazelles and Hales, 2006). This work has then several limitations. First, our study is limited by the completeness and quality of the data, with regard to both incidence and seroprevalence, but, above all, by the scarcity of information available on mosquitoes. Incidence data is aggregated at the island scale and cannot disentangle the effects of geography and observation noise to explain bimodal curves observed in Yap and New Caledonia. Moreover, although all data came from national surveillance systems, we had very little information about the potential degree of under-reporting, especially due to the high proportion of mildly symptomatic cases, at a time when the dangerous complications associated with the virus were unknown. Moreover, some cases might have been misdiagnosed as other flaviviruses, due to similarity in clinical manifestations or cross-reactivity in assays. Seroprevalence data were gathered from small sample sizes and were also sensitive to cross reactivity in populations non naive to dengue. In addition, they were missing in New Caledonia, which leads to strong correlation between our estimation of the observation rate and the fraction of the population involved in the epidemic. Because of the high proportion of asymptomatic or mildly symptomatic cases, the magnitude of the outbreaks is difficult to evaluate without precise seroprevalence data (Metcalf et al., 2016) or detection of mild, asymptomatic or pre-symptomatic infections (Thompson et al., 2016). Considering vectors, no demographic data were available and this partly explains the large variability of our estimations. Secondly, incidence and seroprevalence data were difficult to reconcile; the use of incidence data led to higher infection rates than those observed in the seroprevalence data. This difficulty has been overcome by considering that only a fraction of the population () is involved in the epidemic and then our model manages to reproduce the observed seroprevalence rate. This exposed fraction could be the result of spatial heterogeneity and high clustering of cases and transmission, as observed for dengue. The small dispersal of the mosquito and the limited population inter-mingling likely leads to considerable spatial variation in the extent of exposure to the virus and pockets of refugia in Tahiti as elsewhere (Telle et al., 2016). For instance, previous dengue infection rates in French Polynesia display large spatial variations even within islands (Daudens et al., 2009). Finer scale incidence and seroprevalence data would be useful to explore this. Another explanation for higher predicted than observed infection rates could be due to interaction with other flaviviruses. The Zika outbreak was concomitant with dengue outbreaks in French Polynesia (CHSP, 2014; Mallet et al., 2015) and New Caledonia (DASS, 2014). Examples of coinfection have been reported (Dupont-Rouzeyrol et al., 2015) but competition between these close pathogens may also have occurred. Finally, mathematical models with vectorial transmission may tend to estimate high attack rates, sometimes leading to a contradiction between observed incidence and observed seroprevalence. Assumptions on the proportionality between infected mosquitoes and the force of infection, as well as the density-dependence assumption in these models could be questioned. Indeed even if these assumptions are at the heart of the mathematical models of mosquito-borne pathogen transmission (Reiner et al., 2013; Smith et al., 2014), a recent review (Halstead, 2008) and recent experimental results (Bowman et al., 2014; Harrington et al., 2014) question these important points. On a final note, the estimates of for ZIKV are similar to but generally on the lower side of estimates made for two other flaviviruses of medical importance, dengue and Yellow Fever viruses (Favier et al., 2006; Imai et al., 2015; Massad et al., 2003), even though caution is needed in the comparison of studies with differing models, methods and data sources. Interventions strategies developed for dengue should thus enable as, if not more effective control of ZIKV, with the caveat that ZIKV remains principally a mosquito-borne pathogen with little epidemiological significance of the sexual transmission route. Even though further work and data are needed to support this hypothesis (Brauer et al., 2016), two recent studies indicated that sexual transmission alone is not sufficient to sustain an epidemic (Gao et al., 2016; Towers et al., 2016). In conclusion, using recent stochastic modeling methods, we have been able to determine estimates of for ZIKV with an unexpected relationship with population size. Further data from the current Zika epidemic in South America that is caused by the same lineage as French Polynesia lead to estimates in the same range of values (Nishiura et al., 2016b; Towers et al., 2016; Gao et al., 2016). Our study highlights the importance of gathering seroprevalence data, especially for a virus that often leads to an asymptomatic outcome and it would provide a key component for precise quantitative analysis of pathogen propagation to enable improved planning and implementation of prevention and control strategies.

Materials and methods

Data

During the 2007 outbreak that struck Yap, 108 suspected or confirmed Zika cases (16 per 1000 inhabitants) were reported by reviewing medical records and conducting prospective surveillance between April 1st and July 29th 2007 (Duffy et al., 2009). In French Polynesia, sentinel surveillance recorded more than 8700 suspected cases (32 per 1000 inhabitants) across the whole territory between October 2013 and April 2014 (CHSP, 2014; Mallet et al., 2015). In New Caledonia, the first Zika case was imported from French Polynesia on 2013 November 12th. Approximately 2500 cases (9 per 1000 inhabitants) were reported through surveillance between January (first autochtonous case) and August 2014 (DASS, 2014). For Yap and French Polynesia, the post-epidemic seroprevalence was assessed. In Yap, a household survey was conducted after the epidemic, yielding an infection rate in the island of 73% (Duffy et al., 2009). In French Polynesia, three seroprevalence studies were conducted. The first one took place before the Zika outbreak, and concluded that most of the population was naive for Zika virus (Aubry et al., 2015a). The second seroprevalence survey was conducted between February and March 2014, at the end of the outbreak, and reported a seroprevalence rate around 49% (Aubry et al., 2015b). The third one concerned only schoolchildren in Tahiti and was therefore not included in the present study. Demographic data on population size were based on censuses from Yap (Duffy et al., 2009), French Polynesia (Insee, 2012), and New Caledonia (Insee, 2014).

Models and inference

Model equations

Although the models are simulated in a stochastic framework, we present them with ordinary differential equations for clarity. The reactions involved in the stochastic models are the same as those governed by the deterministic equations, but the simulation process differs through the use of discrete compartments. It is described in the next section. The equations describing Pandey model are: where is the proportion of susceptible mosquitoes, the proportion of exposed mosquitoes, and the proportion of infected mosquitoes. Since we are using a discrete model, we cannot use directly the proportions , and whose values are smaller than one. Therefore, we rescale using , which leads to , , and . In this model, the force of infection for humans is , and the force of infection for mosquitoes is The equations describing Laneri model are: In this model, the role of mosquitoes in transmission is represented only through the delay they introduce during the extrinsic incubation period (EIP, incubation period in the mosquito). For modeling reasons, this delay is included by representing the force of infection from infected humans to susceptible humans with two compartments and : in this formalism, the duration between the moment when an exposed individual becomes infectious and the moment when another susceptible individual acquires the infection has a gamma distribution of mean (Laneri et al., 2010; Roy et al., 2013; Lloyd, 2001). Therefore, represents the current force of infection for humans . The compartment represents the same force of infection but at a previous stage, reflecting the exposed phase for mosquitoes during the extrinsic incubation period. As an analogy to Pandey model, the force of infection for mosquitoes is , and therefore, the parameter can be interpreted as the product of a transmission parameter by the proportion of susceptible mosquitoes: . The force of infection for mosquitoes is then similar to Pandey’s : . Again, since we are using a discrete model, we cannot use directly the proportions and whose values are smaller than one. Therefore, we rescale up to a factor , which leads to and . In both models, following the dominant paradigm that diseases transmitted by Aedes mosquitoes are highly clustered, we restricted the total population measured by census to a fraction , in which the parameter is estimated. This formulation makes the hypothesis that a fraction of the total population is not at risk from the epidemic, because of individual factors or because the individuals remain in areas where the virus is not present. Moreover as the vector’s flying range is small, the clustering of ZIKV infection may be reinforced. This may result in escapees from infection within the population, even at a single island scale. The available data does not allow further exploration of the mechanisms underlying these phenomena, which seem fundamental to understand ZIKV propagation. At the very least, the restriction to a fraction enables the model to reproduce the observed seroprevalence rates, and to provide coherent results with respect to both data sources (seroprevalence and surveillance data).

Stochastic framework

Both models are simulated in a stochastic and discrete framework, the Poisson with stochastic rates formulation (Bretó et al., 2009), to include the uncertainties related to small population size. In this framework, the number of reactions occurring in a time interval is approximated by a multinomial distribution. In a model with possible reactions and compartments, being the state of the system at time and the model parameters, the probability that each reaction with rate occurs times in is calculated as follows (Dureau et al., 2013): with, being the number of individual in compartment at time , (the number of individuals staying in compartment in ) (multinomial coefficient)

Observation models

The only observed compartments are the infected humans (incidence measured every week) and the recovered humans (seroprevalence at the end of the outbreak when data is available). In order to link the model to the data, two observation models, for both incidence and seroprevalence data, are needed.

Observation model on incidence data

The observed weekly incidence is assumed to follow a negative binomial distribution (Bretó et al., 2009) whose mean equals the number of new cases predicted by the model times an estimated observation rate . The observation rate accounts for non observed cases, due to non reporting from medical centers, mild symptoms unseen by health system, and asymptomatic infections. Without additional data, it is not possible to make a distinction between these three categories of cases. We also implicitely make the assumption that these cases transmit the disease as much as reported symptomatic cases. The observation model for incidence data is therefore : being the observed incidence, and the incidence predicted by the model. The dispersion parameter (Bretó et al., 2009) is fixed at 0.1.

Observation model on seroprevalence data

Seroprevalence data is fitted for Tahiti, Moorea, and Yap settings. It is assumed that the observed seroprevalence at the end of the epidemic follows a normal distribution with fixed standard deviation, whose mean equals the number of individuals in the compartment predicted by the model. The observation model for seroprevalence data is therefore : at the last time step, with notations detailed for each model in Table 3.

Table 3.

Details of the observation models for seroprevalence

DOI: http://dx.doi.org/10.7554/eLife.19874.008

Island	Date	Standard deviation	Observed seroprevalence
		Λ	HRo⁢b⁢s
Yap	2007-07-29	150	5005 (Duffy et al., 2009)
Moorea	2014-03-28	325	0.49 × 16200 = 7938 (Aubry et al., 2015b)
Tahiti	2014-03-28	3562	0.49 × 178100 = 87269 (Aubry et al., 2015b)

Details of the observation models for seroprevalence DOI: http://dx.doi.org/10.7554/eLife.19874.008

Prior distributions

Informative prior distributions are assumed for the mosquito lifespan, the duration of infectious period, and both intrinsic and extrinsic incubation periods. The initial numbers of infected mosquitoes and humans are estimated, and the initial number of exposed individuals is set to the initial number of infected to reduce parameter space. We assume that involved populations are naive to Zika virus prior to the epidemic and set the initial number of recovered humans to zero. The other priors and associated references are listed in Table 4.

Table 4.

DOI: http://dx.doi.org/10.7554/eLife.19874.009

Parameters		Pandey model	Laneri model	References
R02	squared basic reproduction number	Uniform[0, 20]	Uniform[0, 20]	assumed
βV	transmission from human to mosquito	Uniform[0,10]	.	assumed
γ-1	infectious period (days)	Normal(5,1) in [4,7]	Normal(5,1) in [4,7]	(Mallet et al., 2015)
σ-1	intrinsic incubation period (days)	Normal(4,1) in [2,7]	Normal(4,1) in [2,7]	(Nishiura et al., 2016b; Bearcroft, 1956; Lessler et al., 2016)
τ-1	extrinsic incubation period (days)	Normal(10.5,1) in [4,20]	Normal(10.5,1) in [4,20]	(Hayes, 2009; Chouin-Carneiro et al., 2016)
μ-1	mosquito lifespan (days)	Normal(15,2) in [4,30]	.	(Brady et al., 2013; Liu-Helmersson et al., 2014)
ρ	fraction of population involved	Uniform[0,1]	Uniform[0,1]

Prior distributions of parameters. 'Uniform[0,20]' indicates a uniform distribution in the range [0,20]. 'Normal(5,1) in [4,7]' indicates a normal distribution with mean five and standard deviation 1, restricted to the range [4,7]. DOI: http://dx.doi.org/10.7554/eLife.19874.009 The range for the prior on observation rate is reduced for Tahiti and New Caledonia, in order to reduce the parameter space and facilitate convergence. In both cases, we use the information provided with the data source. In French Polynesia, 8750 cases we reported, but according to local health authorities, more than 32,000 people would have attended health facilities for Zika (Mallet et al., 2015) (8750/32000 ≤ 0.3). In New Caledonia, approximately 2500 cases were reported but more than 11,000 cases were estimated by heath authorities (DASS, 2014) (2500/11000 ≤ 0.23). In both cases, these extrapolations are lower bounds on the real number of cases (in particular, they do not estimate the number of asymptomatic infections), and therefore can be used as upper bounds on the observation rate.

Estimations

Inference with PMCMC

The complete model is represented using the state space framework, with two equation systems: the transition equations refer to the transmission models, and the measurement equations are given by the observation models. In a deterministic framework, this model could be directly estimated using MCMC, with a Metropolis-Hastings algorithm targeting the posterior distribution of the parameters. This algorithm would require the calculation of the model likelihood at each iteration. In our stochastic framework, the model output is given only through simulations and the likelihood is intractable. In consequence, estimations are performed with the PMCMC algorithm (particle Markov Chain Monte Carlo (Andrieu et al., 2010)), in the PMMH version (particle marginal Metropolis-Hastings). This algorithm uses the Metropolis-Hastings structure, but replaces the real likelihood by its estimation with Sequential Monte Carlo (SMC). Algorithm 1 PMCMC (Andrieu et al., 2010) (PMMH version, as in SSM (Dureau et al., 2013)) SMC (Doucet et al., 2001) is a filtering method that enables to recover the latent variables and estimate the likelihood for a given set of parameters. The data is treated sequentially, by adding one more data point at each iteration. The initial distribution of the state variables is approximated by a sample a particles, and from one iteration to the next, all the particles are projected according to the dynamic given by the model. The particles receive a weight according to the quality of their prediction regarding the observations. Before the next iteration, all the particles are resampled using these weights, in order to eliminate low weight particles and concentrate the computational effort in high probability regions. Model likelihood is also computed sequentially at each iteration (Dureau et al., 2013; Doucet and Johansen, 2011). Algorithm 2 SMC (Sequential Monte Carlo, as implemented in SSM [Dureau et al., 2013]) In a model with observations and particles. A gaussian kernel is used in the PMCMC algorithm, with mean and fixed variance (random walk Metropolis Hastings).

Initialization

PMCMC algorithm is very sensitive to initialization of both the parameter values and the covariance matrix . Several steps of initialization are therefore used. Firstly, parameter values are initialized by maximum likelihood through simplex algorithm on a deterministic version of the model. We apply the simplex algorithm to a set of 1000 points sampled in the prior distributions and we select the parameter set with the highest likelihood. Secondly, in order to initialise the covariance matrix, an adaptative MCMC (Metropolis Hastings) framework is used (Roberts and Rosenthal, 2009; Dureau et al., 2013). It uses the empirical covariance of the chain , and aims to calibrate the acceptance rate of the algorithm to an optimal value. The transition kernel is also mixed (with a probability ) with another gaussian using the identity matrix to improve mixing properties. The parameter is approximated by successive iterations using the empirical acceptance rate of the chain. The adaptative PMCMC algorithm itself may have poor mixing properties without initialization. A first estimation of the covariance matrix is computed using KMCMC algorithm (Dureau et al., 2013). In the KMCMC algorithm, the model is simulated with stochastic differential equations (intermediate between deterministic and Poisson with stochastic rates frameworks) and the SMC part of the adaptative PMCMC is replaced by the extended Kalman filter. When convergence is reached with KMCMC, then, adaptative PMCMC is used. The PMCMC algorithm is finally applied on the output of the adaptative PMCMC, using 50,000 iterations and 10,000 particles. Calculations are performed with SSM software (Dureau et al., 2013) and R version 3.2.2.

Calculation

is calculated using the Next Generation Matrix approach (NGM) (19).

Calculation in Pandey model

Then we have, and We calculate the eigen values of : Then or and the highest eigenvalue is . This formula defines as "the number of secondary cases per generation" (Dietz, 1993): i.e can be written as the geometric mean , where is the number of infected mosquitoes after the introduction of one infected human in a naive population, and is the number of infected humans after the introduction of one infected mosquito in a naive population. With this definition, herd immunity is reached when of the population is vaccinated (Dietz, 1993).

Calculation in Laneri model

Following the analogy with Pandey model, we compute the spectral radius of the NGM for the Laneri model. Then we have, and We calculate the eigen values of : Then or and the highest eigenvalue is . Because and can be seen as parameters rather than state variables, the interpretation of the spectral radius as the of the model is not straightforward. Therefore, we computed the of the model through simulations, by counting the number of secondary infections after the introduction of a single infected individual in a naive population. Since Laneri model is considered here as a vector model, the number of infected humans after the introduction of a single infected is considered as . We simulated 1000 deterministic trajectories, using parameter values sampled in the posterior distributions for all parameters except initial conditions. With this method, the confidence intervals for number of infected humans () are similar to the ones of estimated by the model. As a consequence, was approximated by the spectral radius of the NGM in our results with our stochastic framework (cf. Table 5).

Table 5.

DOI: http://dx.doi.org/10.7554/eLife.19874.010

	Pandey model	Laneri model
Yap	3.1 (2.5–4.3)	2.2 (1.9–2.6)
Moorea	2.6 (2.2–3.3)	1.8 (1.6–2.0)
Tahiti	2.4 (2.0–3.2)	1.6 (1.5–1.7)
New Caledonia	2.0 (1.8–2.2)	1.6 (1.5–1.7)

Square root of the number of secondary cases after the introduction of a single infected individual in a naive population. Median and 95% credible intervals of 1000 deterministic simulations using parameters from the posterior distribution. DOI: http://dx.doi.org/10.7554/eLife.19874.010 As a robustness check, the same method was applied to Pandey model : the confidence intervals for the number of secondary cases in simulations are very similar to the ones of (cf. Table 5).

Sensitivity analysis

In order to analyse the influence of parameter values on the model’s outputs, a sensitivity analysis was performed, using LHS/PRCC technique (Blower and Dowlatabadi, 1994), on Tahiti example. Similar results were obtained for the other settings. Three criteria were retained as outputs for the analysis: the seroprevalence at the last time point, the intensity of the peak of the outbreak and the date of the peak. We used uniform distributions for all parameters, which are listed in Tables 6 and 7. For model parameters, we used the same range as for the prior distribution. For initial conditions, the observation rate and the fraction involved in the epidemic , we used the 95% confidence interval obtained by PMCMC, in order to avoid unrealistic scenarios.

Table 6.

DOI: http://dx.doi.org/10.7554/eLife.19874.011

Parameters	Distribution	Seroprevalence	Peak intensity	Peak date
Model parameters
R02	Uniform[0,20]	0.87	0.90	−0.55
βV	Uniform[0,10]	−0.66	−0.73	0.35
γ-1	Uniform[4,7]	−0.25	0.10	0.20
σ-1	Uniform[2,7]	−0.03	−0.10	0.15
τ-1	Uniform[4,20]	−0.05	−0.07	0.06
μ-1	Uniform[4,30]	−0.56	−0.70	0.49
Initial conditions
H_I(0)	Uniform[2.10^-5,0.011]	0.05	−0.02	0.02
V_I(0)	Uniform[10^-4,0.034]	0.11	−0.00	−0.26
Fraction involved and observation model
ρ	Uniform[0.46,0.54]	0.47	0.15	−0.03
r	Uniform[0.048,0.072]	−0.04	0.03	0.05

Table 7.

Sensitivity analysis in Laneri model. Tahiti island. 1000 parameter sets were sampled with latin hypercube sampling (LHS), using 'lhs' R package (Carnell, 2016). On each parameter set, the model was simulated deterministically in order to explore variability in parameters without interference with variations due to stochasticity. PRCC were computed using the 'sensitivity' R package (Pujol et al., 2016).

DOI: http://dx.doi.org/10.7554/eLife.19874.012

Parameters	Distribution	Seroprevalence	Peak intensity	Peak date
Model parameters
R02	Uniform[0,20]	0.62	0.93	−0.50
γ-1	Uniform[4,7]	0.01	0.62	0.15
σ-1	Uniform[2,7]	−0.03	−0.54	0.21
τ-1	Uniform[4,20]	−0.03	−0.70	0.47
Initial conditions
H_I(0)	Uniform[10^-5,0.015]	0.05	0.02	−0.32
L(0)	Uniform[2.10^-5,0.004]	0.05	0.00	−0.16
Fraction involved and observation model
ρ	Uniform[0.49,0.59]	0.80	0.34	0.02
r	Uniform[0.048,0.068]	−0.01	0.01	−0.02

Sensitivity analysis in Pandey model. Tahiti island. 1000 parameter sets were sampled with latin hypercube sampling (LHS), using 'lhs' R package (Carnell, 2016). On each parameter set, the model was simulated deterministically in order to explore variability in parameters without interference with variations due to stochasticity. PRCC were computed using the 'sensitivity' R package (Pujol et al., 2016). DOI: http://dx.doi.org/10.7554/eLife.19874.011 Sensitivity analysis in Laneri model. Tahiti island. 1000 parameter sets were sampled with latin hypercube sampling (LHS), using 'lhs' R package (Carnell, 2016). On each parameter set, the model was simulated deterministically in order to explore variability in parameters without interference with variations due to stochasticity. PRCC were computed using the 'sensitivity' R package (Pujol et al., 2016). DOI: http://dx.doi.org/10.7554/eLife.19874.012 For all criteria, the key parameters in both models are transmission parameters ( and ). High values for are positively correlated with a large seroprevalence and a high and early peak. On the contrary, high values for the parameters introducing a delay in the model, the incubation periods in human () and in mosquito (), are associated with a lower and later peak, and have no significant effect on seroprevalence. Moreover, the simulations are clearly sensitive to the other model parameters, in particular the mosquito lifespan () in Pandey model. Concerning other parameters, the initial conditions have a noticeable effect on the date of the peak only. As expected, the fraction involved in the epidemic () influences the magnitude of the outbreak, by calibrating the proportion of people than can be infected, but it has no significant effect on the timing of the peak.

Complementary results

These complementary results include PMCMC results for both models in the four settings: the epidemic trajectories regarding the human compartments for infected and recovered individuals (Figures 4,5), the detailed posterior distributions for all parameters (Figures 6–13) and correlation plots for all models (Figures 14–21).

Figure 14.

Correlation plot of MCMC output.

Pandey model, Yap island.

DOI: http://dx.doi.org/10.7554/eLife.19874.023

Figure 21.

Correlation plot of MCMC output.

Laneri model, New Caledonia.

DOI: http://dx.doi.org/10.7554/eLife.19874.030

Infected and recovered humans evolution during the outbreak with Pandey model.

Infected and recovered humans evolution during the outbreak with Laneri model.

Posterior distributions.

Pandey model, Yap island. DOI: http://dx.doi.org/10.7554/eLife.19874.015 Pandey model, Moorea island. DOI: http://dx.doi.org/10.7554/eLife.19874.016 Pandey model, Tahiti island. DOI: http://dx.doi.org/10.7554/eLife.19874.017 Pandey model, New Caledonia. DOI: http://dx.doi.org/10.7554/eLife.19874.018 Laneri model, Yap island. DOI: http://dx.doi.org/10.7554/eLife.19874.019 Laneri model, Moorea island. DOI: http://dx.doi.org/10.7554/eLife.19874.020 Laneri model, Tahiti island. DOI: http://dx.doi.org/10.7554/eLife.19874.021 Laneri model, New Caledonia. DOI: http://dx.doi.org/10.7554/eLife.19874.022

Correlation plot of MCMC output.

Pandey model, Yap island. DOI: http://dx.doi.org/10.7554/eLife.19874.023 Pandey model, Moorea island. DOI: http://dx.doi.org/10.7554/eLife.19874.024 Pandey model, Tahiti island. DOI: http://dx.doi.org/10.7554/eLife.19874.025 Pandey model, New Caledonia. DOI: http://dx.doi.org/10.7554/eLife.19874.026 Laneri model, Yap island. DOI: http://dx.doi.org/10.7554/eLife.19874.027 Laneri model, Moorea island. DOI: http://dx.doi.org/10.7554/eLife.19874.028 Laneri model, Tahiti island. DOI: http://dx.doi.org/10.7554/eLife.19874.029 Laneri model, New Caledonia. DOI: http://dx.doi.org/10.7554/eLife.19874.030

Correlation between estimated parameters

The inference technique may fail to estimate some parameters due to identifiability issues. In particular, when two parameters are highly correlated to one another, the model manages to estimate the pair of parameters but not each one separately. The analysis of correlation between parameters’ posterior distributions can reveal such cases. The following graphics display for each model the correlation coefficients between all pairs of parameters across the MCMC chain. For example, in models for New Caledonia, the observation rate and the fraction of the population involved in the epidemic are strongly negatively correlated (Figures 17,21): the inference technique does not manage to estimate properly these two parameters, due to the lack of information on seroprevalence.

Figure 17.

Correlation plot of MCMC output.

Pandey model, New Caledonia.

DOI: http://dx.doi.org/10.7554/eLife.19874.026

Code and source data files

The estimation tools are provided by the open source software SSM (Dureau et al., 2013) (State Space Models, RRID:SCR_014647), available at https://github.com/JDureau/ssm and https://github.com/sballesteros/ssm-predict. The codes for the implementation of each model are provided as supplementary file. In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included. Thank you for submitting your article "Structure in the variability of the basic reproductive number (R0) for Zika epidemics in the Pacific islands" for consideration by eLife. Your article has been reviewed by two peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Prabhat Jha as the Senior Editor. The reviewers have opted to remain anonymous. The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. On the whole, the reviewers and the Reviewing Editor appreciated the work. Although there have been a number of previous attempts to estimate R0 for Zika using Pacific Island data, they see the importance in the use of a stochastic model in order to obtain more precise values for both point estimates and uncertainty bounds, as well as the efforts to capture some degree of the structural uncertainty in model design. Making the source data available was particularly appreciated as this enhances the transparency and reproducibility of the analysis. The reviewers and Reviewing Editor had only two requirements, although both of them are important: 1) Despite a better effort than many to explain the methods used, the details were still unclear to the reviewers. Hence we would like to see the full model code used for the analysis made available alongside the source data. 2) There was some concern over the estimation and interpretation of the parameter ρ, which appears to be a key parameter in the model but is poorly defined. This reduces the population actually involved in the epidemic dynamics as a result of "spatial heterogeneity, immuno-resistance or cross-immunity". It ranges between 34% to 74% across models and islands, and is especially variable in New Caledonia between the two models (34% – 70%). It seems particularly strange that ρ is actually smaller than final seroprevalence rate in the outbreak (73% in Yap and 94% in French Polynesia), when most of the population was naïve prior to the outbreak. How do people seroconvert if they are not involved in the epidemic? And if part of the population was actually protected from infection, is there any evidence to support the mechanism (eg. genetic links to Zika susceptibility, or spatial data showing that part of the islands were unaffected)? [Editors' note: further revisions were requested prior to acceptance, as described below.] Thank you for resubmitting your work entitled "Structure in the variability of the basic reproductive number (R0) for Zika epidemics in the Pacific islands" for further consideration at eLife. Your revised article has been favorably evaluated by Prabhat Jha as the Senior editor and the previous reviewers. Firstly, we would like to apologise for the time it has taken to respond to your resubmission. We wanted to consult further about the issue of model code availability stated below, but unfortunately this process took far longer than anticipated. Overall we are happy with the revised article and are almost ready to proceed to acceptance. The one outstanding issue is the availability of the model code. We do understand that you have provided a link to the code to the algorithm used for parameter inference. However, the editors and reviewer were unable to understand how to reproduce your numerical findings using the code in the repository plus the data included in the Appendix. As far as we can tell, all you have provided is a link to code written by other developers that was used in your model, not the actual compartmental model itself. We would like you to provide either (i) an explanation that we are wrong and that your results (including all the figures behind all the tables and graphs) can be fully regenerated using this code, or (ii) the actual code itself. Once you are able to provide this, we should be able to proceed very rapidly to a final decision On the whole, the reviewers and the Reviewing Editor appreciated the work. Although there have been a number of previous attempts to estimate R The reviewers and Reviewing Editor had only two requirements, although both of them are important: 1) Despite a better effort than many to explain the methods used, the details were still unclear to the reviewers. Hence we would like to see the full model code used for the analysis made available alongside the source data. We added the links to the open source software SSM, and the JSON codes for the implementation of each model. 2) There was some concern over the estimation and interpretation of the parameter ρ, which appears to be a key parameter in the model but is poorly defined. This reduces the population actually involved in the epidemic dynamics as a result of "spatial heterogeneity, immuno-resistance or cross-immunity". It ranges between 34% to 74% across models and islands, and is especially variable in New Caledonia between the two models (34% – 70%). It seems particularly strange that ρ is actually smaller than final seroprevalence rate in the outbreak (73% in Yap and 94% in French Polynesia), when most of the population was naïve prior to the outbreak. How do people seroconvert if they are not involved in the epidemic? And if part of the population was actually protected from infection, is there any evidence to support the mechanism (eg. genetic links to Zika susceptibility, or spatial data showing that part of the islands were unaffected)? We provided an extended description of the parameter ρ, both in the Methods and Results sections: “The fraction ρ of the population involved in the epidemic is well estimated when the seroprevalence is known (in Yap and French Polynesia). In these cases, the proportion of the population involved is slightly greater than the seroprevalence rate, indicating a very high infection rate among involved individuals. In New Caledonia, as no information on seroprevalence was available, the fraction of population involved displays very large confidence intervals (cf. Tables 1 and 2), indicating that the model did not manage to identify properly this parameter with the available data. In this case, this parameter is highly correlated to the observation rate r (Figures 17 and 21): r and ρ seem unidentifiable without precise information on seroprevalence.” “In both models, following the dominant paradigm that diseases transmitted by Aedes mosquitoes are highly clustered, we restricted the total population H measured by census to a fraction N = ρ.H, in which the parameter ρ is estimated. This formulation makes the hypothesis that a fraction of the total population is not at risk from the epidemic, because of individual factors or because the individuals remain in areas where the virus is not present. Moreover, as the vector’s flying range is small, the clustering of ZIKV infection may be reinforced. This may result in escapees from infection within the population, even at a single island scale. The available data does not allow further exploration of the mechanisms underlying these phenomena, which seem fundamental to understand ZIKV propagation. At the very least, the restriction to a fraction ρ enables the model to reproduce the observed seroprevalence rates, and to provide coherent results with respect to both data sources (seroprevalence and surveillance data).” In all our estimations, the fraction ρ is greater than seroprevalence. The final seroprevalence rate in French Polynesia was 49% (Aubry et al. 2015) and 73% in Yap (Duffy et al. 2009) and ρ was estimated respectively to 50% and 74%. Because no seroprevalence information was available for New Caledonia, the estimated ρ varied between models. Posterior estimates for the parameters are calculated on the MCMC chain, and estimated seroprevalence is calculated on simulations of the model with parameters sampled in the MCMC chain. Our initial code used different lengths of the MCMC chain, generating some minor problems in the estimation of the medians. This did not impact the results, except for ρ in New Caledonia, because, due to the lack of seroprevalence data, ρ is poorly identifiable, and its posterior distribution is very skewed. We corrected this imprecision by using the same chain length: now all our estimates use a chain with 50,000 iterations. In all cases, people do not seroconvert if they are not involved in the epidemic. The precise mechanisms by which a part of the population is protected from infection are unknown, even though there is some evidence of a spatial clustering of cases, as it was observed for dengue in other contexts (see for example Telle et al. PLoS One 2015). For instance, in French Polynesia, previous dengue infection rates display large spatial variations even within islands (cf. Daudens et al. “Épidémiologie de la dengue et stratégies de lutte en Polynésie française, 2006-2008”, Bulletin épidémique hebdomadaire, InVS, 2009, December 22th). [Editors' note: further revisions were requested prior to acceptance, as described below.] Overall we are happy with the revised article and are almost ready to proceed to acceptance. The one outstanding issue is the availability of the model code. We do understand that you have provided a link to the code to the algorithm used for parameter inference. However, the editors and reviewer were unable to understand how to reproduce your numerical findings using the code in the repository plus the data included in the appendix. As far as we can tell, all you have provided is a link to code written by other developers that was used in your model, not the actual compartmental model itself. We would like you to provide either (i) an explanation that we are wrong and that your results (including all the figures behind all the tables and graphs) can be fully regenerated using this code, or (ii) the actual code itself. Once you are able to provide this, we should be able to proceed very rapidly to a final decision Firstly, we want to thank the Editor and the reviewers for their remarks and their effort to reproduce our calculation. We understand that the use of our codes is not straightforward and therefore, we now provide more detailed files and explanations. The code in the online repository is the software SSM: it is the library to be installed in order to perform the estimations using the pmcmc and kmcmc algorithms. In our previous resubmission, we also provided the.json files describing the models (as a zip “Source Code” file). We extended this zip folder, with some initialisation and result files, and we added a READ ME document describing the procedure to be followed in order to use the software and our codes. We also added an R code for plotting and an example of our estimations (for Pandey model in Tahiti), so that it can be plotted directly. Finally, we made the reference to this folder more explicit in the manuscript (it is now referenced as “Supplementary file 1”). We reproduce here the text of the READ ME file: 1) Install SSM software on your machine as indicated in https://github.com/JDureau/ssm; the complementary module “SSM predict” also needs to be installed (https://github.com/sballesteros/ssm-predict) 2) In order to run one model, four elements are needed: a) Data files in csv format b) Priors distributions in json format c) An initialisation file in json format d)The json file containing the model, called “ssm.json” These 4 elements are provided for each model, in a file with the corresponding names. The initialisation file provided here is the result of a previous simplex run (called theta_simplex.json) but one can start with other initial values (also put in a file named theta_simplex.json for using with the given code). 3) Execute the R code “pmcmc_kmcmc_simulations.R”. This files enables to compile the model, run KMCMC and PMCMC algorithm and do a simulation of the model (using our estimated values). The PMCMC part is to be used when convergence is reached with KMCMC algorithm, which might sometimes require more than 5 KMCMC steps. The PMCMC part is computationally intensive and may require a cluster (when the number of particles is 10,000). We provided our estimates for Pandey model in Tahiti as an example, in the “pmcmc-result” file: they comprise the last PMCMC chain (trace_0.csv), the filter trajectory (X_0.csv), the last set of parameters of the chain (mle.json) and the simulation of the model (X_simul.csv) 4) All the calculation is done, and the outputs can be plotted using for example the R code “plotting”.

Algorithm 1 PMCMC (Andrieu et al., 2010) (PMMH version, as in SSM (Dureau et al., 2013))
In a model with n observations and J particles. q(.\|θ(i)) is the transition kernel.
1: Initialize θ(0). 2: Using SMC algorithm, compute p^(y1:n\|θ(0)) and sample x0:n∗ from p^(x0:n\|y1:n,θ(0)). 3: for i=1...N do 4: Sample θ∗ from q(.\|θ(i)) 5: Using SMC algorithm, compute L(θ∗)=p^(y1:n\|θ∗) and sample x0:n∗ from p^(x0:n\|y1:n,θ∗) 6: Accept θ* (et x0:n∗) with probability 1∧L(θ(i))q(\|θ∗)L(θ∗)q(θ∗\|θ(i)) 7: If accepted, θ(i+1)=θ∗ and x0:n(i+1)=x0:n∗. Otherwise θ(i+1)=θ(i) and x0:n(i+1)=x0:n(i). 8: end for

Algorithm 1 PMCMC (Andrieu et al., 2010) (PMMH version, as in SSM (Dureau et al., 2013))

In a model with n observations and J particles. q(.|θ(i)) is the transition kernel.

1: Initialize

θ(0)

. 2: Using SMC algorithm, compute

p^(y1:n|θ(0))

and sample

x0:n∗

from

p^(x0:n|y1:n,θ(0))

. 3: for

i=1...N

do 4: Sample

θ∗

from

q(.|θ(i))

5: Using SMC algorithm, compute

L(θ∗)=p^(y1:n|θ∗)

and sample

x0:n∗

from

p^(x0:n|y1:n,θ∗)

6: Accept

θ*

(et

x0:n∗

) with probability

1∧L(θ(i))q(|θ∗)L(θ∗)q(θ∗|θ(i))

7: If accepted,

θ(i+1)=θ∗

and

x0:n(i+1)=x0:n∗

. Otherwise

θ(i+1)=θ(i)

and

x0:n(i+1)=x0:n(i)

. 8: end for

Algorithm 2 SMC (Sequential Monte Carlo, as implemented in SSM [Dureau et al., 2013])
In a model with n observations and J particles.
L is the model likelihood p(y1:T\|θ). Wk(j) is the weight and xk(j) is the state associated to particle j at iteration k.
1: Set L=1, W0(j)=1/J. 2: Sample (x0(j))j=1:J from p(x\|θ0). 3: for k=0:n-1j=0:Jdo 4: for j=0:jdo 5: Sample (xk+1(j))j=1:J from p(xk+1\|xk,θ) 6: Set α(j)=p(yk+1\|xk+1(j),θ) 7: end for 8: Set Wk+1(j)=α(j)∑l=1Jα(l) and L=L⁢1J⁢∑jα(l) 9: Resample (x0:k+1(j))j=1:J from Wk+1(j) 10: end for

42 in total

1. Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods.

Authors: A L Lloyd
Journal: Proc Biol Sci Date: 2001-05-07 Impact factor: 5.349

2. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission.

Authors: P van den Driessche; James Watmough
Journal: Math Biosci Date: 2002 Nov-Dec Impact factor: 2.144

Review 3. The estimation of the basic reproduction number for infectious diseases.

Authors: K Dietz
Journal: Stat Methods Med Res Date: 1993 Impact factor: 3.021

Review 4. Dengue virus-mosquito interactions.

Authors: Scott B Halstead
Journal: Annu Rev Entomol Date: 2008 Impact factor: 19.686

5. Guillain-Barré Syndrome outbreak associated with Zika virus infection in French Polynesia: a case-control study.

Authors: Van-Mai Cao-Lormeau; Alexandre Blake; Sandrine Mons; Stéphane Lastère; Claudine Roche; Jessica Vanhomwegen; Timothée Dub; Laure Baudouin; Anita Teissier; Philippe Larre; Anne-Laure Vial; Christophe Decam; Valérie Choumet; Susan K Halstead; Hugh J Willison; Lucile Musset; Jean-Claude Manuguerra; Philippe Despres; Emmanuel Fournier; Henri-Pierre Mallet; Didier Musso; Arnaud Fontanet; Jean Neil; Frédéric Ghawché
Journal: Lancet Date: 2016-03-02 Impact factor: 79.321

Review 6. Zika Virus.

Authors: Didier Musso; Duane J Gubler
Journal: Clin Microbiol Rev Date: 2016-07 Impact factor: 26.132

7. Co-infection with Zika and dengue viruses in 2 patients, New Caledonia, 2014.

Authors: Myrielle Dupont-Rouzeyrol; Olivia O'Connor; Elodie Calvez; Maguy Daurès; Michèle John; Jean-Paul Grangeon; Ann-Claire Gourinat
Journal: Emerg Infect Dis Date: 2015-02 Impact factor: 6.883

8. Zika virus, French polynesia, South pacific, 2013.

Authors: Van-Mai Cao-Lormeau; Claudine Roche; Anita Teissier; Emilie Robin; Anne-Laure Berry; Henri-Pierre Mallet; Amadou Alpha Sall; Didier Musso
Journal: Emerg Infect Dis Date: 2014-06 Impact factor: 6.883

9. Recasting the theory of mosquito-borne pathogen transmission dynamics and control.

Authors: David L Smith; T Alex Perkins; Robert C Reiner; Christopher M Barker; Tianchan Niu; Luis Fernando Chaves; Alicia M Ellis; Dylan B George; Arnaud Le Menach; Juliet R C Pulliam; Donal Bisanzio; Caroline Buckee; Christinah Chiyaka; Derek A T Cummings; Andres J Garcia; Michelle L Gatton; Peter W Gething; David M Hartley; Geoffrey Johnston; Eili Y Klein; Edwin Michael; Alun L Lloyd; David M Pigott; William K Reisen; Nick Ruktanonchai; Brajendra K Singh; Jeremy Stoller; Andrew J Tatem; Uriel Kitron; H Charles J Godfray; Justin M Cohen; Simon I Hay; Thomas W Scott
Journal: Trans R Soc Trop Med Hyg Date: 2014-03-03 Impact factor: 2.184

10. Transmission Dynamics of Zika Virus in Island Populations: A Modelling Analysis of the 2013-14 French Polynesia Outbreak.

Authors: Adam J Kucharski; Sebastian Funk; Rosalind M Eggo; Henri-Pierre Mallet; W John Edmunds; Eric J Nilles
Journal: PLoS Negl Trop Dis Date: 2016-05-17

14 in total

1. Reassessing Serosurvey-Based Estimates of the Symptomatic Proportion of Zika Virus Infections.

Authors: Patrick K Mitchell; Luis Mier-Y-Teran-Romero; Brad J Biggerstaff; Mark J Delorey; Maite Aubry; Van-Mai Cao-Lormeau; Matthew J Lozier; Simon Cauchemez; Michael A Johansson
Journal: Am J Epidemiol Date: 2019-01-01 Impact factor: 4.897

Review 2. Performance of Zika Assays in the Context of Toxoplasma gondii, Parvovirus B19, Rubella Virus, and Cytomegalovirus (TORCH) Diagnostic Assays.

Authors: Bettie Voordouw; Barry Rockx; Thomas Jaenisch; Pieter Fraaij; Philippe Mayaud; Ann Vossen; Marion Koopmans
Journal: Clin Microbiol Rev Date: 2019-12-11 Impact factor: 26.132

3. Comparative Analysis of Dengue and Zika Outbreaks Reveals Differences by Setting and Virus.

Authors: Sebastian Funk; Adam J Kucharski; Anton Camacho; Rosalind M Eggo; Laith Yakob; Lawrence M Murray; W John Edmunds
Journal: PLoS Negl Trop Dis Date: 2016-12-07

4. Dynamics of Zika virus outbreaks: an overview of mathematical modeling approaches.

Authors: Anuwat Wiratsudakul; Parinya Suparit; Charin Modchang
Journal: PeerJ Date: 2018-03-22 Impact factor: 2.984

5. Accounting for non-stationarity in epidemiology by embedding time-varying parameters in stochastic models.

Authors: Bernard Cazelles; Clara Champagne; Joseph Dureau
Journal: PLoS Comput Biol Date: 2018-08-15 Impact factor: 4.475

6. Quantifying the risk of Zika virus spread in Asia during the 2015-16 epidemic in Latin America and the Caribbean: A modeling study.

Authors: Xue Shi Luo; Natsuko Imai; Ilaria Dorigatti
Journal: Travel Med Infect Dis Date: 2020-01-26 Impact factor: 6.211

7. New estimates of the Zika virus epidemic attack rate in Northeastern Brazil from 2015 to 2016: A modelling analysis based on Guillain-Barré Syndrome (GBS) surveillance data.

Authors: Daihai He; Shi Zhao; Qianying Lin; Salihu S Musa; Lewi Stone
Journal: PLoS Negl Trop Dis Date: 2020-04-29

8. Epidemiological and ecological determinants of Zika virus transmission in an urban setting.

Authors: José Lourenço; Maricelia Maia de Lima; Nuno Rodrigues Faria; Andrew Walker; Moritz Ug Kraemer; Christian Julian Villabona-Arenas; Ben Lambert; Erenilde Marques de Cerqueira; Oliver G Pybus; Luiz Cj Alcantara; Mario Recker
Journal: Elife Date: 2017-09-09 Impact factor: 8.140

9. Using paired serology and surveillance data to quantify dengue transmission and control during a large outbreak in Fiji.

Authors: Adam J Kucharski; Mike Kama; Conall H Watson; Maite Aubry; Sebastian Funk; Alasdair D Henderson; Oliver J Brady; Jessica Vanhomwegen; Jean-Claude Manuguerra; Colleen L Lau; W John Edmunds; John Aaskov; Eric James Nilles; Van-Mai Cao-Lormeau; Stéphane Hué; Martin L Hibberd
Journal: Elife Date: 2018-08-14 Impact factor: 8.713

10. Modelling the effective reproduction number of vector-borne diseases: the yellow fever outbreak in Luanda, Angola 2015-2016 as an example.

Authors: Shi Zhao; Salihu S Musa; Jay T Hebert; Peihua Cao; Jinjun Ran; Jiayi Meng; Daihai He; Jing Qin
Journal: PeerJ Date: 2020-02-27 Impact factor: 2.984