Literature DB >> 29540937

Modelling time varying heterogeneity in recurrent infection processes: an application to serological data.

Steven Abrams¹, Andreas Wienke², Niel Hens^3,4.

Abstract

Frailty models are often used in survival analysis to model multivariate time-to-event data. In infectious disease epidemiology, frailty models have been proposed to model heterogeneity in the acquisition of infection and to accommodate association in the occurrence of multiple types of infection. Although traditional frailty models rely on the assumption of lifelong immunity after recovery, refinements have been made to account for reinfections with the same pathogen. Recently, Abrams and Hens quantified the effect of misspecifying the underlying infection process on the basic and effective reproduction number in the context of bivariate current status data on parvovirus B19 and varicella zoster virus. Furthermore, Farrington, Unkel and their co-workers introduced and applied time varying shared frailty models to paired bivariate serological data. In this paper, we consider an extension of the proposed frailty methodology by Abrams and Hens to account for age-dependence in individual heterogeneity through the use of age-dependent shared and correlated gamma frailty models. The methodology is illustrated by using two data applications.

Entities: CellLine Chemical Disease Gene Species

Keywords: Age‐dependent frailties; Basic reproduction number; Non‐immunizing infections; Serology; Shared and correlated frailty models

Year: 2017 PMID： 29540937 PMCID： PMC5836988 DOI： 10.1111/rssc.12236

Source DB: PubMed Journal: J R Stat Soc Ser C Appl Stat ISSN： 0035-9254 Impact factor: 1.864

Introduction

The analysis of time‐to‐event data expanded rapidly since the development of the proportional hazards model by Cox (1972) with applications in medicine, epidemiology, demography and many other fields. Frailty models have been used in survival analysis to model dependence and unobserved heterogeneity in multivariate time‐to‐event data explicitly and to address questions that are related to the association between multiple event times (see, for example, Clayton (1978), Hougaard (2000) and Duchateau and Janssen (2008)). The dependence between event times usually arises since individuals in the same group (family, cluster or litter) are related, or because multiple event times are recorded for the same subject. Multivariate frailty models provide an extension of the univariate frailty model (Vaupel et al., 1979; Lancaster, 1979), taking mutual dependence of event times into account by using conditional proportional hazards given latent frailty terms. To impose an association structure among the event times, two approaches are often considered. First, a shared frailty model uses a common frailty term for all individuals in the same group (cluster); see, for example, Hougaard (2000), Therneau and Grambsch (2000) and Duchateau and Janssen (2008). Second, the correlated frailty model extends the shared frailty model to allow for a more flexible correlation structure between the frailty terms (see, for example, Yashin et al. (1995), Xue and Brookmeyer (1996) and Wienke et al. (2002, 2003)). Individual heterogeneity with respect to factors that may enhance or inhibit the transmission of infectious diseases affects the effectiveness of control strategies as has been understood for a long time (Anderson and May, 1991). When this heterogeneity is not appropriately accounted for, one potentially ends up with biased estimates for important epidemiological parameters. One of the key assumptions in shared and correlated frailty models is that of age or time invariant individual heterogeneity, implying that individuals have a constant ‘frailty’ level during their entire life, which is rather restrictive. Recently, Farrington et al. (2012) introduced time varying shared frailties for two immunizing infections. We present new methodology for infections conferring temporary humoral immunity on recovery (henceforth referred to as non‐immunizing infections) taking into account the fact that individual frailties—and hence the correlations that they induce—may vary with time. This enables us to describe and quantify heterogeneities further and to study how these heterogeneities evolve over time. Although our novel frailty methodology is generally applicable to recurrent event processes, we focus here on recurrent infection processes using cross‐sectional serological data. Cross‐sectional serological data constitute type I interval‐censored data, or current status data, for which the only knowledge about the event is whether it occurred before the observation time point or not, at least for infections conferring lifelong humoral immunity. In the case of pathogens for which recurrent infections during life are possible, an individual's immunological status merely indicates whether he or she experienced recent infection (giving rise to a humoral immunity response) and therefore whether the individual is susceptible to infection or not at the moment of data collection. Although we focus on current status data, the models that are presented here are applicable to recurrent event time data in the presence of any type of censoring, underlining the importance of our approach in the general survival context (not shown here). More specifically for the current status data setting, in contrast with a general recurrent event setting entailing information on multiple events per subject, recurrent infections are not directly observed but rather accounted for through the specification of non‐immunizing infection dynamics (a theoretical mechanistic model). For infections in endemic equilibrium, serology can be used to infer the force of infection, even in the presence of individual heterogeneity and more complicated disease dynamics (Farrington et al., 2001; Abrams and Hens, 2015). For a historical note on the estimation of the force of infection from serological data, we refer to Hens et al. (2010). In summary, the methodology that is presented in this paper adds time varying shared frailty models for recurrent event time data and time varying correlated frailties for single‐event time data to the existing frailty literature. The paper is organized as follows. In Section 2, two different data applications are introduced. In Section 3, time varying frailty models are derived for current status data allowing for recurrent events in the past. In Section 4, the proposed methodology is applied to the data. Finally, advantages and disadvantages of our approach are summarized in Section 5. The data that are analysed in the paper and the programs that were used to analyse them can be obtained from

Motivating examples

First, the methodology is illustrated on hepatitis A virus (HAV) and hepatitis B virus (HBV) serology, obtained from a seroepidemiological study undertaken in 1993–1994 in Flanders, Belgium. In total, 4026 blood samples were drawn from a study group that was representative of the Flemish population and tested for the presence of immunoglobulin G antibodies against HAV and HBV (Beutels et al., 1997). On the basis of a prespecified cut‐off level for the antibody titres, individuals are classified as either seropositive or seronegative for each disease. In addition to the immunological status with respect to a specific infection, age at sample collection was registered. As the true infection times are known only to lie between 0 and the age at sampling time for seropositive individuals or between the age at sampling time and ∞ for seronegative individuals, we are faced with current status data. Second, we consider serological survey data on parvovirus B19, PVB19, and varicella zoster virus (VZV) infections in Belgium. Blood samples were collected for 3379 different individuals between 2001 and 2003 and tested for the presence of immunoglobulin G antibodies against PVB19 and VZV (Hens et al., 2008). A detailed description regarding symptomatology and epidemiology of the aforementioned infections can be found in the on‐line appendix A. In both examples only the (current) immunological status of an individual is known at the individual's age at the time of investigation. The number and time points of former infections, if applicable, are unknown, which distinguishes this type of data from classical recurrent event time data. Note, however, that VZV, HAV and HBV infections are assumed to confer lifelong immunity, and only reinfection dynamics for PVB19 are studied and accounted for in the respective data applications.

Materials and methods

Notation—univariate frailty model

Time‐to‐event data represent the time to a defined event such as death, failure or infection. The analysis of time‐to‐event data in survival analysis is often complicated by censoring. Therefore, let represent the true event time for individual j=1,…,n and consider right‐censored data (T ,Δ), where T = represents the observation time, C refers to the censoring time and Δ denotes the censoring indicator defined asTherefore, the observed event time T resembles the true event time only whenever it occurs before censoring. Generally, right‐censored data are a special case of interval‐censored data for which the true event time is known only to lie in some interval. In the case of right‐censored data, the right end point of that interval corresponds to ∞. Finally, for current status (CS) data, the observation (monitoring) time T equals the censoring time C , and indicates whether the event of interest took place before C or not. In contrast with right‐censored data, the exact value of the event times is always unknown. Various mathematical compartmental models can be found in the infectious disease literature to describe the transmission of different pathogens in large populations. For the purpose of this paper we consider so‐called susceptible–infected–recovered (SIR) and susceptible–infected–recovered–susceptible (SIRS) mechanistic models, for infections conferring lifelong immunity or temporary immunity on recovery respectively. We further assume that infections are in endemic equilibrium, the population has reached a demographic equilibrium and the number of deaths and births is exactly balanced, entailing a constant population of size N. In Fig. 1, a flow diagram of an SIR model is presented. All individuals are born in compartment S. A subject j flows to the infectious state I at a time‐dependent rate conditional on a subject‐specific frailty term Z , i.e. a non‐negative random variable, which is often constrained to have unit mean for identifiability. After being infected with the pathogen under study, subject j recovers at a rate η(t ) and moves towards the recovered class R. In continuous time, the conditional proportion of subjects in state S (0) at time is given bywhich is referred to as the conditional survival function in survival analysis. We use the commonly made assumption of proportional hazards, implying that the frailty acts multiplicatively on a baseline hazard: , where is termed the baseline hazard function. Consequently, the unconditional survival function can be derived by integrating out the frailty terms Z according to a specific frailty distribution with density f(·):where L(s) denotes the Laplace transform of the random variable Z and is the cumulative baseline hazard function.

Figure 1

Mathematical SIR(S) model: subject j flows from the susceptible compartment S to the infected and infectious state I at rate , and from I to the recovered state R at rate ; in an SIR model, subject j remains in R for life; in an SIRS model, subject j moves to S at rate () A straightforward extension of the SIR model to encompass recurrent events (i.e. reinfections) is the so‐called SIRS model which is graphically depicted in Fig. 1; subjects flow from state R to the susceptible compartment S at replenishment rate (the broken arrow). In an SIRS setting, we can derive an expression for the conditional proportion of subjects who can experience the event at time , which is denoted by , assuming that the proportion of subjects in I is negligible compared with those in the other compartments (see appendix B in Abrams and Hens (2015)):where σ(u) refers to the time‐dependent replenishment rate at which subjects become at risk again for the event under consideration. Integrating out the random effects, the unconditional proportion of subjects in S can be rendered asWhen σ(u)=0,∀ u, the conditional proportion of subjects in S (0) reduces to equation (2). The approach that is undertaken in this paper is closely related to (continuous time) Markov processes in multistate modelling. For a more detailed discussion on multistate models and Markov processes, the reader is referred to Andersen and Keiding (2002).

Shared and correlated frailty models

Bivariate frailty models enable us to model the association between bivariate event times. Bivariate event time data can arise either when two different events are studied in the same individual, or when the same event occurs in two different but related subjects. Throughout the paper, we formulate the methodology for two events affecting the same individual, albeit that the same reasoning holds in the other case. Consider bivariate right‐censored data (T ,Δ), i=1,2,j=1,…,n, such that the event‐specific hazard function can be formulated as , where Z represent event‐ and individual‐specific frailty terms that are associated with subject j=1, … , n and event i=1, 2. The dependence structure is implied by the frailty structure imposed. Two different frailty structures are often considered in the survival literature: the shared frailty modeland the correlated frailty modelwhere γ denotes the frailty variance with respect to event i, and the additive components , l= 0, 1, 2, are independent random variables with mean and variance equal to ω >0. Therefore, direct calculation of the frailty variances results in the equality γ =1/(ω 0+ω ) ascertaining unit frailty means. The additive decomposition of the frailty terms in the correlated frailty model was proposed by Yashin et al. (1995) and first applied to infectious disease modelling by Hens et al. (2009). The correlated frailty model extends the restrictive shared frailty model in the sense that the implied correlation between the event times is allowed to differ from 1. Needless to say, the correlated frailty model offers a more general approach to account for individual heterogeneity. However, the implied correlation coefficient ρ is restricted because of the additive decomposition as follows (Yashin et al., 1995):The shared frailty model is a special case of the correlated frailty model in which the event‐specific components , l=1, 2, are identical, i.e. . The time‐dependent proportion of individuals who are susceptible to both events under study can be derived relying on the assumption of conditional independence of the event times given the frailties. More specifically, , where refers to the proportion of individuals at risk of infection i at time given frailty Z for which formulae are presented in Section 3.1. For expressions regarding and S(t 1,t 2), the reader is referred to the on‐line appendix C. In this paper, the gamma (frailty) distribution is considered because of its computational and analytical convenience. The usefulness of the gamma distribution is due to the closed form expression for the Laplace transform. Furthermore, it is well established that in a time invariant bivariate setting the gamma distribution implies a constant association between two failure times of interest, and therefore it could serve as a useful starting point in the fitting process. Independent gamma‐distributed components with equal scale parameters result in infection‐specific gamma frailties Z . The Laplace transform corresponding to a gamma‐distributed random variable with mean 1 and variance γ is , for .

Time varying frailty models

So far, it has been assumed that individual‐ and infection‐specific frailty terms Z do not vary over time. However, time varying frailty models offer a way to allow individual frailty terms to vary with time. Suppose that the hazard at time under the proportional hazards assumption is , where denotes a time varying frailty term.

Piecewise constant model

Paik et al. (1994) proposed a piecewise constant frailty model accommodating nested structures in the data. Without the complexity of nesting, a piecewise constant frailty model on disjoint intervals [t [,t [[ can be obtained as follows:where if and only if , and otherwise. The components are independent non‐negative random variables with unit mean and variance γ . The time varying frailty variance function equals . A piecewise constant shared gamma frailty model (model TDPCSGF) is derived under the constraint , m=1,…,M, and follows a gamma distribution with unit mean and variance γ ·. Although this model for is straightforward and simple, a disadvantage is the ad hoc choice of time intervals [t [,t [[ in which the frailties are assumed constant. In addition, the frailty components are assumed to be independent such that heterogeneity in subsequent time intervals is unrelated, which seems counterintuitive. Therefore, an alternative model was introduced by Farrington et al. (2012) and further exploited by Unkel et al. (2014) in the context of shared frailty models for immunizing infections. The piecewise constant frailty model is used to assess the goodness of fit of the parametric time‐dependent frailty variance functions in the following models.

Multiplicative model

In line with the work by Farrington et al. (2012) and Unkel et al. (2014), a bivariate frailty model is proposed with time varying frailty terms given bywhere , m=1,…,M, are independent random variables with unit mean and variance γ , and h (·) is a deterministic function describing the frailty evolution over time. One can easily verify that the frailty terms have unit mean and time varying frailty varianceSuppose that is a deterministic exponential decay function defined as , where the exponential decay parameter ϕ >0. Although we could consider a more general decay function with any non‐negative integer value k, the choice of k=2 provided the best model fits for the multiplicative models when applied to our motivating examples. Assuming that the frailty components follow a gamma distribution, the following two‐component multiplicative time‐dependent gamma frailty models can be considered (i.e. taking M=2): two‐component time‐dependent shared gamma frailty model TDSGF‐2C, two‐component time‐dependent correlated gamma frailty model TDCGF‐2C, These two‐component models simplify to a one‐component time‐dependent shared gamma frailty model TDSGF‐1C and a one‐component time‐dependent correlated gamma frailty model TDCGF‐1C if and , i=1, 2, follow degenerate distributions with unit mean (γ ·2=0 and γ 12=γ 22=0) respectively. In model TDSGF‐1C, the event‐specific frailty variances decrease from γ ·1 to 0 as . Although the frailty variances differ for different decay rates ϕ , the implied correlation ρ(t 1,t 2) is constant and equal to 1, which is in line with the time invariant shared frailty model. Furthermore, frailty variances in model TDSGF‐2C decline from γ ·2 + γ ·1(1+γ ·2) to an asymptotic value γ ·2. Note that for both events at all time points in the general formulation of the multiplicative model to limit the complexity of the models. Allowing the decay parameters to be different for both events, the correlation ρ(t 1,t 2) decreases over time. Time varying correlated frailty models with one or two components enable a more flexible (time varying) correlation as opposed to their shared counterparts. In models TDCGF‐1C and TDCGF‐2C, the time‐dependent frailty variances evolve in the same way as in TDSGF‐1C and TDSGF‐2C respectively. In addition, model TDCGF‐1C implies a time invariant positive correlation ρ(t 1,t 2) which deviates from 1 if ω 11 and/or ω 21 differ from 0. First assume that for the events under study individuals experience each event at most once (SIR model) and time‐dependent heterogeneity is assumed to evolve according to the most general model (TDCGF‐2C). Hence, the univariate conditional proportion of (susceptible) individuals at risk for event i = 1, 2 under the proportional hazards assumption iswhere is the baseline hazard function at time for infection i= 1, 2. The unconditional susceptibility probabilities are obtained by integrating out the random frailty terms according to the joint distribution of (, ). Relying on independence of the frailty components, we have , where f 1(·),f 2(·) and f 12(·,·) represent the marginal density functions for and , and the joint density function for both random variables respectively. Let L 1 and L 2 denote the Laplace transforms of the gamma‐distributed random variables and respectively. Consequently, we can writewhere is the cumulative baseline hazard function with respect to infection i= 1, 2, and . Again, relying on the assumption of conditional independence of the event times given the (time varying) frailty terms, the bivariate unconditional proportion of individuals who are susceptible to both events can be derived. These unconditional probabilities are obtained by using a combination of analytical and numerical integration techniques (see the on‐line appendix C). Second, in a mathematical SIRS model with recurrent events, the expressions become more complicated. Under the assumption of frailty‐independent event‐specific replenishment rates at time , i=1, 2, we havewhere and . Taking the expectation with respect to the components and yields an expression for the marginal unconditional proportions of individuals who are susceptible to each event:where = . For ease of presentation, the derivation of the joint probabilities is presented in the on‐line appendix C. Identifiability of time varying correlated frailty models, at the cost of parametric baseline hazard and decay functions, is not pursued theoretically but is assessed by means of simulations (see the on‐line appendix D), and relying on the general methodology of detecting parameter redundancy as proposed by Catchpole and Morgan (1997, 2001). A formal proof, and investigation of sufficient conditions for the models to be identifiable, is considered beyond the scope of this paper.

Results

In this section, the time varying frailty methodology is applied to bivariate CS data on HAV and HBV, and PVB19 and VZV, as introduced in Section 2. In general, for the bivariate CS data (t 1,t 2,δ 1,δ 2), the random variable follows a Bernoulli distribution with mean . The individual log‐likelihood contributions can be expressed in terms of joint susceptibility proportions as follows (Sun, 2006):where are the multinomial probabilities that are associated with the distribution of (Δ1,Δ2) and depending on the model parameters and that are associated with the baseline hazard functions, and the joint frailty distribution. In general, we havewhere δ 1 and δ 2 are equal to 0 when he or she is at risk for respectively the first and second event under consideration, and δ 1 and δ 2 equal 1 whenever the individual is not at risk. In both applications, the unit of time refers to the age a of individual j=1, … , n at the cross‐sectional sampling time for infection i= 1, 2. Furthermore, univariate monitoring times are present such that a 1=a 2=a . In infectious disease epidemiology, the hazard function is often termed the infection hazard or force of infection and can be estimated from serological survey data. In both applications, all available serology is included by means of a direct likelihood approach; individuals with a missing immunological status for one of the infections under consideration contribute to the log‐likelihood as follows:

Hepatitis A and B infections

Time‐dependent frailty models are fitted to the HAV and HBV serology. Gompertz infection‐specific baseline hazard functions λ (a ), i=1, 2, are considered here, i.e. λ (a )=ξ exp(ν a ), where ξ >0 and . In addition, both infections are presumed to confer lifelong immunity, entailing SIR infection dynamics. For more details on the models that were fitted to the Flemish serological survey data on HAV and HBV infections, and their names, we refer to Table 1.

Table 1

Overview of frailty models fitted to the Flemish seroprevalence data on HAV and HBV, and the Belgian seroprevalence data on PVB19 and VZV

Model	Infection process		Heterogeneity	Zijm*
	HAV	HBV
SGF	Immunizing	Immunizing	Time invariant	Shared
CGF	Immunizing	Immunizing	Time invariant	Correlated
TDSGF‐1C	Immunizing	Immunizing	One‐component time dependent with ϕ ₁₁=ϕ ₂₁	Shared
TDSGF‐1C^*	Immunizing	Immunizing	One‐component time dependent	Shared
TDCGF‐1C	Immunizing	Immunizing	One‐component time dependent with ϕ ₁₁=ϕ ₂₁	Correlated
TDPCSGF	Immunizing	Immunizing	Piecewise constant time dependent	Shared
	PVB19	VZV
SGF‐1	Immunizing	Immunizing	Time invariant	Shared
TDSGF‐1‐1C	Immunizing	Immunizing	One‐component time dependent	Shared
TDSGF‐1‐2C	Immunizing	Immunizing	Two‐component time dependent	Shared
SGF‐2	Non‐immunizing	Immunizing	Time invariant	Shared
TDSGF‐2‐1C	Non‐immunizing	Immunizing	One‐component time dependent	Shared
TDSGF‐2‐2C	Non‐immunizing	Immunizing	Two‐component time dependent	Shared

Overview of frailty models fitted to the Flemish seroprevalence data on HAV and HBV, and the Belgian seroprevalence data on PVB19 and VZV In Table 2, maximum likelihood estimates and standard error estimates for the model parameters are presented. Models are compared on the basis of the Akaike information criterion (AIC) and Bayesian information criterion (BIC). An improvement in model fit is observed when comparing age‐invariant frailty models (SGF and CGF) with their age‐dependent counterparts, underlining the importance of accommodating time varying heterogeneity. Leaving aside the piecewise constant model because of its different interpretation, shared frailty models with (TDSGF‐1C*) and without unequal frailty variances (TDSGF‐1C) provide the best fit among all fitted multiplicative models based on their respective AIC and BIC values respectively, though the correlated frailty model (TDCGF‐1C) yields a lower value for the log‐likelihood compared with the shared models, at the cost of an additional parameter. Two‐component shared frailty models (TDSGF‐2C) with either equal or unequal decay rates for both infections did not improve the model fit (see the on‐line appendix D). The time‐dependent piecewise constant shared gamma frailty model is considered to assess the goodness of fit of the imposed variance functions or, equivalently, the exponential decay functions . In model TDPCSGF, we selected four age groups [0, 12), [12, 25), [25, 65) and older than 65 years for which independent frailty terms are considered. Note that this model enables greater flexibility in modelling age‐dependent heterogeneity; however, the choice of the age intervals is subjective and influences the model fit. The piecewise constant model (TDPCSGF) outperforms all other presented frailty models on the basis of AIC, albeit that model TDSGF‐1C has a smaller BIC value. We clearly see that heterogeneity is highest at young ages. Variability decreases with time and increases again in the last age group, albeit that a limited amount of data is available therein (large standard error). Although slightly outperforming model TDSGF‐1C*, little evidence against an exponential decay in heterogeneity is obtained from model TDPCSGF. In spite of its improved fit, the biological interpretation of model TDPCSGF is not straightforward and independence of the frailty components is unrealistic. Nevertheless, fitting model TDPCSGF is useful to investigate the time varying shape of the frailty variance. The estimated multinomial probabilities for TDSGF‐1C* (the full curve) and TDPCSGF (the dotted curve) are shown in Fig. 2. The TDPCSGF fit deviates from the TDSGF‐1C* fit at higher ages. In Fig. 3, age‐dependent frailty variances for HAV (Fig. 3(a)) and HBV (Fig. 3(b)) are presented, signalling a faster decrease for hepatitis B.

Table 2

Maximum likelihood estimates and standard errors for the model parameters with corresponding AIC and BIC values†

Parameter	Estimates for the following models:
	SGF	CGF	TDSGF‐1C,	TDSGF‐1C ‡ ,	TDCGF‐1C,	TDPCSGF
			ϕ ₁₁=ϕ ₂₁	ϕ ₁₁≠ϕ ₂₁	ϕ ₁₁=ϕ ₂₁
ξ ₁	0.012 (0.001)	0.007 (0.001)	0.077 (0.029)	0.139 (0.093)	0.151 (0.105)	0.029 (0.006)
ν ₁	0.036 (0.005)	0.106 (0.017)	−0.012 (0.007)	−0.021 (0.009)	−0.022 (0.009)	0.009 (0.005)
ξ ₂	0.002 (4×10⁻⁴)	0.002 (4×10⁻⁴)	0.003 (0.001)	0.003 (0.001)	0.003 (0.001)	0.002 (4×10⁻⁴)
ν ₂	−0.003 (0.008)	−0.001 (0.014)	−0.009 (0.008)	−0.012 (0.009)	−0.008 (0.008)	−0.005 (0.008)
√γ ₁₁	0.725 (0.086)	1.651 (0.176)	5.843 (0.829)	6.444 (1.013)	6.606 (1.020)	3.671 (0.606)
√γ ₂₁	0.725 (0.086)	1.608 (2.272)	5.843 (0.829)	6.444 (1.013)	5.765 (0.831)	3.671 (0.606)
√γ _·2	—	—	—	—	—	2.421 (0.504)
√γ _·3	—	—	—	—	—	0.012 (0.160)
√γ _·4	—	—	—	—	—	8.813 (7.856)
ϕ ₁₁	—	—	0.034 (0.005)	0.026 (0.007)	0.025 (0.007)	—
ϕ ₂₁	—	—	0.034 (0.005)	0.044 (0.011)	0.025 (0.007)	—
ρ	1.000 (—)	0.497 (0.702)	1.000 (—)	1.000 (—)	0.871 (0.080)	1.000 (—)
AIC	5824.90	5794.89	5756.01	5755.52	5757.04	5749.01
BIC	5856.41	5838.99	5793.82	5799.62	5807.44	5799.42

†Minima are indicated by italics.

‡Unequal frailty variances.

Figure 2

Figure 3

Age‐dependent frailty standard deviations () in model TDSGF‐1C* and corresponding 95% confidence‐limits () for (a) HAV and (b) HBV

Estimated multinomial probabilities for models TDCGF‐1C* () and TDPCSGF ( ) when applied to the bivariate serological survey data on HAV and HBV infections in Flanders, Belgium, years 1993–1994: (a) ; (b) ; (c) ; (d) Age‐dependent frailty standard deviations () in model TDSGF‐1C* and corresponding 95% confidence‐limits () for (a) HAV and (b) HBV Maximum likelihood estimates and standard errors for the model parameters with corresponding AIC and BIC values† †Minima are indicated by italics. ‡Unequal frailty variances. Estimating the correlation between infection‐specific frailty terms could reflect to what extent latent processes, such as social contact behaviour of people, susceptibility to infection and infectiousness after infection, drive the more general infection process, or could reveal transmission via similar transmission routes when the correlation is perfect. Shared frailty models are restrictive in that they assume a perfect correlation implying common activity levels for both infections, i.e. the propensity to make contacts that are relevant for disease transmission, without accounting for unshared sources of heterogeneity, e.g. differences in susceptibility between infections. Here, little evidence against the hypothesis of a perfect correlation is present. Estimation of important epidemiological parameters such as the basic and effective reproduction number for HAV and HBV can be done by imposing mixing patterns via the so‐called ‘Who acquires infection from whom’ approach. For more details thereon, we refer to Anderson and May (1991) and Hens et al. (2012). In Section 4.2, we show the estimation of the reproduction numbers for PVB19 and VZV when relying on social contact data.

Parvovirus B19 and varicella zoster virus

The baseline infection hazard λ (a ) with respect to infection i is modelled by means of the time homogeneous mass action principle, which is a specific parametric hazard that is often used in infectious disease modelling. However, our approach is generally applicable to all types of parametric hazard, and therefore it is not mandatory for readers to grasp all the complexities of the mechanistic model. The mass action principle relates the infection hazard to social contact data through the specification of the so‐called social contact hypothesis, assuming it to depend on contact rates and a constant proportionality factor q . A thorough description of the mass action principle can be found in the on‐line appendix E. An important transmission parameter describing the spread of an infection is the so‐called basic reproduction number. The basic reproduction number R 0 is defined as the expected number of secondary cases produced by a single typical infectious individual during his or her entire infectious period when introduced into a fully susceptible population. Furthermore, R 0 represents an epidemiological threshold parameter in large populations for which a value larger than 1 implies that an infection can invade the population, whereas the infection will become extinct otherwise. More realistically, the effective reproduction number R e reflects the actual expected number of secondary cases in a population with pre‐existing immunity. The effective reproduction number R related to infection i, or alternatively R when S (a )=1, ∀ a , can be calculated as the leading eigenvalue of the next generation matrix (see, for example, Farrington et al. (2013a) and the on‐line appendix E). The results of fitting time varying bivariate shared frailty models to the Belgian serology on PVB19 and VZV are presented in Table 3. In addition to the parameter estimates and profile likelihood confidence limits, AIC and BIC values are shown for model comparison. Table 1 presents an overview of the models that were fitted to the Belgian serology on PVB19 and VZV.

Table 3

Model		Estimate	R^i0	R^ie	AIC	BIC
SGF‐1	q ₁₀	0.072 [0.069, 0.075]	3.60 [3.35, 3.88]	1.268 [1.209, 1.334]	4937.14	4955.51
	q ₂₀	0.200 [0.188, 0.214]	11.64 [10.59, 12.82]	1.488 [1.397, 1.591]
	γ	0.152 [0.118, 0.188]
TDSGF‐1‐1C	q ₁₀	0.072 [0.069, 0.076]	3.60 [3.22, 3.99]	1.268 [1.169, 1.375]	4939.14	4963.64
	q ₂₀	0.200 [0.183, 0.221]	11.64 [9.99, 13.49]	1.488 [1.348, 1.656]
	γ _·1	0.152 [0.100, 0.210]
	ϕ _·1	0.000 [0.000, 0.009]
TDSGF‐1‐2C	q ₁₀	0.066 [0.062, 0.071]	3.74 [3.15, 4.87]	1.735 [1.286, 2.894]	4912.08	4942.70
	q ₂₀	0.235 [0.191, 0.299]	15.65 [11.38, 24.08]	4.957 [2.417, 11.082]
	γ _·1	2.918 [1.524, 5.004]
	γ _·2	0.233 [0.156, 0.323]
	ϕ _·1	0.316 [0.246, 0.425]
SGF‐2	q ₁₀	0.071 [0.068, 0.074]	3.18 [2.97, 3.43]	1.100 [1.051, 1.157]	4869.83	4894.33
	σ	0.011 [0.008, 0.015]
	q ₂₀	0.173 [0.163, 0.183]	8.98 [8.22, 9.83]	1.207 [1.141, 1.282]
	γ	0.032 [0.002, 0.065]
TDSGF‐2‐1C	q ₁₀	0.065 [0.061, 0.070]	2.90 [2.64, 3.49]	1.142 [1.047, 2.565]	4862.93	4893.56
	σ	0.012 [0.009, 0.016]
	q ₂₀	0.158 [0.141, 0.179]	8.19 [7.15, 10.46]	1.890 [1.193, 4.225]
	γ _·1	1.470 [0.415, 3.498]
	ϕ _·1	0.330 [0.209, 0.530]
TDSGF‐2‐1C,	q ₁₀	0.065 [0.060, 0.070]	2.94 [2.60, 4.97]	1.242 [1.046, 3.405]	4863.83	4900.57
unequal frailty
variances
	σ	0.013 [0.009, 0.021]
	q ₂₀	0.154 [0.133, 0.175]	7.98 [6.76, 11.83]	1.873 [1.160, 6.605]
	γ _·1	1.646 [0.459, 6.443]
	ϕ ₁₁	0.239 [0.141, 0.648]
	ϕ ₂₁	0.377 [0.226, 0.677]
TDSGF‐2‐2C	q ₁₀	0.066 [0.063, 0.071]	3.30 [2.79, 4.45]	1.453 [1.083, 2.706]	4859.26	4896.01
	σ	0.011 [0.007, 0.015]
	q ₂₀	0.193 [0.156, 0.257]	11.27 [8.11, 18.90]	3.304 [1.473, 8.897]
	γ _·1	2.419 [0.839, 4.960]
	γ _·2	0.095 [0.017, 0.186]
	ϕ _·1	0.303 [0.226, 0.423]
TDSGF‐2‐2C,	q ₁₀	0.066 [0.063, 0.081]	3.40 [2.78, 6.17]	1.585 [1.072, 4.107]	4860.73	4903.60
unequal frailty
variances
	σ	0.012 [0.007, 0.020]
	q ₂₀	0.188 [0.151, 0.251]	10.98 [7.85, 19.47]	3.257 [1.416, 10.200]
	γ _·1	2.554 [0.861, 5.994]
	γ _·2	0.095 [0.016, 0.181]
	ϕ ₁₁	0.249 [0.160, 0.706]
	ϕ ₂₁	0.327 [0.228, 0.486]

†The minima are indicated in italics.

Maximum likelihood estimates for the model parameters as well as for the basic and effective reproduction numbers R and R (PVB19, i=1; VZV, i=2), with 95% profile likelihood confidence intervals in square brackets and AIC and BIC values† †The minima are indicated in italics. Although the SGF‐2 and TDSGF‐2 frailty models cover more complex disease dynamics, the improved fit to the data reveals that the assumption of lifelong immunity for PVB19 seems questionable. Furthermore, the age‐dependent frailty models that are described in this paper tend to outperform their age invariant counterparts. This leads to the conclusion that variability is indeed higher in the younger age groups and decreases with age until a constant frailty variance γ 2 is achieved (Fig. 4). A graphical illustration of the TDSGF‐2‐2C model fit can be found in the on‐line appendix C.

Figure 4

Age‐dependent frailty variance () for model TDSGF‐2‐2C and corresponding 95% confidence limits () when applied to the bivariate serological survey data on PVB19 and VZV in Belgium, years 2001–2003

Discussion

First, this paper presents an extension of traditional shared and correlated frailty models to be applicable in the context of non‐immunizing infections (i.e. infections conferring temporary humoral immunity on recovery) while allowing for time varying heterogeneity in their acquisition. The approach proposed combines frailty methodology for non‐immunizing infections with time varying heterogeneity, at least while assuming a shared frailty with respect to multiple infections. The use of a time varying correlated frailty model, however, is complicated by the non‐identifiability of the unshared components in the additive decomposition of the infection‐specific frailty terms. Identifiability of the time invariant correlated gamma frailty model was formally established for bivariate time‐to‐event data in the presence of covariates by Iachine (2004), and a simulation‐based assessment for CS data ensures identifiability at the cost of parametric baseline hazards (Hens et al., 2009). Nevertheless, identifiability in the time varying setting for recurrent infections seems questionable. In addition to the aforementioned time varying frailty models, we propose novel multiplicative time varying correlated frailty models for infections conferring lifelong immunity. Expressions for the time‐dependent correlation coefficients are derived for the two‐component models, albeit that those can be readily generalized. Age‐dependent shared frailty models are fitted to bivariate serological data on childhood infections PVB19 and VZV, and these models were compared with age invariant models either with or without accommodating potential PVB19 reinfections. The results support the hypothesis of potential PVB19 reinfections as argued by Abrams and Hens (2015). In addition, age‐dependent shared frailty models outperform their age invariant counterparts. In general, the age‐dependent shared gamma frailty models with potential reinfection processes for PVB19 (TDSGF‐2‐1C and TDSGF‐2‐2C) were found to be the best‐fitting models to the data at hand, and estimates of the reproduction numbers for PVB19 and VZV were in line with those derived in the presence of age invariant individual heterogeneity (Abrams and Hens, 2015). Childhood infections are typically highly correlated within individuals in early childhood, which was observed as well for PVB19 and VZV, with the correlations persisting into adulthood only for infections having similar transmission routes (Farrington et al., 2013b). Heterogeneity in childhood could be explained by confounding of different transmission routes and could be due to the nature of contacts made at young ages. At older ages, however, behaviour and environmental factors tend to change such that transmission routes become differentiated and as social factors such as school attendance intervene the heterogeneity drops either to 0 (for infections with different transmission routes) or to some constant (reflecting variability in behaviour associated with a common route of transmission). In addition to confounding of different transmission routes, demographic factors, such as differences in socio‐economic status or ethnicity, could explain some of the unobserved individual heterogeneity at young ages, albeit that such information was absent in our data. In shared frailty models, frailty terms have the interpretation of activity levels, meaning that individuals with a high social activity level are more likely to acquire both infections. On top of that, susceptibility to infection is presumed to be equal for both infections. For infections with different routes of transmission and/or different levels of susceptibility, correlated frailties are more suited to describe heterogeneity as well as association in the acquisition of both infections. In addition, age‐dependent shared and correlated frailty models are applied to bivariate serological survey data on HAV and HBV. In this application, we relied on the assumption of lifelong immunity for both infections. Time varying frailty terms improve the model fit substantially, and a decreasing frailty variance with age is observed. Although the piecewise constant shared gamma frailty model TDPCSGF allows for varying frailty variances within different age groups, its biological interpretation is difficult and the assumption of mutual independence between frailty components is unrealistic. In this setting, one‐component age‐dependent correlated gamma frailty models are found to be identifiable at the cost of parametric baseline hazards and common decay parameters for both infections. Identifiability of the frailty models is investigated on the basis of the general methodology of detecting parameter redundancy (Catchpole and Morgan, 1997, 2001). A formal proof of identifiability in the context of immunizing infections is beyond the scope of this paper. Note that the underlying transmission model for HBV did not include a carrier state in which individuals remain infective for life. Because of the low prevalence of HBV in Flanders, we believe that this simplification, albeit unrealistic to some extent, does not have a large effect on the results that are presented in this paper. The proposed time varying frailty models are based on an explicit parametric model for the frailty variables in terms of independent gamma frailty components. This was suggested by Farrington et al. (2012) due to empirical evidence of a decreased heterogeneity with time (age) for several childhood infections. However, further research is required to test the hypothesis of an increased variability in the acquisition of (childhood) infections earlier in life, especially when these frailty models are applied to infections that mainly affect adolescents and the elderly. Although the computationally convenient gamma frailty distribution is considered throughout this paper, other frailty distributions could be used without additional work whenever closed form Laplace transforms are available. Frailty distributions without a closed form Laplace transform could be considered as well; however, multivariate numerical integration techniques are required to integrate out the random frailties, thereby increasing the computational burden. A shortcoming of the multiplicative time‐dependent frailty models is a time dependence of the range of which is required to maintain . Recently, Enki et al. (2014) proposed a new family of time varying shared frailty models, based on power transformations rather than linear models, overcoming the disadvantage of a time‐dependent frailty range in the context of immunizing infections. Further research could entail improvements in modelling time‐dependent heterogeneity for non‐immunizing infections by using these frailty models, albeit that preliminary analyses yield similar results (which are not shown) in terms of model fit. Although copula and counting process models could be used instead of (time varying) frailty models for the analysis of bivariate survival data, these approaches are not pursued in this paper. The connection between frailty and copula models has been studied by, for example, Goethals et al. (2008). For more information about the counting process approach, we refer to Aalen et al. (2008). ‘Supplementary Materials for “Modelling time‐varying heterogeneity in recurrent infection processes: an application to serological data”’. Click here for additional data file.

13 in total

1. Prevalence of hepatitis A, B and C in the Flemish population.

Authors: M Beutels; P Van Damme; W Aelvoet; J Desmyter; F Dondeyne; C Goilav; R Mak; L Muylle; D Pierard; A Stroobant; F Van Loock; P Waumans; R Vranckx
Journal: Eur J Epidemiol Date: 1997-04 Impact factor: 8.082

2. Correlated individual frailty: an advantageous approach to survival analysis of bivariate data.

Authors: A I Yashin; J W Vaupel; I A Iachine
Journal: Math Popul Stud Date: 1995 Impact factor: 0.720

3. Modelling multisera data: the estimation of new joint and conditional epidemiological parameters.

Authors: N Hens; M Aerts; Z Shkedy; H Theeten; P Van Damme; Ph Beutels
Journal: Stat Med Date: 2008-06-30 Impact factor: 2.373

4. The correlated and shared gamma frailty model for bivariate current status data: an illustration for cross-sectional serological data.

Authors: N Hens; A Wienke; M Aerts; G Molenberghs
Journal: Stat Med Date: 2009-09-30 Impact factor: 2.373