Literature DB >> 34141196

Understanding the impact of correlation within pair-bonds on Cormack-Jolly-Seber models.

Alexandru M Draghici¹, Wendell O Challenger², Simon J Bonner¹.

Abstract

The Cormack-Jolly-Seber (CJS) model and its extensions have been widely applied to the study of animal survival rates in open populations. The model assumes that individuals within the population of interest have independent fates. It is, however, highly unlikely that a pair of animals which have formed a long-term pairing have dissociated fates.We examine a model extension which allows animals who have formed a pair-bond to have correlated survival and recapture fates. Using the proposed extension to generate data, we conduct a simulation study exploring the impact that correlated fate data has on inference from the CJS model. We compute Monte Carlo estimates for the bias, range, and standard errors of the parameters of the CJS model for data with varying degrees of survival correlation between mates. Furthermore, we study the likelihood ratio test of sex effects within the CJS model by simulating densities of the deviance. Finally, we estimate the variance inflation factor c ^ for CJS models that incorporate sex-specific heterogeneity.Our study shows that correlated fates between mated animals may result in underestimated standard errors for parsimonious models, significantly deflated likelihood ratio test statistics, and underestimated values of c ^ for models taking sex-specific effects into account.Underestimated standard errors can result in lowered coverage of confidence intervals. Moreover, deflated test statistics will provide overly conservative test results. Finally, underestimated variance inflation factors can lead researchers to make incorrect conclusions about the level of extra-binomial variation present in their data.

Entities: Disease Species

Keywords: Cormack–Jolly–Seber models; correlated fates; goodness‐of‐fit testing; nested models; overdispersion; pair‐bonds; variance inflation factors

Year: 2021 PMID： 34141196 PMCID： PMC8207451 DOI： 10.1002/ece3.7329

Source DB: PubMed Journal: Ecol Evol ISSN： 2045-7758 Impact factor: 2.912

INTRODUCTION

Mark–recapture experiments are a well‐known and effective method of studying the demographics of wildlife populations (Burnham et al., 1987; King, 2014; King et al., 2009; McCrea, 2014; Seber & Schofield, 2019). Mark–recapture data are collected by capturing individuals from the population at several repeated sampling occasions, marking them with a unique identifier, recording their encounter history, and then releasing them back into the study region (see McCrea, 2014; Seber & Schofield, 2019). The data collected from these studies are typically analyzed by fitting capture–recapture models to generate estimates of the demographic rates pertaining to the open population under study (see Burnham et al., 1987; King, 2014; King et al., 2009; McCrea, 2014; Seber & Schofield, 2019). Most open population models fall within the framework of the Cormack–Jolly–Seber (CJS) model (Cormack, 1964; Jolly, 1965; Seber, 1965). The key assumptions of the CJS model are that survival and recapture fates at any point in the study are constant between animals, all marked animals are correctly recorded, capture–release events are instantaneous (or approximately so), emigration from the sampling region is permanent, and fates of animals are independent of one another (see Seber & Schofield, 2019). Data collected from populations of animals that exhibit complex behaviors are often in violation of the original assumptions of the CJS model. Extensions intended to relax the assumption of constant survival and recapture fates among animals include accounting for heterogeneity with individual‐specific covariates (Gimenez & Barbraud, 2017; Lebreton et al., 1992; Pledger et al., 2003; Royle, 2008), multiple strata (Arnason, 1973), missing covariates (Bonner & Schwarz, 2006), and random effects (see, e.g., Pledger et al., 2003; Royle, 2008). However, nearly all capture–recapture models assume that fates of animals are independent during the sampling period (consider Anderson et al., 1994; Bischof et al., 2020; King, 2014; Lebreton et al., 1992; McCrea, 2014; Seber & Schofield, 2019). Long‐term pair‐bonds are common among avian species in which a portion of the life‐history pattern is shared between mates (see, for instance, Culina et al., 2013; Maness & Anderson, 2008; Rebke et al., 2017). It is likely that there is correlation between survival or recapture fates for the individuals within a pair (Anderson et al., 1994; Lebreton et al., 1992). Consider, for instance, a motivating example of Harlequin ducks (Histrionicus histrionicus), which are waterfowl that typically mate for life (Smith et al., 1996). These ducks migrate from their wintering ground to their breeding grounds with their partners and mostly stay together during the breeding season (Smith et al., 1996). Male Harlequin ducks within a pair‐bond have been shown to be extra‐vigilant in monitoring their nesting partner, which has been theorized to improve survival likelihoods of the female (Bond et al., 2009). Furthermore, a study designed to monitor a population that forms pair‐bonds would likely be performed at the breeding ground due to ease of access. As a consequence, the probability of capturing both individuals within a pair will likely be elevated due to being in close proximity of one another (Lebreton et al., 1992). That said, in some cases, the opposite may be true. For instance, if the male of a pair‐bond is foraging nearby, they may flee when they observe their nesting mate get captured by a research team gathering mark–recapture data. Given the following point, it is reasonable to suspect that the recapture fates of paired individuals may be either negatively or positively correlated. The shared life history and elevated probability of paired individuals constitutes a violation of the standard assumption of independence within capture–recapture models that do not separate their demographic parameters by sex. Many animals are known to form complex social structures that go beyond that of a pair‐bond. Lowland gorillas, for instance, form harems with one silver‐back male and several females (Hagemann et al., 2019). Another highly social vertebrate is the sperm whale, a mammal that can form multilevel social structures based on smaller long‐term groups called social units (Konrad et al., 2018). Social units are comprised of either a female and younger whales (typically offspring), or a group of mature males (Konrad et al., 2018). As a final example, Dungan et al. (2016) showed that the social alignment of Indo‐Pacific humpback dolphins, a small and isolated population, is centralized around mother–calf rearing groups and that they form both long‐term (years) and short‐term (hours‐days) social associations. As such, failing to account for dependence within populations that contain long‐term social groupings may result in overestimation of the true precision for parameter estimates of common mark–recapture models (see any of Anderson et al., 1994; Bischof et al., 2020; Lebreton et al., 1992). In this work, we conduct a simulation study to examine the effects that dependence between mated pairs has on inference from the CJS model. Motivated by a long‐term mark–recapture study of Harlequin ducks at the McLeod River region in Alberta, Canada, Challenger (2010) proposed an extension to the CJS framework by introducing a correlation parameter, , to account for the dependence in the recapture events within pairs. Using the work done in Challenger (2010) as the basis for our proposed extension to the CJS model, we introduce another correlation parameter, , that accounts for dependence in survival events of pair‐bonded animals. Furthermore, we also allow all pairs to undergo periods of temporary separation when they choose not to breed due to, for instance, external stressors such as lack of food or increased predation (see, e.g., Ludwig & Becker, 2008). During a period of temporary separation, our model treats individuals within a pair as having independent survival and recapture events. In our simulation study, we assess the standard CJS model's ability to compute accurate demographic estimates for varying levels of survival correlation between mates. Using our proposed extension to generate correlated mark–recapture data, we compute estimates from the standard CJS model and consider the bias, precision, and width of the confidence intervals as survival correlation between pairs increases. Furthermore, our study considered whether asymptotic assumptions of the likelihood ratio test hold when comparing group‐specific CJS models against reduced CJS models in the presence of mated correlation. Finally, we assess the ability of the variance correction (Lebreton et al., 1992) to detect and address the issue of overdispersion due to dependent fates among mated pairs.

MATERIALS AND METHODS

Model definition

Instead of monitoring all individuals within a mark–recapture dataset, we instead will consider a collection of entities. An entity is either a set of two animals, male and female, that have formed a pair‐bond or a single animal that has not formed a pair‐bond (originally discussed in Challenger, 2010). We assume that the recapture and survival fates are independent between entities and that individuals within a pair‐bond are strictly monogamous (Challenger, 2010). Furthermore, if an individual within a pairing perishes, at some discrete sampling occasion , in which is the total number of occasions, then the widowed partner will not seek out a mate during the remainder of the study period (Challenger, 2010). Finally, we condition on the first capture of either individual in an entity in a manner similar to the standard CJS model. When conditioning on the first capture for a pair‐bond, the individuals within the pairing are assumed to have become mates before entering the study (Challenger, 2010). For the following subsections, consider some fixed entity at some sampling occasion .

Temporary separation process

Let the indicator variable denote the event that pair remain together from time to and . If a paired entity is temporarily separated, then it is assumed that its member's fates are independent from one another between the sampling periods to . This process occurs before the survival and recapture step at every sampling occasion. Finally, note that if entity consists of a single individual (widowed or unmated), then .

Survival process

In the standard CJS model, it is assumed that the time‐dependent survival process is governed by a Bernoulli distribution, conditioned on the previous survival state (Lebreton et al., 1992). Let be the event that individual both survived and remained in the study area from time to . The probability of surviving from to , given that the individual is alive and present at , is . If the individual is dead or has emigrated at time , they remain so at subsequent time points. For this extension, we assume that males and females may have distinct probabilities of survival from time to . Let be the probability that the individual of sex of entity survives from time to . For pair‐bonded entities, there are four different survival states in the model: Both members survive, only the female survives, only the male survives, or neither survive (Challenger, 2010). This is represented in the state vector indicating the possible survival outcomes for entity at time , in which is the indicator that the male of entity is alive at time and is similarly defined for the female of pair . If both partners are alive at , then the distribution of is governed by a joint Bernoulli distribution with dependent variables (see Appendix A1 for the derivation). The parameters of this distribution are as follows: is the probability that both members of entity survive from to is the probability that only the individual of sex survives from to given that both members were alive at time is the probability that both members of entity perish between times to where, is the standard deviation of survival event for individual of sex in entity at time is the correlation coefficient for survival of pair from to (see Appendix A2 for the derivation of the bounds and definitions of the odds ratio (OR) and the odds product (OP)). Finally, we condition on such that if there is temporary separation, then the correlation coefficient becomes zero and becomes the product of two independent Bernoulli variables. Now the partially observed survival process for entity at time can be described with the following multinomial distribution:

Recapture process

Consider the standard CJS model, we assume that the observation process is governed by a Bernoulli distribution conditioned on the current survival state (Lebreton et al., 1992). Let be the event that individual was recaptured at time . The probability of being recaptured at time , given that the individual is alive and present at , is . For this extension, we assume that males and females may have distinct recapture probabilities at time . Let be the probability that the individual of sex of entity is recaptured at time . There are four different recapture outcomes for paired entities in the model: Both members are observed, only the female is observed, only the male is observed, or neither are observed (Challenger, 2010). The possible recapture outcomes for entity at time can be represented by the vector , in which is the indicator that the male of entity is recaptured at time and is analogously for the female. If both partners are alive, then the distribution of is governed by a joint Bernoulli distribution with dependent variables (see Appendix A1 for the derivation). The parameters of this distribution are as follows: is the probability that both members in pair are captured at time is the probability that only the individual of sex is captured at time , given that both members were alive at time is the probability that both members of pair are unobserved at time where, is the standard deviation of recapture for individual of sex in entity at time is the correlation coefficient for recapture between members of pair at time . Finally, we condition on such that if there is temporary separation, then the correlation coefficient becomes zero and becomes the product of two independent Bernoulli variables. Now the recapture process for entity at time can be described with the following multinomial distribution:

Simulation study

Data generating process

To study the impact of dependence between mated individuals on the standard CJS model, we used the statistical programming software R (R Core Team, 2020) to generate samples from the extended model (detailed in Section 2.1) for each of the following parameter settings: (Fixed Sample Size) (Fixed Number of Sampling Occasions) (Fixed Probability of Remaining Together for Mated Pairs) (Fixed Survival Probabilities) (Fixed Recapture Probabilities) (Grid of Survival Correlations) (Grid of Recapture Correlations) in which these settings hold and . Moreover, we simulated the sex of each animal with an unbiased coin toss. We assumed that all individuals were marked on the first occasion (a single cohort) and that there are as many pairings as possible. Specifically, if there were simulated males and females there would be mated pairs, unmated males, and a total of entities in our sample. Finally, we assumed that there was no temporal variation across all parameters. Given this, we omit the subscripts and going forward. Note that the case in which and is equivalent to the standard CJS model.

Data modeling process

We used the standard CJS model to compute estimates of survival and recapture rates, goodness‐of‐fit statistics, and overdispersion corrections of the data we simulated from the extended model (Section 2.1) using program MARK (White & Burnham, 1999), a popular mark–recapture modeling software among ecological researchers, with the R library RMark (Laake, 2013). We consider the following parameter settings of the standard CJS model: in which, using the notation discussed in Burnham et al. (1987), denotes a sex‐specific effect for survival and denotes a sex‐specific effect for recapture. For instance, represents the case in which the standard CJS model has a sex‐specific effect for survival probability and a common recapture rate for both sexes.

Standard metrics to assess model performance

To study the impact that varying levels of survival correlation within mark–recapture data has on estimates of survival rates, we computed the range and coverage percentage of the corresponding % confidence intervals, along with the relative bias of the survival estimates. The results were computed across a grid of survival correlations ranging from to increasing by increments of for model . Furthermore, we present the percent coverage of the 95% confidence intervals for each of the cases in equation . Finally, in order to better isolate the impact of correlation within entities on the hidden state process, we set the recapture correlation between mated pairs to zero. Let denote the number of replicate data sets for each scenario and where represents the estimate of from the replicate. Let and denote the values of the upper and lower bounds of the 95% confidence intervals of , respectively. Our computed simulation study metrics are then: Mean Relative Bias: , Mean Relative 95% CI Width: , Percent Coverage of 95% CI: , in which denotes the indicator function of some event occurring.

The likelihood ratio test in mark–recapture modeling

The likelihood ratio test (LRT) is a statistical test used to compare a general model against a nested model that exists on a reduced parameter space (Anderson et al., 1994; Lebreton et al., 1992). The test determines whether the reduced model captures a sufficient amount of variability relative to the general model (Anderson et al., 1994; Lebreton et al., 1992). Consider a case of the CJS model in which we are testing whether survival varies by sex and we assume that recapture does not. Then, our hypothesis test can be expressed as: The likelihood ratio statistic is defined as the ratio between the likelihood maximized over the null hypothesis and the likelihood maximized over alternative (Anderson et al., 1994; Lebreton et al., 1992): The test statistic, called the deviance, is then . Under the null hypothesis, the deviance follows the chi‐squared distribution with degrees of freedom equal to the difference between the degrees of freedom between the general and reduced model (Anderson et al., 1994; Lebreton et al., 1992). In our example, we have and our ‐value is then computed with in which . Moreover, by the probability integral transformation theorem, we know that . In our study, we compared the probability densities of both the deviance statistic and the corresponding ‐value for the both the LRT comparing against and against across with a fixed value of . We investigated whether dependence between mated pairs in mark–recapture data impacted the ability of the LRT to perform reliable model selection.

The correction in mark–recapture models

When mark–recapture data are thought to violate the model assumption of regular binomial variation, an estimate of the variance inflation factor, called , can be computed to assess the level of overdispersion in the model. Under appropriate binomial variation, data that emerged from the CJS model would give a result of (Anderson et al., 1994). On the other hand, suggests that the data has excess variation implying that either the model structure is inadequate () or the underlying model assumptions have been violated (Anderson et al., 1994). One well‐known consequence of overdispersion due to the dependent fates of individuals is that standard error estimates will by understated by the CJS model (see Anderson et al., 1994; Bischof et al., 2020). The recommended approach to dealing with this in practice is to scale up the standard error by a factor of (Anderson et al., 1994; Lebreton et al., 1992; Pradel et al., 2005). Furthermore, Anderson et al. (1994) have shown that the presence of overdispersion due to data replication can impact goodness‐of‐fit testing by inflating the deviance statistic which increases the type I error rate of the LRT. There are three popular estimators of overdispersion in mark–recapture modeling (Cooch & White, 2020). They can be referred to as the deviance estimator (Anderson et al., 1994), Pearson's (or the chi‐square) estimator (Lebreton et al., 1992; Pradel et al., 2005), and Fletcher's estimator (Fletcher, 2012). In our study, we consider the deviance approach. Specifically, when performing model selection the most general model should fit the data reasonably well compared to the saturated model, otherwise the data are likely to have extra‐binomial variation (Anderson et al., 1994; Lebreton et al., 1992). The deviance between the saturated model and the general model over the difference in their degrees of freedom can be used to compute an approximation to the distribution of the variance inflation factor (Anderson et al., 1994), In our simulation study, we drew samples from the density of and generated a point estimate of the overdispersion by taking the median. We call it the median estimator (similar to the median estimator discussed in Cooch & White, 2020), and it is denoted as . We repeated this process for different values of and a fixed . We assessed whether variation induced by mated pairs having correlated fates is detectable by considering whether the density of and the corresponding point estimates, , indicated overdispersion. In order to assess whether the behavior of the estimator is in line with current literature, we computed for all four model settings in equation .

RESULTS

Standard errors for CJS models under pair‐specific linear correlation

Monte Carlo estimates for the survival probability, relative confidence interval width, and relative bias in model are not impacted by changes in the amount of survival correlation present between mated pairs in the data (see Figure 1). That said, as survival correlation increases between mated pairs, the percent coverage of the confidence intervals decreases below the expected value down to an extreme of about (Figure 1). This implies that the standard errors of the survival probability estimates are being understated by the model, since they are the only term that in the confidence bounds that can vary due to the data. Moreover, percentage coverage is only understated at high levels of survival correlation in models that do not account for the effect of sex on survival (see Figure 2). On the other hand, the models that account for sex‐specific differences in their survival probabilities have coverage percentages that tend to stay around , with acceptable statistical variation, and thus continue to produce reliable standard error estimates (Figure 2).

FIGURE 1

Survival metrics against survival correlation () for . Top Left: Monte Carlo estimates of survival across varying levels of . The error bars represent the 95% Monte Carlo confidence intervals, which are approximately equal to . The red line represents the truth ; Top Right: Interval width of 95% confidence intervals on across varying levels of ; Bottom Left: Coverage percentage of the confidence intervals for across varying levels of . The red line represents the confidence level; Bottom Right: Relative bias of across varying levels of . The red line indicates a relative bias of zero

FIGURE 2

Coverage percentage of the confidence intervals for across varying levels of for all models . Red line is confidence level

Behavior of the LRT under Pair‐Specific Linear Correlation

As the level of survival correlation within the data increases, the tails of the density for the likelihood ratio test statistic, comparing models and , become lighter than those of the assumed distribution (Figure 3). The density of the ‐values, in turn, shift from a uniform distribution toward a left‐skewed one (Figure 3). The case in which there is no survival or recapture correlation serves as a basis of comparison. This result implies that the likelihood ratio test will not reject the underlying null hypothesis with a probability equal to its significance level (in this case ), but will instead fail‐to‐reject with a higher probability. The violation of the independence assumption across observations deflates the deviance statistic leading to the goodness‐of‐fit test favoring the more parsimonious hypothesis. A technical example illustrating why the density of the deviance begins to shrink toward zero as the survival and recapture correlation increases is available in Appendix B2. Interestingly, if we consider the likelihood ratio test between models and (Figure 4), in which the recapture correlation is fixed at , we find that added survival correlation has minimal impact on the test's efficacy. These results suggest that increasing mated survival correlation between paired individuals does not have a large impact on goodness‐of‐fit testing for sex effects in recapture rates. Overall, the goodness‐of‐fit test comparing the effect of sex on survival is impacted by survival correlation between mated pairs, while the test comparing the effect of sex on recapture is not.

FIGURE 3

Likelihood ratio test of versus in which across a grid of survival correlations . Dashed line at the value of

FIGURE 4

Likelihood ratio test of versus in which across a grid of survival correlations . Dashed line at the value of

Likelihood ratio test of versus in which across a grid of survival correlations . Dashed line at the value of Likelihood ratio test of versus in which across a grid of survival correlations . Dashed line at the value of

Behavior of the correction under pair‐specific linear correlation

For models that account for sex in either of their parameter estimates (all but ), the sampling densities of (see Figure 5) are within a close neighborhood of , regardless of survival or recapture correlation between mates. In fact, with the exception of the median estimate of decreases as the survival correlation increases (see Table 1). For these model settings, has proven incapable of detecting the violated assumption of independence within the data. However, model does not account for sex‐specific differences in its parameter estimation and so when and the mark–recapture data appear to be nearly replicates. Anderson et al. (1994) showed that under this construction (replicated data without assigning treatment groups to each replicate) . can be thought of as a control with respect to the other models in the study. Given that estimates of are typically computed from the most general model under examination (Cooch & White, 2020), the variance correction would not be applied to the standard errors or be used to rescale goodness‐of‐fit testing metrics. As such, when data replication occurs due to correlation among treatment groups (sex in our example), the estimator will be understated for studies that include these groups in their construction.

FIGURE 5

Density of for all models in which across . Dashed line at the value of

TABLE 1

Median() for varying levels of () across all models

Model	Survival Correlation
Model	γ=0.0	γ=0.3	γ=0.6	γ=0.9	γ=1.0
(ϕ,p)	1.17	1.34	1.59	1.86	2.00
(ϕ,pG)	1.09	1.06	1.03	0.94	0.93
(ϕG,p)	1.05	1.04	1.01	0.93	0.93
(ϕG,pG)	1.10	1.09	1.08	1.02	1.03

Density of for all models in which across . Dashed line at the value of Median() for varying levels of () across all models

DISCUSSION

The results of our study show that the presence of correlation between paired individuals introduces extra‐binomial variation to the data, resulting in underestimated standard errors and lowered coverage of confidence intervals for models that fail to account for sex‐specific effects. Our example in Appendix B1 shows that the most extreme case of paired correlation in the data corresponds to . Furthermore, we have identified an issue with the inferences provided by the likelihood ratio test. Sex‐specific correlation in the data caused the asymptotic distribution of the simulated deviance statistic to differ from its theoretical distribution for the test of whether there was an effect of sex present in survival and/or recapture rates. As such, increased levels of correlation for survival and/or recapture outcomes resulted in overly conservative test results (failure to reject more frequently than theoretically expected). Issues with asymptotic assumptions surrounding the likelihood ratio test in mark–recapture models are not unique to this study. Sparse contingency tables have been shown to skew the density of the deviance statistic (both up and down) stemming from the likelihoods of multinomial models (Afroz et al., 2019; Koehler, 1986). By introducing correlation into the CJS model structure, we are, in a sense, reducing the effective sample size of each generated dataset. Consider an example in which recapture and survival correlations are set to one in a population of animals consisting of exactly males and females with each animal in a long‐term pair‐bond. Under this setup, each pair effectively acts as a single individual (Lebreton et al., 1992). If one animal from the pair dies (or is recaptured), then its partner will die (or be caught) as well. In this case, we need only model the outcomes of one individual from each pair‐bond using the standard CJS model to compute reliable estimates of the survival and recapture probabilities. This is, in effect, reducing our sample size down from down to and halving the expected cell frequencies of our contingency table as well. We contend, however, that sparse data are not the key issue at play here as we designed our simulation study to mitigate these known effects. Recall that the survival and recapture probabilities used to generate our data were and across all time points for all individuals, respectively. Furthermore, our simulation included one cohort in which all first captures occurred at time . Table 2 shows the expected cell frequencies in our simulation study for the cases in which and . Koehler and Larntz (1980) showed that the distribution of the deviance is not well approximated by the chi‐squared distribution when the ratio of the sample size against the number of possible cells is less than five. In our case, this ratio is equal to and so we expect that the deviance should be asymptotically chi‐squared. Moreover, if the majority of expected cell frequencies lie below , then the test is said to be overly conservative (Larntz, 1978). On the other hand, if most of the cell frequencies lie within the interval , then the test becomes too liberal (rejects too often) (Koehler, 1986).

TABLE 2

Recapture history cell probabilities and expected number of observed histories (for populations with n = 100 and n = 200 individuals) used in simulation study

Histories	Probability	Expected (n = 100)	Expected (n = 200)
1000	0.351	35.1	70.1
1011	0.044	4.4	8.8
1101	0.044	4.4	8.8
1110	0.138	13.8	27.6
1100	0.202	20.2	40.5
1010	0.034	3.4	6.9
1001	0.011	1.1	2.2
1111	0.176	17.6	35.1

Recapture history cell probabilities and expected number of observed histories (for populations with n = 100 and n = 200 individuals) used in simulation study The expected cell frequencies shown in Table 2 all lie above for both and . While sparsity will have an impact on the distribution of the deviance, the extreme shift from the chi‐squared distribution that we observe goes well beyond the expected difference introduced by sparsity found in our simulated data. The large spike in ‐values as correlation increases is largely due to the nature of the duplicated data along with the models under consideration in our simulation study. Consider Appendix B2 for a mathematical example illustrating why correlation within groups in mark–recapture data deflates the deviance of the likelihood ratio test along with a small simulation study showing the effect of increased sparsity on the density of the deviance statistic without any correlation present between sexes. Furthermore, we acknowledge that in many field studies the recapture rate in are lower than . In these cases, it becomes increasingly difficult to isolate the cause of deviations from the chi‐squared distribution. Anderson et al. (1994) showed that mark–recapture data with overdispersion due to data replication inflate the size of the deviance when comparing across CJS models that fail to account for the cause of the data replication. Our results show that the source of overdispersion and the models under consideration are vital components to determining the behavior of the deviance. When replicated mark–recapture data are split by treatment groups (males and females) and the mark–recapture model used to study the data accounts for these groups in its parameter estimates, we have shown that the computed values of are understated. This case occurs when comparing group‐specific heterogeneity for data in which there is a significant amount of correlation between the two groups being tested. Therefore, we need to both identify whether there is replication in our sampling data and if there is an underlying group structure separating the replicates (in our example the sex of the animals). For models that took group‐specific heterogeneity into account, estimates of the overdispersion parameter were too small to indicate any significant departure from binomial variation, regardless of the degree of survival and recapture correlation. As such, overdispersion due to dyadic correlation in populations that are highly segmented into pairs may not be easily detectable. Consider, Appendix B3 for a technical example demonstrating why this is the case. The small study presented in Appendix B3 shows that these results also apply to the Pearson (Pradel et al., 2005) and Fletcher's (Fletcher, 2012) estimators. The overdispersion introduced by our model does not result in a large violation of the inherent structure of the CJS model. The new parameters are, in essence, controlling how similar the male and female sample data will be to one another. The estimates of and will remain largely unbiased because the maximum‐likelihood estimation procedure is unaffected by departures in binomial variation (see the discussion in Pradel et al., 2005). Lack of biased estimates is not surprising when dealing with unmodeled dependence structures in mark–recapture data. For instance, Challenger (2010) found that the CJS model produced reasonably unbiased estimates when modeling data with group‐specific correlations using Bayesian methods. Bischof et al. (2020) also showed that spatial capture–recapture models with induced correlation between groups (of sizes ) did not lead to heavily biased estimates of model parameters. As such, if the estimates of were able to reliably detect overdispersion introduced by high dyadic correlations, quasi‐likelihood approaches should provide a reasonable adjustment to standard error estimates (Anderson et al., 1994). The issue is that the estimator is incapable of reliably detecting overdispersion in replicated data when the replicates are accounted for in the modeling process as groups. Unfortunately, we have shown here that failing to account for correlation between mated pairs has the significant consequence of severely violating the asymptotic assumptions of the likelihood ratio test and understating standard errors in reduced models. Lebreton et al. (1992) suggested that when dealing with highly correlated data between sexes it may be reasonable to consider the sample population of only one sex. Indeed, this approach will mitigate issues of understated standard errors and failings of the variance inflation factor. However, one would need a priori knowledge of the dependence between mated pairs in order to make this judgment, as we have shown that the likelihood ratio test for group‐specific differences, sometimes referred to as TEST1 (Burnham et al., 1987), will overly favor the null hypothesis for data with high levels of pair‐specific correlation. In an applied setting, researchers will not be able to determine whether the LRT favors the more parsimonious model because of excessive correlation between mated pairs or whether it is due to the parameters of interest being the same between both sexes without any large violations to independence. As such, it is important to be conscious of these issues when studying animal populations that are suspected to form correlated known social groupings. If a researcher suspects this to be the case, we suggest analyzing the data for each sex separately in order to isolate the source of overdispersion. For instance, one can simulate estimates of using the full data with the model (see chapter 5 in Cooch & White, 2020), separate the data by sex, and then repeat the process for each subset of the data. If the majority of the overdispersion stems from group‐specific correlations, the estimates generated from the data for each specific sex should be close to one. If, however, the estimates remain high for each group, then it is likely that there may be other major sources of extra‐binomial variation present within the data. When a large majority of the overdispersion comes from association between known pairs, the researcher should either scale the standard errors and information criteria with the estimate from or study the data for only one of the two sexes. A cleaner approach would be to estimate group‐specific correlation explicitly using extended models. Directly estimating group‐specific correlation with mark–recapture models will allow researchers to glean further insights into the social dynamics at play between individuals within the population of interest. For instance, we could study how the effect sizes of meaningful covariates pertaining to survival rates change in the presence of group‐specific correlations. Does having a mate improve or hamper the chance of an animal surviving when facing external selective pressures? There are, however, a whole new set of issues that come with explicitly modeling group‐specific correlations as well. The assumption of mated pairs forming permanent (even in highly socially monogamous populations) pairings is unrealistic and can lead to issues with parameter estimation (Gimenez et al., 2012). Furthermore, by conditioning on long‐term pair‐bonds already existing we limit the applicability of our proposed model to mature animals, as juveniles cannot be in a long‐term pair before maturity. Divorce is quite common among animals that form long‐term mate pairings (Culina et al., 2013; Gimenez et al., 2012; Ludwig & Becker, 2008; Maness & Anderson, 2008; Smith et al., 1996). Researchers will need to explicitly model the mate status of each individual animal, their current partner, and their partner transitions, otherwise risk issues of pseudo‐replication (Culina et al., 2013). The issue of missing data is inflated here as well—what if one of the study participants is mated with an individual who has not yet been tagged? In most capture–recapture studies, social detection is imperfect, even among animals with highly correlated fates (Gimenez et al., 2019; Hoppitt & Farine, 2018). One might suggest omitting the data points for animals that are seen with multiple partners in populations that mostly practice social monogamy (low divorce rates). Unless the population has very few cases of partner swapping, omitting these individuals will likely result in inflated standard errors and biased estimates. The question then becomes: Should we risk understated or overstated standard errors when modeling our data? Finally, estimating the correlations of demographic parameters between different groups of animals (adult versus juvenile for instance) often requires populations with a large number of marked individuals to achieve a reasonable degree of estimate precision (see Riecke et al., 2019). These issues will need to be addressed in future work if social independence is to be accounted for with an extended and estimable model structure.

CONFLICT OF INTEREST

The authors have no conflicts of interest to declare.

AUTHOR CONTRIBUTION

Alexandru Marian Draghici: Conceptualization (equal); Data curation (lead); Formal analysis (lead); Funding acquisition (supporting); Investigation (equal); Methodology (equal); Software (lead); Validation (lead); Visualization (equal); Writing—original draft (lead); Writing—review and editing (equal). Wendell Challenger: Conceptualization (supporting); Methodology (equal); Validation (equal); Writing—review and editing (supporting). Simon Bonner: Conceptualization (equal); Formal analysis (equal); Funding acquisition (lead); Investigation (equal); Methodology (equal); Software (supporting); Validation (equal); Visualization (equal); Writing—original draft (supporting); Writing—review and editing (equal). Click here for additional data file.

TABLE 3

Median() for common estimators across all models

	Estimator
Model	Deviance	Pearson	Fletcher
(ϕ,p)	2.01	1.69	1.73
(ϕ,pG)	0.95	0.80	0.81
(ϕG,p)	0.94	0.80	0.81
(ϕG,pG)	1.04	0.88	0.88

11 in total

1. EXPLICIT ESTIMATES FROM CAPTURE-RECAPTURE DATA WITH BOTH DEATH AND IMMIGRATION-STOCHASTIC MODEL.

Authors: G M JOLLY
Journal: Biometrika Date: 1965-06 Impact factor: 2.445

2. A NOTE ON THE MULTIPLE-RECAPTURE CENSUS.

Authors: G A SEBER
Journal: Biometrika Date: 1965-06 Impact factor: 2.445

3. Open capture-recapture models with heterogeneity: I. Cormack-Jolly-Seber model.

Authors: Shirley Pledger; Kenneth H Pollock; James L Norris
Journal: Biometrics Date: 2003-12 Impact factor: 2.571

4. Modeling individual effects in the Cormack-Jolly-Seber model: a state-space formulation.

Authors: J Andrew Royle
Journal: Biometrics Date: 2007-08-28 Impact factor: 2.571

5. Estimating overdispersion in sparse multinomial data.

Authors: Farzana Afroz; Matt Parry; David Fletcher
Journal: Biometrics Date: 2019-12-16 Impact factor: 2.571

6. Better the devil you know: common terns stay with a previous partner although pair bond duration does not affect breeding output.

Authors: Maren Rebke; Peter H Becker; Fernando Colchero
Journal: Proc Biol Sci Date: 2017-01-11 Impact factor: 5.349