Literature DB >> 35915739

Circular-linear copulae for animal movement data.

Abstract

Animal movement is often modelled in discrete time, formulated in terms of steps taken between successive locations at regular time intervals. Steps are characterized by the distance between successive locations (step lengths) and changes in direction (turn angles). Animals commonly exhibit a mix of directed movements with large step lengths and turn angles near 0 when travelling between habitat patches and more wandering movements with small step lengths and uniform turn angles when foraging. Thus, step lengths and turn angles will typically be cross-correlated.Most models of animal movement assume that step lengths and turn angles are independent, likely due to a lack of available alternatives. Here, we show how the method of copulae can be used to fit multivariate distributions that allow for correlated step lengths and turn angles.We describe several newly developed copulae appropriate for modelling animal movement data and fit these distributions to data collected on fishers (Pekania pennanti). The copulae are able to capture the inherent correlation in the data and provide a better fit than a model that assumes independence. Further, we demonstrate via simulation that this correlation can impact movement patterns (e.g. rates of dispersion overtime).We see many opportunities to extend this framework (e.g. to consider autocorrelation in step attributes) and to integrate it into existing frameworks for modelling animal movement and habitat selection. For example, copulae could be used to more accurately sample available locations when conducting habitat-selection analyses.

Entities: Chemical

Year: 2022 PMID： 35915739 PMCID： PMC9314651 DOI： 10.1111/2041-210X.13821

Source DB: PubMed Journal: Methods Ecol Evol Impact factor: 8.335

INTRODUCTION

Advances in sensor technologies have led to the proliferation of animal location data at fine temporal and spatial scales (Kays et al., 2015), which in turn has led to the development of a plethora of new methods and software for modelling animal movements (see e.g. Hooten et al., 2017 for an overview of statistical methods and Joo et al., 2020 for a review of the R packages for modelling animal movement). Although movement is inherently a continuous process, we usually observe it at discrete points in time, and the time interval between successive locations is also often, though not always, constant (e.g. global positioning systems are typically programmed to collect data at fixed intervals). Thus, it is easier to conceptualize movement in discrete time, deconstructed into a series of ‘steps’ between successive locations (McClintock et al., 2014). These steps can be characterized by the distance between locations (step length) and the change in direction from the previous step (turn angle). When modelling movement in discrete time, it is important to consider the following characteristics of observed movement trajectories: (a) over short time‐scales, animals will tend to move in a consistent direction; (b) left and right turns are equally likely in the absence of environmental heterogeneity; and (c) animals will tend to move large distances with little change in direction when travelling between habitat patches and short distances with many changes in direction when foraging. These characteristics, in turn, suggest that: (a) the distribution of turn angles is likely to have a mode at 0 (indicative of directional persistence); (b) the distribution of turn angles should be symmetric around 0; and (c) step lengths and turn angles are likely to be correlated. Commonly used circular distributions (e.g. wrapped Cauchy and von Mises) allow for symmetric angles about a mode of 0 and mixtures of circular distributions can accommodate multimodal angular densities. Thus, these distributions satisfy the first two conditions noted above. The third characteristic is more challenging to address. Correlation between direction and speed (equivalent to distance when observations are equally spaced in time) is an inherent property of continuous‐time models (Calabrese et al., 2016; Gurarie et al., 2017; Johnson et al., 2008; McClintock et al., 2014) and discrete‐time models formulated in terms of a velocity process (i.e. changes in position between successive time points; Jonsen et al., 2005). Yet, it is more common to parameterize discrete‐time models using step lengths and turn angles that are statistically independent conditional on environmental covariates or behavioural states (McClintock et al., 2012, 2014; Morales et al., 2004). The main reason for assuming independence has been a lack of available analytical solutions that allow for proper modelling of this correlation. The broad goal of this paper is to address this limitation. Correlation is easily accommodated when data are distributed according to a multivariate normal distribution, but our interest lies in modelling correlation between a linear (i.e. non‐periodic) variable that can only take on positive values (step length), modelled using, for example, a gamma or Weibull distribution, and a circular variable (turn angle), which is often modelled using a von Mises or other circular distribution (Avgar et al., 2016; Morales et al., 2004; Signer et al., 2019). Here, we show how copulae (Nelsen, 2006), cumulative distribution functions (CDFs) with uniform marginals on the interval [0,1], can be used to develop multivariate distributions that allow for correlation between step lengths and turn angles while preserving these intended marginal distributions. Specifically, we introduce several new circular–linear copulae with properties appropriate for modelling movement in discrete time together with corresponding methods for estimating parameters and simulating data. Functions for implementing these methods are available in our open‐source R package cylcop (Hodel & Fieberg, 2021; R Core Team, 2022), which can be seen as an extension of the copula package (Hofert et al., 2020; Jun Yan, 2007; Kojadinovic & Yan, 2010; Marius Hofert and Martin Mächler, 2011; Hofert et al., 2018) and is freely available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R‐project.org/package‐cylcop. Finally, we have also developed an app which allows the reader to interactively visualize all copulae introduced in this paper. The app can be accessed at https://cylcop.shinyapps.io/cylcop‐graphs/. We begin by providing a motivating example using GPS data collected on fishers (Pekania pennanti) from LaPoint et al. (2013a, 2013b). We expect most ecologists will be unfamiliar with copulae, so we follow this example with a short introduction to the basic theory (Section 3) before presenting new methods for producing circular–linear copulae suitable for animal movement data (Section 4). In Section 5, we describe methods for estimating parameters of copulae. In Section 6, we apply the introduced copulae and estimation methods to the fisher data. Finally, in the last section, we discuss possible future research directions.

MOTIVATING EXAMPLE

To illustrate typical correlation in movement data, we will consider tracking data from fishers (Pekania pennanti) available through Movebank (Brown et al., 2012; LaPoint et al., 2013a, 2013b). The data, displayed in Figure 1a, consist of a total of 4,350 step lengths and turn angles of six individuals, resampled to an interval of 10 min with a tolerance of 1 min using the track_resample function in the R package amt (Signer et al., 2019). The marginal density of the turn angles is symmetric around 0 and has modes at 0 and 180 degrees. The mode at 180 degrees could be due to the animals' movements being influenced by linear features in the environment (Mckenzie et al., 2012), or GPS measurement errors (Hurford, 2009). To evaluate measurement error as a potential explanation, we visualized the distribution of turn angles after excluding steps shorter than 25 m (Hurford, 2009), but this did not change the general shape of the distribution. Therefore, rather than excluding short steps, we will consider models that can accommodate multiple modes. The correlation between turn angles and step lengths can be seen in panel b of Figure 1. For the lower quintiles of step lengths, the circular medians and modes of the corresponding turn angles are close to , whereas in the fourth and fifth quintiles, the medians and modes of the corresponding turn angles are close to 0. Thus, in this dataset, step lengths and turn angles are clearly not independent.

FIGURE 1

Left: Scatter plot of step lengths (, in meters) and turn angles () of six fishers from (LaPoint et al., 2013a, 2013b). Kernel density approximations of the marginal densities are plotted in red and blue next to the axes. The step lengths are separated into five quantiles as marked by the colour bar below the x‐axis. Right: Circular box plots of the same data. For each of the five step length quantiles, a box plot of the corresponding angles is shown. The sample medians are defined as the centre of the shortest arc that connects all points; in other words, the angle around which the spread of the data is minimal. Outliers, that is values outside 1.5 times the inter quartile range, are marked in red

COPULA

A bivariate copula is the CDF of a pair of random variables with uniform marginals and its domain restricted to the unit square, . Although we focus on bivariate copulae (referred to hereafter as ‘copulae’), extensions to more than two dimensions are straightforward. The main properties of copulae can be directly derived from well‐known properties of general CDFs: Any CDF must vanish at the lower limits and reach 1 at the upper limits of the domains of its random variables, . The CDF of a uniform random variable (and hence the marginal CDFs of a copula) is equal to the values of the random variable, . A bivariate CDF of a pair of random variables cannot decrease with either increasing or , that is, the probability of taking a value from any subset of the entire domain must be non‐negative. To illustrate the usefulness of copulae, let and be any two random variables with CDFs and . Sklar's theorem (Sklar, 1959) states that with marginal CDFs as arguments, a copula returns the CDF of a multivariate distribution, . Thus, we can describe the joint distribution of and using their specified marginal distributions together with a copula. All information about correlation between and is ‘encoded’ in the copula, and different copulae give different joint distributions for two specified marginal distributions. Consequently, when modelling data, we can focus separately on describing: (a) the marginal distributions of and ; and (b) modelling their dependence structure. To further demonstrate, note that Equation 1 can be used to define new copulae from a given joint distribution and its marginals, since, using the probability integral transform, we can rewrite it as , where and are again uniform random variables. As an example, consider the joint CDF of a bivariate normal distribution with correlation and both variances equal to one. Its marginal distributions are both standard normal distributions with CDF . From this, we can generate a Gaussian or normal copula Now, let and be some (non‐standard) normally distributed random variables with CDFs and . One possible joint distribution with marginals and is then It can be easily shown that is a bivariate normal distribution with correlation (see Figure 2c). However, if we had instead chosen to follow a gamma and an exponential distribution, we could still obtain a valid joint CDF from a Gaussian copula, but the resulting joint distribution would then not be bivariate normal (see Figure 2e). Although the parameter would continue to capture the correlation between and , the exact Pearson correlation coefficient between and would not be , but a function of . There are many other ways of generating copulae besides using known joint distributions, and as long as the obtained function fulfils the three conditions described above, it is a copula.

FIGURE 2

Samples from copulae and corresponding joint distributions with non‐uniform margins. Left column: Gaussian copula with . Right column: Clayton copula with . First row: samples drawn from the copulae. Second row: samples drawn from the joint distribution obtained with the corresponding copula and normal margins with means 0 and standard deviations 2 and 5. Third row: samples drawn from the joint distribution obtained with the corresponding copula, a marginal gamma distribution with shape = 5 and scale = 1 (x‐direction) and marginal exponential distribution with rate = 3 (y‐direction) The probability density function (PDF), , of a joint distribution can also be obtained with the copula where and denote the marginal PDFs of and , respectively, and is the copula density, that is a PDF of a distribution with uniform marginals corresponding to the CDF . Hereafter, we will use ‘copula’ to refer to and ‘copula density’ to refer to its derivative, . To draw samples from a copula, which can then be used to draw samples from a joint distribution with pre‐specified marginals, we follow a procedure based on conditional distributions (Johnson, 2013). The conditional distribution of a copula is defined as To illustrate, we will draw samples from a Gaussian copula and a Clayton copula (left and right columns of Figure 2 respectively). A definition of the Clayton copula, together with other copulae of the Archimedean family can be found in the Appendix Section A2. To generate a sample from , we first draw from a uniform distribution and then draw from . The pair is then a sample from . To draw from , we draw from a uniform distribution and use the probability integral transform to obtain . We repeat this process times to generate a sample of size from (Figure 2a,b). We can then transform this sample from the copula to a sample from a bivariate distribution, , with non‐uniform marginal CDFs, and , via (Figure 2, second row, for normally distributed margins and third row for gamma and exponentially distributed margins). Finally, any copula is bounded by the Fréchet–Hoeffding bounds (Fréchet, 1935; Fréchet, 1951; Hoeffding, 1940) and are themselves copulae (at least in the bivariate case). However, some copulae do not attain one or either bound, in which case they are not capable of capturing strong positive or negative correlations. The Gaussian copula, for example, approaches the lower and upper Fréchet–Hoeffding bounds for and , respectively, and is equal to the product copula for (see Figure A1); the product copula is a copula corresponding to independent random variables. The Clayton copula, on the other hand, cannot reach the lower Fréchet–Hoeffding bound, approaches the upper bound as and approaches the product copula as (from above or below). For those interested in learning more about copulae, we recommend the following excellent reviews (in our subjective order of increasing mathematical difficulty): Trivedi and Zimmer (2007), Nelsen (2006), Joe (2014); for readers with a working knowledge of measure theory, we recommend Durante and Sempi (2015).

Circular–linear copulae

Our main objective is to develop appropriate copulae for modelling joint turn‐angle and step‐length distributions. Circular–circular copulae and circular–linear copulae without the restriction of symmetry (see below) have been a subject of statistical literature starting with Johnson and Wehrly (1977) and Johnson and Wehrly (1978), who, however, did not yet use the term ‘copula’. With these copulae still being an active area of research, we will also draw inspiration from more recent reviews (e.g. Jones et al., 2015) and applied studies (e.g. García‐Portugués et al., 2013). Let be a continuous circular random variable and a continuous linear one. While has support on the real line, , has support on the unit circle, , which we will view as having support on some interval of length , . The PDF then needs to fulfil the boundary condition This condition is somewhat different from (though equivalent to) the definitions found in standard references, such as Mardia and Jupp (2000), where they view as having support on the entire real line, which means that the PDF needs to be ‐periodic. The first definition is more convenient for our purposes, as it mirrors the definition of the density of a circular–linear copula (see Equation 8). Examples of circular distributions commonly used in movement studies include von Mises and wrapped Cauchy distributions (see Mardia & Jupp, 2000; McClintock et al., 2012; Watson, 1983). Any continuous bivariate circular–linear PDF has support on (a subset of) the cylinder, . From Equation 4, it is evident that this means that also the domain of the corresponding circular–linear copula is the surface of a cylinder of unit height and unit circumference (instead of the unit square as for linear–linear copulae) and For the remainder of the text, we will define to be the circular and the linear dimension of the copula density, and for the sake of brevity, we will refer to copulae with densities fulfilling Equation 8 as ‘periodic in u’ or just as ‘periodic’. We will set the support of the turn angle to , with negative angles corresponding to left turns and positive angles to right turns. The support of the linear random variable , the step length, is restricted to the non‐negative real numbers. Turn angles are somewhat different from other circular variables in that it often makes sense to assume that an animal has no inherent bias to turn left or right. This can be achieved with a marginal PDF that is symmetric around and a copula that has not only a periodic density (see Equation 8), but also a symmetric one in direction around For the sake of brevity, we will refer to copulae having the property in Equation 9 as ‘symmetric in u’ or just as ‘symmetric’, even though there are many other possible symmetries for copulae (see Hodel & Fieberg, 2021; Nelsen, 2006). Finally, note that any copula that is ‘symmetric’ will also be ‘periodic’.

CONSTRUCTING CIRCULAR–LINEAR COPULAE

Johnson and Wehrly (1978) showed that circular–linear densities with given marginals can be obtained via where is a ‐periodic density function on the circle. Thus, many have recognized that is a circular–linear copula density (see e.g. García‐Portugués et al., 2013). We have implemented this copula using a von Mises density for (see cyl‐vonmises in our cylcop package; Hodel & Fieberg, 2021). However, there is no obvious way to obtain symmetric joint densities using this approach, which makes it less useful for movement data. We will therefore introduce in the following sections other methods of obtaining copulae with densities not only periodic, but also symmetric in .

Copulae with quadratic or cubic sections

The section in at of a copula is a curve that is defined by the vertical cross‐section of the copula where is held constant at . Consider the following class of circular–linear copulae, for which the sections in the linear dimension are quadratic functions It is easy to see that is not only periodic in , but also symmetric Similarly, the class of circular–linear copulae, below, with cubic sections in the linear dimension will also be periodic and symmetric in the circular direction since . Copulae with quadratic and cubic sections are implemented in the cylcop package as cyl_quadsec and cyl_cubsec; further details including the derivation of their densities and conditional distribution functions are given in Hodel and Fieberg (2021). An example of a copula with quadratic sections in is displayed in Figure 3 and an example of one with cubic sections is shown in Figure 6b. To fully appreciate the difference between copulae with quadratic and cubic sections, we refer the reader to our app as well as Hodel and Fieberg (2021).

FIGURE 3

Left: linear () and circular () samples drawn from a joint distribution obtained using a copula with quadratic sections in (Equation 12, with ), a marginal gamma distribution (shape = 3, scale = 1) and a marginal von Mises distribution (, ). Right: PDF of the quadratic section copula with

FIGURE 6

(a) Kernel density estimate of the pseudo‐observations of the fisher data. (b) Density of the cubic sections copula cub_sec. (c) Copula density obtained from the HMM

Combinations of Copulae

A second way to generate circular–linear copulae that are symmetric in is to combine two or more linear–linear copulae. In doing so, we will make use of the following properties of copulae (Nelsen, 2006): Any convex linear combination of copulae is a copula. Orthogonal reflections of a copula density with respect to the lines or are again densities of a copula.

Linear combinations of reflected Copulae

In some special cases, orthogonal reflections can also be seen as rotations by a multiple of (see Hodel & Fieberg, 2021). It is therefore common (see e.g. Hofert et al., 2018; Patton, 2012; Yoshiba, 2016) to denote a copula obtained by reflecting the density of copula along the line by , along the line by and along both lines and by ; these copulae are often referred to as ‘rotated copulae’. The density and distribution of the copula generated by reflecting with respect to are The CDF can be derived easily by integrating the PDF and taking care of the boundary conditions of a copula (see Hodel & Fieberg, 2021). Taking the arithmetic mean between any linear–linear copula and produces a circular–linear copula with a density that is not only periodic but also symmetric in . This density, which usually has an X‐shaped appearance, can also be ‘periodically shifted’ by 0.5 in the ‐direction, which gives a copula with diamond‐shaped density that is also symmetric and periodic (Figure A5). The structure of the correlation captured by such copulae in the context of movement data is as follows: Small values of step lengths correlate with turn angles around or, equivalently, , step lengths close to the median (i.e. step lengths close to relative rank 0.5) correspond to turn angles close to 0 and large step lengths correlate again with large turn angles around or . For the shifted, ‘diamond‐shaped’ copulae, the opposite is true, with small and large step lengths associated with small turn angles. This approach is implemented in the cylcop package as cyl_rot_combine (Hodel & Fieberg, 2021). For further illustrations, we refer the reader to our app (https://cylcop.shinyapps.io/cylcop‐graphs/).

Rectangular patchworks of Copulae

The last method for generating symmetric circular–linear copulae entails partitioning the unit square into rectangular regions, each containing an appropriately transformed copula. To construct such rectangular patchworks of copulae, we follow the procedure outlined in Durante et al. (2009). Specifically, we define two rectangles that are symmetric about : and with . Next, let (‘bg’ for background) be a copula and a 2‐increasing function obtained by some transformation of a copula so that at the boundaries of each rectangle, . The patchwork copula is then To generate a copula that is periodic and symmetric in , the two copulae from which the functions are derived must be reflections of each other with respect to the line . This can be achieved by defining . With this condition satisfied, we then either set , (see upper row of Figure 4) or we can choose a copula that is periodic and symmetric in for and use any values of and that we desire (see lower row of Figure 4). For illustrative purposes, we have chosen copulae with quite extreme parameter values. For a rectangular patchwork copula with smoother density, which is usually more appropriate for movement data, see Figure A9f. The explicit distribution of a general patchwork copula (of which the above‐described copula is a special case) is given in theorem 2.2 in Durante et al. (2009). The equations and derivations for our two‐rectangle‐copulae can be found in Hodel and Fieberg (2021), and they are implemented in the cylcop package as cyl_rect_combine.

FIGURE 4

Left column: linear (x) and circular (θ) samples drawn from joint distributions obtained with cyl_rect_combine‐copulae, a marginal gamma distribution (shape = 3, scale = 1) and a von Mises distribution (, ). Right column: PDFs of cyl_rect_combine‐copulae. Upper row: the rectangular patchwork of the copula consists of the two rectangles— and . The function in the lower rectangle is obtained by a transformation of a Frank copula (Frank, 1979) with and the function in the upper rectangle by transforming a 90 degrees rotated Frank copula with . Lower row: the rectangular patchwork of the copula consists of the two rectangles—R 1 = [0.1, 0.4] × [0, 1] and . The function in the lower rectangle is obtained by a transformation of a Frank copula () and the function in the upper rectangle by transforming a 90 degrees rotated Frank copula with . The ‘background copula’ is a cyl_quadsec copula with parameter

PARAMETER ESTIMATION AND DEPENDENCE

Although it is possible, in principle, to estimate parameters of the two assumed marginal distributions and the copula simultaneously by maximizing the log‐likelihood using the joint PDF of Equation 4, this leads to a complex optimization problem. Further, the estimators of copula parameters can be sensitive to mis‐specification of the marginal distributions, and the estimators of parameters in the marginal distribution can be influenced by mis‐specification of the copulae. Instead, we implement a maximum pseudo‐likelihood estimator (MPLE, see Oakes, 1994; Genest et al., 1995; Shih & Louis, 1995; Tsukahara, 2005). We begin by finding the optimal parameters of the marginal distributions via maximum likelihood estimation. Next, we calculate a nonparametric estimate for each marginal distribution and then use the resulting empirical marginal CDFs, and , to transform the set of i.i.d. observations of step lengths and turn angles to the pseudo‐observations , that is values drawn from the empirical copula. Finally, we use MPLE to fit the parameters of our selected copula family to the pseudo‐observations by maximizing Compared to maximum likelihood estimations using the full joint distribution, this MPLE approach reduces the number of parameters that have to be optimized simultaneously. It is important to provide ‘good’ starting parameter values for MPLE, especially for rugged parameter hyper‐surfaces, and we obtain them by comparing circular–linear correlation measures for the empirical and parametric copula with different parameter values (see Hodel & Fieberg, 2021 for further details). For rectangular patchwork copulae with rectangles spanning the entire unit square, parameter values (i.e. starting values for MPLE) can be derived analytically from Kendall's tau of the empirical copula (Genest, 1987; Genest & Rivest, 1993; Oakes, 1982). Finally, model selection can be performed using Akaike's information criterion, which is a common (e.g. Chen & Fan, 2005; McNeil et al., 2015; Joe, 2014), but for MPLE not uncontroversial approach (see Grønneberg & Hjort, 2014, but also Jordanger & Tjøstheim, 2014). To assess how well a copula fits the data and to measure the ‘difference’ between copulae, we implemented a function in our cylcop package (Hodel & Fieberg, 2021) to calculate the Wasserstein distance (Villani, 2008) between copula PDFs as well as between copula PDFs and pseudo‐observations (Del Barrio et al., 1999; Hallin et al., 2021; Marti et al., 2016; Schuhmacher et al., 2020). Interestingly, the calculation of the Wasserstein distance itself is also intimately related to the concept of copulae (Alfonsi & Jourdain, 2014; Benth et al., 2021; Marti et al., 2016).

Direction of correlation

For all symmetric copulae introduced in this paper, periodically shifting the density by 0.5 in ‐direction leads again to a symmetric copula. This ‘shifting’ is implemented differently in all copulae: for copulae with quadratic or cubic sections, it means multiplying their parameters by −1. For the patchwork copulae, it means reflecting or rotating by 90 degrees and (and adapting the background copula, if applicable). Finally, for the copulae obtained by taking the arithmetic mean of a linear–linear copula and its reflected counterpart, we implemented the ‘shifting’ explicitly to go from an x‐shaped to a diamond‐shaped density, as mentioned in Section 4.2.1. In any case, periodically shifting the density reverses the correlation between the two random variables. If a copula captures, for example, a correlation of large step lengths with small absolute angles, the shifted copula corresponds to a correlation of small step lengths with small absolute angles.

APPLICATION TO FISHER LOCATION DATA

Methods

We fitted bivariate distributions to the fisher data displayed in Figure 1 using the copulae and methods introduced in the previous sections. We considered several different linear and circular marginal distributions (see Tables A1 and A2) and selected the appropriate ones using AIC. Next, we considered several different copulae based on the characteristics of the data. With our assumption that an animal has no inherent bias to turn left or right, we can ignore the (only briefly mentioned) circular–linear copula by Johnson and Wehrly (1978), since it is not symmetric in . A visual inspection of the data further excludes copulae obtained from linear combinations of reflected copulae with their x‐ or diamond‐shaped densities. We therefore decided to fit copulae with quadratic (quad_sec) and cubic sections (cub_sec), as well as three rectangular patchwork copulae. The latter were restricted to consist of two symmetric rectangles, together covering the entire unit square. They were based on Clayton (rect_combine_Clayton), Gumbel (rect_combine_Gumbel) and Frank copulae (rect_combine_Frank), three commonly used Archimedean copulae (see Appendix, Section A2 for a quick overview of their properties). The functions in the lower rectangles were derived directly from those copulae, the ones in the upper rectangles from their 90 degree rotations. We also attempted to fit a rectangular patchwork copula based on the Frank copula, again with two symmetric rectangles but this time with their lower and upper bound as tuneable parameters and a quadratic sections copula as the background copula. We generated preliminary estimates of copula parameters using measures of circular–linear correlation (see Hodel & Fieberg, 2021), which we then used as starting values for MPLEs. We determined starting values for quad_sec and cub_sec using a grid search, choosing parameter values that gave the best match to the circular–linear correlation coefficient of the data; starting values for the rect_combine copulae were chosen to match Kendall's tau of the underlying linear–linear copulae to the data and could be calculated analytically. When fitting the rectangular patchwork copula with lower and upper bounds as tuneable parameters, the MPLE converged either to rect_combine_Frank or to quad_sec depending on the starting values. Therefore, purely for illustrative purposes, we fixed the rectangles to and , and optimized the parameters of the Frank and the quadratic sections copula (rect_combine_Frank‐quad_sec). We compared copulae using AIC and visually investigated their properties and the properties of the corresponding joint distributions using various plots. We also generated trajectories of different lengths and estimated diffusion coefficients, , by regressing mean‐square displacement against time for the cub_sec copula and for simulations using independent samples of turn angles and step lengths (Appendix, Section A4.2). Since both marginal distributions were best described by mixture distributions with two components (see below), we decided to also fit a two‐state hidden Markov model (HMM) to the data (McClintock et al., 2020; McClintock & Michelot, 2018; Morales et al., 2004; Patterson et al., 2008). The joint PDF of step lengths and turn angles can then be obtained from the HMM using the steady‐state transition probabilities and by assuming independence between step lengths and turn angles, conditional on the state. Using Equation 4, we calculated a copula density from the marginal and joint densities estimated with the HMM and compared it to the copulae described above. This comparison was done visually, as well as by calculating the second Wasserstein distance (using the Euclidean distance) between the pseudo‐observations and the different copula densities.

Results

The marginal distribution of the step lengths was best described by a mixed gamma distribution with two components and with shape parameters 1.47 and 1.70, scale parameters 10.6 and 110.6 and mixing proportions of 0.59 and 0.41. As already remarked in the Introduction, the marginal distribution of the turn angles is bimodal and the smallest AIC score was achieved with a mixture of two von Mises distributions. We fixed the location parameters at 0 and and estimated the concentration parameters to be 0.48 and 6.31, respectively, with mixing proportions of 0.78 and 0.22 (see Tables A1 and A2). Of the copulae we considered, the cubic sections copula (cub_sec) had the lowest AIC score (Table 1). To visualize it, we simulated 4350 step lengths and turn angles from the cub_sec copula and from the independence copula (Figures 5 and 6; data simulated with the other copulae can be found in the Figure A7). Visually, it is clear that the data generated with cub_sec (Figure 5a) provide a better match to the ‘real’ fisher data (Figure 1a) than the data simulated with independent turn angles and step lengths (Figure 5c). The effect of the correlation structure can also be seen from the circular box plots of the turn angles (Figures 1b and 5b,d). We calculated the pseudo‐observations of the fisher data and smoothed them using a two‐dimensional kernel density estimator (KDE) with a bivariate Gaussian kernel. While this KDE is a helpful visualization, note that it is mathematically not a valid empirical, nonparametric (circular–linear) copula (see Section 7). The KDE and the density of cub_sec both highlight that large step lengths tend to be associated with turn angles near 0 and small step lengths with turn angles near (Figure 6). Plots of copula densities for the other copulae (Table 1) can be found in Figure A9. The cub_sec copula resulted in a diffusion coefficient, , smaller than the diffusion coefficient assuming independent steps and turns, .

TABLE 1

Copula		ΔAIC	Starting value	MPLE
a	cub_sec	0.0	a=0.11 b=−0.14	a=0.07 b=−0.16
b	quad_sec	75.5	a=0.11	a=0.12
c	rect_combine, Frank + quad_sec	131.5	NA	α=0.64 a=0.12
d	rect_combine, Frank	147.0	α=1.84	α=1.83
e	rect_combine, Gumbel	281.2	α=1.25	α=1.19
f	rect_combine, Clayton	451.7	α=0.49	α=0.20
Independence		545.8

FIGURE 5

Top row: 4,350 step lengths (, in metres) and turn angles () sampled from a joint distribution obtained with the cubic sections copula cub_sec. Bottom row: 4,350 step lengths and turn angles sampled independently from their marginal distributions. Left column: scatter plots of step lengths and turn angles. Maximum likelihood estimates of the marginal densities are plotted in red and blue next to the axes. The step lengths are separated into five quantiles as marked by the colour bar below the x‐axis. Right column: circular box plots of the same data. For each of the five step length quantiles, a box plot of the corresponding angles is shown

AIC relative to the lowest AIC (cub_sec) and maximum pseudo‐likelihood parameter estimates for six copulae fit to the fisher data. Starting values for copulae a and b were determined using a grid search, choosing parameter values that gave the best match to the circular–linear correlation coefficient of the data; starting values for copulae d–f were chosen to match Kendall's tau of the data Top row: 4,350 step lengths (, in metres) and turn angles () sampled from a joint distribution obtained with the cubic sections copula cub_sec. Bottom row: 4,350 step lengths and turn angles sampled independently from their marginal distributions. Left column: scatter plots of step lengths and turn angles. Maximum likelihood estimates of the marginal densities are plotted in red and blue next to the axes. The step lengths are separated into five quantiles as marked by the colour bar below the x‐axis. Right column: circular box plots of the same data. For each of the five step length quantiles, a box plot of the corresponding angles is shown (a) Kernel density estimate of the pseudo‐observations of the fisher data. (b) Density of the cubic sections copula cub_sec. (c) Copula density obtained from the HMM The estimate of the parameter of the cub_sec copula was on the boundary, and its circular–linear correlation coefficient was slightly lower than the coefficient for the actual fisher data (0.085 vs. 0.102). These results suggest that a closer fit might be possible using a copula with a correlation structure similar to the cub_sec copula, but with a wider range of possible dependencies. The copulae consisting of rectangular patchworks can potentially capture any degree of correlation between both Fréchet–Hoeffding bounds and a wide range of joint density shapes. Thus, it may be beneficial to consider other correlation structures beyond the three Archimedean copulae we selected. The main drawback of these type of copulae, however, is that they can be difficult to optimize since one has to choose and compare copulae for the patches and the background, and estimate parameters for each of these copulae. Introducing more than two rectangular patches is straightforward but would further complicate optimization. Comparing the cub_sec copula to an HMM with two states described by gamma and von Mises distributions, we find that the former provides a better fit to the data. The second Wasserstein distance between the pseudo‐observations and the copula density of cub_sec was 0.019, whereas the Wasserstein distance between the pseudo‐observations and the copula density of the HMM was 0.028. To put these numbers in context, if we assume independence between step lengths and turn angles, that is have a copula density that is 1 everywhere on the unit square, the Wasserstein distance between the pseudo‐observations and that density is 0.052. Although the marginal distributions obtained from the HMM describe the data well (the null hypothesis that the data come from the corresponding mixed distributions cannot be rejected at the 0.05 level of significance according to Cramér–von Mises tests of goodness‐of‐fit), the copula densities from the HMM were not able to fully capture the correlation as can be seen visually from Figure 6c. In particular, the HMM results in a ‘striped’ copula density due to assuming independence between step lengths and turn angles after conditioning on the latent state.

DISCUSSION AND FUTURE RESEARCH

The observed correlation between step lengths and turn angles in our applied example is typical of many modern‐day telemetry datasets. We have shown that this correlation can be captured by new circular–linear copulae that we have developed to also mimic the characteristics of animal movement data (e.g. symmetric turn angle distributions). We have demonstrated methods for estimating copulae parameters and for simulating data from fitted models. Lastly, we have shown that this correlation can impact movement patterns (e.g. dispersion over time). Copulae have parameters that directly quantify the strength of dependence, which is allowed to vary continuously in space. Thus, one nice feature of copulae is that they can inform us as to where the dependence is strongest and may also provide insights into the mechanisms that lead to particular copula structures (e.g. asymmetric tail associations; see Ghosh et al., 2020). In our application, the dependencies were strongest in two regions: (a) the region with large step lengths and turn angles near 0, reflecting highly directed movements; and (b) the region with small step lengths and turn angles near +/− . We suspect the dependencies in the latter region may be driven, in part, by measurement error that is magnified during time periods when the animals are resting (e.g. as a result of non‐optimal collar orientation or signal interruption when animals are sleeping/denning). HMMs are also capable of capturing correlation between step lengths and turn angles through their dependence on one or more latent states. Yet, we expect that HMMs may sometimes require a large number of states to fully capture the range of correlations present in movement data. In our applied example, we found that an HMM with two states did not fit the data as well as the cub_sec copula with the same assumed marginal distributions. Furthermore, given the lack of separation between marginals and correlation when modelling data with HMMs, we expect there will be a trade‐off between modelling the marginal distribution and the correlation correctly, particularly when the marginal distributions are unimodal. We note, however, that it is also possible to fit HMMs with state‐dependent copulae (e.g. Nasri et al., 2020; Ötting et al., 2021). In the future, we plan to further compare these various approaches to modelling correlation in movement data. We see several other opportunities to expand on our work, including opportunities for both theoretical and methodological advances as well as research that will allow copulae to be incorporated into existing frameworks for modelling animal movement and habitat use. One of the main characteristics of movement data, which we have so far neglected, is temporal autocorrelation. Although turn‐angle distributions with modes at 0 allow steps to be autocorrelated (i.e. they allow individuals to maintain a consistent direction), the copulae we have considered do not allow turn angles or step lengths themselves to be autocorrelated. To incorporate this feature, our MPLE procedure would need adapting to allow for non‐i.i.d. data. Often, covariates are recorded along with movement metrics, and the target of modelling is the joint distribution of step lengths and turn angles conditional on the covariates, or, in other words, regression. Similarly, in time‐series analysis one is often interested in the distribution of a random vector conditional on past instances. Both issues can be tackled using conditional copulae (Fermanian & Wegkamp, 2012; Patton, 2006). We have focused this paper on parametric copulae with parameters estimated using frequentist methods. Distributions of copula parameters could also be estimated in a fully Bayesian framework. Alternatively, copulae can be estimated nonparametrically by smoothing the empirical copula using, for example, Bernstein polynomials (Janssen et al., 2012; Sancetta & Satchell, 2004). This has already been done in the general circular–linear case (see Carnicero et al., 2013; García‐Portugués et al., 2013, 2014) and would need only slight adaptation for our symmetric circular–linear data. Other areas where additional research might be rewarding include the development of graphical tools to assess goodness‐of‐fit or to visualize dependence, such as Chi‐plots (Fisher & Switzer, 1985, 2001) or K‐plots (Genest & Boies, 2003). These could be implemented along with a cross‐validation procedure for assessing goodness‐of‐fit (Grønneberg & Hjort, 2014). A more thorough investigation of correlation measures applicable to movement data and the possibility of estimating copula parameters from them could help in pre‐selecting copula families and provide better initial guesses for MPLE. Lastly, we see many opportunities for integrating these methods into existing frameworks for modelling animal movement. In particular, copula could be used to more accurately sample available steps when conducting step‐selection analyses (Fieberg et al., 2021; Forester et al., 2009; Fortin et al., 2005; Thurfjell et al., 2014) or to better capture movement modes in hidden Markov or state space models (McClintock et al., 2020; McClintock & Michelot, 2018; Morales et al., 2004; Patterson et al., 2008).

CONFLICT OF INTEREST

None.

AUTHORS' CONTRIBUTIONS

F.H.H. and J.R.F. conceived the study; F.H.H. wrote the R package; and F.H.H. and J.R.F. wrote the manuscript.

PEER REVIEW

The peer review history for this article is available at https://publons.com/publon/10.1111/2041‐210X.13821. Appendix Click here for additional data file. Code Click here for additional data file.

14 in total

10. GPS measurement error gives rise to spurious 180 degree turning angles and strong directional biases in animal movement data.

Authors: Amy Hurford
Journal: PLoS One Date: 2009-05-20 Impact factor: 3.240

Circular-linear copulae for animal movement data.

INTRODUCTION

MOTIVATING EXAMPLE

COPULA

Circular–linear copulae

CONSTRUCTING CIRCULAR–LINEAR COPULAE

Copulae with quadratic or cubic sections

Combinations of Copulae

Linear combinations of reflected Copulae

Rectangular patchworks of Copulae

PARAMETER ESTIMATION AND DEPENDENCE

Direction of correlation

APPLICATION TO FISHER LOCATION DATA

Methods

Results

DISCUSSION AND FUTURE RESEARCH

CONFLICT OF INTEREST

AUTHORS' CONTRIBUTIONS

PEER REVIEW

Review 1. State-space models of individual animal movement.

Review 2. Navigating through the r packages for movement.

3. Inferences on the association parameter in copula models for bivariate survival data.

4. A 'How to' guide for interpreting parameters in habitat-selection analyses.

5. How linear features alter predator movement and the functional response.

6. When to be discrete: the importance of time formulation in understanding animal movement.

Review 7. Applications of step-selection functions in ecology and conservation.

8. Correlated velocity models as a fundamental unit of animal movement: synthesis and applications.

9. Animal movement tools (amt): R package for managing tracking data and conducting habitat selection analyses.

10. GPS measurement error gives rise to spurious 180 degree turning angles and strong directional biases in animal movement data.