Literature DB >> 34073336

Ordinal Pattern Dependence in the Context of Long-Range Dependence.

Abstract

Ordinal pattern dependence is a multivariate dependence measure based on the co-movement of two time series. In strong connection to ordinal time series analysis, the ordinal information is taken into account to derive robust results on the dependence between the two processes. This article deals with ordinal pattern dependence for a long-range dependent time series including mixed cases of short- and long-range dependence. We investigate the limit distributions for estimators of ordinal pattern dependence. In doing so, we point out the differences that arise for the underlying time series having different dependence structures. Depending on these assumptions, central and non-central limit theorems are proven. The limit distributions for the latter ones can be included in the class of multivariate Rosenblatt processes. Finally, a simulation study is provided to illustrate our theoretical findings.

Entities: Chemical Gene

Keywords: limit theorems; long-range dependence; multivariate data analysis; ordinal patterns; time series

Year: 2021 PMID： 34073336 PMCID： PMC8230352 DOI： 10.3390/e23060670

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

The origin of the concept of ordinal patterns is in the theory of dynamical systems. The idea is to consider the order of the values within a data vector instead of the full metrical information. The ordinal information is encoded as a permutation (cf. Section 3). Already in the first papers on the subject, the authors considered entropy concepts related to this ordinal structure (cf. [1]). There is an interesting relationship between these concepts and the well-known Komogorov–Sinai entropy (cf. [2,3]). Additionally, an ordinal version of the Feigenbaum diagram has been dealt with e.g., in [4]. In [5], ordinal patterns were used in order to estimate the Hurst parameter in long-range dependent time series. Furthermore, Ref. [6] have proposed a test for independence between time series (cf. also [7]). Hence, the concept made its way into the area of statistics. Instead of long patterns (or even letting the pattern length tend to infinity), rather short patterns have been considered in this new framework. Furthermore, ordinal patterns have been used in the context of ARMA processes [8] and change-point detection within one time series [9]. In [10], ordinal patterns were used for the first time in order to analyze the dependence between two time series. Limit theorems for this new concept were proven in a short-range dependent framework in [11]. Ordinal pattern dependence is a promising tool, which has already been used in financial, biological and hydrological data sets, see in this context, also [12] for an analysis of the co-movement of time series focusing on symbols. In particular, in the context of hydrology, the data sets are known to be long-range dependent. Therefore, it is important to also have limit theorems available in this framework. We close this gap in the present article. All of the results presented in this article have been established in the Ph.D. thesis of I. Nüßgen written under the supervision of A. Schnurr. The article is structured as follows: in the subsequent section, we provide the reader with the mathematical framework. The focus is on (multivariate) long-range dependence. In Section 3, we recall the concept of ordinal pattern dependence and prove our main results. We present a simulation study in Section 4 and close the paper by a short outlook in Section 5.

2. Mathematical Framework

We consider a stationary d-dimensional Gaussian time series (for ), with: such that and for all and . Furthermore, we require the cross-correlation function to fulfil for and , where the component-wise cross-correlation functions are given by for each and . For each random vector , we denote the covariance matrix by , since it is independent of j due to stationarity. Therefore, we have . We specify the dependence structure of and turn to long-range dependence: we assume that for the cross-correlation function for each , it holds that: with for finite constants with , where the matrix has full rank, is symmetric and positive definite. Furthermore, the parameters are called long-range dependence parameters. Therefore, is multivariate long-range dependent in the sense of [13], Definition 2.1. The processes we want to consider have a particular structure, namely for , we obtain for fixed : The following relation holds between the extendend process and the primarily regarded process . For all , we have: where . Note that the process is still a centered Gaussian process since all finite-dimensional marginals of follow a normal distribution. Stationarity is also preserved since for all , and , the cross-correlation function of the process is given by and the last line does not depend on j. The covariance matrix of has the following structure: Hence, we arrive at: where , . Note that and , since we are studying cross-correlation functions. Therefore, we finally have to show that based on the assumptions on , the extended process is still long-range dependent. Hence, we have to consider the cross-correlations again: since and , with , and . Let us remark that . Therefore, we are still dealing with a multivariate long-range dependent Gaussian process. We see in the proofs of the following limit theorems that the crucial parameters that determine the asymptotic distribution are the long-range dependence parameters , of the original process and therefore, we omit the detailed description of the parameters herein. It is important to remark that the extended process is also long-range dependent in the sense of [14], p. 2259, since: with: and can be chosen as any constant that is not equal to zero, so for simplicity, we assume without a loss of generality , and therefore, , since the condition in [14] only requires convergence to a finite constant . Hence, we may apply the results in [14] in the subsequent results. We define the following set, which is needed in the proofs of the theorems of this section. and denote the corresponding long-range dependence parameter to each by We briefly recall the concept of Hermite polynomials as they play a crucial role in determining the limit distribution of functionals of multivariate Gaussian processes. (Hermite polynomial, [ The j-th Hermite polynomial Their multivariate extension is given by the subsequent definition. (Multivariate Hermite polynomial, [ Let with Let us remark that the case is excluded here due to the assumption . Analogously to the univariate case, the family of multivariate Hermite polynomials forms an orthogonal basis of , which is defined as The parameter denotes the density of the d-dimensional standard normal distribution, which is already divided into the product of the univariate densities in the formula above. We denote the Hermite coefficients by The Hermite rank of f with respect to the distribution is defined as the largest integer m, such that: Having these preparatory results in mind, we derive the multivariate Hermite expansion given by We focus on the limit theorems for functionals with Hermite rank 2. First, we introduce the matrix-valued Rosenblatt process. This plays a crucial role in the asymptotics of functionals with Hermite rank 2 applied to multivariate long-range dependent Gaussian processes. We begin with the definition of a multivariate Hermitian–Gaussian random measure with independent entries given by where is a univariate Hermitian–Gaussian random measure as defined in [16], Definition B.1.3. The multivariate Hermitian–Gaussian random measure satisfies: and: where denotes the Hermitian transpose of . Thus, following [14], Theorem 6, we can state the spectral representation of the matrix-valued Rosenblatt process , as where each entry of the matrix is given by The double prime in excludes the diagonals , in the integration. For details on multiple Wiener-Itô integrals, as can be seen in [17]. The following results were taken from [18], Section 3.2. The corresponding proofs were outsourced to the Appendix A. Let with where: The matrix where where C denotes the matrix of second order Hermite coefficients, given by It is possible to soften the assumptions in Theorem 1 to allow for mixed cases of short- and long-range dependence. Instead of demanding in the assumptions of Theorem 1 that ( We assume that: with However, with a mild technical assumption on the covariances of the one-dimensional marginal Gaussian processes that is often fulfilled in applications, there is another way of normalizing the partial sum on the right-hand side in Theorem 1, this time explicitly for the case and , such that the limit can be expressed in terms of two standard Rosenblatt random variables. This yields the possibility of further studying the dependence structure between these two random variables. In the following theorem, we assume for the reader’s convenience. Under the same assumptions as in Theorem 1 with with

3. Ordinal Pattern Dependence

Ordinal pattern dependence is a multivariate dependence measure that compares the co-movement of two time series based on the ordinal information. First introduced in [10] to analyze financial time series, a mathematical framework including structural breaks and limit theorems for functionals of absolutely regular processes has been built in [11]. In [19], the authors have used the so-called symbolic correlation integral in order to detect the dependence between the components of a multivariate time series. Their considerations focusing on testing independence between two time series are also based on ordinal patterns. They provide limit theorems in the i.i.d.-case and otherwise use bootstrap methods. In contrast, in the mathematical model in the present article, we focus on asymptotic distributions of an estimator of ordinal pattern dependence having a bivariate Gaussian time series in the background but allowing for several dependence structures to arise. As it will turn out in the following, this yields central but also non-central limit theorems. We start with the definition of an ordinal pattern and the basic mathematical framework that we need to build up the ordinal model. Let denote the set of permutations in , that we express as -dimensional tuples, assuring that each tuple contains each of the numbers above exactly once. In mathematical terms, this yields: as can be seen in [11], Section 2.1. The number of permutations in is given by . In order to get a better intuitive understanding of the concept of ordinal patterns, we have a closer look at the following example, before turning to the formal definition. Formally, the aforementioned procedure can be defined as follows, as can be seen in [11], Section 2.1. As the ordinal pattern of a vector such that: with The last condition assures the uniqueness of if there are ties in the data sets. In particular, this condition is necessary if real-world data are to be considered. In Figure 2, all ordinal patterns of length are shown. As already mentioned in the introduction, from the practical point of view, a highly desirable property of ordinal patterns is that they are not affected by monotone transformations, as can be seen in [5], p. 1783.

Figure 2

Ordinal patterns for .

Mathematically, this means that if is strictly monotone, then: In particular, this includes linear transformations , with and . Following [11], Section 1, the minimal requirement of the data sets we use for ordinal analysis in the time series context, i.e., for ordinal pattern probabilities as well as for ordinal pattern dependence later on, is ordinal pattern stationarity (of order h). This property implies that the probability of observing a certain ordinal pattern of length h remains the same when shifting the moving window of length h through the entire time series and is not depending on the specific points in time. In the course of this work, the time series, in which the ordinal patterns occur, always have either stationary increments or are even stationary themselves. Note that both properties imply ordinal pattern stationarity. The reason why requiring stationary increments is a sufficient condition is given in the following explanation. One fundamental property of ordinal patterns is that they are uniquely determined by the increments of the considered time series. As one can imagine in Example 1, the knowledge of the increments between the data points is sufficient to obtain the corresponding ordinal pattern. In mathematical terms, we can define another mapping , which assigns the corresponding ordinal pattern to each vector of increments, as can be seen in [5], p. 1783. We define for such that for We define the two mappings, following [5], p. 1784: An illustrative understanding of these mappings is given as follows. The mapping , which is the spatial reversion of the pattern , is the reflection of on a horizontal line, while , the time reversal of , is its reflection on a vertical line, as one can observe in Figure 3.

Figure 3

Space and time reversion of the pattern .

Based on the spatial reversion, we define a possibility to divide into two disjoint sets. We define Note that this definition does not yield the uniqueness of . We consider the case We stick to the formal definition of ordinal pattern dependence, as it is proposed in [11], Section 2.1. The considered moving window consists of data points, and hence, h increments. We define: and: Then, we define ordinal pattern dependence as The parameter q represents the hypothetical case of independence between the two time series. In this case, p and q would obtain equal values and therefore, would equal zero. Regarding the other extreme, the case in which both processes coincide or one is a strictly monotone increasing transform of the other one, we obtain the value 1. However, in the following, we assume and . Note that the definition of ordinal pattern dependence in (17) only measures positive dependence. This is no restriction in practice, because negative dependence can be investigated in an analogous way, by considering . If one is interested in both types of dependence simultaneously, in [11], the authors propose to use . To keep the notation simple, we focus on as it is defined in (17). We compare whether the ordinal patterns in coincide with the ones in . Recall that it is an essential property of ordinal patterns that they are uniquely determined by the increment process. Therefore, we have to consider the increment processes as defined in (1) for , where , . Hence, we can also express p and q (and consequently ) as a probability that only depends on the increments of the considered vectors of the time series. Recall the definition of for , given by such that with as given in (6). In the course of this article, we focus on the estimation of p. For a detailed investigation of the limit theorems for estimators of , we refer to [18]. We define the estimator of p, the probability of coincident patterns in both time series in a moving window of fixed length, by where: Figure 4 illustrates the way ordinal pattern dependence is estimated by . The patterns of interest that are compared in each moving window are colored in red.

Figure 4

Illustration of estimation of ordinal pattern dependence.

Having emphasized the crucial importance of the increments, we define the following conditions on the increment process : let be a bivariate, stationary Gaussian process with , : We assume that fulfills (2) with in . We allow for to be in the range . We assume such that the cross-correlation function of fulfills for : with and holds. Furthermore, in both cases, it holds that for and to exclude ties. We begin with the investigation of the asymptotics of . First, we calculate the Hermite rank of , since the Hermite rank determines for which ranges of the estimator is still long-range dependent. Depending on this range, different limit theorems may hold. The Hermite rank of Following [20], Lemma 5.4 it is sufficient to show the following two properties: , . Note that the conclusion is not trivial, because in general, as can be seen in [15], Lemma 3.7. Lemma 5.4 in [20] can be applied due to the following reasoning. Ordinal patterns are not affected by scaling, therefore, the technical condition that is positive semidefinite is fulfilled in our case. We can scale the standard deviation of the random vector by any positive real number since for all we have: To show property , we need to consider a multivariate random vector: with covariance matrix . We fix . We divide the set into disjoint sets, namely into , as defined in Definition 5 and the complimentary set . Note that: holds. This implies: for . Hence, we arrive at: for . Consequently, . In order to prove , we consider: to be a random vector with independent distributed entries. For and such that , we obtain: since for all . This was shown in the proof of Lemma 3.4 in [20]. All in all, we derive and hence, have proven the lemma. □ The case exhibits the property that the standard range of the long-range dependence parameter has to be divided into two different sets. If , the transformed process is still long-range dependent, as can be seen in [16], Table 5.1. If , the transformed process is short-range dependent, which means by definition that the autocorrelations of the transformed process are summable, as can be seen in [13], Remark 2.3. Therefore, we have two different asymptotic distributions that have to be considered for the estimator of coincident patterns.

3.1. Limit Theorem for the Estimator of p in Case of Long-Range Dependence

First, we restrict ourselves to the case that at least one of the two parameters and is in . This assures . We explicitly include mixing cases where the process corresponding to is allowed to be long-range as well as short-range dependent. Note that this setting includes the pure long-range dependence case, which means that for , we have , or even . However, in general, the assumptions are lower, such that we only require for either or and the other parameter is also allowed to be in or . We can, therefore, apply the results of Corollary 1 and obtain the following asymptotic distribution for : Under the assumption in with for each denotes the matrix of second order Hermite coefficients. The proof of this theorem is an immediate application of the Corollary 1 and Lemma 1. Note that for it holds that it is square integrable with respect to and that the set of discontinuity points is a Null set with respect to the -dimensional Lebesgue measure. This is shown in [18], Equation (4.5). □ Following Theorem 2, we are also able to express the limit distribution above in terms of two standard Rosenblatt random variables by modifying the weighting factors in the limit distribution. Note that this requires slightly stronger assumptions as in Theorem 1. Let with Following [ We turn to an example that deals with the asymptotic variance of the estimator of p in Theorem 3 in the case . We focus on the case We start with the calculation of the second order Hermite coefficients in the case and: Due to Recall that the inverse By using the formula for Plugging the second order Hermite coefficients and the entries of the inverse of the covariance matrix depending on and: Therefore, in the case It is not possible to analytically determine the limit variance for

3.2. Limit Theorem for the Estimator of p in Case of Short-Range Dependence

In this section, we focus on the case of . If , we are still dealing with a long-range dependent multivariate Gaussian process . However, the transformed process is no longer long-range dependent, since we are considering a function with Hermite rank 2, see also [16], Table 5.1. Otherwise, if , the process itself is already short-range dependent, since the cross-correlations are summable. Therefore, we obtain the following central limit theorem by applying Theorem 4 in [14]. Under the assumptions in with: We close this section with a brief retrospect of the results obtained. We established limit theorems for the estimator of p as probability of coincident pattern in both time series and hence, on the most important parameter in the context of ordinal pattern dependence. The long-range dependent case as well as the mixed case of short- and long-range dependence was considered. Finally, we provided a central limit theorem for a multivariate Gaussian time series that is short-range dependent if transformed by . In the subsequent section, we provide a simulation study that illustrates our theoretical findings. In doing so, we shed light on the Rosenblatt distribution and the distribution of the sum of Rosenblatt distributed random variables.

4. Simulation Study

We begin with the generation of a bivariate long-range dependent fractional Gaussian noise series . First, we simulate two independent fractional Gaussian noise processes and derived by the R-package “longmemo”, for a fixed parameter in both time series. For the reader’s convenience, we denote the long-range dependence parameter d by as it is common, when dealing with fractional Gaussian noise and fractional Brownian motion. We refer to H as Hurst parameter, tracing back to the work of [23]. For and we generate samples, for , we choose . We denote the correlation function of univariate fractional Gaussian noise by , . Then, we obtain for : for . Note that this yields the following properties for the cross-correlations of the two processes for : We use and to obtain unit variance in the second process. Note that we chose the same Hurst parameter in both processes to get a better simulation result. The simulations of the processes and are visualized in Figure 5. On the left-hand side, the different fractional Gaussian noises depending on the Hurst parameter H are displayed. They represent the stationary long-range dependent Gaussian increment processes we need in the view of the limit theorems we derived in Section 3. The processes in which we are comparing the coincident ordinal patterns, namely and , are shown on the right-hand side in Figure 5. The long-range dependent behavior of the increment processes is very illustrative in these processes: roughly speaking, they become smoother the larger the Hurst parameter gets.

Figure 5

Plots of 500 data points of one path of two dependent fractional Gaussian noise processes (left) and the paths of the corresponding fractional Brownian motions (right) for different Hurst parameters: (top), (middle), (bottom).

We turn to the simulation results for the asymptotic distribution of the estimator . The first limit theorem is given in Theorem 3 for and . In the case of , a different limit theorem holds, see Theorem 5. Therefore, we turn to the simulation results of the asymptotic distribution of the estimator of p, as shown in Figure 6 for pattern length . The asymptotic normality in case can be clearly observed. We turn to the interpretation of the simulation results of the distribution of for and as the weighted sum of the sample (cross-)correlations: we observe in the Q–Q plot for that the samples in the upper and lower tail deviate from the reference line. For , a similar behavior in the Q–Q plot is observed.

Figure 6

Histogram, kernel density estimation and Q–Q plot with respect to the normal distribution () or to the Rosenblatt distribution of with for different Hurst parameters: (top); (middle); (bottom).

We want to verify the result in Theorem 4 that it is possible, by a different weighting, to express the limit distribution of as the distribution of the sum of two independent standard Rosenblatt random variables. The simulated convergence result is provided in Figure 7. We observed the standard Rosenblatt distribution.

Figure 7

Histogram, kernel density estimation and Q–Q plot with respect to the Rosenblatt distribution of for different Hurst parameters: (top); (bottom).

5. Conclusions and Outlook

We considered limit theorems in the context of the estimation of ordinal pattern dependence in the long-range dependence setting. Pure long-range dependence, as well as mixed cases of short- and long-range dependence, were considered alongside the transformed short-range dependent case. Therefore, we complemented the asymptotic results in [11]. Hence, we made ordinal pattern dependence applicable for long-range dependent data sets as they arise in the context of neurology, as can be seen in [24] or artificial intelligence, as can be seen in [25]. As these kinds of data were already investigated using ordinal patterns, as can be seen, for example, in [26], this emphasizes the large practical impact of the ordinal approach in analyzing the dependence structure multivariate time series. This yields various research opportunities in these fields in the future. Our results rely on the assumption of Gaussianity of the considered multivariate time series. If we focus on comparing the coincident ordinal patterns in a stationary long-range dependent bivariate time series, we highly benefit from the property of ordinal patterns not being affected by monotone transformations. It is possible to transform the data set to the Gaussian framework without losing the necessary ordinal information. In applications, this property is highly desirable. If we consider the more general setting, that is, stationary increments, the mathematical theory in the background gets a lot more complex leading to the limitations of our results. A crucial argument used in the proofs of the results in Section 2 is given in the Reduction Theorem, originally proven in Theorem 4.1 in [27] in the univariate case and extended to the multivariate setting in Theorem 6 in [14]. For further details, we refer the reader to the Appendix A. However, this result only holds in the Gaussian case. Limit theorems for the sample cross-correlation process of multivariate linear long-range dependent processes with Hermite rank 2 have recently been proven in Theorem 4 in [28]. This is possibly an interesting starting point to adapt the proofs in the Appendix A to this larger class of processes without requiring Gaussianity. Considering the property of having a discrete bivariate time series in the background, an interesting extension is given in time continuous processes and the associated techniques of discretization to still regard the ordinal perspective. To think even further beyond our scope, a generalization to categorical data is conceivable and yields an interesting open research opportunity.

3 in total