Literature DB >> 34398909

Structural change detection in ordinal time series.

Abstract

Change-point detection in health care data has recently obtained considerable attention due to the increased availability of complex data in real-time. In many applications, the observed data is an ordinal time series. Two kinds of test statistics are proposed to detect the structural change of cumulative logistic regression model, which is often used in applications for the analysis of ordinal time series. One is the standardized efficient score vector, the other one is the quadratic form of the efficient score vector with a weight function. Under the null hypothesis, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the former statistic performs better when the change-point occurs at the centre of the data, but the latter is preferable when the change-point occurs at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two test statistics to a group of sleep data, the results show that there exists a structural change in the data.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34398909 PMCID： PMC8367010 DOI： 10.1371/journal.pone.0256128

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

In categorical data analysis, ordinal categorical variables are frequently encountered in many contexts, such as health status (very good, good, so-so, bad, very bad), blood pressure (low, normal, high). The data observed hourly or daily constitutes an ordinal time series. The cumulative logistic regression model is often applied for analyzing the ordinal time series [1]. Sometimes the model may change at some unknown time moments (change-points) while it remains stable between these points. Structural stability is of prime importance in statistical modeling and inference. If the parameters have changed with the observed sample, inferences can be severely biased, and forecasts lose accuracy. Because of the importance of parameter stability, it is necessary to detect the structural change. Studies of structural change detection has been a popular research subject in statistics, see Csörgö and Horváth [2], Bai and Perron [3], Lee et al. [4], Perron [5], Gombay [6], Wang et al. [7], Chen et al. [8], Baranowski et al. [9], Wang et al. [10], Chen [11] and Liu et al. [12] for reviews of the field. Structural changes detection in categorical data have been considered as well. Höhle [13] proposed a prospective CUSUM change-point detection procedure to detect a structural change in categorical time series; Wang et al. [10] described a procedure based on high-dimensional homogeneity test to detect and estimate multiple change-point in multinomial data; Plasse and Adams [14] illustrated a multiple change-point detection method for categorical data streams, which could adaptively monitor the category probabilities. As generalized linear regression models for categorical time series allow for parsimonious modeling and incorporation of random time-dependent covariates, Fokianos and Kedem [15] suggested the generalized linear model for categorical time series modeling. For change-point detection in the generalized linear model, Xia et al. [16] introduced two procedures to sequentially detect the structural change in generalized linear models with assuming independence; Hudecová [17] investigated the detection of change in autoregressive models for binary time series; Fokianos et al. [18] provided a statistical procedure based on the partial likelihood score process to detect a structural change in binary logistic regression model; Gombay et al. [19] and Li et al. [20] discussed retrospective change detection and sequential change detection in multinominal logistic regression model. Score test for detection of changes in time series models has been studied by Gombay and Serban [21], Gombay et al. [22]. The test statistic is usually computationally less demanding than the likelihood ratio test statistic. In this paper, we first propose a test statistic based on the efficient score vector to detect a structural change in cumulative logistic regression model, which extends the change-point detection of Gombay et al. [19]. Simulation shows that the empirical power of the proposed statistic is low when the change-point occurs at the beginning or end of the data. To this end, we propose a new statistic, which is the quadratic form of the efficient score vector and has a weight function. Under the null hypothesis of no change, we derive the asymptotic distribution of the two statistics, and prove the consistency under the alternative hypothesis. We also study the consistency of the change-point estimator, and a binary segmentation procedure is suggested for estimating the locations of possible multiple change-points. Simulation results show that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. The empirical power of the former statistic is higher when the change-point is located at the centre of the data, but the latter performs better when the change-point is located at the beginning or end of the data. Furthermore, the former statistic could find the reason for rejecting the null hypothesis. Finally, we apply the two statistics to study a group of sleep data, and find a structural change in the data.

The model and hypotheses

Consider a categorical time series {} with m categories, = (Y, …, Y)′, q = m − 1, for t = 1, 2, …, n and j = 1, …, q, . The vector of conditional probability = (π, …, π)′ is defined by for every t, , where {} denotes the p × q covariate matrices. Define an ordinal time series {Y}, where Y = j is equivalent to Y = 1 for j = 1, 2, …, m, t = 1, 2, …, n. Let {X} be a latent variable time series, where X = −′ + e, ∈ , is a d-dimensional covariate vector, e is a white noise process with continuous cumulative distribution function F. Suppose that −∞ = α0 < α1 < ⋯ < α = ∞ are threshold parameters, such that Y satisfies for j = 1, 2, …, m. According to the equivalence relation between Y and Y, we have then If F(x) is the logistic distribution function, then F−1 is the logistic link function log it(x), where logit (x) = ln (x/(1 − x)), 0 < x < 1. Thus we have which is called the cumulative logistic regression model. Let = (α1, …, α, ′)′ be a p-dimensional parameter vector, p = q + d. In this paper we wish to test if there exists a structural change in the parameter , that is, where 0 is the true value of , k* denotes the change-point which occurs in some of the parameter , , 0, and k* are unknown. Next, we estimate the parameter vector by the partial likelihood method (Fokianos et al. [18]). The partial likelihood function and the partial log-likelihood function are defined in Gombay et al. [19]. Denote the partial score vector where , , () = (h1(), …, h())′, . h1(), …, h() satisfies where . Σ() is the conditional covariance matrix of with for i, j = 1, …, q [23]. To obtain the existence, consistency and asymptotic normality of the maximum partial likelihood estimator, we give a few assumptions on the the covariate matrices {} and parameter vector . Assumption 1The parameter vector, where Ω is an open set. Assumption 2The link function h is twice continuously differentiable, and satisfies det(∂ h()/∂ ) ≠ 0, where . Assumption 3The covariate matrixlies almost surely in a non-random compact subsetΦ of such that , lies almost surely in the domain H of h for all ∈ Φ and ∈ Ω, where , λ ≠ . Assumptions 1 and 2 ensure that the second derivative of l() is continuous, det(∂ ()/∂ ) ≠ 0 implies that () is not singular (Fokianos and Kedem [24]). From Assumption 3, is positive definite with probability one [24]. Since the likelihood estimation employs an assumption regarding ergodicity of the joint process (Fokianos and Truquet [25]), let {Y} be a time series taking values in a finite set E with cardinal m, and such that where , , , q is a transition kernel. We assume that the applications are measurable, as applications from to (0, 1), where is a sequence, , is such that . Assume that v1 and v2 are two probability measures on E, define For y, y′ ∈ E and a positive integer s, we write if , 0 ≤ i ≤ s − 1 (Truquet [26]). Assumption 4The d-dimensional covariate vector {} is stationary and ergodic. Assumption 5Setting for s ≥ 0, we have b0 < 1 and . Assumptions 4 and 5 guarantees that is stationary and ergodic [26]. Assumptions 1–5 are required to obtain consistency and asymptotic normality of the maximum likelihood estimator. However, existence of moments for the covariate process is still required to study large sample properties of the maximum likelihood estimator [25]. So we have Assumption 6, i = 1, 2, ⋯, d, where , 1 ≤ i ≤ d are components of vector .

The proposed testing procedure

Based on the partial likelihood score process, a test statistic is defined by where , is the maximum partial likelihood estimator of θ, which can be obtained by maximizing the partial log-likelihood function (1) (see Fokianos and Kedem [23]). Under the null hypothesis of no change, we derive the asymptotic distribution of the proposed test statistic. Theorem 1If Assumptions 1–6 and H0hold, then we havewhere, (t) is a p-dimensional vector of independent Brownian bridge, means convergence in distribution. Proof: Since , we can write let denote the i-th element of , i = 1, 2, …, p, 0 is the true value of , then we have Next, it is similar to the proof of Proposition 3 in Gombay et al. [19], we can prove that By Theorem 4.1 of Fokianos and Kedem [23], we get The error terms have higher orders of products of , it can be shown that . According to Proposition 1 (Gombay et al. [19]) and Slutsky’s theorem, we get as n → ∞. Remark 1 When using the above test, if there exists some i, 1 ≤ i ≤ p, the null hypothesis is rejected and a change-point occurs, α* = 1 ‒ (1 ‒ α)1/. Let B(u) be a one-dimensional Brownian bridge, Csörgö and Révész [27] suggested that C(α*) could be obtained by Simulation shows that W1 has poor performance at the boundaries. In particular, the limiting Brownian bridge is tied down at t = 0 and t = 1 (meaning B(0) = B(1) = 0), and hampers the ability of the test to detect the structural change occurring near the beginning or end of the data. Many authors address this problem by adding a weight function [28]. Therefore, we construct a new test statistic which is the quadratic form of the efficient score vector and has a weight function, where {i1, i2, …, i} ⊂ {1, 2, …, p}, j = 1, 2, …, p, 0 < l < h < 1. Theorem 2If Assumptions 1–6 and H0hold, then we havefor each 0 < l < h < 1, B(t), i = 1, …, j are independent one-dimensional Brownian bridges. The conclusion of Theorem 2 can be deduced directly from Theorem 1. To obtain the critial values of the asymptotic distribution, Csörgö and Horváth [2] used a result of Vostrikova [29] to show that as x → ∞. For example, when α = 0.05, l = 0.05, h = 0.95, j = 2, the critical value C(α) = 13.1. Under the alternative hypothesis, there exists a structural change in the model, then we will prove the consistency of the two statistics. Theorem 3Suppose Assumptions 1–6 and Hhold, if the coefficient changes from0toatk*, , is the jth component of0, j ∈ {1, 2, …, p}, where δ is a constant, δ ≠ 0, then we have where 0 < l < h < 1, ‖⋅‖ denotes the Euclidean norm of a vector, means convergence in probability. Proof: Under the alternative hypothesis = 0, t = 1, 2, …, k*, , t = k* + 1, …, n. Suppose that the coefficient changes from 0 to at k*, , is the j-th component of 0, 1 < j < p, where δ is a constant, δ ≠ 0. When k* < k < n, where , . For the ith component of , 1 < i < p, we have where has two orders of products of , has two orders of products of . By Theorem 1 we have as n → ∞. Following Assumptions 1–6, we conclude that Since δ ≠ 0, we have as n → ∞. When 1 < k < k*, the proof is similar. The proof of (ii) is similar to the proof of (i). Once the null hypothesis is rejected, indicating there may exist a change-point, then we locate the change-point position by The following theorem shows that the change-point estimator is consistent for the true change-point k*, as n → ∞. Theorem 4Let k* be the true position of change-point under the alternative hypothesis H and be the estimate of k* given by (2). Under Assumptions 1–6, then is consistent to k*, as n → ∞. Proof: First we note that where i = 1, 2, …, p. Since where . And because Therefore increases as k = 1, 2, ⋯, k*, and decrease as k = k* + 1, k* + 2, ⋯, n, then we take (2) as the change-point estimator. By the proof of Theorem 1, we have and . By Theorem 2 of Gombay [6], to prove (2) it is enough to show that and where . To prove (3), assume that there exists a constant K, K < k*, where By Theorem 1, choosing δ > 0 arbitrarily if K is large enough, so (3) is proven. The proof of (4) is the same by symmetry. If we consider detecting multiple structural changes in the sequences, we can employ the binary segmentation method [30]. First use the single change test. If H0 is rejected, then find by (2). Next divide the sample into two subsamples and , and test both subsamples for further changes. One continues this segmentation procedure until no subsamples contain further change-points.

Simulation

To evaluate the finite sample performance of the proposed two test statistics (W1 and W2), we first simulate an ordinal time series {Y} with m = 3 categories and length n = 100, 200, 500, 1000. The data are generated by where α1 = −0.5, α2 = 0.2, (β1, β2, β3)′ = (2, 0.5, 1)′, then the parameter vector = (α1, α2, β1, β2, β3)′. All simulation results are based on 1000 replications at the 0.05 significance level. Suppose that we are only interested in α1 and β1, the others are nuisance parameters. Table 1 shows the empirical size of the two statistics under the null hypothesis H0. and denote the empirical size of W1 when testing for change in each of α1 and β1, respectively. W1 and W2 denote the empirical size of W1 and W2 when testing for change in both α1 and β1, respectively.

Table 1

The empirical size of W1 and W2 under the null hypothesis H0.

n	100	200	500	1000
W11	0.013	0.014	0.018	0.019
W12	0.01	0.018	0.02	0.014
W ₁	0.038	0.027	0.046	0.037
W ₂	0.038	0.045	0.033	0.047

It can be seen from Table 1 that the empirical size increases as the historical sample size n increases. When the sample size n = 1000, the empirical size of W1 and W2 is close to the significance level 0.05. In addition, based on the relation between the probability of type I errors when detecting α1 or β1 and the overall probability of type I errors, that is , and W1 should satisfy . The results show that , which confirms the above inference. Under the alternative hypothesis H, we consider the following three different situations: where k* = 0.1n, …, 0.9n. Tables 2–4 summarize the empirical power of W1 and W2 under the alternative hypotheses , and when k* = 0.1n, 0.5n, 0.8n. and denote the empirical power when testing for change in each of α1 and β1, respectively. W1 and W2 denote the empirical power of W1 and W2 when testing for change in both α1 and β1. From the simulation results, it can be seen that the empirical power of the two statistics increases with the sample size n, and is close to 1 when n = 1000. In addition, The empirical power of the two statistics varies according to different change-point locations, and reaches maximum when k* = 0.5n. Fig 1 describes the empirical power of the two statistics when k* = 0.1n, …, 0.9n. It is showed that the empirical power of W1 is higher than that of W2 when the change-point is located at the centre of the data, but W2 performs better when the change-point is located at the beginning or end of the data.

Table 2

The empirical power of W1 and W2 under the alternative hypothesis .

	k* = 0.1n				k* = 0.5n				k* = 0.8n
n	100	200	500	1000	100	200	500	1000	100	200	500	1000
W11	0.016	0.016	0.032	0.057	0.055	0.092	0.292	0.6	0.027	0.04	0.094	0.208
W12	0.01	0.028	0.089	0.14	0.069	0.186	0.502	0.859	0.027	0.042	0.112	0.33
W ₁	0.037	0.059	0.102	0.215	0.117	0.264	0.66	0.946	0.029	0.077	0.213	0.454
W ₂	0.084	0.153	0.3	0.541	0.068	0.168	0.569	0.924	0.03	0.064	0.259	0.643

Table 4

The empirical power of W1 and W2 under the alternative hypothesis .

	k* = 0.1n				k* = 0.5n				k* = 0.8n
n	100	200	500	1000	100	200	500	1000	100	200	500	1000
W11	0.006	0.017	0.019	0.029	0.013	0.026	0.029	0.037	0.011	0.015	0.02	0.029
W12	0.02	0.031	0.077	0.158	0.059	0.19	0.599	0.901	0.014	0.039	0.13	0.415
W ₁	0.032	0.059	0.097	0.203	0.078	0.231	0.615	0.921	0.037	0.061	0.178	0.433
W ₂	0.1	0.142	0.265	0.436	0.052	0.142	0.461	0.813	0.022	0.036	0.145	0.429

Fig 1

The empirical power of W1 and W2 under when k* = 100, 200, …, 900, n = 1000.

In simulation for Table 2 both α1 and β1 change, whereas in Tables 3 and 4 only α1 and β1 changes at different change-points. Tables 3 and 4 indicate that most power stems from the parameter that is changed, which means W1 that could not only detect change in parameters, but also find the reason for rejecting the null hypothesis.

Table 3

The empirical power of W1 and W2 under the alternative hypothesis .

	k* = 0.1n				k* = 0.5n				k* = 0.8n
n	100	200	500	1000	100	200	500	1000	100	200	500	1000
W11	0.018	0.021	0.052	0.086	0.084	0.192	0.541	0.857	0.031	0.084	0.184	0.495
W12	0.012	0.017	0.018	0.026	0.016	0.01	0.028	0.028	0.016	0.022	0.026	0.019
W ₁	0.028	0.044	0.082	0.098	0.104	0.201	0.545	0.873	0.053	0.067	0.215	0.498
W ₂	0.029	0.047	0.086	0.222	0.064	0.127	0.394	0.805	0.064	0.095	0.252	0.565

Application to real data

To illustrate the applicability of our results, we use 1000 sleep data (Y) collected from the sleep state measurements of a newborn infant sampled every 30 seconds (Fokianos and Kedem [23]). The sleep states are classified as follows: (1) quiet sleep, (2) indeterminate sleep, (3) active sleep, (4) awake (Fig 2). According to the newborn’s sleep pattern, the sleep states have the following order: “(4)” < “(1)” < “(2)” < “(3)”, which means {Y} is an ordinal time series. One goal of analyzing these data is to establish a correct model, and predict the sleep state based on the covariate information. Refer to example 6.3 of [23], = (Y(, Y(, Y()′ is a significant predictor, which can be considered as a covariate. Then these data could be modeled by a cumulative logistic regression model where α1 = −14.722, α2 = −10.389, α3 = −4.078, β1 = 18.663, β2 = 12.173, β3 = 7.566.

Fig 2

1000 sleep data (Y) collected from the sleep state measurements of a newborn infant sampled every 30 seconds.

Let = (α1, α2, α3, β1, β2, β3)′, then testing whether there exists a structural change in , the result finds that a structural change occurs in by computing the test statistics W1 and W2. After this, using W1 to check which parameter occurs a structural change, the result shows that there exists a structural change in α2 at 596. Specifically, the maximum of W1 is 3.446, and the critical value is 1.35 when p = 1, α = 0.05, which gives a significant result (Fig 3). Re-estimate the parameters based on the first 596 samples and the last 404 samples, we have for the former and for the latter. We obtain AIC = 1646.65 for the adjusted model, and AIC = 1652.89 when assuming there is no change-point, which means to improve the model in some extent, so that we can make accurate predictions.

Fig 3

The value of W1 when testing for α2, the critical value at α = 0.05 is 1.35, and the location of change-point is 596.

Concluding remark

Cumulative logistic regression model is a generalized linear model, and has a wide application in health care. In this paper, two test statistics based on the efficient score vector are proposed to detect the structural change of cumulative logistic regression model. Under the null hypothesis of no change, we derive the asymptotic distribution of the two test statistics, and prove the consistency under the alternative hypothesis. Furthermore, we prove the consistency of the change-point estimator, and a binary segmentation procedure is provided for estimating the locations of possible multiple change-points. The finite sample performance is investigate by a monte carlo simulation, the results shows that the empirical size of the two statistics is close to the significance level 0.05, and the empirical power is approximate to 1 when the sample size is large. From the empirical power of view, the two test statistics have different advantages when the change-point occurs at different locations. Furthermore, the proposed statistic W1 could find the reason for rejecting the null hypothesis. Finally we apply the two test statistics to study 1000 sleep data collected from the sleep state measurements of a newborn infant sampled every 30 seconds, the results shows there exists a structural change in the model.

Simulation data for Table 1.