Literature DB >> 34226773

Change-point analysis through integer-valued autoregressive process with application to some COVID-19 data.

Subhankar Chattopadhyay1, Raju Maiti2, Samarjit Das2, Atanu Biswas1.   

Abstract

In this article, we consider the problem of change-point analysis for the count time series data through an integer-valued autoregressive process of order 1 (INAR(1)) with time-varying covariates. These types of features we observe in many real-life scenarios especially in the COVID-19 data sets, where the number of active cases over time starts falling and then again increases. In order to capture those features, we use Poisson INAR(1) process with a time-varying smoothing covariate. By using such model, we can model both the components in the active cases at time-point t namely, (i) number of nonrecovery cases from the previous time-point and (ii) number of new cases at time-point t. We study some theoretical properties of the proposed model along with forecasting. Some simulation studies are performed to study the effectiveness of the proposed method. Finally, we analyze two COVID-19 data sets and compare our proposed model with another PINAR(1) process which has time-varying covariate but no change-point, to demonstrate the overall performance of our proposed model.
© 2021 Netherlands Society for Statistics and Operations Research.

Entities:  

Keywords:  COVID‐19; INAR(1) process; Poisson distribution; active cases; change‐point; smoothing function; time‐varying covariates

Year:  2021        PMID: 34226773      PMCID: PMC8242783          DOI: 10.1111/stan.12251

Source DB:  PubMed          Journal:  Stat Neerl        ISSN: 0039-0402            Impact factor:   1.239


INTRODUCTION

Time series of count data have been widely studied during the last three decades or so due to its increased relevance toward various fields of science. There are several ways to model count time series data. For example, McKenzie (1985, 1986) and Al‐Osh and Alzaid (1987) introduced a class of stationary integer‐valued autoregressive (INAR) time series process based on binomial thinning operator. This process was further studied and generalized by Alzaid and Al‐Osh (1990), Jin‐Guan and Yuan (1991), Freeland and McCabe (2004), Ristić, Bakouch, and Nastić (2009), Jazi, Jones, and Lai (2012), Schweer and Weiß, (2014), Maiti, Biswas, and Das (2015) and many more. In particular, McKenzie (1986) introduced the integer‐valued AR(1) or INAR(1) models with geometric and negative binomial marginals when the data are overdispersed. McKenzie (1985) and Al‐Osh and Alzaid (1987) developed an INAR(1) process with Poisson marginals, well known as PINAR(1) process which is very popular due to its simple form. The INAR(1) process was further extended to a more general INAR(p) process by Alzaid and Al‐Osh (1990) and Jin‐Guan and Yuan (1991). Ristić et al. (2009) and Schweer and Weiß (2014) proposed a new INAR(1) process based on negative binomial thinning operator which can also handle the overdispersion problem. Jazi et al. (2012) and Maiti et al. (2015) studied zero‐inflated PINAR(1) (ZIPINAR(1)) processes for zero‐inflated count data. Apart from these thinning‐based INAR processes, Cameron and Trivedi (1986) and Fokianos (2011) studied some regression‐based time series models to model count time series data. In this article, we employ the INAR process to model the data of COVID‐19 active cases which is an example of count time series data. In an INAR process there are two components at time‐point t namely, (i) nonrecovery cases from the previous time‐point (survival part) and (ii) new cases coming in the process at time‐point t (innovation terms). These INAR processes are mainly stationary since the innovation terms involve no time‐varying covariate, that is, the new cases coming in the process are not time‐dependent. But in real‐life scenarios like the COVID‐19 data sets, we can find that the rapid change in the number of infected cases makes the innovation terms time‐dependent. Besides this time‐varying nature of the innovation terms, we also notice some change‐points in these data sets. In the current scenario of COVID‐19 pandemic, we are seeing mainly two types of curves for daily new cases reported in different parts of the world, which are (i) the curve, at first, began to increase exponentially, but after major steps like “nationwide lockdowns,” “social distancing” measures, a massive number of testing, and so on taken by the respective authorities in different countries, the curve started decreasing, and (ii) the curve which came down, started to rise again as the respective authorities began to ease those measures in some parts of the world. The curves of daily active cases are also changing in the same way in those parts of the world. Hence we can spot one change‐point (upward to downward) for the curve described in Case (i) and two change‐points (upward to downward and then downward to upward) for the curve in Case (ii). In this article, we try to develop a PINAR process based on binomial thinning operator for count time series data like the COVID‐19 data where we model the innovation terms through some time‐varying covariates and smoothing change‐point function without changing the survival part. PINAR process, introduced by McKenzie (1985) and Al‐Osh and Alzaid (1987), is very popular due to its simple form and has a wide application in modeling count time series data. But this PINAR process based on binomial thinning operator is not capable of handling the count time series data which has both change‐points and time‐varying innovation terms. Hence we introduce a new suitable PINAR model which is able to tackle both these features which can be found in the COVID‐19 data sets. To incorporate the change‐points in our proposed PINAR model, the innovation terms are modeled with a smoothing version (see Smooth maximum, n.d.) of time‐varying covariate which consists of the change‐points. The idea to capture the change‐points in the innovation terms through time‐varying smoothing covariate is inspired by Chan and Tong (1986), Hansen, (2000) and Fong, Huang, Gilbert, and Permar (2017) whose works are mainly based on continuous data. We use this smoothing version of time‐varying components in our proposed model to catch the changing curvatures in the data of daily active cases. The effectiveness of the proposed model for both the studies of one change‐point and two change‐points is reviewed later by simulation study and the analysis of two COVID‐19 data sets. We compare our proposed model with another PINAR model which has time‐varying covariate but no change‐point, to illustrate the overall performance of the proposed model. The rest of the article is organized as follows. Section 2 discusses two real COVID‐19 data sets. Section 3 describes our proposed model along with a brief illustration of the INAR(1) process. We provide the distributional forms of our proposed model and the h‐step ahead forecasting distribution in Sections 4 and 5, respectively. In Section 6, we talk about the estimation method for our proposed model. Some extensive simulation studies for our proposed model are provided in Section 7. In Section 8, we analyze the COVID‐19 data sets. Finally, some conclusions are drawn in Section 9. All the proofs of the theoretical results are provided in the Appendix.

MOTIVATING DATA EXAMPLES: COVID‐19 DATA

The world is now facing the biggest global health crisis in the name of COVID‐19 pandemic unlike any in recent times. The outbreak was first identified in Wuhan, China, in early December 2019. The World Health Organization declared the outbreak a Public Health Emergency of International Concern on January 30 and a pandemic on March 11. To restrict the spread of this virus in early stages, heavy measures have been implemented in different parts of the world by the respective authorities like “nationwide lockdowns,” “rapid testing” process, strict “social distancing,” using of masks and sanitizers in public places, and so on. Hence in certain parts of the world, the situation of COVID‐19 has improved, and the lockdown has been eased in those parts. During that period, there were also some Gulf evacuations took place in different countries, especially in India. Therefore the “community transmission” has started in those parts of the world due to the highly infectious nature of this virus, and the number of infected cases began to pile up again. For further discussion in this regard, we explore two real COVID‐19 data sets in Sections 2.1 and 2.2.

COVID‐19 data of Italy

This data set is an example of Case (i) described in Section 1. We can only see one change‐point in the data of active cases of Italy and hence the study will be based on one change‐point analysis. The data (see Worldometer, n.d.) are collected from February 15 to June 6 (total 113 days). Though the first case in this country was detected back in January 2020, the cases started to increase rapidly from the beginning of March. After continuous measurements taken by the authorities, the curve of active cases has started to come down. As of June 6, 2020, the total number of confirmed cases was more than 234k, and the number of deaths was more than 33.8k. The active number of cases was more than 35 000. Figure 1 displays the data of new daily cases, the data of daily active cases, and the autocorrelation function (ACF) and partial ACF (PACF) plots of daily active cases. From the ACF and PACF plots, it seems that the data have a good fit for the AR(1) process.
FIGURE 1

COVID‐19 data of Italy

COVID‐19 data of Italy

COVID‐19 data of Kerala

This data set is an example of Case (ii). Here we observe two change‐points in the data of daily active cases and hence the study will be based on two change‐point analysis. In Kerala, the first case was also detected back in January 2020, but the cases started to pile up from mid‐March. Due to heavy measurements taken by the state government of Kerala, the curve of active cases came down, but from mid‐May, the cases again started to rise when the Gulf evacuees began to come into the state. The data for Kerala (see GoK Dashboard, n.d) are collected from March 9 to June 6 (total 90 days). More than 1800 cases and total of 15 deaths were reported in Kerala as of June 6, 2020, and the active number of cases was more than 1000. Figure 2 displays the data of new daily cases, the data of daily active cases, and the ACF and PACF plots of daily active cases. It seems that the data have a good fit for the AR(1) process.
FIGURE 2

COVID‐19 data of Kerala

COVID‐19 data of Kerala

MODEL

In this section, we develop a new model based on the integer‐valued AR(1) process to capture the change‐points in the count time series data sets like the COVID‐19 data sets of Italy and Kerala. Here we use the INAR(1) process (proposed by McKenzie, 1985 and Al‐Osh & Alzaid, 1987) consisting of binomial thinning operator (introduced by Steutel & van Harn, 1979), to develop our proposed model for change‐point analysis, which is given by where denotes the number of daily active cases at time‐point t and represents daily new cases reported at time‐point t. We assume that follows Poisson() where is assumed to have the following form: where the tuning parameter helps to capture the changing curvature of the data. denotes the change‐point in the data. The change‐point can be easily estimated from the data. The above model is defined only for one change‐point. = are the regression parameters, and is the regression coefficient which associates with the time‐varying covariate consisting of the change‐point. However, we can easily extend the model for more than one change‐point. For two change‐points, only the form of will change and the functional form is given by where and are the two change‐points in the data. Here = are the regression parameters, and and are the regression coefficients which associate with the time‐varying covariates consisting of the change‐points. In the subsequent section, we provide a general idea about our proposed model. The use of one tuning parameter in data with one change‐point can be widened for more than one change‐point like using two different tuning parameters and for data with two change‐points. But in our proposed process for two change‐points, we put only instead of and , mainly because using reduces the computational difficulty and simplifies the form of the proposed model.

Idea behind the model

The idea behind the form of , discussed in Equations (2) and (3), comes from the threshold regression model setup (see Chan & Tong, 1986; Fong et al. 2017; Hansen, 2000). From the concept of the segmented model in the threshold regression setup, we can write the form of for one change‐point as , where and . In this segmented form of , we get to see sharp change (upward to downward or downward to upward) in the curve of daily new cases and hence in the curve of daily active cases. But in real‐life scenarios like the COVID‐19 data, we do not get to see sharp changes; most of the time we notice changing curvature(s) in these data sets. So we try to capture those changing curvature(s) in the data of daily active cases by modeling the data of daily new cases (innovation terms) in the proposed model through some time‐varying covariates and smoothing change‐point functions. Moreover, the function is not differentiable at . So we replace (for ) by a smooth differentiable maximum function (see Smooth maximum, n.d.), which is given by Hence the functional form of for one change‐point is given by that is, In the similar way, we can find the functional form of for two change‐points, which is given by

Conditions on 's

The changing behaviors of these data sets depend on some conditions on 's. We try to provide those conditions through the form of the segmented model of threshold regression setup for both sets of 's (() in Equation (2), and ( in Equation (3)) which enable our proposed model to capture the change‐point(s). The required conditions for both the studies of one change‐point and two change‐points are given below. (i) In the segmented form of for one change‐point analysis, we model as for and for . So the derivatives of are for and for in this segmented model setup for one change‐point. So for and , increases when and decreases when , that is, increases when and decreases when . Hence the change‐point is th time‐point in the data of daily new cases. So for the count time series data of one change‐point, the condition: must hold. (ii) Similarly for the study of two change‐points, we model as for , for and for . Hence the derivatives of are for , for and for in the segmented model for two change‐points. So for , and , increases when , decreases when and again increases when , that is, increases when , decreases when and again increases when . Here the two change‐points are th and th time‐points in the data of daily new cases. So the condition: must hold for the count time series data containing two change‐points.

Choices of the tuning parameter

The tuning parameter of our proposed model, , helps to capture the changing curvature(s) in the data. Here . To compute the optimal value of the tuning parameter from the data, we consider a grid search method (see Chakraborty, Laber, & Zhao, 2013, James, Witten, Hastie, & Tibshirani, 2013). In this method, we use a goodness‐of‐fit measure based on which the optimal value of is calculated. The idea of comes from the concept of Smooth maximum (n.d.) and so as the value of increases the changing curvature becomes sharper. We show this property in Figures 3 and 4 where we can clearly see as the values of shift from 0.05 to 1; the changing curvatures become sharper for both the studies of one change‐point and two change‐points. We also add the nonsmoothing version (no use of ) of the generated data, that is, the segmented data.
FIGURE 3

The changing curvatures for one change‐point study for along with segmented data (no use of )

FIGURE 4

The changing curvatures for two change‐point study for along with segmented data (no use of )

The changing curvatures for one change‐point study for along with segmented data (no use of ) The changing curvatures for two change‐point study for along with segmented data (no use of )

Estimation of change‐point(s)

To estimate the change‐point(s), we take the difference between every two consecutive observations (i.e., ) and consider the sign of those differences denoted by where if and otherwise. For a data set with one change‐point, the sequence should give us two runs: (1) run of +, and (2) run of (see Wald & Wolfowitz, 1940). Depending on the increasing or decreasing curve of , the run of + and the run of will be interchanged. For example, if the original time series plot of is bell‐shaped (i.e., initially the observations are increasing and then after a certain time‐point (say, ) the observations are decreasing), we will have a run of + first and then after the time‐point we will have a run of . The time‐point at which the first run of + ends gives us an estimate of the original change‐point . However, in real scenarios, time series data with one change‐point may not be smooth and often there are random fluctuations present in the data. As a result, there might be many small runs of + and which make the above estimation procedure difficult to locate the true change‐point. Hence we employ a presmoothing approach before implementing the above run‐based point estimation. That is, instead of working with the actual time series data, we make the data smooth by implementing some standard statistical approaches like m‐point moving average, or through a pth degree polynomial function. For the time series data with two change‐points (say, and ), the sequence should produce three runs: (1) run of +, (2) run of , and (3) again run of +. Here the run of + and the run of will be interchanged twice, that is, a run of + for the increasing curvature, then a run of for the decreasing curvature and another run of + when cases again begin to rise (another increasing curvature). The time‐point at which the first run ends gives us an estimate of the first change‐point and the time‐point at which the second run ends provides an estimate of the second change‐point . However, like the case of one change‐point, here also time series data sets are nonsmooth and hence the implementation of presmoothing approaches like m‐point moving average, or through pth degree polynomial function is required. Later, in Section 7.2, we perform a simulation study where we estimate the true change‐point(s) and provide confidence interval(s) (CI(s)) based on normal approximation. And we study the large sample properties by varying the sample size.

DISTRIBUTIONAL PROPERTIES

In this section, we study the conditional and the marginal distributions of the proposed model.

Conditional distribution

Under our proposed setup, the conditional distribution of given and (the set of all covariates up to time‐point t including smooth time‐varying and simple time‐varying covariates up to time‐point t) can be derived as where is the indicator function. This is the probability of going from state i to state j in a single step. The conditional mean and variance can be given as , and , respectively.

Marginal distribution

Since the marginal distribution of is difficult to obtain, we find the partial marginal distribution of given for , henceforth it is called the marginal distribution. Here we derive the probability generating function (PGF) of given . The derivation is valid for and hence we assume that given , the marginal distribution of is Poisson(). The reason behind this assumption can be given as follows. We know the elements which enter the system in the interval are the innovation term at time‐point t (). Now for , the interval is (0, 1], and there is no previous existing interval in the system. So in the interval (0, 1], the elements which enter the system can be seen as the first count process . Hence we can assume Poisson(). Under the assumptions that Poisson( ) and Poisson( ), we can show that the PGF of is that is, given , follows Poisson distribution with mean . The derivation of this result is presented in Appendix A. Here we can also use a recursive formula as an alternative way to derive the marginal distribution, which is given by where is the indicator function. Here the marginal mean and the marginal variance are given by and Under the above setup, the autocovariance function (ACVF) of given using the equation can be derived as The derivation of this result is presented in Appendix B. Hence for 0, the ACF can be derived as follows: It can be seen that the above expression decays exponentially to 0 as h goes to for and the restricted 's discussed in Section 3.2.

FORECASTING

h‐Step ahead forecasting distribution

To find the h‐step ahead forecasting distribution, we use the following recursive method: Thus the h‐step ahead conditional mean and conditional variance can be given as and The h‐step ahead forecasting distribution of PINAR(1) process was derived by Freeland and McCabe (2004) using the binomial thinning operator discussed by Al‐Osh and Alzaid (1987) and it turned out to be a convolution of binomial and Poisson distributions. Here we can calculate the conditional PGF of given and and then derive the forecasting distribution using this. The conditional PGF of given and can be shown as The derivation of this result is presented in Appendix C. From the above result, we can say that the h‐step ahead prediction distribution of given and is a convolution of Bin and some random variable having the PGF of the form . Therefore follows Poisson distribution with mean . Thus, the prediction distribution can be presented as where “ ” is called the convolution between two distributions. Using Corollary 1, the h‐step ahead forecasting distribution of given and can be derived as where is the indicator function, , and . The derivation of this result is presented in Appendix D.

Descriptive measure of forecasting accuracy

Given an observed data set {} of size , we partition the data into two sets. The training set containing the first n observations is used to estimate the parameters of the model and based on the rest of m observations called the test set, we define the following descriptive measure of forecasting accuracy. The h‐step ahead predicted root mean squared error (denoted by PRMSE(h)) is defined as where is the mean of the estimated h‐step ahead forecasting distribution of given and mentioned in Theorem 4. Intuitively, the PRMSE(h) should increase in h.

ESTIMATION METHOD FOR THE MODEL PARAMETERS

Conditional least squares estimation

Conditional least squares estimation is usually used for estimating the regression parameters of the model in the context of time series models. Freeland and McCabe (2004, 2005) used this approach for PINAR(1) process. In order to implement the conditional least squares estimation method, we need to minimize the sum of squared deviation about the conditional expectation which is given as instead of with respect to the regression parameters of the model, where and is the vector for regression parameters. Here numerical methods are being employed to obtain the CLS estimates of the regression parameters of the model as there are no closed forms of the CLS estimators. In the subsequent section, we have done an extensive simulation study for both the studies of one change‐point and two change‐points and from the simulation results, we have shown consistency of the CLS method. In maximum likelihood estimation, given a data set of size n, the likelihood function for the process is given by . In order to obtain the MLE estimators, we maximize the log‐likelihood function with respect to regression parameters, which can be written as . Here . In real‐life scenarios like the COVID‐19 data, the number of daily active cases at time‐point t (represented by ) and the number of daily new cases at time‐point t (represented by ) will often be large and hence in R programming language, we face difficulties to execute the MLE method because of the terms like (“j” is the number of daily active cases at time‐point t and where “i” is the number of daily active cases at time‐point ) involved in the likelihood function. So the estimation method which we have employed for data analysis is CLS method.

SIMULATION STUDY

General setup

In this section, we perform extensive simulation studies for (a) the estimation of change‐point(s), (b) the estimation of model parameters, and (c) the forecasting performances of the proposed model. To perform the studies, we simulate data from (1) one change‐point model and (2) two change‐point model. The simulation studies are performed for varying sample sizes along with different choices of model parameters, tuning parameter, and change‐points. For the simulation studies regarding the analysis of one change‐point (), (the set of all covariates up to time‐point n) is equal to where , which is the smooth time‐varying component and , which is the simple time‐varying component. And for the simulation studies regarding the analysis of two change‐points ( and ), where , , and ; here 's and 's are the smooth time‐varying components and 's are the simple time‐varying components. In the simulation studies, we use these components for each of the studies to generate data sets of varying sample sizes by the data‐generating processes mentioned in Equation (2) for one change‐point and Equation (3) for two change‐points. In the simulation study regarding forecasting performances, we compare our proposed model with the following model where denotes the daily number of active cases at time‐point t and represents the daily number of new cases at time‐point t. Here follows Poisson() where is assumed to have the following form: This model involves no change‐point. But the innovation terms depend on time‐varying covariates.

Results on change‐point(s) estimation

Here we perform a simulation study in order to provide 95% CIs for the true change‐points from the simulated data sets and examine the widths of those intervals with increasing sample size. The estimation method of change‐point(s) is discussed in Section 3.4. In order to perform this simulation study, we simulate data from the proposed model with (1) one change‐point (given in Equation (2)), and (2) two change‐points (given in Equation 3)). Two sets of regression parameters are considered for each of the above two data‐generating cases. Three different sample sizes (n) of 400, 450, and 500 are explored. Throughout the whole simulation study, we consider two different values of as 0.1 and 0.2. All the simulations results are based on 1000 Monte Carlo replications.

Case 1: Analysis of one change‐point

For one change‐point simulation study, we assume the value of the true change‐point to be where n is the sample size of the data. The estimation method of the change‐point is discussed in Section 3.4. Two sets of regression parameters used in the data‐generating process are and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (2). Here for the data‐generating method of one change‐point, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and we report the 95% CIs in Tables 1 and 2 where we can see that as the sample sizes increase the widths of the CIs decrease.
TABLE 1

95% confidence intervals (CIs) for the true change‐point for different sample sizes for different values of where the true change‐point is at th time‐point and true

δn=0.1
n 95% CIWidth
400(196.6439, 203.3801)6.7362
450(222.5360, 227.3340)4.7980
500(248.1794, 251.7426)3.5632
δn=0.2
n 95% CIWidth
400(197.8427, 202.1033)4.2606
450(223.4354, 226.6266)3.1912
500(248.9330, 251.0090)2.0760
TABLE 2

95% confidence intervals (CIs) for the true change‐point for different sample sizes for different values of where the true change‐point is at th time‐point, and true

δn=0.1
n 95% CIWidth
400(197.9393, 205.0067)7.0674
450(224.1986, 228.8794)4.6808
500(249.8242, 252.8998)3.0756
δn=0.2
n 95% CIWidth
400(198.4818, 203.1122)4.6304
450(224.1968, 227.2532)3.0564
500(249.5226, 251.7134)2.1908
95% confidence intervals (CIs) for the true change‐point for different sample sizes for different values of where the true change‐point is at th time‐point and true 95% confidence intervals (CIs) for the true change‐point for different sample sizes for different values of where the true change‐point is at th time‐point, and true

Case 2: Analysis of two change‐points

For the simulation study of two change‐points, the true change‐points and are assumed to be and , respectively. Two sets of values of the regression parameters used in the data‐generating process are = and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (3). The estimation method of the change‐point is discussed in Section 3.4. Here for the data‐generating method of two change‐points, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and the 95% CIs are reported in Tables 3 and 4. From the tables, we can see that as the sample sizes increase the widths of the CIs decrease.
TABLE 3

95% confidence intervals (CIs) for the true change‐points for different sample sizes for different values of where the true change‐points are at th and th time‐points, and true

δn=0.1
n 95% CI for first change‐pointWidth95% CI for second change‐pointWidth
400(158.7577, 161.6283)2.8706(235.0461, 245.1539)10.1078
450(178.9962, 181.0778)2.0816(265.5051, 274.5129)9.0078
500(199.3109, 200.7131)1.4022(296.0952, 303.7808)7.6856
δn=0.2
n 95% CI for first change‐pointWidth95% CI for second change‐pointWidth
400(159.1228, 160.9412)1.8184(236.4025, 243.1755)6.7730
450(179.4911, 180.5349)1.0438(266.7304, 272.4936)5.7632
500(199.9124, 200.0876)0.1752(297.2184, 301.8216)4.6032
TABLE 4

95% confidence intervals (CIs) for the true change‐points for different sample sizes for different values of where the true change‐points are at th and th time‐points, and true

δn=0.1
n 95% CI for first change‐pointWidth95% CI for second change‐pointWidth
400(158.5181, 162.2079)3.6898(233.8636, 244.1944)10.3308
450(178.6895, 181.8585)3.1690(264.5409, 273.4491)8.9082
500(198.8991, 201.3409)2.4418(294.9574, 302.7986)7.8412
δn=0.2
n 95% CI for first change‐pointWidth95% CI for second change‐pointWidth
400(158.8499, 161.4501)2.6002(235.8058, 242.6082)6.8024
450(179.0436, 181.0664)2.0228(266.2779, 272.0281)5.7502
500(199.3532, 200.7068)1.3536(296.5623, 301.5857)5.0234
95% confidence intervals (CIs) for the true change‐points for different sample sizes for different values of where the true change‐points are at th and th time‐points, and true 95% confidence intervals (CIs) for the true change‐points for different sample sizes for different values of where the true change‐points are at th and th time‐points, and true

Results on estimation of model parameters

Here we perform a simulation study to investigate the consistency of the estimation method used for the proposed model. In order to perform this simulation study, we simulate data from the proposed model with (1) one change‐point (given in Equation (2)), and (2) two change‐points (given in Equation (3)). Three sets of regression parameters are considered for each of the above two data‐generating cases. Those values are mentioned in the subsequent sections. Three different sample sizes (n) of 100, 200, and 500 are explored. Throughout the whole simulation study, we consider three different values of as 0.1, 0.5, and 1. All the simulations results are based on 1000 Monte Carlo replications. For one change‐point simulation study, we assume the value of the change‐point to be where n is the sample size of the data. Three sets of regression parameters used in the data‐generating process are , , and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (2). Then we estimate the regression parameters using CLS estimation method. Here for the data‐generating method of one change‐point, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and we report the mean estimates and mean squared errors (MSEs) of the regression parameters in Tables 5, 6, 7. From Tables 5, 6, 7, we can see that as the sample size increases MSE of the estimated regression parameters decreases. This empirically establishes the consistency of the CLS estimation.
TABLE 5

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.3840 (0.0317)0.1896 (0.1905) 0.2264 (0.0245)0.0207 (0.0002)
2000.4198 (0.0188)0.1862 (0.0853) 0.2024 (0.0016)0.0205 (0.0000)
5000.4503 (0.0089)0.1522 (0.0284) 0.1985 (0.0002)0.0202 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.4007 (0.0290)0.1596 (1.7079) 0.3031 (0.2222)0.0220 (0.0011)
2000.4357 (0.0149)0.1782 (0.0926) 0.2333 (0.0302)0.0203 (0.0000)
5000.4595 (0.0076)0.1536 (0.0333) 0.1982 (0.0004)0.0201 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.4015 (0.0290)0.2021 (0.1847) 0.3268 (0.0006)0.0211 (0.0003)
2000.4383 (0.0145)0.1802 (0.0838) 0.2412 (0.0006)0.0205 (0.0001)
5000.4602 (0.0071)0.1466 (0.0311) 0.1979 (0.0001)0.0201 (0.0000)
TABLE 6

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.5341 (0.0131) 0.1468 (0.2255) 0.0427 (0.0005)0.0218 (0.0002)
2000.5600 (0.0064) 0.1627 (0.0995) 0.0409 (0.0000)0.0206 (0.0000)
5000.5701 (0.0040) 0.1676 (0.0270) 0.0401 (0.0000)0.0202 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.5308 (0.0136) 0.1345 (0.2083) 0.0425 (0.0004)0.0218 (0.0002)
2000.5597 (0.0060) 0.1507 (0.0871) 0.0405 (0.0000)0.0204 (0.0000)
5000.5778 (0.0036) 0.1884 (0.0257) 0.0402 (0.0000)0.0202 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2(MSE)
1000.5329 (0.0126) 0.1276 (0.2027) 0.0414 (0.0005)0.0211 (0.0002)
2000.5583 (0.0065) 0.1511 (0.0857) 0.0409 (0.0000)0.0205 (0.0000)
5000.5752 (0.0037) 0.1776 (0.0240) 0.0401 (0.0000)0.0201 (0.0000)
TABLE 7

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.7190 (0.0126) 0.0579 (0.3042) 0.0262 (0.0026)0.0134 (0.0004)
2000.7586 (0.0045) 0.0924 (0.1265) 0.0204 (0.0002)0.0105 (0.0001)
5000.7765 (0.0018) 0.1037 (0.0622) 0.0200 (0.0000)0.0100 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.7211 (0.0129) 0.0614 (0.3907) 0.0231 (0.0078)0.0110 (0.0104)
2000.7625 (0.0039) 0.1007 (0.1208) 0.0209 (0.0001)0.0106 (0.0000)
5000.7771 (0.0018) 0.1011 (0.0654) 0.0199 (0.0000)0.0100 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE)
1000.7234 (0.0116) 0.0582 (0.2681) 0.0240 (0.0006)0.0135 (0.0003)
2000.7602 (0.0043) 0.0883 (0.1321) 0.0207 (0.0006)0.0103 (0.0001)
5000.7769 (0.0020) 0.1058 (0.0663) 0.0199 (0.0001)0.0098 (0.0000)
Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true For two change‐point simulation study, the change‐points and are assumed to be and , respectively. Three sets of values of the regression parameters used in the data‐generating process are = , , and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (3). Then we estimate the regression parameters using CLS estimation method for a given simulated data. Here for the data‐generating method of two change‐points, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and the combined mean estimates and MSEs of the regression parameters are reported in Tables 8, 9, 10. We can see as the sample size increases MSE of the estimated regression parameters decreases. This establishes the consistency of the CLS estimation empirically.
TABLE 8

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.4294 (0.0140)0.1483 (0.1438) 0.0531 (0.0008)0.0415 (0.0007)0.0218 (0.0002)
2000.4522 (0.0072)0.1455 (0.0627) 0.0509 (0.0001)0.0405 (0.0001)0.0205 (0.0000)
5000.4642 (0.0037)0.1249 (0.0210) 0.0502 (0.0000)0.0400 (0.0000)0.0202 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.4222 (0.0155)0.1727 (0.1312) 0.0529 (0.0007)0.0414 (0.0007)0.0215 (0.0001)
2000.4549 (0.0069)0.1584 (0.0559) 0.0502 (0.0001)0.0398 (0.0001)0.0202 (0.0000)
5000.4572 (0.0042)0.1512 (0.0243) 0.0502 (0.0000)0.0400 (0.0000)0.0201 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.4231 (0.0154)0.1713 (0.1180) 0.0512 (0.0007)0.0394 (0.0007)0.0212 (0.0001)
2000.4549 (0.0069)0.1284 (0.0634) 0.0510 (0.0001)0.0404 (0.0001)0.0207 (0.0000)
5000.4563 (0.0042)0.1482 (0.0234) 0.0502 (0.0000)0.0400 (0.0000)0.0202 (0.0000)
TABLE 9

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.5145 (0.0161) 0.1179 (0.2584) 0.0420 (0.0013)0.0298 (0.0010)0.0219 (0.0003)
2000.5563 (0.0062) 0.1405 (0.0818) 0.0403 (0.0001)0.0299 (0.0001)0.0203 (0.0000)
5000.5669 (0.0040) 0.1677 (0.0419) 0.0402 (0.0000)0.0300 (0.0000)0.0202 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.5212 (0.0151) 0.1173 (0.1854) 0.0427 (0.0010)0.0310 (0.0009)0.0217 (0.0002)
2000.5547 (0.0058) 0.1381 (0.0738) 0.0405 (0.0001)0.0302 (0.0001)0.0204 (0.0000)
5000.5576 (0.0045) 0.0985 (0.0462) 0.0399 (0.0000)0.0299 (0.0000)0.0200 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.5163 (0.0150) 0.1113 (0.1854) 0.0410 (0.0010)0.0288 (0.0009)0.0216 (0.0003)
2000.5575 (0.0057) 0.1376 (0.0738) 0.0406 (0.0001)0.0304 (0.0001)0.0203 (0.0000)
5000.5594 (0.0042) 0.0948 (0.0462) 0.0398 (0.0000)0.0299 (0.0000)0.0199 (0.0000)
TABLE 10

Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

δn=0.1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.6923 (0.0183) 0.1259 (0.4098) 0.0265 (0.0019)0.0200 (0.0021)0.0150 (0.0004)
2000.7489 (0.0053) 0.1638 (0.1152) 0.0206 (0.0001)0.0195 (0.0001)0.0108 (0.0000)
5000.7786 (0.0015) 0.2734 (0.0235) 0.0204 (0.0000)0.0201 (0.0000)0.0103 (0.0000)
δn=0.5
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.6951 (0.0178) 0.0900 (0.2900) 0.0241 (0.0013)0.0196 (0.0012)0.0137 (0.0003)
2000.7500 (0.0050) 0.1546 (0.1238) 0.0202 (0.0001)0.0195 (0.0001)0.0105 (0.0000)
5000.7808 (0.0013) 0.2894 (0.0234) 0.0205 (0.0000)0.0201 (0.0000)0.0101 (0.0000)
δn=1
n α^ (MSE) β^0 (MSE) β^1 (MSE) β^2 (MSE) β^3 (MSE)
1000.7013 (0.0161) 0.1013 (0.3606) 0.0232 (0.0017)0.0182 (0.0019)0.0135 (0.0004)
2000.7501 (0.0052) 0.1605 (0.1238) 0.0211 (0.0002)0.0205 (0.0001)0.0106 (0.0000)
5000.7809 (0.0014) 0.2752 (0.0200) 0.0204 (0.0000)0.0201 (0.0000)0.0103 (0.0000)
Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true Mean estimates of the regression parameters with their respective mean squared errors (MSEs) for different sample sizes and different values of where the true

Results on forecasting performance

Another simulation study is done to study the h‐step ahead forecasting performances of the proposed process for varying h, compared with the comparison method mentioned in Equation (11). For comparison, we consider the measure of forecasting criteria, namely PRMSE(h), defined in Equation (10). In order to perform this simulation study, we simulate data from the proposed model with (1) one change‐point (given in Equation (2)) and (2) two change‐points (given in Equation (3)). Two sets of regression parameters are considered for each of the above two data‐generating cases. Throughout the whole simulation study, we consider two different values of as 0.1 and 0.2. Each time we generate a total sample of size 100 of which a training set of size 85 is used to fit the two models considered for comparison and a test set of size 15 is considered to find PRMSE(h) for . This procedure is repeated for 100 times. For one change‐point simulation study, we assume the value of the change‐point to be where n is the sample size of the data. Two sets of regression parameters used in the data‐generating process are and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (2). Here for the data‐generating method of one change‐point, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 100 times and we report the h‐step ahead forecasting performances for both the proposed model and the comparison model for in Tables 11 and 12 where we see the average PRMSE(h) of the proposed process is relatively smaller than that of the comparison process. It is also observed from the tables that the measure seems to have an increasing pattern in h, and this coincides with the theoretical result mentioned in Section 5.2.
TABLE 11

Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of one change‐point, and the true

δn=0.1
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.27841.5581
21.32852.0101
31.34092.2905
δn=0.2
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.24751.4388
21.31721.7750
31.32262.0305
TABLE 12

Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of one change‐point, and the true

δn=0.1
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
10.91431.2301
20.91931.6439
30.92211.9372
δn=0.2
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
10.88701.1384
20.88981.4843
30.89221.7281
Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of one change‐point, and the true Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of one change‐point, and the true For the study of two change‐points, the change‐points and are assumed to be and , respectively. Two sets of regression parameters used in the data‐generating process are and . For each set of the regression parameters and the tuning parameter , we simulate the data using model (1) with given in Equation (3). Here for the data‐generating method of two change‐points, , set of all covariates up to time‐point n, consists of both the smooth time‐varying components and the simple time‐varying components up to time‐point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 100 times and we report the h‐step ahead forecasting performances for both the proposed model and the comparison model for in Tables 13 and 14 where we see the average PRMSE(h) of the proposed process is relatively smaller than that of the comparison process. It is also observed that the measure seems to have an increasing pattern in h, and this coincides with the theoretical result mentioned in Section 5.2.
TABLE 13

Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of two change‐points, and the true

δn=0.1
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.19411.2209
21.28361.3087
31.30841.3088
δn=0.2
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.17641.2360
21.25451.3146
31.29721.3215
TABLE 14

Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of two change‐points, and the true

δn=0.1
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.15321.1551
21.27911.2953
31.29701.3138
δn=0.2
h Proposed model (PRMSE(h))Comparison model (PRMSE(h))
11.23031.2378
21.32711.3631
31.38801.4039
Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of two change‐points, and the true Predicted root mean squared error (PRMSE(h)) values for varying h for different where the data‐generating process is our proposed method of two change‐points, and the true

DATA ANALYSIS

In this section, we consider two real data sets: (i) Italy COVID‐19 data with total of 113 observations and (ii) Kerala COVID‐19 data with total of 90 observations, to illustrate the usefulness of our proposed model. We compare our proposed model with the comparison model mentioned in Equation (11). For investigating the predictive performances of these two models, we take 100 observations of Italy data as the training set along with the remaining 13 observations as the test set, and for Kerala data, we consider 78 observations as the training set along with the test set of remaining 12 observations.

Data fitting

In this section, we analyze the COVID‐19 data of daily active cases of Italy (described in Section 2.1) through our proposed method. We also fit the comparison model (described in Equation (11)) to this data set. We consider 113 data points from February 15 to June 6. Here , the set of all time‐varying covariates up to time‐point n, contains both the smooth time‐varying covariates which have the change‐point and the simple time‐varying covariates up to n time‐points as described in Section 7.1. For this data set, . From the daily time series plot, we see that there is only one change‐point during that period and hence we fit the proposed model with one change‐point (given in Equation (2)). For the proposed model with one change‐point, the change‐point for the COVID‐19 data of Italy is estimated using the method described in Section 3.4 and the estimated point is the 36th time‐point. This mostly implies that the number of cases increased up to March 21 (36 days from February 15) and after that, the number of daily new cases started decreasing gradually. In order to estimate the optimal , we consider a set of points in the interval [0.1, 10] with an increment of 0.1. For each of the in the set, we fit our one change‐point model to the data. For every fit, we calculate the goodness‐of‐fit measure namely root mean squared error (RMSE). Then we consider the minimum value of this measure to obtain the optimal . gives the minimum value of RMSE which is 949.85. Hence the estimated value of is 0.1. For this data set, the estimates of the regression parameters of our proposed model by CLS method are , and that of the comparison model are . The RMSE corresponding to our proposed model for the data set is 949.85, which is much lower compared with that for the comparison model, which is 1940.95. In Figure 5, we provide the plot of RMSEs against each of 's in the set [0.1,10]. And in Figure 6, we give the plot of the original data along with the fitted data through both the comparison model and the proposed model.
FIGURE 5

versus root mean squared error (RMSE) (for Italy)

FIGURE 6

Fitted data (active cases) by both the comparison model and the proposed model of one change‐point study (for Italy)

versus root mean squared error (RMSE) (for Italy) Fitted data (active cases) by both the comparison model and the proposed model of one change‐point study (for Italy) In Figure 6, if we observe closely, we can find that the fitted data through our proposed model overlaps with major portions of the original data, but for the comparison method, we can distinguish between the original data and the fitted data in those major portions. The differences between the fitted data through our model and that through the comparison model seem to be small in Figure 6 since the magnitudes of observed data points are very high and hence the RMSEs help us here to see the differences between our proposed process and the comparison process easily rather than the plot. Overall, we can say the fit through our proposed model is good.

Forecasting

To study the forecasting performance, we partition the data into two sets. As described earlier, the training set containing the first 100 observations is used to fit the models, and the test set with the remaining 13 observations, is used for finding the forecasting measure PRMSE for both models. For this setup, the estimates of the regression parameters of our proposed model by CLS method are , and that of the comparison model are . For one‐step ahead forecasting (), PRMSEs for the proposed model and the comparison model are 1049.62 and 1903.99, respectively. For two‐step ahead forecasting (), PRMSEs for the proposed model and the comparison model are 1994.67 and 3788.10, respectively. So for both the one‐step and two‐step ahead forecasting results, our proposed model performs much better than the comparison model. Here we analyze the COVID‐19 data of daily active cases of Kerala (see Section 2.2) through our proposed method. We also fit the comparison model (described in Equation (11)) to this data set. In this data set of Kerala, we consider 90 data points from March 9 to June 6. Here consists of both the smooth time‐varying covariates which have two change‐points and the simple time‐varying covariates up to n time‐points as described in Section 7.1. For this data set, . From the daily time series plot, we notice that there are two change‐points during that period and hence we fit the proposed model with two change‐points (given in Equation (3)). For the proposed model with two change‐points, the change‐points and are estimated using the method described in Section 3.4 and the estimated change‐points are 19th and 54th time‐points. This mostly implicates that the number of cases increased up to March 27 (19 days from March 9), then the number of daily new cases started decreasing gradually, but after May 1 (54 days from March 9) the cases again began to rise. In order to estimate the optimal for this data, we follow the same process as mentioned for the COVID‐19 data of Italy. We find that gives the minimum value of RMSE which is 12.88. Hence the estimated value of is 0.2. For this data set, the estimates of the regression parameters of our proposed model by CLS method are = , and that of the comparison model are . The RMSE corresponding to our proposed model for the data set is 12.88, which is much lower compared with that for the comparison model, which is 15.10. In Figure 7, we provide the plot of RMSEs against each of 's in the set [0.1, 10]. And in Figure 8, we give the plot of the original data along with the fitted data through both the comparison model and the proposed model.
FIGURE 7

versus root mean squared error (RMSE) (for Kerala)

FIGURE 8

Fitted data (active cases) by both the comparison model and the proposed model of two change‐point study (for Kerala)

versus root mean squared error (RMSE) (for Kerala) Fitted data (active cases) by both the comparison model and the proposed model of two change‐point study (for Kerala) If we study Figure 8 closely, we see that the fitted data through our proposed model overlaps with major portions of the original data, whereas we can distinguish between the original data and the fitted data by the comparison model in those major portions. Here also the RMSEs help us to see the differences between the proposed method and the comparison method easily. So overall, we can say that the fit through our proposed model is good. To study the forecasting part, we partition the data into two sets. As described earlier, the training set, containing the first 78 observations, is used to fit the models, and the test set with the remaining 12 observations, is used for finding the forecasting measure PRMSE for both models. For this setup, the estimates of the regression parameters of our proposed model by CLS method are = , and the estimates for the comparison model are . For one‐step ahead forecasting (), PRMSEs for the proposed model and the comparison model are 154.06 and 275.80, respectively. For two‐step ahead forecasting (), PRMSEs for the proposed model and the comparison model are 186.99 and 348.68, respectively. So for both one‐step and two‐step ahead forecasting studies, our proposed model performs much better than the comparison model.

CONCLUDING REMARKS

PINAR(1) process (introduced by McKenzie, 1985 and Al‐Osh & Alzaid 1987) has received significant attention owing to its simplicity and is used widely in the field of count time series data. But this process is unable to model the count time series data like the COVID‐19 data containing change‐points and time‐varying covariates. In this article, we have developed a new PINAR(1) model based on binomial thinning operator to handle the problem of change‐point analysis through time‐varying covariates. The development of our proposed model is inspired by Chan and Tong (1986); Hansen (2000), and Fong et al. (2017) who mainly worked on continuous data. We have used the concept of Smooth maximum (n.d.) in the proposed model to develop the smoothing change‐point function which enables the model to capture the changing curvatures in the data. The key feature of our proposed model is its ability to accommodate both change‐points and time‐varying covariates. As described earlier, we can see these features in the COVID‐19 data sets from which we have obtained the idea to develop our proposed model for both the studies of one change‐point and two change‐points. We have studied the distributional forms of our proposed model along with the h‐step ahead forecasting distribution. Because of the difficulty in estimating the regression parameters through the maximum likelihood method, we have employed the CLS estimation method. We have performed an extensive simulation study to examine the CIs for true change‐points for varying sample sizes and seen that as sample sizes increase widths of the CIs decrease. Regarding the estimation of parameters, the simulation results have shown consistency of the CLS estimation method. From the data applications, we can see that our proposed model has led to much better performance over the comparison model with respect to standard statistical measure like RMSE. Our proposed model has also given a much better performance than the comparison model in the forecasting area with respect to the accuracy measure PRMSE in both the simulation study and the data analysis part. We can further extend our proposed model for more than two change‐points in the same way as the model for one change‐point analysis has been extended to that for two change‐point analysis. Therefore we hope that our proposed model could be a viable choice for modeling these kinds of count time series data sets.
  3 in total

1.  Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme.

Authors:  Bibhas Chakraborty; Eric B Laber; Yingqi Zhao
Journal:  Biometrics       Date:  2013-07-11       Impact factor: 2.571

2.  Change-point analysis through integer-valued autoregressive process with application to some COVID-19 data.

Authors:  Subhankar Chattopadhyay; Raju Maiti; Samarjit Das; Atanu Biswas
Journal:  Stat Neerl       Date:  2021-07-11       Impact factor: 1.239

3.  chngpt: threshold regression model estimation and inference.

Authors:  Youyi Fong; Ying Huang; Peter B Gilbert; Sallie R Permar
Journal:  BMC Bioinformatics       Date:  2017-10-16       Impact factor: 3.169

  3 in total
  2 in total

1.  Change-point analysis through integer-valued autoregressive process with application to some COVID-19 data.

Authors:  Subhankar Chattopadhyay; Raju Maiti; Samarjit Das; Atanu Biswas
Journal:  Stat Neerl       Date:  2021-07-11       Impact factor: 1.239

2.  A spatio-temporal autoregressive model for monitoring and predicting COVID infection rates.

Authors:  Peter Congdon
Journal:  J Geogr Syst       Date:  2022-04-26
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.