Literature DB >> 32287880

Estimation and tests for power-transformed and threshold GARCH models.

Jiazhu Pan^1,2, Hui Wang³, Howell Tong².

Abstract

Consider a class of power-transformed and threshold GARCH ( p , q ) (PTTGRACH ( p , q ) ) model, which is a natural generalization of power-transformed and threshold GARCH(1,1) model in Hwang and Basawa [2004. Stationarity and moment structure for Box-Cox transformed threshold GARCH(1,1) processes. Statistics & Probability Letters 68, 209-220.] and includes the standard GARCH model and many other models as special cases. We first establish the asymptotic normality for quasi-maximum likelihood estimators (QMLE) of the parameters under the condition that the error distribution has finite fourth moment. For the case of heavy-tailed errors, we propose a least absolute deviations estimation (LADE) for PTTGARCH ( p , q ) model, and prove that the LADE is asymptotically normally distributed under very weak moment conditions. This paves the way for a statistical inference based on asymptotic normality for heavy-tailed PTTGARCH ( p , q ) models. As a consequence, we can construct the Wald test for GARCH structure and discuss the order selection problem in heavy-tailed cases. Numerical results show that LADE is more accurate than QMLE for heavy-tailed errors. Furthermore, the theory is applied to the daily returns of the Hong Kong Hang Seng Index, which suggests that asymmetry and nonlinearity could be present in the financial time series and the PTTGARCH model is capable of capturing these characteristics. As for the probabilistic structure of PTTGARCH ( p , q ) model, we give in the appendix a necessary and sufficient condition for the existence of a strictly stationary solution of the model, the existence of the moments and the tail behavior of the strictly stationary solution.

Entities: Disease

Keywords: Asymptotic normality; Least absolute deviations estimation; Order selection; PTTGARCH structure; Power transformation; Quasi-maximum likelihood estimator; Threshold GARCH; Wald test

Year: 2007 PMID： 32287880 PMCID： PMC7116990 DOI： 10.1016/j.jeconom.2007.06.004

Source DB: PubMed Journal: J Econom ISSN： 0304-4076 Impact factor: 2.388

Introduction

The autoregressive conditional heteroscedastic (ARCH) model proposed by Engle (1982) has led to considerable interest in models in which the conditional variance (volatility) of the current observation, , is a function of the past observations. Engle's ARCH model formulated the conditional variance of the process as “linear” in squared past values. Bollerslev (1986) generalized ARCH model to allow the conditional variance to depend additionally on its past realizations. Since then many empirical and theoretical aspects of the ARCH/GARCH model have been developed. Shepard (1996) and Rydberg (2000) gave excellent surveys of ARCH/GARCH modelling for financial data. Weiss (1986) and Berkes et al. (2003) established consistency and asymptotic normality of maximum likelihood estimators for ARCH and GARCH model, respectively. The former assumes that the errors have finite fourth moment and the latter requires a moment of errors slightly higher than the fourth. Hall and Yao (2003) showed that when the error is heavy tailed (without finite fourth moment), quasi-maximum likelihood estimators (QMLE) are not asymptotically normal and suffer from slow convergence rate and complex asymptotic distribution, which do not facilitate, among others, statistical tests and interval estimation in the standard manner; see Hall and Yao (2003), and Mikosch and Straumann (2006). Peng and Yao (2003) pointed out that a kind of least absolute deviations estimator (LADE) has asymptotic normality if the error distribution has finite second moment. Many extensions and generalizations of the ARCH model have appeared (see Engle and Bollerslev, 1986, Higgins and Bera, 1992, Li and Li, 1996, Hwang and Kim, 2004, Hwang and Basawa, 2004). Among all the extensions, the functional form for is of great importance (see Higgins and Bera, 1992). Even Engle (1982) has acknowledged that “it is likely that other formulations of the variance may be more appropriate for the particular applications”. Hsieh (1989) found that the GARCH models cannot fit some exchange rates satisfactorily; Scheinkman and LeBaron (1989) found evidence that volatility in stock market data cannot be captured completely by linear ARCH models; Gouriéroux (1997, p. 90) indicated that the heteroscedasticity varies depending on whether the error is positive or negative. This leads to asymmetric threshold ARCH modelling. The study of Li and Li (1996) has showed that threshold-asymmetric modelling provides better fitting compared with symmetric ARCH in the field of financial time series. Therefore, combining the above ideas, Hwang and Kim (2004) proposed a broad class of power-transformed and threshold ARCH model:where , , , are unknown parameters, . Here, is a sequence of independent and identically distributed random variables, and is independent of for all t. They studied the geometric ergodicity and existence of moments of the model, and investigated a large sample test for ARCH structures based on the uniform local asymptotic normality approach. However, the in Hwang and Kim's (2004) model is only a function of the past p observations. Hwang and Basawa (2004) introduced a Box–Cox transformed threshold GARCH(1,1) model by allowing to depend on and studied the stationarity and moment structure of the model. Liu (2006) investigated the tail behavior of the Box–Cox transformed threshold GARCH(1,1) model. We consider a more general power-transformed and threshold GARCH model, in which is a function of not only the past p observations but also the past q values of itself. A power-transformed and threshold GARCH model (PTTGARCH) is defined aswhere , , , , are the same as those in model (1.1), and . As we can see, besides the standard GARCH model (Bollerslev, 1986) i.e., and , model (1.2) includes diverse nonlinear and asymmetric models as special cases. For example, it becomes a Box–Cox transformed ARCH model (Higgins and Bera, 1992) when , a TARCH model (Li and Li, 1996) when , a power-transformed and threshold ARCH model (Hwang and Kim, 2004) when , a Box–Cox transformed threshold GARCH(1,1) model (Hwang and Basawa, 2004) when . The main goal of this paper is to study the estimation and tests for model (1.2). We differ from Hwang and Kim (2004) and Hwang and Basawa (2004) in the following ways: Our model is not a pure ARCH model or a simple GARCH(1,1). There are q GARCH terms in model (1.2). Instead of the uniform local asymptotic normality approach of maximum likelihood estimation (MLE), we consider Gaussian quasi-maximum likelihood estimation (QMLE) for PTTGARCH model and obtain asymptotic normality of QMLE under the condition that the error distribution has finite fourth moment. Our LADE approach relaxes the moment condition for the error distribution to the minimum. Its asymptotic normality enables us to do statistical inference on PTTGARCH model with heavy-tailed errors. We also give a necessary and sufficient condition for the existence of a strictly stationary solution of model (1.2), and study the existence of the moments and the tail behavior of the model. Furthermore, an order selection method is established by using the Wald statistic based on the asymptotic normality of LADE for a heavy-tailed PTTGARCH model. A simulation study indicates that the LADE is more accurate than the QMLE when the errors are heavy-tailed. We give a real data example to illustrate the practicality of our theory. Our results in this paper is relevant because much empirical evidence shows that financial data often have heavy tails (see Adler et al., 1997, Mittnik and Rachev, 2000). The rest of this paper is organized as follows. In Section 2, asymptotic normality of QMLE and LADE is established. Section 3 investigates tests for GARCH structures and the order selection problem. Section 4 presents a simulation study and a real data example. All the proofs of the main results in Sections 2 and 3 are presented in Section 5. The Appendix presents the stationarity and existence of moments for PTTGARCH model. In the sequel, , and denote convergence in distribution, in probability and almost surely, respectively. denotes the transpose of a vector or a matrix A, denotes the Euclidean norm unless declared otherwise and C is a constant which may be different at different places.

Estimation

Suppose that the data generating process is model (1.2). To avoid pathological cases, we assume that or , and if . Let be the parametric vector with true value . Define Our basic assumptions are as follows. is non-degenerate and symmetrically distributed. Furthermore, for some , and is a compact subset of , is in the interior of , and the Lyapunov exponent for all (see (A.4) in the Appendix). Because of the compactness of , there exist positive constants such that , for any . Under assumptions (A1) and (A2), it may be deduced that (1.2) implies thatThe derivatives of , which are very useful in the sequel, may be deduced from (2.2) as follows. In the above expressions, we set if and if . In practice, however, cannot be computed using Eq. (2.2), since is only observed for . We have to use the following approximation for based on .

Quasi-maximum likelihood estimation (QMLE)

In this subsection, we deal with QMLE of the parameters. The logarithm of the quasi-likelihood function (omitted some constant) is defined aswhere is defined by (2.2). The QMLE of is . Define In order to obtain the consistency and asymptotic normality of , we need an additional condition, namely , and . If has density at 0, then (2.1) is satisfied for any . The following theorem shows that is consistent and asymptotically normal. Under assumptions (A1)–(A3), it follows that , , where and . As mentioned earlier, we can only observe in practice. So we replace by Similarly, we define . Let where are defined similar to by replacing by . The next theorem shows that our results for are the same as those for , and and are consistent estimators of and , respectively. Under assumptions (A1)–(A3), it follows that , , and . Based on Theorem 1, we can develop some statistical inference about model (1.2). For example, we can consider a general form of the linear null hypothesiswhere is a constant matrix with rank s, and is constant vector. By Theorems 2 and 4 in the next section, the asymptotic distributions of the likelihood ratio (LR) test statistic, the Lagrange multiplier (LM) test statistic and the Wald test statistic are . From the above discussion, it can be seen that if we apply QMLE, we need Assumption (A3), which is quite restrictive on the parameter vector and excludes the heavy tailed cases.

Least absolute deviations estimation (LADE)

We have seen from the above that the QMLE requires stringent moment conditions on and . However, empirical evidence indicates that financial data may have heavy tails. In recent years, the problem of statistical inference about GARCH-type models with weak moment conditions on and has attracted much attention (see Hall and Yao, 2003). We introduce LADE for PTTGARCH model, which only requires conditions for strict stationarity and assumption (A4). Define an objective function as in Peng and Yao (2003) where is a positive number satisfying and as . The LADE is a minimizer of the objective function on the parameter space Denotewhere . It is easy to see that , where is a minimizer of Define where We need the following condition on the error distribution instead of Assumption (A3). has zero median and a differentiable positive density function such that and . Suppose that conditions (A1), (A2) and (A4) hold. Then for any given positive random variable M with , there exists a local minimizer of which lies in the random region for which Here is a normal random vector with mean 0 and covariance matrix .

Order selection and tests for GARCH structure

The LR test, the LM test and the Wald test are the three standard approaches to constructing test statistics for parametric hypotheses. However, the first two depend on the likelihood function and MLE, which are very sensitive to heavy tails (see Hall and Yao, 2003). Therefore, we use a Wald test statistic based on LADE for the heavy-tailed case. We consider a general form of linear null hypothesis (2.10). A Wald test statistic is defined as We reject for large values of . In the above expression,where is defined similar to by replacing by , is a density function on R, and is a bandwidth. The following theorem gives the limiting distribution of under . Suppose the conditions of Theorem 3 hold. If the kernel function K and the bandwidth satisfy the following assumptions: K is Lipschitz continuous and of finite first moment; and as , then under . For testing an order against a higher order or with and , we can take a in (2.10) such thatand Notice that a GARCH model cannot be tested directly against an using the standard technique because of the identification problem already discussed by Bollerslev (1986), and the situation is the same for PTTGARCH models. As pointed out by Ling (2005), the above test procedure is very useful in model building. In fact, we can use it to select the order. Suppose that the order of model (1.2) does not exceed . For a given significant level , we can take the above test in order for and in (3.2) until and such that . Then we can declare that the order of model (1.2) is . Because model (1.2) is a very general framework, we can test whether some special case is true or not. In the following, we mainly discuss testing problems about GARCH structures for :Let Then the above four testing problems can be written in the form with So the Wald test provides a simple way to test the null hypothesis of a particular specification against wider nonlinear alternatives. For example, we can determine whether Bollerslev's standard model provides an adequate description of the data by testing . Bollerslev's standard GARCH: and , . IGARCH: , , and . Symmetric GARCH: , . No power transformation: .

Simulations and empirical results

In this section, we perform a simulation study to demonstrate the accuracy of LADE in heavy-tailed case and apply the theory in Sections 2 and 3 to the Hong Kong Hang Seng Index (HSI) series. Firstly, we compare numerically LADE and QMLE for the PTTGARCH(1,1) model. The data are generated by the PTTGARCH(1,1) model with the true parameter . We take the errors to have either a standard normal distribution or a standardized Student's -distribution with degrees of freedom , 3, 4, 5. The sample size is and we draw 1000 independent samples. For LADE, u was set to be . Fig. 1 presents the boxplots of the average absolute error (AAE) for both LADE and QMLE. For samples with heavy-tailed errors, i.e., , and , LADE performs better than QMLE especially for and . As expected, QMLE is better when the errors are and .

Fig. 1

Boxplots of AAE for LADE and QMLE.

Boxplots of AAE for LADE and QMLE. Then we apply the PTTGARCH model to daily HSI from 2001 to 2003, which has a total of 738 observations. The return series is defined as the percentage of the log difference of the index. Fig. 2, Fig. 3 are the time plots of the index and the return, respectively. They display some drastic shocks, which are caused by the 11/9 terrorist attack on 11-09-2001 and the severe acute respiratory syndrome (SARS) in China erupted in March 2003.

Fig. 2

The time plot of the original HSI.

Fig. 3

The time plot of the percentage of the log return of HSI .

The time plot of the original HSI. The time plot of the percentage of the log return of HSI . We first study whether or not is heavy-tailed. The Hill estimator and the QQ-plot are used for this. The Hill estimator is defined as where are the order statistics of . We plot in Fig. 4 , which suggests that has an infinite fourth moment or even probably infinite variance, since the estimator is less than 4 for and less than 2 for . Fig. 5 presents the QQ-plot for , which suggests that is very heavy-tailed.

Fig. 4

The Hill estimator of .

Fig. 5

The QQ-plot of .

The Hill estimator of . The QQ-plot of . To test whether is white noise, we use the Wald test statistics based on the weighted least absolute estimators with the weight function where and is the 90 percent quantile of the data (see Ling, 2005 for details), since the Box–Pierce statistic is not applicable for the heavy-tailed case. Here and in the following we take the kernel function and (Silverman, 1986, p. 40). We obtain that and . Both are not significant at 0.05 level. However, for the squared series , we obtain that and , which are highly significant at the level 0.05. This suggests that the series has conditional heteroscedastic structure. Now, we fit a PTTGARCH(1,1) model using QMLE to the data. The estimates are with standard errors , , , , and , respectively. For the standardized residuals, we obtain and ; for the squared standardized residuals, we obtained and . Based on the significance level of the and distribution, the PTTGARCH(1,1) model fits the data adequately according to both statistics , , , and . Fig. 6 shows the Hill estimator of the standardized residuals, which indicates that the residuals may have infinite fourth moment since the Hill estimator is less than when . The QQ-plot in Fig. 7 also shows that the residuals are heavy-tailed. Thus, we fit a PTTGARCH model to the data with LADE.

Fig. 6

The Hill estimator of the standardized residuals.

Fig. 7

The QQ-plot of the standardized residuals.

The Hill estimator of the standardized residuals. The QQ-plot of the standardized residuals. For order selection, we assume that for simplicity. Using the procedure for order selection in Section 3, we test , vs , ; , vs , ; and , vs , in order, and all the Wald statistics are less than 1, namely not significant. Then we test , vs , , and the Wald statistic is 34.9, which rejects the null hypothesis and we take , . The LADEs are with standard errors , , , , and , respectively. To check the adequacy of the estimated PTTGARCH(1,1) model, we conduct the white noise test for the residuals and the squared residuals using the same method as for before. We have and for the residuals and and for the squared residuals, which are all not significant at 0.05 level. Hence, the estimated PTTGARCH(1,1) model is adequate for the data . Notice that for both the residuals and the squared residuals, all the Wald statistics based on LADE are less than those based on QMLE, which suggests that the fitted model based on LADE is the more adequate. For the fitted model using LADE, we also test the hypotheses and , respectively. The Wald statistic for the former is 14.24 and is highly significant. The Wald statistic for the latter is 2.41 and is not significant, which may be caused by the small values of and . In fact, as we can see from the estimators, is about five times . This example illustrates that the data are asymmetric and nonlinear and the PTTGARCH model is capable of capturing these characteristics.

Theoretical proofs

We use the same notation as in 2, 3. Before we prove Theorem 1, Theorem 2, Theorem 3, Theorem 4, we introduce some lemmas first. Under assumptions (A1) and (A2), there exist positive constants and independent from such that and where is the positive constant in Remark 1. Denote and By Lemma 3.1 of Berkes et al. (2003), there exist some constants and such that where C and r are both independent of . Notice that thus (5.1) holds. Similarly, we can show that (5.2) holds. □ Under assumption (A1) and (A2), it follows that where m, h are any positive integers, and . We only prove the case . The proof of the case is similar. From (2.11), (2.12), it is sufficient to prove that By the definition of , we have , where is the positive constant in Remark 1. Notice that for any given positive integers h, there exists such that for , where , and are defined, respectively, in Theorem 6 and Remark 1. By Lemma 1, it follows that Noticing that we have which implies (5.3) holds by Theorem 6(i). It is obvious that (5.4) holds for by (2.4), (2.5), (2.6). From Lemma 5.2 of Berkes et al. (2003), we can obtain (5.4) for . Now we prove (5.5). Using the same argument as (A.5), we can obtain , where is defined in (2.3). On the other hand,where is the same one as in Remark 1 and is defined in (2.3). From Hölder inequality and (5.4), it follows that It can be easily verified that Let and . Noticing (2.1) holds, We obtainThus, . Similarly, we can obtain that . This completes the proof. □ Suppose that assumptions (A1) and (A2) hold. If for some , it follows that where . We only prove the Lemma for , noting that similar arguments apply for . By the definition of , we have By similar argument to Lemma 5.1 of Berkes et al. (2003), we can obtain thatFrom Hölder inequality, Lemma 2 and (5.6), we haveUsing Lemma 2 again, we obtain Thus, . □ Suppose that assumptions (A1)–(A3) hold, and denote for all . Then is well defined and is the unique maximizer of . By Lemma 2, Lemma 3, it is obvious that is well defined. Maximizing is equivalent to minimizing . But, Note that the function for any and reaches its unique minimum value at . Since if and only if , we obtain the result. □ If the conditions of Theorem 1 are satisfied, then as , we have By the ergodic theorem,for any . Using the mean value theorem, we havewhere , lie on the line form to . We have by Lemma 2, Lemma 3. Then, which shows that is equicontinuous with probability one. Combining this fact, (5.7) and the compactness of , the uniform convergence of follows. By the same method, we can prove that the results hold for and . □ If the conditions of Theorem 1 are satisfied, then it follows that: Let then By Lemma 1, we haveBy the mean value theorem, we have where lies between and . Thus we have by a similar way to the proof of from Lemma 1. Therefore,Using the same method as the proof of (5.8), we getThis completes the proof. □ (i) By Theorem 4.1.1 and the associated in Amemiya (1985), we have if the following conditions hold:By assumption (A.2), Lemma 4, Lemma 5, we know conditions (a)–(d) are satisfied. Thus, . is a compact parameter space; is continuous in for all X and is a measurable function of X for all ; uniformly in ; attains a unique global maximum at . (ii) By the mean value theorem, we obtain that where lies between and . But, . Then Using Lemma 3 and the continuity of , we obtain It can be easily verified that is a stationary sequence of martingale differences. Therefore, by applying a central limit theorem of martingale (Hall and Heyde, 1980), we obtain (i) By Lemmas 5, and 6, we have Imitating the proof of Theorem 1 (i), we obtain the result. (ii) Notice thatFrom Lemma 5 and the mean value theorem, we have Thus, by Lemma 3 and the continuity of . Then the result follows from Theorem 1. (iii) By Lemmas 3 and 6 and Theorem 2 (ii), we have Therefore, by applying an ergodic theorem to . This completes the proof of Theorem 2. □ Definewhere . By Lemma 1 and the same argument as Lemma 6, we obtain thatuniformly on compact sets. Using the equalitywe haveSince , we know is a stationary sequence of martingale differences by assumption (A4) and Lemma 2. Therefore, applying a martingale central limit theorem (Hall and Heyde, 1980), we obtain , where N denotes a random vector. Now turning to , let ThenBy Lemma 2, we have and . Therefore, we have proved that On the other hand, on the set , we may show that and Therefore, Using the same argument for the second indicator in the summands of , we obtain thatLet , then the finite dimensional distributions of converge to those of T. But, since has convex sample paths, this implies that the convergence is in fact on (see the proof of Proposition 1 in Davis and Dunsmuir, 1997). Denote , then we have from Lemma 2. By Taylor expansion and (5.10), it follows thatBy a similar argument for in Pan et al. (2007), we can obtain that uniformly on compact sets, which implies on from (5.9). By the proof of Theorem 1 in Pan et al. (2007), we obtain the result. □ Based on Theorems 3 and 4 follows immediately from the following two assertions:For the first assertion, defining we have Obviously, by the ergodic theorem. But,by Lemma 2, Theorem 3. In the above expression, and lie on the line from to . Using Lemma 1, by a similar argument leading to Lemma 6, we can conclude that . The proof of the first assertion is completed. For the second assertion, defining we haveprovided , by a proof similar to that of Lemma 6. Notice thatIt follows from Lemma 2 and Theorem 3 that On the other hand, sinceit follows thatby assumption (A4). Then, . This completes the proof of the second assertion. □

1 in total

Review 1. Stock Market Volatility and Return Analysis: A Systematic Literature Review.

Authors: Roni Bhowmik; Shouyang Wang
Journal: Entropy (Basel) Date: 2020-05-04 Impact factor: 2.524

1 in total