Literature DB >> 30363843

A difference-based approach in the partially linear model with dependent errors.

Abstract

We study asymptotic properties of estimators of parameter and non-parameter in a partially linear model in which errors are dependent. Using a difference-based and ordinary least square (DOLS) method, the estimator of an unknown parametric component is given and the asymptotic normality of the DOLS estimator is obtained. Meanwhile, the estimator of a nonparametric component is derived by the wavelet method, and asymptotic normality and the weak convergence rate of the wavelet estimator are discussed. Finally, the performance of the proposed estimator is evaluated by a simulation study.

Entities: Disease Gene

Keywords: Asymptotic normality; Finite difference; Least square; NSD random variables; Partially linear model

Year: 2018 PMID： 30363843 PMCID： PMC6182447 DOI： 10.1186/s13660-018-1857-x

Source DB: PubMed Journal: J Inequal Appl ISSN： 1025-5834 Impact factor: 2.491

Introduction

Consider the partially linear model (PLM) where the superscript T denotes the transpose, are scalar response variables, are explanatory variables, β is a d-dimensional column vector of the unknown parameter, is an unknown function, are deterministic with , and are random errors. PLM was first considered by Engle et al. [1], and now is one of the most widely used statistical models. It can be applied in almost every field, such as engineering, economics, medical sciences and ecology, etc. There are many authors (see [2-8]) concerned with various estimation methods to obtain estimators of the unknown parameters and nonparameters for partially linear model. Deep results such as asymptotic normality of estimators have been obtained. In this paper, by a difference-based approach, we will use the ordinary least square and wavelet to investigate model (1). The differencing procedures provide a convenient means for introducing nonparametric techniques to practitioners in a way which parallels their knowledge of parametric techniques, and differencing procedures may easily be combined with other procedures. For example, Wang et al. [9] obtained a difference-based approach to the semiparametric partially linear model. Tabakan et al. [10] studied a difference-based ridge in partially linear model. Duran et al. [11] investigated the difference-based ridge and Liu type estimators in semiparametric regression models. Hu et al. [12] used a difference-based Huber Dutter estimator (DHD) to obtain the root variance σ and parametric β for partially linear model. Wu [13] constructed the restricted difference-based Liu estimator for the parametric component of partially linear model. However, in the majority of the previous work it is assumed that errors are independent. The asymptotic problem of difference-based estimators of partially linear model with dependent errors is in practice important. In this paper, we use a difference-based and ordinary least square method to study the partially linear model with dependent errors. For the dependent errors we confine ourselves to negatively superadditive dependent (NSD) random variables. There are many applications of NSD random variables in multivariate statistical analysis; see [14-23]. Hence, it is meaningful to study the properties of NSD random variables. The formal definition of NSD random variables is the following.

Definition 1

(Kemperman [24]) A function Φ: is called superadditive if for all , where ∨ stands for componentwise maximum, and ∧ for componentwise minimum.

Definition 2

(Hu [25]) A sequence is said to be NSD if where are independent with for each i, and Φ is a superadditive function such that the expectations in (2) exist. An infinite sequence of random variables is said to be NSD if is NSD for all . In addition, using the wavelet method (see [26-29]), the weak convergence rate and asymptotic normality of the estimator of are obtained. Throughout the paper we fix the following notations. is the true value of the unknown parameter β. Z is the set of integers, N is the set of natural numbers, R is the set of real numbers. Denote , and . Let are positive constants. For a sequence of random variables and a positive sequence , write if converges to 0 and if is bounded. We can similarly define the notations of and for stochastic convergence and stochastic bounded. Weak convergence of a distribution is denoted by , and for random variables by . is the Euclidean norm of x, and .

Estimation method

Define the differencing matrix D as where the positive integer number m is the order of differencing and are differencing weights satisfying This differencing matrix is given by Yatchew [30]. Using the differencing matrix to model (1), we have From Yatchew [30], the application of differencing matrix D in model (1) can remove the nonparametric effect in large samples, so we will ignore the presence of Df. Thus, we can rewrite (4) as where , and is nonsingular for large n, , , , , . As a usual regression model, the ordinary least square estimator of the unknown parameter β is given as Then the estimator satisfies and hence In the following, we use wavelet techniques to estimate if is known. Suppose that there exists a scaling function in the Schwartz space and a multiresolution analysis in the concomitant Hilbert space , with the reproducing kernel given by Let denote intervals that partition with for . Then the estimator of the nonparameter is given by

Preliminary conditions and lemmas

In this section, we give the following conditions and lemmas which will be used to obtain the main results. . (Sobolev space), for some . is Lipschitz function of order . belongs to , which is a Schwartz space for . is a Lipschitz function of order 1 and has compact support, in addition to as , where ϕ̂ denotes Fourier transform of ϕ. , , satisfy , and .

Remark 3.1

Condition (C1) is standard and often imposed in the estimator of partial linear models, once can refer to Zhao et al. [31]. Conditions (C2)–(C5) are used by Hu et al. [29]. Therefore, our conditions are very mild and can easily be satisfied.

Lemma 3.1

(Hu [25]) Suppose that is NSD. If are nondecreasing functions, then is NSD. For any and , is NSD.

Lemma 3.2

(Wang et al. [17]) Let . Let be a sequence of NSD random variables with and for each . Then for all , and

Lemma 3.3

Let . Let be a sequence of NSD random variables with and for all , and be a sequence of real constants. Then for all , and, for ,

Proof

Let , , then , and and are both NSD random variables for all by Lemma 3.1. By the -inequality, In the case , it follows from Lemma 3.2 that Note that , the desired result (11) follows from (13) immediately. In the same way, we also have (12). The proof is completed. □

Remark 3.2

From Lemma 3.3 and Lemma 3.1, we have, for , and, for , where .

Lemma 3.4

Let A and B be disjoint subsets of N, and be a sequence of NSD random variables. Let f: and g: be differentiable with bounded derivatives, and stand for supnorm. Then provided the covariation on the right hand side exists, where is an array of real numbers. For a pair of random variables , , we have Denote by the joint distribution functions of , and the marginal distribution function of , one gets this relation was established in Lehmann [32] for any two random variables and with exist. Let are complex valued function on R with derivatives , then we have The proof is completed. □

Lemma 3.5

Let be a sequence of NSD random variable with . Let , and if . Then where , and , are real numbers with . Notice that the result is true for . For , let . Then, by Lemma 3.4 and , Hence, the result is true for . Moreover, suppose that (16) holds for . By Lemma 3.4, we have, for n, which completes the proof. □

Lemma 3.6

(Hu et al. [29]) If Condition (C3) holds, then , (where and is a constant depending on k only). . . .

Lemma 3.7

(Rao [33]) Suppose that are independent random variables with and for some . Then where .

Lemma 3.8

(Yu et al. [34]) Let be a sequence of NSD random variable satisfying , as , and be an array of real numbers with and . Suppose that is uniformly integral in , then where .

Main results and their proofs

Theorem 4.1

Under Condition (C1), suppose that is a sequence of NSD random variables with and for some , as . Then provided that is a positive definite matrix, where is the identity matrix of order d. By Condition (i), we have from which it follows that and for all Then we can find a positive number sequence with such that Now, we define the integers: , and, for each , put Denote where is the number of blocks of indices . Then and hence we have . If the number of the remainder term is not zero when the construction ends, then we put all the remainder terms into a block denoted by . By (7), we have Then to prove (17), it is enough to prove that Let u be an arbitrary d-dimensional column vector with , and set . Then, by the Cramér–Wold device, to prove (21) it suffices to prove that Write Moreover, note that and by Condition (C1), then applying Lemma 3.3 with we have which follows from by the Markov inequality. Therefore, to prove (22), it suffices to show that On the one hand, by the definition of , it is easy to show that Therefore by the above formula and (23), On the other hand, by Lemma 3.5 and (ii), we have which implies that the problem now is reduced to study the asymptotic behavior of independent and non-identically distribution random variables . To complete the proof of (24), it is enough to show that random variables satisfies the condition of Lemma 3.7. Set By the definition of , By Lemma 3.3 with and (27), and recalling that , since and (i). Hence, by Lemma 3.7, (24) holds and the proof is completed. □

Corollary 4.1

Under Condition (C1), let be a sequence of independent random variables with , and suppose that (i) of Theorem 4.1 holds and for all . Then provided that is a positive definite matrix. Since is a sequence of independent random variables, we have if and hence if . It follows that from the conditions of Corollary 4.1, we see that is a positive definite matrix. Thus the result follows from (29). □

Theorem 4.2

Assume the conditions of Theorem 4.1, and further assume that Conditions (C2)–(C5) hold. Then where in arbitrary slowly rate, and if , if , and if . We can prove Theorem 4.2 by a similar argument to Theorem 3.2 of Hu et al. [12], so we omit the detail. □

Theorem 4.3

Under the Conditions of Theorem 4.2, we have where . Note from the proof of Theorem 3.2 in Hu et al. [12], we get , and , and it implies that and Then we should prove Let , then, by Lemma 3.6 and (C5), , and , and condition (i) implies that is a uniformly integral family on , then, by Lemma 3.8 and (ii), we have The proof is completed. □

A simulation example

In this section, we perform a simulation example to verify the accuracy of Theorem 4.1 and Theorem 4.3. Consider the partially linear model where , is NSD sequence and raised as follows. Let be a sequence of independent and identically distributed random variables with common probability mass function . Then given is NSD by Theorem 3.1 in Hu [25], where . Set and the difference sequence (Wang et al. [9]). We first evaluate the approximation. Figures 1 and 2 show the results for two sample size specifications (). Panel 1 in Fig. 1 compares the empirical distribution functions of and . Panel 2 in Fig. 1 gives the QQ-plot of . Figure 1 shows that the distribution of can approximate well even if the sample size are not large (). Comparison of Fig. 2 with Fig. 1 indicates that the distribution approximation for the larger sample size is much more accurate than that for the small one.

Figure 1

A comparison fitted distribution functions of and , and QQ-plot of , where

Figure 2

A comparison fitted distribution functions of and , and QQ-plot of , where

A comparison fitted distribution functions of and , and QQ-plot of , where A comparison fitted distribution functions of and , and QQ-plot of , where Choose the Daubechies scaling function as in Hu et al. [29]. Figures 3 and 4 show that the distribution of is closer and closer to with the increasing sample size.

Figure 3

A comparison fitted distribution functions of and , and QQ-plot of , where

Figure 4

A comparison fitted distribution functions of and , and QQ-plot of , where

A comparison fitted distribution functions of and , and QQ-plot of , where A comparison fitted distribution functions of and , and QQ-plot of , where

Conclusions

In this paper, we use a difference-based and ordinary least square (DOLS) method to obtain the estimator of the unknown parametric component β of the partial linear model with dependent errors. In addition, we investigate the asymptotic normality for the DOLS estimator of β and wavelet estimator of . Thus, we extend some results of Hu et al. [12] to the partially linear model with NSD errors. Furthermore, NSD random variables contain negatively associated random variables. Therefore, it is an interesting subject to investigate the limit properties of the difference-based estimator for a partially linear model with NSD errors in future studies.

4 in total

A difference-based approach in the partially linear model with dependent errors.

Introduction

Definition 1

Definition 2

Estimation method

Preliminary conditions and lemmas

Remark 3.1

Lemma 3.1

Lemma 3.2

Lemma 3.3

Proof

Remark 3.2

Lemma 3.4

Lemma 3.5

Lemma 3.6

Lemma 3.7

Lemma 3.8

Main results and their proofs

Theorem 4.1

Corollary 4.1

Theorem 4.2

Theorem 4.3

A simulation example

Conclusions

1. Partially Linear Models with Missing Response Variables and Error-prone Covariates.

2. A PARTIALLY LINEAR FRAMEWORK FOR MASSIVE HETEROGENEOUS DATA.

3. On the strong convergence for weighted sums of negatively superadditive dependent random variables.

4. M-test in linear models with negatively superadditive dependent errors.