Pointwise Wavelet Estimations for a Regression Model in Local Hölder Space

: This paper considers an unknown functional estimation problem in a regression model with multiplicative and additive noise. A linear wavelet estimator is ﬁrst constructed by a wavelet projection operator. The convergence rate under the pointwise error of linear wavelet estimators is studied in local Hölder space. A nonlinear wavelet estimator is provided by the hard thresholding method in order to obtain an adaptive estimator. The convergence rate of the nonlinear estimator is the same as the linear estimator up to a logarithmic term. Finally, it should be pointed out that the convergence rates of two wavelet estimators are consistent with the optimal convergence rate on pointwise nonparametric estimation.


Introduction
The classical regression model plays an important role in many practical applications. The definition of this model is shown by Y i = f (X i ) + ε i , i ∈ {1, . . . , n}. The aim of this conventional regression model is to estimate the unknown regression function f (x) by observed data (X 1 , Y 1 ), . . . , (X n , Y n ). For this classical regression model, many important and interesting results have been obtained by Hart [1], Kerkyacharian and Picard [2], Chesneau [3], Reiß [4], Yuan and Zhou [5], and Wang and Politis [6].
Recently, Chesneau et al. [7] studied the following regression model where (X 1 , Y 1 ), . . . , (X n , Y n ) are independent and identically distributed random variables, f is an unknown function defined on ∆ ⊆ R, U 1 , . . . , U n are n identically distributed random vectors, X 1 , . . . , X n and V 1 , . . . , V n are identically distributed random variables. Moreover, X i and U i are independent, U i and V i are independent for any i ∈ {1, . . . , n}.
The aim of this model is to estimate the unknown function r(x)(r := f 2 ) by the observed data (X 1 , Y 1 ), . . . , (X n , Y n ). For the above model (1), it reduces to the classical regression model when U i ≡ 1. In other words, (1) can be viewed as an extension of the classical regression problem. In addition, model (1) becomes the classical heteroscedastic regression model when V i is a function of X i (V i = g(X i )). Then, the function r(x)(r := f 2 ) is called a variance function in a heteroscedastic regression model, which plays a crucial role in financial and economic fields (Cai and Wang [8], Alharbi and Patili [9]). Furthermore, the regression model (1) is also widely used in Global Positioning Systems (Huang et al. [10]), Image processing (Kravchenko et al. [11], Cui [12]), and so on. For this regression model, Chesneau et al. [7] propose two wavelet estimators and discuss convergence rates under the mean integrated square error over Besov space. However, this study only focuses on the global error of wavelet estimators. There is a lack of pointwise risk estimation for this model. In this paper, two new wavelet estimators are constructed, and the convergence rates over the pointwise error of wavelet estimators in local Hölder space are considered. More importantly, those wavelet estimators can all obtain the optimal convergence rate under pointwise error.

Assumptions, Local Hölder Space and Wavelet
In this paper, we will consider model (1) For the above assumptions, it is easy to see that A5 and A6 are reversed. Hence, we will define the following two sets, H1 and H2, of the above assumptions H1:={A1,A2,A3,A4,A5}, H2:={A1,A2,A3,A4,A6}. Note that the difference between H1 and H2 is the relationship between V i and X i . Since the above assumptions are separated into two sets, H1 and H2; the estimators of the function r(x) should be constructed under different condition sets, respectively. This paper will consider nonparametric pointwise estimation in local Hölder space. Now, we introduce the concept of local Hölder space. Recall the classic Hölder condition Let Ω x 0 be a neighborhood of x 0 ∈ R and a function space where C > 0 is a fixed constant. Clearly, f ∈ H δ (R) must be contained in H δ (Ω x 0 ). However, the converse does not hold. For s = N + δ > 0 with δ ∈ (0, 1] and N ∈ N (the nonnegative integer set), we define the local Hölder space as In order to construct wavelet estimators in later sections, we introduce some basic theories of wavelets.

Definition 1.
A multiresolution analysis (MRA) is a sequence of closed subspaces {V j } j∈Z of the square-integrable function space L 2 (R) satisfying the following properties: if and only if f (·) ∈ V j for each j ∈ Z; (iv) There exists φ ∈ L 2 (R) (scaling function) such that {φ(· − k), k ∈ Z} forms an orthonormal basis of V 0 = span{φ(· − k)}.
Let φ be a scaling function, and ψ be a wavelet function such that {φ j * ,k , ψ j,k , j ≥ j * , k ∈ Z} constitutes an orthonormal basis of L 2 (R), where j * is a positive integer, φ j * ,k = 2 j * 2 φ(2 j * x − k) and ψ j,k = 2 j 2 ψ(2 j x − k). In this paper, we choose the Daubechies wavelets. Then for any h(x) ∈ H s (Ω x 0 ), it has the following expansion where α j,k = h, φ j,k , β j,k = h, ψ j,k . Further details can be found in Meyer [13] and Daubechies [14].
Let P j be the orthogonal projection operator from L 2 (R) onto the space V j with the orthonormal In this position, we give an important lemma, which will be used in later discussions. Here and after, we adopt the following symbol: A B denotes A ≤ cB for some constant c > 0; A B means B A; A ∼ B stand for both A B and B A. [15]).

Linear Wavelet Estimator
In this section, a linear wavelet estimator is given by using the wavelet method, and the order of pointwise convergence of this estimator is studied in local Hölder space. Now we define our linear wavelet estimator r lin whereα According to the definition of v j * ,k , it is clear that the structure of this linear wavelet estimator depends on the reverse conditions of A5 and A6. Some of the lemmas needed in this section and their proofs are given below.
Proof. According to the definition ofα j * ,k , Since U i is independent from X i and V i , respectively, In addition, condition A3 implies that E[U 1 ] = 0. Then one gets It follows from A5, A2 and A4 that On the other hand, we obtain Finally, according to the assumption of A3 and A2, In order to estimate E α j * ,k − α j * ,k p , we need the following Rosenthal's inequality. Rosenthal's inequality Let X 1 , . . . , X n be independent random variables such that E[X i ] = 0 and |X i | ≤ M(i = 1, 2, . . . , n), Lemma 3. Letα j * ,k be defined by (3). If H1 or H2 hold and 2 j * ≤ n, then for 1 ≤ p < ∞, Proof. By (5) and the definition ofα j * ,k , with Using the definition of Z i and A1, there exists a constant c > 0 such that When p > 2, according to Rosenthal's inequality, Note Furthermore, it follows from A1 and the property of φ j * ,k that Then it can be easily seen that By (8) and (9), we obtain It follows from (7), (10) and (11) that Hence, This with 2 j * ≤ n implies that Now the convergence rate of the linear wavelet estimator is proved in the following.

Theorem 1.
Let r ∈ H s (Ω x 0 ) with s > 0. Then for each 1 ≤ p < ∞, the linear wavelet estimator r lin n (x) defined in (2) with 2 j * ∼ n 1 2s+1 satisfies Remark 1. Note that n − s 2s+1 is the optimal convergence rate over pointwise error for nonparametric functional estimation (Brown and Low [16]). The above result yields that the linear wavelet estimator can obtain the optimal convergence rate.

Nonlinear Wavelet Estimator
According to the definition of the linear wavelet estimator, we can easily find that the scale parameter j * of the linear wavelet estimator depends on the smooth parameter s of the function r(x) to be estimated, so the linear estimator is not adaptive. In this section, we will solve this problem by constructing a nonlinear wavelet estimator with the hard thresholding method. Now we define our nonlinear wavelet estimator r non whereα j * ,k is defined by (3),β and t n = √ ln n/n, I G denotes the indicator function over an event G. The positive integer j * , j 1 , and κ will be given in Theorem 2.

Remark 2.
Compared with the structure ofβ j,k in Chesneau et al. [7], the definition ofβ j,k in this paper does not need a thresholding algorithm. In other words, this paper reduces the complexity of the nonlinear wavelet estimator. (1), if H1 or H2 hold, then E β j,k = β j,k . Lemma 5. Letβ j,k be defined by (17). If H1 or H2 hold and 2 j ≤ n, then for 1 ≤ p < ∞,

Lemma 4. For model
The proof methods of Lemmas 4 and 5 are similar to that of Lemmas 2 and 3, so the proofs are omitted here. For nonlinear wavelet estimation, Bernstein's inequality plays a crucial role.
Bernstein's inequality Let X 1 , . . . , X n be independent random variables such that E[X i ] = 0, |X i | ≤ M and E X 2 i = σ 2 , then for each v > 0 . Lemma 6. Letβ j,k be defined by (17), t n = ln n n and 2 j ≤ n ln n . If H1 or H2 hold, then for each w > 0, there exists a constant κ > 1 such that Proof. According to the definition ofβ j,k , Hence, Using Bernstein's inequality, t n = ln n n and 2 j ≤ n ln n , .
Then one chooses a large enough κ > 1 such that Theorem 2. Let r ∈ H s (Ω x 0 ) with s > 0. Then for each 1 ≤ p < ∞, the nonlinear wavelet estimatorr non n (x) defined in (16) with 2 j * ∼ n 1 2m+1 (s < m) and 2 j 1 ∼ n ln n satisfies Remark 3. Compared with the linear wavelet estimator, the nonlinear wavelet estimator does not depend on the smooth parameter of r(x). Hence, the nonlinear estimator is adaptive. More importantly, the nonlinear estimator can also achieve the optimal convergence rate up to an ln n factor.
Hence, one can obtain that . Clearly, 2 j * ∼ n 1 2m+1 ≤ 2 j ∼ n ln n 1/(2s+1) ≤ 2 j 1 ∼ n ln n . Note that Similar to the argument of (15), one gets It is easy to show that In addition, Then according to (29) , which completes the proof of Theorem 2.

Conclusions
This paper studies the pointwise estimations of an unknown function in a regression model with multiplicative and additive noise. Under some different assumptions, linear and nonlinear wavelet estimators are constructed. It is clear that those wavelet estimators have diverse forms with different conditions. The convergence rates over the pointwise risk of two wavelet estimators are proposed by Theorems 1 and 2. It should be pointed out that the linear and nonlinear wavelet estimators can all obtain the optimal convergence rate of pointwise nonparametric estimation. More importantly, the nonlinear wavelet estimator is adaptive. In other words, the conclusions of asymptotic and theoretical performance are clear in this paper. However, it is a difficult problem to give numerical experiments, which need more investigations and new skills. We will study it in the future.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.