Next Article in Journal
Optimal Congestion Management with FACTS Devices for Optimal Power Dispatch in the Deregulated Electricity Market
Next Article in Special Issue
Joint Discrete Universality in the Selberg–Steuding Class
Previous Article in Journal
Information Processing with Stability Point Modeling in Cohen–Grossberg Neural Networks
Previous Article in Special Issue
On a Sum of More Complex Product-Type Operators from Bloch-Type Spaces to the Weighted-Type Spaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Scalar-on-Function Relative Error Regression for Weak Dependent Case

by
Zouaoui Chikr Elmezouar
1,
Fatimah Alshahrani
2,
Ibrahim M. Almanjahie
1,
Zoulikha Kaid
1,
Ali Laksaci
1 and
Mustapha Rachdi
3,*
1
Department of Mathematics, College of Science, King Khalid University, Abha 62223, Saudi Arabia
2
Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia
3
Laboratory AGEIS, University of Grenoble Alpes, UFR SHS, BP. 47, Cedex 09, F38040 Grenoble, France
*
Author to whom correspondence should be addressed.
Axioms 2023, 12(7), 613; https://doi.org/10.3390/axioms12070613
Submission received: 15 April 2023 / Revised: 6 June 2023 / Accepted: 16 June 2023 / Published: 21 June 2023
(This article belongs to the Special Issue Theory of Functions and Applications)

Abstract

:
Analyzing the co-variability between the Hilbert regressor and the scalar output variable is crucial in functional statistics. In this contribution, the kernel smoothing of the Relative Error Regression (RE-regression) is used to resolve this problem. Precisely, we use the relative square error to establish an estimator of the Hilbertian regression. As asymptotic results, the Hilbertian observations are assumed to be quasi-associated, and we demonstrate the almost complete consistency of the constructed estimator. The feasibility of this Hilbertian model as a predictor in functional time series data is discussed. Moreover, we give some practical ideas for selecting the smoothing parameter based on the bootstrap procedure. Finally, an empirical investigation is performed to examine the behavior of the RE-regression estimation and its superiority in practice.

1. Introduction

This paper focuses on nonparametric prediction in Hilbertian statistics, which is an intriguing area of research within nonparametric Hilbertian statistics. Various approaches exist for modeling the relationship between the input Hilbertian variable and the output real variable. Typically, this relationship is modeled through a regression model, where the regression operators are estimated using the least square error. However, this rule is not relevant for some practical cases. Instead, we consider in this paper the relative square error. The primary advantage of this regression is the possibility of reducing the effect of the outliers. This kind of relative error is used as a performance measure in many practical situations, namely in time series forecasting. The literature on the subject of nonparametric analysis is limited. Most existing works consider a parametric approach. In particular, Narula and Wellington [1] were the first to investigate the use of the relative square error in the estimation method. For practical purposes, relative regression has been applied in areas such as medicine by Chatfield [2] and financial data by Chen et al. [3]. Yang and Ye [4] considered the estimation by RE-regression in multiplicative regression models. Jones et al. [5] also focused on the use of this model but with the nonparametric estimation method and stated the convergence of the local linear estimator obtained by the relative error as a loss function. The RE-regression estimation has been deeply studied for time series data in the last few years, specifically by Mechab and Laksaci [6] for the quasi-associated time series, and Attouch et al. [7] for the spatial process. The nonparametric Hilbertian RE-regression was first developed by Demongeot et al. [8], who focused on strong consistency and gave the asymptotic law of the RE-regression. To summarize, functional statistics is an attractive subject in mathematical statistics; the reader may refer to some survey papers, such as [9,10,11,12,13,14,15], for recent advances and trends in functional data analysis and/or functional time series analysis.
In this article, we focus on the Hilbertian RE-regression for weak functional time series data. In particular, the correlation of our observations is modeled by using the quasi-association assumption. This correlation includes many important Hilbertian time series cases, such as the linear and Gaussian processes, as well as positive and negative associated processes. Our ambition in this contribution is to build a new Hilbertian predictor in the Hilbertian time series. This predictor is defined as the ratio of the first and the second inverted conditional moments. We use this explicit expression to construct two estimators based on the kernel smoothing and/or k-Neighbors Number (kNN). We prove a strong consistency of the constructed estimator, which provides good mathematical support for its use in practice. Thus, treating the functional RE-regression by the kNN method under quasi-associated assumption is a great theoretical development which requires nonstandard mathematical tools and techniques. On the one hand, it is well known that the establishment of the asymptotic property in the kNN method is more difficult than the classical kernel estimation due to the random feature of the bandwidth parameter. On the other hand, our weak structure of the functional time series data requires additional techniques and mathematical tools alternative to those used in the mixing case. Clearly, this theoretical development is very useful in practice because the kNN estimator is more accurate than the kernel method and the quasi-association structure is sufficiently weak to cover a large class of functional time series data. Furthermore, the applicability of this estimator is highlighted by giving some selection procedures to determine the parameters involved in the estimator. Then, real data are used to emphasize the superiority and impact of this contribution in practice.
This paper is organized as follows. We introduce the estimation algorithms in Section 2. The required conditions, as well as the main asymptotic results, are demonstrated in Section 3. We discuss some selectors for the smoothing parameter in Section 4. The constructed estimator’s performance over the artificial data is evaluated in Section 5. Finally, we state our conclusion in Section 6 and demonstrate proofs of the technical results in the Appendix A.

2. The Re-Regression Model and Its Estimation

As discussed in the introduction, we aim to evaluate the relationship between an exogenous Hilbertian variable X and a real endogenous variable Y. Specifically, the variables ( X , Y ) belong in H × I R . The set H constitutes a separable Hilbert space. We assume that the norm · in H is associated with the inner product · , · . Furthermore, we define on H a complete orthonormal basis ( e k ) k 1 . In addition, we suppose that Y is strictly positive, and we suppose that the Hilbertian operators I E [ Y 1 | X ] and I E [ Y 2 | X ] exist and are, almost surely, finite. The RE-regression is defined by
R ( x ) = arg min θ I E Y θ Y 2 | X = x .
By differentiating with respect to θ , we prove that
R ( x ) = I E Y 1 | X = x I E Y 2 | X = x .
Clearly, the RE-regression R ( · ) is a good alternative to the traditional regression, in the sense that, the traditional regression, based on the least square error, treats all variables with equal weight. This is inadequate when the observations contain some outliers. Thus, the traditional regression can lead to irrelevant results in the presence of outliers. Thus, the main advantage of the RE-regression R ( · ) compared to the traditional regression is the possibility to reduce the effect of the outliers (see Equation (1)). So, we can say that the robustness feature is one of the main advantages of the RE-regression. Additionally, unlike the classical robust regression (the M-egression), the RE-regression is very easy to implement in practice. It has an explicit definition based on the ratio of the first and the second inverted conditional moments (see Equation (2)).
Now, consider ( X i , Y i ) i 1 , , n strictly stationary observations, as copies of a couple ( X , Y ) . The Hilbertian time series framework of the present contribution is carried out using the quasi-association setting (see Douge [16] for the definition of the Hilbert space). We use the kernel estimators of the inverse moments I E [ Y 1 | X ] and I E [ Y 2 | X ] as conditional expectations of Y γ ( γ = 1 , 2 ), given X = x , to estimate R ( x ) by
R ˜ ( x ) = i = 1 n Y i 1 K x X i h n i = 1 n Y i 2 K x X i h n ,
where h n is a positive sequence of real numbers, and K is a real-function so-called kernel. The choice of h n is the determining issue of the applicability of the estimator R ˜ . A common solution is to utilize kernel smoothing with the kNN estimation, for which
R ^ ( x ) = i = 1 n Y i 1 K x X i A n k i = 1 n Y i 2 K x X i A n k ,
where
A n k ( x ) = min a n > 0 ; i = 1 n 1 I B ( x , a n ) ( X i ) = k ,
where B ( x , a n ) is an open ball of radius a n > 0 centered x. In R ^ , the smoothing parameter is the number k. Once again, the selection of k is crucial.

3. The Consistency of the Kernel Estimator

We demonstrate the almost complete convergence of R ˜ ( · ) to R ( · ) at the fixed point x in H . Hereafter, N x is the given neighborhood of x, and C 1 , C 2 , C , are strictly positive constants. In the sequel, we put K i ( x ) = K ( h n 1 x X i ) , i = 1 , , n ,   R γ ( u ) = I E Y γ | X = u , and γ = 1 , 2 , and we denote this by
λ k : = sup s k | i j | s k = 1 l = 1 | C o v ( X j k , X i k ) | + k = 1 | C o v ( Y j , X i k ) | + l = 1 | C o v ( X j l , Y i ) | + | C o v ( Y j , Y i ) | ,
where X i k : = < X i , e k > . Moreover, we assume the following conditions:
(D1)
For all d > 0 ϕ x ( d ) : = I P ( X B ( x , d ) ) > 0 , and lim d 0 ϕ x ( d ) = 0 .
(D2)
For all ( x 1 , x 2 ) N x 2 ,
| R γ ( x 2 ) R γ ( x 1 ) | C d k γ ( x 2 , x 1 ) for k 1 , k 2 > 0 .
(D3)
The covariance coefficient is ( λ k ) k I N , such that λ k C e a k , a > 0 , C > 0 .
(D4)
K is the Lipschitzian kernel function, which has ( 0 , 1 ) as support and satisfies the following:
0 < C 2 K ( · ) C 3 < .
(D5)
The endogenous variable Y gives:
I E exp ( | Y | γ 1 ) < C and i j I E | Y i γ 2 Y j γ 3 | X i , X j C < , ( γ i ) i = 1 , 2 , 3 = 1 , 2 .
(D6)
For all i j ,
0 < sup i j I P ( X j , X i ) B ( x , d ) × B ( x , d ) = C ( ϕ x a + 1 a ( d ) ) .
(D7)
There exist ξ ( 0 , 1 ) and ξ 1 ( 0 , 1 ξ ) , x i 2 ( 0 , a 1 ) ,   such that
log n 5 n 1 ξ ξ 1 ϕ x ( h n ) 1 log n 1 + ξ 2 .
  • Brief comment on the conditions: Note that the required conditions stated above are standard in the context of Hilbertian time series analysis. Such conditions explore the fundamental axes of this contribution. The functional path of the data is explored through the condition (D1), the nonparametric nature of the model is characterized by (D2), and the correlation degree of the Hilbertian time series is explored by conditions (D3) and (D6). The principal parameters used in the estimator, namely the kernel and the bandwidth parameter, are explored through the conditions, (D4), (D5), and (D6). Such conditions are of a technical nature. They allow for retaining the usual convergence rate in nonparametric Hilbertian time series analysis.
Theorem 1.
Based on the conditions (D1)–(D7), we get
| R ˜ ( x ) R ( x ) | = O ( h n k 0 ) + O a . c o . log n n 1 ξ ϕ x ( h n ) ,
where k 0 = min ( k 1 , k 2 ) .
Proof of Theorem 1.
Firstly, we write
R ˜ ( x ) = R N ˜ ( x ) R D ˜ ( x ) ,
where
R N ˜ ( x ) = 1 n I E [ K ( h n 1 x X 1 ) ] i = 1 n Y i 1 K ( h n 1 x X i ) ,
and R D ˜ ( x ) = 1 n I E [ K ( h n 1 x X 1 ) ] i = 1 n Y i 2 K ( h n 1 x X i ) .
We use a basic decomposition (see Demongeot et al. [8] to deduce that Theorem 1 is a consequence result of the below lemmas). □
Lemma 1.
Using the conditions (D1) and (D3)–(D7), we get
| R N ˜ ( x ) I E R N ˜ ( x ) | = O a . c o . log n n 1 ξ ϕ x ( h n ) ,
and
| R D ˜ ( x ) I E R D ˜ ( x ) | = O a . c o . log n n 1 ξ ϕ x ( h n ) .
Lemma 2.
Under conditions (D1),(D2), (D4), and (D7), we get
I E R N ˜ ( x ) R 1 ( x ) = O ( h n k 1 ) ,
and
I E R D ˜ ( x ) R 2 ( x ) = O ( h n k 2 ) .
Corollary 1.
Using the conditions of Theorem 1, we obtain
n = 1 P R D ˜ ( x ) < R 2 ( x ) 2 < .
Next, to prove the consistency of R ^ ( x ) , we adopt the following postulates:
(K1)
K ( · ) has a bounded derivative on [ 0 , 1 ] ;
(K2)
The function ϕ x ( · ) , such that
ϕ x ( a ) = ϕ ( a ) L ( x ) + O ( a α ϕ ( a ) ) and lim a 0 ϕ ( u a ) ϕ ( a ) = ζ ( u ) ,
where L ( · ) , ζ are positive and bounded functions, and ϕ is an invertible function;
(K3)
There exist ξ ( 0 , 1 ) and ξ 1 , ξ 2 > 0 , such that
n ξ + ξ 1 log n 5 k n log n 1 ξ 2 .
Theorem 2.
Under conditions (D1)–(D6) and (K1)–(K3), we have
| R ^ ( x ) R ( x ) | = O ϕ 1 k n k 0 + O a . c o . n ξ log ( n ) k .
Proof of Theorem 2.
Similarly to Theorem 1, write
R ^ ( x ) = R N ^ ( x ) R D ^ ( x ) ,
where
R N ^ ( x ) = 1 n I E [ K ( A n k 1 x X 1 ) ] i = 1 n Y i 1 K ( A n k 1 x X i ) ,
and R D ^ ( x ) = 1 n I E [ K ( A n k 1 x X 1 ) ] i = 1 n Y i 2 K ( A n k 1 x X i ) ,
and we define, for a sequence β n ( 0 , 1 ) , such that β n 1 = O ϕ 1 k n k 0 + n ξ log ( n ) k , h n = ϕ 1 β n k n , and h n + = ϕ 1 k n β n . Using standard evidence (see Bouzebda et al. [17] ), we deduce that Theorem 2 is the outcome of Theorem 1 and the two lemmas below. □
Lemma 3.
Under the conditions of Theorem 2, we have
| i = 1 n K ( h n 1 x X i ) i = 1 n K ( h n + 1 x X i ) β n | = O ϕ 1 k n k 0 + O a . c o . n ξ log ( n ) k .
Lemma 4.
Based on the conditions of Theorem 2, we obtain
1 I h n ϕ 1 k n h n + 1 , a . c o . .
Corollary 2.
Using the conditions of Theorem 2, we get
| R N ^ ( x ) R 1 ( x ) | = O ϕ 1 k n k 0 + O a . c o . n ξ log ( n ) k ,
and
| R D ^ ( x ) R 2 ( x ) | = O ϕ 1 k n k 0 + O a . c o . n ξ log ( n ) k .

4. Smoothing Parameter Selection

The applicability of the estimator is related to the selection of the parameters used for the construction of the estimator R ˜ . In particular, the bandwidth parameter h n has a decisive effect on the implementation of this regression in practice. In the literature on nonparametric regression analysis, there are several ways to achieve this issue. In this paper, we adopt two approaches common in classical regression to the relative one. The two selections are the cross-validation rule and the bootstrap algorithm.

4.1. Leave-One-Out Cross-Validation Principle

In classical regression, the leave-one-out cross-validation rule is obtained using the mean square error. This criterion has been employed for predicting Hilbertian time series by several authors in the past (see Feraty and View [18] for some references). The leave-one-out cross-validation rule is easy to execute and has shown good behavior in practice. However, it is a relatively time-consuming rule. We overcome this inconvenience by reducing the cardinal of the optimization set of the rule. Thus, we adopt this rule for this kind of regression analysis. Specifically, we consider some subset of smoothing parameters (resp. number of the neighborhood) H n (resp. K n ), and we select the best bandwidth parameter as follows.
h n o p t = arg min h n H n i = 1 n ( Y i R ˜ i ( X i ) ) 2 Y i 2
or
k o p t = arg min k K n i = 1 n ( Y i R ^ i ( X i ) ) 2 Y i 2 ,
where R ˜ i ( X i ) (resp. R ^ i ( X i ) ) is the leave-out-one estimator of R ˜ (resp. R ^ ). The latter is calculated without the observation ( X i , Y i ) . It is worth noting that the efficiency of this estimator is also linked to the determination of the subset H n , where the rule (7) is optimized. Often, we distinguish two cases, the local case and the global case. In the local one, the subset H n is defined with respect to the number of neighborhoods near the location point. For the global case, the subset H n is the quantile of the vector distance between the Hilbertian regressors. The choice of K n is easier, and it suffices to take K n as a subset of a positive integer. This selection procedure has shown good behavior in practice, but there is no theoretical result concerning its asymptotic optimality. This will be a significant prospect for the future.

4.2. Bootstrap Approach

In addition to the leave-one-out cross-validation rule, the bootstrap method constitutes another important selection method. The principle of the latter is based on the plug-in estimation of the quadratic error. In the rest of this subsection, we describe the principal steps of this selection procedure.
Step 1.
We choose an arbitrary bandwidth h 0 (resp. k 0 ), and we calculate R ˜ h 0 ( x ) (resp. R ^ k 0 ( x ) ).
Step 2.
We estimate ϵ ˜ = Y R ˜ h 0 ( x ) (resp. ϵ ^ = Y R ^ k 0 ( x ) ).
Step 3.
We create a sample of residual ϵ (resp. ϵ ) from the distribution
G = ( ( 5 + 1 ) / 2 5 ) δ ϵ ˜ ( 1 5 ) / 2 ( ( 1 5 ) / 2 5 ) δ ϵ ˜ ( 5 + 1 ) / 2 ,
( resp . G = ( ( 5 + 1 ) / 2 5 ) δ ϵ ^ ( 1 5 ) / 2 ( ( 1 5 ) / 2 5 ) δ ϵ ^ ( 5 + 1 ) / 2 ) ,
where δ is the Dirac measure (see Hardle and Marron [19] for more details).
Step 4.
We reconstruct the sample ( Y i , X i ) i = ( ϵ R ˜ h 0 ( X i ) , X i ) , (resp. ( Y i , X i ) i = ( ϵ R ^ k 0 ( X i ) , X i ) ,
Step 5.
We use the sample ( Y i , X i ) i to calculate R ˜ h 0 ( X i ) and ( Y i , X i ) i to calculate R ^ k 0 ( X i ) .
Step 6.
We repeat the previous steps N B times and put R ˜ h 0 r ( X i ) (resp. R ^ k 0 r ( X i ) ), the estimators, at the replication r.
Step 7.
We select h (resp. k) according to the criteria
h o p t B o o = arg min h H n r = 1 N B i = 1 n ( R ˜ h r ( X i ) R ˜ h 0 r ( X i ) ) 2 ,
and
k o p t B o o = arg min k K n r = 1 N B i = 1 n ( R ^ k r ( X i ) R ˜ k 0 r ( X i ) ) 2 .
Once again, the choice of the subset H n (resp. K n ) and the pilot bandwidth h 0 (resp. k 0 ) have a significant impact on the performance of the estimator. It will be very interesting to combine both approaches in order to benefit from the advantage of both selections. However, the time cost of this idea is very important.

5. Computational Study

5.1. Empirical Analysis

As a theoretical contribution, we wish in this empirical analysis to inspect the easy implementation of the built estimator R ˜ in practice. As the determination of h n is the principal challenge of the computation ability of R ˜ , we compared in this computational study the two selections discussed in the previous section. For this purpose, we conducted an empirical analysis based on artificial data generated through the following nonparametric regression
Y i = τ ( X i ) + ϵ i , i = 1 , , n
where τ ( ) is known regression operator r and ( ϵ i ) sequence of independent random variable generated from a Gaussian distribution N ( 0 , 0.5 ) . The model, in (9), shows the relationship between an endogenous and exogenous variable.
On the other hand, in order to prospect the dependency of the data, we generated the Hilbertian regressor by using the Hilbertian GARCH process through dgp.fgarch from the R-package rockchalk. We plotted, in Figure 1, a sample of the exogenous curves X ( t ) .
The endogenous variable Y was generated by
τ ( x ) = 4 0 π x 2 ( t ) 1 + x 2 ( t ) d t .
For this empirical analysis, we compared the two selectors (7) and (8) with the mixed one obtained by using the optimal h of the rule (7) as the pilot bandwidth in the bootstrap procedure (8). For a fair comparison between the three algorithms, we optimized over the same subset H n . We selected the optimal h for the three selectors, and the subset H n of the quantiles of the vector distance between the Hilbertian curves observations of X i (the order of the quantiles was { 1 / 5 , 1 / 10 , 1 / 15 , 0.5 } . Finally, based on a quadratic kernel on ( 0 , 1 ) , the estimator was computed, and we utilized the L 2 metric associated with the PCA definition based on the m = 3 first eigenfunctions of the empirical covariance operator associated with the m = 3 greatest eigenvalues (see Ferraty and Vieu [18]).
The efficiency of the estimation method was evaluated by plotting the true response value ( Y i ) i versus the predicted values R ^ ( X i ) . In addition, we used the relative error defined by
R S E = i = 1 n ( Y i R ˜ i ( X i ) ) 2 Y i 2
to evaluate the performance of this simulation study, which performed over 150 replications. The prediction results are depicted in Figure 2.
It shows clearly that the R ˜ of the relative regression was very easy to implement in practice, and both selection algorithms had satisfactory behaviors. Typically the mixed approach performed better compared to the two separate approaches. It had an R S E = 0.35 . On the other hand, the cross-validation rule had a small superiority ( R S E = 0.52 ) over the bootstrap approach R S E = 0.65 ) in this case. Of course, this small superiority was justified by the fact that the efficiency of the bootstrap approach was based on the pilot bandwidth parameter h 0 , whereas the cross-validation rule was strongly linked to the relative error loss function.

5.2. A Real Data Application

We devote this paragraph to the real application of the RE-regression as a predictor. Our ambition is to emphasize the robustness of this new regression. To do this, we compared it to the classical regression defined by the conditional expectation. For this purpose, we considered physics data corresponding to the monthly number of sunspots in the years 1749–2021. These data were available at the website of WDC-SILSO, Royal Observatory of Belgium, Brussels, http://www.sidc.be (accessed on 1 April 2023). The prediction of sunspots is very useful in real life. It can be used to forecast the space weather, assess the state of the ionosphere, and define the appropriate conditions of radio shortwave propagation or satellite communications. It is worth noting that these kinds of data can be viewed as a continuous time process, which is the principal source of a Hilbertian time series by cutting the continuous trajectory into small intervals with fixed larger intervals. To fix these ideas, we plotted the initial data in Figure 3.
To predict the value of a sunspot in the future, given its past observations in a continuous path, we use ( Z t ) t [ 0 , b ) the whole data set, as a real-valued process in continuous time. We then constructed, from Z t , n Hilbertian variables ( X i ) i = 1 , , n , where
t [ 0 , b ) , X i ( t ) = Z n 1 ( ( i 1 ) b + t ) , Y i = X i ( b ) .
Thus, our objective was to predict Y n , knowing ( X i , Y i ) i = 1 , , n 1 and X n . At this stage, R ˜ ( X n ) was the predictor of Y n . In this computational study, we aimed to forecast the sunspot number one year ahead, given the observation of the past years. Thus, we fixed on month j in 1, …, 12, and computed the estimator R ˜ by the sample ( Y i j , X i ) i = 1 272 , with Y i j as the sunspot number of jth months in the ( i + 1 ) th year, and we repeated this estimation procedure for all j = 1 , , 12 .
As the main feature of the RE-regression is its insensitivity to the outliers, we examined this property by detecting the number of outliers in each prediction step j. To do this, we used a MAD-Median rule (see Wilcox and Rand [20]). Specifically, the MAD-Median rule considers an observation Y i as an outlier if
| Y i M | MAD 0.6745 > C ,
where M and MAD are the medians of ( Y i ) i , and ( Y i M ) i respectively, and C = χ 2 0.975 (with one degree of freedom). Table 1 summarizes the number of outliers for each step j.
Both estimators R ˜ and
R ^ ( x ) = i = 1 n Y i K ( h n 1 x X i ) i = 1 n K ( h n 1 x X i )
were simulated using the quadratic kernel K ( x ) , where
K ( x ) = 3 2 ( 1 x 2 ) 1 I [ 0 , 1 ]
and norm L 2 was associated with the PCA-metric with m = 3 . The cross-validation rule (7) is used to choose the smoothing parameter h. Figure 4 shows the prediction results, where we drew two curves showing the predicted values (the dashed curve for the relative regression and the point curve for the classical regression) and the observed values (solid curve).
Figure 4 shows that R ˜ performed better in terms of the prediction results compared to R ^ . Even though both predictors had good behavior, the ASE of the relative regression (2.09) was smaller than the classical regression, which was equal to 2.87.

6. Conclusions

In the current contribution, we focused on the kernel estimation of the RE-regression when the observations exhibited quasi-associated autocorrelation. It constituted a new predictor in the Hilbertian time series, an alternative to classical regression based on conditional expectation. Clearly, this new estimator increased the robustness of the classical regression because it reduced the effect of the largest variables. Therefore, it made this Hilbertian model insensitive to the outlier observations; this is the main feature of this kind of regression. We provided in this contribution two rules to select the bandwidth parameter. The first was based on adapting the cross-validation rule to the relative error loss function. The second was obtained by the adaptation of the wild bootstrap algorithm. The simulation experiment highlighted the applicability of both selectors in practice. In addition to these features, the present work opened an important number of questions for the future. First, establishing the asymptotic distribution of the present estimator allows extending the applicability of this model to other applied issues in statistics. The second natural prospect focuses on the treatment of some alternative Hilbertian time series, including the ergodic case, the spatial case, and the β -mixing case, among others. It will also be very interesting to study another type of data (missing, censored, ) or another estimation method, such as the kNN, local linear method, , etc.

Author Contributions

The authors contributed approximately equally to this work. Formal analysis, F.A.; Validation, Z.C.E. and Z.K.; Writing—review & editing, I.M.A., A.L. and M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research project was funded by the Deanship of Scientific Research, Princess Nourah bint Abdulrahman University, through the Program of Research Project Funding after Publication, grant No. (43-PRFA-P-25).

Data Availability Statement

The data used in this study are available through the link http://www.sidc.be (accessed on 1 April 2023).

Acknowledgments

The authors would like to thank the Associate Editor and the referees for their very valuable comments and suggestions which led to a considerable improvement of the manuscript. The authors also thank and extend their appreciation to the Deanship of Scientific Research, Princess Nourah bint Abdulrahman University for funding this work.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this appendix, we briefly give the proof of preliminary results; the proofs of Lemmas 3 and 4 are omitted, as they can be obtained straightforwardly through the adaptation of the proof of Bouzebda et al. [17].
Proof of Lemma 1.
Clearly, the proof of both terms is very similar. So, we will focus only in the first one. In fact, the difficulty in this kind of proof comes from the fact that the quantity Y i 1 is not bounded. So, to deal with this problem, the truncation method is used to define
R ˜ N ( x ) = 1 n I E [ K 1 ( x ) ] i = 1 n K h n 1 x X i Y i 1 1 I | Y i | > μ n   with   μ n = n ξ / 6 .
Then, the desired result is a consequence of
I E [ R ˜ N ( x ) ] I E [ R ˜ N ( x ) ] = O log n n 1 ξ ϕ x ( h n ) ,
R ˜ N ( x ) R ˜ N ( x ) = O a . c o . log n n 1 ξ ϕ x ( h n ) ,
and
R ˜ N ( x ) I E [ R ˜ N ( x ) ] = O a . c o . log n n 1 ξ ϕ x ( h n ) .
We start by proving (A3). For this, we write
R ˜ N ( x ) I E R ˜ N ( x ) = i = 1 n Υ i where Υ i = 1 n I E [ K 1 ( x ) ] χ ( X i , Y i ) ,
with
χ ( z , w ) = w 1 K ( h n 1 x z ) ) 1 I | w | > μ n I E K 1 ( x ) Y 1 1 I | Y i | > μ n ,   z H ,   w I R .
Observe that,
χ C μ n 1 K   and   Lip ( χ ) C μ n 1 h n 1 Lip ( K )
The key tool for proving (A3) is the application of Kallabis and Newmann’s inequality (see [21], p. 2). We apply this inequality on Yi. It requires evaluating asymptotically two quantities: V a r i = 1 n Υ i and C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) , for all ( s 1 , , s u ) I N u and ( t 1 , , t v ) I N v
Concerning the variance term, we write
V a r i = 1 n Υ i = i = 1 n j = 1 n C o v ( Υ i , Υ j ) = n V a r Υ 1 + i = 1 n j = 1 j i n C o v ( Υ i , Υ j ) .
Note that the above formula has two terms. For the V a r Υ 1 and under (D5), we obtain
I E Y 2 1 I | Y i | > μ n K 1 2 ( x ) I E K 1 2 ( x ) I E Y 1 2 | X 1 C I E K 1 2 .
Then, we use
I E K 1 j ( x ) = O ( ϕ x ( h n ) )
to deduce that
V a r ( Υ 1 ) = O 1 n ϕ x ( h n ) .
Now, we need to examine the covariance term. To do that, we use the techniques of Massry to obtain the decomposition:
i = 1 n j = 1 j i n C o v ( Υ i , Υ j ) = i = 1 n j = 1 0 < | i j | m n n C o v ( Υ i , Υ j ) + i = 1 n j = 1 | i j | > m n n C o v ( Υ i , Υ j ) = : T I + T I I .
Note that ( m n ) is a positive sequence of real number integers, which tends to infinity as n .
We use the second part of (D5) to obtain
C o v ( Υ i , Υ j ) C I E K i ( x ) K j ( x ) + I E K i ( x ) I E K j ( x ) C ϕ x ( a + 1 ) / a ( h n ) + ϕ x 2 ( h n ) .
Therefore,
T I C n m n ϕ x ( a + 1 ) / a ( h n ) .
Since the observations are quasi-associated, and the kernel K is bounded, based on the Lipschitz, we obtain
T I I h n 1 Lip ( K ) 2 i = 1 n j = 1 | i j | > m n n Υ i , j C ( μ n h ) 1 Lip ( K ) 2 i = 1 n j = 1 | i j | > m n n Υ i , j C ( μ n h ) 1 Lip ( K ) 2 i = 1 n j = 1 | i j | > m n n Υ i , j C n ( μ n h ) 1 Lip ( K ) 2 Υ m n C n ( μ n h ) 1 Lip ( K ) 2 e a m n .
Then, by (A5) and (A6), we obtain
i = 1 n j = 1 j i n C o v ( Υ i , Υ j ) C n m n ϕ x ( a + 1 ) / a ( h n ) + n ( μ n h ) 1 Lip ( K ) 2 e a m n .
Putting m n = log ( μ n h ) 1 Lip ( K ) 2 a ϕ x ( a + 1 ) / a ( h n ) , we obtain
1 n ϕ x ( h n ) i = 1 n j = 1 j i n C o v ( Υ i , Υ j ) 0 , as   n .
Combining together results (A4) and (A7), we show that
V a r i = 1 n Υ i = O 1 n ϕ x ( h n ) .
We evaluate the covariance term
C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) , ( s 1 , , s u , t 1 , , t v ) ) I N u + v .
To do that, we treat the following cases:
  • The first case is t 1 > s u ; based on the definition of quasi-association, we obtain
    | C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) | ( μ n h ) 1 Lip ( K ) 2 n I E K 1 ( x ) 1 2 C n μ n I E K 1 ( x ) u + v 2 i = 1 u j = 1 v Υ s i , t j h n 1 Lip ( K ) 2 C n μ n I E K 1 ( x ) u + v v Υ t 1 s u h n 1 Lip ( K ) 2 C n μ n ϕ x ( h n ) u + v v e a ( t 1 s u ) .
    On the other hand, we have
    | C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) | C K n μ n I E K 1 ( x ) u + v 2 × I E [ Υ s u Υ t 1 ] + I E Υ s u I E Υ t 1 C K n μ n I E K 1 ( x ) u + v 2 C n μ n I E K 1 ( x ) 2 × ϕ x ( a + 1 ) / a ( h n ) + ϕ x 2 ( h n ) C n μ n ϕ x ( h n ) u + v ϕ x ( a + 1 ) / a ( h n ) .
    Furthermore, taking a 1 2 ( a + 1 ) -power of (A9) and a ( 2 a + 1 a + 1 ) -power of (A10), we get for 1 s 1 s u t 1 t v n :
    | C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) | ϕ x ( h n ) C n ϕ x ( h n ) u + v v e a ( t 1 s u ) / ( 2 ( a + 1 ) ) .
  • The second one is where t 1 = s u . In this case, we have
    | C o v ( Υ s 1 Υ s u , Υ t 1 Υ t v ) | C K n μ n I E K 1 ( x ) u + v I E K 1 2 ( x ) ϕ x ( h n ) C n μ n ϕ x ( h n ) u + v .
So, we are in a position for Kallabis and Newmann’s inequality for the variable Υ i , i = 1 , , n , where
K n = C n μ n ϕ x ( h n ) , M n = C μ n n ϕ x ( h n ) and V a r i = 1 n Υ i = O 1 n ϕ x ( h n ) .
It allows us to have
I P R ˜ N ( x ) I E R ˜ N ( x ) > η log n n 1 ξ ϕ x ( h n ) I P | i = 1 n Υ i | > η log n n 1 ξ ϕ x ( h n ) exp η 2 log n / ( 2 n 1 ξ ϕ x ( h n ) ) V a r i = 1 n Υ i + C μ n 1 ( n ϕ x ( h n ) ) 1 3 log n n 1 ξ ϕ x ( h n ) 5 6 exp η 2 log n C n ξ + μ n 1 n ξ / 6 log 5 n n ϕ x ( h n ) 1 6 C exp C η 2 log n .
Choosing the η adequately leads to achieving the proof of (A3).
Next, to prove (A1), use Holder’s inequality to write that
I E R ˜ N ( x ) I E R ˜ N ( x ) 1 n I E [ K 1 ( x ) ] I E i = 1 n Y i 1 1 I { | Y i | < μ n } K i ( x ) 1 I E [ K 1 ( x ) ] I E | Y i | 1 1 I { | Y i | < μ n } K i ( x ) C ϕ x 1 / 2 ( h n ) exp μ n 1 / 4 .
Since μ n = n ξ / 6 , which allows us to obtain
I E R ˜ N ( x ) I E R ˜ N ( x ) = o log n n 1 ξ ϕ x ( h n ) 1 / 2 .
We use Markov’s inequality to obtain the last claimed result (A2). Hence, for all ϵ > 0
I P R ˜ N ( x ) R ˜ N ( x ) > ϵ = I P 1 n ϕ x ( h n ) i = 1 n Y i 1 1 I | Y i | 1 > μ n K i ( x ) | > ϵ n I P | Y 1 | 1 > μ n C n   exp   μ n 1 .
Then,
n 1 I P R ˜ N ( x ) R ˜ N ( x ) > ϵ 0 log n n 1 ξ ϕ x ( h n ) C n 1 n exp μ n 1 .
Use the definition of μ n to achieve the proof of the lemma. □
Proof of Lemma 2.
Once again, the focus is on the first statement’s proof; the second statement is obtained in the same way. In fact, the proof of both results uses the stationarity of the couples ( X i , Y i ) . Therefore, we write
| I E R N ˜ ( x ) R 1 ( x ) | = 1 I E K 1 ( x ) I E K 1 ( x ) R 1 ( x ) I E Y 1 1 | X 1 .
The conditions (D2) and (D4) imply
| R 1 ( X 1 ) R 1 ( x ) | C h k 1 .
Hence,
| I E R N ˜ ( x ) R 1 ( x ) | C h k 1 .
Proof of Corollary 1.
Clearly, we can obtain that
| R D ˜ ( x ) | R 2 ( x ) 2 | R D ˜ ( x ) R 2 ( x ) | R 2 ( x ) 2 .
So,
I P | R D ˜ ( x ) | R 2 ( x ) 2 I P | R D ˜ ( x ) R 2 ( x ) | > R 2 ( x ) 2 .
Consequently,
n = 1 I P | R D ˜ ( x ) | < R 2 ( x ) 2 < .

References

  1. Narula, S.C.; Wellington, J.F. Prediction, linear regression and the minimum sum of relative errors. Technometrics 1977, 19, 185–190. [Google Scholar] [CrossRef]
  2. Chatfield, C. The joys of consulting. Significance 2007, 4, 33–36. [Google Scholar] [CrossRef]
  3. Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least absolute relative error estimation. J. Am. Statist. Assoc. 2010, 105, 1104–1112. [Google Scholar] [CrossRef] [Green Version]
  4. Yang, Y.; Ye, F. General relative error criterion and M-estimation. Front. Math. China 2013, 8, 695–715. [Google Scholar] [CrossRef]
  5. Jones, M.C.; Park, H.; Shin, K.-I.; Vines, S.K.; Jeong, S.-O. Relative error prediction via kernel regression smoothers. J. Stat. Plan. Inference 2008, 138, 2887–2898. [Google Scholar] [CrossRef] [Green Version]
  6. Mechab, W.; Laksaci, A. Nonparametric relative regression for associated random variables. Metron 2016, 74, 75–97. [Google Scholar] [CrossRef]
  7. Attouch, M.; Laksaci, A.; Messabihi, N. Nonparametric RE-regression for spatial random variables. Stat. Pap. 2017, 58, 987–1008. [Google Scholar] [CrossRef]
  8. Demongeot, J.; Hamie, A.; Laksaci, A.; Rachdi, M. Relative-error prediction in nonparametric functional statistics: Theory and practice. J. Multivar. Anal. 2016, 146, 261–268. [Google Scholar] [CrossRef]
  9. Cuevas, A. A partial overview of the theory of statistics with functional data. J. Stat. Plan. Inference 2014, 147, 1–23. [Google Scholar] [CrossRef]
  10. Goia, A.; Vieu, P. An introduction to recent advances in high/infinite dimensional statistics. J. Multivar. Anal. 2016, 146, 1–6. [Google Scholar] [CrossRef]
  11. Ling, N.; Vieu, P. Nonparametric modelling for functional data: Selected survey and tracks for future. Statistics 2018, 52, 934–949. [Google Scholar] [CrossRef]
  12. Aneiros, G.; Cao, R.; Fraiman, R.; Genest, C.; Vieu, P. Recent advances in functional data analysis and high-dimensional statistics. J. Multivar. Anal. 2019, 170, 3–9. [Google Scholar] [CrossRef]
  13. Aneiros, G.; Horova, I.; Hušková, M.; Vieu, P. On functional data analysis and related topics. J. Multivar. Anal. 2022, 189, 3–9. [Google Scholar] [CrossRef]
  14. Chowdhury, J.; Chaudhuri, P. Convergence rates for kernel regression in infinite-dimensional spaces. Ann. Inst. Stat. Math. 2020, 72, 471–509. [Google Scholar] [CrossRef] [Green Version]
  15. Li, B.; Song, J. Dimension reduction for functional data based on weak conditional moments. Ann. Stat. 2022, 50, 107–128. [Google Scholar] [CrossRef]
  16. Douge, L. Théorèmes limites pour des variables quasi-associées hilbertiennes. Ann. L’Isup 2010, 54, 51–60. [Google Scholar]
  17. Bouzebda, S.; Laksaci, A.; Mohammedi, M. The k-nearest neighbors method in single index regression model for functional quasi-associated time series data. Rev. Mat. Complut. 2023, 36, 361–391. [Google Scholar] [CrossRef]
  18. Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis; Springer Series in Statistics; Theory and Practice; Springer: New York, NY, USA, 2006. [Google Scholar]
  19. Hardle, W.; Marron, J.S. Bootstrap simultaneous error bars for nonparametric regression. Ann. Stat. 1991, 16, 1696–1708. [Google Scholar] [CrossRef]
  20. Wilcox, R. Introduction to Robust Estimation and Hypothesis Testing; Elsevier Academic Press: Burlington, MA, USA, 2005. [Google Scholar]
  21. Kallabis, R.S.; Neumann, M.H. An exponential inequality under weak dependence. Bernoulli 2006, 12, 333–335. [Google Scholar] [CrossRef]
Figure 1. Displayed is a sample of the functional curves.
Figure 1. Displayed is a sample of the functional curves.
Axioms 12 00613 g001
Figure 2. Prediction results.
Figure 2. Prediction results.
Axioms 12 00613 g002
Figure 3. Initial data.
Figure 3. Initial data.
Axioms 12 00613 g003
Figure 4. Comparison of the prediction result.
Figure 4. Comparison of the prediction result.
Axioms 12 00613 g004
Table 1. Number of outliers with respect to j.
Table 1. Number of outliers with respect to j.
Months123456789101112
Outliers1526135242579118915
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chikr Elmezouar, Z.; Alshahrani, F.; Almanjahie, I.M.; Kaid, Z.; Laksaci, A.; Rachdi, M. Scalar-on-Function Relative Error Regression for Weak Dependent Case. Axioms 2023, 12, 613. https://doi.org/10.3390/axioms12070613

AMA Style

Chikr Elmezouar Z, Alshahrani F, Almanjahie IM, Kaid Z, Laksaci A, Rachdi M. Scalar-on-Function Relative Error Regression for Weak Dependent Case. Axioms. 2023; 12(7):613. https://doi.org/10.3390/axioms12070613

Chicago/Turabian Style

Chikr Elmezouar, Zouaoui, Fatimah Alshahrani, Ibrahim M. Almanjahie, Zoulikha Kaid, Ali Laksaci, and Mustapha Rachdi. 2023. "Scalar-on-Function Relative Error Regression for Weak Dependent Case" Axioms 12, no. 7: 613. https://doi.org/10.3390/axioms12070613

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop