Estimation Approach for a Linear Quantile-Regression Model with Long-Memory Stationary GARMA Errors

: The aim of this paper is to assess the significant impact of using quantile analysis in multiple fields of scientific research . Here, we focus on estimating conditional quantile functions when the errors follow a GARMA (Generalized Auto-Regressive Moving Average) model. Our key theoretical contribution involves identifying the Quantile-Regression (QR) coefficients within the context of GARMA errors. We propose a modified maximum-likelihood estimation method using an EM algorithm to estimate the target coefficients and derive their statistical properties. The proposed procedure yields estimators that are strongly consistent and asymptotically normal under mild conditions. In order to evaluate the performance of the proposed estimators, a simulation study is conducted employing the minimum bias and Root Mean Square Error (RMSE) criterion. Furthermore, an empirical application is given to demonstrate the effectiveness of the proposed methodology in practice.


Introduction
Linear regression is one of the most common and well understood techniques for modeling the relationship between one or more covariates or predictor variables, X, and the conditional mean E(Y|X = x) of the response variable Y, given X = x.The conditional mean minimizes the expected squared error: the conditional mean of Y, given x, is linear and expressed as µ(x) = x ′ β, then β can be estimated by solving min β ∑(y i − x ′ β) 2 , which is the ordinary least squares solution of the linear regression model.
However, inference from this model requires that specific assumptions be made about the distribution of the error (i.e., linearity, homoscedasticity, independence, or normality).In contrast, quantile regression, as introduced by Koenker and Bassett [1], aims to extend these ideas to the estimation of conditional quantile function models wherein the quantiles of the conditional distribution of the response variable are expressed as functions of observed covariates.For a pair of observed values (x, y) of random vector (X, Y), a quantile regression has the form q τ (x) = in f {y : P(Y ≤ y|x) ≥ τ}, where τ indicates that the specific quantile of Y is the smallest value of y denoted by P(Y ≤ y|x) = E(1 {Y≤y} |X = x) , given without making assumptions about the distribution of the error.In the case of classical quantile regressions, we can assume that these quantiles of the conditional distribution have a linear form, where β τ is a parameter vector (a vector of regression coefficients).
For the following, it may be useful to note that this expression can be written in an equivalent way: Y = x ′ β τ + ϵ τ , with q τ (ϵ τ |x) = 0. Condition 1 is similar to that carried out in the standard linear regression, in which the conditional mean of the variable of interest Y is modelled as a linear expression of the explanatory variables X.One difference is that here we allow the coefficients to differ from quantile to quantile.This provides additional information not apparent from simple linear regression.
In the parameter estimation problem, the form of the quantile regression function is known, but it contains unknown parameters, β τ .The most intuitive way to calculate the standard estimator, β τ , consists of ordering these n variables, the quantile of order τ being provided by the (nτ th ) observation where [nτ] is the smallest integer greater than or equal to nτ.However, it is more useful to notice that the popular method for estimating the unknown parameters, β τ , in a quantile-regression function is by solving the estimating equation q τ (Y) = min where q τ (.) is the check function defined by ρ τ (u) = u(τ − I(u < 0)) for some τ ∈ (0, 1).
Here, I(.) is an indicator function that takes the value of unity when I(.) is true and zero otherwise, and here u = y i − x ′ i β τ .However, this function is not differentiable at zero, and clear solutions to minimization problems are unobtainable [2,3].In quantile-regression methods, linear programming is frequently implemented for parameter estimation.
The Expectation-Maximization (EM) algorithm and the Alternating Least Squares (ALS) algorithm are both iterative optimization techniques commonly employed in statistical modeling to estimate parameters iteratively.In quantile regression, aimed at estimating conditional quantiles of a response variable given covariates, the selection between EM and ALS relies on several considerations.
Opting for EM in quantile regression over ALS is justified due to its capability in managing latent variables and accommodating diverse error distributions, and its alignment with the objectives of quantile regression.Additionally, EM avoids the swapping effect by updating parameters sequentially rather than alternately, enhancing stability in estimation.However, it is essential to note that the bias in EM hinges on factors such as sample size and model assumptions.
Despite potential variations in speed, where the speed of EM varies depending on the complexity of the model and convergence criteria, EM's advantages in handling latent variables and error distributions make it a favorable choice in many quantileregression applications.
There are numerous articles about the parameter estimation of quantile-regression models with the EM algorithm.Tian et al. [4] proposed a new method based on the Expectation-Maximization algorithm for a linear quantile-regression model with symmetric Laplace error distribution.Furthermore, Tian et al. [5] used this method for a linear quantile-regression model with autoregressive errors.Zhou et al. [6] developed the EM algorithm and GEM algorithm for calculating the quantiles of linear and non-linear regression models.
All of these results are only for the errors that are modeled by short-memory, timeseries models.However, long-memory models are very important; they are used in various fields, such us hydrology, chemistry, and economics; see, for example Hurst [7], Jeffreys [8], Student [9].
The most commonly used long-memory model, which deals with the modeling of cyclic behavior, is The Generalized Autoregressive Moving Average model (GARMA).It was proposed by Gray et al. [10] to deal with non-additivity, non-normality, and heteroscedasticity in real time-series data.
This behavior has attracted considerable attention, prompting extensive research efforts over recent decades.For instance, Darmawan et al. [11] utilized a GARMA model to predict COVID-19 data trends in Indonesia, Albarracin et al. [12] analyzed the structure of GARMA models in practical scenarios, while Huntet al. [13] introduced an R package, 'garma', specifically tailored for fitting and forecasting GARMA models in R (Version R-4.4.0).
The foundations of the methods for independent data have now been consolidated, and some computational commands for the analysis of such data are provided by most of the available statistical software (e.g., R (Version 4.4.1 R. Ross Ihaka and Robert Gentleman's creation in Vienna, Austria)/Python (Version 3.13.Guido van Rossum's creation, in Wilmington, DE, USA).
In this paper, we will consider a linear quantile-regression model with Generalized Autoregressive Moving Average error, defined as follows: where y t is the t-th observation of the response variable, ) ′ is a regression parameter vector, and {ϵ t , t ∈ Z} is a stationary long-memory process generated by the GARMA(p,0) model, as follows: where In this work, we aim to estimate the parameters of our models with the EM algorithm (see [14]) based on the method of Tian et al. [4].However, we will derive the asymptotic properties of our estimators.
The outline of this paper is organized as follows: In Section 2, we provide the likelihood function, and Section 3 deals with the estimation of our parameters by the EM algorithm.In Section 4, we derive the asymptotic properties of the estimators under some mild conditions.Some simulation results will be provided in Section 5, and finally a real data example is provided in Section 6.

Estimation Procedure via EM Algorithm
The maximum-likelihood method is a technique that is widely recognized for deriving estimators.One of the principal reasons for the wide popularity of the maximum-likelihood method is that the resulting estimator has many interesting properties.However, in some complicated problems maximum-likelihood estimators are unsuitable or do not exist.This method estimates the parameters of various statistical models, including quantileregression models.In such cases, to estimate the q τ (y t |x t ) at model (1), it is natural to simultaneously estimate the regression coefficients and the GARMA parameters using quantile regression.Hence, in this section we focuses on the estimation of the parameter vector of models ( 3) and ( 4) by (β ′ τ , Φ ′ , σ) ′ .It is also worth noting that maximum-likelihood estimation is related to other optimization techniques.In particular, if we assume a normal distribution then it is equivalent to the ordinary least squares method.In practice, the normality assumption of the random error may be abandoned for many reasons, such as outliers, contaminated data, and heavy-tailed distributions.The Laplace distribution is a good robust alternative in this case [15].Yu and Moyeed [16] found that minimizing the expression (1) is equivalent to maximizing a likelihood function formed by combining the independently distributed asymmetric Laplace error distribution (see [17]).In this work, we adopt a similar idea to build a likelihood function; the distribution of the process {ξ t , t ∈ Z} in model ( 3) is assumed to be a symmetric Laplace distribution with zero mean, scale parameter σ > 0, and skewness parameter τ ∈ (0, 1).Let ξ t ∼ ALD(0, σ, τ), as follows: where ) is the quantile check function.
The mean and the variance of this distribution are given by: ).The Laplace distribution was chosen for the error term in the GARMA model due to several advantages it offers.Firstly, it offers robustness to outliers.The Laplace distribution, also referred to as the double-exponential distribution, has heavier tails compared to the normal distribution.This implies that it is more robust to outliers in the data.In a GARMA error model, which aims to capture the time-series behavior of a process, having robustness to outliers is essential for precise estimation and prediction.Secondly, unlike the normal distribution, the Laplace distribution is symmetric and has the capability to capture skewness in the data.In time-series analysis, it is common to encounter data that exhibit asymmetric patterns.By using the Laplace distribution as the error term in the GARMA model, we align the model's assumptions with the potential skewness in the data, thus improving its ability to capture the underlying patterns.Additionally, the Laplace distribution possesses straightforward mathematical properties that enhance its computational efficiency.This characteristic can be beneficial when fitting the GARMA model to extensive datasets or when performing simulations for different scenarios.
Under such circumstance, the maximum-likelihood estimates for the parameters β τ , Φ, and σ are obtained by maximizing the marginal density, f (y t |β τ , Φ, σ) , which is obtained by replacing ξ t with Φ p (L)(1 The maximization of ( 4) is equivalent to the maximization of the logarithm of the likelihood function: ). (7) Clearly, the objective function has no closed form.Hence, it is not possible the estimate the quantile-regression coefficients because this objective function is non-convex with respect to β τ and the differentiation (the function ρ τ (.) is not differentiable at 0).Therefore, to overcome this problem we suggest the use of the property of the asymmetric Laplace, combining a normal distribution conditional on an exponential distribution, proposed by Yu and Moyeed [16], as follows: where . This proposal should be seen as a ratification to the previous objective function, using the Gegenbauer polynomials (C (See [19]).Then, under the previous results, Equations (3), ( 8) and ( 9) lead to the following explicit expression: Therefore, we can conclude that y t is conditionally normally distributed (see [5] for more details).Then, the quantile-regression model here is represented as a normal regression model.The full joint density for y = (y 1 , ..., y n ) is given by: with z = (z 1 , ..., z n ) and P(L) = (1 The estimators of the unknown quantile-regression parameters, β τ , the scale σ, and the vector parameters of the GARMA model, Φ, may be obtained by maximizing the likelihood function: The corresponding log-likelihood function is given by: Due to the unobserved variables, z t , the maximum likelihood becomes intractable.In this case, solving Equation ( 2) requires an iterative algorithm.The EM algorithm (see [20]) is a broad method of finding the maximum-likelihood estimates of the parameters of a fundamental distribution from a provided dataset that has missing values.In the next section, we present the Expectation-Maximization (EM) algorithm, which is derived from likelihood optimization.

EM Algorithm
The Expectation-Maximization (EM) algorithm [14] is a commonly utilized method for finding maximum-likelihood estimates in statistical models that rely on missing data or unobservable latent variables.A latent variable is a variable that affects our observed data but in ways that we cannot know directly.The EM algorithm is an iterative approach that alternates between two modes.The first mode attempts to estimate the missing or hidden variables; this is known as the estimation-step or E-step.The second mode attempts to optimize the parameters of the model to provide the best explanation of the data; this is known as the maximization-step or M-step.
In the model (10), the EM iterations are based on regarding the random variable (z t = (z 1 , ..., z n )) as a set of unobserved latent variables or missing values.We seek to estimate our model by maximizing the complete-data log likelihood, which is then denoted in Equation (12), log(L(β τ , Φ, σ/y, x, z), where ϑ = (β ′ τ , Φ ′ , σ) ′ are the unknown parameter vectors for which we wish to find the MLE.To apply the EM algorithm for estimation, we must first find the conditional pdf of the unobserved variable z t , in order to estimate the missing variables in the dataset.Indeed, since z t , in Equation (8), is assumed to be exp( 1 σ ) and u t ∼ N (0, 1), it can be observed that the conditional distribution of z t , given y, (z t |y) is given by: where GIG is the Generalized Inverse Gaussian distribution, a three parameter distribution introduced by Good, [21] that has been applied in a variety of fields of statistics.More recently, Sánchez et al.
[22] used the GIG distribution as a mixing distribution to estimate the parameters of quantile-regression models.Furthermore, note that the r-th moment of z t ∼ GIG(α, γ, δ) is given by: and where K is a modified Bassel function of the second kind (see [23] for more details).The above relations ( 14) and ( 15) will be useful in calculating the conditional expectation of the log-likelihood function, with respect to the conditional distribution of z t given y, when applying the following two EM steps, described as follows: Step E To applying EM to model (10) (with GARMA errors), we start by writing down the expected complete log-likelihood, given by Equation (2), known as the Q-function, Q(ϑ/ϑ (h) ) = E(log(ϑ/y, z)/y, ϑ (h) ), where E ϑ (h) ) [.] means that the expectation is being affected using ϑ (h) for ϑ , with respect to the unknown data z given the observed data y and the current parameter estimates ϑ (h) , which is the estimated value of the h-th iteration.Calculating this expectation at the (h − 1)-th iteration yields where we have defined the pseudo-values λ t = E(log(z t )/y, ϑ (h−1) ), µ t = E(z −1 t /y, ϑ (h−1) ), and ν t = E(z t /y, ϑ (h−1) ).
For evaluating (2), it is necessary to compute λ t , µ t , and ν t , as in ( 14) and ( 15) , which will depend of the conditional pdf of z in Equation (11).
These expressions are straightforward to derive using Equations ( 14) and ( 15), as discussed in Eberlein and Hammerstein [24].The elements of the partial derivatives of the t-th moment of z t evaluated at ϑ (h−1) are as follows: . We can calculate ν t explicitly by applying the well-known recursion formula defined by xK a−1 (x) − xK a+1 (x) = −2aK a (x), resulting in the following expression: Step M (Maximization) The above E-step of the EM algorithm simply uses L(y|x, z, ϑ (h−1) ) to calculate the expectation of the unobservable information, z, given the observed data (y, x) and the existing estimates of unknown parameters ϑ (h−1) .
In the M-step, the parameters are re-estimated by maximizing the Q-function to find the new estimate ϑ (h) by solving the (h) th step; the M-step procedures is as follows: where each estimate parameter can be acquired by partially maximizing the objective Q-function: In this regard, starting values, ϑ (0) = (β (0) ′ τ , Φ (0) ′ , σ (0) ) ′ , are necessary in order to initiate the iterative procedure.These can be obtained from the Least Squares Estimates (LSE) method (see [25]).
An estimate of ϑ = (β ′ τ , Φ ′ , σ) ′ can be obtained by equating the score vector, including the first-order partial derivatives of Q(ϑ) with respect to each parameter, denoted by , ..., ∂Q ∂ϕ p , ∂Q ∂σ ) , to zero vector, leading to Q-function equations.In particular, we have the following explicit expressions:

Estimator of β τ
Let β τ be the parameter value at a local maximum of the Q-function in Equation ( 2).Differentiating the previous expression in respect to β τ , we find In matrix form, we have: where: We obtain the h-th iteration estimator of the parameter β τ as follows: 2.1.2.Estimator of Φ Now, we apply the same procedure for our parameters ϕ 1 ,....,ϕ p : For k ∈ {1, ..., p}, we have: Then, we write the matrix form as follows: where: Therefore, the h-th iteration estimators of the parameters ϕ j are given by:

. Estimator of σ
Finally, for σ, we have: Then, we get:

Consistency and Asymptotic Normality
In this section, we establish the consistency and asymptotic results of our estimators in Equations ( 17)- (19).We will begin by stating and discussing a set of basic notations and some assumptions that will be used throughout the remaining parts of this paper.
For the derivation of asymptotic normality of the estimator of β τ , we need consistency and some additional assumptions.Because we try to show the asymptotic normality of √ n( β τ − β τ ), it is convenient to apply the notation introduced in (1)-( 2) and make the following assumptions of regularity conditions: As well as conditions (C1) and (C2), we will use condition (C3) to apply the ergodic theorem and condition(C4) for the martingale central limit theorem of Billingsley [26].Thus, we have the following theorem: Theorem 2. Let Φ (h) the EM estimator of Φ.Under (C3) and (C4), we have: Proof.Note that ξ = J − EΦ, we have: Similarly to Proof 1, we found that: □

Simulation
In this section, we present a simulation study in order to illustrate the performance of our proposed estimation procedure of ϑ = (β ′ τ , Φ ′ , σ) ′ under different scenarios.This study was performed based on the QR model with GARMA errors, designed with two covariates each measuring n samples.The response variable, y t , is generated from the following equation: The simulation scenario considers the following setting: The sample size (n) is fixed at n = 50, n = 100, and n = 200 and combinations of the vector of the true values for the parameters stated as η = −0.9,d = −0.4,ϕ = −0.5, β = (β 1 , β 2 ) ′ = (3, 2) ′ σ = 1, including different degrees of asymmetry considered by choosing τ = 0.25, 0.5, and 0.75 as the quantile points for estimation, with 500 Monte Carlo replications for each n.
The covariate values for x t = (x t,1 , x t,2 ) ′ , t = 1, 2, . . ., n are i.i.d random vectors generated from the standard normal distribution.The data for the error term, ϵ t , are also generated by taking the different distributions, density-Gaussian, N (0, 1), density function for Student distribution with k = 3 degrees of freedom (t 3 ), and the density function for the skewed t distribution with k = 5 degrees of freedom (st 5 ).The performance and recovery of the estimators are assessed by the bias and Root Mean Square Error (RMSE) to ensure that the parameters estimated have small bias and small variance, which are stated, respectively, by: where β τ and β τ (h) are the true parameter value and its respective h − th maximumlikelihood estimate using the EM algorithm.The Monte Carlo simulation experiments were performed using the R software (version 4.4.1);see www.r-project.org,accessed on 31 October 2023.
The results derived from simulation studies are presented in Tables 1-3.A look at the results in all these tables allows us to conclude that, as expected, in general, as the sample size increases, the bias and RMSE of the estimators decrease for all values of quantile points (τ) considered.Moreover, β 1 , β 2 , and ϕ all seem to be consistent and asymptotically normally distributed.

Real Data Example
In this work, to evaluate our proposed quantile-regression model with GARMA errors formulated in this paper, we used a dataset called Engel (Engel food expenditure data), sourced from the R package quantreg.Koenker and Bassett [27] utilized this dataset to exemplify the application of quantile regression in R using authentic, real-world data.
In Figure 1, displayed below, Koenker and Bassett show the scatter plot of the original data on double log axes.Fitted regression quartile lines are superimposed on the same figure.The estimated slope parameters (Engel elasticities), 0.8358, 0.8326, 0.8780, and 0.9170, for the quartiles corresponding to the 20th, 40th, 60th, and 80th percentiles, respectively.
It is interesting to note in this example that the fitted quartile lines indicate an increasing conditional scale effect.To investigate the influence of the GARMA error model's "significance" in quantileregression models, we suggest the following approach with the GARMA errors expressed as: y t = β + x 1,t β 1 + ϵ t , t = 1,2,...,n, where: • y t is the food expenditure of observation t; • x 1,t is the income of observation t; • d = 0.35 and η = −0.8.
Our methodology commences by estimating the parameters β, β 1 , ϕ 1 , and ϕ 2 at the quartiles 0.2, 0.4, 0.6, and 0.8 via our method described in Section 2 with the R software (version 4.4.1).Table 4 shows the estimated coefficients (Estimate) and the corresponding predictions at four quantile levels, τ : 0.2, 0.4, 0.6, and 0.8.Upon examination of Table 4, the estimates associated with the covariate "income" consistently escalate as τ increases, in accordance with our expectations.Moreover, the values of prediction are close to the true value, which is 750.32, and tend to increase as τ increases, as expected.
This closeness to the true value not only validates the efficacy of our method but also underscores its reliability in making accurate predictions.Moreover, akin to the estimates, the predicted values also manifest a tendency to increase with escalating τ, a phenomenon that aligns seamlessly with our anticipated behavior of the model.

Conclusions
In this article, we have used the maximum-likelihood estimation method and the EM algorithm to estimate the unknown parameters of the quantile regression model with Generalized Autoregressive Moving Average error.To asses the performance of our estimators, we conducted a simulation study.The results were promising, demonstrating that our parameter estimates closely approximate the true values.Additionally, we provide a practical illustration using real data.

Table 1 .
The estimation results for the Standard Normal distribution (out of 500 replications).

Table 2 .
The estimation results for the Student (t 3 ) distribution (out of 500 replications).

Table 3 .
The estimation results for the Skew Student (st 5 ) distribution (out of 500 replications).

Table 4 .
Estimated parameters for Engel data with expectation-maximization algorithm.