Nonparametric Estimation of a Conditional Quantile Function in a Fixed Effects Panel Data Model

This paper develops a nonparametric method to estimate a conditional quantile function for a panel data model with an additive individual fixed effects. The proposed method is easy to implement, it does not require numerical optimization and automatically ensures quantile monotonicity by construction. Monte Carlo simulations show that the proposed estimator performs well in finite samples.


Introduction
Using nonparametric techniques to estimate econometric models has received increasing attention among econometricians in recent decades (see, for example, Pagan and Ullah (1999); Hall et al. (2007); Belloni et al. (2016); Lin et al. (2015); Li et al. (2013); Firpo et al. (2009) and Firpo et al. (2018) for the literature of nonparametric methods and applications).The most popular nonparametric model is the conditional mean regression model.However, compared with a conditional mean function, a conditional quantile regression function, when evaluated at different quantiles, can reveal an entire distributional relationship between the covariates and the response variable.Quantile regression therefore has many useful applications in economics and finance.For example, in risk and financial management, researchers are more concerned about the uncertainty or the risk of an asset, which can be characterized by its left tail behavior (corresponding to the lower quantiles) (see Al Rahahleh and Bhatti (2017); Al Rahahleh et al. (2017); Nguyen and Bhatti (2015); Al Rahahleh et al. (2016); Bartram et al. (2018); Al Shubiri and Jamil (2018) for the literature on idiosyncratic risk), and quantile regression can play an important role in this line of research.
The existing work on nonparametric estimation of quantile functions mostly focuses on cross-sectional data, or weakly dependent stationary data processes.Nonparametric estimation of conditional quantile functions with panel data is more difficult when there exists fixed effects term that is correlated with covariates.In this paper, we consider the following nonparametric panel data model with individual fixed effects: where Y it is the outcome variable, X it is a scalar 1 , α i is the individual fixed effect, it has zero mean and is allowed to be correlated with X it in an unknown correlation form, m(•) is smooth but otherwise unspecified function, the idiosyncratic error it is i.i.d with zero mean and a finite variance.Given that Without loss of generality, we assume that E m(X it ) = 0. 2 A key attractive feature of panel data for empirical researchers is that it controls for the unobserved heterogeneity.Equation (1) has been discussed in Henderson et al. (2008), with a focus on the nonparametric estimation and testing of the conditional mean function.Our interest lies in estimating the conditional quantile function of Y it − α i = m(X it ) + it given X it = x.The application of quantile regression to panel data framework has been a challenging task (see, for example, Koenker (2004); Abrevaya and Dahl (2008); Kato et al. (2012); Harding and Lamarche (2014)).The check-function method and inverse-CDF method are the two main methods in quantile regression analysis, with the former most widely used in literature.One main challenge with the check-function method is that the objective criterion function is non-differentiable and therefore numerical optimization is required.This creates a computational burden.Another drawback of the check-function method is the lack of monotonicity, also known as the quantile crossing problem (see Bassett and Koenker (1982) and He (1997)).Researchers often need to impose shape restrictions or use monotone rearrangement to address the quantile crossing problem (Chernozhukov et al. (2010);Qu and Yoon (2015)).
This paper develops a new quantile regression method for the nonparametric panel data Equation (1) in the spirit of Fang et al. (2018) 3 .The new method exploits the location-scale structure of Equation (1).Note that the conditional τ-th quantile function of Y it − α i given X it = x, denoted by q τ (x), takes a particularly simple closed-form structure: for all τ ∈ (0, 1), where Q (τ) is the τ-th quantile of it4 .Thus, if m(x) is the estimator of m(x), then q τ (x) can be estimated by where Q (τ) is the empirical quantile function of the (normalized) regression residuals.
For estimation, we first use the first-difference transformation to get rid of the individual fixed effect α i and estimate the the unknown function m(•) by the series method, we then use deconvolution method to back up the distribution of error term { it }, therefore the quantile estimator of .Finally, we exploit the location-scale structure of the first-differenced model to derive the quantile estimator of Y it − α i , which is given in Equation (3).The deconvolution step closely relates to the papers by Horowitz and Markatou (1996) and Evdokimov (2010) for the application of the deconvolution method to recover the density of panel data error term.Our approach does not require numerical optimization, is computationally easy to implement, and automatically ensures quantile monotonicity by construction.For asymptotic property of the conditional quantile estimator, as long as the series estimator m(x) and Q (τ) are consistent5 , the conditional quantile estimator qτ (x) is also consistent by 1 For ease of exposition, we assume X it is univariate, the extension to multivariate case can be carried over straightforwardly.

2
This can be achieved by using de-mean data for the dependent variable, i.e., replacing Y it by Y it − (NT) −1 ∑ N j=1 ∑ T s=1 Y js in Equation (1).For notational simplicity, we still use Y it to denote the dependent variable although it is actually the de-mean version of it.
3 Recently, Fang et al. (2018) proposes a new nonparametric method for estimating a conditional quantile function with cross-sectional data.We refer readers to Fang et al. (2018) for a detailed discussion.
Equation (3) and the continuous mapping theorem.While we do not provide theoretical underpinnings for the proposed quantile estimator, Monte Carlo simulation results show that the estimator performs well in finite samples.
The remainder of the paper is organized as follows.Section 2 gives a detailed description of the methodology.Section 3 presents a Monte Carlo simulation to examine the finite-sample performance of the proposed quantile estimator.Section 4 considers an extension where the error is heteroskedastic.Section 5 concludes the paper.

Methodology
In this section, we describe the three-step procedure to estimate the conditional quantile function q τ (x).
STEP 1. Use the first-difference to get rid of individual fixed effects and estimate m(•) by the nonparametric series method.
First differencing Equation (1), we have that Note that despite if one uses a de-mean dependent variable or not, it leads to the same first-differenced Equation (4) because any additive constant will be wiped out by first-difference transformation. Let where K is the number of basis functions.For example, we may choose power series base function so that , or we can choose spline base function.By the approximation property of series basis function, there exists an K × 1 vector of constants β such that In practice, one can estimate β by the least squares method based on where We estimate β by applying the OLS to Equation ( 5), yielding that × 1 vector of outcome variables, and We therefore obtain the series estimator of m(X it ): STEP 2. Let f (•) denote the density of it .In this step, we use the deconvolution method to recover f (•).
From Step 1, one can obtain the estimator of To see how the density of it can be estimated, let φ u (t) = E(e ιtu it ) and φ = E(e ιt it ) denote the characteristic functions of u it and it , respectively, where ι = √ −1.Assume that the distribution of it is such that φ is real and positive for all t ∈ R.Then, it is easy to see that where in the third equality we use the independence of it and i,t−1 , and the fourth equality uses the symmetry of η i,t−1 .Therefore, We propose the following steps to obtain the density estimate of it : (1) Estimate φ u (t) by (2) By Equations ( 6) and ( 7), we estimate φ (t) by φ (t) = φu (t).
(3) By the deconvolution method, we estimate f (•) by where Φ k t T n is the Fourier transform of the kernel function k(x) = sinπx πx with bandwidth 1 T n , and STEP 3. We estimate Q (τ) by Q (τ) such that for τ ∈ (0, 1), Q (τ) satisfies the following condition: Therefore, for τ ∈ (0, 1), the τ-th conditional quantile estimator of where m(x) and Q (τ) are estimated in Steps 1 and 2, respectively.
In series estimation, K/(NT) plays a role similar to the bandwidth h in kernel methods.In practice, one can use Mallows's C L or leave-one-out cross-validation method to determine the series term K.We refer readers to Li and Racine (2007) for details.
Remark 2. Note that, in Step 2, assuming φ (t) is real and is equivalent to assuming that the density of it is symmetric around 0. We are using the assumption that φ it is positive in deriving Equation (6).
Remark 3. In Step 2, the smoothing parameter T n depends on the sample size n = NT.To guarantee that φ (t) uniformly converges to φ (t) over [−T n , T n ] at a geometric rate with respect to the sample size n, Hu and Ridder (2010) suggests that we can choose T n such that where c > 0 is a constant.
Remark 4. For inference, we recommend using a residual bootstrap method similar to Fang et al. (2018).
We leave the proof of validity of such a bootstrap procedure to a future research topic.

Monte Carlo Simulation
In this section, we conduct Monte Carlo simulations to assess the performance of the proposed conditional quantile estimator.
We consider the following data generating process (DGP): where 3) (a t-distribution with degree of freedom 3).We conduct 2000 Monte Carlo replications for samples of size N = 100, 200, 400 with T = 10.We report mean squared error (MSE) of three estimators: (1) the series estimator m(x) with (2) the quantile estimator Qτ ( ) with 2 , and (3) the conditional quantile estimator qτ (x) with MSE( qτ ) = For each of the three quantities above, we average them over the 2000 replications.
We first examine the performance of the deconvolution method for recovering the density of error terms.As an illustration, we only present the result (Figure 1) for the case of it ∼ N(0, 1), with sample size N = 100, T = 10.We examine the sensitivity of the estimated density to the choice of different bandwidths.We set c = 1, and γ = 1 8 , 3 16 , 1 4 , 3 8 .It can be seen from Figure 1 that the performance of the deconvolution method can be somewhat sensitive to the choice of bandwidth.This is a well known problem of the deconvolution method, not a particular problem to our approach.When γ is small, say γ = 1 8 , the estimated density is flatter than the true density.However, generally, the estimated density tracks the true density 6 .
Tables 1 and 2 report the Mean MSE of m, Q (τ) and qτ .It can be seen that, as sample size doubles, MSEs of m, Q (τ), and qτ decrease by about 1 2 , which indicates that the proposed estimator behaves well.

6
There is no rule-of-thumb to choose the optimal bandwidth in the deconvolution method.In practice, researchers can try different bandwidths as a robust check to see how results vary across the different bandwidths.Table 1.Mean MSE (×100), N(0, 1) Errors.

Sample Size (N, T)
Estimators

Sample Size (N, T)
Estimators

Extension: Conditional Heteroskedastistic Error Case
In this section, we consider an extension where the error term is conditional heteroskedastic.Specifically, we generalize Equation (1) to the following case 7 : where σ(X it ) > 0 is an unknown function, η it is assumed to be i.i.d with zero mean, unit variance and independent of {X js } j=1,...,N; s=1,...,T .Without loss of generality, we assume that E m(X it ) = 0 (similar to the conditional homoskedasticity case).Define it ≡ σ(X it )η it .The conditional τ-th quantile function of Y it − α i given X it = x, denoted by q τ (x), takes the following closed-form structure: for all τ ∈ (0, 1), where Q |X=x (τ) = σ(x)Q η (τ), and Q η (τ) is the (un-conditional) τ-th quantile of η it .
Remark 5.In deriving Equation ( 10), we use the fact that Q |X=x (τ) = σ(x)Q η (τ) because σ(x) > 0 and X it and η it are independent with each other.
Remark 6.Noting that, due to the independence between X it and js , we have that it We propose the following three-step procedure to estimate the conditional quantile function of STEP 1.We obtain m(X it ) = P K (X it ) β by exactly the same procedure as in Step 1 of the conditional homoskedastic error case.STEP 2. We use the deconvolution method to estimate f it |X it =x (•), the conditional density of it given X it = x.Define ∆Y it ≡ Y it − Y it−1 .Assuming that the density of η it is symmetric around zero, 8 and note that [m(X it ) − m(X i,t−1 )] X it =x,X i,t−1 =x = m(x) − m(x) = 0, we have where ι = √ −1, the third equality uses the conditional independence property as described in Remark 6, and in the fourth equality we use the symmetry of i,t−1 |X i,t−1 = x = σ(x)η i,t−1 , and η i,t−1 is symmetric around zero. 7 Fang et al. (2018) also considers the same form of heteroskedastic error as described here.

8
This implies the conditional density of it given X it = x is symmetric, since, given that it | X it =x = σ(x)η it , the symmetry of η it is equivalent to the symmetry of it .
Under the assumption that φ (s|x) is positive, the above equation implies that φ it (s|x) = φ ∆Y it (s|x).The left-hand side of Equation ( 11) can be estimated from data: Therefore, we estimate φ it (s|x) by φ it (s|x) = φ∆Y it (s|x).Let f it |X it =x (•) denote the conditional density of it = σ(X it )η it given X it = x.Then, using the deconvolution method as in the homoskedastic case, one can recover f it |X it =x (•) using φ it (s|x) as in Equation ( 8).We use f it |X it =x (•) to denote the resulting estimator of f it |X it =x (•).By Equation ( 10), the τ-th conditional quantile estimator of Y it − α i , given X it = x, is estimated by qτ (x) = m(x) + Q |X=x (τ), τ ∈ (0, 1), where m(x) = P K (x) β is obtained in Step 1, and Q |X=x (τ) is obtained in Step 3.

Conclusions
In this paper, we propose an easy-to-implement nonparametric method to estimate conditional quantile functions in a fixed effects panel data model.There are many directions that one can extend the results of this paper to more general settings.For example, one can allow for panel non-stationary data as considered in Chen and Khan (2008) or allow for the covariate X it to be endogenous.We leave these as possible future research topics.

Figure 1 .
Figure 1.Recovered densities across different bandwidths and homoskedastic symmetric normal errors.