Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification

Lee, Ying-Ying

doi:10.3390/econometrics4010002

Open AccessArticle

Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification

by

Ying-Ying Lee

Department of Economics, University of Oxford, Manor Road Building, Manor Road, Oxford OX1 3UQ, UK

Econometrics 2016, 4(1), 2; https://doi.org/10.3390/econometrics4010002

Submission received: 4 October 2015 / Revised: 27 November 2015 / Accepted: 1 December 2015 / Published: 24 December 2015

(This article belongs to the Special Issue Quantile Methods)

Download

Browse Figure

Versions Notes

Abstract

:

Allowing for misspecification in the linear conditional quantile function, this paper provides a new interpretation and the semiparametric efficiency bound for the quantile regression parameter

β (τ)

in Koenker and Bassett (1978). The first result on interpretation shows that under a mean-squared loss function, the probability limit of the Koenker–Bassett estimator minimizes a weighted distribution approximation error, defined as

F_{Y} (X^{'} β (τ) | X) - τ

, i.e., the deviation of the conditional distribution function, evaluated at the linear quantile approximation, from the quantile level. The second result implies that the Koenker–Bassett estimator semiparametrically efficiently estimates the quantile regression parameter that produces parsimonious descriptive statistics for the conditional distribution. Therefore, quantile regression shares the attractive features of ordinary least squares: interpretability and semiparametric efficiency under misspecification.

Keywords:

semiparametric efficiency bounds; misspecification; conditional quantile function; conditional distribution function; best linear approximation

JEL classifications:

C14; C21

1. Introduction

This paper revisits the approximation properties of the linear quantile regression under misspecification ([1,2,3]). The quantile regression estimator, introduced by the seminal paper of Koenker and Bassett [4], offers parsimonious summary statistics for the conditional quantile function and is computationally tractable. Since the development of the estimator, researchers have frequently used quantile regression, in conjunction with ordinary least squares regression, to analyse how the outcome variable responds to the explanatory variables. For example, to model wage structure in labour economics, Angrist, Chernozhukov, and Fernández-Val [1] study returns to education at different points in the wage distribution and changes in inequality over time. A thorough review of recent developments in quantile regression can be found in [5]. The object of interest of this paper is the quantile regression (QR) parameter that is the probability limit of the Koenker–Bassett estimator without assuming the true conditional quantile function to be linear. Two results are presented: a new interpretation and the semiparametric efficiency bound for the QR parameter.

The topic of interest is the conditional distribution function (CDF) of a continuous response variable Y given the regressor vector X, denoted as

F_{Y} (y | X)

. An alternative for the CDF is the conditional quantile function (CQF) of Y given X, defined as

Q_{τ} (Y | X) : = inf {y : F_{Y} (y | X) \geq τ}

for any quantile index

τ \in (0, 1)

. Assuming integrability, the CQF minimizes the check loss function

\begin{matrix} Q_{τ} (Y | X) \in arg \min_{q \in Q} E [ρ_{τ} (Y - q (X))], \end{matrix}

where

Q

is the set of measurable functions of X,

ρ_{τ} (u) = u (τ - 1_{{u \leq 0}})

is known as the check function and

1_{{\cdot}}

is the indicator function. A linear approximation to the CQF is provided by the QR parameter

β (τ)

, which solves the population minimization problem

\begin{matrix} β (τ) : = arg \min_{b \in R^{d}} E [ρ_{τ} (Y - X^{'} b)] \end{matrix}

(1)

assuming the integrability and uniqueness of the solution, and d is the dimension of X. The QR parameter

β (τ)

provides a simple summary statistic for the CQF. The QR estimator introduced in [4] is the sample analogue

\begin{matrix} \hat{β} (τ) \in arg \min_{b \in R^{d}} \frac{1}{n} \sum_{i = 1}^{n} ρ_{τ} (Y_{i} - X_{i}^{'} b) \end{matrix}

(2)

for the random sample

(Y_{i}, X_{i}^{'}, i \leq n)

on the random variables

(Y, X^{'})

. By the equivalent first-order condition, this estimator

\hat{β} (τ)

is also the generalized method of moments (GMM) estimator based on the unconditional moment restriction ([6,7])

\begin{matrix} E [(τ - 1_{{Y \leq X^{'} β (τ)}}) X] = 0 . \end{matrix}

(3)

This paper focuses on the population QR parameter defined by (1) or equivalently (3).

If the CQF is modelled to be linear in the covariates

Q_{τ} (Y | X) = X^{'} β (τ)

or

F_{Y} (X^{'} β (τ) | X) = τ

, the coefficient

β (τ)

satisfies the conditional moment restriction

\begin{matrix} E [τ - 1_{{Y \leq X^{'} β (τ)}} | X] = 0 \end{matrix}

(4)

almost surely. In the theoretical and applied econometrics literature, this linear QR model is often assumed to be correctly specified. Nevertheless, a well-known crossing problem arises: the CQF for different quantiles may cross at some values of X, except when

β (τ)

is the same for all τ. A logical monotone requirement is violated for

Q_{τ} (Y | X)

or its estimator to be weakly increasing in the probability index τ given X. The crossing problem for estimation could be treated by rearranging the estimator (for example, see [8] and the references therein.1). However, the crossing problem remains for the population CQF, suggesting that the linear QR model (4) is inherently misspecified. That is, there is no

β (τ) \in R^{d}

satisfying the conditional moment (4) almost surely. Therefore, the parameter of interest in this paper is the QR parameter

β (τ)

defined by (1) or (3) without the linear CQF assumption in (4). We can view

β (τ)

as the pseudo-true value of the linear QR model under misspecification. As the Koenker–Bassett QR estimator is widely used, it is important to understand the approximation nature of the estimand.

For the mean regression counterpart, ordinary least squares (OLS) consistently estimates the linear conditional expectation and minimizes mean-squared error loss for fitting the conditional expectation under misspecification. Chamberlain [9] proves the semiparametric efficiency of the OLS estimator, which provides additional justification for the widespread use of OLS. The attractive features of OLS, interpretability and semiparametric efficiency, under misspecification, motivate my investigation of parallel properties in QR. I study how this QR parameter approximates the CQF and the CDF and calculate its semiparametric efficiency bound.

The first contribution of this paper is on how

β (τ)

minimizes the distribution approximation error, defined by

F_{Y} (X^{'} β (τ) | X) - τ

, under a mean-squared loss function. The first-order condition (3) can be understood as the orthogonality condition of the covariates X and the distribution approximation error in the projection model. I show that the QR parameter

β (τ)

minimizes the mean-squared distribution approximation error, inversely weighted by the conditional density function

f_{Y} (X^{'} β (τ) | X)

. Angrist, Chernozhukov, and Fernández-Val [1] (henceforth ACF) show that

β (τ)

minimizes the mean-squared quantile specification error, defined by

Q_{τ} (Y | X) - X^{'} β (τ)

, using a weight primarily determined by the conditional density. ACF’s study, as well as my own results, suggests that QR approximates the CQF more accurately at points with more observations, but the corresponding CDF evaluated at the approximated point

F_{Y} (X^{'} β (τ) | X)

is more distant from the targeted quantile level τ. This trade-off is controlled by the conditional density, which is distinct from OLS approximating the conditional mean, because the distribution and quantile functions are generally nonlinear operators. This observation is novel and increases the understanding of how the QR summarizes the outcome distribution. A numerical example in Figure 1 in Section 4 illustrates this finding.

The second result is the semiparametric efficiency bound of the

β (τ)

. Chamberlain’s results in [9] on the mean regression based on differentiable moment restrictions cannot be applied to semiparametric efficiency for QR, due to the lack of moment function differentiability in (3). Although Ai and Chen [10] provide general results for sequential moment restrictions containing unknown functions, which could cover the quantile regression setting, I calculate the efficiency bound accommodating regularity conditions specifically for the QR parameter

β (τ)

using the method of Severini and Tripathi [11]. It follows that the misspecification-robust asymptotic variance of the QR estimator

\hat{β} (τ)

in (2) attains this bound, which means no regular2 estimator for (3) has smaller asymptotic variance than

\hat{β} (τ)

. This result might be expected for an M-estimator, but, to my knowledge, the QR application has not been demonstrated and discussed rigorously in any publication. Furthermore, I calculate the efficiency bounds for jointly estimating QR parameters at a finite number of quantiles for both linear projection (3) and linear QR (4) models. Employing the widely-used method of Newey [12], Newey and Powell [13] find the semiparametric efficiency bound for

β (τ)

of the correctly-specified linear CQF in (4). Note that the efficiency bounds for (3) do not imply the bounds for (4); nor does the converse hold.

In Section 2, I discuss the interpretation of the misspecified QR model in terms of approximating the CDF and the CQF. The theorems for the semiparametric efficiency bounds are in Section 3. In Section 4, I discuss the parallel properties of QR and OLS. The paper is concluded by a review of some existing efficient estimators for linear projection model (3) and linear QR model (4).

2. Interpreting QR under Misspecification

Let Y be a continuous response variable and X be a

d \times 1

regressor vector. The quantile-specific residual is defined as the distance between the response variable and the CQF,

ε_{τ} : = Y - Q_{τ} (Y | X)

with the conditional density

f_{ε_{τ}} (e | X)

at

ε_{τ} = e

or

f_{Y} (y | X)

at

Y = y = e + Q_{τ} (Y | X)

for any

τ \in (0, 1)

. This is a semiparametric problem in the sense that the distribution functions of

ε_{τ}

and X, as well as the CQF, are unspecified and unrestricted other than by the following assumptions, which are standard in QR models. I assume the following regularity conditions, based on the conditions of Theorem 3 in ACF.

(R1): $(Y_{i}, X_{i}, i \leq n)$ are independent and identically distributed on the probability space $(Ω, F, P)$ for each n;
(R2): the conditional density $f_{Y} (y | X = x)$ exists and is bounded and uniformly continuous in y, uniformly in x over the support of X;
(R3): $J (τ) : = E [f_{Y} (X^{'} β (τ) | X) X X^{'}]$ is positive definite for all $τ \in (0, 1)$ , where $β (τ)$ is uniquely defined in (1);
(R4): ${E ∥ X ∥}^{2 + ϵ} < \infty$ for some $ϵ > 0$ ;
(R5): $f_{Y} (X^{'} β (τ) | X)$ to be bounded away from zero.

The identification of the pseudo-true parameter

β (τ)

is assumed in (R3). The bounded conditional density function of the continuous response variable Y given X in (R2) is needed for the existence of the CQF for any

τ \in (0, 1)

. The uniform continuity guarantees the existence and differentiability of the distribution function, i.e.,

d F_{Y} (y | X) / d y = f_{Y} (y | X)

and

F_{Y} (y | X) = \int_{- \infty}^{y} f_{Y} (u | X) d u

with probability one. (R4) is used for the asymptotic normality of

\sqrt{n} (\hat{β} (τ) - β (τ))

. The covariates X are allowed to contain discrete components. (R5) guarantees that the objective function defined below in Equation (6) is finite

\forall β \in R^{d}

, where

β (τ)

is the parameter of interest uniquely defined by Equation (1).

The parameter of interest

β (τ)

is equivalent to solving

\begin{matrix} E [X (F_{Y} (X^{'} β (τ) | X) - τ)] = 0 \end{matrix}

(5)

by applying the law of iterated expectations on Equation (3). Equation (5) states that X is orthogonal to the distribution approximation error

F_{Y} (X^{'} β (τ) | X) - τ

. The following theorem interprets QR via a weighed mean-squared error loss function on the distribution approximation error.

Theorem 1.

Assume (R1)–(R5). Then,

\bar{β} (τ) = β (τ)

solves the equation

\begin{matrix} \bar{β} (τ) = arg \min_{b \in R^{d}} E [{(f_{Y} (X^{'} \bar{β} (τ) | X))}^{- 1} {(F_{Y} (X^{'} b | X) - τ)}^{2}] . \end{matrix}

(6)

Furthermore, if

E [(f_{Y} (X^{'} b | X) + (F_{Y} (X^{'} b | X) - τ) f_{Y}^{'} (X^{'} b | X) / f_{Y} (X^{'} b | X)) X X^{'}]

is positive definite at

b = β (τ)

, then

\bar{β} (τ) = β (τ)

is the unique solution to this problem (6).

Proof of Theorem 1.

The objective function in (6) is finite by the assumptions. Any fixed point

b = \bar{β} (τ)

would solve the first-order condition,

E [X (F_{Y} (X^{'} b | X) - τ)] = 0

. By the law of iterated expectations, (3) implies the above first-order condition. Therefore,

β (τ)

solves (6). When the second-order condition holds, i.e.,

E [(f_{Y} (X^{'} b | X) + (F_{Y} (X^{'} b | X) - τ) f_{Y}^{'} (X^{'} b | X) / f_{Y} (X^{'} b | X)) X X^{'}]

is positive definite at

b = β (τ)

,

β (τ)

solves (6) uniquely. ☐

Theorem 1 states that

β (τ)

is the unique fixed point to an iterated minimum distance approximation, with a weight of a function of X only. The mean-squared loss makes it clear how the linear function matches the CDF to the targeted probability of interest. The loss function puts more weight on points where the conditional density

f_{Y} (X^{'} β (τ) | X)

is small. As a result, the distribution approximation error is smaller at points with smaller conditional density.

Now, I discuss the approximation nature of QR based on the distributional approximation error and quantile specification error. ACF interpret QR as the minimizer of the weighted mean-squared error loss function for quantile specification error, defined as the deviation between the approximation point

X^{'} β (τ)

and the true CQF

Q_{τ} (Y | X)

,3

\begin{matrix} β (τ) & = arg \min_{β \in R^{d}} E [{\bar{w}}_{τ} (X, β (τ)) {(X^{'} β - Q_{τ} (Y | X))}^{2}], where \end{matrix}

(7)

\begin{matrix} {\bar{w}}_{τ} (X, β (τ)) & = \frac{1}{2} \int_{0}^{1} f_{ε_{τ}} (u (X^{'} β (τ) - Q_{τ} (Y | X)) | X) d u . \end{matrix}

(8)

ACF define

{\bar{w}}_{τ} (X, β (τ))

in (8) to be the importance weights that are the averages of the response variable over a line connecting the approximation point

X^{'} β (τ)

and the true CQF. ACF note that the regressors contribute disproportionately to the QR estimate and the primary determinant of the importance weight is the conditional density.

Moreover, the first-order condition implied by (7)

E [{\bar{w}}_{τ} (X, β (τ)) X (X^{'} β (τ) - Q_{τ} (Y | X))] = 0

is a weighted orthogonal condition of the quantile specification error. A Taylor expansion provides intuition to connect the distribution approximation error and the quantile specification error:

{(f_{Y} (X^{'} β | X))}^{- 1} {(F_{Y} (X^{'} β | X) - τ)}^{2} \approx f_{Y} (X^{'} β | X) {(Q_{τ} (Y | X) - X^{'} β)}^{2}

by

f_{Y} (X^{'} β | X) = f_{ε_{τ}} (X^{'} β - Q_{τ} (Y | X) | X)

. This observation implies the quantile specification error is smaller at points where the conditional density

f_{Y} (X^{'} β | X)

is larger. On the other hand, the distribution approximation error is larger at points with larger

f_{Y} (X^{'} β | X)

. Comparing with the OLS, where the mean operator is linear, the CDF and its inverse operator, the CQF, are generally nonlinear. The distribution approximation error can be interpreted as the distance after a nonlinear transformation by the CDF,

F_{Y} (X^{'} β (τ) | X) - F_{Y} (Q_{τ} (Y | X) | X)

. A Taylor expansion linearizes the distribution function to the quantile specification error multiplied by the conditional density function. The conditional density plays a crucial role on weighting the distribution approximation error and the quantile specification error. The above discussion provides additional insights to how the QR parameter approximates the CQF and fits the CDF to the targeted quantile level.

Remark 1

(Mean-squared loss under misspecification). The linear function

X^{'} β (τ)

is the best linear approximation under the check loss function in (1). While

β (. 5)

is the least absolute derivations estimation, the QR parameter

β (τ)

for

τ \neq 0.5

is the best linear predictor for a response variable under the asymmetric loss function

ρ_{τ} (\cdot)

in (1). ACF note that the prediction under the asymmetric check loss function is often not the object of interest in empirical work, with the exception of the forecasting literature, for example [15]. For the mean regression counterpart, OLS consistently estimates the linear conditional expectation and minimizes mean-squared error loss for fitting the conditional expectation under misspecification. The robust nature of OLS also motivates research on misspecification in panel data models. For example, Galvao and Kato [16] investigate linear panel data models under misspecification. The pseudo-true value of the fixed effect estimator provides the best partial linear approximation to the conditional mean given the explanatory variables and the unobservable individual effect.4

3. The Semiparametric Efficiency Bounds

Section 3.1 presents the semiparametric efficiency bound for the unconditional moment restriction (3). Section 3.2 discusses the existing results on the semiparametric efficiency bound for the conditional moment restriction (4).

3.1. QR under Misspecification

I calculate the semiparametric efficiency bound for the unconditional moment restriction (3) by the approach of Severini and Tripathi [11].

Theorem 2.

Assume (R1)–(R4). The semiparametric efficiency bound for estimating the population QR parameter

β (τ)

, defined in (1) or equivalently (3), is

J {(τ)}^{- 1} Γ (τ, τ) J {(τ)}^{- 1}

, where

J (τ)

is defined in (R3) and

\begin{matrix} Γ (τ_{i}, τ_{j}) : = E [(τ_{i} - 1_{{Y < X^{'} β (τ_{i})}}) (τ_{j} - 1_{{Y < X^{'} β (τ_{j})}}) X X^{'}] \end{matrix}

for any

τ_{i}, τ_{j} \in T : =

a closed subset of

[ϵ, 1 - ϵ]

for

ϵ > 0

.

In general, the semiparametrically-efficient joint asymptotic covariance of the estimators for

(β^{'} (τ_{1}), β^{'} (τ_{2}), . . ., β^{'} {(τ_{m})}^{'}

is

J {(τ_{i})}^{- 1} Γ (τ_{i}, τ_{j}) J {(τ_{j})}^{- 1}

, for any

τ_{i}, τ_{j} \in T

,

i, j = 1, 2, . . ., m

, for a finite integer

m \geq 1

.

Proof of Theorem 2.

See the Appendix. ☐

My proof accommodates the regularity assumptions for quantile regression and modifies Section 9 of [11]. For example, the covariate X can contain discrete components, by constructing two tangent spaces for the conditional density of Y given X and the marginal density of X, respectively. In the efficiency bound,

J (τ) : = E [f_{Y} (X^{'} β (τ) | X) X X^{'}]

is obtained by assuming the interchangeability of integration and differentiation for the nonsmooth check function.5

The method in [11] has been used in the monotone binary model in [18], Lewbel [19] latent variable model in [20] and the partial linear single index model in [21], for example. I work in the Hilbert space of tangent vectors of the square-root density functions and using the Riesz–Fréchet representation theorem. Another equivalent approach in [12] works in a Hilbert space of random variables and uses the projection on the linear space spanned by the scores from the one-dimensional subproblems to find the efficient influence function. The efficiency bound is then the second moment of the efficient influence function,

J {(τ)}^{- 1} X (τ - 1_{{Y \leq X^{'} β}})

. Newey’s efficient influence function is the score function evaluated at the unique representers by the Riesz–Fréchet theorem used in [11]; a more detailed comparison of these two approaches is discussed in [11].

ACF show that the QR process

\hat{β} (\cdot)

is asymptotically mean-zero Gaussian with the covariance function

J {(τ_{1})}^{- 1} Γ (τ_{1}, τ_{2}) J {(τ_{2})}^{- 1}

for any

τ_{1}, τ_{2} \in T

, which is the semiparametric efficiency bound in Theorem 2. This asymptotic covariance under misspecification for a single quantile,

J {(τ)}^{- 1} Γ (τ, τ) J {(τ)}^{- 1}

, has been presented in [2] and [3]. Hahn [3] further shows the QR estimator is well approximated by the bootstrap distribution, even when the linear quantile restriction is misspecified. An alternative estimator for the misspecification-robust asymptotic covariance matrix of

\hat{β} (τ)

is the nonparametric kernel method in ACF.

3.2. QR for Linear Specification

Assuming the linear QR model in (4) is correctly specified, i.e.,

Q_{τ} (Y | X) = X^{'} β (τ)

almost surely, the asymptotic covariance for the QR process

\hat{β} (\cdot)

derived by ACF is simplified to

J {(τ_{1})}^{- 1} Γ_{0} (τ_{1}, τ_{2}) J {(τ_{2})}^{- 1}

, where

Γ_{0} (τ_{1}, τ_{2}) : = (\min \{τ_{1}, τ_{2}\} - τ_{1} τ_{2}) E [X X^{'}]

for any

τ_{1}, τ_{2} \in (0, 1)

. The asymptotic covariance

J {(τ)}^{- 1} Γ_{0} (τ, τ) J {(τ)}^{- 1}

for a single quantile τ, first derived by Powell [7], is widely used for inference in most empirical studies, which implicitly assume correct specification.

The semiparametric efficiency bound for the correctly-specified quantile regression (4) is

τ (1 - τ) {(E [X X^{'} f_{ε_{τ}}^{2} (0 | X)])}^{- 1}

, where

f_{Y} (X^{'} β (τ) | X) = f_{ε_{τ}} (0 | X)

a.s. and

E [X X^{'} f_{ε_{τ}}^{2} (0 | X)]

is assumed to be finite and nonsingular. This is first calculated in [13] using the method developed in [12]. If, in addition, the conditional density function of

ε_{τ}

given X is independent of X, i.e.,

f_{Y} (Q_{τ} (Y | X) | X) = f_{ε_{τ}} (0 | X) = f_{ε_{τ}} (0)

, and

f_{ε_{τ}} (0) > 0

, the semiparametric efficiency bound becomes

τ (1 - τ) {(E [X X^{'}])}^{- 1} / f_{ε_{τ}}^{2} (0)

. This asymptotic covariance is attained by

\hat{β} (τ)

, first shown in [4]. This has an interesting resemblance to the fact that the OLS estimator is semiparametrically efficient in a homoskedastic regression model, i.e.,

e = Y - X^{'} β

,

E [e | X] = 0

, and

E [e^{2} | X] = E [e^{2}]

.

I further show, in general, that the semiparametrically-efficient joint asymptotic covariance of the estimators for

{(β^{'} (τ_{1}), . . ., β^{'} (τ_{m}))}^{'}

is

\begin{matrix} (\min \{τ_{i}, τ_{j}\} - τ_{i} τ_{j}) {(E [X X^{'} f_{ε_{τ_{i}}} (0 | X) f_{ε_{τ_{j}}} (0 | X)])}^{- 1} \end{matrix}

(9)

for any

τ_{i}, τ_{j} \in T

,

i, j = 1, 2, . . ., m

, for any finite integer

m \geq 1

. The regularity conditions imposed, (R1), (R2) and (R4), are weaker than the assumptions in [13]; for example, they assume

f (ε, X)

is absolutely continuous in ε, which implies uniform continuity in (R2). See the Appendix for the detailed proof for (9).

4. Discussion and Conclusions

Misspecification is a generic phenomenon; especially in quantile regression (QR), the true conditional quantile function (CQF) might be nonlinear or different functions of the covariates at different quantiles. Table 1 summarizes the parallel properties of QR and OLS. Under misspecification, the pseudo-true OLS coefficient can be interpreted as the best linear predictor of the conditional mean function,

E [Y | X]

, in the sense that the coefficient minimizes the mean-squared error of the linear approximation to the conditional mean. The approximation properties of OLS have been well studied (see, for example, [22]). With respect to the QR counterpart, I present the inverse density-weighted mean-squared error loss function based on the distribution approximation error

F_{Y} (X^{'} β | X) - τ

. This result complements the interpretation based on the quantile specification error in [1]. My results imply that the Koenker–Bassett estimator is semiparametrically efficient for misspecified linear projection models and correctly specified linear quantile regression models when

f_{Y} (Q_{τ} (Y | X) | X) = f_{ε_{τ}} (0 | X)

does not depend on X. Alternatively, the smoothed empirical likelihood estimator using the unconditional moment restriction in [23] has the same asymptotic distribution as the Koenker-Bassett estimator and, hence, attains the efficiency bound.

Table 1. Summary properties of OLS and quantile regression (QR).

**Table 1.** Summary properties of OLS and quantile regression (QR).
	OLS	QR
	Linear Projection Model
objective minimized	$E [{(Y - X^{'} β)}^{2}]$	$E [ρ_{τ} (Y - X^{'} β (τ))]$
(interpretation)	$E [{(E [Y \| X] - X^{'} β)}^{2}]$	$E [{\bar{w}}_{τ} {(Q_{τ} (Y \| X) - X^{'} β (τ))}^{2}]$
		$E [f_{Y} {(X^{'} β (τ) \| X)}^{- 1} {(F_{Y} (X^{'} β (τ) \| X) - τ)}^{2}]$
unconditional moment	$E [X (Y - X^{'} β)] = 0$	$E [X (1_{{Y \leq X^{'} β (τ)}} - τ)] = 0$
(interpretation)	$E [X (E [Y \| X] - X^{'} β)] = 0$	$E [X (F_{Y} (X^{'} β (τ) \| X) - τ)] = 0$
		$E [{\bar{w}}_{τ} X (X^{'} β (τ) - Q_{τ} (Y \| X))] = 0$
efficient estimators	$arg \min_{β \in R^{d}} \frac{1}{n} \sum_{i = 1}^{n} {(Y_{i} - X_{i}^{'} β)}^{2}$	$arg \min_{β \in R^{d}} \frac{1}{n} \sum_{i = 1}^{n} ρ_{τ} (Y_{i} - X_{i}^{'} β)$
	$= {(\sum_{i = 1}^{n} X_{i} X_{i}^{'})}^{- 1} (\sum_{i = 1}^{n} X_{i} Y_{i})$	(Koenker–Bassett)
	(OLS)
asymptotic covariance	$Q^{- 1} Ω Q^{- 1}$ $^{*}$	$J^{- 1} Γ J^{- 1}$
efficiency bounds	Chamberlain (1987) [9]	Theorem 2
	Linear Regression Model
conditional moment	$E [Y \| X] = X^{'} β$	$Q_{τ} (Y \| X) = X^{'} β (τ)$
		or $F_{Y} (X^{'} β (τ) \| X) = τ$
efficiency bounds	Chamberlain (1987) [9] $^{†}$	Newey and Powell (1990) [13]
homoscedasticity-type	$v a r [Y \| X] = σ^{2}$	$f_{ε_{τ}} (0 \| X) = f_{ε_{τ}} (0)$
condition
efficient estimators	OLS	Koenker–Bassett

^{*}

Q = E [X X^{'}]

and

Ω = E [X X^{'} e^{2}]

where

e = Y - X^{'} β

;

^{†}

The feasible generalized least squares estimator is semiparametrically efficient, for example.

Under the linear quantile regression model, the Koenker–Bassett estimator consistently estimates the true

β (τ)

, although it is not semiparametrically efficient given heteroskedasticity. Researchers have proposed many efficient estimators for the correctly-specified linear quantile regression parameter, for example the one-step score estimator in [13], the smoothed conditional empirical likelihood estimator in [24] and the sieve minimum distance (SMD) estimator in [25,26]. However, for all of these estimators, the pseudo-true values under misspecification are different, and their interpretations have not been thoroughly studied. Therefore, the semiparametric efficiency bounds of these pseudo-true values are also different. For example, an unweighted SMD estimator converges to a pseudo-true value

β_{S M D}

that minimizes

E [{(F_{Y} (X^{'} β | X) - τ)}^{2}]

.6 The first-order condition is

E [X (F_{Y} (X^{'} β_{S M D} | X) - τ) f_{Y} (X^{'} β_{S M D} | X)] = 0

, which is the unconditional moment used in [13] for the semiparametrically efficient GMM estimator under correct specification. The conditional density weight is similar to the generalized least squares in the mean regression in that it uses a weight function of the conditional variance to construct an efficient estimator.

It is interesting to note that the pseudo-true value of the SMD estimator minimizes

E [{(F_{Y} (X^{'} β | X) - τ)}^{2}] \approx E [f_{Y}^{2} (Q_{τ} (Y | X) | X) {(X^{'} β - Q_{τ} (Y | X))}^{2}]

. The distribution approximation error is weighted evenly over the support of X for

β_{S M D}

, in contrast to the QR parameter, which is weighted more at points with smaller conditional density in Theorem 1. Therefore, the SMD estimator might have more desirable and reasonable approximation properties than QR. Nevertheless, the SMD estimator is computationally more demanding than the Koenker–Bassett estimator. A numerical example in Figure 1 illustrates how the Koenker and Bassett (KB) and SMD estimators approximate the CQF and the CDF.

Figure 1. This numerical example is constructed by

X \sim U n i f o r m [1, 2]

,

e | X = x \sim U n i f o r m [0, x]

and

Y = cos (2 X) + e

. Therefore,

f_{Y} (y | X) = 1 / X

,

F_{Y} (y | X) = (y - cos (2 X)) / X

and

Q_{τ} (Y | X) = τ X + cos (4 X)

. Set

τ = 0.5

for the median. The red solid line is for the QR parameter

β_{K B}

defined in (3) and estimated by the Koenker-Bassett (KB) estimator. The blue dashed line is the approximation by the SMD estimator

β_{S M D}

minimizing

E [{(F_{Y} (X^{'} β | X) - τ)}^{2}]

. The approximations are

X^{'} β_{K B} = - 0.324 + 0.161 X

and

X^{'} β_{S M D} = - 0.204 + 0.078 X

. The left panel shows the linear approximations

X^{'} β_{K B}

,

X^{'} β_{S M D}

and the true CQF. The green circles are 300 random draws from the DGP. The right panel shows the corresponding CDFs

F_{Y} (X^{'} β_{K B} | X)

and

F_{Y} (X^{'} β_{S M D} | X)

. For smaller x where the conditional density is larger, the quantile specification error of SMD is smaller than that of KB in the left panel. For the distribution approximation error in the right panel, SMD weights more evenly over the support of X, while KB has smaller distribution approximation error at larger x with smaller density.

Figure 1. This numerical example is constructed by

X \sim U n i f o r m [1, 2]

,

e | X = x \sim U n i f o r m [0, x]

and

Y = cos (2 X) + e

. Therefore,

f_{Y} (y | X) = 1 / X

,

F_{Y} (y | X) = (y - cos (2 X)) / X

and

Q_{τ} (Y | X) = τ X + cos (4 X)

. Set

τ = 0.5

for the median. The red solid line is for the QR parameter

β_{K B}

defined in (3) and estimated by the Koenker-Bassett (KB) estimator. The blue dashed line is the approximation by the SMD estimator

β_{S M D}

minimizing

E [{(F_{Y} (X^{'} β | X) - τ)}^{2}]

. The approximations are

X^{'} β_{K B} = - 0.324 + 0.161 X

and

X^{'} β_{S M D} = - 0.204 + 0.078 X

. The left panel shows the linear approximations

X^{'} β_{K B}

,

X^{'} β_{S M D}

and the true CQF. The green circles are 300 random draws from the DGP. The right panel shows the corresponding CDFs

F_{Y} (X^{'} β_{K B} | X)

and

F_{Y} (X^{'} β_{S M D} | X)

. For smaller x where the conditional density is larger, the quantile specification error of SMD is smaller than that of KB in the left panel. For the distribution approximation error in the right panel, SMD weights more evenly over the support of X, while KB has smaller distribution approximation error at larger x with smaller density.

This discussion leads to open-ended questions: What is an appropriate linear approximation or a meaningful summary statistic for the nonlinear CQF? How should economists measure the marginal effect of the covariates on the CQF? An approach that circumvents this problem is measuring the average marginal response of the covariates on the CQF directly. The average quantile derivative, defined as

E [W (X) \nabla Q_{τ} (Y | X)]

where

W (X)

is a weight function, offers such a succinct summary statistic ([27]). Sasaki [28] investigates the question that quantile regressions may misspecify true structural functions. He provides a causal interpretation of the derivative of the CQF, which identifies a weighted average of heterogeneous structural partial effects among the subpopulation of individuals at the conditional quantile of interest. Sasaki’s work adds economic content to this misspecified question. This paper complements the prior literature on understanding how the QR statistically summarizes the outcome distribution.

Acknowledgments

I am grateful to Bruce Hansen and Jack Porter for invaluable guidance and comments. I also thank Yuya Sasaki for helpful discussion. I would also like to thank three anonymous referees for their suggestions. All remaining errors are mine.

Conflicts of Interest

The author declares no conflict of interest.

Appendix

Proof of Theorem 3.

I implement the main results in [11]. I start with the definitions and construct the Hilbert space. The unknown probability density or mass function of the random vector

{(Y, X^{'})}^{'} \in Ω = S_{Y} \times S_{X}

with respect to the measure P, the products of the Lebesgue measure

μ_{Y}

and

μ_{X}

,7 is written as

f (Y, X) = f (Y | X) f (X) : = ψ_{0}^{2} (Y | X) ϕ_{0}^{2} (X)

. The functionals

ψ_{0}

and

ϕ_{0}

belong to the following sets defined by the regularity conditions

Ψ_{Y} : = {ψ \in S_{Y} \times S_{X} \to R, ψ^{2} (Y | X) > 0, b o u n d e d a n d u n i f o r m l y c o n t i n u o u s i n y, u n i f o r m l y i n x o v e r t h e s u p p o r t o f X, \int_{S_{Y}} ψ^{2} (y | X) d y = 1} a n d Φ : = {ϕ \in L^{2} (S_{X}; μ_{X}), ϕ^{2} (X) > 0, \int_{S_{X}} ϕ^{2} (x) μ_{X} (d x) = 1, \int_{S_{X}} {∥ x ∥}^{2 + ϵ} ϕ^{2} (x) μ_{X} (d x) < \infty, f o r s o m e ϵ > 0} .

Let

A : = Ψ_{Y} \times Φ

.

Definition 1.

A vector

\dot{ξ} = (\dot{ψ}, \dot{ϕ})

is said to be tangent to

A

at

ξ_{0}

if it is the slope of

ξ_{t} : = (ψ_{t}, ϕ_{t})

at

t = 0

, i.e.,

l i m_{t \to 0} ∥ t^{- 1} (ξ_{t} - ξ_{0}) - \dot{ξ} ∥ = 0

.

Definition 2.

The tangent space to

A

at the true value

ξ_{0}

, denoted as

\bar{l i n T (A, ξ_{0})}

is the smallest linear space, which is closed under the

L^{2}

-norm and contains all

\dot{ξ} \in L^{2} (Ω; μ_{Y} \times μ_{X})

tangent to

A

at

ξ_{0}

.

Severini and Tripahi [11] show that the tangent space

\bar{l i n T (A, ξ_{0})}

is the product of

\bar{l i n T (Ψ_{Y}, ψ_{0})}

and

\bar{l i n T (Φ, ϕ_{0})}

, where

\begin{matrix} \bar{l i n T (Ψ_{Y}, ψ_{0})} : = & \{\dot{ψ} \in L^{2} (Ω; μ_{Y} \times X), \int_{S_{Y}} \dot{ψ} (y | X) ψ_{0} (y | X) μ_{Y} (d y) = 0 with probability 1 (w . p . 1)\} \\ \bar{l i n T (Φ, ϕ_{0})} : = & \{\dot{ϕ} \in L^{2} (S_{X}; μ_{X}), \int_{S_{X}} \dot{ϕ} (x) ϕ_{0} (x) μ_{X} (d x) = 0\} . \end{matrix}

The pseudo-true model is the unconditional moment restriction

E_{0} [(τ - 1_{{Y \leq X^{'} β_{0}}}) X] = 0

in (3). Here,

E_{0}

is the expectation with respect to the true density functions

ξ_{0} = (ψ_{0}, ϕ_{0})

, and

β_{0}

denotes the pseudo-true

β (τ)

for notational simplicity. The objective is to estimate the efficiency bound for estimating

β_{0}

. Equivalently, I can instead look at the efficiency bound for estimating the functional

η (ψ_{0}, ϕ_{0}) : = c^{'} β_{0}

for any arbitrary vector

c \in R^{d}

. Severini and Tripahi [11] parameterize

ξ_{0} = (ψ_{0}, ϕ_{0}) \in A

and

β_{0}

as a one-dimensional subproblem. For some

t_{0} > 0

, let

t \mapsto (ξ_{t}, β_{t})

be a curve from

[0, t_{0}]

into

A \times R^{d}

, which passes through

(ξ_{0}, β_{0})

at

t = 0

. That is, estimating

η (ξ_{t}) = c^{'} β_{t} = t

at the true parameter

t = 0

is equivalent to estimating

t = 0

. The likelihood of estimating t using a single observation

{(Y, X^{'})}^{'}

is given by

ψ_{t}^{2} (Y | X) ϕ_{t}^{2} (X)

. Therefore, the score function for estimating

t = 0

is

\begin{matrix} S_{0} (Y, X) & : = \frac{d}{d t} log [ψ_{t}^{2} (Y | X) ϕ_{t}^{2} (X)] |_{t = 0} = 2 \frac{\dot{ψ} (Y | X)}{ψ_{0} (Y | X)} + 2 \frac{\dot{ϕ} (X)}{ϕ_{0} (X)} . \end{matrix}

Then, the Fisher information at

t = 0

can be written as

\begin{matrix} i_{F} & = E [S_{0} (Y, X) S_{0}^{'} (Y, X)] = \int_{S_{X}} \int_{S_{Y}} S_{0} (y, x) S_{0}^{'} (y, x) ψ_{0}^{2} (y | x) ϕ_{0}^{2} (x) μ_{Y} (d y) μ_{X} (d x) \\ = 4 E_{X} [\int_{S_{Y}} \dot{ψ} (y | X) \dot{ψ}' (y | X) μ_{Y} (d y)] + 4 \int_{S_{X}} \dot{ϕ} (x) {\dot{ϕ}}^{'} (x) μ_{X} (d x) \\ : = < (\dot{ψ}, \dot{ϕ}), (\dot{ψ}, \dot{ϕ}) >_{F}, \end{matrix}

where the third equality is because

{\dot{ξ}}_{0} = ({\dot{ψ}}_{0}, {\dot{ϕ}}_{0}) \in \bar{l i n T (A, ξ_{0})}

, and

E_{X}

denotes integrals with respect to the distribution of X. Therefore, the Fisher information inner product

< \cdot, \cdot >_{F}

and the corresponding norm

{∥ \cdot ∥}_{F}

are defined as

\begin{matrix} < {\dot{ξ}}_{1}, {\dot{ξ}}_{2} >_{F} & : = 4 E_{X} [\int_{S_{Y}} {\dot{ψ}}_{1} (y | X) {\dot{ψ}}_{2}^{'} (y | X) μ_{Y} (d y)] + 4 \int_{S_{X}} {\dot{ϕ}}_{1} (x) {\dot{ϕ}}_{2}^{'} (x) μ_{X} (d x) and \\ ∥ {\dot{ξ}}_{1} ∥_{F}^{2} & = ∥ ({\dot{ψ}}_{1}, {\dot{ϕ}}_{1}) ∥_{F}^{2} : = < ({\dot{ψ}}_{1}, {\dot{ϕ}}_{1}), ({\dot{ψ}}_{1}, {\dot{ϕ}}_{1}) >_{F} \end{matrix}

for any

{\dot{ξ}}_{1} . {\dot{ξ}}_{2} \in \bar{l i n T (A, ξ_{0})}

, which is a closed subset of

L^{2} (Ω; P)

. Hence, I have constructed the Hilbert space

(\bar{l i n T (A, ξ_{0})}, < \cdot, \cdot >_{F})

.

Now, I am ready to derive the efficiency bounds. It is known that the information inequality holds for all regular estimators, i.e., the asymptotic covariance of the estimator

\geq 1 / i_{F} = {∥ {\dot{ξ}}_{0} ∥}_{F}^{- 2}

. The semiparametric bound can be interpreted as the supremum of the asymptotic covariance over the parametric submodels. By [11], the lower bound is

\begin{matrix} l . b . & = s u p_{{\dot{ξ} \in (\bar{l i n T (A, ξ_{0})} : \dot{ξ} \neq 0, \nabla η (\dot{ξ}) = 1}} ∥ \dot{ξ} ∥_{F}^{- 2} = s u p_{{\dot{ξ} \in (\bar{l i n T (A, ξ_{0})} : ∥ \dot{ξ} ∥_{F} = 1}} {| \nabla η (\dot{ξ}) |}^{2} \\ = {∥ \nabla η ∥}_{*}^{2} = {∥ ξ^{*} ∥}_{F}^{2} . \end{matrix}

(10)

The third equality is the norm of the linear functional

\nabla η

, the path-wise derivative of η (p. 105 in [29]). The forth equality is from the Riesz–Fréchet theorem: there exists a unique

ξ^{*} \in \bar{l i n T (A, ξ_{0})}

for the continuous linear functional

\nabla η

on the Hilbert space

(\bar{l i n T (A, ξ_{0})}, < \cdot, \cdot >_{F})

, such that

\nabla η (\dot{ξ}) \leq ξ^{*}, \dot{ξ} >_{F}

for all

\dot{ξ} \in \bar{l i n T (A, ξ_{0})}

, i.e.,

\begin{matrix} \nabla η (\dot{ψ}, \dot{ϕ}) = c^{'} \dot{β} & = < (ψ^{*}, ϕ^{*}), (\dot{ψ}, \dot{ϕ}) >_{F} \\ = 4 E_{X} [\int_{S_{Y}} ψ^{*} \dot{ψ}' μ_{Y} (d y)] + 4 \int_{S_{X}} ϕ^{*} {\dot{ϕ}}^{'} μ_{X} (d x) . \end{matrix}

(11)

Therefore, to find the lower bound by (10), I need to find

ξ^{*}

, which is known as the representer of the continuous linear functionals

\nabla η

.

The submodel

(ψ_{t}, ϕ_{t}, β_{t})

is also required to satisfy the unconditional moment restriction,

E_{t} [(1_{{Y \leq X^{'} β_{t}}} - τ) X] = 0

. For any

τ_{1}, τ_{2} \in (0, 1)

,

β_{t} : = {(β_{t}^{'} (τ_{1}), β_{t}^{'} (τ_{2}))}^{'} : = {(β_{1 t}^{'}, β_{2 t}^{'})}^{'}

. I simultaneously estimate

β_{0} = {(β_{10}^{'}, β_{20}^{'})}^{'}

, a

2 d

-dimensional vector, so the unconditional moment restriction is

\int_{S_{X}} \int_{S_{Y}} (\begin{matrix} (1_{{y \leq x^{'} β_{1 t}}} - τ_{1}) x \\ (1_{{y \leq x^{'} β_{2 t}}} - τ_{2}) x \end{matrix}) ψ_{t}^{2} (y | x) ϕ_{t}^{2} (x) μ_{Y} (d y) μ_{X} (d x) = 0 .

Taking the derivative with respect to t evaluated at

t = 0

,8

\begin{matrix} 0 = & \int_{S_{X}} (\begin{matrix} x x^{'} f_{Y} (x^{'} β_{10} | x) \dot{β_{1}} \\ x x^{'} f_{Y} (x^{'} β_{20} | x) \dot{β_{2}} \end{matrix}) ϕ_{0}^{2} (x) μ_{X} (d x) \\ + 2 \int_{S_{X}} \int_{S_{Y}} (\begin{matrix} x 1_{{y \leq x^{'} β_{10}}} \\ x 1_{{y \leq x^{'} β_{20}}} \end{matrix}) ψ_{0} (y | x) \dot{ψ}' (y | x) μ_{Y} (d y) ϕ_{0}^{2} (x) μ_{X} (d x) \\ + 2 \int_{S_{X}} (\begin{matrix} x (F_{Y} (x^{'} β_{10} | x) - τ_{1}) \\ x (F_{Y} (x^{'} β_{20} | x) - τ_{2}) \end{matrix}) ϕ_{0} (x) \dot{ϕ}' (x) μ_{X} (d x), \end{matrix}

where the second term uses the fact that if

\dot{ψ} \in \bar{l i n T (Ψ_{Y}, ψ_{0})}

, then

\int_{S_{Y}} ψ_{0} \dot{ψ} μ_{Y} (d y) = 0

. Note that

\int_{S_{X}} x x^{'} f_{Y} (x^{'} β_{0} | x) ϕ_{0}^{2} (x) μ_{X} (d x) = E_{0} [X X^{'} f_{Y} (X^{'} β_{0} | X)] = J (τ)

, which is assumed to be positive definite by (R3), so

J {(τ)}^{- 1}

exists. Define

D : = (\begin{matrix} J (τ_{1}) & 0 \\ 0 & J (τ_{2}) \end{matrix}),

so

D^{- 1}

exists. Then

\begin{matrix} (\begin{matrix} {\dot{β}}_{1} \\ {\dot{β}}_{2} \end{matrix}) = & - 2 D^{- 1} [\int_{S_{X}} \int_{S_{Y}} (\begin{matrix} x 1_{{y \leq x^{'} β_{10}}} \\ x 1_{{y \leq x^{'} β_{20}}} \end{matrix}) ψ_{0} (y | x) \dot{ψ}' (y | x) μ_{Y} (d y) ϕ_{0}^{2} (x) μ_{X} (d x) \\ + \int_{S_{X}} (\begin{matrix} x (F_{Y} (x^{'} β_{10} | x) - τ_{1}) \\ x (F_{Y} (x^{'} β_{20} | x) - τ_{2}) \end{matrix}) ϕ_{0} (x) {\dot{ϕ}}^{'} (x) μ_{X} (d x)] . \end{matrix}

(12)

I confirm that

\nabla η (\dot{ξ}) = c^{'} \dot{β}

is a continuous linear functional on

\bar{l i n T (A, ξ_{0})}

, so η is indeed path-wise differentiable. Using (12) to replace

c^{'} \dot{β}

in the left-hand side of (11), I can find the representer for

\nabla η

as

\begin{matrix} ϕ^{*} (x) & = - \frac{1}{2} c^{'} D^{- 1} (\begin{matrix} (F_{Y} (x^{'} β_{10} | x) - τ_{1}) x \\ (F_{Y} (x^{'} β_{20} | x) - τ_{2}) x \end{matrix}) ϕ_{0} (x) and \\ ψ^{*} (y | x) & = - \frac{1}{2} c^{'} D^{- 1} (\begin{matrix} (1_{{y \leq x^{'} β_{10}}} - F_{Y} (x^{'} β_{10} | x)) x \\ (1_{{y \leq x^{'} β_{20}}} - F_{Y} (x^{'} β_{20} | x)) x \end{matrix}) ψ_{0} (y | x) \end{matrix}

because

\dot{ψ} \in \bar{l i n T (Ψ_{Y}, ψ_{0})}

. It can be easily checked that

(ψ^{*}, ϕ^{*}) \in \bar{l i n T (A, ξ_{0})}

. For notational ease, denote

1_{i} : = 1_{{y \leq x^{'} β_{i 0}}}

and

F_{i} : = F_{Y} (x^{'} β_{i 0} | x)

,

i = 1, 2

. Then, the lower bound for regular

\sqrt{n}

-consistent estimators of

c^{'} β_{0}

is

\begin{matrix} ∥ (ψ^{*}, ϕ^{*} {) ∥}_{F}^{2} \\ = c^{'} D^{- 1} {\int_{S_{X}} (\begin{matrix} {(F_{1} - τ_{1})}^{2} x x^{'} & (F_{1} - τ_{1}) (F_{2} - τ_{2}) x x^{'} \\ (F_{1} - τ_{1}) (F_{2} - τ_{2}) x x^{'} & {(F_{2} - τ_{2})}^{2} x x^{'} \end{matrix}) ϕ_{0}^{2} (x) 1_{{ϕ_{0} (x) > 0}} μ_{X} (d x) \\ + E (\begin{matrix} E [{(1_{1} - F_{1})}^{2} | X] X X^{'} & E [(1_{1} - F_{1}) (1_{2} - F_{2}) | X] X X^{'} \\ E [(1_{1} - F_{1}) (1_{2} - F_{2}) | X] X X^{'} & E [{(1_{2} - F_{2})}^{2} | X] X X^{'} \end{matrix})} D^{- 1} c \\ = c D^{- 1} (\begin{matrix} Γ (τ_{1}, τ_{1}) & Γ (τ_{1}, τ_{2}) \\ Γ (τ_{1}, τ_{2}) & Γ (τ_{2}, τ_{2}) \end{matrix}) D^{- 1} c, \end{matrix}

where

\begin{matrix} Γ (τ_{1}, τ_{2}) & : = E [E [(F_{1} - τ_{1}) (F_{2} - τ_{2}) + (1_{1} - F_{1}) (1_{2} - F_{2}) | X] X X^{'}] \\ = E [E [τ_{1} τ_{2} - τ_{1} F_{2} - τ_{2} F_{1} + 1_{1} 1_{2} | X] X X^{'}] \\ = E [E [(τ_{1} - 1_{{y \leq X^{'} β_{10}}}) (τ_{2} - 1_{{y \leq X^{'} β_{20}}}) | X] X X^{'}] \\ = E [(τ_{1} - 1_{{y \leq X^{'} β_{10}}}) (τ_{2} - 1_{{y \leq X^{'} β_{20}}}) X X^{'}] \end{matrix}

by the law of iterated expectations, and so,

Γ (τ, τ) = E [{(τ - 1_{{y \leq X^{'} β_{0}}})}^{2} X X^{'}]

.

Therefore, the lower bound for estimating

β (τ)

is

J {(τ)}^{- 1} Γ (τ, τ) J {(τ)}^{- 1}

. The asymptotic covariance of the estimators for

β (τ_{1})

and

β (τ_{2})

cannot be smaller than

J {(τ_{1})}^{- 1} Γ (τ_{1}, τ_{2}) J {(τ_{2})}^{- 1}

.

☐

Remark 2.

Consider the efficiency bound for estimating one single quantile

β (τ)

using Newey’s approach in [12]. Severini and Triphathi [11] claim that the efficient influence function for

c^{'} β (τ)

is

2 ψ^{*} / ψ_{0} + 2 ϕ^{*} / ϕ_{0} = c^{'} J {(τ)}^{- 1} X (τ - F_{Y} (X^{'} β | X)) + c^{'} J {(τ)}^{- 1} X (F_{Y} (X^{'} β | X) - 1_{{Y \leq X^{'} β}}) = c^{'} J {(τ)}^{- 1} X (τ - 1_{{Y \leq X^{'} β}})

. Then, the efficient influence function for

β_{0}

is

{(E [S S^{'}])}^{- 1} S

, where the efficient score

S = J (τ) Γ {(τ, τ)}^{- 1} X (τ - 1_{{Y \leq X^{'} β}})

. Newey shows that the semiparametric bound is

{(E [S S^{'}])}^{- 1}

.

Proof of the Semiparametric Efficiency Bound for the Linear QR Model (4).

Under the linear specification,

F_{Y} (X^{'} β_{0} | X) = τ

. The random vectors

(Y, X)

satisfy the conditional moment restriction

E [1_{{Y \leq X^{'} β_{0}}} - τ | X] = 0

, i.e.,

\int_{S_{Y}} (1_{{y \leq X^{'} β_{0}}} - τ) ψ_{0}^{2} (y | X) μ_{Y} (d y) = 0

, where the joint distribution of

(Y, X)

is

ψ_{0}^{2} (Y | X) ϕ_{0}^{2} (X)

. The Hilbert space

(\bar{l i n T (A, ξ_{0})}, < \cdot, \cdot >_{F})

and

A = (Ψ_{Y}, Φ)

are defined in the proof of Theorem 2. Consider any

τ_{1} . τ_{2} \in (0, 1)

,

β_{t} : = {(β_{t}^{'} (τ_{1}), β_{t}^{'} (τ_{2}))}^{'} : = {(β_{1 t}^{'}, β_{2 t}^{'})}^{'}

. The parameterized submodel

(ψ_{t}, ϕ_{t}, β_{t})

also has to satisfy the moment condition

\int_{S_{Y}} (\begin{matrix} (1_{{y \leq X^{'} β_{1 t}}} - τ_{1}) \\ (1_{{y \leq X^{'} β_{2 t}}} - τ_{2}) \end{matrix}) ψ_{t}^{2} (y | X) μ_{Y} (d y) = 0 .

Taking the derivative with respect to t evaluated at

t = 0

, I have

\begin{matrix} (\begin{matrix} f_{Y} (X^{'} β_{1} | X) X^{'} {\dot{β}}_{1} \\ f_{Y} (X^{'} β_{2} | X) X^{'} {\dot{β}}_{2} \end{matrix}) + 2 \int_{S_{Y}} (\begin{matrix} (1_{{y \leq X^{'} β_{10}}} - τ_{1}) \\ (1_{{y \leq X^{'} β_{20}}} - τ_{2}) \end{matrix}) ψ_{0} (y | X) \dot{ψ}' (y | X) d y = 0, \end{matrix}

(13)

where

f_{Y} (y | X) = ψ_{0}^{2} (y | X)

. Define

D (X) : = (\begin{matrix} f_{Y} (X^{'} β_{1} | X) X^{'} & 0 \\ 0 & f_{Y} (X^{'} β_{2} | X) X^{'} \end{matrix}) .

Note that (13) has an over-identifying moment restriction that cannot uniquely solve

\dot{β}

. To locally identify

\dot{β}

, [11] give the sufficient condition by

W (X)

, which is some nonsingular (w.p.1)

2 \times 2

matrix, such that

E [D {(X)}^{'} W (X) D (X)]

is nonsingular. By assumption,

E [X X^{'} f_{Y}^{2} (X^{'} β (τ) | X)] = E [X X^{'} f_{ε_{τ}}^{2} (0 | X)]

exists and is nonsingular, so the same holds for

E [D^{'} (X) D (X)]

. Hence, I can choose

W (X) = 1

, the identity matrix. Multiplying (13) by

D^{'} (X)

,

D^{'} (X) D (X) (\begin{matrix} {\dot{β}}_{1} \\ {\dot{β}}_{2} \end{matrix}) + D^{'} (X) 2 \int_{S_{Y}} (\begin{matrix} (1_{{y \leq X^{'} β_{10}}} - τ_{1}) \\ (1_{{y \leq X^{'} β_{20}}} - τ_{2}) \end{matrix}) ψ_{0} (y | X) \dot{ψ}' (y | X) d y = 0 .

Taking expectations on both sides with respect to X and solve for

\dot{β}

,

(\begin{matrix} {\dot{β}}_{1} \\ {\dot{β}}_{2} \end{matrix}) = - 2 {(E [D^{'} (X) D (X)])}^{- 1} E [D^{'} (X) 2 \int_{S_{Y}} (\begin{matrix} (1_{{y \leq X^{'} β_{10}}} - τ_{1}) \\ (1_{{y \leq X^{'} β_{20}}} - τ_{2}) \end{matrix}) ψ_{0} (y | X) \dot{ψ}' (y | X) d y] .

Then, for any arbitrary

c \in R^{2 d}

, the representer for

\nabla η ((\dot{ψ}, \dot{ϕ})) = c^{'} \dot{β}

is

\begin{matrix} ψ^{*} (y | X) & = - \frac{1}{2} c^{'} {(E [D^{'} (X) D (X)])}^{- 1} D^{'} (X) (\begin{matrix} (1_{{y \leq X^{'} β_{10}}} - τ_{1}) \\ (1_{{y \leq X^{'} β_{20}}} - τ_{2}) \end{matrix}) ψ_{0} (y | X) \\ \in \bar{l i n T (Ψ_{Y}, ψ_{0})} \end{matrix}

by the conditional moment restriction. Additionally,

ϕ^{*} = 0

, since

ϕ_{0}

is just ancillary in this conditional moment case. Define

A : = {(E [D^{'} (X) D (X)])}^{- 1}

and

1_{i} : = 1_{{Y \leq X^{'} β_{i 0}}}

for

i = 1, 2

for notational ease. Without loss of generality, assume

τ_{1} < τ_{2}

. Therefore, the lower bound is

\begin{matrix} ∥ (ψ^{*}, ϕ^{*}) ∥_{F}^{2} \\ = c^{'} E [A D^{'} (X) (\begin{matrix} E [{(1_{1} - τ_{1})}^{2} | X] & E [(1_{1} - τ_{1}) (1_{2} - τ_{2}) | X] \\ E [(1_{1} - τ_{1}) (1_{2} - τ_{2}) | X] & E [{(1_{2} - τ_{2})}^{2} | X] \end{matrix}) D (X) A] c \\ = c^{'} A E [D^{'} (X) (\begin{matrix} τ_{1} (1 - τ_{1}) & τ_{1} (1 - τ_{2}) \\ τ_{1} (1 - τ_{2}) & τ_{2} (1 - τ_{2}) \end{matrix}) D (X)] A c \\ = c^{'} (\begin{matrix} τ_{1} (1 - τ_{1}) {\{E [X X^{'} f_{ε_{τ_{1}}}^{2} (0 | X)]\}}^{- 1} & τ_{1} (1 - τ_{2}) {\{E [X X^{'} f_{ε_{τ_{1}}} (0 | X) f_{ε_{τ_{2}}} (0 | X)]\}}^{- 1} \\ τ_{1} (1 - τ_{2}) {\{E [X X^{'} f_{ε_{τ_{1}}} (0 | X) f_{ε_{τ_{2}}} (0 | X)]\}}^{- 1} & τ_{2} (1 - τ_{2}) {\{E [X X^{'} f_{ε_{τ_{2}}}^{2} (0 | X)]\}}^{- 1} \end{matrix}) c, \end{matrix}

since

f_{Y} (X^{'} β | X) = f_{ε_{τ}} (0 | X)

for correct specification. □

Remark 3.

Consider the efficiency bound for estimating one single quantile

β (τ)

by Newey’s approach in [12]. Severini and Triphathi [11] claim that Newey’s efficient influence function for

c^{'} β (τ)

is

2 ψ^{*} / ψ_{0} = c^{'} {(E [f_{ε_{τ}}^{2} (0 | X) X X^{'}])}^{- 1} f_{ε_{τ}} (0 | X) X (τ - 1_{{Y \leq X^{'} β}})

. Then, the efficient influence function for

β_{0}

is

{(E [S S^{'}])}^{- 1} S

, where the efficient score

S = {(τ - τ^{2})}^{- 1} f_{ε_{τ}} (0 | X) X (τ - 1_{{Y \leq X^{'} β}})

. Newey shows the semiparametric bound is

{(E [S S^{'}])}^{- 1}

.

References

J. Angrist, V. Chernozhukov, and I. Fernández-Val. “Quantile regression under misspecification, with an application to the U.S. wage structure.” Econometrica 74 (2006): 539–563. [Google Scholar]
T.-H. Kim, and H. White. “Estimation, inference, and specification testing for possibly misspecified quantile regression.” Adv. Econom. 17 (2003): 107–132. [Google Scholar]
J. Hahn. “Bayesian bootstrap of the quantile regression estimator: A large sample study.” Int. Econ. Rev. 38 (1997): 795–808. [Google Scholar] [CrossRef]
R. Koenker, and G. Bassett. “Regression quantile.” Econometrica 46 (1978): 33–50. [Google Scholar] [CrossRef]
R. Koenker. Quantile Regression. Econometric Society Monographs. New York, NY, USA: Cambridge University Press, 2005. [Google Scholar]
J. Powell. “Least absolute deviations estimation for the censored regression model.” J. Econom. 25 (1984): 303–325. [Google Scholar] [CrossRef]
J. Powell. “Censored regression quantiles.” J. Econom. 32 (1986): 143–155. [Google Scholar] [CrossRef]
V. Chernozhukov, I. Fernández-Val, and A. Galichon. “Quantile and probability curves without crossing.” Econometrica 78 (2010): 1093–1125. [Google Scholar]
G. Chamberlain. “Asymptotic efficiency in estimation with conditional moment restrictions.” J. Econom. 34 (1987): 305–334. [Google Scholar] [CrossRef]
C. Ai, and X. Chen. “The semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions.” J. Econom. 170 (2012): 442–457. [Google Scholar] [CrossRef]
T.A. Severini, and G. Tripathi. “A simplified approach to computing efficiency bounds in semiparametric models.” J. Econom. 102 (2001): 23–66. [Google Scholar] [CrossRef]
W.K. Newey. “Semiparametric efficiency bounds.” J. Appl. Econom. 5 (1990): 99–135. [Google Scholar] [CrossRef]
W.K. Newey, and J.L. Powell. “Efficient estimation of linear and type 1 censored regression models under conditional quantile restrictions.” Econom. Theory 6 (1990): 295–317. [Google Scholar] [CrossRef]
R. Koenker, S. Leorato, and F. Peracchi. Distributional vs. Quantile Regression. Ceis Working Paper No. 300. 2013. Available online: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2368737 (accessed on 20 November 2015).
I. Komunjer. “Quasi-maximum likelihood estimation for conditional quantiles.” J. Econom. 128 (2005): 137–164. [Google Scholar] [CrossRef]
A.F. Galvao, and K. Kato. “Estimation and inference for linear panel data models under misspecification when both n and t are large.” J. Bus. Econ. Stat. 32 (2014): 285–309. [Google Scholar] [CrossRef]
Y. Lee. “Bias in dynamic panel models under time series misspecification.” J. Econom. 169 (2012): 54–60. [Google Scholar] [CrossRef]
T. Magnac, and E. Maurin. “Identification and information in monotone binary models.” J. Econom. 139 (2007): 76–104. [Google Scholar] [CrossRef]
A. Lewbel. “Semiparametric latent variable model estimation with endogenous or mismeasured regressors.” Econometrica 66 (1998): 105–122. [Google Scholar] [CrossRef]
D.T. Jacho-Chávez. “Efficiency bounds for semiparametric estimation of inverse conditional-density-weighted functions.” Econom. Theory 25 (2009): 847–855. [Google Scholar] [CrossRef]
T. Chen, and T. Parker. “Semiparametric efficiency for partially linear single-index regression models.” J. Multivar. Anal. 130 (2014): 376–386. [Google Scholar] [CrossRef]
H. White. “Using least squares to approximate unknown regression functions.” Int. Econ. Rev. 21 (1980): 149–170. [Google Scholar] [CrossRef]
Y.-J. Whang. “Smoothed empirical likelihood methods for quantile regression models.” Econom. Theory 22 (2006): 173–205. [Google Scholar] [CrossRef]
T. Otsu. “Conditional empirical likelihood estimation and inference for quantile regression models.” J. Econom. 142 (2008): 508–538. [Google Scholar] [CrossRef]
X. Chen, and D. Pouzo. “Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals.” J. Econom. 152 (2009): 46–60. [Google Scholar] [CrossRef]
X. Chen, and D. Pouzo. “Estimation of nonparametric conditional moment models with possibly nonsmooth generalized residuals.” Econometrica 80 (2012): 277–321. [Google Scholar]
P. Chaudhuri, K. Doksum, and A. Samarov. “On average derivative quantile regression.” Ann. Stat. 25 (1997): 715–744. [Google Scholar] [CrossRef]
Y. Sasaki. “What do quantile regressions identify for general structural functions? ” Econom. Theory 31 (2015): 1102–1116. [Google Scholar] [CrossRef]
D.G. Luenberger. Optimization by Vector Space Methods. New York, NY, USA: Wiley, 2005. [Google Scholar]

^1.Chernozhukov, Fernández-Val, and Galichon [8] rearrange an estimator ${\hat{Q}}_{τ} (Y | X)$ to be monotonic. The original estimator can be computationally tractable. The rearranged monotonic estimated CDF is ${\hat{F}}_{Y} (y | X) = \int_{0}^{1} 1_{{{\hat{Q}}_{τ} (Y | X) \leq y}} d τ$ . The rearranged quantile estimation is ${\hat{Q}}_{τ}^{*} (Y | X) = inf {y : {\hat{F}}_{Y} (y | X) \geq τ}$ .
^2.See [12] for the definition of regular estimators.
^3.For estimation, [14] studies different approaches based on the distribution regression and quantile regression.
^4.Galvao and Kato [16] show that misspecification affects the bias correction and convergence rate of the estimator and provide a misspecification-robust inference procedure. In panel models under time series misspecification, Lee [17] proposes bias reduction methods for the incidental parameter.
^5.Severini and Tripathi construct the tangent space for the continuous and bounded joint density $f (X, Y)$ in Section 9 of [11]. Additionally, they define J on the derivative of the moment restriction.
^6.The conditional moment restriction in (4) can be expressed as $m (X, β) = τ - F_{Y} (X^{'} β | X) = 0$ . In [26], an unweighted penalized sieve minimum distance estimator minimizes a possibly penalized consistent estimate of the minimum distance criterion, $E [m {(X, β)}^{2}]$ .
^7. $μ_{X}$ may not be a Lebesgue measure, since I allow discrete components in the covariates X.
^8.The interchange of differentiation and integration is allowed, assumed throughout [11], by the smoothness of $ξ_{t} (Y, X)$ in $t \in [0, t_{0}]$ by the construction of regular parametric submodels; see [12] for details.

© 2015 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, Y.-Y. Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification. Econometrics 2016, 4, 2. https://doi.org/10.3390/econometrics4010002

AMA Style

Lee Y-Y. Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification. Econometrics. 2016; 4(1):2. https://doi.org/10.3390/econometrics4010002

Chicago/Turabian Style

Lee, Ying-Ying. 2016. "Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification" Econometrics 4, no. 1: 2. https://doi.org/10.3390/econometrics4010002

APA Style

Lee, Y.-Y. (2016). Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification. Econometrics, 4(1), 2. https://doi.org/10.3390/econometrics4010002

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification

Abstract

1. Introduction

2. Interpreting QR under Misspecification

3. The Semiparametric Efficiency Bounds

3.1. QR under Misspecification

3.2. QR for Linear Specification

4. Discussion and Conclusions

Acknowledgments

Conflicts of Interest

Appendix

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI