# Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification

## Abstract

**:**

## 1. Introduction

## 2. Interpreting QR under Misspecification

- (R1)
- $({Y}_{i},{X}_{i},i\le n)$ are independent and identically distributed on the probability space $(\Omega ,\mathcal{F},P)$ for each n;
- (R2)
- the conditional density ${f}_{Y}(y|X=x)$ exists and is bounded and uniformly continuous in y, uniformly in x over the support of X;
- (R3)
- $J(\tau ):=E\left[{f}_{Y}({X}^{\prime}\beta (\tau )|X)X{X}^{\prime}\right]$ is positive definite for all $\tau \in (0,1)$, where $\beta (\tau )$ is uniquely defined in (1);
- (R4)
- ${E\parallel X\parallel}^{2+\u03f5}<\infty $ for some $\u03f5>0$;
- (R5)
- ${f}_{Y}({X}^{\prime}\beta (\tau )|X)$ to be bounded away from zero.

**Theorem 1.**

**Proof of Theorem 1.**

**Remark 1**

**.**The linear function ${X}^{\prime}\beta (\tau )$ is the best linear approximation under the check loss function in (1). While $\beta (.5)$ is the least absolute derivations estimation, the QR parameter $\beta (\tau )$ for $\tau \ne 0.5$ is the best linear predictor for a response variable under the asymmetric loss function ${\rho}_{\tau}(\xb7)$ in (1). ACF note that the prediction under the asymmetric check loss function is often not the object of interest in empirical work, with the exception of the forecasting literature, for example [15]. For the mean regression counterpart, OLS consistently estimates the linear conditional expectation and minimizes mean-squared error loss for fitting the conditional expectation under misspecification. The robust nature of OLS also motivates research on misspecification in panel data models. For example, Galvao and Kato [16] investigate linear panel data models under misspecification. The pseudo-true value of the fixed effect estimator provides the best partial linear approximation to the conditional mean given the explanatory variables and the unobservable individual effect.4

## 3. The Semiparametric Efficiency Bounds

#### 3.1. QR under Misspecification

**Theorem 2.**

**Proof of Theorem 2.**

#### 3.2. QR for Linear Specification

## 4. Discussion and Conclusions

OLS | QR | |
---|---|---|

Linear Projection Model | ||

objective minimized | $E[{(Y-{X}^{\prime}\beta )}^{2}]$ | $E[{\rho}_{\tau}(Y-{X}^{\prime}\beta (\tau ))]$ |

(interpretation) | $E[{(E[Y|X]-{X}^{\prime}\beta )}^{2}]$ | $E[{\overline{w}}_{\tau}{({Q}_{\tau}(Y|X)-{X}^{\prime}\beta (\tau ))}^{2}]$ |

$E[{f}_{Y}{({X}^{\prime}\beta (\tau )|X)}^{-1}{({F}_{Y}({X}^{\prime}\beta (\tau )|X)-\tau )}^{2}]$ | ||

unconditional moment | $E[X(Y-{X}^{\prime}\beta )]=0$ | $E[X({\mathbf{1}}_{\{Y\le {X}^{\prime}\beta (\tau )\}}-\tau )]=0$ |

(interpretation) | $E[X(E[Y|X]-{X}^{\prime}\beta )]=0$ | $E[X({F}_{Y}({X}^{\prime}\beta (\tau )|X)-\tau )]=0$ |

$E\left[{\overline{w}}_{\tau}X\left({X}^{\prime}\beta (\tau )-{Q}_{\tau}(Y|X)\right)\right]=0$ | ||

efficient estimators | $arg{\mathrm{min}}_{\beta \in {R}^{d}}\frac{1}{n}{\sum}_{i=1}^{n}{({Y}_{i}-{X}_{i}^{\prime}\beta )}^{2}$ | $arg{\mathrm{min}}_{\beta \in {R}^{d}}\frac{1}{n}{\sum}_{i=1}^{n}{\rho}_{\tau}({Y}_{i}-{X}_{i}^{\prime}\beta )$ |

$={({\sum}_{i=1}^{n}{X}_{i}{X}_{i}^{\prime})}^{-1}({\sum}_{i=1}^{n}{X}_{i}{Y}_{i})$ | (Koenker–Bassett) | |

(OLS) | ||

asymptotic covariance | ${Q}^{-1}\Omega {Q}^{-1}$ ${}^{*}$ | ${J}^{-1}\Gamma {J}^{-1}$ |

efficiency bounds | Chamberlain (1987) [9] | Theorem 2 |

Linear Regression Model | ||

conditional moment | $E[Y|X]={X}^{\prime}\beta $ | ${Q}_{\tau}(Y|X)={X}^{\prime}\beta (\tau )$ |

or ${F}_{Y}({X}^{\prime}\beta (\tau )|X)=\tau $ | ||

efficiency bounds | Chamberlain (1987) [9] ${}^{\u2020}$ | Newey and Powell (1990) [13] |

homoscedasticity-type | $var[Y|X]={\sigma}^{2}$ | ${f}_{{\epsilon}_{\tau}}(0|X)={f}_{{\epsilon}_{\tau}}(0)$ |

condition | ||

efficient estimators | OLS | Koenker–Bassett |

**Figure 1.**This numerical example is constructed by $X\sim Uniform[1,2]$, $e|X=x\sim Uniform[0,x]$ and $Y=cos(2X)+e$. Therefore, ${f}_{Y}(y|X)=1/X$, ${F}_{Y}(y|X)=(y-cos(2X))/X$ and ${Q}_{\tau}(Y|X)=\tau X+cos(4X)$. Set $\tau =0.5$ for the median. The red solid line is for the QR parameter ${\beta}_{KB}$ defined in (3) and estimated by the Koenker-Bassett (KB) estimator. The blue dashed line is the approximation by the SMD estimator ${\beta}_{SMD}$ minimizing $E[{({F}_{Y}({X}^{\prime}\beta |X)-\tau )}^{2}]$. The approximations are ${X}^{\prime}{\beta}_{KB}=-0.324+0.161X$ and ${X}^{\prime}{\beta}_{SMD}=-0.204+0.078X$. The left panel shows the linear approximations ${X}^{\prime}{\beta}_{KB}$, ${X}^{\prime}{\beta}_{SMD}$ and the true CQF. The green circles are 300 random draws from the DGP. The right panel shows the corresponding CDFs ${F}_{Y}({X}^{\prime}{\beta}_{KB}|X)$ and ${F}_{Y}({X}^{\prime}{\beta}_{SMD}|X)$. For smaller x where the conditional density is larger, the quantile specification error of SMD is smaller than that of KB in the left panel. For the distribution approximation error in the right panel, SMD weights more evenly over the support of X, while KB has smaller distribution approximation error at larger x with smaller density.

## Acknowledgments

## Conflicts of Interest

## Appendix

**Proof of Theorem 3.**

**Definition 1.**

**Definition 2.**

**Remark 2.**

**Proof of the Semiparametric Efficiency Bound for the Linear QR Model (4).**

**Remark 3.**

## References

- J. Angrist, V. Chernozhukov, and I. Fernández-Val. “Quantile regression under misspecification, with an application to the U.S. wage structure.” Econometrica 74 (2006): 539–563. [Google Scholar]
- T.-H. Kim, and H. White. “Estimation, inference, and specification testing for possibly misspecified quantile regression.” Adv. Econom. 17 (2003): 107–132. [Google Scholar]
- J. Hahn. “Bayesian bootstrap of the quantile regression estimator: A large sample study.” Int. Econ. Rev. 38 (1997): 795–808. [Google Scholar] [CrossRef]
- R. Koenker, and G. Bassett. “Regression quantile.” Econometrica 46 (1978): 33–50. [Google Scholar] [CrossRef]
- R. Koenker. Quantile Regression. Econometric Society Monographs. New York, NY, USA: Cambridge University Press, 2005. [Google Scholar]
- J. Powell. “Least absolute deviations estimation for the censored regression model.” J. Econom. 25 (1984): 303–325. [Google Scholar] [CrossRef]
- J. Powell. “Censored regression quantiles.” J. Econom. 32 (1986): 143–155. [Google Scholar] [CrossRef]
- V. Chernozhukov, I. Fernández-Val, and A. Galichon. “Quantile and probability curves without crossing.” Econometrica 78 (2010): 1093–1125. [Google Scholar]
- G. Chamberlain. “Asymptotic efficiency in estimation with conditional moment restrictions.” J. Econom. 34 (1987): 305–334. [Google Scholar] [CrossRef]
- C. Ai, and X. Chen. “The semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions.” J. Econom. 170 (2012): 442–457. [Google Scholar] [CrossRef]
- T.A. Severini, and G. Tripathi. “A simplified approach to computing efficiency bounds in semiparametric models.” J. Econom. 102 (2001): 23–66. [Google Scholar] [CrossRef]
- W.K. Newey. “Semiparametric efficiency bounds.” J. Appl. Econom. 5 (1990): 99–135. [Google Scholar] [CrossRef]
- W.K. Newey, and J.L. Powell. “Efficient estimation of linear and type 1 censored regression models under conditional quantile restrictions.” Econom. Theory 6 (1990): 295–317. [Google Scholar] [CrossRef]
- R. Koenker, S. Leorato, and F. Peracchi. Distributional vs. Quantile Regression. Ceis Working Paper No. 300. 2013. Available online: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2368737 (accessed on 20 November 2015).
- I. Komunjer. “Quasi-maximum likelihood estimation for conditional quantiles.” J. Econom. 128 (2005): 137–164. [Google Scholar] [CrossRef]
- A.F. Galvao, and K. Kato. “Estimation and inference for linear panel data models under misspecification when both n and t are large.” J. Bus. Econ. Stat. 32 (2014): 285–309. [Google Scholar] [CrossRef]
- Y. Lee. “Bias in dynamic panel models under time series misspecification.” J. Econom. 169 (2012): 54–60. [Google Scholar] [CrossRef]
- T. Magnac, and E. Maurin. “Identification and information in monotone binary models.” J. Econom. 139 (2007): 76–104. [Google Scholar] [CrossRef]
- A. Lewbel. “Semiparametric latent variable model estimation with endogenous or mismeasured regressors.” Econometrica 66 (1998): 105–122. [Google Scholar] [CrossRef]
- D.T. Jacho-Chávez. “Efficiency bounds for semiparametric estimation of inverse conditional-density-weighted functions.” Econom. Theory 25 (2009): 847–855. [Google Scholar] [CrossRef]
- T. Chen, and T. Parker. “Semiparametric efficiency for partially linear single-index regression models.” J. Multivar. Anal. 130 (2014): 376–386. [Google Scholar] [CrossRef]
- H. White. “Using least squares to approximate unknown regression functions.” Int. Econ. Rev. 21 (1980): 149–170. [Google Scholar] [CrossRef]
- Y.-J. Whang. “Smoothed empirical likelihood methods for quantile regression models.” Econom. Theory 22 (2006): 173–205. [Google Scholar] [CrossRef]
- T. Otsu. “Conditional empirical likelihood estimation and inference for quantile regression models.” J. Econom. 142 (2008): 508–538. [Google Scholar] [CrossRef]
- X. Chen, and D. Pouzo. “Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals.” J. Econom. 152 (2009): 46–60. [Google Scholar] [CrossRef]
- X. Chen, and D. Pouzo. “Estimation of nonparametric conditional moment models with possibly nonsmooth generalized residuals.” Econometrica 80 (2012): 277–321. [Google Scholar]
- P. Chaudhuri, K. Doksum, and A. Samarov. “On average derivative quantile regression.” Ann. Stat. 25 (1997): 715–744. [Google Scholar] [CrossRef]
- Y. Sasaki. “What do quantile regressions identify for general structural functions? ” Econom. Theory 31 (2015): 1102–1116. [Google Scholar] [CrossRef]
- D.G. Luenberger. Optimization by Vector Space Methods. New York, NY, USA: Wiley, 2005. [Google Scholar]

^{1.}Chernozhukov, Fernández-Val, and Galichon [8] rearrange an estimator ${\widehat{Q}}_{\tau}(Y|X)$ to be monotonic. The original estimator can be computationally tractable. The rearranged monotonic estimated CDF is ${\widehat{F}}_{Y}(y|X)={\int}_{0}^{1}{\mathbf{1}}_{\{{\widehat{Q}}_{\tau}(Y|X)\le y\}}d\tau $. The rearranged quantile estimation is ${\widehat{Q}}_{\tau}^{*}(Y|X)=inf\{y:{\widehat{F}}_{Y}(y|X)\ge \tau \}$.^{2.}See [12] for the definition of regular estimators.^{3.}For estimation, [14] studies different approaches based on the distribution regression and quantile regression.^{4.}Galvao and Kato [16] show that misspecification affects the bias correction and convergence rate of the estimator and provide a misspecification-robust inference procedure. In panel models under time series misspecification, Lee [17] proposes bias reduction methods for the incidental parameter.^{5.}Severini and Tripathi construct the tangent space for the continuous and bounded joint density $f(X,Y)$ in Section 9 of [11]. Additionally, they define J on the derivative of the moment restriction.^{7.}${\mu}_{X}$ may not be a Lebesgue measure, since I allow discrete components in the covariates X.

© 2015 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Lee, Y.-Y.
Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification. *Econometrics* **2016**, *4*, 2.
https://doi.org/10.3390/econometrics4010002

**AMA Style**

Lee Y-Y.
Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification. *Econometrics*. 2016; 4(1):2.
https://doi.org/10.3390/econometrics4010002

**Chicago/Turabian Style**

Lee, Ying-Ying.
2016. "Interpretation and Semiparametric Efficiency in Quantile Regression under Misspecification" *Econometrics* 4, no. 1: 2.
https://doi.org/10.3390/econometrics4010002