A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models

Iwasawa, Masamune

doi:10.3390/econometrics3030667

Open AccessArticle

A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models

by

Masamune Iwasawa

^1,2

¹

Graduate School of Economics, Kyoto University, Yoshida Honmachi, Sakyo-ku, Kyoto 606-8501, Japan

²

Japan Society for the Promotion of Science, Kojimachi Business Center Building, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan

Econometrics 2015, 3(3), 667-697; https://doi.org/10.3390/econometrics3030667

Submission received: 4 June 2015 / Revised: 28 August 2015 / Accepted: 9 September 2015 / Published: 16 September 2015

(This article belongs to the Special Issue Recent Developments of Specification Testing)

Download

Browse Figure

Versions Notes

Abstract

:

Estimation results obtained by parametric models may be seriously misleading when the model is misspecified or poorly approximates the true model. This study proposes a test that jointly tests the specifications of multiple response probabilities in unordered multinomial choice models. The test statistic is asymptotically chi-square distributed, consistent against a fixed alternative and able to detect a local alternative approaching to the null at a rate slower than the parametric rate. We show that rejection regions can be calculated by a simple parametric bootstrap procedure, when the sample size is small. The size and power of the tests are investigated by Monte Carlo experiments.

Keywords:

specification test; multinomial choice models; parametric bootstrap; nonparametric methods

JEL classifications:

C12; C25; C15

1. Introduction

Not infrequently, variables of interest in economic research are discrete and unordered, as we often find the variables that indicate the behavior or state of economic agents. Some econometric models have been developed to deal with these discrete and unordered outcomes. Above all, parametric models, such as the multinomial logit (MNL) and probit (MNP) models proposed by [1,2], respectively, are widely employed, for example, in structural econometric analysis (e.g., the economic models of automobile sales in [3,4]) and as part of econometric methods (e.g., selection bias correction of [5,6]). However, results obtained by such parametric models may be seriously misleading when the model is misspecified or poorly approximates the true model. Thus, researchers need to examine the validity of parametric assumptions as long as the assumptions are refutable from data alone.

This study proposes a new specification test that is directly applicable to any multinomial choice models with unordered outcome variables. These models set parametric assumptions on response probabilities that an option is chosen from multiple alternatives, and identical assumptions are often set for all response probabilities. Problems occur when these models do not mimic the true models, because the response probabilities and partial effects of some variables on the probabilities cannot be properly predicted. Moreover, the parameter estimation results may be misleading and their interpretation confusing. The specification test proposed here can be utilized to justify the choice of parametric models and to avoid misspecification problems.

The novelty of the test provided in this study is that it allows us to test the specifications of response probabilities jointly for all choice alternatives. Multinomial choice models with unordered outcomes consist of multiple response probabilities, each of which may be parameterized differently. This implies that one needs to test multiple null hypotheses to justify the parametric assumptions of these models. A substantial number of specification tests has been developed to test a single null hypothesis. To our knowledge, however, no joint specification tests have so far been theoretically suggested for multiple null hypotheses.

The test proposed here is based on moment conditions. We show that the test statistic is asymptotically chi-square distributed, consistent against a fixed alternative and able to detect a local alternative approaching the null at the rate of

1 / \sqrt{n h^{q / 2}}

, where q is the number of independent variables.

One eminent feature of our test is that a parametric bootstrap procedure works well to calculate the rejection region for the test statistic. Since the testing method involves nonparametric estimation, a sufficiently large sample size could be required to establish that the chi-squared distribution is a proper approximation for the distribution of the test statistic. Thus, a simple parametric bootstrap procedure to calculate rejection regions is a practical need.

A crucial point that makes parametric bootstrap work is that the orthogonality condition holds with bootstrap sampling under both the null and alternative hypotheses. This is different from the specification test for the regression function that requires the wild bootstrapping procedure to calculate the rejection region proven by [7]. It is also noteworthy that the parametric nature of the model leads to substantial savings in the computational cost of bootstrapping.

Methodologically, two different approaches have been developed to construct specification tests. One uses an empirical process and the other a smoothing technique. We call the first type empirical process-based tests and the second type smoothing-based tests. Most of the literature on specification tests can be categorized into one of these two types. Empirical process-based tests are proposed by [8,9,10,11,12,13,14,15,16,17,18], among others. Smoothing-based tests are proposed by [7,19,20,21,22,23,24,25,26,27,28,29,30,31,32], to mention only a few.

These two types of tests are complementary to each other, rather than substitutional, in terms of the power property. For Pitman local alternatives, empirical process-based tests are more powerful than the smoothing-based tests. Empirical process-based tests can detect Pitman local alternatives approaching the null at the parametric rate

n^{- 1 / 2}

, whereas smoothing-based tests can detect them at a rate slower than the parametric rate. Smoothing-based tests are, however, more powerful for a singular local alternative that changes drastically or is of high frequency. Empirical process-based tests can be represented by a kernel-like weight function with a fixed smoothing parameter. Thus, it can be intuitively understood that empirical process-based tests oversmooth the true function and obscure the drastic changes of alternatives. The work in [33] shows that smoothing-based tests can detect singular local alternatives at a rate faster than

n^{- 1 / 2}

.

The test proposed in this study is most related to [30], which proposes a smoothing-based test for functional forms of the regression function. Most of the specification tests developed for functional forms of the regression function can be directly applied to test the parametric specifications of ordered choice models, such as the parametric binary choice models, because ordered choice models have only a single response probability that is equal to the conditional expectation of the outcome. For example, [34] applied several specification tests, originally developed for regression functions, to some ordered discrete choice models for a comparison of their relative merits based on their asymptotic size and power. However, applying them to unordered multinomial choice models, as done in this study, is not a trivial task. Extending empirical process-based tests and rate-optimal tests1 to unordered multinomial choice models is a task left for future research.

This paper is organized as follows. Section 2 introduces unordered multinomial choice models and reveals the problems of parametric specification. The new test statistic is proposed in Section 3. The assumptions and asymptotic behavior are provided in Section 4. Section 5 shows how to bootstrap parametrically. We investigate the size and power of the test by conducting Monte Carlo experiments in Section 6. We conclude with Section 7. The proofs of the lemmas and propositions are provided in the Appendix.

2. Unordered Multinomial Choice Models

We have the observations

{{Y_{i, j}, X_{i, j}}_{i = 1}^{n}}_{j = 1}^{J}

, where

Y_{i, j} \in {0, 1}

is a binary response variable that takes one if individual i chooses alternative j and zero otherwise. Each individual chooses one of J alternatives, which implies

Y_{i, m} = 0

for all

m \neq j

if

Y_{i, j} = 1

.

X_{i, j} \in R^{k_{j}}

is a vector of independent variables that affect the choice decision made by individual i. Throughout this paper, we assume that

{X_{i, j}, Y_{i, j}}_{i = 1}^{n}

is independent and identically distributed for each

j = 1, \dots, J

. With i remaining fixed, however,

{X_{i, j}, Y_{i, j}}_{j = 1}^{J}

is not necessarily independent or identical.

Multinomial choice models with unordered response variables are constructed by introducing latent variables

y_{i, j}^{*}

, which may be interpreted as the utility or satisfaction that i can obtain by choosing alternative j. We assume each individual chooses an alternative that maximizes personal utility; that is,

Y_{i, j} = 1

if

y_{i, j}^{*} > y_{i, m}^{*}

for all

m \neq j

. Further,

y_{i, j}^{*}

depends on a function

g_{j} (X_{i, j}, θ)

and unobserved error

ϵ_{i, j}

:

y_{i, j}^{*} = g_{j} (X_{i, j}, θ) + ϵ_{i, j}

, where

ϵ_{i, j}

is independent of

X_{i, j}

and

θ \in Θ

is a parameter in a subset of a finite dimensional space Θ. Then, the response probability that i chooses j can be formulated as follows:

\begin{matrix} P (Y_{i, j} = 1 | X_{i}) & = P (y_{i, j}^{*} > y_{i, m}^{*} \forall m \neq j | X_{i}) \\ = P (ϵ_{i, j} - ϵ_{i, m} > g_{m} (X_{i, m}, θ) - g_{j} (X_{i, j}, θ) \forall m \neq j | X_{i}), \end{matrix}

(1)

where

X_{i} \in R^{q}

is a vector consisting of all independent variables. The dimension q of

X_{i}

is equal to

\sum_{j = 1}^{J} k_{j}

when all variables in

X_{i, j}

are alternative-specific for all j. This occurs when no variable in

X_{i, j}

is identical to any of those in

X_{i, m}

as long as

j \neq m

.

A specification of the functional forms of

g (\cdot)

and the distributions of ϵ leads to full parameterization of the model in the sense that parameters and response probabilities can be estimated parametrically. For example, if we assume linearity,

g_{j} (X_{i, j}, θ) = X_{i, j}^{'} β

, and the Type I extreme-value distribution for

ϵ_{i, j}

for all j, we have MNL model in which:

P (Y_{i, j} = 1 | X_{i}) = exp (X_{i, j}^{'} β) / \sum_{j = 1}^{J} exp (X_{i, j}^{'} β)

.2 An alternative model suggested by [2] is the MNP model, in which

ϵ_{i, j}

is assumed to be normally distributed. In both cases, the parameters can be inferred by maximum likelihood estimation, and the choice probabilities are obtained by plugging the estimated values into (1).

Specification of the distribution of ϵ in (1) is less restrictive than specifying a distribution of a random variable. This is because specification of ϵ is true if ϵ is in a family of distributions. The strict inequality in (1) holds after any transformations on both sides of the inequality with any strictly increasing functions. For example, the distribution of

ϵ_{i, j} - ϵ_{i, m}

in (1) could be transformed into a well-known one as the normal or Type I extreme distribution. In these special cases, the distributions of ϵ’s may not be an essential specification issue, provided we can specify the right-hand side of the inequality correctly. In other words, distributional assumptions of error terms that help us simplify the estimation of parametric models could be justified by specifying the functional forms of

g_{j} (\cdot)

prudently.

In empirical studies, however, functional forms of

g_{j} (\cdot)

and distributions of

ϵ_{i, j}

are generally unknown for all j. Moreover, in unordered multinomial choice models, the functional forms of

g_{j} (\cdot)

and the distributions of

ϵ_{i, j}

may be nonidentical across j. Thus, we need joint specification tests that indicate whether parametric specifications provide a good approximation to the true models. The appropriate null and alternative hypotheses are as follows:

\begin{matrix} H_{0} & : P [m_{θ, j} (X_{i}) = P (Y_{i, j} = 1 | X_{i})] = 1, for some θ \in Θ and for all j, \\ H_{1} & : P [m_{θ, j} (X_{i}) = P (Y_{i, j} = 1 | X_{i})] < 1, for any θ \in Θ and for some j, \end{matrix}

where

m_{j} (X_{i})

denotes the true response probabilities and

m_{θ, j} (X_{i})

their parameterized variants.

3. Test Statistic

The test statistic proposed in this study is built on the features of response probabilities, that is the moment conditions that are satisfied when the parametric response probability is true. This implies that we test the specifications of the functional forms of

g_{j} (\cdot)

and the distributions of

ϵ_{i . j}

simultaneously for all j. Rejection of the null hypothesis, thus, indicates that at least one of the parametric specifications of

g_{j} (\cdot)

and

ϵ_{i . j}

is misspecified.

Before presenting the test statistic, we introduce some notations. Let

f_{h} (x)

be the non-parametric density estimator for a continuous point of

X_{i}

as follows:

\begin{matrix} f_{h} (x) & = \frac{1}{n h^{q}} \sum_{i = 1}^{n} K (\frac{X_{i} - x}{h}), \end{matrix}

where

K (\cdot)

is a kernel function and h is a bandwidth depending on n. In addition, we define

K^{(2)}

as the two-times convolution product of the kernel and

K^{(4)}

as that of

K^{(2)}

.

The test statistic is based on

Z_{j} \equiv E [u_{θ, i, j} E (u_{θ, i, j} | X_{i}) f (X_{i})],

where

u_{θ, i, j} = Y_{i, j} - m_{θ, j} (X_{i})

and

f (\cdot)

is the marginal density of

X_{i}

. Under the null hypothesis,

Z_{j} = 0

, since

E (u_{θ, i, j} | X_{i}) = 0

. Under the alternative hypothesis,

E [u_{θ, i, j} E (u_{θ, i, j} | X_{i}) f (X_{i})] = E [E {(u_{θ, i, j} | X_{i})}^{2} f (X_{i})] \geq c E {{[P (Y_{i, j} = 1 | X_{i}) - m_{θ, j} (X_{i})]}^{2}} > 0,

for some positive constant c, provided that

f (\cdot)

is bounded away from zero.

The nonparametric estimates of

Z_{j}

, denoted as

Z_{n, j}

, can be obtained as follows:

Z_{n, j} = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) {\hat{u}}_{θ, i, j} {\hat{u}}_{θ, l, j},

where

{\hat{u}}_{θ, i, j} = Y_{i, j} - m_{\hat{θ}, j} (X_{i})

and

m_{\hat{θ}, j} (X_{i})

is the estimate of

m_{θ, j} (X_{i})

. We denote the asymptotic variance of

Z_{n, j}

and the covariance between

Z_{n, j}

and

Z_{n, m}

by

V_{j, j}

and

V_{j, m}

, respectively.

We introduce some further notations to provide the test statistic. Note that testing the specification of an arbitrary pair of

J - 1

response probabilities is a sufficient test for the null hypothesis subject to

\sum_{j = 1}^{J} P (Y_{i, j} = 1 | X_{i}) = 1

for all i. For notational simplicity, we omit the J-th response probability from our test statistic. Let

Z_{n} \equiv {(Z_{n, 1}, \dots, Z_{n, J - 1})}^{'}

be a

(J - 1) \times 1

vector and

\hat{V}

a

(J - 1) \times (J - 1)

variance-covariance matrix whose

(j, m)

elements are estimates of

V_{j, m}

. Then, the test statistic is

C_{n} = n^{2} h^{q} Z_{n}^{'} {\hat{V}}^{- 1} Z_{n},

where

\begin{matrix} {\hat{V}}_{j, j} & = K^{(2)} (0) \frac{2}{n} \sum_{i = 1}^{n} {[{\hat{σ}}_{j}^{2} (X_{i})]}^{2} f_{h} (X_{i}), \\ {\hat{V}}_{j, m} & = K^{(2)} (0) \frac{2}{n} \sum_{i = 1}^{n} {[{\hat{σ}}_{j, m} (X_{i})]}^{2} f_{h} (X_{i}), \end{matrix}

for all

j = 1, \dots, J - 1

and

j \neq m

.

{\hat{σ}}_{j}^{2} (\cdot)

is the estimated conditional variance of

u_{i, j} \equiv Y_{i, j} - m_{j} (X_{i})

, where

E (u_{i, j} | X_{i}) = 0

and

m_{j} (x) \equiv P (Y_{i, j} = 1 | X_{i} = x) = E (Y_{i, j} | X_{i} = x)

, and

{\hat{σ}}_{j, m} (\cdot)

is the estimated covariance between

u_{i, j}

and

u_{i, m}

.

Considering the nature of the model,

{\hat{σ}}_{j}^{2} (\cdot)

and

{\hat{σ}}_{j, m} (\cdot)

can be easily obtained. Since

Y_{i, j}

is a binary variable taking zero or one,

u_{i, j} = [1 - m_{j} (X_{i})] 1 (Y_{i, j} = 1) - m_{j} (X_{i}) 1 (Y_{i, j} = 0)

, where

1 (\cdot)

is an indicator function. The conditional variance of

u_{i, j}

and the covariance between

u_{i, j}

and

u_{i, m}

can then be written straightforwardly as follows:

\begin{matrix} σ_{j}^{2} (X_{i}) & \equiv E (u_{i, j}^{2} | X_{i}) = m_{j} (X_{i}) [1 - m_{j} (X_{i})], \end{matrix}

(2)

\begin{matrix} σ_{j, m} (X_{i}) & \equiv E (u_{i, j} u_{i, m} | X_{i}) = - m_{j} (X_{i}) m_{m} (X_{i}) . \end{matrix}

(3)

Thus, consistent parametric estimators of

σ_{j}^{2} (X_{i})

and

σ_{j, m} (X_{i})

under the null hypothesis are

{\hat{σ}}_{j}^{2} (x) = m_{\hat{θ}, j} (x) [1 - m_{\hat{θ}, j} (x)]

and

{\hat{σ}}_{j, m} (x) = - m_{\hat{θ}, j} (x) m_{\hat{θ}, m} (x)

, respectively.

4. The Asymptotic Behavior

First, we provide sufficient assumptions to show the asymptotic behavior of the test statistic. Asymptotic distributions under the null and alternative hypothesis are then given. Finally, we show the asymptotic behavior of the test statistic under Pitman local alternatives.

4.1. Assumptions

The following are sufficient assumptions to show the test statistic’s asymptotic behavior.

Assumption 1:

X lies on a compact set. The marginal density of

X_{i}

, denoted as

f (\cdot)

, is continuously differentiable and bounded away from zero.

Assumption 2:

m (\cdot)

is continuously differentiable on the support of X.

Assumption 3:

P (Y_{i, j} = 1 | X_{i}) \neq 0

and

P (Y_{i, j} = 1 | X_{i}) \neq 1

, for all i and j. None of the alternatives is a perfect substitute for any other.

Assumption 4:

m_{θ, j} (X)

is differentiable with respect to θ, the derivative

\frac{\partial}{\partial θ} m_{θ, j} (X)

is continuous with respect to X and θ, and

{sup}_{θ \in Θ} | m_{θ, j} (x) | < \infty

for all x.

Assumption 5:

There exists a unique value for the θ, defined as

θ_{0} = \underset{θ}{\arg \max} \sum_{i = 1}^{n} \sum_{j = 1}^{J} 1 {Y_{i, j} = 1} log [m_{θ, j} (X_{i})]

. Letting

θ_{0} = θ

, it satisfies

\hat{θ} - θ = O_{p} (1 / \sqrt{n})

.

Assumption 1 establishes that the first-order derivative of

f (\cdot)

is bounded. The assumption that X lies on a compact set may be considered too strong, because it excludes X from some tractable distributions, such as the normal. However, it does not confine applications of the test to empirical study, because, in general, observations rarely take an infinite value. The assumption that

f (\cdot)

is bounded from zero avoids the random denominator problem associated with a nonparametric kernel estimation. It is also straightforward to see that the first-order derivative of

m (\cdot)

is also bounded under Assumptions 1 and 2.

Assumption 3 guarantees that

σ_{j}^{2} (X_{i}) \neq 0

and

σ_{j, l} (X_{i}) \neq 0

for any j and

l \neq j

, because

σ_{j}^{2} (X_{i}) = P (Y_{i, j} = 1 | X_{i}) P (Y_{i, j} = 0 | X_{i})

and

σ_{j, l} (X_{i}) = - P (Y_{i, j} = 1 | X_{i}) P (Y_{i, l} = 1 | X_{i})

. It is also clear that

σ_{j}^{2} (X_{i})

and

σ_{j, l} (X_{i})

never tend to infinity, owing to the nature of the model. The fact that no alternatives are perfect substitutes for each other ensures that the variance-covariance matrix V is invertible.

We need Assumption 4 to show the asymptotic behavior of

C_{n}

. The

\sqrt{n}

-consistency of the parametric estimation given in Assumption 5 is obtained, for example, by maximal likelihood estimation of a multinomial probit or logit model.

The kernel function assumption is as follows:

Assumption 6:

The kernel K is a symmetric function and satisfies

\int K (u) d u = 1

,

\int | K (u) | d u < \infty

,

sup | K (u) | < \infty

and

| u K (u) | \to 0

as

| u | \to \infty

.

Assumption 6 is satisfied by commonly-used second-order kernels, such as the Epanechnikov, Gaussian and quartic kernels, and the two-times convolution product of the kernel is bounded under this assumption. Furthermore, the nonparametric density estimator is consistent under Assumptions 1 and 5 (see, for example, Theorem 4.1 of [41]).

4.2. Asymptotic Distribution under the Null Hypothesis

We provide a proposition about the asymptotic distribution of

C_{n}

under the null hypothesis. The proof of the proposition is provided in the Appendix.

Proposition 1.

Let Assumptions 1–6 hold. Then, under the null hypothesis,

C_{n} \overset{d}{\to} χ_{(J - 1)}^{2},

as

h \to 0

and

n h^{q} \to \infty

.

Proposition 1 indicates that the asymptotic distribution of the test statistic

C_{n}

under the null hypothesis is a chi-squared distribution with

J - 1

degrees of freedom. Therefore, we reject the null hypothesis that the parametric specification of the response probability is identical to the true one with a probability of one if

C_{n} > t_{α}

, where

t_{α}

is the

(1 - α)

quantile of the chi-squared distribution with

J - 1

degrees of freedom.

4.3. Asymptotic Distribution under the Alternative Hypothesis

We show that the test statistic is consistent, that is its asymptotic power is equal to one. The proof of the lemma is provided in the Appendix.

Lemma 1.

Let Assumptions 1–6 hold. Then, under the alternative hypothesis,

\frac{1}{n h^{2 / q}} \frac{n h^{2 / q} Z_{n, j}}{\sqrt{{\hat{V}}_{j, j}}} \overset{p}{\to} \frac{E {{[m_{θ, j} (X_{i}) - m_{j} (X_{i})]}^{2} f (X_{i})}}{\sqrt{2 K^{(2)} (0) E {m_{θ, j} {(X_{i})}^{2} {[1 - m_{θ, j} (X_{i})]}^{2} f (X_{i})}}} > 0,

for some j as

n \to \infty

and

h \to 0

.

The proof of Lemma 1 provided in the Appendix implies that

n h^{2 / q} Z_{n, j}

diverges for some j as the sample size n increases and

{\hat{V}}_{j, j}

converges to a constant that is strictly larger than zero. In addition, it is straightforward to see that the probability limits of

{\hat{V}}_{j, m}

under the alternative hypothesis is

2 K^{(2)} (0) E [m_{θ, j} {(X_{i})}^{2} m_{θ, m} {(X_{i})}^{2} f (X_{i})],

which is bounded above by Assumptions 1–3 and 5 for any

j \neq m

. Thus, the following proposition follows immediately.

Proposition 2.

Let Assumptions 1–6 hold. Then, under the alternative hypothesis,

C_{n}

diverges in probability, and thus, the asymptotic power of the test is one.

The proof of Proposition 2 is apparent from Lemma 1 and the discussion on the probability limits of

{\hat{V}}_{j, m}

under the alternative hypothesis mentioned above.

4.4. Asymptotic Distribution under the Pitman Local Alternative

We show that the test statistic

C_{n}

has nontrivial power against Pitman local alternatives approaching the null at the rate of

1 / \sqrt{n h^{q / 2}}

. The proof of the lemma is provided in the Appendix. Let us consider a sequence of local alternatives:

H_{1 n} : P (Y_{i, j} = 1 | X_{i}) = m_{θ, j} (X_{i}) + δ_{n} l_{j} (X_{i}),

where

l_{j} (\cdot)

is a known continuous function with

E [l_{j} {(\cdot)}^{2}] < \infty

for all j and

δ_{n} \to 0

at the rate of

1 / \sqrt{n h^{q / 2}}

.

Lemma 2.

Let Assumptions 1–6 hold. Then, under the local alternative hypothesis,

\begin{matrix} n h^{q / 2} Z_{n, j} \overset{d}{\to} N (M_{j}, V_{j, j}) for all j, \end{matrix}

where

M_{j} \equiv E [l_{j} {(x)}^{2} f (x)] .

Lemma 2 indicates that the limiting distribution of

n h^{q / 2} Z_{n, j} / V_{j, j}

is the normal distribution with mean

M_{j} V_{j, j}^{- 1 / 2}

and variance one. The following proposition shows that the test statistic can detect the local alternative with nontrivial power.

Proposition 3.

Let Assumptions 1–6 hold. Then, under the local alternative hypothesis, the test statistic

C_{n}

converges to a non-central chi-squared distribution with

J - 1

degrees of freedom:

\begin{matrix} C_{n} \overset{d}{\to} χ_{(J - 1)}^{2} (\tilde{λ}), \end{matrix}

where

\tilde{λ} \equiv M^{'} V^{- 1} M

is a noncentrality parameter.

The proof of Proposition 3 is straightforward from Lemma 2 and the discussion on the probability limits of

{\hat{V}}_{j, m}

for

j \neq m

in the proof of Proposition 1.

5. Bootstrap Methods

This section presents a bootstrapping method that is useful in approximating the distribution of the test statistic when the sample size is small. Specification tests for the regression function usually require the wild bootstrapping procedure to calculate the rejection region, as proven by [7]. In our case, however, the wild bootstrap does not work well. The intuitive reason is that it fails to generate a bootstrap sample for the binary response variable.

We show that the parametric bootstrap procedure works well to calculate the rejection region for the test statistic. Intuitively, this is because the binary bootstrap sample for the response variable, say

Y_{i}^{*}

, can be driven according to the parametrically-generated response probabilities, and there are no specific conditions that should be held by

Y_{i}^{*}

in multinomial choice models. This is, for example, different from the case of the regression model in which the conditional expectation of the error term should be zero. The proof for the proposition in this section is provided in the Appendix.

The response probability that person i chooses alternative j can be parametrically estimated under the null hypothesis for all i and j by using the observations

{{X_{i, j}, Y_{i, j}}_{i = 1}^{n}}_{j = 1}^{J}

. For each person, we randomly choose one of J alternatives (say, alternative

m_{i}

) with individual probabilities equal to the estimated response probabilities. Then, we derive bootstrap observations

Y_{i}^{*} \equiv {Y_{i, 1}^{*}, Y_{i, 2}^{*}, \dots, Y_{i, m_{i}}^{*}, \dots, Y_{i, J}^{*}}

for each

i = 1, \dots, n

, so that

P (Y_{i, j}^{*} | X_{i}) = m_{\hat{θ}, j} (X_{i})

, where

Y_{i, m_{i}}^{*} = 1

and

Y_{i, j}^{*} = 0

for

j \neq m_{i}

. We use

{{X_{i, j}, Y_{i, j}^{*}}_{i = 1}^{n}}_{j = 1}^{J}

as the bootstrap observations.

Assumptions 3 and 5 can be rewritten by using the bootstrap observations as follows:

Assumption 3’:

P (Y_{i, j}^{*} = 1 | X_{i}) \neq 0

and

P (Y_{i, j}^{*} = 1 | X_{i}) \neq 1

, for all i and j. None of the alternatives is a perfect substitute for any other.

Assumption 5’:

There exists a unique value for the θ, defined as

θ_{0} = \underset{θ}{\arg \max} \sum_{i = 1}^{n} \sum_{j = 1}^{J} 1 {Y_{i, j} = 1} log [m_{θ, j} (X_{i})]

. Letting

θ_{0} = θ

, it satisfies

{\hat{θ}}^{*} - θ = O_{p} (1 / \sqrt{n})

.

where

{\hat{θ}}^{*}

is the estimate of θ obtained by using the bootstrap observations

{{X_{i, j}, Y_{i, j}^{*}}_{i = 1}^{n}}_{j = 1}^{J}

.

Since the bootstrap sample

Y_{i, j}^{*}

is derived in accordance with the parametrically-estimated response probabilities

m_{\hat{θ}, j} (X_{i})

, Assumption 3’ implies that these probabilities do not take the values zero and one; that is,

m_{\hat{θ}, j} (X_{i}) \neq 0

and

m_{\hat{θ}, j} (X_{i}) \neq 1

, for all i and j. Assumption 3’ holds whenever Assumption 3 holds and one applies parametric models whose estimates do not exceed below zero and above one, such as the MNL or MNP model. Assumption 5’ requires that

{\hat{θ}}^{*}

be a consistent estimator of θ. Assumption 5’ is satisfied whenever Assumption 5 holds because the true value of

{\hat{θ}}^{*}

is

\hat{θ}

, which converges to θ in probability.

Bootstrap Methods for $C_{n}$

The test statistic

C_{n}^{*}

is constructed similarly to

C_{n}

by using the bootstrap observations

{{X_{i, j}, Y_{i, j}^{*}}_{i = 1}^{n}}_{j = 1}^{J}

. We obtain the

(1 - α)

quantile

t_{α}^{*}

by Monte Carlo approximation for the distribution of

C_{n}^{*}

. The null hypothesis is rejected if

C_{n} > t_{α}^{*}

. In the following proposition, we show that this parametric bootstrap procedure works: under the null hypothesis,

C_{n}^{*}

converges to the asymptotic distribution of

C_{n}

; under the alternative hypothesis,

C_{n}^{*}

converges to the asymptotic distribution of the test statistics under the null hypothesis.

Proposition 4.

Let Assumptions 1–6 hold. Then, the test statistic obtained with the bootstrap observation converges to a chi-squared distribution with

J - 1

degrees of freedom:

C_{n}^{*} \overset{p}{\to} χ_{(J - 1)}^{2},

as

n \to \infty

and

h \to 0

.

6. Monte Carlo Experiments

The size and power of the test are examined by Monte Carlo experiments. We consider a simple case in which each individual chooses one of three alternatives. To explore the power properties of the test, we consider three different true models.

The null hypothesis to be tested is the following:

H_{0} : P [m_{θ, j} (X_{i}) = \frac{exp (β_{0} + β_{1} X_{i, j})}{\sum_{j = 1}^{J} exp (β_{0} + β_{1} X_{i . j})}] = 1,

for some

β_{0}, β_{1} \in R

and for all

j = 1, 2, 3

. The null hypothesis is based on the assumptions that the function

g_{j} (X_{i, j}, θ)

is linear, specifically,

β_{0} + β_{1} X_{i, j}

, and that

ϵ_{i, j}

follows the Type I extreme-value distribution for all j. For simplicity of calculation,

X_{i, j}

is assumed to be one-dimensional.

We consider three different true models. Each of these true models has a specific form of

g_{j} (\cdot)

, which can be generally written as

g_{j} (X_{i, j}, θ) = γ_{j} X_{i, j} + c_{j} {(X_{i, j} - 1 / 2)}^{2} + d_{j} {(2 X_{i, j} - 2 / 3)}^{3}

. By applying specific values in

γ \equiv {γ_{1}, γ_{2}, γ_{3}}

,

c \equiv (c_{1}, c_{2}, c_{3})

and

d \equiv (d_{1}, d_{2}, d_{3})

, we propose three true models: Model 1:

γ = {1, 1, 1}

,

c = (0, 0, 1)

, and

d = (0, 0, 0)

; Model 2:

γ = {1, 1, 5}

,

c = (0, 3, 5)

, and

d = (0, 0, 0)

; and Model 3:

γ = {1, 1, 1}

,

c = (0, 3, 5)

, and

d = (0, 3, 5)

. The true distribution of

ϵ_{i, j}

is a Type I extreme-value distribution for all j.

These true models allow us to investigate the power property of the test in the case of misspecification due to nonlinearity and the choice-specific coefficients. We add nonlinearity to the true function of

g_{j} (\cdot)

in all true models by setting

c_{j}

and

d_{j}

at nonzero values. Choice-specific coefficients are inserted into Model 2 by setting

γ_{j}

at different values across j. In this experiment, we do not consider the misspecification originating in the distribution of

ϵ_{i, j}

and the omitted variables.

We derive

{{X_{i, j}}_{j = 1}^{3}}_{i = 1}^{n}

uniformly from [0,1] and

{{ϵ_{i, j}}_{j = 1}^{3}}_{i = 1}^{n}

randomly from the Type I extreme-value distribution. Then, the latent variable

y^{*}

is generated by each true model:

y_{i, j}^{*} = g_{j} (X_{i, j}, θ) + ϵ_{i, j}

. The binary outcome

Y_{i, j}

is chosen to be one, if

y_{i, j}^{*} > y_{i, m}^{*}

for all

m \neq j

, and zero otherwise. Sample sizes are

n = 50

and

n = 100

. The critical value is computed by

B = 100

repetitions of the parametric bootstrap, and all results are based on

M = 1000

simulation runs.

To calculate the test statistics,

X_{i, j}

is considered to be specific to each alternative, namely,

q = 3

. The quartic kernel

K (z) = (15 / 16) {(1 - z^{2})}^{2} 1 (| z | < 1)

is used for nonparametric estimation. Bandwidths for the kernel estimator are chosen to be

h \in {0.30, 0.35, 0.40, 0.45}

.

Table 1 illustrates the size of the test at the

5 %

significance level. The first and second rows of the table show the size of the test, where the critical values are obtained by the parametric bootstrap (

t_{0.05}^{*}

) and asymptotic distribution of the test statistic (

t_{0.05}

), respectively. The first to fourth columns of the table illustrate the results obtained with a sample size of

n = 50

and bandwidths h of

0.30

,

0.35

,

0.40

and

0.45

, respectively. Similarly, the fifth to eighth columns show the result with

n = 100

. Overall, the test tends to over-reject the null hypothesis when the critical values are calculated by parametric bootstrap. The probability of rejection is close to its nominal size when

h = 0.35

and

n = 50

. In contrast, the test tends to under-reject the null hypothesis when the critical values are the

95 %

quantile of the chi-squared distribution with two degrees of freedom. The probability of rejection is close to its nominal size when

h = 0.30

and

n = 50

.

Table 1. Monte Carlo estimates of the size.

**Table 1.** Monte Carlo estimates of the size.
Critical Value\h	$n = 50$				$n = 100$
Critical Value\h	$0.30$	$0.35$	$0.40$	$0.45$	$0.30$	$0.35$	$0.40$	$0.45$
$t_{0.05}^{*}$	$0.069$	$0.047$	$0.062$	$0.058$	$0.058$	$0.064$	$0.063$	$0.077$
$t_{0.05}$	$0.049$	$0.040$	$0.042$	$0.043$	$0.032$	$0.051$	$0.050$	$0.046$

The significance level is

0.05

.

In comparing the power performance of the test, it is possible to correct size distortion by using the bandwidths corresponding to the nominal size of the tests. In practice, however, this procedure cannot be employed, because we do not know the true model. Thus, we do not correct the size distortion in this experiment. We rather show the power performance with each bandwidth level, since choosing an appropriate bandwidth in practice is outside the scope of this paper.

Before beginning to show the simulation results of the power performance of the test statistics, we illustrate the discrepancy between the true and parametric null models. The response probabilities in this simulation are mappings of the unit cube to the unit interval. For illustration simplicity, however, we focus on the domain of the response probabilities, being

{X_{i} = (X_{i, 1}, X_{i, 2}, X_{i, 3}) : X_{i, j} \in [0, 1] for all j and X_{i, 1} = X_{i, 2} = X_{i, 3}}

. In this setting, the fitted values for the response probabilities of the parametric model under the null hypothesis are always

1 / 3

for all j, because the model does not have any alternative-variant coefficients.

Figure 1 shows how the true and null response probabilities react to the covariates. The larger distance between the true and null models with x fixed indicates that the parametric null model does not approximate the true model well. The parametric predictions of response probabilities lie closer to the true response probability in Model 1 than in Models 2 and 3 for all j. For the second and third alternatives, the parametric null response probability appears to lie closer to the true response probability in Model 3 than in Model 2. For the first alternative, however, the distance between the true and null models seems closer in Model 3. In brief, the null model gives the best response probability predictions in Model 1, but the predictions are less accurate in Models 2 and 3. The prediction precision of the null model could reflect the power performance of the test statistics.

Table 2 reports the proportion of rejections of the null hypothesis at the

5 %

significance level. The first to third rows of the table show the power of the test when the true models are Model 1, Model 2 and Model 3, respectively, where the critical values are obtained by parametric bootstrap. Similarly, the fourth to sixth rows of the table show the power, where the critical values are obtained by the

95 %

quantile of the chi-squared distribution with two degrees of freedom. The first to fourth columns of the table illustrate the power results obtained with a sample size of

n = 50

and bandwidths h of

0.30

,

0.35

,

0.40

and

0.45

, respectively. Similarly, the fifth to eighth columns show the result with

n = 100

.

The test does not have a decidedly nontrivial power when Model 1 is true. Non-rejection of the null hypothesis does not imply that the null model is true. However, in fact, as the top three figures in Figure 1 show, the parametric model under the null hypothesis may provide a proper approximation for the response probabilities of Model 1. Therefore, the low power of the test statistic may be acceptable. In contrast, the test statistic has more nontrivial power when Model 2 or 3 is true. The greater the sample size, the better the power performance, which depends on the choice of bandwidth.

Figure 1. Discrepancy between true and estimated parametric response probabilities.

Table 2. Proportion of null hypothesis rejections based on Monte Carlo simulation.

**Table 2.** Proportion of null hypothesis rejections based on Monte Carlo simulation.
Critical Value	Model\h	$n = 50$				$n = 100$
Critical Value	Model\h	$0.30$	$0.35$	$0.40$	$0.45$	$0.30$	$0.35$	$0.40$	$0.45$
$t_{0.05}^{*}$	Model 1	$0.061$	$0.058$	$0.055$	$0.070$	$0.065$	$0.053$	$0.063$	$0.076$
	Model 2	$0.810$	$0.888$	$0.935$	$0.960$	$0.998$	$1.000$	$1.000$	$1.000$
	Model 3	$0.236$	$0.330$	$0.397$	$0.411$	$0.672$	$0.709$	$0.791$	$0.838$
$t_{0.05}$	Model 1	$0.047$	$0.043$	$0.056$	$0.061$	$0.049$	$0.053$	$0.052$	$0.053$
	Model 2	$0.807$	$0.907$	$0.940$	$0.970$	$0.995$	$1.000$	$0.999$	$1.000$
	Model 3	$0.255$	$0.304$	$0.365$	$0.415$	$0.600$	$0.713$	$0.777$	$0.839$

The significance level is

0.05

.

Closer inspection of Table 2 reveals that the test performs better in terms of power when the critical values are obtained by parametric bootstrap, especially when the sample size is

n = 50

. Too see this, we compare the results of

h = 0.35

when critical values are

t_{0.05}^{*}

with those of

h = 0.30

when critical values are

t_{0.05}

. We compare the results with different bandwidths because the size of the test is close to its nominal size with these bandwidths (

0.047

and

0.049

, respectively). When the true model is Model 2, the probability of the rejection of the null hypothesis is

0.888

for

t_{0.05}^{*}

. The probability of the rejection is

0.807

for

t_{0.05}

. Similarly, when the true model is Model 3, the probability is

0.330

for

t_{0.05}^{*}

and

0.255

for

t_{0.05}

. It is surprising that the performance of the test is not unreasonable when critical values are obtained by an asymptotic distribution. However, at least in this setting, the test shows higher power when critical values are obtained by parametric bootstrap when the sample size is small.

7. Conclusions

This study proposes a consistent specification test for unordered multinomial choice models. It tests the specifications of multiple response probabilities jointly for all choice alternatives. The test statistic is asymptotically chi-square distributed with

J - 1

degrees of freedom, consistent against a fixed alternative and have nontrivial power against local alternatives approaching the null at the rate of

1 / \sqrt{n h^{q / 2}}

. The rejection region for the test statistic can be calculated through a simple parametric bootstrap procedure, when the sample size is small. In Monte Carlo experiments, we test the specification of the MNL model under three true models to examine the power performance of the test. We find that the test statistic does not have a decidedly nontrivial power when the parametric model under the null hypothesis provides a proper approximation for the response probabilities of the true model. The test statistic has more nontrivial power when the approximation of the null model is less successful. In addition, we find that the test shows higher power performance when critical values are obtained by parametric bootstrap than when they are obtained by the asymptotic distribution of the test statistic. The differences of the power performances are greater when the sample size is small. We can reduce size distortion by choosing an appropriate bandwidth, but this issue remains for future research.

The test proposed in this study can be applied to testing the parametric specifications of response probabilities for any unordered multinomial choice models, including the MNL and MNP models. However, the test is not able to detect local alternatives approaching the null hypothesis at the parametric rate, nor is it rate-optimal. Extending the testing procedure to incorporate such features is left for future research.

Acknowledgments

The author is grateful to Yoshihiko Nishiyama, Ryo Okui, and Naoya Sueishi for their helpful comments and guidance. I also would like to thank Kohtaro Hitomi, Yoon-Jae Whang, and two anonymous referees for constructive comments that improved the paper. This work was supported by JSPS KAKENHI Grant Number 13J06130.

Conflicts of Interest

The author declares no conflict of interest.

Appendix: Proofs

Proof of Proposition 1.

We first prove the following:

\begin{matrix} n h^{q / 2} {\hat{Z}}_{n, j} \overset{d}{\to} N (0, V_{j, j}), \end{matrix}

(4)

\begin{matrix} V_{j, j} - {\hat{V}}_{j, j} = o_{p} (1), \end{matrix}

(5)

\begin{matrix} V_{j, m} - {\hat{V}}_{j, m} = o_{p} (1), \end{matrix}

(6)

where

V_{j, j}

and

V_{j, m}

are the asymptotic variance of

n h^{q / 2} {\hat{Z}}_{n, j}

and the covariance between

n h^{q / 2} {\hat{Z}}_{n, j}

and

n h^{q / 2} {\hat{Z}}_{n, m}

, respectively. We show that they can be written as follows:

\begin{matrix} V_{j, j} \equiv 2 K^{(2)} (0) E {{[σ^{2} (x)]}^{2} f (x)}, \\ V_{j, m} \equiv 2 K^{(2)} (0) E {{[σ_{j, m} (x)]}^{2} f (x)} . \end{matrix}

Proof of (4).

Under the null hypothesis, we have

m_{j} (\cdot) = m_{θ, j} (\cdot)

. Thus, it follows that

\begin{matrix} n h^{q / 2} {\hat{Z}}_{n, j} \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{\hat{θ}, i, j} u_{\hat{θ}, l, j} \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i}) + u_{i, j}] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l}) + u_{l, j}] \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] u_{l, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] u_{i, j} \\ \equiv Z_{1, j} + Z_{2, j} + Z_{3, j} + Z_{4, j} . \end{matrix}

We will prove the following:

\begin{matrix} Z_{1, j} = o_{p} (1), \end{matrix}

(7)

\begin{matrix} Z_{2, j} \overset{d}{\to} N (0, V_{j, j}), \end{matrix}

(8)

\begin{matrix} Z_{3, j} + Z_{4, j} = o_{p} (1) . \end{matrix}

(9)

Proof of (7).

We show that

Z_{1, j} = o_{p} (1)

. By Assumption 4, there is an interior point

\tilde{θ}

between θ and

\hat{θ}

, such that

m_{\hat{θ}, j} (X_{i}) - m_{θ, j} (X_{i}) = \frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (X_{i}) (\hat{θ} - θ) .

(10)

By using this,

Z_{1, j}

can be represented as follows:

\begin{matrix} Z_{1, j} & = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] \\ = h^{q / 2} \sqrt{n} {(\hat{θ} - θ)}^{'} \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) \frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ} \frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}} \sqrt{n} (\hat{θ} - θ) \\ \equiv h^{q / 2} \sqrt{n} {(\hat{θ} - θ)}^{'} \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{1} (X_{i}, X_{l}) \sqrt{n} (\hat{θ} - θ), \end{matrix}

where

{\bar{Z}}_{1} (X_{i}, X_{l})

is a symmetric function. We apply Lemma 3.1 of [42] to the second-order U-statistic,

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{1} (X_{i}, X_{l})

. To do this, we need to show that

E [∥ {\bar{Z}}_{1} (X_{i}, X_{l}) ∥^{2}] = o (n)

.

\begin{matrix} E [∥ {\bar{Z}}_{1} (X_{i}, X_{l}) ∥^{2}] & = \frac{1}{h^{2 q}} E [K {(\frac{X_{i} - X_{l}}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ}∥}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ}∥}^{2}] \\ = \frac{1}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ}∥}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ}∥}^{2} f (x) f (y) d x d y \\ = \frac{1}{h^{q}} \int K {(u)}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ}∥}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (x - u h)}{\partial θ}∥}^{2} f (x) f (x - u h) d x d u \\ = \frac{1}{h^{q}} K^{(2)} (0) \int {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ}∥}^{4} f {(x)}^{2} d x + O (h) \\ = O (h^{- q}) + O [n {(n h^{q})}^{- 1}] = o (n) since n h^{q} \to \infty . \end{matrix}

(11)

Applying Lemma 3.1 of [42], we obtain

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{1} (X_{i}, X_{l}) = E [{\bar{Z}}_{1} (X_{i}, X_{l})] + o_{p} (1 / \sqrt{n})

, where

\begin{matrix} E [{\bar{Z}}_{1} (X_{i}, X_{l})] & = \frac{1}{h^{q}} E [K (\frac{X_{i} - X_{l}}{h}) \frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ} \frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}}] \\ = \frac{1}{h^{q}} \int K (\frac{x - y}{h}) \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ} \frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}} f (x) f (y) d x d y \\ = \int K (u) \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ} \frac{\partial m_{\tilde{θ}, j} (x - u h)}{\partial θ^{'}} f (x) f (x - u h) d x d u \\ = \int \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ} \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} f {(x)}^{2} d x \\ = O (1) . \end{matrix}

Therefore, we yield

\begin{matrix} Z_{1, j} & = h^{q / 2} \sqrt{n} {(\hat{θ} - θ)}^{'} \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{1} (X_{i}, X_{l}) \sqrt{n} (\hat{θ} - θ), \\ = O_{p} (h^{q / 2}) = o_{p} (1) . \end{matrix}

Proof of (8).

Note that

Z_{2, j}

can be treated as a second-order degenerate U-statistic:

\begin{matrix} \frac{h^{q / 2}}{n} Z_{2, j} & = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j} . \end{matrix}

Define

G_{n} (Z_{1}, Z_{2}) = E_{Z_{i}} [{K [(X_{1} - X_{i}) / h] u_{1, j} u_{i, j}} {K [(X_{2} - X_{i}) / h] u_{2, j} u_{i, j}}]

, where

Z_{i} = {X_{i}, u_{i}}

. According to the central limit theorem for degenerate U-statistics proposed by [43],

\frac{Z_{2, j}}{h^{- q / 2} \sqrt{2 E {{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{2}}}} \overset{d}{\to} N (0, 1),

if

\begin{matrix} \frac{E [G_{n}^{2} (Z_{1}, Z_{2})] + n^{- 1} E {{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{4}}}{E {{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{2}}^{2}} \to 0 as n \to \infty . \end{matrix}

(12)

Thus, it is enough to show that (12) and

\frac{2}{h^{q}} E \{{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{2}\} \to V_{j, j},

(13)

hold.

Proof of (12).

First, straightforward calculation gives

\begin{matrix} E [G_{n}^{2} (Z_{1}, Z_{2})] & = E [{\{E_{Z_{i}} [u_{1, j} u_{2, j} u_{i, j}^{2} K (\frac{X_{1} - X_{i}}{h}) K (\frac{X_{2} - X_{i}}{h})]\}}^{2}] \\ = E \{σ_{j}^{2} (X_{1}) σ_{j}^{2} (X_{2}) {[\int σ_{j}^{2} (z) K (\frac{X_{1} - z}{h}) K (\frac{X_{2} - z}{h}) f (z) d z]}^{2}\} \\ = h^{3 q} K^{(4)} (0) \int {[σ_{j}^{2} (x)]}^{4} f {(x)}^{4} d x + O (h^{3 q + 1}) + o (h^{3 q + 1}) \\ = O (h^{3 q}) . \end{matrix}

(14)

Similarly, it can be shown that

\begin{matrix} \frac{1}{n} E \{{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{4}\} & = \frac{1}{n} \int σ_{j}^{4} (x) σ_{j}^{4} (y) {[K (\frac{x - y}{h})]}^{4} f (x) f (y) d x d y \\ = \frac{h^{q}}{n} \int {[σ_{j}^{4} (x)]}^{2} f^{2} (x) d x \int {[K (u)]}^{4} d u + O (\frac{h^{2 q}}{n}) \\ = O (\frac{h^{q}}{n}) . \end{matrix}

(15)

Following some calculation, we obtain

\begin{matrix} E {\{{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{2}\}}^{2} & = E {\{σ^{2} (X_{1}) σ^{2} (X_{2}) {[K (\frac{X_{1} - X_{2}}{h})]}^{2}\}}^{2} \\ = h^{2 q} {\{K^{(2)} (0) \int {[σ^{2} (x)]}^{2} f^{2} (x) d x + O (h)\}}^{2} \\ = O (h^{2 q}) . \end{matrix}

(16)

Finally, (14)–(16) indicate that (12) holds because

\frac{O (h^{3 q}) + O (\frac{h^{q}}{n})}{O (h^{2 q})} \to 0

as

h \to 0

and

n h^{q} \to \infty

.

Proof of (13).

From Equation (16), it is clear that

\begin{matrix} \frac{2}{h^{q}} E \{{[u_{1, j} u_{2, j} K (\frac{X_{1} - X_{2}}{h})]}^{2}\} & = 2 K^{(2)} (0) \int {[σ^{2} (x)]}^{2} f^{2} (x) d x + O (h) \\ = 2 K^{(2)} (0) E {{[σ^{2} (x)]}^{2} f (x)} + O (h) \\ \to V_{j, j} . \end{matrix}

(17)

Proof of (9).

We show that

Z_{3, j} + Z_{4, j} = o_{p} (1)

. By using (10),

Z_{3, j} + Z_{4, j}

can be represented as follows:

\begin{matrix} Z_{3, j} + Z_{4, j} & = \frac{1}{(n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q / 2}} K (\frac{X_{i} - X_{l}}{h}) {[m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] u_{l, j} + [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] u_{i, j}} \\ = \frac{1}{(n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q / 2}} K (\frac{X_{i} - X_{l}}{h}) \{[\frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ^{'}} (\hat{θ} - θ)] u_{l, j} + [\frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}} (\hat{θ} - θ)] u_{i, j}\} \\ = \frac{h^{q / 2}}{(n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) [u_{l, j} \frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ^{'}} + u_{i, j} \frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}}] (\hat{θ} - θ) \\ \equiv \frac{h^{q / 2}}{(n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{3} (X_{i}, X_{l}) (\hat{θ} - θ), \end{matrix}

where

{\bar{Z}}_{3} (X_{i}, X_{l})

is a symmetric function. We apply Lemma 3.1 of [42] to the second-order U-statistic,

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{3} (X_{i}, X_{l})

. To do this, we need to show that:

E [∥ {\bar{Z}}_{3} (X_{i}, X_{l}) ∥^{2}] = o (n)

.

\begin{matrix} E [| {\bar{Z}}_{3} (X_{i}, X_{l}) |^{2}] & \leq \frac{2}{h^{2 q}} E \{K {(\frac{X_{i} - X_{l}}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ^{'}}∥}^{2} u_{l, j}^{2}\} \\ + \frac{2}{h^{2 q}} E \{K {(\frac{X_{i} - X_{l}}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}}∥}^{2} u_{i, j}^{2}\} \\ = \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (y) f (x) f (y) d x d y \\ + \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (x) f (x) f (y) d x d y \\ = \frac{2}{h^{q}} \int K {(u)}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (x - u h) f (x) f (x - u h) d x d u \\ + \frac{2}{h^{q}} \int K {(v)}^{2} {∥\frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (y + v h) f (y + v h) f (y) d v d y \\ = \frac{2}{h^{q}} K^{(2)} (0) \int {∥\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (x) f {(x)}^{2} d x \\ + \frac{2}{h^{q}} K^{(2)} (0) \int {∥\frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}}∥}^{2} σ_{j}^{2} (y) f {(y)}^{2} d y + O (h) \\ = O (h^{- q}) + O [n {(n h^{q})}^{- 1}] = o (n) since n h^{q} \to \infty . \end{matrix}

(18)

Applying Lemma 3.1 of [42], we obtain

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{3} (X_{i}, X_{l}) = E [{\bar{Z}}_{3} (X_{i}, X_{l})] + o_{p} (1 / \sqrt{n})

, where

E [{\bar{Z}}_{3} (X_{i}, X_{l})] = 0

. Therefore,

\begin{matrix} Z_{3, j} + Z_{4, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{Z}}_{3} (X_{i}, X_{l}) (\hat{θ} - θ) \\ = n h^{q / 2} o_{p} (1 / \sqrt{n}) O_{p} (1 / \sqrt{n}) \\ = o_{p} (h^{q / 2}) = o_{p} (1) . \end{matrix}

Proof of (5) and (6).

Since the asymptotic variance is shown above, we derive the asymptotic covariance between

n h^{q / 2} {\hat{Z}}_{n, j}

and

n h^{q / 2} {\hat{Z}}_{n, m}

, which we denote as

V_{j, m}

. From the results of (7)–(9), it is clear that

E (Z_{2, j} Z_{2, m}) \to V_{j, m}

as

n \to \infty

. Because

E (u_{i, j} u_{l, j}) = 0 if i \neq l

, and

E (u_{i, j} u_{i, m} | X_{i}) = σ_{j, m} (X_{i}) if j \neq m

, it follows that

\begin{matrix} E (Z_{2, j} Z_{2, m}) \\ = \frac{1}{{(n - 1)}^{2} h^{q}} E [\sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j} \sum_{s = 1}^{n} \sum_{t \neq s}^{n} K (\frac{X_{s} - X_{t}}{h}) u_{s, m} u_{t, m}] \\ = \frac{2}{{(n - 1)}^{2} h^{q}} E \{\sum_{i = 1}^{n} \sum_{l \neq 1}^{n} u_{i, j} u_{l, j} u_{i, m} u_{l, m} {[K (\frac{X_{i} - X_{l}}{h})]}^{2}\} \\ = \frac{2 n}{(n - 1) h^{q}} \int σ_{j, m} (x) σ_{j, m} (y) {[K (\frac{x - y}{h})]}^{2} f (x) f (y) d x d y \\ = 2 K^{(2)} (0) \int {[σ_{j, m} (x)]}^{2} f^{2} (x) d x + O (h) \\ \to V_{j, m} . \end{matrix}

(19)

Thus, the proofs of (5) and (6) are straightforward from (17) and (19).

Let

Z_{2} = {(Z_{2, 1}, Z_{2, 2}, \dots, Z_{2, J - 1})}^{'}

. Similarly to the proof of (8), it can be straightforwardly shown that

t^{'} Z_{2} \overset{d}{\to} N (0, t^{'} V t)

for any

(J - 1) \times 1

vector t, where V is a

(J - 1) \times (J - 1)

variance-covariance matrix whose

(j, m)

elements are

V_{j, m}

. Then, by the Cramér–Wold device,

Z_{2}

converges to a multivariate normal distribution with

(J - 1) \times 1

mean vector consists of zero and variance-covariance matrix V. Therefore,

C_{n}

, which is the quadratic form of

n h^{q / 2} {\hat{Z}}_{n, j}

, converges to a chi-squared distribution with

J - 1

degrees of freedom. ☐

Proof of Lemma 1.

Under the alternative hypothesis,

n h^{q / 2} Z_{n, j}

can be represented as follows:

\begin{matrix} n h^{q / 2} {\hat{Z}}_{n, j} & = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{\hat{θ}, i, j} u_{\hat{θ}, l, j} \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{\hat{θ}, j} (X_{i}) + u_{i, j}] [m_{j} (X_{l}) - m_{\hat{θ}, j} (X_{l}) + u_{l, j}] \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{θ, j} (X_{i}) + m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i}) + u_{i, j}] \\ [m_{j} (X_{l}) - m_{θ, j} (X_{l}) + m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l}) + u_{l, j}] \\ = \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{θ, j} (X_{i})] [m_{j} (X_{l}) - m_{θ, j} (X_{l})] \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{θ, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{θ, j} (X_{i})] u_{l, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{j} (X_{l}) - m_{θ, j} (X_{l})] \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] u_{l, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{l}) - m_{θ, j} (X_{l})] u_{i, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] u_{i, j} \\ + \frac{1}{h^{q / 2} (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j} \\ \equiv A_{1, j} + A_{2, j} + A_{3, j} + A_{4, j} + A_{5, j} + A_{6, j} + A_{7, j} + A_{8, j} + A_{9, j}, \end{matrix}

where

A_{5, j} = Z_{1, j} = o_{p} (1)

and

A_{6, j} + A_{8, j} = Z_{3, j} + Z_{4, j} = o_{p} (1)

. We show that

\frac{1}{n h^{2 / q}} (A_{2, j} + A_{3, j} + A_{4, j} + A_{7, j} + A_{9, j}) = o_{p} (1) .

(20)

Then,

{\hat{Z}}_{n, j} = \frac{1}{n h^{2 / q}} A_{1, j} + o_{p} (1) .

Thus, it is enough to show that (20), and the following holds:

\begin{matrix} \frac{1}{n h^{2 / q}} A_{1, j} & = E {{[m_{θ, j} (X_{i}) - m_{j} (X_{i})]}^{2} f (X_{i})} + o_{p} (1), \end{matrix}

(21)

\begin{matrix} {\hat{V}}_{j, j}^{Z h} & = 2 K^{(2)} (0) E {m_{θ, j} {(X_{i})}^{2} {[1 - m_{θ, j} (X_{i})]}^{2} f (X_{i})} + o_{p} (1) . \end{matrix}

(22)

Since

{\hat{σ}}_{j}^{2} (x) = m_{\hat{θ}, j} (x) [1 - m_{\hat{θ}, j} (x)]

converges to

m_{θ, j} (x) [1 - m_{θ, j} (x)]

in a probability under the alternative hypothesis, the proofs of (22) is straightforward.

Proof of (20).

First, we show

\frac{1}{n h^{2 / q}} (A_{2, j} + A_{4, j}) = o_{p} (1)

.

\frac{1}{n h^{2 / q}} (A_{2, j} + A_{4, j})

can be represented as a second-order U-statistic as follows:

\begin{matrix} \frac{1}{n h^{2 / q}} & (A_{2, j} + A_{4, j}) \\ = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) \\ \{[m_{j} (X_{i}) - m_{θ, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] + [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{j} (X_{l}) - m_{θ, j} (X_{l})]\} \\ = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) \\ \{[m_{j} (X_{i}) - m_{θ, j} (X_{i})] \frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (X_{l}) + [m_{j} (X_{l}) - m_{θ, j} (X_{l})] \frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (X_{i})\} (\hat{θ} - θ) \\ \equiv \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{2} (X_{i}, X_{l}) (\hat{θ} - θ), \end{matrix}

where

{\bar{A}}_{2} (X_{i}, X_{l})

is a symmetric function. We show that

E [∥ {\bar{A}}_{2} (X_{i}, X_{l}) ∥^{2}] = o (n)

.

\begin{matrix} E [∥ {\bar{A}}_{2} (X_{i}, X_{l}) ∥^{2}] & \leq \frac{2}{h^{2 q}} E [K {(\frac{X_{i} - X_{l}}{h})}^{2} {| m_{j} (X_{i}) - m_{θ, j} (X_{i}) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (X_{l})∥}^{2}] \\ + \frac{2}{h^{2 q}} E [K {(\frac{X_{i} - X_{l}}{h})}^{2} {| m_{j} (X_{l}) - m_{θ, j} (X_{l}) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (X_{i})∥}^{2}] \\ = \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {| m_{j} (x) - m_{θ, j} (x) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (y)∥}^{2} f (x) f (y) d x d y \\ + \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {| m_{j} (y) - m_{θ, j} (y) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (x)∥}^{2} f (x) f (y) d x d y \\ = \frac{2}{h^{q}} \int K {(u)}^{2} {| m_{j} (x) - m_{θ, j} (x) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (x - u h)∥}^{2} f (x) f (x - u h) d x d u \\ + \frac{2}{h^{q}} \int K {(v)}^{2} {| m_{j} (y) - m_{θ, j} (y) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (y + v h)∥}^{2} f (y + v h) f (y) d v d y \\ = \frac{2}{h^{q}} K^{(2)} (0) \int {| m_{j} (x) - m_{θ, j} (x) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (x)∥}^{2} f {(x)}^{2} d x \\ + \frac{2}{h^{q}} K^{(2)} (0) \int {| m_{j} (y) - m_{θ, j} (y) |}^{2} {∥\frac{\partial}{\partial θ^{'}} m_{\tilde{θ}, j} (y)∥}^{2} f {(y)}^{2} d y + O (h) \\ = O (h^{- q}) + O [n {(n h^{q})}^{- 1}] = o (n) since n h^{q} \to \infty . \end{matrix}

Applying Lemma 3.1 of [42], we obtain

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{2} (X_{i}, X_{l}) = E [{\bar{A}}_{2} (X_{i}, X_{l})] + o_{p} (1)

, where

\begin{matrix} E [{\bar{A}}_{2} (X_{i}, X_{l})] & = \frac{1}{h^{q}} \int K (\frac{x - y}{h}) [m_{j} (x) - m_{θ, j} (x)] \frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}} f (x) f (y) d x d y \\ + \frac{1}{h^{q}} \int K (\frac{x - y}{h}) [m_{j} (y) - m_{θ, j} (y)] \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} f (x) f (y) d x d y \\ = \int K (u) [m_{j} (x) - m_{θ, j} (x)] \frac{\partial m_{\tilde{θ}, j} (x - u h)}{\partial θ^{'}} f (x) f (x - u h) d x d u \\ + \int K (v) [m_{j} (y) - m_{θ, j} (y)] \frac{\partial m_{\tilde{θ}, j} (y + v h)}{\partial θ^{'}} f (y + v h) f (y) d v d y \\ = \int [m_{j} (x) - m_{θ, j} (x)] \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} f {(x)}^{2} d x \\ + \int [m_{j} (y) - m_{θ, j} (y)] \frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}} f {(y)}^{2} d y + O (h) \\ = O (1) . \end{matrix}

Therefore, we yield

\begin{matrix} \frac{1}{n h^{2 / q}} & (A_{2, j} + A_{4, j}) = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{2} (X_{i}, X_{l}) (\hat{θ} - θ) = O_{p} (1 / \sqrt{n}) = o_{p} (1) . \end{matrix}

Next, we show

\frac{1}{n h^{2 / q}} (A_{3, j} + A_{7, j}) = o_{p} (1)

.

\frac{1}{n h^{2 / q}} (A_{3, j} + A_{7, j})

can be represented as a second-order U-statistic as follows:

\begin{matrix} \frac{A_{3, j} + A_{7, j}}{n h^{2 / q}} & = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) \{[m_{j} (X_{i}) - m_{θ, j} (X_{i})] u_{l, j} + [m_{j} (X_{l}) - m_{θ, j} (X_{l})] u_{i, j}\} \\ = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{3} (X_{i}, X_{j}), \end{matrix}

where

{\bar{A}}_{3} (X_{i}, X_{j})

is a symmetric function. We show that

E [∥ {\bar{A}}_{3} (X_{i}, X_{l}) ∥^{2}] = o (n)

.

\begin{matrix} E [∥ {\bar{A}}_{3} (X_{i}, X_{l}) ∥^{2}] & \leq \frac{2}{h^{2 q}} E \{K {(\frac{X_{i} - X_{l}}{h})}^{2} {| m_{j} (X_{i}) - m_{θ, j} (X_{i}) |}^{2} u_{l, j}^{2}\} \\ + \frac{2}{h^{2 q}} E \{K {(\frac{X_{i} - X_{l}}{h})}^{2} {| m_{j} (X_{l}) - m_{θ, j} (X_{l}) |}^{2} u_{i, j}^{2}\} \\ = \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {| m_{j} (x) - m_{θ, j} (x) |}^{2} σ_{j}^{2} (y) f (x) f (y) d x d y \\ + \frac{2}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} {| m_{j} (y) - m_{θ, j} (y) |}^{2} σ_{j}^{2} (x) f (x) f (y) d x d y \\ = \frac{2}{h^{q}} \int K {(u)}^{2} {| m_{j} (x) - m_{θ, j} (x) |}^{2} σ_{j}^{2} (x - u h) f (x) f (x - u h) d x d u \\ + \frac{2}{h^{q}} \int K {(v)}^{2} {| m_{j} (y) - m_{θ, j} (y) |}^{2} σ_{j}^{2} (y + v h) f (y + v h) f (y) d v d y \\ = \frac{2}{h^{q}} K^{(2)} (0) \int {| m_{j} (x) - m_{θ, j} (x) |}^{2} σ_{j}^{2} (x) f {(x)}^{2} d x \\ + \frac{2}{h^{q}} K^{(2)} (0) \int {| m_{j} (y) - m_{θ, j} (y) |}^{2} σ_{j}^{2} (y) f {(y)}^{2} d y + O (h) \\ = O (h^{- q}) + O [n {(n h^{q})}^{- 1}] = o (n) since n h^{q} \to \infty . \end{matrix}

Applying Lemma 3.1 of [42], we obtain

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{3} (X_{i}, X_{l}) = E [{\bar{A}}_{3} (X_{i}, X_{l})] + o_{p} (1)

, where

E [{\bar{A}}_{3} (X_{i}, X_{l})] = 0

. Therefore, we yield

\begin{matrix} \frac{A_{3, j} + A_{7, j}}{n h^{2 / q}} & = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{3} (X_{i}, X_{j}) = o_{p} (1) . \end{matrix}

Finally, we show that

n^{- 1} h^{- 2 / q} A_{9, j} = o_{p} (1)

. It is clear that

\frac{1}{n h^{2 / q}} A_{9, j}

is a second-order U-statistic. This satisfies the condition for Lemma 3.1 of [42] as follows:

\begin{matrix} E [{|\frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j}|}^{2}] & = \frac{1}{h^{2 q}} \int K {(\frac{x - y}{h})}^{2} σ_{j}^{2} (x) σ_{j}^{2} (y) f (x) f (y) d x d y \\ = \frac{1}{h^{q}} \int K {(u)}^{2} {[σ_{j}^{2} (x)]}^{2} f {(x)}^{2} d x d u + O (h) \\ = O (h^{- q}) = O (n {(n h^{q})}^{- 1}) = o (n) since n h^{q} \to \infty . \end{matrix}

Applying Lemma 3.1 of [42], we obtain

\frac{1}{n h^{q / 2}} A_{9, j} = E [h^{- q} u_{i, j} u_{l, j} K (\frac{X_{i} - X_{l}}{h})] + o_{p} (1)

, where

E [h^{- q} u_{i, j} u_{l, j} K (\frac{X_{i} - X_{l}}{h})] = 0

.

Proof of (21).

\frac{1}{n h^{2 / q}} A_{1, j}

can be represented as a second-order U-statistic as follows:

\begin{matrix} \frac{1}{n h^{2 / q}} A_{1, j} & = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) [m_{j} (X_{i}) - m_{θ, j} (X_{i})] [m_{j} (X_{l}) - m_{θ, j} (X_{l})] \\ \equiv \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} {\bar{A}}_{1} (X_{i}, X_{l}), \end{matrix}

where

{\bar{A}}_{1} (X_{i}, X_{l})

is a symmetric function. Similar to (11), it can be straightforwardly shown that

E [∥ {\bar{A}}_{1} (X_{i}, X_{l}) ∥^{2}] = o (n)

. The only difference from (11) is we have

{∥m_{j} (x) - m_{θ, j} (x)∥}^{2}

, which is uniformly bounded, as a part of the integrand instead of

{∥\partial m_{\tilde{θ}, j} (x) / \partial θ∥}^{2}

. By applying Lemma 3.1 of [42], we yield

\frac{1}{n h^{2 / q}} A_{1, j} = E [{\bar{A}}_{1} (X_{i}, X_{l})] + o_{p} (1)

, where

\begin{matrix} E [{\bar{A}}_{1} (X_{i}, X_{l})] & = \frac{1}{h^{q}} \int K (\frac{x - y}{h}) [m_{j} (x) - m_{θ, j} (x)] [m_{j} (y) - m_{θ, j} (y)] f (x) f (y) d x d y \\ = \int K (u) [m_{j} (x) - m_{θ, j} (x)] [m_{j} (x - u h) - m_{θ, j} (x - u h)] f (x) f (x - u h) d x d u \\ = \int {[m_{j} (x) - m_{θ, j} (x)]}^{2} f {(x)}^{2} d x + O (h) \\ = E {{[m_{θ, j} (X_{i}) - m_{j} (X_{i})]}^{2} f (X_{i})} + O (h) . \end{matrix}

☐

Proof of Lemma 2.

Under the local alternative hypothesis,

n h^{q / 2} Z_{n, j}

can be written as follows:

\begin{matrix} {n h}^{q / 2} Z_{n, j} = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) {\hat{u}}_{θ, i, j} {\hat{u}}_{θ, l, j} \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i}) + δ_{n} l_{j} (X_{i}) + u_{i, j}] \\ [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l}) + δ_{n} l_{j} (X_{l}) + u_{l, j}] \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] δ_{n} l_{j} (X_{l}) \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] δ_{n} l_{j} (X_{i}) \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] u_{l, j} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] u_{i, j} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) δ_{n}^{2} l_{j} (X_{i}) l_{j} (X_{l}) \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) δ_{n} l_{j} (X_{i}) u_{l, j} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) δ_{n} l_{j} (X_{l}) u_{i, j} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j} u_{l, j} \\ \equiv B_{1, j} + B_{2, j} + B_{3, j} + B_{4, j} + B_{5, j} + B_{6, j} + B_{7, j} + B_{8, j} + B_{9, j}, \end{matrix}

where

B_{1, j} = Z_{1, j} = o_{p} (1)

,

B_{4, j} + B_{5, j} = Z_{3, j} + Z_{4, j} = o_{p} (1)

and

B_{9, j} = Z_{2, j} \overset{d}{\to} N (0, V_{j, j})

. It suffices to show the following:

\begin{matrix} B_{2, j} + B_{3, j} = o_{p} (1), \end{matrix}

(23)

\begin{matrix} B_{6, j} \overset{p}{\to} E [l_{j} {(x)}^{2} f (x)], \end{matrix}

(24)

\begin{matrix} B_{7, j} + B_{8, j} = o_{p} (1) . \end{matrix}

(25)

Proof of (23).

We show that

B_{2, j} + B_{3, j} = o_{p} (1)

.

B_{2, j} + B_{3, j}

can be represented as follows:

\begin{matrix} B_{2, j} + B_{3, j} \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) {[m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] l_{j} (X_{l}) + [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] l_{j} (X_{i})} δ_{n} \\ = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) {[m_{θ, j} (X_{i}) - m_{\hat{θ}, j} (X_{i})] l_{j} (X_{l}) + [m_{θ, j} (X_{l}) - m_{\hat{θ}, j} (X_{l})] l_{j} (X_{i})} δ_{n} \\ = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) [\frac{\partial m_{\tilde{θ}, j} (X_{i})}{\partial θ^{'}} l_{j} (X_{l}) + \frac{\partial m_{\tilde{θ}, j} (X_{l})}{\partial θ^{'}} l_{j} (X_{i})] (\hat{θ} - θ) δ_{n} \\ \equiv \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{2} (X_{i}, X_{l}) (\hat{θ} - θ) δ_{n}, \end{matrix}

where

{\bar{B}}_{2} (X_{i}, X_{l})

is a symmetric function. Similar to (18), it can be straightforwardly shown that

E [| {\bar{B}}_{2} (X_{i}, X_{l}) |^{2}] = o (n)

. The only difference from (18) is that we have

l_{j} {(\cdot)}^{2}

instead of

σ_{j}^{2} (\cdot)

as a part of the integrand, where

E [l_{j} {(\cdot)}^{2}]

is assumed to be bounded. By applying Lemma 3.1 of [42], we yield

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{2} (X_{i}, X_{l}) = E [{\bar{B}}_{2} (X_{i}, X_{l})] + o_{p} (1)

, where

\begin{matrix} E [{\bar{B}}_{2} (X_{i}, X_{l})] & = \frac{1}{h^{q}} \int K (\frac{x - y}{h}) [\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} l_{j} (y) + \frac{\partial m_{\tilde{θ}, j} (y)}{\partial θ^{'}} l_{j} (x)] f (x) f (y) d x d y \\ = \int K (u) [\frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} l_{j} (x - u h) + \frac{\partial m_{\tilde{θ}, j} (x - u h)}{\partial θ^{'}} l_{j} (x)] f (x) f (x - u h) d x d u \\ = 2 \int \frac{\partial m_{\tilde{θ}, j} (x)}{\partial θ^{'}} l_{j} (x) f {(x)}^{2} d x + O (h) \\ = O (1) . \end{matrix}

Therefore,

\begin{matrix} B_{2, j} + B_{3, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{2} (X_{i}, X_{l}) (\hat{θ} - θ) δ_{n} \\ = n h^{q / 2} O (1) O_{p} (1 / \sqrt{n}) O (1 / \sqrt{n h^{q / 2}}) \\ = O_{p} (\sqrt{h^{q / 2}}) = o_{p} (1) . \end{matrix}

Proof of (24).

We show that

B_{6, j}

converges to

E [l_{j} {(x)}^{2} f (x)]

as

n \to \infty

.

B_{6, j}

can be represented as follows:

\begin{matrix} B_{6, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) l_{j} (X_{i}) l_{j} (X_{l}) δ_{n}^{2} \\ \equiv \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{6} (X_{i}, X_{l}) δ_{n}^{2}, \end{matrix}

where

{\bar{B}}_{6} (X_{i}, X_{l})

is a symmetric function. Similar to (11), it can be straightforwardly shown that

E [| {\bar{B}}_{6} (X_{i}, X_{l}) |^{2}] = o (n)

. The only difference from (11) is that we have

l_{j} {(\cdot)}^{2}

instead of

{∥\partial m_{\tilde{θ}, j} (X_{i}) / \partial θ∥}^{2}

as a part of the integrand, where

E [l_{j} {(\cdot)}^{2}]

is assumed to be bounded. By applying Lemma 3.1 of [42], we yield

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{6} (X_{i}, X_{l}) = E [{\bar{B}}_{6} (X_{i}, X_{l})] + o_{p} (1)

, where

\begin{matrix} E [{\bar{B}}_{6} (X_{i}, X_{l})] & = \frac{1}{h^{q}} \int K (\frac{x - y}{h}) l_{j} (y) l_{j} (x) f (x) f (y) d x d y \\ = \int K (u) l_{j} (x - u h) l_{j} (x) f (x) f (x - u h) d x d u \\ = \int l_{j}^{2} (x) f {(x)}^{2} d x + O (h) . \end{matrix}

Therefore,

\begin{matrix} B_{6, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{6} (X_{i}, X_{l}) δ_{n}^{2} \\ = n h^{q / 2} {E [l_{j}^{2} (x) f (x)] + o_{p} (1)} δ_{n}^{2} \overset{p}{\to} E [l_{j}^{2} (x) f (x)] . \end{matrix}

Proof of (25).

We show that

B_{7, j} + B_{8, j} = o_{p} (1)

.

B_{7, j} + B_{8, j}

can be represented as follows:

\begin{matrix} B_{7, j} + B_{8, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} \frac{1}{h^{q}} K (\frac{X_{i} - X_{l}}{h}) [l_{j} (X_{i}) u_{l, j} + l_{j} (X_{l}) u_{i, j}] δ_{n} \\ \equiv \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{7} (X_{i}, X_{l}) δ_{n}, \end{matrix}

where

{\bar{B}}_{7} (X_{i}, X_{l})

is a symmetric function. Similar to (18), it can be straightforwardly shown that

E [| {\bar{B}}_{7} (X_{i}, X_{l}) |^{2}] = o (n)

. The only difference from (18) is that we have

l_{j} {(\cdot)}^{2}

instead of

{∥\partial m_{\tilde{θ}, j} (X_{i}) / \partial θ∥}^{2}

as a part of the integrand, where

E [l_{j} {(\cdot)}^{2}]

is assumed to be bounded. By applying Lemma 3.1 of [42], we yield

\frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{7} (X_{i}, X_{l}) = E [{\bar{B}}_{7} (X_{i}, X_{l})] + o_{p} (1 / \sqrt{n})

, where

E [{\bar{B}}_{7} (X_{i}, X_{l})] = 0

. Therefore,

\begin{matrix} B_{7, j} + B_{8, j} & = \frac{n h^{q / 2}}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq 1}^{n} {\bar{B}}_{7} (X_{i}, X_{l}) δ_{n} \\ = n h^{q / 2} o_{p} (1 / \sqrt{n}) δ_{n} = o_{p} (1) . \end{matrix}

☐

Proofs of Proposition 4.

Proposition 4 can be proven along the same lines as Proposition 1. Let

u_{i, j}^{*} = Y_{i, j}^{*} - m_{j}^{*} (X_{i})

, where

m_{j}^{*} (X_{i}) \equiv E (Y_{i, j}^{*} | X_{i}) = m_{\hat{θ}, j} (X_{i})

, and therefore,

E (u_{i, j}^{*} | X_{i}) = 0

. Then, the boundedness of

σ_{j}^{* 4} (x) \equiv E [u_{i, j}^{* 4} | X_{i} = x]

corresponding to (15) can be shown straightforwardly because

Y_{i, j}^{*}

is a binary variable taking the values zero and one, and X lies on a compact set by Assumption 1.

We first prove the following:

\begin{matrix} n h^{q / 2} {\hat{Z}}_{n, j}^{*} \overset{d}{\to} N (0, V_{j, j}^{*}), \end{matrix}

(26)

\begin{matrix} V_{j, j}^{*} - {\hat{V}}_{j, j}^{*} = o_{p} (1), \end{matrix}

(27)

\begin{matrix} V_{j, m}^{*} - {\hat{V}}_{j, m}^{*} = o_{p} (1), \end{matrix}

(28)

where

V_{j, j}^{*}

and

V_{j, m}^{*}

are the asymptotic variance of

n h^{q / 2} {\hat{Z}}_{n, j}^{*}

and covariance between

n h^{q / 2} {\hat{Z}}_{n, j}^{*}

and

n h^{q / 2} {\hat{Z}}_{n, m}^{*}

, respectively. We show that they can be written as follows:

\begin{matrix} V_{j, j}^{*} \equiv 2 K^{(2)} (0) E {{[σ_{j}^{* 2} (x)]}^{2} f (x) d x}, \\ V_{j, m}^{*} \equiv 2 K^{(2)} (0) E {{[σ_{j, m}^{*} (x)]}^{2} f (x) d x}, \end{matrix}

where

σ_{j}^{* 2} (x)

is the conditional variance of

u_{i, j}^{*}

and

σ_{j, m}^{*} (x)

is the covariance between

u_{i, j}^{*}

and

u_{i, m}^{*}

.

Proof of (26).

Let

u_{\hat{θ}, i, j}^{*} = Y_{i, j}^{*} - m_{{\hat{θ}}^{*}, j} (X_{i})

and

Y_{i, j}^{*} = m_{\hat{θ}, j} (X_{i}) + u_{i, j}^{*}

, where

E (u_{i, j}^{*} | X_{i}) = 0

by definition. Then,

\begin{matrix} n h^{q / 2} {\hat{Z}}_{n, j}^{*} & = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{\hat{θ}, i, j}^{*} u_{\hat{θ}, l, j}^{*} \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i}) + u_{i, j}^{*}] [m_{\hat{θ}, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l}) + u_{l, j}^{*}] \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] [m_{\hat{θ}, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j}^{*} u_{l, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] u_{l, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] u_{i, j}^{*} \\ \equiv Z_{1, j}^{*} + Z_{2, j}^{*} + Z_{3, j}^{*} + Z_{4, j}^{*} . \end{matrix}

We will prove that

\begin{matrix} Z_{1, j}^{*} = o_{p} (1), \end{matrix}

(29)

\begin{matrix} Z_{2, j}^{*} \overset{d}{\to} N (0, V_{j, j}^{*}), \end{matrix}

(30)

\begin{matrix} Z_{3, j}^{*} + Z_{4, j}^{*} = o_{p} (1) . \end{matrix}

(31)

Proof of (29).

Z_{1, j}^{*}

can be represented as follows:

\begin{matrix} Z_{1, j}^{*} & = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{θ, j} (X_{i}) + m_{θ, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] \\ [m_{\hat{θ}, j} (X_{l}) - m_{θ, j} (X_{l}) + m_{θ, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{θ, j} (X_{i})] [m_{\hat{θ}, j} (X_{l}) - m_{θ, j} (X_{l})] \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] \\ + \frac{2}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{θ, j} (X_{i})] [m_{θ, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] \\ \equiv Z_{1, j}^{*^{'}} + Z_{1, j}^{*^{″}} + Z_{1, j}^{*^{'''}}, \end{matrix}

where

Z_{1, j}^{*^{'}} = Z_{1, j} = o_{p} (1)

. By Assumption 4, there is an interior point

{\tilde{θ}}^{*}

between θ and

{\hat{θ}}^{*}

, such that

m_{{\hat{θ}}^{*}, j} (X_{i}) - m_{θ, j} (X_{i}) = \frac{\partial}{\partial θ^{'}} m_{{\tilde{θ}}^{*}, j} (X_{i}) ({\hat{θ}}^{*} - θ) .

(32)

Therefore,

Z_{1, j}^{*^{''}} = o_{p} (1)

and

Z_{1, j}^{*^{'''}} = o_{p} (1)

can be also shown similar to the proof for

Z_{1, j} = o_{p} (1)

by using the above mean value theorem instead of (10), because

θ - {\hat{θ}}^{*} = O_{p} (1 / \sqrt{n})

for all j under appropriate parametric models and Assumption 5.

Proof of (30).

It is clear that

Z_{2, j}^{*}

can be treated as second order degenerate U-statistic as follows:

\begin{matrix} \frac{h^{q / 2}}{n} Z_{2, j}^{*} & = \frac{1}{n (n - 1)} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j}^{*} u_{l, j}^{*} . \end{matrix}

Define

G_{n}^{*} (Z_{1}, Z_{2}) = E_{Z_{i}} [{K [(X_{1} - X_{i}) / h] u_{1, j}^{*} u_{i, j}^{*}} {K [(X_{2} - X_{i}) / h] u_{2, j}^{*} u_{i, j}^{*}}]

, where

Z_{i}^{*} = {X_{i}, u_{i}^{*}}

. According to the central limited theorem for degenerated U-statistics proposed by [43],

\frac{Z_{2, j}^{*}}{h^{- q / 2} \sqrt{2 E {{[u_{1, j}^{*} u_{2, j}^{*} K ((X_{1} - X_{2}) / h)]}^{2}}}} \overset{d}{\to} N (0, 1),

if

\begin{matrix} \frac{E [G_{n}^{* 2} (Z_{1}^{*}, Z_{2}^{*})] + n^{- 1} E {{[u_{1, j} u_{2, j} K ((X_{1} - X_{2}) / h)]}^{4}}}{E {{[u_{1, j}^{*} u_{2, j}^{*} K ((X_{1} - X_{2}) / h)]}^{2}}^{2}} \to 0 as n \to \infty . \end{matrix}

(33)

Thus, it suffices to show that (33) and

2 h^{- q} E {{[u_{1, j}^{*} u_{2, j}^{*} K ((X_{1} - X_{2}) / h)]}^{2}} \to V_{j, j}^{*},

(34)

hold.

Proof of (33).

First, note that

\begin{matrix} E [G_{n}^{* 2} (Z_{1}, Z_{2})] & = E [{\{E_{Z_{i}} [u_{1, j}^{*} u_{2, j}^{*} u_{i, j}^{* 2} K (\frac{X_{1} - X_{i}}{h}) K (\frac{X_{2} - X_{i}}{h})]\}}^{2}] \\ = E \{σ_{j}^{* 2} (X_{1}) σ_{j}^{* 2} (X_{2}) {[\int σ_{j}^{* 2} (z) K (\frac{X_{1} - z}{h}) K (\frac{X_{2} - z}{h}) f (z) d z]}^{2}\} \\ = h^{3 q} K^{(4)} (0) \int {[σ_{j}^{* 2} (x)]}^{4} f {(x)}^{4} d x + O (h^{3 q + 1}) + o (h^{3 q + 1}) \\ = O (h^{3 q}) . \end{matrix}

(35)

In the same way as above, we obtain

\begin{matrix} n^{- 1} E \{{[u_{1, j}^{*} u_{2, j}^{*} K (\frac{X_{1} - X_{2}}{h})]}^{4}\} & = n^{- 1} \int σ_{j}^{* 4} (x) σ_{j}^{* 4} (y) {[K (\frac{x - y}{h})]}^{4} f (x) f (y) d x d y \\ = n^{- 1} h^{q} \int {[σ_{j}^{* 4} (x)]}^{2} f^{2} (x) d x \int {[K (u)]}^{4} d u + O (\frac{h^{2 q}}{n}) \\ = O (\frac{h^{q}}{n}) . \end{matrix}

(36)

Following some calculation, we can obtain

\begin{matrix} E {\{{[u_{1, j}^{*} u_{2, j}^{*} K (\frac{X_{1} - X_{2}}{h})]}^{2}\}}^{2} & = E {\{σ^{* 2} (X_{1}) σ^{* 2} (X_{2}) {[K (\frac{X_{1} - X_{2}}{h})]}^{2}\}}^{2} \\ = h^{2 q} {\{K^{(2)} (0) \int {[σ^{* 2} (x)]}^{2} f^{2} (x) d x + O (h)\}}^{2} \\ = O (h^{2 q}) . \end{matrix}

(37)

Thus, (33) holds by (35)–(37) because

\frac{O (h^{3 q}) + O (\frac{h^{q}}{n})}{O (h^{2 q})} = \frac{O (h^{q}) + O (\frac{1}{n h^{q}})}{O (1)} \to 0,

as

h \to 0

and

n h^{q} \to \infty

.

Proof of (34).

From Equation (37), it is clear that

\begin{matrix} \frac{2}{h^{q}} E \{{[u_{1, j}^{*} u_{2, j}^{*} K (\frac{X_{1} - X_{2}}{h})]}^{2}\} & = 2 K^{(2)} (0) \int {[σ^{* 2} (x)]}^{2} f^{2} (x) d x + O (h) \\ = 2 K^{(2)} (0) E {{[σ^{* 2} (x)]}^{2} f (x)} + O (h) \\ \to V_{j, j}^{*} . \end{matrix}

(38)

Thus, we have

Z_{2, j}^{*} \overset{d}{\to} N (0, V_{j, j}^{*})

.

Proof of (31).

Z_{3, j}^{*} + Z_{4, j}^{*}

can be represented as follows:

\begin{matrix} Z_{3, j}^{*} + Z_{4, j}^{*} & = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] u_{l, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] u_{i, j}^{*} \\ = \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{i}) - m_{θ, j} (X_{i})] u_{l, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{i}) - m_{{\hat{θ}}^{*}, j} (X_{i})] u_{l, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{\hat{θ}, j} (X_{l}) - m_{θ, j} (X_{l})] u_{i, j}^{*} \\ + \frac{1}{(n - 1) h^{q / 2}} \sum_{i = 1}^{n} \sum_{l \neq i}^{n} K (\frac{X_{i} - X_{l}}{h}) [m_{θ, j} (X_{l}) - m_{{\hat{θ}}^{*}, j} (X_{l})] u_{i, j}^{*} \\ \equiv Z_{3, j}^{*^{'}} + Z_{3, j}^{*^{″}} + Z_{4, j}^{*^{'}} + Z_{4, j}^{*^{″}} . \end{matrix}

The only difference between

Z_{3, j}^{*^{'}} + Z_{4, j}^{*^{'}}

and

Z_{3, j} + Z_{4, j}

in (9) is that the former contains

u_{l, j}^{*}

and

u_{i, j}^{*}

instead of

u_{l, j}

and

u_{i, j}

, respectively. However, since

E (u_{i, j}^{*} | X_{i}) = 0

and

E (u_{l, j}^{*} | X_{l}) = 0

from the definition,

Z_{3, j}^{*^{'}} + Z_{4, j}^{*^{'}} = o_{p} (1)

can be proven as with the proof of (9). Moreover,

Z_{3, j}^{*^{''}} + Z_{4, j}^{*^{''}} = o_{p} (1)

can also be proven as with the proof of (9) by using (32) instead of (10).

Proof of (27) and (28).

Since the asymptotic variance is shown above, we derive the asymptotic covariance between

n h^{q / 2} {\hat{Z}}_{n, j}^{*}

and

n h^{q / 2} {\hat{Z}}_{n, m}^{*}

, which we denote

V_{j, m}^{*}

. From the results of (29)–(31), it is clear that

E (Z_{2, j}^{*} Z_{2, m}^{*}) \to V_{j, m}^{*}

as

n \to \infty

. Because

E (u_{i, j}^{*} u_{l, j}^{*}) = 0 if i \neq l

and

E (u_{i, j}^{*} u_{i, m} | X_{i}) = σ_{j, m}^{*} (X_{i}) if j \neq m

, it follows that

\begin{matrix} E (Z_{2, j}^{*} Z_{2, m}^{*}) & = \frac{1}{{(n - 1)}^{2} h^{q}} E [\sum_{i = 1}^{n} \sum_{l \neq 1}^{n} K (\frac{X_{i} - X_{l}}{h}) u_{i, j}^{*} u_{l, j}^{*} \sum_{s = 1}^{n} \sum_{t \neq s}^{n} K (\frac{X_{s} - X_{t}}{h}) u_{s, m}^{*} u_{t, m}^{*}] \\ = \frac{2}{{(n - 1)}^{2} h^{q}} E \{\sum_{i = 1}^{n} \sum_{l \neq 1}^{n} u_{i, j}^{*} u_{l, j}^{*} u_{i, m}^{*} u_{l, m}^{*} {[K (\frac{X_{i} - X_{l}}{h})]}^{2}\} \\ = \frac{2 n}{(n - 1) h^{q}} \int σ_{j, m}^{*} (x) σ_{j, m}^{*} (y) {[K (\frac{x - y}{h})]}^{2} f (x) f (y) d x d y \\ = 2 K^{(2)} (0) \int {[σ_{j, m}^{*} (x)]}^{2} f^{2} (x) d x + O (h) \\ \to V_{j, m}^{*} . \end{matrix}

(39)

Thus, proofs of (27) and (28) are straightforward from (38) and (39).

By the Cramér–Wold device and a similar calculation to the proof of (30), it can be straightforwardly shown that

Z_{2}

converges to a multivariate normal distribution with the

(J - 1) \times 1

mean vector consisting of zero and variance-covariance matrix

V^{*}

, where

V^{*}

is a

(J - 1) \times (J - 1)

variance-covariance matrix whose

(j, m)

elements are

V_{j, m}^{*}

. Therefore,

C_{n}^{*}

, which is the quadratic form of

n h^{q / 2} {\hat{Z}}_{n, j}^{*}

, converges to a chi-squared distribution with

J - 1

degrees of freedom. ☐

References

D. McFadden. “Conditional Logit Analysis of Qualitative Choice Behavior.” In Frontiers in Econometrics. Edited by P. Zarembka. New York, NY, USA: Academic Press, 1974, pp. 105–142. [Google Scholar]
J.A. Hausman, and D.A. Wise. “A Conditional Probit Model for Qualitative Choice: Discrete Decisions Recognizing Interdependence and Heterogeneous Preferences.” Econometrica 46 (1978): 403–426. [Google Scholar] [CrossRef]
S. Berry, J. Levinsohn, and A. Pakes. “Automobile Prices in Market Equilibrium.” Econometrica 63 (1995): 841–890. [Google Scholar] [CrossRef]
P.K. Goldberg. “Product Differentiation and Oligopoly in International Markets: The Case of the US Automobile Industry.” Econometrica 63 (1995): 891–951. [Google Scholar] [CrossRef]
J.J. Heckman. “The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models.” Ann. Econ. Soc. Meas. 5 (1976): 475–492. [Google Scholar]
J.A. Dubin, and D.L. McFadden. “An Econometric Analysis of Residential Electric Appliance Holdings and Consumption.” Econometrica 52 (1984): 345–362. [Google Scholar] [CrossRef]
W. Härdle, and E. Mammen. “Comparing Nonparametric versus Parametric Regression Fits.” Ann. Stat. 21 (1993): 1926–1947. [Google Scholar] [CrossRef]
H.J. Bierens. “Consistent Model Specification Tests.” J. Econ. 20 (1982): 105–134. [Google Scholar] [CrossRef]
H.J. Bierens. “Model Specification Testing of Time Series Regressions.” J. Econom. 26 (1984): 323–353. [Google Scholar] [CrossRef]
H.J. Bierens. “A Consistent Conditional Moment Test of Functional Form.” Econometrica 58 (1990): 1443–1458. [Google Scholar] [CrossRef]
M.A. Delgado. “Testing the Equality of Nonparametric Regression Curves.” Stat. Probab. Lett. 17 (1993): 199–204. [Google Scholar] [CrossRef]
R.M. De Jong. “The Bierens Test under Data Dependence.” J. Econom. 72 (1996): 1–32. [Google Scholar] [CrossRef]
D.W. Andrews. “A Conditional Kolmogorov Test.” Econometrica 65 (1997): 1097–1128. [Google Scholar] [CrossRef]
H.J. Bierens, and W. Ploberger. “Asymptotic Theory of Integrated Conditional Moment Tests.” Econometrica 65 (1997): 1129–1151. [Google Scholar] [CrossRef]
W. Stute. “Nonparametric Model Checks for Regression.” Ann. Stat. 25 (1997): 613–641. [Google Scholar] [CrossRef]
M.B. Stinchcombe, and H. White. “Consistent Specification Testing with Nuisance Parameters Present Only under the Alternative.” Econom. Theory 14 (1998): 295–325. [Google Scholar] [CrossRef]
X. Chen, and Y. Fan. “Consistent Hypothesis Testing in Semiparametric and Nonparametric Models for Econometric Time Series.” J. Econom. 91 (1999): 373–401. [Google Scholar] [CrossRef]
Y.J. Whang. “Consistent Bootstrap Tests of Parametric Regression Functions.” J. Econom. 98 (2000): 27–46. [Google Scholar] [CrossRef]
R.L. Eubank, and C.H. Spiegelman. “Testing the Goodness of fit of a Linear Model via Nonparametric Regression Techniques.” J. Am. Stat. Assoc. 85 (1990): 387–392. [Google Scholar] [CrossRef]
S. Le Cessie, and J.C. van Houwelingen. “A Goodness-of-fit Test for Binary Regression Models, Based on Smoothing Methods.” Biometrics 47 (1991): 1267–1282. [Google Scholar] [CrossRef]
J.M. Wooldridge. “A Test for Functional Form Against Nonparametric Alternatives.” Econom. Theory 8 (1992): 452–475. [Google Scholar] [CrossRef]
A.J. Yatchew. “Nonparametric Regression Tests Based on an Infinite Dimensional Least Squares Procedure.” Econom. Theory 8 (1992): 435–451. [Google Scholar] [CrossRef]
P.L. Gozalo. “A Consistent Model Specification Test for Nonparametric Estimation of Regression Function Models.” Econom. Theory 9 (1993): 451–477. [Google Scholar] [CrossRef]
Y. Aït-Sahalia, P.J. Bickel, and T.M. Stoker. “Goodness-of-fit Tests for Kernel Regression with an Application to Option Implied Volatilities.” J. Econom. 105 (2001): 363–412. [Google Scholar] [CrossRef]
M.A. Delgado, and T. Stengos. “Semiparametric Specification Testing of non-Nested Econometric Models.” Rev. Econ. Stud. 61 (1994): 291–303. [Google Scholar] [CrossRef]
J.L. Horowitz, and W. Härdle. “Testing a Parametric Model Against a Semiparametric Alternative.” Econom. Theory 10 (1994): 821–848. [Google Scholar] [CrossRef]
Y. Hong, and H. White. “Consistent Specification Testing via Nonparametric Series Regression.” Econometrica 63 (1995): 1133–1159. [Google Scholar] [CrossRef]
Y. Fan, and Q. Li. “Consistent Model Specification Tests: Omitted Variables and Semiparametric Functional Forms.” Econometrica 64 (1996): 865–890. [Google Scholar] [CrossRef]
P. Lavergne, and Q.H. Vuong. “Nonparametric Selection of Regressors: The Nonnested Case.” Econometrica 64 (1996): 207–219. [Google Scholar] [CrossRef]
J.X. Zheng. “A Consistent Test of Functional Form via Nonparametric Estimation Techniques.” J. Econom. 75 (1996): 263–289. [Google Scholar] [CrossRef]
Q. Li, and S. Wang. “A Simple Consistent Bootstrap Test for a Parametric Regression Function.” J. Econom. 87 (1998): 145–165. [Google Scholar] [CrossRef]
P. Lavergne, and Q. Vuong. “Nonparametric Significance Testing.” Econom. Theory 16 (2000): 576–601. [Google Scholar] [CrossRef]
Y. Fan, and Q. Li. “Consistent Model Specification Tests.” Econom. Theory 16 (2000): 1016–1041. [Google Scholar] [CrossRef]
J. Mora, and A.I. Moro-Egido. “On Specification Testing of Ordered Discrete Choice Models.” J. Econom. 143 (2008): 191–205. [Google Scholar] [CrossRef]
J. Fan, and L.S. Huang. “Goodness-of-fit Tests for Parametric Regression Models.” J. Am. Stat. Assoc. 96 (2001): 640–652. [Google Scholar]
J.L. Horowitz, and V.G. Spokoiny. “An Adaptive, Rate-Optimal Test of a Parametric Mean-Regression Model Against a Nonparametric Alternative.” Econometrica 69 (2001): 599–631. [Google Scholar] [CrossRef]
V. Spokoiny. “Data-Driven Testing the fit of Linear Models.” Math. Methods Stat. 10 (2001): 465–497. [Google Scholar]
Y. Baraud, S. Huet, and B. Laurent. “Adaptive Tests of Linear Hypotheses by Model Selection.” Ann. Stat. 31 (2003): 225–251. [Google Scholar]
C.M. Zhang. “Adaptive Tests of Regression Functions via Multiscale Generalized Likelihood Ratios.” Can. J. Stat. 31 (2003): 151–171. [Google Scholar] [CrossRef]
E. Guerre, and P. Lavergne. “Data-Driven Rate-Optimal Specification Testing in Regression Models.” Ann. Stat. 33 (2005): 840–870. [Google Scholar]
W. Härdle, M. Müller, S. Sperlich, and A. Werwatz. Nonparametric and Semiparametric Models. Berlin Heidelberg, Germany: Springer, 2004, p. 92. [Google Scholar]
J.L. Powell, J.H. Stock, and T.M. Stoker. “Semiparametric Estimation of Index Coefficients.” Econometrica 57 (1989): 1403–1430. [Google Scholar] [CrossRef]
P. Hall. “Central Limit Theorem for Integrated Square Error of Multivariate Nonparametric Density Estimators.” J. Multivar. Anal. 14 (1984): 1–16. [Google Scholar] [CrossRef]

^1.Rate optimal tests are proposed by [36,37,38,39,40,43], among others.
^2.To be accurate, the MNL model consists of alternative-variant coefficients whose response probabilities are indicated by $P (Y_{j} = 1 | X) = exp (X^{'} β_{j}) / [1 + \sum_{j = 1}^{J} exp (X^{'} β_{j})]$ . However, the models represented by alternative-variant coefficients are able to transform into a model with alternative-invariant coefficients without loss of generality, which is sometimes called a conditional logit model. In this paper, we describe only a model with alternative-invariant coefficients.

© 2015 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Iwasawa, M. A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models. Econometrics 2015, 3, 667-697. https://doi.org/10.3390/econometrics3030667

AMA Style

Iwasawa M. A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models. Econometrics. 2015; 3(3):667-697. https://doi.org/10.3390/econometrics3030667

Chicago/Turabian Style

Iwasawa, Masamune. 2015. "A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models" Econometrics 3, no. 3: 667-697. https://doi.org/10.3390/econometrics3030667

Article Menu

A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models

Abstract

1. Introduction

2. Unordered Multinomial Choice Models

3. Test Statistic

4. The Asymptotic Behavior

4.1. Assumptions

4.2. Asymptotic Distribution under the Null Hypothesis

4.3. Asymptotic Distribution under the Alternative Hypothesis

4.4. Asymptotic Distribution under the Pitman Local Alternative

5. Bootstrap Methods

Bootstrap Methods for $C_{n}$

6. Monte Carlo Experiments

7. Conclusions

Acknowledgments

Conflicts of Interest

Appendix: Proofs

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Joint Specification Test for Response Probabilities in Unordered Multinomial Choice Models

Abstract

1. Introduction

2. Unordered Multinomial Choice Models

3. Test Statistic

4. The Asymptotic Behavior

4.1. Assumptions

4.2. Asymptotic Distribution under the Null Hypothesis

4.3. Asymptotic Distribution under the Alternative Hypothesis

4.4. Asymptotic Distribution under the Pitman Local Alternative

5. Bootstrap Methods

Bootstrap Methods for C n

6. Monte Carlo Experiments

7. Conclusions

Acknowledgments

Conflicts of Interest

Appendix: Proofs

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Bootstrap Methods for $C_{n}$