Expansions for the Conditional Density and Distribution of a Standard Estimate

Withers, Christopher S.

doi:10.3390/stats8040098

Open AccessArticle

Expansions for the Conditional Density and Distribution of a Standard Estimate

by

Christopher S. Withers

Formerly Industrial Research Ltd., Lower Hutt 6007, New Zealand

Stats 2025, 8(4), 98; https://doi.org/10.3390/stats8040098

Submission received: 22 August 2025 / Revised: 7 October 2025 / Accepted: 11 October 2025 / Published: 14 October 2025

Download

Browse Figure

Versions Notes

Abstract

Conditioning is a very useful way of using correlated information to reduce the variability of an estimate. Conditioning an estimate on a correlated estimate, reduces its covariance, and so provides more precise inference than using an unconditioned estimate. Here we give expansions in powers of

n^{- 1 / 2}

for the conditional density and distribution of any multivariate standard estimate based on a sample of size n. Standard estimates include most estimates of interest, including smooth functions of sample means and other empirical estimates. We also show that a conditional estimate is not a standard estimate, so that Edgeworth-Cornish-Fisher expansions cannot be applied directly.

Keywords:

conditional distributions; conditional confidence intervals; multivariate Edgeworth expansions; Edgeworth coefficients; standard estimates

1. Introduction and Summary

Given correlated estimates

\hat{w} = {({\hat{w}}_{1}, {\hat{w}}_{2})}^{'}

of unknown parameters

w = {(w_{1}, w_{2})}^{'}

, inference on

w_{1}

can be made more precise by conditioning on

{\hat{w}}_{2}

. To see this, suppose that X is a bivariate normal with correlation

ρ \neq 0

. Then

v a r (X_{1} | X_{2}) = (1 - ρ^{2}) v a r (X_{1}) < v a r (X_{1}) .

So if

\hat{w}

is an estimate of w based on a sample of size n, and it satisfies the Central Limit Theorem, (CLT),

X_{n} = n^{1 / 2} (\hat{w} - w) \overset{L}{\to} X \sim N_{2} (0, V), as n \to \infty,

then

v a r ({\hat{w}}_{1} | {\hat{w}}_{2}) \approx (1 - ρ^{2}) v a r ({\hat{w}}_{1}) < v a r ({\hat{w}}_{1})

if

ρ = correlation {({\hat{w}}_{1}, {\hat{w}}_{2})}^{'}

is non-zero. A similar result holds when for

i = 1, 2, w_{i}

has dimension

q_{i}

, which we now assume. To apply this result to obtain inference on

w_{1}

given

{\hat{w}}_{1} | {\hat{w}}_{2}

, we need to approximate its distribution, ideally beyond its first approximation given by the CLT. This paper uses expansions in powers of

n^{- 1 / 2}

for the density and distribution of

X_{n} = n^{1 / 2} (\hat{w} - w)

, for a wide class of estimates, called standard estimates.

Suppose that

\hat{w}

is a standard estimate of an unknown parameter

w \in R^{q}

of a statistical model, based on a sample of size n. That is,

\hat{w}

is a consistent estimate, and for

r \geq 1

, its rth order cumulants have magnitude

n^{- (r - 1) / 2}

and can be expanded in powers of

n^{- 1}

. The coefficients in these expansions are called the cumulant coefficients. This is a very large class of estimates, with potential application to a range of practical problems. For example,

\hat{w}

may be a smooth function of one or more sample means, or a smooth functional of one or more empirical distributions. A smooth function of a standard estimate is also a standard estimate: see ref. [1]. Ref. [2] gave the multivariate Edgeworth expansions for the distribution and density of

X_{n} = n^{1 / 2} (\hat{w} - w),

in powers of

n^{- 1 / 2}

about the multivariate normal in terms of the Edgeworth coefficients of (3). (For typos, see p25 of ref. [1]. Also replace

\hat{θ}

by

\hat{θ} / θ

on 4th to last line p1121 and in (23). To line 3 p1138, add

P_{12} = B_{23} / 2

). Ref. [3] gave the Edgeworth coefficients explicitly for the Edgeworth expansions to

O (n^{- 2})

.

Choosing an estimate can be a tradeoff between simplicity and efficiency. Conventional point estimation emphasises efficiency as measured by mean square error. The maximum liklihood estimate is attractive as it is asymptotically efficient in this sense. However their cumulant coefficients generally take much more work to obtain than those of a moment estimate. See refs. [4,5]. But whether one chooses a simple estimate or a more complicated one, it will generally be a standard estimate.

Turning to conditioning, as noted, this is a very useful way of using correlated information to reduce the variability of estimates, and to make inference on unknown parameters more precise. This is the motivation for this paper. To emphasise when

w_{i}, {\hat{w}}_{i}

are vectors, we bold them. In Section 4 we take

q \geq 2

, and write

w, \hat{w}

and

X_{n}

as

(\binom{w_{1}}{w_{2}})

,

(\binom{{\hat{w}}_{1}}{{\hat{w}}_{2}})

and

(\binom{X_{n 1}}{X_{n 2}})

of dimensions

(\binom{q_{1}}{q_{2}})

. Just as the distribution of

X_{n}

allows inference on w, the conditional distribution of

X_{n 1}

given

X_{n 2}

, allows inference on

w_{1}

for a given

w_{2}

. The covariance of

{\hat{w}}_{1} | {\hat{w}}_{2}

can be substantially less than that of

{\hat{w}}_{1}

. Only when

{\hat{w}}_{1}

and

{\hat{w}}_{2}

are uncorrelated, is there no advantage in conditioning. Given a statistical model, its unknown parameters w, will consist of one or more parameters of primary interest,

w_{1}

, and the others. (For example, for an autoregressive time series with mean

μ

, autocorrelation

ρ

, and variance of residuals

σ^{2}

, the parameter of primary interest is

μ

.) When conditioning one can choose

w_{2}

to be all of the other parameters, or more simply, the single parameter

w_{2}

which maximises the estimated correlation of

{\hat{w}}_{2}

with

{\hat{w}}_{1}

. This is another trade-off between efficiency and simplicity, as increasing

q_{2}

will reduce the conditional variance.

We shall see that for V the asymptotic covariance of

X_{n}

,

\begin{matrix} as n \to \infty, X_{n} = n^{1 / 2} (\hat{w} - w) \overset{L}{\to} X \sim N_{q} (0, V), \end{matrix}

(1)

the multivariate normal on

R^{q}

, with density and distribution

\begin{matrix} ϕ_{V} (x) = {(2 π)}^{- q / 2} {(d e t V)}^{- 1 / 2} exp (- x^{'} V^{- 1} x / 2), Φ_{V} (x) = \int_{- \infty}^{x} ϕ_{V} (x) d x . \\ So, X_{n 1} | (X_{n 2} = x_{2}) \overset{L}{\to} X_{1} | (X_{2} = x_{2}) \sim N_{q_{1}} (μ_{1 \cdot 2}, V_{1 \cdot 2}), \end{matrix}

where

V_{1 \cdot 2}

is a function of V, and

μ_{1 \cdot 2}

is a function of V and is also linear in

x_{2}

. If

q_{1} = 1

, this leads in Section 4 to 1- or 2-sided confidence intervals for

w_{1} | (X_{n 2} = x_{2})

, of error

O (n^{- 1 / 2})

or

O (n^{- 1})

. So unlike traditional confidence regions (including confidence intervals), the conditional versions depend on the value of the unknown

w_{2}

. This gives a new level of sophistication to them over traditional confidence regions. While this paper does not deal with Studentized estimates, that next step can be done using ref. [6] or ref. [1]. The

x_{2}

of most interest are small.

Theorems 1 and 2 give our main results: explicit expansions to

O (n^{- 2})

for the conditional density and distribution of

X_{n 1}

given

X_{n 2}

, that is, for the conditional density and distribution of

{\hat{w}}_{1} - w_{1}

given

{\hat{w}}_{2} - w_{2}

. In other words, it gives the likely position of

w_{1}

for any given

w_{2}

. The main difficulty is integrating the density. Theorem 2 does this in terms of

{\bar{I}}^{1 - k}

of (42), the integral of the multivariate Hermite polynomial, with respect to the conditional normal density. Note 1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional distribution. If

q_{1} = 1

, then Theorem 4 gives

{\bar{I}}^{1 - k}

in terms of the unit normal distribution and density.

Section 4 specialises to the case

q_{1} = q_{2} = 1

. Examples are the condtional distribution and density of a bivariate sample mean, of entangled gamma random variables, and of a sample mean given the sample variance. Section 5 and Section 6 give conclusions, discussion, and suggestions for future research. Appendix A gives expansions for the conditional moments of

X_{n 1} | (X_{n 2} = x_{2}) .

It shows that

{\hat{w}}_{1}

given

X_{n 2}

, is neither a standard estimate, nor a Type B estimate, so that Edgeworth-Cornish-Fisher expansions do not apply to it.

Ref. [7] (pp. 34–36) argue that an ideal choice of conditioning variable

{\hat{w}}_{2}

would be one whose distribution does not depend on

w_{1}

. But this is generally not possible except for some exponential families. An example when it is true, is when

w_{1}

and

w_{2}

are location and scale parameters: on p54 they essentially suggest choosing

w_{2} = n v a r {\hat{w}}_{1}

. This is our motivation for Example 4. For some examples, see ref. [8]. Their (7.5) gave a form for the 3rd order expansion for the conditional density of a sample mean to

O (n^{- 3 / 2})

, but they did not attempt to integrate it.

Conditional expansions for the sample mean were given in Chapter 12 of [9], and used in Sections 2.3 and 2.5 of [10] to show bootstrap consistency. For some other results on conditional distributions, see refs. [11,12,13,14,15,16].

2. Multivariate Edgeworth Expansions

Suppose that

\hat{w}

is a standard estimate of

w \in R^{q}

with respect to n. (n is typically the sample size.) That is,

E \hat{w} \to w

as

n \to \infty

, where we use

E

for expected value, and for

r \geq 1

and

1 \leq i_{1}, \dots, i_{r} \leq q,

the rth order cumulants of

\hat{w} = {({\hat{w}}_{1}, \dots, {\hat{w}}_{q})}^{'}

can be expanded as

\begin{matrix} {\bar{k}}^{1 - r} = k^{i_{1} \dots i_{r}} = κ ({\hat{w}}_{i_{1}}, \dots, {\hat{w}}_{i_{r}}) \approx \sum_{d = r - 1}^{\infty} n^{- d} {\bar{k}}_{d}^{1 - r}, where {\bar{k}}_{d}^{1 - r} = k_{d}^{i_{1} \dots i_{r}}, \end{matrix}

(2)

where ≈ indicates an asymptotic expansion, and the cumulant coefficients

{\bar{k}}_{d}^{1 - r}

may depend on n but are bounded as

n \to \infty

. So the bar replaces each

i_{k}

by k. For example

{\bar{k}}_{0}^{1} = w^{i_{1}}

and

{\bar{k}}_{1}^{12} = k_{1}^{i_{1} i_{2}} .

We reserve

i_{k}

for this bar notation to avoid double subscripts. (1) holds with

V = ({\bar{k}}_{1}^{12}), q \times q

. V may depend on n, but we assume that

d e t V

is bounded away from 0.

\begin{matrix} Let {\bar{P}}_{r}^{1 - k} = P_{r}^{i_{1} \dots i_{k}} be the rth Edgeworth coefficient of \hat{w}, \end{matrix}

(3)

for

q \geq 1 \leq r \leq 3

. These are Bell polynomials in the cumulant coefficients of (2), as defined and given in [3]. Their importance lies in their central role in the Edgeworth expansions of

X_{n}

of (1). (When

q = 1

and

\hat{w}

is a sample mean, the Edgeworth coefficients were given for all r in [17]. For typos, see pp. 24–25 of [1].)

Set

P (A) =

Probability A is true. By [2], or [1], for

\hat{w}

non-lattice, the distribution and density of

X_{n}

can be expanded as

\begin{matrix} P (X_{n} \leq x) \approx \sum_{r = 0}^{\infty} n^{- r / 2} P_{r} (x), p_{X_{n}} (x) \approx \sum_{r = 0}^{\infty} n^{- r / 2} p_{r} (x), x \in R^{q}, \end{matrix}

(4)

where

P_{0} (x) = Φ_{V} (x), p_{0} (x) = ϕ_{V} (x),

and for

r \geq 1, P_{r} (x) = \sum_{k = 1}^{3 r} [P_{r k} (x) : k - r even],

\begin{matrix} p_{r} (x) / ϕ_{V} (x) = \sum_{k = 1}^{3 r} [{\tilde{p}}_{r k} : k - r even] = {\tilde{p}}_{r} (x) say, \end{matrix}

(5)

\begin{matrix} P_{r k} (x) = {\bar{P}}_{r}^{1 - k} {\bar{H}}_{*}^{1 - k}, {\tilde{p}}_{r k} = {\bar{P}}_{r}^{1 - k} {\bar{H}}^{1 - k}, \end{matrix}

(6)

\begin{matrix} {\bar{H}}_{*}^{1 - k} = {\bar{H}}_{*}^{1 - k} (x, V) = {\bar{O}}^{1 - k} Φ_{V} (x) = \int_{- \infty}^{x} {\bar{H}}^{1 - k} ϕ_{V} (x) d x, \end{matrix}

(7)

\begin{matrix} {\bar{O}}^{1 - k} = (- {\bar{\partial}}_{1}) \dots (- {\bar{\partial}}_{k}), {\bar{\partial}}_{k} = \partial_{i_{k}}, \partial_{i} = \partial / \partial x_{i}, \end{matrix}

\begin{matrix} {\bar{H}}^{1 - k} = H^{i_{1} \dots i_{k}} = ϕ_{V} {(x)}^{- 1} {\bar{O}}^{1 - k} ϕ_{V} (x) = E ({\bar{y}}_{1} + I {\bar{Y}}_{1}) \dots ({\bar{y}}_{k} + I {\bar{Y}}_{k}) \end{matrix}

(8)

\begin{matrix} and I = \sqrt{- 1}, y = V^{- 1} x, Y = V^{- 1} X \sim N_{q} (0, V^{- 1}) . \end{matrix}

(9)

{\bar{H}}^{1 - k} (x, V) = {\bar{H}}^{1 - k}

is the multivariate Hermite polynomial. We use the tensor summation convention, repetition of

i_{1}, \dots, i_{k}

in (6) implies their implicit summation over their range,

1, \dots, q

. Ref. [3] gave

{\bar{H}}^{1 - k}

explicitly for

k \leq 6

and for

k \leq 9

when

q = 2

.

\begin{matrix} Set {\bar{μ}}^{1 - 2 k} = E {\bar{Y}}_{1} \dots {\bar{Y}}_{2 k} = \sum^{1.3 \dots (2 k - 1)} {\bar{V}}^{12} \dots {\bar{V}}^{2 k - 1, 2 k}, \end{matrix}

(10)

where

\sum^{N} {\bar{f}}^{1 - 2 k}

sums

{\bar{f}}^{1 - 2 k}

over all N permutations of

i_{1}, \dots, i_{2 k}

giving distinct values. For example,

\begin{matrix} {\bar{H}}^{1} = {\bar{y}}_{1}, {\bar{H}}^{12} = {\bar{y}}_{1} {\bar{y}}_{2} - {\bar{V}}^{12}, \\ {\bar{H}}^{1 - 3} = {\bar{y}}_{1} {\bar{y}}_{2} {\bar{y}}_{3} - \sum^{3} {\bar{y}}_{1} {\bar{V}}^{23} = {\bar{y}}_{1} {\bar{y}}_{2} {\bar{y}}_{3} - {\bar{y}}_{1} {\bar{V}}^{23} - {\bar{y}}_{2} {\bar{V}}^{13} - {\bar{y}}_{3} {\bar{V}}^{12}, \end{matrix}

\begin{matrix} {\bar{H}}_{*}^{1} = {\bar{J}}^{1}, {\bar{H}}_{*}^{12} = {\bar{J}}^{12} - {\bar{V}}^{12} Φ_{V} (x), {\bar{H}}_{*}^{1 - 3} = {\bar{J}}^{123} - \sum^{3} {\bar{J}}^{1} {\bar{V}}^{23}, where \end{matrix}

\begin{matrix} {\bar{J}}^{1 - k} = {\bar{J}}^{1 - k} (x, V) = E {\bar{Y}}_{1} \dots {\bar{Y}}_{k} I (X \leq x) = {\bar{V}}^{1, k + 1} \dots {\bar{V}}^{k, 2 k} {\bar{M}}_{V}^{k + 1 - 2 k}, \end{matrix}

(11)

\begin{matrix} and {\bar{M}}_{V}^{a - b} = {\bar{M}}^{a - b} (x, V) = \int_{- \infty}^{x} {\bar{x}}_{a} \dots {\bar{x}}_{b} ϕ_{V} (x) d x, for {\bar{x}}_{a} = x_{i_{a}} . \end{matrix}

(12)

\begin{matrix} So, P_{1} (x) = \sum_{k = 1, 3} P_{1 k} (x), P_{11} (x) = {\bar{k}}_{1}^{1} {\bar{H}}_{*}^{1}, P_{13} (x) = {\bar{k}}_{2}^{1 - 3} {\bar{H}}_{*}^{1 - 3} / 6, \end{matrix}

\begin{matrix} {\tilde{p}}_{1} (x) = p_{1} (x) / ϕ_{V} (x) = \sum_{k = 1, 3} {\tilde{p}}_{1 k}, {\tilde{p}}_{11} = {\bar{k}}_{1}^{1} {\bar{H}}^{1}, {\tilde{p}}_{13} = {\bar{k}}_{2}^{1 - 3} {\bar{H}}^{1 - 3} / 6 . \end{matrix}

(13)

(So the repeated

i_{k + 1}, \dots, i_{2 k}

in (11) implies their repeated summatioin over

1, \dots, q

.)

P_{2} (x), P_{3} (x)

are given explicitly in [3]. So (4) with the

{\bar{P}}_{r}^{1 - k}

in [3] give the Edgeworth expansions for the distribution and density of

X_{n}

of (1) to

O (n^{- 2})

.

{\tilde{p}}_{r k}

and

P_{r k}

each have

q^{k}

terms, but many are duplicates as

{\bar{P}}_{r}^{1 - k}

is symmetric in

i_{1}, \dots, i_{k}

. This is exploited by the notation of Section 4 of [3] to greatly reduce the number of terms in (6).

By (5), the density of

X_{n}

relative to its asymptotic value is

p_{X_{n}} (x) / ϕ_{V} (x) \approx 1 + \sum_{r = 1}^{\infty} n^{- r / 2} {\tilde{p}}_{r} (x) = 1 + n^{- 1 / 2} {\tilde{p}}_{1} (x) + O (n^{- 1}), for x \in R^{q},

and for measurable

C \subset R^{q}

,

\begin{matrix} P (X_{n} \in C) \approx Φ_{V} (C) + \sum_{r = 1}^{\infty} n^{- r / 2} p_{r C}, where for r \geq 1, \\ p_{r C} = E p_{r} (X) I (X \in C) = \int_{C} p_{r} (x) ϕ_{V} (x) d x = \sum_{k = 1}^{3 r} [{\tilde{p}}_{r k} (C) : k - r even], \\ {\tilde{p}}_{r k} (C) = E {\tilde{p}}_{r k} (X) I (X \in C) = \int_{C} {\tilde{p}}_{r k} (x) ϕ_{V} (x) d x = {\bar{P}}_{r}^{1 - k} {\bar{H}}^{1 - k} (C), \\ and {\bar{H}}^{1 - k} (C) = E {\bar{H}}^{1 - k} (X, V) I (X \in C) = \int_{C} {\bar{H}}^{1 - k} ϕ_{V} (x) d x . \end{matrix}

If

- C = C

, then for r odd,

{\bar{Q}}^{1 - r} = {\tilde{p}}_{r k} (C) = p_{r C} = 0,

so that

\begin{matrix} P (X_{n} \in C) \approx Φ_{V} (C) + \sum_{r = 1}^{\infty} n^{- r} p_{2 r C} = Φ_{V} (C) + n^{- 1} p_{2 C} + O (n^{- 2}) . \end{matrix}

(14)

Examples 3 and 4 of [3] gave

p_{2 C}

for

C = {x : x^{'} V^{- 1} x \leq u}

, and

C = {x : | {(V^{- 1 / 2} x)}_{j} | \leq u_{j}, j = 1, \dots, q}

.

3. The Conditional Density and Distribution

For

q = q_{1} + q_{2}, q_{1} \geq 1,

and

q_{2} \geq 1,

partition

w, \hat{w}, X \sim N_{q} (0, V), X_{n} = n^{1 / 2} (\hat{w} - w), x

and

y = V^{- 1} x

as

(\binom{w_{1}}{w_{2}}), (\binom{{\hat{w}}_{1}}{{\hat{w}}_{2}}), (\binom{X_{1}}{X_{2}}), (\binom{X_{n 1}}{X_{n 2}}), (\binom{x_{1}}{x_{2}})

and

(\binom{y_{1}}{y_{2}}),

where

w_{i}, {\hat{w}}_{i}, X_{i}, X_{n i}, x_{i}, y_{i}

are vectors of length

q_{i}

. Partition

V, V^{- 1}

as

(V_{i j}), (V^{i j}), 2 \times 2,

where

V_{i j}, V^{i j}

are

q_{i} \times q_{j}

.

\begin{matrix} Set X_{1 \cdot 2} = X_{1} | (X_{2} = x_{2}), X_{n 1 \cdot 2} = X_{n 1} | (X_{n 2} = x_{2}), \end{matrix}

(15)

\begin{matrix} {\hat{w}}_{1 \cdot 2} = {\hat{w}}_{1} | (X_{n 2} = x_{2}) = w_{1} + n^{- 1 / 2} X_{n 1 \cdot 2} . \end{matrix}

(16)

Now we come to the main purpose of this paper. Theorem 1 expands the conditional density of

X_{n 1 \cdot 2}

about the conditional density of

X_{1 \cdot 2}

. Its derivation is straightforward, the only novel feature being the use of Lemma 2 to find the reciprocal of a series, using Bell polynomials. Theorem 2 integrates the conditional density to obtain the expansion for the conditional distribution of

X_{n 1 \cdot 2}

about the conditional distribution of

X_{1 \cdot 2}

in terms of

{\bar{I}}^{1 - k}

of (42) below, the integral of the Hermite polynomial

{\bar{H}}^{1 - k}

of (8), with respect to the conditional normal density. Note 1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional normal distribution. For

X_{1 \cdot 2}

of (15), set

\begin{matrix} μ_{1 \cdot 2} = E X_{1 \cdot 2} = V_{12} V_{22}^{- 1} x_{2} \in R^{q_{1}}, \end{matrix}

(17)

\begin{matrix} V_{1 \cdot 2} = c o v a r X_{1 \cdot 2} = V_{11} - V_{12} V_{22}^{- 1} V_{21} = V_{0} say, \in R^{q_{1} \times q_{1}} . \end{matrix}

(18)

\begin{matrix} So, X_{1 \cdot 2} \sim N_{q_{1}} (μ_{1 \cdot 2}, V_{1 \cdot 2}) . \end{matrix}

(19)

Lemma 1.

The elements of

(V^{i j}) = V^{- 1}

are

\begin{matrix} V^{11} = V_{1 \cdot 2}^{- 1}, V^{12} = - V^{11} V_{12} V_{22}^{- 1} = - V_{11}^{- 1} V_{12} V^{22}, \\ V^{21} = - V^{22} V_{21} V_{11}^{- 1} = - V_{22}^{- 1} V_{21} V^{11}, \\ V^{22} = V_{2 \cdot 1}^{- 1}, where V_{2 \cdot 1} = V_{22} - V_{21} V_{11}^{- 1} V_{12} \in R^{q_{2} \times q_{2}} . \end{matrix}

For i = 1, 2, set A_{i} = V^{i 1} V_{12} V_{22}^{- 1} + V^{i 2} . Then A_{1} = 0_{q_{1} \times q_{2}}, A_{2} = V_{22}^{- 1} .

(20)

Proof.

V V^{- 1} = V^{- 1} V = I_{q}

gives 8 equations relating

{V^{i j}}

and

{V_{i j}}

. Now solve for

{V^{i j}}

.

So

A_{1} = 0_{q_{1} \times q_{2}}, A_{2} = V^{22} B V_{22}^{- 1}

for

B = V_{22} - V_{21} V_{11}^{- 1} V_{12} = {(V^{22})}^{- 1} .

□

Since

Q = V_{11} - V_{1 \cdot 2} \geq 0_{q_{1} \times q_{1}}

in the sense that

x^{'} Q x \geq 0

for

x \in R^{q_{1}}

,

X_{1 \cdot 2}

is less variable than

X_{1}

, and

X_{n 1 \cdot 2}

is less variable than

X_{n 1}

, unless

X_{1}

and

X_{2}

are uncorrelated, that is,

V_{12}

is a matrix of zeros.

The conditional density of

X_{n 1 \cdot 2}

is

\begin{matrix} p_{n 1 \cdot 2} (x_{1}) = p_{X_{n}} (x) / p_{X_{n 2}} (x_{2}) = ϕ_{1 \cdot 2} (x_{1}) (1 + S) / (1 + S_{2}), \end{matrix}

(21)

where S = p_{X_{n}} (x) / ϕ_{V} (x) - 1 \approx \sum_{r = 1}^{\infty} n^{- r / 2} {\tilde{p}}_{r} (x) of (6),

\begin{matrix} S_{2} = p_{X_{n 2}} (x_{2}) / ϕ_{V_{22}} (x_{2}) - 1 \approx \sum_{r = 1}^{\infty} n^{- r / 2} f_{r}, for f_{r} = p_{r}^{*} (x_{2}), \end{matrix}

(22)

where

p_{r}^{*} (x_{2})

is

{\tilde{p}}_{r} (x)

of (6) for

X_{n 2}

, and

ϕ_{1 \cdot 2} (x_{1})

is the density of

X_{1 \cdot 2}

of (15). By (4)–(6), Section 2.5 of [18], for

V_{0}

of (18),

\begin{matrix} ϕ_{1 \cdot 2} (x_{1}) = ϕ_{V} (x) / ϕ_{V_{22}} (x_{2}) = ϕ_{V_{0}} (u), where u = x_{1} - μ_{1 \cdot 2} \in R^{q_{1}} . \end{matrix}

(23)

So the distribution of

X_{1} | (X_{2} = x_{2})

is

\begin{matrix} Φ_{1 \cdot 2} (x_{1}) = Φ_{V_{0}} (u), for V_{0} of (18) . \end{matrix}

(24)

For

μ_{1 \cdot 2}

of (17),

V_{1 \cdot 2}

of (18), and

v \in R^{q_{1}}

, set

\begin{matrix} x_{1} (x_{2}, v) = μ_{1 \cdot 2} + V_{1 \cdot 2}^{1 / 2} v = V_{12} V_{22}^{- 1} x_{2} + V_{1 \cdot 2}^{1 / 2} v . \end{matrix}

(25)

Corollary 1.

Suppose that

q_{1} = 1

. Then for

v = V_{0}^{- 1 / 2} u

of (23),

\begin{matrix} ϕ_{1 \cdot 2} (x_{1}) = V_{0}^{- 1 / 2} ϕ (v), P (X_{1 \cdot 2} < x_{1} (x_{2}, v)) = Φ (v), \\ P ({\hat{w}}_{1 \cdot 2} < w_{1} + n^{- 1 / 2} x_{1} (x_{2}, v)) = P (X_{n 1 \cdot 2} < x_{1} (x_{2}, v)) = Φ (v) + O (n^{- 1 / 2}), \\ P (| w_{1} - {\hat{w}}_{1 \cdot 2} | < n^{- 1 / 2} x_{1} (x_{2}, v)) = P (| X_{n 1 \cdot 2} | < x_{1} (x_{2}, v)) \\ = 2 Φ (v) - 1 + O (n^{- 1}) if v > 0, \\ P (x_{1} (x_{2}, - v) < X_{n 1 \cdot 2} < x_{1} (x_{2}, v)) = 2 Φ (v) - 1 + O (n^{- 1}), if v > 0 . \end{matrix}

(26)

Replacing V by an estimate will usually give 1- and 2-sided conditional confidence intervals of error

O (n^{- 1 / 2})

and

O (n^{- 1})

for

{\hat{w}}_{1} - w_{1}

given

{\hat{w}}_{2} - w_{2}

.

\begin{matrix} Set {\bar{H}}_{q}^{1 - k} = {\bar{H}}^{1 - k} = {\bar{H}}^{1 - k} (x, V), and {\bar{H}}_{q_{2}}^{1 - k} = {\bar{H}}^{1 - k} (x_{2}, V_{22}) . \end{matrix}

So

{\bar{H}}_{q_{2}}^{1 - k}

is given by replacing

y = V^{- 1} x

and

(V^{i j}) = V^{- 1}

in

{\bar{H}}_{q}^{1 - k}

by

\begin{matrix} z = V_{22}^{- 1} x_{2} and (U^{i j}) = V_{22}^{- 1} . For example {\bar{H}}_{q_{2}}^{12} = {\bar{z}}_{1} {\bar{z}}_{2} - {\bar{U}}^{12} . \end{matrix}

(27)

By (5) and (6), for

r \geq 1

,

p_{r}^{*} (x_{2})

of (22) is given by

\begin{matrix} p_{r}^{*} (x_{2}) = \sum_{k = 1}^{3 r} [p_{r k}^{*} : k - r even], where p_{r k}^{*} = {\bar{P}}_{r}^{1 - k} {\bar{H}}_{q_{2}}^{1 - k}, \end{matrix}

(28)

and implicit summation in (28) for

i_{1}, \dots, i_{k}

is now over

q_{1} + 1, \dots, q

. So,

\begin{matrix} p_{1}^{*} (x_{2}) = \sum_{k = 1, 3} p_{1 k}^{*}, p_{11}^{*} = \sum_{i_{1} = q_{1} + 1}^{q} {\bar{k}}_{1}^{1} {\bar{H}}_{q_{2}}^{1}, p_{13}^{*} = \sum_{i_{1}, i_{2}, i_{3} = q_{1} + 1}^{q} {\bar{k}}_{2}^{1 - 3} {\bar{H}}_{q_{2}}^{1 - 3} / 6, \\ where {\bar{H}}_{q_{2}}^{1} = {\bar{z}}_{1}, {\bar{H}}_{q_{2}}^{1 - 3} = {\bar{z}}_{1} {\bar{z}}_{2} {\bar{z}}_{3} - \sum^{3} {\bar{U}}^{12} {\bar{z}}_{3}, \\ p_{2}^{*} (x_{2}) = \sum_{k = 2, 4, 6} p_{2 k}^{*}, p_{3}^{*} (x_{2}) = \sum_{k = 1, 3, 5, 7, 9} p_{3 k}^{*}, for p_{r k}^{*} of (28) . \end{matrix}

Ordinary Bell polynomials. For a sequence

e = (e_{1}, e_{2}, \dots)

from R, the partial ordinary Bell polynomial

{\tilde{B}}_{r s} = {\tilde{B}}_{r s} (e)

, is defined by the identity

\begin{matrix} for s = 0, 1, 2, \dots and z \in R, S^{s} = \sum_{r = s}^{\infty} z^{r} {\tilde{B}}_{r s} (e), where S = \sum_{r = 1}^{\infty} z^{r} e_{r} . \end{matrix}

(29)

So, {\tilde{B}}_{r 0} = δ_{r 0}, {\tilde{B}}_{r 1} = e_{r}, {\tilde{B}}_{r r} = e_{1}^{r}, {\tilde{B}}_{32} = 2 e_{1} e_{2},

where

δ_{00} = 1, δ_{r 0} = 0

for

r \neq 0 .

They are tabled on p309 of [19]. To obtain (21), we use

Lemma 2.

Take

{\tilde{B}}_{r s} (e)

of (29). Set

S_{2} = \sum_{r = 1}^{\infty} z^{r} f_{r}

for

f_{r} \in R

. Then

\begin{matrix} {(1 + S_{2})}^{- 1} = \sum_{r = 0}^{\infty} z^{r} C_{r}, where C_{r} = B_{r}^{*} (- f), B_{r}^{*} (e) = \sum_{s = 0}^{r} {\tilde{B}}_{r s} (e) . \end{matrix}

(30)

So, B_{0}^{*} (e) = 1, B_{1}^{*} (e) = e_{1}, B_{2}^{*} (e) = e_{2} + e_{1}^{2}, B_{3}^{*} (e) = e_{3} + 2 e_{1} e_{2} + e_{1}^{3},

\begin{matrix} C_{0} = 1, C_{1} = - f_{1}, C_{2} = f_{1}^{2} - f_{2}, C_{3} = - f_{1}^{3} + 2 f_{1} f_{2} - f_{3} . \end{matrix}

(31)

Proof.

\begin{matrix} {(1 + S_{2})}^{- 1} = \sum_{s = 0}^{\infty} {(- S_{2})}^{s}, and {(- S_{2})}^{s} = \sum_{r = s}^{\infty} z^{r} {\tilde{B}}_{r s} (- f) . \end{matrix}

Now swap summations. □

Theorem 1.

Take

{\tilde{p}}_{r} (x)

of (6) and

C_{r} = B_{r}^{*} (- f)

of (30) with

\begin{matrix} f_{r} = p_{r}^{*} (x_{2}) of (28) . \end{matrix}

(32)

The conditional density

p_{n 1 \cdot 2} (x_{1})

of (21), relative to

ϕ_{1 \cdot 2} (x_{1})

of (23), is

\begin{matrix} p_{n 1 \cdot 2} (x_{1}) / ϕ_{1 \cdot 2} (x_{1}) \approx \sum_{r = 0}^{\infty} n^{- r / 2} D_{r}, where D_{r} = C_{r} \otimes {\tilde{p}}_{r} (x), \end{matrix}

(33)

and for sequences

(a_{0}, a_{1}, \dots)

and

(b_{0}, b_{1}, \dots), a_{r} \otimes b_{r} = \sum_{i = 0}^{r} a_{i} b_{r - i} .

So,

\begin{matrix} D_{0} = {\tilde{p}}_{0} (x) = 1, D_{1} = C_{1} + {\tilde{p}}_{1} (x), D_{2} = C_{2} + C_{1} {\tilde{p}}_{1} (x) + {\tilde{p}}_{2} (x), \end{matrix}

(34)

\begin{matrix} D_{3} = C_{3} + C_{2} {\tilde{p}}_{1} (x) + C_{1} {\tilde{p}}_{2} (x) + {\tilde{p}}_{3} (x) . \end{matrix}

(35)

Proof.

This follows from (21) and Lemma 2. □

So

D_{0}, \dots, D_{3}

of (34) and (35) give the conditional density to

O (n^{- 2})

. We call (33) the relative conditional density. We now give our main result, an expansion for the conditional distribution of

X_{n 1} | (X_{n 2} = x_{2})

. As noted, Theorem 2 gives this in terms of

{\bar{I}}^{1 - k}

of (42) below, an integral of the Hermite polynomial

{\bar{H}}^{1 - k}

of (8), and Note 1 gives

{\bar{I}}^{1 - k}

in terms of derivatives of the multivariate normal distribution. Theorem 3 gives

{\bar{I}}^{1 - k}

in terms of the partial moments of the conditional distribution

Φ_{1 \cdot 2} (x_{1})

of (24).

When

q_{1} = 1

, Theorem 4 gives

{\bar{I}}^{1 - k}

in terms of

Φ (v)

and

ϕ (v)

for

\begin{matrix} v = V_{1 \cdot 2}^{- 1 / 2} u = V_{1 \cdot 2}^{- 1 / 2} (x_{1} - μ_{1 \cdot 2}) \in R^{q_{1}} . \end{matrix}

(36)

Theorem 2.

Take

C_{r}, D_{r}

of Theorem 1. Set

{\tilde{p}}_{0} (x) = 1 .

The conditional distribution of

X_{n 1}

given

X_{n 2}

, about

Φ_{1 \cdot 2} (x_{1})

of (24), has the expansion

\begin{matrix} P_{n 1 \cdot 2} (x_{1}) = P (X_{n 1 \cdot 2}) \leq x_{1} \approx \sum_{r = 0}^{\infty} n^{- r / 2} G_{r}, \end{matrix}

(37)

\begin{matrix} where G_{r} = \int_{- \infty}^{x_{1}} D_{r} d Φ_{1 \cdot 2} (x_{1}) = C_{r} \otimes g_{r}, and \end{matrix}

(38)

\begin{matrix} g_{r} = \int_{- \infty}^{x_{1}} {\tilde{p}}_{r} (x) d Φ_{1 \cdot 2} (x_{1}) . So, G_{0} = g_{0} = Φ_{1 \cdot 2} (x_{1}) = Φ_{V_{0}} (u) of (24), \end{matrix}

(39)

\begin{matrix} for r \geq 1, g_{r} = \sum_{k = 1}^{3 r} [g_{r k} : k - r even], where for {\tilde{p}}_{r k} of (6), \end{matrix}

(40)

\begin{matrix} g_{r k} = \int_{- \infty}^{x_{1}} {\tilde{p}}_{r k} d Φ_{1 \cdot 2} (x_{1}) = {\bar{P}}_{r}^{1 - k} {\bar{I}}^{1 - k}, and for 1 \leq i_{1}, \dots, i_{k} \leq q, \end{matrix}

(41)

\begin{matrix} {\bar{I}}^{1 - k} = I^{i_{1} \dots i_{k}} = \int_{- \infty}^{x_{1}} {\bar{H}}_{q}^{1 - k} ϕ_{1 \cdot 2} (x_{1}) d x_{1}, for ϕ_{1 \cdot 2} (x_{1}) of (23) . \end{matrix}

(42)

Proof.

(40) holds by (6). (41) holds by (6). Now use (23). □

So P_{n 1 \cdot 2} (x_{1}) = Φ_{V_{0}} (u) + \sum_{r = 1}^{3} n^{- r / 2} G_{r} + O (n^{- 2}), where G_{1} = g_{1} - f_{1} g_{0},

\begin{matrix} G_{2} = g_{2} - f_{1} g_{1} + C_{2} g_{0}, G_{3} = g_{3} - f_{1} g_{2} + C_{2} g_{1} + C_{3} g_{0}, \end{matrix}

(43)

for

C_{r}

of (31).

g_{1} = g_{11} + g_{13}

is given by

{\bar{I}}^{1}, {\bar{I}}^{1 - 3}

,

g_{2} = g_{22} + g_{24} + g_{26}

is given by

{\bar{I}}^{12}, {\bar{I}}^{1 - 4}, {\bar{I}}^{1 - 6},

and

g_{3} = \sum (g_{3 k} : k = 1, 3, 5, 7, 9)

is given by

{\bar{I}}^{1}, {\bar{I}}^{1 - 3}, \dots, {\bar{I}}^{1 - 9} .

Note 1.

Set

\partial^{†} = Π_{i = q_{1} + 1}^{q} \partial_{i} .

By (23),

{\bar{I}}^{1 - k} = ϕ_{V_{22}} {(x_{2})}^{- 1} {\bar{L}}^{1 - k}, where

\begin{matrix} {\bar{L}}^{1 - k} = \int_{- \infty}^{x_{1}} {\bar{H}}_{q}^{1 - k} ϕ_{V} (x) d x_{1} = (- {\bar{\partial}}_{1}) \dots (- {\bar{\partial}}_{k}) \partial^{†} Φ_{V} (x) . \end{matrix}

(44)

Comparing

{\bar{L}}^{1 - k}

with the Hermite function

{\bar{H}}_{*}^{1 - k}

of (7), we can call

{\bar{L}}^{1 - k}

the partial Hermite function. When

q = 2

, see (53).

By (39),

G_{r}

in (37) is given by

C_{r}

of (32) and

g_{r}

of (40). Viewing

{\bar{H}}_{q}^{1 - k}

as a polynomial in

x_{1} = μ_{1 \cdot 2} + u

for u of (23),

{\bar{I}}^{1 - k}

is linear in

\int_{- \infty}^{x_{1}} x_{i_{1}} \dots x_{i_{s}} ϕ_{1 \cdot 2} (x_{1}) d x_{1} = \int_{- \infty}^{u} {(μ_{1 \cdot 2} + u)}_{i_{1}} \dots {(μ_{1 \cdot 2} + u)}_{i_{s}} ϕ_{V_{0}} (u) d u

for

0 \leq s \leq k, 1 \leq i_{1}, \dots, i_{s} \leq q_{1}

. So

{\bar{I}}^{1 - k}

can be expanded in terms of the partial moments of

\begin{matrix} M = Φ_{1 \cdot 2} (x_{1}) = Φ_{V_{0}} (u), \end{matrix}

(45)

\begin{matrix} {\bar{M}}^{a - b} = {\bar{M}}^{a - b} (u, V_{0}) = M_{i_{a} \dots i_{b}} (u, V_{0}) = \int_{- \infty}^{u} u_{i_{a}} \dots u_{i_{b}} ϕ_{V_{0}} (u) d u . \end{matrix}

(46)

This has only

q_{1}

integrals, while (12) has q integrals.

Lemma 3.

For

u = x_{1} - μ_{1 \cdot 2}

,

y = V^{- 1} x = α + Λ u

, where

\begin{matrix} Λ = (\binom{V^{11}}{V^{21}}) \in R^{q \times q_{1}}, α = (\binom{α_{1}}{α_{2}}), α_{1} = 0_{q_{1}}, α_{2} = V_{22}^{- 1} x_{2} . \end{matrix}

(47)

Proof.

y = (\binom{y_{1}}{y_{2}})

, where

y_{i} = V^{i 1} x_{1} + V^{i 2} x_{2} = α_{i} + V^{i 1} u

, and

α_{i} = A_{i} x_{2}

for

A_{i}

of (20). □

Our main result, Theorem 2, gave the conditional distribution expansion in terms of

{\bar{I}}^{1 - k}

of (42). Note 4.1 gave these in terms of the derivatives of

Φ_{V} (x)

. We now give

{\bar{I}}^{1 - k}

in terms of

{\bar{J}}^{1 - k}

, the partial moments of the conditional distribution

Φ_{1 \cdot 2} (x_{1})

of (24). As in (10), for any

π = (m, \dots, n)

, set

\sum^{N} c_{π} = \sum c_{π}

summed over all, N say, permutations of

π

giving distinct

c_{π}

. For example,

\sum^{2} c_{23} = c_{23} + c_{32}

.

Theorem 3.

Take

{\bar{J}}^{1 - k} (x, V)

of (11), u of (24), M of (45),

{\bar{M}}^{a - b}

of (46),

Λ, α

of (47), and

1 \leq i_{1}, \dots, i_{k} \leq q

. Set

\begin{matrix} {\bar{K}}^{1 - k} = K^{i_{1} \dots i_{k}} = \int_{- \infty}^{u} {(Λ u)}_{i_{1}} \dots {(Λ u)}_{i_{k}} ϕ_{V_{0}} (u) d u = Λ_{i_{1} j_{1}} \dots Λ_{i_{k} j_{k}} M^{j_{1}, \dots j_{k}}, \end{matrix}

where

j_{1}, \dots j_{k}

sum over their range

1, \dots, q_{1} .

So

\begin{matrix} {\bar{K}}^{1 - k} = {\bar{Λ}}_{1, k + 1} \dots {\bar{Λ}}_{k, 2 k} {\bar{M}}^{k + 1 - 2 k} . \\ For example, {\bar{K}}^{1} = {\bar{Λ}}_{12} {\bar{M}}^{2}, {\bar{K}}^{12} = {\bar{Λ}}_{13} {\bar{Λ}}_{24} {\bar{M}}^{34}, {\bar{K}}^{123} = {\bar{Λ}}_{14} {\bar{Λ}}_{25} {\bar{Λ}}_{36} {\bar{M}}^{456} . \end{matrix}

(48)

\begin{matrix} Set {\bar{J}}_{0}^{1 - k} = {\bar{J}}^{1 - k} (u, V_{0}) = \int_{- \infty}^{u} {\bar{y}}_{1} \dots {\bar{y}}_{k} ϕ_{V_{0}} (u) d u : \\ {\bar{J}}_{0}^{1 - k} = {\bar{α}}_{1} \dots {\bar{α}}_{k} M + \sum^{(\binom{k}{1})} {\bar{α}}_{1} \dots {\bar{α}}_{k - 1} {\bar{K}}^{k} + \sum^{(\binom{k}{2})} {\bar{α}}_{1} \dots {\bar{α}}_{k - 2} {\bar{K}}^{k - 1, k} + \dots + {\bar{K}}^{1 - k} . \end{matrix}

(49)

\begin{matrix} For example, {\bar{J}}_{0}^{1} = {\bar{α}}_{1} M + {\bar{K}}^{1}, {\bar{J}}_{0}^{12} = {\bar{α}}_{1} {\bar{α}}_{2} M + \sum^{2} {\bar{α}}_{1} {\bar{K}}^{2} + {\bar{K}}^{12}, \\ {\bar{J}}_{0}^{1 - 3} = {\bar{α}}_{1} {\bar{α}}_{2} {\bar{α}}_{3} M + \sum^{3} ({\bar{α}}_{1} {\bar{α}}_{2} {\bar{K}}^{3} + {\bar{α}}_{1} {\bar{K}}^{23}) + {\bar{K}}^{1 - 3}, \\ {\bar{J}}_{0}^{1 - 4} = {\bar{α}}_{1} \dots {\bar{α}}_{4} M + \sum^{4} ({\bar{α}}_{1} {\bar{α}}_{2} {\bar{α}}_{3} {\bar{K}}^{4} + {\bar{α}}_{1} {\bar{K}}^{234}) + \sum^{6} {\bar{α}}_{1} {\bar{α}}_{2} {\bar{K}}^{34} + {\bar{K}}^{1 - 4}, \\ {\bar{J}}_{0}^{1 - 5} = {\bar{α}}_{1} \dots {\bar{α}}_{5} M + \sum^{5} ({\bar{α}}_{1} \dots {\bar{α}}_{4} {\bar{K}}^{5} + {\bar{α}}_{1} {\bar{K}}^{2 - 5}) + \sum^{10} ({\bar{α}}_{1} {\bar{α}}_{2} {\bar{α}}_{3} {\bar{K}}^{45} \\ + {\bar{α}}_{1} {\bar{α}}_{2} {\bar{K}}^{3 - 5}) + {\bar{K}}^{1 - 5}, \\ {\bar{J}}_{0}^{1 - 6} = {\bar{α}}_{1} \dots {\bar{α}}_{6} M + \sum^{6} ({\bar{α}}_{1} \dots {\bar{α}}_{5} {\bar{K}}^{6} + {\bar{α}}_{1} {\bar{K}}^{2 - 6}) + \sum^{15} ({\bar{α}}_{1} \dots {\bar{α}}_{4} {\bar{K}}^{56} \\ + {\bar{α}}_{1} {\bar{α}}_{2} {\bar{K}}^{3 - 6}) + \sum^{20} {\bar{α}}_{1} {\bar{α}}_{2} {\bar{α}}_{3} {\bar{K}}^{4 - 6} + {\bar{K}}^{1 - 6}, \\ For {\bar{μ}}^{1 - 2 k} of (10), {\bar{I}}^{1} = {\bar{J}}_{0}^{1}, {\bar{I}}^{12} = {\bar{J}}_{0}^{12} - M {\bar{V}}^{12}, {\bar{I}}^{1 - 3} = {\bar{J}}_{0}^{1 - 3} - \sum^{3} {\bar{J}}_{0}^{1} {\bar{V}}^{23}, \\ {\bar{I}}^{1 - 4} = {\bar{J}}_{0}^{1 - 4} - \sum^{6} {\bar{J}}_{0}^{12} {\bar{V}}^{34} + M {\bar{μ}}^{1 - 4}, {\bar{I}}^{1 - 5} = {\bar{J}}_{0}^{1 - 5} - \sum^{10} {\bar{J}}_{0}^{1 - 3} {\bar{V}}^{45} + \sum^{5} {\bar{J}}_{0}^{1} {\bar{μ}}^{2 - 5}, \\ {\bar{I}}^{1 - 6} = {\bar{J}}_{0}^{1 - 6} - \sum^{15} {\bar{J}}_{0}^{1 - 4} {\bar{V}}^{56} + \sum^{15} {\bar{J}}_{0}^{12} {\bar{μ}}^{3 - 6} - M {\bar{μ}}^{1 - 6} . \end{matrix}

Proof.

Since

x_{1} = μ + u, y = Λ x_{1} + (\binom{V^{12}}{V^{22}}) x_{2} = α + Λ u \in R^{q} .

Substitute

y = α + Λ u

into the expressions for

{\bar{H}}^{1 - k} .

Now multiply by

ϕ_{V_{0}} (u)

and integrate from

- \infty

to u. □

This gives the

{\bar{I}}^{1 - k}

needed for

g_{1}, g_{2}, G_{1}, G_{2}

. The

{\bar{I}}^{1 - k}, k = 7, 9

needed for

g_{3}, G_{3}

can be written down similarly in terms of the partial moments using

{\bar{H}}_{q}^{1 - k}

for

k = 7, 9 .

We now show that if

q_{1} = 1

, we only need the partial moments of

Φ (v)

at v of (36), and that these are easily written in terms of

Φ (v)

and

ϕ (v) \times

a polynomial in v of (36).

The case

q_{1} = 1 .

So

w_{1} = w_{1}, {\hat{w}}_{1} = {\hat{w}}_{1}, X_{1} = X_{1}, X_{n 1} = X_{n 1}, V_{11} = V_{11} .

Theorem 4.

For

q_{1} = 1

,

1 \leq k \leq 6, {\bar{I}}^{1 - k}

is given by Theorem 3 with

\begin{matrix} α_{1} = 0, {\bar{K}}^{1 - k} = {\bar{Λ}}_{1} \dots {\bar{Λ}}_{k} σ^{k} γ_{k}, where Λ = (\binom{V^{11}}{V^{21}}) \in R^{q}, \end{matrix}

\begin{matrix} σ = V_{1 \cdot 2}^{1 / 2} of (18), γ_{k} = \int_{- \infty}^{v} v^{k} ϕ (v) d v, for v of (36) . \end{matrix}

(50)

So γ_{0} = Φ (v), γ_{1} = - ϕ (v), γ_{k} = v^{k - 1} γ_{1} + (k - 1) γ_{k - 2}, for k \geq 2 :

(51)

\begin{matrix} γ_{2} = γ_{0} + v γ_{1}, γ_{3} = (v^{2} + 2) γ_{1}, γ_{4} = 3 γ_{0} + (v^{3} + 3 v) γ_{1}, \\ γ_{5} = (v^{4} + 4 v^{2} + 4.2) γ_{1}, γ_{6} = 5.3 γ_{0} + (v^{5} + 5 v^{3} + 5.3 v) γ_{1}, \\ γ_{7} = (v^{6} + 6 v^{4} + 6.4 v^{2} + 6 . 4.2) γ_{1}, \\ γ_{8} = 7 . 5.3 γ_{0} + (v^{7} + 7 v^{5} + 7.5 v^{3} + 7 . 5.3 v) γ_{1}, \\ γ_{9} = (v^{8} + 8 v^{6} + 8.6 v^{4} + 8 . 6.4 v^{2} + 8.6 . 4.2) γ_{1}, \end{matrix}

where dot denotes multiplication. Also,

G_{0} = Φ (v)

.

Proof.

For v of (36), by (23),

ϕ_{1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) .

(51) follows from integration by parts. By (48),

K^{1 - k} = {\bar{Λ}}_{1} \dots {\bar{Λ}}_{k} M^{1^{k}}

where

M^{1^{k}} = \int_{- \infty}^{u} u^{k} d Φ (u / σ) = σ^{k} γ_{k} .

That

G_{0} = Φ (v),

follows from (39). □

By (37), for

C_{r}

of (32) and v of (36), the conditional distribution of

X_{n 1 \cdot 2}

is

\begin{matrix} P (X_{n 1 \cdot 2} / σ \leq v) \approx Φ (v) + \sum_{r = 1}^{\infty} n^{- r / 2} G_{r}, where G_{r} = C_{r} \otimes g_{r}, \end{matrix}

(52)

as in (43), and

g_{r}

is given by (40) in terms of the integrated Hermite polynomial,

{\bar{I}}^{1 - k}

of (42) given by Theorems 3 and 4.

4. The Case $q_{1} = q_{2} = 1$

Theorem 3.2 gave the conditional Edgeworth expansion in terms of

{\bar{I}}^{1 - k}

of (42). Theorem 3.3 gave

{\bar{I}}^{1 - k}

needed for

g_{r k}

of (41) and

G_{1}, G_{2}

of (37), in terms of the partial moments

{\bar{M}}^{a - b}

of (46). When

q_{1} = 1

, Theorem 4 gave

{\bar{I}}^{1 - k}

in terms of

Φ (v)

and its partial moments for v of (36). But now

q = 2

so that

i_{1}, \dots, i_{k} = 1

or 2. So for

(I, y, Y)

of (9), we switch notation to

\begin{matrix} H_{a b} = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} ϕ_{V} (x) = E {(y_{1} + I Y_{1})}^{a} {(y_{2} + I Y_{2})}^{b}, \\ H_{a b}^{*} = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} Φ_{V} (x) = \int_{- \infty}^{x} H_{a b} ϕ_{V} (x) d x . \\ So H_{a b}^{*} = H_{a - 1, b - 1} ϕ_{V} (x) if a \geq 2, b \geq 1, \\ H_{10}^{*} = \int_{- \infty}^{x_{2}} ϕ_{V} (x) d x_{2} = \partial_{1} Φ_{V} (x), H_{a 0}^{*} = {(- \partial_{1})}^{a - 1} H_{10}^{*} if a \geq 2, \\ H_{01}^{*} = \int_{- \infty}^{x_{1}} ϕ_{V} (x) d x_{1} = \partial_{2} Φ_{V} (x), H_{0 b}^{*} = {(- \partial_{2})}^{b - 1} H_{01}^{*} if b \geq 1, \end{matrix}

L^{1^{a} 2^{b}} = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b + 1} Φ_{V} (x) = H_{a, b + 1}^{*},

(53)

for

{\bar{L}}^{1 - k}

of (44). Similarly, write (2) as

κ_{a b} ({\hat{w}}_{1}, {\hat{w}}_{2}) \approx \sum_{d = a + b - 1}^{\infty} n^{- d} k_{a b d}, for a + b \geq 1, where k_{a b d} = k_{d}^{1^{a} 2^{b}}, and

\begin{matrix} set K_{a b} = K^{1^{a} 2^{b}}, J_{a b} = J_{0}^{1^{a} 2^{b}}, I_{a b} = I^{1^{a} 2^{b}} = \int_{- \infty}^{x_{1}} H_{a b} (x) ϕ_{1 \cdot 2} (x_{1}) d x_{1} . \end{matrix}

(54)

Also, we switch from

{\bar{P}}_{r}^{1 - k}

to

\begin{matrix} P_{r} (a b) = (\binom{a + b}{a}) P_{r}^{1^{a} 2^{b}} . \end{matrix}

given for

r \leq 3

in Section 4 of [3]. So,

{\tilde{p}}_{r k} = \sum_{b = 0}^{k} P_{r} (k - b, b) H_{k - b, b}, P_{r k} (x) = \sum_{b = 0}^{k} P_{r} (k - b, b) H_{k - b, b}^{*} .

(55)

\begin{matrix} So, {\tilde{p}}_{r 1} = P_{r} (10) H_{10} + P_{r} (01) H_{01}, {\tilde{p}}_{11} = k_{101} y_{1} + k_{011} y_{2}, \\ {\tilde{p}}_{13} = P_{1} (30) H_{30} + P_{1} (21) H_{21} + P_{1} (12) H_{12} + P_{1} (03) H_{03} . \end{matrix}

P_{r} (b a)

is just

P_{r} (a b)

with 1 and 2 reversed. For the other

{\tilde{p}}_{r k}

and

P_{r k} (x)

needed for

r \leq 3

, see Section 4 of [3]. Our main result for this section, Theorem 7, gives simple formulas for

I_{a b}

and for

g_{r}

of (40), the main ingredient needed in Theorem 2 for the expansion of the conditional distribution.

Theorem 5.

The conditional density of

X_{n 1 \cdot 2}

of (15), is given by Theorem 1 where

f_{r} = p_{r}^{*} (x_{2})

is given by (28) in terms of

\begin{matrix} p_{r k}^{*} = P_{r} (0 k) H_{k}^{*}, where H_{k}^{*} = H_{k} (z, V_{22}) and z = V_{22}^{- 1} x_{2} . \end{matrix}

(56)

\begin{matrix} For example, H_{1}^{*} = z, H_{2}^{*} = z^{2} - V_{22}^{- 1}, H_{3}^{*} = z^{3} - 3 z V_{22}^{- 1}, \end{matrix}

(57)

\begin{matrix} H_{4}^{*} = z^{4} - 15 z^{2} V_{22}^{- 1} + 3 V_{22}^{- 2}, H_{5}^{*} = z^{5} - 10 z^{3} V_{22}^{- 1} + 15 z V_{22}^{- 2}, \end{matrix}

(58)

\begin{matrix} H_{6}^{*} = z^{6} - 15 z^{4} V_{22}^{- 1} + 45 z^{2} V_{22}^{- 2} - 15 V_{22}^{- 3} . \end{matrix}

(59)

Proof.

This follows from Theorem 1. □

Theorem 6 gives a laborious expression for the conditional distribution.

However Theorem 7 gives a huge simplification.

Theorem 6.

The conditional distribution of

X_{n 1 \cdot 2}

of (15), is given by Theorem 2 with

Λ, σ, γ_{s}

of Theorem 4 as follows. For

k - r

even,

g_{r k}

of (41) is given by

\begin{matrix} g_{r k} = \sum_{b = 0}^{k} P_{r} (k - b, b) I_{k - b, b}, \end{matrix}

(60)

where

I_{a b}

of (54) is given for

a + b = k

, as follows in terms of

Λ_{i} = V^{i 1}

.

K_{a b} = Λ_{1}^{a} Λ_{2}^{b} σ^{a + b} γ_{a + b}, and J_{k 0} = \sum_{s = 0}^{k} α_{1}^{k - s} K_{s 0}, for K_{s 0} = {(Λ_{1} σ)}^{s} γ_{s} .

(61)

\begin{matrix} For k = 1 : I_{10} = J_{10} = α_{1} γ_{0} + Λ_{1} σ γ_{1} . \\ For k = 2 : I_{20} = J_{20} - γ_{0} V^{11}, I_{11} = J_{11} - γ_{0} V^{12}, \end{matrix}

\begin{matrix} J_{11} = α_{1} α_{2} γ_{0} + σ γ_{1} \sum^{2} α_{1} Λ_{2} + Λ_{1} Λ_{2} σ^{2} γ_{2} . \\ For k = 3 : I_{30} = J_{30} - 3 J_{10} V^{11}, I_{21} = J_{21} - (2 J_{10} V^{12} + J_{01} V^{11}), \\ J_{21} = α_{1}^{2} α_{2} γ_{0} + X_{21} + X_{12} + K_{21}, where \\ X_{21} = α_{1}^{2} K_{01} + 2 α_{1} α_{2} K_{10}, X_{12} = 2 α_{1} K_{11} + α_{2} K_{20} . \\ For k = 4 : I_{40} = J_{40} - 6 J_{20} V^{11} + 3 γ_{0} {(V^{11})}^{2}, \\ I_{31} = J_{31} - S_{6} + γ_{0} S_{3}, where S_{6} = 3 J_{20} V^{22} + 3 J_{11} V^{12}, S_{3} = 3 V^{11} V^{12}, \\ J_{31} = α_{1}^{3} α_{2} γ_{0} + X_{31} + X_{22} + X_{13} + K_{31}, where X_{31} = α_{1}^{3} K_{01} + 3 α_{1}^{2} α_{2} K_{10}, \\ X_{22} = 4 α_{1}^{2} K_{11} + 2 α_{1} α_{2} K_{20}, X_{13} = 3 α_{1} K_{21} + 6 α_{2} K_{30}, \\ I_{22} = J_{22} - S_{6} + γ_{0} S_{3}, where S_{6} = J_{20} V^{22} + 4 J_{11} V^{12} + J_{02} V^{11}, \\ S_{3} = μ^{1122} = V^{11} V^{22} + 2 {(V^{12})}^{2}, \\ J_{22} = α_{1}^{2} α_{2}^{2} γ_{0} + X_{31} + X_{22} + X_{13} + K_{22}, where X_{31} = 2 α_{1}^{2} α_{2} K_{01} + 2 α_{1} α_{2}^{2} K_{02}, \\ X_{22} = α_{1}^{2} K_{02} + 4 α_{1} α_{2} K_{11} + α_{2}^{2} K_{20}, X_{13} = 2 α_{1} K_{12} + 2 α_{2} K_{21} . \\ For k = 5 : I_{50} = J_{50} - 10 J_{30} V^{11} + 15 J_{10} {(V^{11})}^{2}, \\ I_{41} = J_{41} - S_{10} + S_{15}, where S_{10} = 6 J_{21} V^{11} + 4 J_{30} V^{12}, \\ S_{15} = 12 J_{10} V^{11} V^{12} + 3 J_{01} {(V^{11})}^{2}, \\ J_{41} = α_{1}^{4} α_{2} γ_{0} + X_{41} + X_{32} + X_{23} + X_{14} + K_{41}, where, \\ X_{41} = 4 α_{1}^{3} α_{2} K_{10} + α_{1}^{4} K_{01}, X_{32} = 5 α_{1}^{2} α_{2} K_{20} + 5 α_{1}^{3} K_{11}, \\ X_{23} = 6 α_{1}^{2} K_{21} + 4 α_{1} α_{2} K_{30}, X_{14} = 4 α_{1} K_{31} + α_{2} K_{40}, \\ I_{32} = J_{32} - S_{10} + S_{15}, where S_{10} = 3 J_{12} V^{11} + 6 J_{21} V^{12} + J_{30} V^{22}, \\ S_{15} = 3 J_{10} μ^{1122} + 6 J_{01} V^{11} V^{12}, \\ J_{32} = α_{1}^{3} α_{2}^{2} γ_{0} + X_{41} + X_{32} + X_{23} + X_{14} + K_{32}, where, \\ X_{41} = 3 α_{1}^{2} α_{2}^{2} K_{10} + 2 α_{1}^{3} α_{2} K_{01}, X_{32} = 3 α_{1} α_{2}^{2} K_{20} + 6 α_{1}^{2} α_{2} K_{11} + α_{1}^{3} K_{02}, \\ X_{23} = 3 α_{1}^{2} K_{12} + 6 α_{1} α_{2} K_{21} + α_{2}^{2} K_{30}, X_{14} = 3 α_{1} K_{22} + 2 α_{2} K_{31} . \\ For k = 6 : I_{60} = J_{60} - 15 J_{40} V^{11} + 45 J_{20} {(V^{11})}^{2} - 15 γ_{0} {(V^{11})}^{3}, \\ I_{51} = J_{51} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = 5 V^{12} J_{40} + 10 V^{11} J_{31}, \\ S_{45} = 30 V^{11} V^{12} J_{20} + 15 {(V^{11})}^{2} J_{11}, S_{15}^{'} = 15 γ_{0} {(V^{11})}^{2} V^{12}, \\ J_{51} = α_{1}^{5} α 2 γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{51}, where \\ X_{51} = α_{1}^{5} K_{01} + 5 α_{1}^{4} α_{2} K_{10}, X_{42} = 5 α_{1}^{4} K_{11} + 10 α_{1}^{3} α_{2} K_{20}, \\ X_{33} = 10 α_{1}^{3} K_{21} + 10 α_{1}^{2} α_{2} K_{30}, X_{24} = 10 α_{1}^{2} K_{31} + 5 α_{1} α_{2} K_{40}, \\ X_{15} = α_{2} K_{60} + 5 α_{1} K_{51}, \\ I_{42} = J_{42} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = V^{22} J_{40} + 6 V^{11} J_{22} + 8 V^{12} J_{31}, \\ S_{45} = 3 {(V^{11})}^{2} J_{02} + 6 μ^{1122} J_{20} + 24 V^{11} V^{12} J_{11}, \\ S_{15}^{'} = 3 {(V^{11})}^{2} V^{22} + 24 (V^{11} V^{12})^{2} + 6 V^{11} μ^{1122}, \\ J_{42} = α_{1}^{4} α_{2}^{2} γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{42}, where \\ X_{51} = 2 α_{1}^{5} K_{01} + 4 α_{1}^{4} α_{2} K_{10}, X_{42} = α_{1}^{4} K_{02} + 8 α_{1}^{3} α_{2} K_{11} + 6 α_{1}^{2} α_{2}^{2} K_{20}, \\ X_{33} = 10 α_{1}^{2} α_{2} K_{12} + 10 α_{1} α_{2}^{2} K_{21}, \\ X_{24} = α_{2}^{2} K_{40} + 8 α_{1} α_{2} K_{31} + 6 α_{1}^{2} K_{22}, X_{15} = 2 α_{2} K_{50} + 4 α_{1} K_{41}, \\ I_{33} = J_{33} - S_{15} + S_{45} - γ_{0} S_{15}^{'}, where S_{15} = 6 V^{11} J_{04} + 9 V^{12} J_{22} + 6 V^{22} J_{40}, \\ S_{45} = 9 V^{12} V^{22} J_{20} + 9 μ^{1122} J_{11} + 9 V^{12} V^{11} J_{02}, S_{15}^{'} = 6 V^{11} V^{12} V^{22} + 3 V^{12} μ^{1122}, \\ J_{33} = α_{1}^{3} α_{2}^{3} γ_{0} + X_{51} + X_{42} + X_{33} + X_{24} + X_{15} + K_{33}, where \\ X_{51} = 2 α_{1}^{2} α_{2}^{3} K_{10} + 3 α_{1}^{3} α_{2}^{2} K_{01}, X_{42} = 6 α_{1} α_{2}^{3} K_{20} + 3 α_{1}^{3} α_{2}^{3} K_{11} + 6 α_{1}^{3} α_{2} K_{02}, \\ X_{33} = α_{1}^{3} K_{03} + 9 α_{1}^{2} α_{2} K_{12} + 9 α_{1} α_{2}^{2} K_{21} + α_{2}^{3} K_{30}, \\ X_{24} = 6 α_{1}^{2} K_{13} + 3 α_{1} α_{2} K_{22} + 8 α_{2}^{2} K_{31}, X_{15} = 3 α_{1} K_{23} + 3 α_{2} K_{32} . \end{matrix}

Also

J_{b a}, I_{b a}

are

J_{a b}, I_{a b}

with

α_{1}, Λ_{1} = V^{11}

and

α_{2}, Λ_{2} = V^{21}

of (47) reversed, before setting

α_{1} = 0

and

α_{2} = z = V_{22}^{- 1} x_{2}

by (27). For example, by (61), for

Λ, σ, γ_{s}

of Theorem 3.4,

\begin{matrix} I_{10} = α_{1} γ_{0} + Λ_{1} σ γ_{1} = V^{11} σ γ_{1}, I_{01} = α_{2} γ_{0} + Λ_{2} σ γ_{1} = z γ_{0} + V^{11} σ γ_{1}, \end{matrix}

(62)

\begin{matrix} J_{k 0} = K_{k 0} = {(V^{11} σ)}^{k} γ_{k}, J_{0 k} = \sum_{s = 0}^{k} z^{k - s} K_{0 s} = z J_{0, k - 1} + K_{0 k} \end{matrix}

(63)

where K_{0 s} = {(V^{21} σ)}^{s} γ_{s} .

Proof

This follows from Theorems 3 and 4. □

This gives the

{\bar{I}}_{a b}

needed for

g_{1}, g_{2}, G_{1}, G_{2}

for the conditional distribution of (37)–(39) to

O (n^{- 3 / 2})

. The

{\bar{I}}_{a b}

needed for

g_{3}, G_{3}

can be written down similarly. We now give a much simpler method for obtaining

g_{r k}

of (41), and so

g_{r}

by (40), and

G_{r}

needed for (37) by (38). Theorem 7 gives

g_{r k}

and

g_{r}

in terms of

I_{0 k}

of (54). Theorem 4.4 gives

I_{0 k}

in terms of

J_{0 k}

of (63), a function of

(Λ, σ, γ_{s})

of Theorem 3.4.

Theorem 7.

For v of (36),

I_{a b}

of (54) is given by

\begin{matrix} I_{a b} = - H_{a - 1, b} σ^{- 1} ϕ (v), for a \geq 1 . \end{matrix}

(64)

For

k \geq r \geq 1

and

k - r

even,

g_{r k}

of (41) is given by

\begin{matrix} g_{r k} = P_{r} (0 k) I_{0 k} - b_{r k} σ^{- 1} ϕ (v), for b_{r k} = \sum_{a = 1}^{k} P_{r} (a, k - a) H_{a - 1, k - a} . \end{matrix}

(65)

So by (40), for

r \geq 1, g_{r}

of (39) is given by

\begin{matrix} g_{r} = \sum_{k = 1}^{3 r} [P_{r} (0 k) I_{0 k} - b_{r k} σ^{- 1} ϕ (v) : k - r even] . \end{matrix}

(66)

Proof.

By (60),

g_{r k} = P_{r} (0 k) I_{0 k} + \sum_{a = 1}^{k} P_{r} (a, k - a) I_{a, k - a} .

By (23),

ϕ_{1 \cdot 2} (x_{1}) / ϕ_{V} (x) = θ^{- 1}

where

θ = ϕ_{V_{22}} (x_{2})

and

ϕ_{1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) .

\begin{matrix} So for a \geq 1, H_{a b} ϕ_{V} (x) = {(- \partial_{1})}^{a} {(- \partial_{2})}^{b} = - \partial_{1} [H_{a - 1, b} ϕ_{V} (x)] . \\ So I_{a b} = θ^{- 1} \int_{- \infty}^{x_{1}} H_{a b} ϕ_{V} (x) d x_{1} = - θ^{- 1} H_{a - 1, b} ϕ_{V} (x) = - H_{a - 1, b} σ^{- 1} ϕ (v) . \end{matrix}

This proves (64). So,

\begin{matrix} g_{r k} = P_{r} (0 k) I_{0 k} - θ^{- 1} ϕ_{V} (x) \sum [P_{r} (a b) H_{a - 1, b} : a + b = k, a \geq 1] . \end{matrix}

(65) follows. (66) now follows from (28). □

Note 2.

b_{r k}

is just

{\tilde{p}}_{r k}

of (55) with

(H_{0 b}, H_{a b})

replaced by

(0, H_{a - 1, b})

for

a \geq 1

.

So for

r = 1, 2, 3, b_{r k}

is given in terms of

P_{r} (.)

of Section 4, by

\begin{matrix} b_{r 1} = P_{r} (10) = k_{r}^{1} = k_{10 r}, b_{r 3} = P_{r} (30) H_{20} + P_{r} (21) H_{11} + P_{r} (12) H_{02}, \end{matrix}

(67)

\begin{matrix} b_{22} = P_{2} (20) H_{10} + P_{2} (11) H_{01}, \end{matrix}

(68)

b_{24} = P_{2} (40) H_{30} + P_{2} (31) H_{21} + P_{2} (22) H_{12} + P_{2} (13) H_{03} .

(69)

\begin{matrix} b_{26} = P_{2} (60) H_{50} + P_{2} (51) H_{41} + P_{2} (42) H_{32} + P_{2} (33) H_{23} + P (24) H_{14} \\ + P_{2} (15) H_{05}, \\ b_{35} = P_{3} (50) H_{40} + P_{3} (41) H_{31} + P_{3} (32) H_{22} + P_{3} (23) H_{13} + P_{3} (14) H_{04}, \\ b_{37} = P_{3} (70) H_{60} + P_{3} (61) H_{51} + P_{3} (52) H_{42} + P_{3} (43) H_{33} + P_{3} (34) H_{24} \\ + P_{3} (25) H_{15} + P_{3} (16) H_{06}, \end{matrix}

\begin{matrix} b_{39} = P_{3} (90) H_{80} + P_{3} (81) H_{71} + P_{3} (72) H_{62} + P_{3} (63) H_{53} + P_{3} (54) H_{44} \\ + P_{3} (45) H_{35} + P_{3} (36) H_{26} + P_{3} (27) H_{17} + P_{3} (18) H_{08} . \end{matrix}

(70)

This gives

g_{r k}

and

g_{r}

of (40) for

r \leq 3

, and so the conditional distribution

P_{1 \cdot 2} (x_{1})

of (37), to

O (n^{- 2})

, in terms of

I_{0 k}

of (54) and the coefficients

P_{r} (a b)

.

Theorem 8.

The

I_{0 k}

needed for

g_{1}, g_{2}, g_{3}

of (66) and (38) are given in terms of

γ_{0} = Φ (v), v

of (36), and

J_{0 k}

of (63), by

\begin{matrix} I_{01} = J_{01}, I_{02} = J_{02} - γ_{0} V^{22}, I_{03} = J_{03} - 3 J_{01} V^{22}, \\ I_{04} = J_{04} - 6 J_{02} V^{22} + 3 γ_{0} {(V^{22})}^{2}, \\ I_{05} = J_{05} - 10 J_{03} V^{22} + 15 J_{01} {(V^{22})}^{2}, \\ I_{06} = J_{06} - 15 J_{04} V^{22} + 45 J_{02} {(V^{22})}^{2} - 15 γ_{0} {(V^{22})}^{3}, \\ I_{07} = J_{07} - 21 J_{05} V^{22} + 105 J_{03} {(V^{22})}^{2} - 105 J_{01} {(V^{22})}^{3}, \\ I_{08} = J_{08} - 28 J_{06} V^{22} + 210 J_{04} {(V^{22})}^{2} - 420 J_{02} {(V^{22})}^{3} + 105 γ_{0} {(V^{22})}^{4}, \\ I_{09} = J_{09} - 36 J_{07} V^{22} + 378 J_{05} {(V^{22})}^{2} - 1260 J_{03} {(V^{22})}^{3} + 945 J_{01} {(V^{22})}^{4} . \end{matrix}

Proof.

For

k \leq 6, I_{0 k}

follow from Theorem 2.

By the proof of Theorem 3,

I_{0 k}

can be read off [3] and the univariate Hermite polynomials

H_{k} (u)

given in terms of

I = \sqrt{- 1}

by expanding

\begin{matrix} H_{k} = H_{k} (u) = ϕ {(u)}^{- 1} {(- d / d u)}^{k} ϕ (u) = E {(u + I N)}^{k}, for k \geq 0 . \end{matrix}

□

To summarise, the conditional density of

X_{n 1 \cdot 2}

of (15), is given by Theorem 5, and the conditional distribution is given by (37), (41) in terms of

g_{r}

of (66) and

I_{0 k}

of Theorem 8.

Example 1.

Conditioning when

\hat{w} \in R^{2}

is the mean of a sample with cumulants

κ_{a b}

. The non-zero

P_{r} (a b)

were given in Example 6 of [3]. So

b_{r k} = 0

for

(r k) = (11), (22), (31), (33),

and for

r = 1, 2, 3,

other

b_{r k}

are given by (67)–(70) starting

\begin{matrix} 6 b_{13} = κ_{30} H_{20} + 3 κ_{21} H_{11} + κ_{12} H_{02}, \end{matrix}

(71)

\begin{matrix} 24 b_{24} = κ_{40} H_{30} + 4 κ_{31} H_{21} + 6 κ_{22} H_{12} + 4 κ_{13} H_{03}, \end{matrix}

(72)

\begin{matrix} 72 b_{26} = κ_{30}^{2} H_{50} + 6 κ_{30} κ_{21} H_{41} + 3 (2 κ_{30} κ_{12} + 3 κ_{21}^{2}) H_{32} + 12 (κ_{30} κ_{03} \\ + 9 κ_{21} κ_{12}) H_{23} + 3 (2 κ_{03} κ_{21} + 3 κ_{12}^{2}) H_{14} + 6 κ_{03} κ_{12} H_{05} . \end{matrix}

(73)

The relative conditional density is given to

O (n^{- 2})

by (33) in terms of

{\tilde{p}}_{r}

of (6),

{\tilde{p}}_{r k}

of (55),

f_{r} = p_{r}^{*} (x_{2})

of (28) for

r \leq 3

, and

H_{k}^{*}

of (56) for

k \leq 9

.

\begin{matrix} So, f_{1} = p_{13}^{*} = P_{1} (03) H_{3}^{*}, P_{1} (03) = κ_{03} / 6, \\ f_{2} = p_{24}^{*} + p_{26}^{*}, p_{24}^{*} = P_{2} (04) H_{4}^{*}, P_{2} (04) = κ_{04} / 24, \\ p_{26}^{*} = P_{2} (06) H_{6}^{*}, P_{2} (06) = κ_{03}^{2} / 72, \end{matrix}

\begin{matrix} f_{3} = \sum_{k = 5, 7, 9} p_{3 k}^{*}, p_{3 k}^{*} = P_{3} (0 k) H_{k}^{*}, P_{3} (05) = κ_{05} / 120, \\ P_{3} (07) = κ_{04} κ_{03} / 144, P_{3} (09) = {(κ_{03} / 6)}^{3} . \end{matrix}

The conditional distribution is given by (52) with

g_{r}

of (66), starting

\begin{matrix} G_{0} = g_{0} = Φ (v), g_{1} = κ_{03} I_{03} / 6 - b_{13} σ^{- 1} ϕ (v), \end{matrix}

(74)

for v of (36), with σ^{2} = V_{0} = κ_{20} - κ_{11}^{2} / κ_{02}, μ_{1 \cdot 2} = κ_{11} κ_{02}^{- 1} x_{2},

I_{03}

of Theorem 8, and

b_{13}

of (71). As noted this is a far simpler result than using Theorem 6.

\begin{matrix} Similarly, g_{2} = κ_{04} I_{04} / 24 + κ_{03}^{2} I_{06} / 72 - (b_{24} + b_{26}) σ^{- 1} ϕ (v), \\ g_{3} = \sum_{k = 5, 7, 9} [P_{3} (0 k) I_{0 k} - b_{3 k} σ^{- 1} ϕ (v)], \end{matrix}

for

b_{24}, b_{26}

of (72), (73) and

b_{3 k}

above.

Example 2.

We now build on the entangled gamma model of Example 7 of [3], which gave the

P_{r} (a b)

needed. Let

G_{0}, G_{1}, G_{2}

be independent gamma random variables with means

γ = γ_{0}, γ_{1}, γ_{2}

. For

i = 1, 2

, set

X_{i} = G_{0} + G_{i}, w_{i} = E X_{i} = γ + γ_{i}

, and let

\hat{w}

be the mean of a random sample of size n distributed as

(X_{1}, X_{2})

. So,

E \hat{w} = w,

and

n \hat{w} \overset{L}{=} {(G_{n 0} + G_{n 1}, G_{n 0} + G_{n 2})}^{'}

where

G_{n 0}, G_{n 1}, G_{n 2}

are independent gamma random variables with means

n γ, n γ_{1}, n γ_{2}

. The rth order cumulants of

X = {(X_{1}, X_{2})}^{'}

are

κ^{i^{r}} = (r - 1)! w_{i},

and otherwise

(r - 1)! γ

. Now suppose that

γ_{i} \equiv 1

, the entangled exponential model. So

q = 2

,

X_{n 1}

and

X_{n 2}

have correlation

1 / 2

,

\begin{matrix} V = (\begin{matrix} 2 & 1 \\ 1 & 2 \end{matrix}), V_{12} V_{22}^{- 1} = 1 / 2, V_{1 \cdot 2} = 3 / 2, V^{- 1} = (\begin{matrix} 2 & - 1 \\ - 1 & 2 \end{matrix}) / 3, \\ P (X_{n 1} | (X_{n 2} = x_{2}) < x_{1} (x_{2}, u)) = Φ (u) + O (n^{- 1 / 2}), \end{matrix}

for

x_{1} (x_{2}, v)

of (25), that is,

x_{1} (x_{2}, u) = x_{2} / 2 + {(3 / 2)}^{1 / 2} u

. Figure 1 plots the conditional asymptotic quantiles of

X_{n 1 \cdot 2}

, that is,

x_{1} (x_{2}, u)

, for

Φ (u) = 0.01, 0.0025, 0.1, 0, 0.9, 0.975, 0.99

. To

O (n^{- 1 / 2})

, given n and

\hat{w}

, this figure is equivalent to a figure of

w_{1}

versus

w_{2}

. That is, Figure 1 shows to

O (n^{- 1 / 2})

, the likely value of

w_{1} = {\hat{w}}_{1} - n^{- 1 / 2} x_{1}

for a given value of

w_{2} = {\hat{w}}_{2} - n^{- 1 / 2} x_{2} .

In fact by (26),

- X_{n 1 \cdot 2} = n^{1 / 2} (w_{1} - {\hat{w}}_{1 \cdot 2})

lies between the outer limits with probability 0.98+

O (n^{- 1})

. So although labelled as

x_{1}

versus

x_{2}

, the figure can be viewed as showing the likely value of

w_{1} = {\hat{w}}_{1} - n^{- 1 / 2} x_{1}

for a given value of

w_{2} = {\hat{w}}_{2} - n^{- 1 / 2} x_{2} .

We now give

C_{r}

of (31),

D_{r}

of (33),

H_{k}^{*}

and

p_{r k}^{*}

of (56), and

g_{r}

for

G_{r}

, the coefficients of the expansion for the conditional distribution of (37).

\begin{matrix} So, P_{1} (03) = 2 / 3, P_{1} (21) = 1, P_{2} (04) = 1 / 2, P_{2} (31) = 1, P_{2} (22) = 3 / 2, \\ P_{2} (06) = 2 / 9, P_{2} (51) = 2 / 3, P_{2} (42) = 7 / 6, P_{2} (33) = 26 / 3 \\ P_{3} (05) = 2 / 5, P_{3} (41) = 1, P_{3} (32) = 2, P_{3} (07) = 1 / 3, P_{3} (61) = P_{3} (52) = 5 / 2, \\ P_{3} (43) = 3, P_{3} (09) = 8 / 27, P_{3} (81) = 4 / 27, P_{3} (72) = 10 / 27, \\ P_{3} (63) = 47 / 756, P_{3} (54) = 59 / 945 . \\ By Theorem 8, to 3 decimal places, I_{03} = J_{03} - 2 J_{01} = - 0.586, \\ I_{04} = J_{04} - 4 J_{02} + 4 γ_{0} / 3 = 0.871, I_{05} = J_{05} - 20 J_{03} / 3 + 20 J_{01} / 9 = 0.709, \\ I_{06} = J_{06} - 10 J_{04} + 20 J_{02} - 40 γ_{0} / 9 = - 3.187, \\ I_{07} = J_{07} - 14 J_{05} + 140 J_{03} / 3 - 280 J_{01} / 9 = - 12.857, \\ I_{08} = J_{08} - 56 J_{06} / 3 + 280 J_{04} / 3 - 1120 J_{02} / 3 + 560 γ_{0} / 27 = 12.077, \\ I_{09} = J_{09} - 24 J_{07} + 168 J_{05} - 1120 J_{03} + 560 J_{01} / 3 = 28.278 . \end{matrix}

By Note 2,

{\tilde{p}}_{r k}

of Example 7 of [3], symmetry, and (66),

\begin{matrix} b_{13} = 5 H_{20} / 3 + H_{11}, b_{24} = 3 H_{40} / 2 + 2 H_{31}, b_{26} = [7 H_{50} + 12 H_{41} + 19 H_{32}] / 9, \\ b_{35} = [3 H_{40} + 2 H_{31} + H_{22}] / 5, b_{37} = [9 H_{60} + 19 H_{51} + 30 H_{42} + 18 H_{33}] / 6, \\ b_{39} = [44 H_{80} + 83 H_{71} + 206 H_{62} + 159 H_{53}] / 27, \\ g_{1} = 2 I_{03} / 3 + b_{13} σ^{- 1} γ_{1}, g_{2} = I_{04} / 2 + 2 I_{06} / 9 - b_{24} - b_{26}, \\ g_{3} = 2 I_{05} / 5 + I_{07} / 3 + 8 I_{09} / 27 - b_{35} - b_{37} - b_{39} . \end{matrix}

Let us work through 2 numerical examples to get the conditional distribution to

O (n^{- 2})

. We build on Example 7 of [3]. By Theorem 5, if

x_{2} = 1,

then

z = 1 / 2

,

\begin{matrix} H_{3}^{*} = - 5 / 8, H_{4}^{*} = - 17 / 16, H_{6}^{*} = - 89 / 64, \\ H_{5}^{*} = 41 / 32, H_{7}^{*} = - 461 / 2^{7}, H_{9}^{*} = 6481 / 2^{9}, \\ - C_{1} = f_{1} = - 5 / 12, p_{24}^{*} = - 17 / 32, p_{26}^{*} = - 89 / 288, f_{2} = - 121 / 144, \\ p_{35}^{*} = 41 / 80, p_{37}^{*} = - 461 / 384, p_{39}^{*} = 6481 / 1728, f_{3} = 52921 / 17280, \\ C_{2} = 83 / 72, C_{3} = - 39571 / 17280 . \end{matrix}

We worked to 8 significant figures, but display less. If

x = {(1, 1)}^{'}

, then

\begin{matrix} D_{1} = - 113 / 324 = - 0.349, D_{2} = 120199 / 2^{3} 3^{8} = 2.290, \\ D_{3} = 8896102087 / 2^{7} 3^{12} 5 = 26.156 . \end{matrix}

So to

O (n^{- 2})

the relative conditional density of (33) for

n = 4, 16, 64,

is

\begin{matrix} {(1, 1, 1)}^{'} - {(2^{- 1}, 4^{- 1}, 8^{- 1})}^{'} 0.349 + {(4^{- 1}, 16^{- 1}, 64^{- 1})}^{'} 2.290 \\ + {(8^{- 1}, 64^{- 1}, 2^{- 9})}^{'} 26.156 = (\begin{matrix} 1 & - 0.174 & + 0.573 & + 3.269 \\ 1 & - 0.087 & + 0.143 & + 0.409 \\ 1 & - 0.044 & + 0.036 & + 0.051 \end{matrix}), \end{matrix}

so that for

n = 4

and 16 we can only include two terms, and for

n = 64

, only three terms. We now give the 1st 3

g_{r}, G_{r}

, needed by (37) for the conditional distribution to

O (n^{- 2})

. By (50),

σ^{2} = 3 / 2, σ^{2} = 1.225

. By (17),

μ_{1 \cdot 2} = x_{2} / 2

.

\begin{matrix} For x = {(1, 1)}^{'}, μ_{1 \cdot 2} = 1 / 2, and by (18), v = 6^{- 1 / 2} = 0.408 \\ G_{0} = g_{0} = γ_{0} = Φ (v) = 0.658, γ_{1} = - ϕ (v) = - 0.367, γ_{2} = 0.509, γ_{3} = - 0.795, \\ γ_{4} = 1.501, γ_{5} = - 3.191, γ_{6} = 7.500, γ_{7} = - 19.150, γ_{8} = 52.500, γ_{9} = - 153.200 . \\ K_{0 s} = {(- 6^{- 1 / 2})}^{s} γ_{s} \Rightarrow K_{01} = 0.150, K_{02} = 0.0848, K_{03} = 0.0541, K_{04} = 0.0417, \\ K_{05} = 0.0362, K_{06} = 0.03472, K_{07} = 0.0362, K_{08} = 0.0405, K_{09} = 0.0483 . \\ So by (63), J_{01} = 0.479, J_{02} = 0.0203, J_{03} = 0.372, J_{04} = 0.0738, \\ J_{05} = 0.0731, J_{06} = 0.0713, J_{07} = 0.0718, J_{08} = 0.441, J_{09} = 0.269 . \\ So for x = {(1, 1)}^{'}, b_{13} = - 13 / 27 = - 0.481, g_{1} = - 0.246, \\ b_{24} = - 47 / 54, b_{26} = 2726 / 2107, g_{2} = - 0.696, b_{35} = 10 / 27, b_{37} = - 9371 / 4374, \\ b_{39} = 163806 / 59049 = 2.774, g_{3} = 3.375 . \\ By (38), G_{1} = g_{1} + C_{1} g_{0} = 0.0281, G_{2} = - 0.040, G_{3} = 0.762 . \end{matrix}

For example for

n = 4, 16, 64,

to

O (n^{- 2}), P (X_{n 1} < 1 | X_{n 2} = 1) =

\begin{matrix} 0.658 + 0.0141 - 0.01000 + 0.0952, n = 4, \\ 0.658 + 0.00703 - 0.00250 + 0.0119, n = 16, \\ 0.658 + 0.00351 - 0.000625 + 0.00149, n = 64, \end{matrix}

so that divergence begins with the 4th term.

\begin{matrix} If x_{2} = 2 then z = 1, H_{3}^{*} = - 1 / 2, H_{4}^{*} = - 23 / 4, H_{6}^{*} = 23 / 8, H_{5}^{*} = - 1 / 4, \\ H_{7}^{*} = 29 / 8, H_{9}^{*} = - 175 / 16, - C_{1} = f_{1} = - 1 / 3, p_{24}^{*} = - 23 / 4, p_{26}^{*} = 23 / 36, \\ f_{2} = 161 / 72, p_{35}^{*} = - 1 / 10, p_{37}^{*} = 29 / 24, p_{39}^{*} = - 175 / 54, \\ f_{3} = - 2303 / 1080 = - 2.132, C_{2} = - 17 / 8 = - 2.125, C_{3} = 733 / 1080 = 0.679 . \\ If x = {(2, 2)}^{'}, then D_{1} = - 37 / 81 = - 0.457, D_{2} = 0.387, D_{3} = 13.313 . \end{matrix}

So to

O (n^{- 2})

the relative conditional density of (33) for

n = 4, 16, 64,

is

\begin{matrix} {(1, 1, 1)}^{'} - {(2^{- 1}, 4^{- 1}, 8^{- 1})}^{'} 0.457 + {(4^{- 1}, 16^{- 1}, 64^{- 1})}^{'} 0.387 \\ + {(8^{- 1}, 64^{- 1}, 2^{- 9})}^{'} 13.313 = (\begin{matrix} 1 & - 0.228 & + 0.0969 & + 1.664 \\ 1 & - 0.114 & + 0.0242 & + 0.208 \\ 1 & - 0.0571 & + 0.00605 & + 0.0260, \end{matrix}), \end{matrix}

so that we can only include three terms. Finally, we now give the 1st three

g_{r}, G_{r}

, needed by (37) for the conditional distribution to

O (n^{- 2})

.

\begin{matrix} For x = {(2, 2)}^{'}, μ_{1 \cdot 2} = 1, v = {(2 / 3)}^{1 / 2} = 0.816, G_{0} = γ_{0} = Φ (v) = 0.793, \\ γ_{1} = - ϕ (v) = - 0.286, γ_{2} = 0.559, γ_{3} = - 0.762, γ_{4} = 1.522, \\ γ_{5} = - 3.176, γ_{6} = 7.511, γ_{7} = - 19.142, γ_{8} = 52.505, γ_{9} = - 153.190 . \\ K_{0 s} = {(- 6^{- 1 / 2})}^{s} γ_{s} \Rightarrow K_{01} = 0.117, K_{02} = 0.0932, K_{03} = 0.0519, K_{04} = 0.0423, \\ K_{05} = 0.0360, K_{06} = 0.0348, K_{07} = 0.0362, K_{08} = 0.0405, K_{09} = 0.0483 . \\ So by (63), J_{01} = 0.910, J_{02} = 1.0028, J_{03} = 1.055, J_{04} = 1.097, \\ J_{05} = 1.133, J_{06} = 1.168, J_{07} = 1.204, J_{08} = 1.249, J_{09} = 1.293 . \\ I_{03} = - 0.764, I_{04} = - 1.877, I_{05} = - 3.877, I_{06} = 6.731, I_{07} = 6.263, \\ I_{08} = - 276.110, I_{09} = - 848.735, \\ b_{13} = 11 / 27, g_{1} = - 0.605, b_{24} = - 26 / 27, b_{26} = 1660 / 2187, g_{2} = 0.771, \\ b_{35} = - 138 / 405, b_{37} = 20, 128 / 4374, b_{39} = 1, 795, 048 / 3^{10}, g_{3} = - 224.802 . \\ By (38), G_{1} = g_{1} + C_{1} g_{0} = 0.0180, G_{2} = - 2.463, G_{3} = 4.204 . \end{matrix}

For example for

n = 4, 16, 64,

to

O (n^{- 2}), P (X_{n 1} < 2 | X_{n 2} = 2) =

\begin{matrix} 0.793 + 0.00902 - 0.616 + 0.525, n = 4, \\ 0.793 + 0.00451 - 0.154 + 0.131, n = 16, \\ 0.793 + 0.00226 - 0.0385 + 0.0164, n = 64, \end{matrix}

so that divergence begins with the 3rd term.

Example 3.

Conditioning when the distribution of

\hat{w}

is symmetric about w. Then for r odd,

C_{r} = D_{r} = g_{r k} = g_{r} = 0

. By (33), the conditional density is

p_{n 1 \cdot 2} (x_{1}) = σ^{- 1} ϕ (v) [1 + n^{- 1} D_{2} + O (n^{- 2})], where D_{2} = {\tilde{p}}_{2} (x) - p_{2}^{*} (x_{2}),

for

{\tilde{p}}_{2} (x)

of Example 1 of [3],

H_{k}^{*}

of (56), and

\begin{matrix} p_{2}^{*} (x_{2}) = k_{022} H_{2}^{*} / 2 + k_{043} H_{4}^{*} / 24 . \end{matrix}

By (52), the conditional distribution of

X_{n 1 \cdot 2}

is

\begin{matrix} Φ (v) + n^{- 1} G_{2} + O (n^{- 2}), where G_{2} = g_{2} - p_{2}^{*} (x_{2}) Φ (v), \\ g_{2} = \sum_{k = 2, 4} [P_{2} (0 k) I_{2} (0 k) - b_{2 k} σ^{- 1} ϕ (v)], \end{matrix}

for

b_{2 k}

of (68) and (69).

Example 4.

Discussions of pivotal statistics advocate using the distribution of a sample mean, given the sample variance. So

q = 2 .

Let

{\hat{w}}_{2}, {\hat{w}}_{2}

be the usual unbiased estimates of the mean and variance from a univariate random sample of size n from a distribution with rth cumulant

κ_{r}

. So

w_{1} = κ_{1}, w_{2} = κ_{2} .

By the last 2 equations of Section 12.15 and (12.35)–(12.38) of [20], the cumulant coefficients needed for

{\bar{P}}_{r}^{1 - k}

of (3) for

r \leq 3

, – the coefficients needed for the conditional density to

O (n^{- 2})

, in terms of

(i_{1}^{j_{1}} i_{2}^{j_{2}} \dots) = κ_{i_{1}}^{j_{1}} κ_{i_{2}}^{j_{2}} \dots

, are

\begin{matrix} k_{201} = κ_{2}, k_{111} = κ_{3}, k_{021} = κ_{4} + 2 κ_{2}^{2}, \Rightarrow V = (\begin{matrix} κ_{2} & κ_{3} \\ κ_{3} & κ_{4} + 2 κ_{2}^{2} \end{matrix}), \\ k_{101} = k_{011} = 0, k_{302} = κ_{3}, k_{212} = k_{122} = 0, k_{032} = (6) + 12 (24) + 4 (3^{2}) + 8 (2^{3}), \\ k_{202} = k_{112} = 0, k_{022} = 2 (2^{2}), k_{403} = (4), k_{313} = (5), \\ k_{223} = k_{133} = 0, k_{043} = (8) + 24 (26) + 32 (35) + 32 (4^{2}) + 144 (2^{2} 4) + 96 (23^{2}) \\ + 48 (2^{4}), k_{102} = k_{012} = k_{303} = k_{213} = k_{123} = 0, k_{033} = 12 (24) + 16 (2^{3}), \\ k_{504} = k_{324} = k_{234} = k_{144} = 0, k_{414} = (6), k_{054} = (10) + 40 (28) + 80 (37) \\ + 200 (46) + 96 (5^{2}) + 480 (2^{2} 6) + 1280 (235) + 1280 (24^{2}) + 960 (3^{2} 4) + 1920 (2^{3} 4) \\ + 1920 (2^{2} 3^{2}) + 384 (2^{5}) . \end{matrix}

(33) gives

D_{r}

in terms of

{\tilde{p}}_{r}

and

p_{r}^{*}

, that is, in terms of

{\tilde{p}}_{r k}

and

p_{r k}^{*}

of (28) in terms of

P_{r} (a b)

. In this example, many of these are 0. The non-zero

P_{r} (a b)

are in order needed,

\begin{matrix} P_{1} (30) = κ_{3} / 6, P_{1} (03) = k_{032} / 6, P_{2} (02) = κ_{2}^{2}, P_{2} (40) = κ_{4} / 24, \\ P_{2} (04) = k_{043} / 24, P_{2} (32) = κ_{5} / 6, P_{2} (60) = κ_{3}^{2} / 72, \\ P_{2} (06) = k_{032}^{2} / 72, P_{2} (33) = κ_{3} k_{032} / 36 . P_{3} (03) = k^{033} / 6 . \\ P_{3} (05) = k_{054} / 120 + k_{022} k_{032} / 12 . P_{3} (70) = κ_{3} κ_{4} / 144, P_{3} (07) = k^{032} k_{042} / 144, \\ P_{3} (62) = κ_{3} k_{313} / 36, P_{3} (43) = k_{032} κ_{4} / 144, P_{3} (34) = (κ_{3} k_{043} + k_{022} k_{313}) / 144, \\ P_{3} (90) = κ_{3} / 6^{3}, P_{3} (09) = {(k_{032} / 6)}^{3}, P_{3} (63) = 3 κ_{3}^{2} k_{032} / 6^{3}, \\ P_{3} (36) = 3 κ_{3} k_{032}^{2} / 6^{3} . \\ So, {\tilde{p}}_{11} = 0, {\tilde{p}}_{13} = P_{1} (30) H_{30} + P_{1} (03) H_{03}, {\tilde{p}}_{22} = P_{2} (02) H_{02}, \\ {\tilde{p}}_{24} = P_{2} (40) H_{40} + P_{2} (04) H_{04} + P_{2} (32) H_{31}, {\tilde{p}}_{26} = P_{2} (60) H_{60} + P_{2} (06) H_{06} \\ + P_{2} (33) H_{33}, {\tilde{p}}_{31} = 0, {\tilde{p}}_{33} = P_{3} (30) H_{30} + P_{3} (03) H_{03}, {\tilde{p}}_{35} = P_{3} (05) H_{05}, \\ {\tilde{p}}_{37} = P_{3} (70) H_{70} + P_{3} (2^{7}) H_{07} + P_{3} (62) H_{61} + P_{3} (43) H_{43} + P_{3} (34) H_{34} \\ + P_{3} (25) H_{25} + P_{3} (16) H_{16} + P_{3} (07) H_{07}, \\ {\tilde{p}}_{39} = P_{3} (90) H_{90} + P_{3} (63) H_{63} + P_{3} (36) H_{36} + P_{3} (09) H_{09} . \\ Also, b_{13} = P_{1} (30) H_{20}, b_{22} = P_{2} (20) H_{10} + P_{2} (11) H_{01}, \\ b_{24} = P_{2} (40) H_{30} + P_{2} (31) H_{21}, b_{26} = P_{2} (60) H_{50} + P_{2} (33) H_{23}, \\ b_{31} = 0, b_{33} = P_{3} (30) H_{20}, b_{35} = 0, \\ b_{37} = P_{3} (70) H_{60} + P_{3} (61) H_{51} + P_{3} (43) H_{33} + P_{3} (34) H_{24}, \\ b_{39} = P_{3} (90) H_{80} + P_{3} (63) H_{53} + P_{3} (36) H_{26} . \end{matrix}

For

r = 1, 2, 3, {\tilde{p}}_{r} (x)

is now given by (13),

p_{r}^{*} (x)

, and Section 2 of [3]. By (4) and (33), this gives the conditional density

p_{n 1 \cdot 2} (x_{1})

to

O (n^{- 2})

. And (66) gives

g_{r}

needed for the conditional distribution

P_{n 1 \cdot 2} (x_{1})

to

O (n^{- 2})

.

5. Conclusions

Conditioning is a very useful and basic way to use correlated information to reduce the variability of an estimate. These results provide the means to obtain conditional regions and 1- or 2-sided conditional confidence intervals.

Section 4 gave the conditional density and distribution of

X_{n 1}

given

X_{n 2}

to

O (n^{- 2})

where

(\binom{X_{n 1}}{X_{n 2}})

is any partition of

X_{n} = n^{1 / 2} (\hat{w} - w)

. The expansion (33) gave the conditional density of any multivariate standard estimate. Our main result, an explicit expansion for the conditional distribution (37) to

O (n^{- 2})

, is given in terms of the leading

{\bar{I}}^{1 - k}

of (42). These are given explicitly by Theorems 3.3 and 3.4.

When

q_{1} = q_{2} = 1,

Theorem 4.1 simplified the conditional density expansion, and Theorem 7 gave a huge simplification, and the coefficients of the conditional distribution expansion in terms of

I_{0 k} = I^{2^{k}}

of Theorem 4.4.

Cumulant coefficients can also be used to obtain estimates of bias

O (n^{- k})

for

k \geq 2

: see [21,22,23].

6. Discussion

Ref. [3] gave the density and distribution of

X_{n} = n^{1 / 2} (\hat{w} - w)

to

O (n^{- 2})

, for

\hat{w} \in R^{q}

any standard estimate, in terms of functions of the cumulant coefficients

{\bar{k}}_{j}^{1 - r}

of (2), called the Edgeworth coefficients,

{\bar{P}}_{r}^{1 - k}

.

Most estimates of interest are standard estimates, including smooth functions of sample moments, like the sample skewness, kurtosis, correlation, and any multivariate function of k-statistics. (These are unbiased estimates of cumulants and their products, the most common example being that for a variance.) Unbiased estimates are not needed for Edgeworth expansions, although this does simplify the Edgeworth coefficients, as seen in Examples 1, 2, and 4. However unbiased estimates are not available for most parameters or functions of them, such as the ratio of two means or variances, except for special cases of exponential families. Ref. [1] gave the cumulant coefficients for smooth functions of standard estimates.

A good approximation for the distribution of an estimate, is vital for accurate inference. It enables one to explore the distribution’s dependence on underlying parameters. Our analytic method avoids the need for simulation or jack-knife or bootstrap methods while providing greater accuracy than any of them. Ref. [10] used the Edgeworth expansion to show that the bootstrap gives accuracy to

O (n^{- 1})

. Ref. [24] said that “2nd order correctness usually cannot be bettered”. But this is not true using our analytic method. Simulation, while popular, can at best shine a light on behaviour, only when there is a small number of parameters, and only for limited values of their range.

Estimates based on a sample of independent, but not identically distributed random vectors, are also generally standard estimates. For example, for a univariate sample mean

\bar{X} = n^{- 1} \sum_{j = 1}^{n} X_{j n}

where

X_{j n}

has rth cumulant

κ_{r j n}

, then

κ_{r} (\bar{X}) = n^{1 - r} κ_{r}

where

κ_{r} = n^{- 1} \sum_{j = 1}^{n} κ_{r j n}

is the average rth cumulant. For some examples, see [2,25,26,27] The last is for a function of a weighted mean of complex random matrices.

For conditions for the validity of multivariate Edgeworth expansions, see [28] and its references, and Appendix C of [3].

While the use of Edgeworth-Cornish-Fisher expansions is widespread, few papers address how to deal with their divergence for small sample sizes. Refs. [29,30] avoided this question as it did not arise in their examples. In contrast we confronted this in Example 2, the examples of Withers (1984), and in Example 7 of [3].

We now turn to conditioning. Conditioning on

{\hat{w}}_{2}

makes inference on

w_{1}

more precise by reducing the covariance of the estimate. The covariance of

{\hat{w}}_{1} | {\hat{w}}_{2}

can be substantially less than that of

{\hat{w}}_{1}

. See the references at the end of Section 1.

A conditional distribution by tilting, was first given by [31] up to

O (n^{- 1})

, for a bivariate sample mean. See Chapter 4 of [32], Compare [8]. Tilting (also known as small sample asympotics, or saddlepoint expansioins), was first used in statistics by [33]. He gave an approximation to the density of a sample mean, good for the whole line, not just in the region where the Central Limit Theorem approximation holds.

Future directions.

1. The results here give the first step for constructing confidence intervals and confidence regions of higher order accuracy. See [6,34]. What is needed next, is an application of [1] to obtain the cumulant coefficients of

{\hat{θ}}_{i} = {\hat{V}}_{i i}^{- 1 / 2} ({\hat{w}}_{i} - w_{i}), i = 1, 2,

or those of

\hat{θ} = {\hat{V}}^{- 1 / 2} (\hat{w} - w)

. This should be straightforward.

2. When

q_{1} = 1

, our expansion for the conditional distribution of

X_{n 1 \cdot 2}

of (15), can be inverted using the Lagrange Inversion Theorem, to give expansions for its percentiles. This should be straightforward. (The quantile expansions of [29] and Withers (1984) do not apply as Appendix A shows that conditional estimates of standard estimates are not standard estimates.)

3. Here we have only considered expansions about the normal. However expansions about other distributions can greatly reduce the number of terms by matching the leading bias coefficient. The framework for this is [2], building on [34]. For expansions about a matching gamma, see [35,36].

4. The results here can be extended to tilted (saddlepoint) expansions by applying the results of [2]. The tilted version of the multivariate distribution and density of a standard estimate are given by Corollaries 3, 4 there, and that of the conditional distribution and density follow from these. For the entangled gamma of Example 2, this requires solving a cubic. See also [37].

5. A possible alternative approach to finding the conditional distribution, is to use conditional cumulants, when these can be found. Section 6.2 of [38] uses conditional cumulants to give the conditional density of a sample mean to

O (n^{- 3 / 2})

. Section 5.6 of [39] gave formulas for the 1st 4 cumulants conditional on

X_{2} = x_{2}

only when

X_{1}

and

X_{2}

are uncorrelated. He says that this assumption can be removed, but gives no details how. That is unlikely to give an alternative to our approach, for as well as giving expansions for the first 3 conditional cumulants, Appendix A shows that the conditional estimate is not a standard estimate.

6. Lastly we discuss numerical computation. We have used [40] for our calculations. Its input is

V^{11}, V^{12}, V^{22}

and

y_{1}, y_{2}

, - not

V_{11}, V_{12}, V_{22}

and

x_{1}, x_{2}

. There is a function sub2(sb1,sb2) which takes as argument the two subscripts of mu, and returns the value. If global variables mu20, mu02, mu11 are symbolic variables (defined using sympy) then it returns the answer in terms of those, but if they are numeric then it returns a numeric answer. There is another function called biHermite(n, m, y1, y2) which takes the 2 subscripts of H. If y1 and y2 are symbolic, then it returns a symbolic answer, but if they are numeric it returns a numeric answer. A numerical example is given by Example 2, that is, for the case

V_{11} = 2, V_{12} = 1, V_{22} = 2

and

x_{1} = x_{2} = 1

or

x_{1} = x_{2} = 2

.

Similar software for numerical calculations for Theorems 5, 7 and 8 would be invaluable, as would software for applying the Lagrange Inversion Theorem. (We mention R-4.4.1 for Windows: dmvnorm for the density function of the multivariate normal, mvtnorm for the multivariate normal, qmvnorm for quantiles, and rmvnorm to generate multivariate normal variables.) On bivariate Hermite polynomials, see cran.r-project.org/web/packages/calculus/vignettes/hermite.html accessed on 20 December 2024.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Conditional Moments

Here we give expansions for the conditional moments of

X_{n 1 \cdot 2}

of (15), in terms of the conditional normal moments of

X_{1 \cdot 2}

, of (15). And we show that

\begin{matrix} {\hat{w}}_{1 \cdot 2} = {\hat{w}}_{1} | (X_{n 2} = x_{2}) \end{matrix}

(A1)

is neither a standard estimate of

w_{1}

, nor a Type B estimate, as defined below.

Consider the case

q_{1} = 1

. By (19),

\begin{matrix} X_{1 \cdot 2} = μ + σ N, where μ = μ_{1 \cdot 2}, σ^{2} = V_{1 \cdot 2} of (18), N \sim N_{1} (0, 1) . \\ Set M_{s} = E X_{1 \cdot 2}^{s} . \\ So M_{1} = μ, M_{2} = μ^{2} + σ^{2}, M_{3} = μ^{3} + 3 μ σ^{2}, M_{4} = μ^{4} + 6 μ^{2} σ^{4} + 3 σ^{4} . \\ For r = 0, 1, \dots, set A_{r} \otimes B_{r} = \sum_{a + b = r} A_{a} B_{b}, A_{r} \otimes B_{r} \otimes C_{r} = \sum_{a + b + c = r} A_{a} B_{b} C_{c} . \end{matrix}

(A2)

Non-central moments.

Theorem A1.

Take

C_{r}, D_{r}

of Theorem 3.1. Set

{\tilde{p}}_{0} (x) = 1 .

For

s > 0,

the sth conditional moment of

X_{n 1 \cdot 2}

of (15) about

Φ_{1 \cdot 2} (x_{1})

of (24), has the expansion

m_{n s} = E X_{n 1 \cdot 2}^{s} = n^{s / 2} E {({\hat{w}}_{1 \cdot 2} - w_{1})}^{s} \approx \sum_{r = 0}^{\infty} n^{- r / 2} G_{r}^{s}

(A3)

\begin{matrix} where G_{r}^{s} = C_{r} \otimes g_{r}^{s}, g_{r}^{s} = E X_{1 \cdot 2}^{s} P_{r}, and P_{r} = {\tilde{p}}_{r} (x) at x_{1} = X_{1 \cdot 2} . \\ So, G_{0}^{s} = g_{0}^{s} = M_{s} = E X_{1 \cdot 2}^{s} . \end{matrix}

\begin{matrix} g_{r}^{s} = \sum_{k = 1}^{3 r} [g_{r k}^{s} : k - r even], for r \geq 1, where for {\tilde{p}}_{r k} of (6), \end{matrix}

(A4)

\begin{matrix} g_{r k}^{s} = E X_{1 \cdot 2}^{s} {\tilde{P}}_{r k} = {\bar{P}}_{r}^{1 - k} {\bar{I}}_{s}^{1 - k}, {\tilde{P}}_{r k} = {\tilde{p}}_{r k} at x_{1} = X_{1 \cdot 2}, \end{matrix}

(A5)

\begin{matrix} and for 1 \leq i_{1}, \dots, i_{k} \leq q, {\bar{I}}_{s}^{1 - k} = I_{s}^{i_{1} \dots i_{k}} = E X_{1 \cdot 2}^{s} {\bar{H}}^{1 - k} (X_{1 \cdot 2}), \end{matrix}

(A6)

for {\bar{H}}^{1 - k} (X_{1 \cdot 2}) = {\bar{H}}^{1 - k} at x_{1} = X_{1 \cdot 2} .

Proof.

This follows from Theorem 1. □

So by (A3), the sth conditional moment of

X_{n 1 \cdot 2}

is

\begin{matrix} m_{n s} = M_{s} + n^{- 1 / 2} G_{1}^{s} + n^{- 1} G_{2}^{s} + O (n^{- 3 / 2}), where \\ G_{1}^{s} = C_{1} M_{s} + g_{1}^{s}, G_{2}^{s} = C_{2} M_{s} + C_{1} g_{1}^{s} + g_{2}^{s}, \\ g_{1}^{s} = g_{11}^{s} + g_{13}^{s}, g_{2}^{s} = g_{22}^{s} + g_{24}^{s} + g_{26}^{s}, \end{matrix}

of (A5) and (A6). For example,

\begin{matrix} g_{11}^{s} = {\bar{k}}_{1}^{1} E X_{1 \cdot 2}^{s} {\bar{H}}^{1} (X_{1 \cdot 2}), and g_{13}^{s} = {\bar{k}}_{2}^{1 - 3} E X_{1 \cdot 2}^{s} {\bar{H}}^{1 - 3} (X_{1 \cdot 2}) / 6 . \end{matrix}

So

{\hat{w}}_{1 \cdot 2}

of (A1) is not a standard estimate, as by (A3), the expansion for its mean is a power series in

n^{- 1 / 2}

, not

n^{- 1}

. Is it a Type B estimate? These are defined as for a standard estimate, but with cumulant expansions being series in

n^{- 1 / 2}

, not

n^{- 1}

. We shall see. Take

q_{2} = q_{1} = 1

. By Theorem 6, for

P_{r} (a b)

of (3),

g_{r k}^{s}

of (A4) is given by

\begin{matrix} g_{r k}^{s} = \sum_{b = 0}^{k} P_{r} (k - b, b) I_{k - b, b}^{s}, where I_{a b}^{s} = E X_{1 \cdot 2}^{s} H_{a b} (X_{1 \cdot 2}), \\ and H_{a b} (X_{1 \cdot 2}) = H_{a b} at x_{1} = X_{1 \cdot 2} . \end{matrix}

For example,

\begin{matrix} g_{r 1}^{s} = P_{r} (10) I_{10}^{s} + P_{r} (01) I_{01}^{s} \\ g_{r 3}^{s} = P_{r} (30) I_{30}^{s} + P_{r} (21) I_{21}^{s} + P_{r} (12) I_{12}^{s} + P_{r} (03) I_{03}^{s} . \end{matrix}

Finding the

I_{a b}^{s} .

The

H_{a b}

needed are given in Appendix B of [3] in terms of

y = V^{- 1} x : y_{i} = V^{i 1} x_{1} + V^{i 2} x_{2}, y_{1} = μ_{20} x_{1} + μ_{11} x_{2}, y_{2} = μ_{11} x_{1} + μ_{02} x_{2} .

For example,

\begin{matrix} H_{10} = y_{1} = μ_{20} x_{1} + μ_{11} x_{2}, H_{01} = y_{2} = μ_{11} x_{1} + μ_{02} x_{2}, \\ H_{30} = y_{1}^{3} - 3 y_{1} μ_{20} = {(μ_{20} x_{1} + μ_{11} x_{2})}^{3} - 3 (μ_{20} x_{1} + μ_{11} x_{2}) μ_{20}, \\ H_{03} = y_{2}^{3} - 3 y_{2} μ_{02} = {(μ_{11} x_{1} + μ_{02} x_{2})}^{3} - 3 (μ_{11} x_{1} + μ_{02} x_{2}) μ_{20}, \\ H_{21} = y_{2} (y_{1}^{2} - μ_{20}) - 2 y_{1} μ_{11}, H_{12} = y_{1} (y_{2}^{2} - μ_{02}) - 2 y_{2} μ_{11} . \end{matrix}

Let us write

H_{a b}

in terms of

M_{s}

of (A2), as

\begin{matrix} H_{a b} = \sum_{k = 0}^{a + b} H_{a b}^{k} x_{1}^{k} . Then, I_{a b}^{s} = \sum_{k = 0}^{a + b} [H_{a b}^{k} M_{s + k} : s + k even] . \\ So, I_{10}^{s} = H_{10}^{0} M_{s} + H_{10}^{1} M_{s + 1}, I_{01}^{s} = H_{01}^{0} M_{s} + H_{01}^{1} M_{s + 1} : \\ for odd s, I_{10}^{s} = H_{10}^{1} M_{s + 1} = μ_{20} M_{s + 1}, I_{01}^{s} = H_{01}^{1} M_{s + 1} = μ_{02} x_{2} M_{s + 1}, \\ and for even s, I_{10}^{s} = H_{10}^{0} M_{s} = μ_{11} x_{2} M_{s}, I_{01}^{s} = H_{01}^{0} M_{s} = μ_{02} x_{2} M_{s} . \\ So, H_{10}^{0} = μ_{11} x_{2}, H_{10}^{1} = μ_{20}, H_{01}^{0} = μ_{02} x_{2}, H_{01}^{1} = μ_{11}, \\ H_{30}^{0} = {(μ_{11} x_{2})}^{3} - 3 μ_{11} x_{2} μ_{20}, H_{30}^{1} = 3 μ_{20} [{(μ_{11} x_{2})}^{2} - μ_{20}], \\ H_{30}^{2} = 3 μ_{20}^{2} μ_{11} x_{2}, H_{30}^{3} = μ_{20}^{3}, \\ H_{03}^{0} = {(μ_{02} x_{2})}^{3} - 3 μ_{02} x_{2} μ_{20}, H_{03}^{1} = 3 μ_{11} [{(μ_{02} x_{2})}^{2} - μ_{11}], \\ H_{03}^{2} = 3 μ_{11}^{2} μ_{02} x_{2}, H_{03}^{3} = μ_{11}^{3}, \\ H_{21}^{0} = μ_{02} x_{2} [{(μ_{11} x_{2})}^{2} - μ_{20}] - 2 μ_{11}^{2} x_{2}, \\ H_{21}^{1} = μ_{11} [{(μ_{11} x_{2})}^{2} - μ_{20}] + 2 μ_{20} μ_{11} (μ_{02} x_{2}^{2} - 1), \\ H_{21}^{2} = μ_{20} μ_{22} x_{2} since μ_{22} = μ_{20} μ_{02} + 2 μ_{11}^{2}, H_{21}^{3} = μ_{11} μ_{20}^{2}, \\ H_{12}^{0} = μ_{11} x_{2} [{(μ_{02} x_{2})}^{2} - μ_{02}] - 2 μ_{02} μ_{11} x_{2}, \\ H_{12}^{1} = μ_{20} [{(μ_{02} x_{2})}^{2} - μ_{02}] + 2 μ_{11}^{2} (μ_{02} x_{2}^{2} - 1), H_{12}^{2} = μ_{11} μ_{22} x_{2}, H_{12}^{3} = μ_{20} μ_{11}^{2} . \end{matrix}

To get a general formula for

H_{a b}^{k}

, set

\begin{matrix} c_{1} = V_{11} x_{1}, c_{2} = I V_{11} X_{1}, c_{3} = V_{12} x_{2}, c_{4} = I V_{12} X_{2} . \\ So, y_{1} + I Y_{1} = c_{1} + c_{2}, y_{2} + I Y_{2} = c_{3} + c_{4}, \\ H_{a b} = E {(c_{1} + c_{2})}^{a} {(c_{3} + c_{4})}^{b} = \sum_{j = 0}^{a} (\binom{a}{j}) c_{1}^{a - j} \sum_{k = 0}^{b} (\binom{b}{k}) c_{3}^{b - k} C_{j k} \\ where C_{j k} = E c_{2}^{j} c_{4}^{k} = I^{j + k} V_{11}^{j} V_{12}^{k} μ^{j k}, μ^{j k} = E X_{1}^{j} X_{2}^{k} . \\ So, H_{a b}^{a - j} = (\binom{a}{j}) V_{11}^{a - j} \sum_{k = 0}^{b} (\binom{b}{k}) c_{3}^{b - k} C_{j k}, \end{matrix}

where

C_{j k} = 0

if

j + k

is odd.

μ^{j k}

is just

μ_{j k}

of Appendix B of [3] with V replaced by

V^{- 1} .

Central moments. Set

m_{s} (X) = E X^{s}

and

μ_{s} (X) = E {(X - E X)}^{s}

.

For

m_{s} = m_{s 1 \cdot 2}

of (A3), set

\begin{matrix} μ_{s} = μ_{s 1 \cdot 2} = μ_{s} (X_{n 1 \cdot 2}) = n^{s / 2} μ_{s} ({\hat{w}}_{1 \cdot 2}) . \\ So by (A 3), μ_{2} = m_{2} - m_{1}^{2} \approx \sum_{r = 0}^{\infty} n^{- r / 2} μ_{2 r}, where μ_{2 r} = G_{r}^{2} - G_{r}^{1} \otimes G_{r}^{1}, \\ and μ_{3} = m_{3} - 3 m_{1} m_{2} + 2 m_{1}^{3} \approx \sum_{r = 0}^{\infty} n^{- r / 2} μ_{3 r} \\ where μ_{3 r} = G_{r}^{3} - 3 G_{r}^{1} \otimes G_{r}^{2} + 2 G_{r}^{1} \otimes G_{r}^{1} \otimes G_{r}^{1} . \end{matrix}

Is the conditional estimate

{\hat{w}}_{1 \cdot 2}

a Type B estimate? This requires its rth cumulant to have magnitude

O (n^{1 - r})

for

r \geq 1

. This is true for

r = 1

and 2 but not for

r = 3

, as

μ_{r} ({\hat{w}}_{1 \cdot 2})

has magnitude

O (n^{- r / 2})

, since

μ_{s 1 \cdot 2} = O (1)

.

References

Withers, C.S. 5th-Order Multivariate Edgeworth Expansions for Parametric Estimates. Mathematics 2024, 12, 905. Available online: https://www.mdpi.com/2227-7390/12/6/905/pdf (accessed on 20 December 2024). [CrossRef]
Withers, C.S.; Nadarajah, S. Tilted Edgeworth expansions for asymptotically normal vectors. Ann. Inst. Stat. Math. 2010, 62, 1113–1142. [Google Scholar] [CrossRef]
Withers, C.S. Edgeworth coefficients for standard multivariate estimates. Axioms 2025, 14, 632. [Google Scholar] [CrossRef]
Shenton, L.R.; Bowman, K.O. Maximum Likelihood Estimation in Small Samples; Griffin’s Statistical Monograph: London, UK, 1977. [Google Scholar]
Withers, C.S.; Nadarajah, S. Asymptotic properties of M-estimators in linear and nonlinear multivariate regression models. Metrika 2014, 77, 647–673. [Google Scholar] [CrossRef]
Withers, C.S. Accurate confidence intervals when nuisance parameters are present. Comm. Statist.-Theory Methods 1989, 18, 4229–4259. [Google Scholar] [CrossRef]
Barndoff-Nielsen, O.E.; Cox, D.R. Inference and Asymptotics; Chapman and Hall: London, UK, 1994. [Google Scholar]
Barndoff-Nielsen, O.E.; Cox, D.R. Asymptotic Techniques for Use in Statistics; Chapman and Hall: London, UK, 1989. [Google Scholar]
Bhattacharya, R.N.; Rao, R.R. Normal Approximation and Asymptotic Expansions; SIAM: Philadelphia, PA, USA, 2010. [Google Scholar]
Hall, P. The Bootstrap and Edgeworth Expansion; Springer: New York, NY, USA, 1992. [Google Scholar]
Booth, J.; Hall, P.; Wood, A. Bootstrap estimation of conditional distributions. Ann. Stat. 1992, 20, 1594–1610. [Google Scholar] [CrossRef]
DiCiccio, T.J.; Martin, M.A.; Young, G.A. Analytical approximations to conditional distribution functions. Biometrika 1993, 80, 781–790. [Google Scholar] [CrossRef]
Hansen, B.E. Autoregressive conditional density estimation. Int. Econ. Rev. 1994, 35, 705–730. [Google Scholar] [CrossRef]
Klüppelberg, C.; Seifert, M.I. Explicit results on conditional distributions of generalized exponential mixtures. J. Appl. Probab. 2020, 57, 760–774. [Google Scholar] [CrossRef]
Moreira, M.J. A conditional likelihood ratio test for structural models. Econometrica 2003, 71, 1027–1048. [Google Scholar] [CrossRef]
Pfanzagl, P. Conditional distributions as derivatives. Ann. Probab. 1979, 7, 1046–1050. [Google Scholar] [CrossRef]
Withers, C.S.; Nadarajah, S.N. Charlier and Edgeworth expansions via Bell polynomials. Probab. Math. Stat. 2009, 29, 271–280. [Google Scholar]
Anderson, T.W. An Introduction to Multivariate Analysis; John Wiley: New York, NY, USA, 1958. [Google Scholar]
Comtet, L. Advanced Combinatorics; Reidel: Dordrecht, The Netherlands, 1974. [Google Scholar]
Stuart, A.; Ord, K. Kendall’s Advanced Theory of Statistics, 5th ed.; Griffin: London, UK, 1991; Volume 2. [Google Scholar]
Withers, C.S.; Nadarajah, S. Nonparametric estimates of low bias. REVSTAT Stat. J. 2012, 10, 229–283. [Google Scholar]
Withers, C.S.; Nadarajah, S. Bias reduction: The delta method versus the jackknife and the bootstrap. Pak. J. Statist. 2014, 30, 143–151. [Google Scholar]
Withers, C.S.; Nadarajah, S. Bias reduction for standard and extreme estimates. Commun. Stat.-Simul. Comput. 2023, 52, 1264–1277. [Google Scholar] [CrossRef]
Hall, P. Rejoinder: Theoretical Comparison of Bootstrap Confidence Intervals. Ann. Stat. 1988, 16, 981–985. [Google Scholar] [CrossRef]
Skovgaard, I.M. Edgeworth expansions of the distributions of maximum likelihood estimators in the general (non i.i.d.) case. Scand. J. Statist. 1981, 8, 227–236. [Google Scholar]
Skovgaard, I.M. Transformation of an Edgeworth expansion by a sequence of smooth functions. Scand. J. Statist. 1981, 8, 207–217. [Google Scholar]
Withers, C.S.; Nadarajah, S. The distribution and percentiles of channel capacity for multiple arrays. Sadhana Sadh Indian Acad. Sci. 2020, 45, 1–25. [Google Scholar] [CrossRef]
Skovgaard, I.M. On multivariate Edgeworth expansions. Int. Statist. Rev. 1986, 54, 169–186. [Google Scholar] [CrossRef]
Cornish, E.A.; Fisher, R.A. Moments and cumulants in the specification of distributions. Rev. l’Inst. Int. Statist. 1937, 5, 307–322. [Google Scholar] [CrossRef]
Fisher, R.A.; Cornish, E.A. The percentile points of distributions having known cumulants. Technometrics 1960, 2, 209–225. [Google Scholar] [CrossRef]
Skovgaard, I.M. Saddlepoint expansions for conditional distributions. J. Appl. Probab. 1987, 24, 875–887. [Google Scholar] [CrossRef]
Butler, R.W. Saddlepoint Approximations with Applications; Cambridge University Press: Cambridge, UK, 2007; pp. 107–144. [Google Scholar] [CrossRef]
Daniels, H.E. Saddlepoint approximations in statistics. Ann. Math. Statist. 1954, 25, 631–650. [Google Scholar] [CrossRef]
Hill, G.W.; Davis, A.W. Generalised asymptotic expansions of Cornish-Fisher type. Ann. Math. Statist. 1968, 39, 1264–1273. [Google Scholar] [CrossRef]
Withers, C.S.; Nadarajah, S. Generalized Cornish-Fisher expansions. Bull. Brazilian Math. Soc. New Series 2011, 42, 213–242. [Google Scholar] [CrossRef]
Withers, C.S.; Nadarajah, S. Expansions about the gamma for the distribution and quantiles of a standard estimate. Methodol. Comput. Appl. Probab. 2014, 16, 693–713. [Google Scholar] [CrossRef]
Jing, B.; Robinson, J. Saddlepoint approximations for marginal and conditional probabilities of transformed variables. Ann. Statist. 1994, 22, 1115–1132. [Google Scholar] [CrossRef]
McCullagh, P. Tensor notation and cumulants of polynomials. Biometrika 1984, 71, 461–476. [Google Scholar] [CrossRef]
McCullagh, P. Tensor Methods in Statistics; Chapman and Hall: London, UK, 1987. [Google Scholar]
Teal, P. A Code to Calculate Bivariate Hermite Polynomials. 2024. Available online: https://github.com/paultnz/bihermite/tree/main (accessed on 20 December 2024).

Figure 1.

x_{1} (x_{2}, v) = x_{2} / 2 + {(3 / 2)}^{1 / 2} v

of (25) for

Φ (v) = 0.01, 0.1, 0, 0.9, 0.99

—courtesy of Dr Paul Teal.

Figure 1.

x_{1} (x_{2}, v) = x_{2} / 2 + {(3 / 2)}^{1 / 2} v

of (25) for

Φ (v) = 0.01, 0.1, 0, 0.9, 0.99

—courtesy of Dr Paul Teal.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Withers, C.S. Expansions for the Conditional Density and Distribution of a Standard Estimate. Stats 2025, 8, 98. https://doi.org/10.3390/stats8040098

AMA Style

Withers CS. Expansions for the Conditional Density and Distribution of a Standard Estimate. Stats. 2025; 8(4):98. https://doi.org/10.3390/stats8040098

Chicago/Turabian Style

Withers, Christopher S. 2025. "Expansions for the Conditional Density and Distribution of a Standard Estimate" Stats 8, no. 4: 98. https://doi.org/10.3390/stats8040098

APA Style

Withers, C. S. (2025). Expansions for the Conditional Density and Distribution of a Standard Estimate. Stats, 8(4), 98. https://doi.org/10.3390/stats8040098

Article Menu

Expansions for the Conditional Density and Distribution of a Standard Estimate

Abstract

1. Introduction and Summary

2. Multivariate Edgeworth Expansions

3. The Conditional Density and Distribution

4. The Case $q_{1} = q_{2} = 1$

5. Conclusions

6. Discussion

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Conditional Moments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Expansions for the Conditional Density and Distribution of a Standard Estimate

Abstract

1. Introduction and Summary

2. Multivariate Edgeworth Expansions

3. The Conditional Density and Distribution

4. The Case q 1 = q 2 = 1

5. Conclusions

6. Discussion

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Conditional Moments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4. The Case $q_{1} = q_{2} = 1$