Testing Group Symmetry of a Multivariate Distribution

Sakhanenko, Lyudmila

doi:10.3390/sym1020180

Open AccessArticle

Testing Group Symmetry of a Multivariate Distribution

by

Lyudmila Sakhanenko

Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824-1027, USA

Symmetry 2009, 1(2), 180-200; https://doi.org/10.3390/sym1020180

Submission received: 15 September 2009 / Accepted: 24 November 2009 / Published: 26 November 2009

Download Versions Notes

Abstract

:

We propose and study a general class of tests for group symmetry of a multivariate distribution, which encompasses different types of symmetry, such as ellipsoidal and permutation symmetries among others. Our approach is based on supremum norms of special empirical processes combined with bootstrap. We show that these tests are consistent against any fixed alternative. This work generalizes the methodology of Koltchinskii and Sakhanenko [7], developed for ellipsoidal symmetry to the case of group symmetry. It also provides a unified approach to testing different types of symmetry of a multivariate distribution.

Keywords:

symmetric distributions; Donsker class; linear operator

Classification:

MSC 62G10, 60F05, 60F99

1. Introduction

Let

S

be a compact group of linear transformations (operators) from

R^{d}

to

R^{d}

. A Borel probability measure P on

R^{d}

is called

S

-symmetric if and only if there exist an affine nonsingular transformation

A_{0}

from

R^{d}

onto

R^{d}

and a

S

-invariant Borel probability measure

Π_{0}

such that

P = Π_{0} \circ A_{0} .

In other words, if X is a random vector with the distribution

P,

then there exists an affine nonsingular transformation

A_{0}

such that the random vector

Z = A_{0} X

is

S

-invariant, which in its turn means that

Z \overset{d}{=} S Z

for all transformations S from the group

S

. Clearly, an affine transformation

A_{0}

is not unique, as one could take instead

S A_{0}

for an arbitrary transformation S from the group

S

. But, on the other hand, one could always fix

A_{0}

by normalizing it in any proper way.

Obviously, we can define

Π_{0} = Π_{0, P}

as the distribution of the random variable

Z = A_{0} X

. The couple

(A_{0}, Π_{0})

will be called parameters (specifiers) of the

S

-symmetric probability measure

P .

We denote by

E (R^{d})

the set of all

S

-symmetric distributions on

R^{d} .

We are interested in the following problem. Given an independent identically distributed (i.i.d.) sample

(X_{1}, \dots, X_{n})

from the distribution

P,

defined on the probability space

(Ω, Σ, P)

, construct and study tests for

S

-symmetry of the distribution P.

Our group approach will unify the theory for different tests for different types of symmetry. Below we give some examples of common in literature types of symmetry

S

with corresponding choices of

A_{0}

.

Example 1.

Let

S = {S : S (x_{1}, \dots, x_{d}) = (\pm x_{1}, \dots, \pm x_{d})}

. Then

S

-invariant probability measure

Π_{0}

is a so called sign change symmetrical measure. Probability measure P is

S

-symmetric if there exists an affine transformation

A_{0}

in the form

A_{0} x = x - θ_{0}

with some

θ_{0} \in R^{d}

such that

P = Π_{0} \circ A_{0}

. In this case, P is often called diagonally or reflectively symmetrical measure.

Let

| x |

denote the Euclidean norm of a vector

x \in R^{d}

. If

E | X | < + \infty,

one can define

θ_{0} = θ_{0} (P) : = E X = \int_{R^{d}} x P (d x)

. Obviously, the parameter

θ_{0}

and the transformation

A_{0}

are uniquely defined for any P such that the corresponding X is integrable (not necessarily

S

-symmetric). We can also define in such a generality

Π_{0}

as the distribution of the random variable

Z = A_{0} X

.

Example 2.

Let

S

be the group of all orthogonal transformations in

R^{d}

. Then

S

-symmetrical probability measure P is often called ellipsoidally symmetric or elliptically symmetric or elliptically contoured measure and

Π_{0}

is spherically symmetric one. In this case, if

{E | X |}^{2} < + \infty,

one can define

θ_{0} = θ_{0} (P)

as the mean

E X = \int_{R^{d}} x P (d x)

and

V_{0} = V_{0} (P)

as the square root of the covariance operator of

X,

so that

A_{0} x = A_{0} (P) x = V_{0}^{- 1} (x - θ_{0})

for any x from

R^{d}

. Define

Π_{0}

as the distribution of the random variable

Z = A_{0} X

. As in the previous case, these parameters are defined for any P such that the corresponding X is square-integrable (not necessarily

S

-symmetric).

We refer to the papers [1,2,3,4,5,6,7,8,9] for results on ellipsoidal and spherical symmetry testing.

Example 3.

Let

d = 2

and let

S_{k}

be a group of all transformations translating a regular polygon with k vertices centered at 0 into itself. Clearly,

S_{k}

is a subgroup of the group of all orthogonal transformations. Thus, an affine transformation

A_{0}

can be fixed in the same way as in example 2.

Example 4.

Let

S

be a group of all reflections about hyperplanes

{x \in R^{d} : x_{k} = x_{l},} k, l = 1, \dots, d

. Then for each

S \in S

there exists a permutation

(i_{1}, \dots, i_{d})

such that

S (x_{1}, \dots, x_{d}) = (x_{i_{1}}, \dots, x_{i_{d}})

for all

x \in R^{d}

. In this case

S

-invariant probability measure

Π_{0}

is called permutation symmetric measure.

As in examples 2 and 3 we can define

A_{0} x = A_{0} (P) x = V_{0}^{- 1} (x - θ_{0})

for any x from

R^{d}

, where

θ_{0} = θ_{0} (P) : = E X = \int_{R^{d}} x P (d x)

and

V_{0} = V_{0} (P)

is the square root of the covariance operator of

X,

provided

{E | X |}^{2} < + \infty

(also see [10,11]).

Tests for symmetry of a multivariate distribution play an important role in statistics and in various fields of science. To name a few, in finance theory log-returns of assets are assumed to be ellipsoidally symmetric. In genetics it is assumed that gene expression values are diagonally symmetrically distributed. In image analysis components are assumed to be spherically symmetric. In linear programming it is assumed that the distribution of feasible solutions is permutation symmetric. In statistics sliced inverse regression method due to Li, see [12], works for ellipsoidally symmetric distributions. Also since tests for normality are extended to tests for ellipsoidal symmetry, any research field that employs multivariate analysis based on normality assumption can benefit from relaxing this assumption to ellipsoidal symmetry assumption. So clearly symmetry tests are needed in applications. See [13] for a detailed survey on the use of symmetry in various scientific fields.

The rest of the paper is organized in the following way. We give notations and construct test statistics with examples in Section 2. Main results and bootstrapped test statistics are given in Section 3, followed by a detailed example in Section 4. The proofs are in Section 5. The closing remarks are in Section 6, followed by technical details in Appendix.

2. Notations and Preliminaries

Let m denote the uniform distribution (the normalized Haar measure) on the group

S .

Given a bounded Borel function f on

R^{d},

we define

m_{f} (y) : = \int_{S} f (S y) m (d S), y \in R^{d} .

(1)

It is easy to check that for a

S

-symmetric P distribution with specifiers

(A_{0}, Π_{0})

we have

\int_{R^{d}} f (A_{0} x) P (d x) = \int_{R^{d}} m_{f} (y) Π_{0} (d y)

(2)

for any bounded Borel function f. Indeed,

\int_{R^{d}} f (z) Π_{0} (d z) = \int_{R^{d}} f (S y) Π_{0} (d y)

for any

S \in S

. Since

S

is a compact group, one can integrate the equality over

S

with respect to the uniform measure. Thus,

\int_{R^{d}} \int_{S} f (S y) m (d S) Π_{0} (d y) = \int_{R^{d}} m_{f} (y) Π_{0} (d y)

implies (2).

As a result, if a class

F

characterizes the distribution, i.e.

\int_{R^{d}} f d Q_{1} = \int_{R^{d}} f d Q_{2} for all f \in F

implies that

Q_{1} = Q_{2},

then P is

S

-symmetric if and only if (2) holds for all

f \in F .

In general, we call P a

F

-asymmetric distribution if and only if there exists a function

f \in F

such that (2) does not hold. This observation is the key idea behind the tests that we construct and study. Naturally, a class

F

should be rich enough and possess good properties for further analysis. Let us describe it. Let

F

be a semialgebraic subgraph class as introduced in [7]. Basically, for a function from such a class, its subgraph can be constructed from a union of intersections of a finite number of subgraphs of polynomials of a finite degree in

R^{d}

. The same should be true for a product of two functions from such a class. For instance, one can use polynomials of bounded degree or trigonometric functions of bounded frequency as

F

. The precise definition can be found in Appendix. We will provide a few examples later on.

Let

P_{n}

be the empirical distribution based on the sample

(X_{1}, \dots, X_{n}) .

Assume in what follows that

A_{n} = A_{n} (P)

is a

n^{1 / 2}

-consistent estimator of

A_{0} = A_{0} (P)

. And, furthermore, there exists such a function

γ_{A} : R^{d} \to R

that

E γ_{A}^{2} (X) < + \infty

and as

n \to \infty

n^{1 / 2} (A_{n} - A_{0}) = n^{1 / 2} \int_{R^{d}} γ_{A} (x) d (P_{n} - P) + o_{P} (1) .

For example, assuming

{E | X |}^{2} < + \infty

let us define

θ_{n}

and

V_{n}

as

θ_{n} : = θ_{n} (X_{1}, \dots, X_{n}) : = {\bar{X}}_{n} : = n^{- 1} \sum_{i = 1}^{n} X_{i}

and

V_{n}^{2} : = V_{n}^{2} (X_{1}, \dots, X_{n}) : = n^{- 1} \sum_{i = 1}^{n} (X_{i} - \bar{X}) {(X_{i} - \bar{X})}^{T},

where all vectors are columns and superscript T denotes transposition. Then one can define

A_{n} x = A_{n} (X_{1}, \dots, X_{n}) x = V_{n}^{- 1} (x - θ_{n})

in examples 2-4 and

A_{n} x = x - θ_{n}

in example 1 for any

x \in R^{d}

. Under the condition

{E | X |}^{4} < + \infty,

V_{n}

is a

n^{1 / 2}

-consistent estimator of

V_{0}

. Weaker moment assumptions on P can be imposed if other statistics are considered for estimation of

θ_{0}, V_{0}

, such as a sample median and an M-estimator for the covariance matrix. See, e.g., [14].

The scaled residuals of the observations

(X_{1}, \dots, X_{n})

are defined as

Z_{j} : = Z_{j, n} : = A_{n} X_{j}, j = 1, \dots n .

Let

Π_{n}

denote the empirical distribution based on the sample

(Z_{1}, \dots, Z_{n}) .

Our approach to the problem of testing for

S

-symmetry will be to use the sup-norms of the stochastic process

ξ_{n} (f) : = n^{1 / 2} (\int_{R^{d}} f (A_{n} x) P_{n} (d x) - \int_{R^{d}} m_{f} (y) d Π_{n} (y)) = n^{- 1 / 2} \sum_{j = 1}^{n} [f (Z_{j}) - m_{f} (Z_{j})], f \in F

as test statistics

T_{n} (F) : = sup_{f \in F} | ξ_{n} (f) | .

Such functionals can be viewed as “measures of asymmetry” of the empirical distribution because of the relationship (2).

Note that a nonsingular affine transformation of the data

(X_{1}, \dots, X_{n})

results in an orthogonal transformation of the scaled residuals. If a class

F

is invariant with respect to all orthogonal transformations (i.e. for all

f \in F

and any orthogonal transformation O we have

f \circ O \in F

), then the test statistic defined as the sup-norm of the process

ξ_{n}

is affine invariant. This is the case in the following examples.

Example 1.1.

Consider

S

from example 1. Let

H : = \{{x \in R^{d} : 〈 x, u 〉 \leq c} : u \in S^{d - 1}, c \in R\}

be the class of all half-spaces in

R^{d}

, where

S^{d - 1}

denotes the unit sphere in

R^{d}

. Consider the class

F : = {I_{H} (x) : H \in H} .

For

f (\cdot) = I_{H} (\cdot)

, we have

m_{f} (y) = \frac{1}{2^{d}} \sum_{ε_{i} = \pm 1} I_{H} (ε_{1} y_{1}, \dots, ε_{d} y_{d}) = : \frac{1}{2^{d}} \sum_{ε_{i} = \pm 1} I_{H} (ε y), y \in R^{d} .

The process

ξ_{n}

becomes

ξ_{n} (H) = \frac{n^{- 1 / 2}}{2^{d}} \sum_{j = 1}^{n} \sum_{ε_{i} = \pm 1} (I_{H} (Z_{j}) - I_{H} (ε Z_{j})), H \in H .

The test statistic is represented as

T_{n} (F) : = sup_{H \in H} | ξ_{n} (H) | = \frac{n^{- 1 / 2}}{2^{d}} sup_{H \in H} |\sum_{j = 1}^{n} \sum_{ε_{i} = \pm 1} (I_{H} (Z_{j}) - I_{H} (ε Z_{j}))| .

Example 1.2.

Consider

S

from example 1. Let

H

be the class of all half-spaces in

R^{d}

as in the previous example. Let

f (x) = \sum_{ε_{i} = \pm 1} {(- 1)}^{| ε |} I_{ε H} (x)

for

ε H = {x : 〈 x, ε u 〉 \leq c}

with

| ε | = ε_{1} + \dots + ε_{d}

. Then we have

m_{f} (y) = 0

. The process

ξ_{n}

becomes

ξ_{n} (H) = n^{- 1 / 2} \sum_{j = 1}^{n} \sum_{ε_{i} = \pm 1} {(- 1)}^{| ε |} I_{ε H} (Z_{j}), H \in H

and the test statistic looks like

T_{n} (F) = n^{- 1 / 2} sup_{H \in H} |\sum_{j = 1}^{n} \sum_{ε_{i} = \pm 1} {(- 1)}^{| ε |} I_{ε H} (Z_{j})| .

In one-dimensional case

d = 1

this test statistic becomes

T_{n} : = n^{- 1 / 2} sup_{c \in R} |\sum_{j = 1}^{n} (I_{(Z_{j} \leq c)} - I_{(- Z_{j} \leq c)})| .

One gets the expression that resembles a well known test for symmetry based on the empirical distribution function

P_{n}

,

{\tilde{T}}_{n} = n^{- 1 / 2} sup_{x \leq 0} | n (P_{n} (x) + P_{n} (- x) - 1) | .

See, for instance the discussion in the paper [15].

Example 2.1.

Consider

S

from example 2. Let

C : = {{v \in S^{d - 1} : 〈 v, l 〉 \geq c} : l \in S^{d - 1}, c \in R_{+}}

be the class of “caps” on the unit sphere

S^{d - 1} .

Consider the class

F : = {I_{C} (\frac{x}{| x |}) I_{{0 < | x | \leq t}} : C \in C, t > 0} .

For

f (\cdot) = I_{C} (\frac{\cdot}{| \cdot |}) I_{{0 < | \cdot | \leq t}},

we have

m_{f} (ρ) = m (C) I_{(0, t]} (ρ), ρ > 0 .

The process

ξ_{n}

becomes now

ξ_{n} (C, t) = n^{- 1 / 2} \sum_{1}^{n} (I_{C} (\frac{Z_{j}}{| Z_{j} |}) - m (C)) I_{{0 < | Z_{j} | \leq t}}, C \in C, t > 0 .

The test statistic

T_{n} (F) : = {sup}_{t > 0, C \in C} | ξ_{n} (C, t) |

can also be represented as

T_{n} (F) = n^{- 1 / 2} max_{1 \leq j \leq n} sup_{C \in C} |\sum_{k = 1}^{j} (I_{C} (\frac{Z_{[k]}}{| Z_{[k]} |}) - m (C))|,

where

Z_{[j]}, j = 1, \dots, n,

is the rearrangement of

Z_{1}, \dots, Z_{n},

such that

| Z_{[1]} | \leq \dots \leq | Z_{[n]} | .

These tests were studied in [7] and [9].

Example 2.2.

Consider

S

from example 2. Let

G_{l}

denote the linear space of spherical harmonics of degree less than or equal to l in

R^{d},

and let

B_{l}

be the unit ball in

G_{l} \cap L_{2} (S^{d - 1}, d m) .

Denote

F : = {I_{{0 < | x | \leq t}} ψ (\frac{x}{| x |}) : ψ \in B_{l}, t > 0} .

Then for

f \in F,

we have

f (x) = I_{{0 < | x | \leq t}} ψ (\frac{x}{| x |}), x \in R^{d},

and

m_{f} (ρ) = m (ψ) I_{(0, t]} (ρ)

, where

m (ψ)

is the average of

ψ

on

S^{d - 1} .

In this case, the process

ξ_{n}

becomes

ξ_{n} (t, ψ) = n^{- 1 / 2} \sum_{1}^{n} (I_{{0 < | Z_{j} | \leq t}} (ψ (\frac{Z_{j}}{| Z_{j} |}) - m (ψ))), t > 0, ψ \in B_{l} .

The statistic

T_{n} (F) : = {sup}_{t > 0, ψ \in B_{l}} | ξ_{n} (t; ψ) |

becomes

T_{n} (F) = n^{- 1 / 2} max_{1 \leq j \leq n} {(\sum_{s = 1}^{\dim (G_{l})} {(\sum_{k = 1}^{j} ψ_{s} (\frac{Z_{[k]}}{| Z_{[k]} |}) - δ_{s 1}))}^{2})}^{1 / 2},

where

{ψ_{s}, s = 1, \dots, \dim (G_{l})}

denotes an orthonormal basis of the space

G_{l},

ψ_{1} \equiv 1,

δ_{i j} = 1

for

i = j

and 0 otherwise, and

\dim

of a set denotes the number of elements of the set. These tests were studied in [7] and [9], where their superiority in level preservation and power performance over other tests both theoretically and in a simulation study, was shown. A similar approach was used to test for multivariate normality in [16]. The authors of [6] developed a different kind of tests for ellipsoidal symmetry based on spherical harmonics.

Example 2.3.

Consider

S

from example 2. Let

H

be the class of all half-spaces in

R^{d}

as in the example 1.1. For

f : = I_{H},

where

H : = {x \in R^{d} : 〈 x, u 〉 \leq c},

we have

m_{f} (ρ) : = γ (\frac{c}{ρ}),

where

γ (c) : = m {v : 〈 v, u 〉 \leq c} .

The process

ξ_{n}

in this case is

ξ_{n} (u, c) : = n^{- 1 / 2} \sum_{j = 1}^{n} (I_{{〈 Z_{j}, u 〉 \leq c}} - γ (\frac{c}{| Z_{j} |})), u \in S^{d - 1}, c \in R

and the test statistic can be defined as

T_{n} : = sup_{c \in R} sup_{u \in S^{d - 1}} | ξ_{n} (u, c) | .

This type of test statistics was systematically studied in papers [7,9,10].

Example 2.4.

Consider

S

from example 2. Let

F : = \{\frac{| \cdot - t | - | \cdot |}{| t |} : t \in R^{d}, t \neq 0\} .

For

f (\cdot) = \frac{| \cdot - t | - | \cdot |}{| t |},

we have

m_{f} (ρ) = ρ \frac{β (t) - 1}{| t |},

where

β (t) : = \int_{S^{d - 1}} | v - t | m (d v) .

Thus, the process

ξ_{n}

becomes

ξ_{n} (t) : = n^{- 1 / 2} {| t |}^{- 1} \sum_{i = 1}^{n} (| Z_{i} - t | - | Z_{i} | β (t)), t \in R^{d}

and the test statistic can be chosen as

T_{n} (F) : = sup_{t \in R^{d}, t \neq 0} | ξ_{n} (t) | .

Example 2.5.

Consider

S

from example 2. Consider the class

F : = {e^{i 〈 t, x 〉}, x \in R^{d} : | t | \leq 1} .

For

f (\cdot) = e^{i 〈 t, \cdot 〉}

we have

m_{f} (ρ) = c_{d} \frac{J_{d / 2 - 1} (ρ | t |)}{{(ρ | t |)}^{d / 2 - 1}}, ρ > 0

, where

J_{l}

denotes the Bessel function of the l-th order, the constant

c_{d}

depends only on d. The process

ξ_{n}

becomes

ξ_{n} (t) = n^{- 1 / 2} \sum_{j = 1}^{n} (exp {i 〈 t, Z_{j} 〉} - c_{d} \frac{J_{d / 2 - 1} (| Z_{j} | | t |)}{(| Z_{j} {| | t |)}^{d / 2 - 1}}), | t | \leq 1

and the test statistic can be chosen as

T_{n} : = sup_{| t | \leq 1} | ξ_{n} (t) | .

Consider

S

from example 3. Due to similarity between examples 2 and 3, one can choose the same classes of functions for

S_{k}

from example 3. We give just one of the examples as an illustration.

Example 3.1.

Consider the class

F

from example 2.1. Then for

f (\cdot) = I_{C} (\frac{\cdot}{| \cdot |}) I_{{0 < | \cdot | \leq t}},

we have

m_{f} (y) = \frac{1}{k} \sum_{l = 0}^{k - 1} I_{C} (O_{l} y / | y |) I_{(0, t]} (| y |)

for all

y \in R^{d}

, where

O_{l}

is the rotation on angle

2 π l / k

,

l = 0, \dots, k - 1

. In this case, the process

ξ_{n}

is

ξ_{n} (C, t) = n^{- 1 / 2} \sum_{j = 1}^{n} (I_{C} (\frac{Z_{j}}{| Z_{j} |}) - \frac{1}{k} \sum_{l = 0}^{k - 1} I_{C} (O_{l} Z_{j} / | Z_{j} |)) I_{{0 < | Z_{j} | \leq t}}, C \in C, t > 0 .

The test statistic

T_{n} : = {sup}_{t > 0, C \in C} | ξ_{n} (C, t) |

can be also represented as

T_{n} (F (= n^{- 1 / 2} max_{1 \leq j \leq n} sup_{C \in C} |\sum_{m = 1}^{j} (I_{C} (\frac{Z_{[m]}}{| Z_{[m]} |}) - \frac{1}{k} \sum_{l = 0}^{k - 1} I_{C} (O_{l} Z_{[m]} / | Z_{[m]} |))|,

where

Z_{[j]}, j = 1, \dots, n

is the rearrangement of

Z_{1}, \dots, Z_{n},

such that

| Z_{[1]} | \leq \dots \leq | Z_{[n]} | .

Example 4.1.

Consider

S

from example 4. Let

H

be the class of all half-spaces in

R^{d}

as in the example 1.1. Denote

F : = {I_{H} (y) I (y_{j_{1}} \geq \dots \geq y_{j_{k}}) : H \in H, (j_{1}, \dots, j_{k}) is any combination from (1, \dots, d)} .

Then for

f \in F

such that

f (y) : = I_{H} (y) I (y_{j_{1}} \geq \dots \geq y_{j_{k}})

for

y \in R^{d}

we have

m_{f} (y) = m_{H, i_{1}, . . ., i_{k}} (y) = \frac{1}{d!} \sum_{(i_{1}, . . ., i_{d})} I_{H} ((y_{i_{1}}, . . ., y_{i_{d}})) I (y_{i_{j_{1}}} \geq \dots \geq y_{i_{j_{k}}}),

where the summation is over all permutations

(i_{1}, \dots, i_{d})

of

(1, \dots, d)

. In this case the process

ξ_{n}

is

ξ_{n} (H, i_{1}, \dots, i_{k}) : = n^{- 1 / 2} \sum_{j = 1}^{n} (I_{H} (Z_{j}) I ({(Z_{j})}_{j_{1}} \geq . . . \geq {(Z_{j})}_{j_{k}}) - m_{H, i_{1}, . . ., i_{k}} (Z_{j})),

H \in H, (j_{1}, \dots, j_{k}) is any combination from (1, \dots, d)

and the test statistic can be defined as

T_{n} : = sup_{H \in H} sup_{2 \leq k \leq d} sup_{(j_{1}, . . ., j_{k})} | ξ_{n} (H, j_{1}, . . ., j_{k}) |,

where the last supremum is taken over all combinations

(j_{1}, \dots, j_{k})

out of

(1, \dots, d)

. Well known and frequently used Friedman’s rank tests are based on the similar choice of a class

F

. For reference see the papers [17] and [11].

It is not hard to see that the function classes defined in examples 1.1, 1.2, 2.1–2.4, 3.1, 3.2, and 4.1 are semialgebraic subgraph. In addition, classes

F

characterize the distribution in the case of examples 1.1, 2.1, 2.3, 2.4, 3.1, 4.1 above.

We say the class of transformations

S

preserves the semialgebraic property if for any polynomial p on

R^{d}

of degree less than or equal to r the set

{(x, t) \in R^{d + 1} : p (S x, t) \geq 0}

belongs to

{S A}_{q, d + 1, l}

for some q and l (see Appendix for the definition). Classes

S

, defined in examples 1–4, preserve the semialgebraic property.

Let

E (f; A) : = \int_{R^{d}} [f (A x) - m_{f} (A x)] P (d x) .

It follows from (2) that, for a

S

-symmetric distribution P and for all f

E (f; A_{0}) = 0 .

Let

W_{P}^{\circ}

denote the P-Brownian bridge, i.e. a centered Gaussian process indexed by functions in

L_{2} (R^{d}; d P)

with the covariance

E W_{P}^{\circ} (f) W_{P}^{\circ} (g) = P (f g) - P (f) P (g) .

We will frequently use integral notation for

W_{P}^{\circ} (f) :

W_{P}^{\circ} (f) = \int_{R^{d}} f (x) W_{P}^{\circ} (d x) .

As always,

ℓ^{\infty} (F)

denotes the space of all uniformly bounded functions on

F

with the sup-norm

{∥ Y ∥}_{F} : = {sup}_{f \in F} | Y (f) |, Y \in ℓ^{\infty} (F) .

A sequence of stochastic processes

ζ_{n} : F \mapsto R

is said to converge weakly in

ℓ^{\infty} (F)

(in the sense of Hoffmann-Jørgensen) to the stochastic process

ζ : F \mapsto R

if and only if there exists a Radon probability measure

γ

on

ℓ^{\infty} (F)

such that

γ

is the distribution of

ζ

and, for all bounded and

{∥ \cdot ∥}_{F}

-continuous functionals

Φ : ℓ^{\infty} (F) \mapsto R,

we have

E^{*} Φ (ζ_{n}) \to \int Φ (x) γ (d x),

where

E^{*}

stands for the outer expectation, which is defined as

E^{*} Ψ = inf {E U : U \geq Ψ, Ψ : Ω \to [- \infty, \infty] measurable and E U exists}

for a

Ψ : Ω \to [- \infty, \infty]

. See for instance [18].

We assume in what follows that the class

F

satisfies standard measurability assumptions used in the theory of empirical processes (see [19] or [18]). We also need smoothness conditions (S) on P and

F

, which are given in Appendix.

3. Main Results

Theorem 1

Suppose that

F

is a semialgebraic subgraph class, the smoothness conditions (S) hold and

E γ_{A}^{2} (X) < + \infty .

Define a Gaussian stochastic process

ξ_{P} (f) : = W_{P}^{\circ} (f (A_{0} \cdot) - m_{f} (A_{0} \cdot)) + E_{A}^{'} (f; A_{0}) (\int_{R^{d}} γ_{A} (x) W_{P}^{\circ} (d x)),

whose distribution is a Radon measure in

ℓ^{\infty} (F)

. Then the sequence of stochastic processes

{ξ_{n} (f) - n^{1 / 2} E (f; A_{0}) : f \in F}

converges weakly in the space

ℓ^{\infty} (F)

to the process

ξ_{P} .

In particular, if P is

S

-symmetric with specifiers

(A_{0}, Π_{0}),

then the sequence

ξ_{n}

converges weakly in the space

ℓ^{\infty} (F)

to the process

ξ_{P} .

Define the test statistics

T_{n} : = {∥ ξ_{n} ∥}_{F} .

Given

α > 0,

let

t_{α} : = inf \{t : P {∥ ξ_{P} ∥_{F} \geq t} \leq α\} .

Let

H_{0}

be the hypothesis that

P \in E (R^{d})

and let

H_{a}

be the alternative that

P \notin E (R^{d}) .

Also, denote by

H_{a} (F)

the alternative that P is

F

-asymmetric.

Theorem 1 and the well-known theorem of Cirel’son on continuity of the distribution of the sup-norm of Gaussian processes, see [20], imply the following.

Corollary 1

Suppose all conditions of Theorem 1 hold. Under the hypothesis

H_{0}

P {T_{n} \geq t_{α}} \to α

and under the alternative

H_{a} (F)

P {T_{n} \geq t_{α}} \to 1 as n \to \infty .

In particular, if

F

characterizes the distribution, then under the alternative

H_{a}

, i.e. for a fixed

S

-asymmetric distribution P,

P {T_{n} \geq t_{α}} \to 1 as n \to \infty .

In most cases, however, the limit distributions of such statistics as

T_{n}

depend on the unknown parameters of the distribution

P .

Thus, to implement the test one has to evaluate the distribution of the test statistic using, for instance, a bootstrap method. We describe below a version of the conditional bootstrap for

S

-symmetry testing. It is a generalization of the bootstrap method proposed in [7].

Given

P,

let

P^{s}

denote the

S

-symmetric distribution with specifiers

(A_{0}, Π_{0}) .

It will be called the

S

-symmetrization of

P .

Denote by

P_{n}^{s}

the

S

-symmetric distribution with specifiers

(A_{n}, Π_{n}) .

Let

(X_{1}^{s}

, …,

X_{n}^{s})

be an i.i.d. sample from the distribution

P_{n}^{s},

defined on a probability space

(\hat{Ω}, \hat{Σ}, \hat{P}) .

One can construct such a sample using the following procedure. Take an i.i.d. sample

(Y_{1}, \dots, Y_{n})

from

Π_{n}

, which is a resampling from

(Z_{1}, \dots, Z_{n})

. Define

X_{j}^{s} : = A_{n}^{- 1} Y_{j}, j = 1, \dots, n .

Then conditionally on

(X_{1}, \dots, X_{n})

,

(X_{1}^{s}, \dots, X_{n}^{s})

is an i.i.d. sample from the

S

-symmetric distribution

P_{n}^{s} .

In particular, for

S

from example 1

X_{j}^{s} : = θ_{n} + ε_{j} Y_{j}, j = 1, \dots, n,

where

(ε_{1}, \dots, ε_{n})

is a Rademacher i.i.d. sample, that is

ε_{j} = - 1 or 1

with probability 1/2,

j = 1, \dots, n,

independent of

(Y_{1}, \dots, Y_{n})

.

For

S

from example 2 one can take an i.i.d. sample

(U_{1}, \dots, U_{n})

uniformly distributed on

S^{d - 1}

and an i.i.d. sample

(Y_{1}, \dots, Y_{n})

from

{\tilde{Π}}_{n}

, the empirical distribution based on

(| Z_{1} |, \dots, | Z_{n} |)

, independent of

(U_{1}, \dots, U_{n}) .

In other words,

(Y_{1}, \dots, Y_{n})

is the resampling from the sample

(| Z_{1} |, \dots, | Z_{n} |)

. Then

X_{j}^{s} : = θ_{n} + V_{n} U_{j} Y_{j}, j = 1, \dots, n .

For

S

from example 3 let

(ε_{1}, \dots, ε_{n})

be an i.i.d. sample uniformly distributed on

{0, 1, \dots, k - 1}

independent of

(Y_{1}, \dots, Y_{n})

, then

X_{j}^{s} : = θ_{n} + V_{n} O_{j} Y_{j}, j = 1, \dots, n,

where

O_{j}

is a rotation on the angle

2 π \frac{ε_{j}}{k}

about 0.

Finally, for

S

from example 4 consider n independent permutations

{i_{1}^{(j)}, \dots, i_{d}^{(j)}}

of

(1, \dots, d)

,

j = 1, \dots, n,

independent of

(Y_{1}, \dots, Y_{n})

. Then

X_{j}^{s} : = θ_{n} + R_{j} Y_{j}, j = 1, \dots, n,

where

R_{j}

is a reflection transformation such that

R_{j} (x_{1}, \dots, x_{d}) = (x_{i_{1}^{(j)}}, \dots, x_{i_{d}^{(j)}})

for

x \in R^{d}

.

Let

{\hat{P}}_{n}

denote the empirical measure based on the sample

(X_{1}^{s}, \dots, X_{n}^{s}),

and let

{\hat{A}}_{n} : = A_{n} (X_{1}^{s}, \dots, X_{n}^{s}) .

Define the bootstrapped scaled residuals as

{\hat{Z}}_{j} : = {\hat{Z}}_{j, n} : = {\hat{A}}_{n} X_{j}^{s}, j = 1, \dots, n .

Let

{\hat{Π}}_{n}

denote the empirical distribution based on the sample

({\hat{Z}}_{1}, \dots, {\hat{Z}}_{n}) .

The bootstrap version of

ξ_{n}

is the process

\begin{array}{l} {\hat{ξ}}_{n} (f) : & = n^{1 / 2} (\int_{R^{d}} f ({\hat{A}}_{n} x) {\hat{P}}_{n} (d x) - \int_{R^{d}} m_{f} (y) d {\hat{Π}}_{n} (y)) \\ = n^{- 1 / 2} \sum_{j = 1}^{n} [f ({\hat{Z}}_{j}) - m_{f} ({\hat{Z}}_{j})], f \in F . \end{array}

Let

B L_{1} (ℓ^{\infty} (F))

denote the set of all functionals

Φ : ℓ^{\infty} (F) \mapsto R

such that

| Φ (Y) | \leq 1

for all

Y \in ℓ^{\infty} (F)

and

| Φ (Y_{1}) - Φ (Y_{2}) | \leq ∥ Y_{1} - Y_{2} ∥_{F}

for all

Y_{1}, Y_{2} \in ℓ^{\infty} (F)

. Given two stochastic processes

ζ_{1}, ζ_{2} : Ω \times \hat{Ω} \times F \mapsto R,

we define the following bounded Lipschitz distance:

d_{B L} (ζ_{1}, ζ_{2}) : = sup_{Φ \in B L_{1} (ℓ^{\infty} (F))} | {\hat{E}}^{*} Φ (ζ_{1}) - {\hat{E}}^{*} Φ (ζ_{2}) |,

where

E^{*}

denotes the outer expectation.

Now we are going to consider a bootstrap version of Theorem 1.

Theorem 2

Suppose that

F

is a semialgebraic subgraph class, the smoothness conditions (S) hold and

E γ_{A}^{2} (X) < \infty .

Then the sequence of stochastic processes

{{\hat{ξ}}_{n}}

converges weakly in the space

ℓ^{\infty} (F)

to a version

{\hat{ξ}}_{P^{s}}

of the process

ξ_{P^{s}}

(defined on the probability space

(\hat{Ω}, \hat{Σ}, \hat{P})

) in probability

P .

More precisely,

d_{B L} ({\hat{ξ}}_{n}; {\hat{ξ}}_{P^{s}}) \to 0 as n \to \infty in probability P .

In particular, if P is

S

-symmetric,

{\hat{ξ}}_{n}

converges weakly to a version of the process

ξ_{P} .

Define test statistics

{\hat{T}}_{n} : = {∥ {\hat{ξ}}_{n} ∥}_{F} .

Given

α > 0,

let

{\hat{t}}_{n, α} : = inf \{t : \hat{P} {{\hat{T}}_{n} \geq t} \leq α\} .

In other words,

{\hat{t}}_{n, α}

is a

(1 - α)

-quantile of the distribution of

{\hat{T}}_{n}

conditional on the sample

(X_{1}, \dots, X_{n})

.

Then Theorems 1 and 2 imply the following.

Corollary 2

Suppose all the conditions of Theorems 1 and 2 hold. Under the hypothesis

H_{0}

P {T_{n} \geq {\hat{t}}_{n, α}} \to α

and under the alternative

H_{a} (F)

P {T_{n} \geq {\hat{t}}_{n, α}} \to 1 as n \to \infty .

In particular, if

F

characterizes the distribution, the bootstrap test is consistent against any asymmetric alternative (subject to the smoothness conditions (S)): under the alternative

H_{a}

P {T_{n} \geq {\hat{t}}_{n, α}} \to 1 as n \to \infty .

Thus, our method provides tests that are consistent against any

S

-asymmetric alternative.

4. Detailed example

In this section we provide an example for which we verify all the assumptions and supply a step-by-step computational algorithm. Let

d = 2

and consider the problem of testing whether P is elliptically contoured measure (Example 2). For a vector

x \in R^{2}

let

(r, φ) \in [0, \infty) \times [0, 2 π)

be its polar coordinates. For a fixed integer l let

F : = lin ({I_{{0 < r \leq t}} cos (k φ), I_{{0 < | x | \leq t}} sin (k φ) : 1 \leq k \leq l, t > 0}),

where

lin (G)

denotes the linear span of

G

with all the functions bounded by 1, see Example 2.2. This class

F

satisfies the following assumptions.

1. It characterizes distribution only for

l = \infty

. For a finite l it does not characterize the distribution, since one might find two different distributions

Q_{1, 2}

such that

\int_{0}^{t} \int_{0}^{2 π} cos (k φ) d Q_{1} (r cos φ, r sin φ) = \int_{0}^{t} \int_{0}^{2 π} cos (k φ) d Q_{2} (r cos φ, r sin φ)

and

\int_{0}^{t} \int_{0}^{2 π} sin (k φ) d Q_{1} (r cos φ, r sin φ) = \int_{0}^{t} \int_{0}^{2 π} sin (k φ) d Q_{2} (r cos φ, r sin φ)

for all

t > 0, 1 \leq k \leq l

.

2.

F

is a semialgebraic subgraph class. Indeed, for

(x_{1}, x_{2})

the sets

\begin{matrix} {(x_{1}, x_{2}, s) : I (0 < x_{1}^{2} + x_{2}^{2} \leq t^{2}) cos (k \arctan (x_{2} / x_{1})) \geq s \geq 0 \\ or I (0 < x_{1}^{2} + x_{2}^{2} \leq t^{2}) cos (k \arctan (x_{2} / x_{1})) \leq s \leq 0} \end{matrix}

can be represented as unions of finite number of intersections of polynomial sets of finite degree. For instance, for

k = 1

we have have the following representation

\begin{matrix} {1 - s^{2} \geq 0, t^{2} - x_{1}^{2} - x_{2}^{2} \geq 0, s \geq 0, x_{1}^{2} - s^{2} (x_{1}^{2} + x_{2}^{2}) \geq 0} \\ \cup {1 - s^{2} \geq 0, t^{2} - x_{1}^{2} - x_{2}^{2} \geq 0, - s \geq 0, s^{2} (x_{1}^{2} + x_{2}^{2}) - x_{1}^{2} \geq 0} . \end{matrix}

The representations for any

1 \leq k \leq l

can be obtained similarly using trigonometric identities. Obviously, similar arguments work for sines, linear combinations of sines and cosines and products of any two functions from

F

.

3.

F

is invariant with respect to all orthogonal transformations, which are rotations on the unit circle. Indeed, for any rotation on an angle

τ

a vector

(r, φ)

is transformed into the vector

(r, φ + τ)

. So for any

t > 0, 1 \leq k \leq l

we have

f (r, φ + τ) = I_{{0 < r \leq t}} cos (k (φ + τ))

or

f (r, φ + τ) = I_{{0 < r \leq t}} sin (k (φ + τ))

, where both functions belong to the linear span of

F

and are bounded by 1. So any linear combination of such functions, which is bounded by 1, would also lie in

F

.

4. Condition (S2) holds. Indeed,

F

is uniformly bounded by 1. For any

δ > 0

and any

f \in F

we have

ω_{f} (\cdot; δ) = C δ

for some constant

C > 0

, so that the measure of the set defined in (S2) is zero for

δ < ε

.

Also note that the group

S

of all orthonormal transformations of

R^{2}

preserves semialgebraic property. Indeed, for any rotation on an angle

τ

the sets

\begin{matrix} {(x_{1}, x_{2}, s) : I (0 < x_{1}^{2} + x_{2}^{2} \leq t^{2}) cos (k \arctan (x_{2} / x_{1}) + τ) \geq s \geq 0 \\ or I (0 < x_{1}^{2} + x_{2}^{2} \leq t^{2}) cos (k \arctan (x_{2} / x_{1}) + τ) \leq s \leq 0} \end{matrix}

are semialgebraic. The same holds for sines and linear combinations of sines and cosines.

We also require the following two conditions on P: (S1) holds and

E γ_{A}^{2} (X) < \infty

. The first condition is satisfied for absolutely continuous P with the uniformly bounded and continuously differentiable Lebesgue density with the corresponding derivative approaching zero at infinity faster than

{| x |}^{- 3}

,

x \in R^{2}

. For example, distributions with densities on a finite support and normal distributions satisfy (S1). The last condition is satisfied if

{E | X |}^{4} < \infty

.

Given a random sample

(X_{1}, \dots, X_{n})

from a distribution P let us describe a step-by-step testing algorithm.

1. Obtain

θ_{n}, V_{n}

. In our example

θ_{n}

is the sample mean and

V_{n}

is the square root of the sample covariance for the sample

(X_{1}, \dots, X_{n})

.

2. Calculate residuals

Z_{j} = V_{n}^{- 1} (X_{j} - θ_{n}), j = 1, \dots, n

.

3. Find the test statistics, which can be simplified as follows

T_{n} (F) = n^{- 1 / 2} max_{1 \leq j \leq n} {(\sum_{k = 1}^{l} [{(\sum_{i = 1}^{j} cos (k φ_{i}))}^{2} + {(\sum_{i = 1}^{j} sin (k φ_{i}))}^{2}])}^{1 / 2},

where

φ_{i}

is the polar coordinate of

Z_{i}

.

4. Choose a number of bootstrap repetitions, say M. On practice we often take a large number, for instance

M = 10000

. Then the next four steps are repeated M times.

4.1. Generate a sample

(X_{1}^{s}, \dots, X_{n}^{s})

. In this example, first, generate a sample

(U_{1}, \dots, U_{n})

from a uniform distribution on the unit circle, independent of

(| Z_{1} |, \dots, | Z_{n} |)

. Secondly, resample with replacement from

(| Z_{1} |, \dots, | Z_{n} |)

to obtain

(R_{1}, \dots, R_{n})

. Thirdly,

X_{j}^{s} = θ_{n} + V_{n} R_{j} U_{j}, j = 1, \dots, n

.

4.2. Obtain

{\hat{θ}}_{n}, {\hat{V}}_{n}

. In our example

{\hat{θ}}_{n}

is the sample mean and

{\hat{V}}_{n}

is the square root of the sample covariance for the sample

(X_{1}^{s}, \dots, X_{n}^{s})

.

4.3. Calculate residuals

{\hat{Z}}_{j} = {\hat{V}}_{n}^{- 1} (X_{j}^{s} - {\hat{θ}}_{n}), j = 1, \dots, n

.

4.4. Find the bootstrapped test statistics, which can be simplified as follows

{\hat{T}}_{n} (F) = n^{- 1 / 2} max_{1 \leq j \leq n} {(\sum_{k = 1}^{l} [{(\sum_{i = 1}^{j} cos (k {\hat{φ}}_{i}))}^{2} + {(\sum_{i = 1}^{j} sin (k {\hat{φ}}_{i}))}^{2}])}^{1 / 2},

where

{\hat{φ}}_{i}

is the polar coordinate of

{\hat{Z}}_{i}

.

5. Based on

({\hat{T}}_{n}^{(1)}, \dots, {\hat{T}}_{n}^{(M)})

find the empirical

(1 - α)

-quantile of the distribution of

{\hat{T}}_{n} (F)

, conditional on

(X_{1}, \dots, X_{n})

. Let us denote it as

{\hat{t}}_{n, α, M}

.

6. If

T_{n} (F) \geq {\hat{t}}_{n, α, M}

then reject

H_{0} : P

is elliptically contoured distribution, at the significance level

α

.

5. Proofs

We use ideas and methods of the work [7]. Their technique was developed for ellipsoidal symmetry and is needed to be adjusted for group symmetry. Basically one should change

V^{- 1} (x - θ)

to

A x

throughout the proofs. However, there are technical difficulties associated with using transformations A instead of

(θ, V)

, they are hidden in the proofs of lemmas. We give a few details for completeness.

Let

SP

denote a subset of all nonsingular linear transformations in

R^{d} .

Given a transformation

A \in SP

denote

τ_{A} f (\cdot) : = f (A \cdot) .

For a function f on

R^{d},

let

\tilde{f} (x) : = f (x) - m_{f} (x), x \in R^{d} .

Given a class

G

of functions on

R^{d}

, define

\tilde{G} : = {\tilde{g} : g \in G},

G_{aff} : = {τ_{A} g : A \in SP} .

Now the process

ξ_{n}

is represented as

ξ_{n} (f) = n^{1 / 2} \int_{R^{d}} τ_{A_{n}} \tilde{f} d P_{n}, f \in F

and the process

{\hat{ξ}}_{n}

as

{\hat{ξ}}_{n} (f) = n^{1 / 2} \int_{R^{d}} τ_{{\hat{A}}_{n}} \tilde{f} d {\hat{P}}_{n}, f \in F .

Clearly,

E (f; A) = \int_{R^{d}} τ_{A} \tilde{f} d P .

Define

E^{s} (f; A) = \int_{R^{d}} τ_{A} \tilde{f} d P^{s} .

Given a function g on

R^{d},

we can write

\begin{matrix} \int_{R^{d}} g (x) P_{n}^{s} (d x) = \int_{R^{d}} \int_{S} g (A_{n}^{- 1} S y) m (d S) Π_{n} (d y) \\ = \int_{R^{d}} M_{g} (A_{n}; x) P_{n} (d x), \end{matrix}

(3)

where

M_{g} (A; x) : = \int_{S} g (A^{- 1} S A x) m (d S) .

A similar computation shows that

\int_{R^{d}} g (x) P^{s} (d x) = \int_{R^{d}} M_{g} (A_{0}; x) P (d x) = \int_{R^{d}} M_{g} (A_{0}; x) P^{s} (d x) .

(4)

Let

Γ (g; A) : = \int_{R^{d}} M_{g} (A; x) P (d x) .

Given a class

G

of functions, define

G^{2} : = {g h : g \in G, h \in G},

M (G) : = {M_{g} (A, \cdot) : g \in G, A \in SP} .

We reformulate the following versions of lemmas from [7] that describe smoothness properties of the functions introduced above and Donsker properties of the classes of functions given above. The convergence of transformations is with respect to the operator norm on the set of all linear transformations. Smoothness conditions are used in the proof of Lemma 1. Properties of Vapnik-Chervonenkis subgraph classes are used in the proof of Lemma 2. See [18] for details on Vapnik-Chervonenkis, Glivenko-Cantelli, and Donsker classes of functions.

Lemma 1

Suppose that P and

F

satisfy the smoothness conditions (S). Then the following statements hold:

(C1) If

A \to A_{0}

then

sup_{f \in F} \int_{R^{d}} {| {\tilde{f}}_{A} - {\tilde{f}}_{A_{0}} |}^{2} P (d x) \to 0

and

sup_{f \in F} \int_{R^{d}} {| {\tilde{f}}_{A} - {\tilde{f}}_{A_{0}} |}^{2} P^{s} (d x) \to 0 .

(C2) The function

E (f; A)

is differentiable at the point

A_{0}

for any

f \in F

, and the Taylor expansion of the first order

E (f; A) = E (f; A_{0}) + E_{A}^{'} (f; A_{0}) (A - A_{0}) + o (| A - A_{0} |)

holds uniformly in

f \in F

.

(C3) Similarly, the function

E^{s} (f; A)

is differentiable at the point

A_{0}

for any

f \in F

, and the Taylor expansion of the first order

E^{s} (f; A) = E^{s} (f; A_{0}) + {(E^{s})}_{A}^{'} (f; A_{0}) (A - A_{0}) + o (| A - A_{0} |)

holds uniformly in

f \in F

.

(C4) The function

Γ (g; A)

is continuous with respect to A at

A_{0}

uniformly in

g \in {({(\tilde{F})}_{aff})}^{2} .

(C5) The function

Γ (g; A)

is differentiable at the point

A_{0}

for any

g \in {(\tilde{F})}_{aff},

and, moreover, the Taylor expansion of the first order

Γ (g; A) = Γ (g; A_{0}) + Γ_{A}^{'} (g; A_{0}) (A - A_{0}) + o (| A - A_{0} |)

holds uniformly in

g \in {(\tilde{F})}_{aff} .

Moreover, the matrix-valued function

A \mapsto Γ_{A}^{'} (τ_{A} \tilde{f}; A_{0})

is continuous at

A_{0}

uniformly in

f \in F .

(C6) if

A \to A_{0},

then for all

δ > 0

sup \{\int_{R^{d}} | M_{τ_{B} \tilde{f}} (A; x) - M_{τ_{B} \tilde{f}} (A_{0}; x) |^{2} P (d x) : B \in SP, | B - A_{0} | \leq δ\} \to 0

and

sup_{f \in F} \int_{R^{d}} {| M_{τ_{A} \tilde{f}} (A_{0}; x) - M_{τ_{A_{0}} \tilde{f}} (A_{0}; x) |}^{2} P (d x) \to 0 .

Lemma 2

For a uniformly bounded semialgebraic subgraph class

F,

the classes

{(\tilde{F})}_{aff},

M ({(\tilde{F})}_{aff})

and

M ({(\tilde{F})}_{aff}^{2})

are uniformly Donsker and uniformly Glivenko–Cantelli.

Proof of Theorem 1.

Define a process

η_{n} (f; A) : = n^{1 / 2} (P_{n} - P) (τ_{A} \tilde{f}), f \in F, A \in SP .

(C1) and

{(\tilde{F})}_{aff}

being a P-Donsker class by Lemma 2 imply that we can use asymptotic equicontinuity to obtain

lim_{δ \to 0} lim sup_{n \to \infty} P^{*} ({sup_{| A - A_{0} | \leq δ} sup_{f \in F} | η_{n} (f; A) - η_{n} (f; A_{0}) | \geq ε}) = 0

for all

ε > 0

. Clearly,

\begin{matrix} ξ_{n} (f) - n^{1 / 2} E (f; A_{0}) = η_{n} (f; A_{n}) + n^{1 / 2} (E (f; A_{n}) - E (f; A_{0})) \\ = η_{n} (f; A_{0}) + n^{1 / 2} (E (f; A_{n}) - E (f; A_{0})) + (η_{n} (f; A_{n}) - η_{n} (f; A_{0})) . \end{matrix}

If

| A_{n} - A_{0} | \leq δ,

we have

\begin{matrix} sup_{f \in F} | ξ_{n} (f) - n^{1 / 2} E (f; A_{0}) - η_{n} (f; A_{0}) - n^{1 / 2} (E (f; A_{n}) - E (f; A_{0})) | \\ \leq sup_{| A - A_{0} | \leq δ} sup_{f \in F} | η_{n} (f; A) - η_{n} (f; A_{0}) | . \end{matrix}

(5)

Note that

η_{n} (f; A_{0}) = n^{1 / 2} \int_{R^{d}} τ_{A_{0}} \tilde{f} d (P_{n} - P) .

Using (5) and

n^{1 / 2}

-consistency of

A_{n}

we obtain

\begin{matrix} ξ_{n} (f) - n^{1 / 2} E (f; A_{0}) = n^{1 / 2} \int_{R^{d}} τ_{A_{0}} \tilde{f} d (P_{n} - P) + n^{1 / 2} (E (f; A_{n}) - E (f; A_{0})) + o_{p} (1) \end{matrix}

(6)

as

n \to \infty

uniformly in

f \in F

. It follows from (C1) and

n^{1 / 2}

-consistency of

A_{n}

that

n^{1 / 2} (E (f; A_{n}) - E (f; A_{0})) = n^{1 / 2} E_{A}^{'} (f, A_{0}) (A_{n} - A_{0}) + o_{p} (1), n \to \infty

(7)

uniformly in

f \in F

. Representations (6) and (7), the fact that

{(\tilde{F})}_{aff}

is a uniformly Donsker class from Lemma 2 and (C1) imply that the sequence

ξ_{n} (f) - n^{1 / 2} E (f; A_{0})

converges weakly in the space

ℓ^{\infty} (F)

to the Gaussian stochastic process

ξ_{P} .

This implies the first statement of the theorem. If P is ellipsoidally symmetric then

E (f; A_{0}) = 0

, which concludes the proof of Theorem 1.

Proof of Theorem 2.

Define a process

{\hat{η}}_{n} (f; A) : = n^{1 / 2} ({\hat{P}}_{n} - P_{n}^{s}) (τ_{A} \tilde{f}), f \in F, A \in SP .

By Lemma 2, the class

M ({({(\tilde{F})}_{aff})}^{2})

is uniformly Glivenko–Cantelli. This together with (C4) and representations (3), (4) implies that

\int_{R^{d}} g h d P_{n}^{s} \to \int_{R^{d}} g h d P^{s} as n \to \infty a . s .

uniformly in

g, h \in {(\tilde{F})}_{aff}

. Similarly, since the class

M ({(\tilde{F})}_{aff})

is uniformly Glivenko–Cantelli, by (C5) and representations (3), (4), we obtain

\int_{R^{d}} g d P_{n}^{s} \to \int_{R^{d}} g d P^{s} as n \to \infty a . s .

uniformly in

g \in {(\tilde{F})}_{aff}

.

Since

{(\tilde{F})}_{aff}

is a uniformly Donsker class, we can use Corollary 2.7 in [21] to prove that a.s.

n^{1 / 2} ({\hat{P}}_{n} - P_{n}^{s})

converges weakly in the space

ℓ^{\infty} ({(\tilde{F})}_{aff})

to the same limit as

n^{1 / 2} ({\tilde{P}}_{n} - P^{s})

, where

{\tilde{P}}_{n}

is the empirical measure based on a sample from

P^{s}

, i.e. to the

P^{s}

-Brownian bridge

W_{P^{s}}^{\circ} .

Asymptotic equicontinuity and (C1) yield that for all

ε > 0

P

a.s.

lim_{δ \to 0} lim sup_{n \to \infty} {\hat{P}}^{*} ({sup_{| A - A_{n} | \leq δ} sup_{f \in F} | {\hat{η}}_{n} (f; A) - {\hat{η}}_{n} (f; A_{n}) | \geq ε}) = 0 .

(8)

Define

{\hat{E}}_{n} (f; A) : = \int_{R^{d}} τ_{A} \tilde{f} d P_{n}^{s} .

Since

{\hat{E}}_{n} (f; A_{n}) = 0,

we can write

{\hat{ξ}}_{n} (f) = {\hat{η}}_{n} (f; {\hat{A}}_{n}) + n^{1 / 2} {\hat{E}}_{n} (f; {\hat{A}}_{n}) = {\hat{η}}_{n} (f; A_{n}) + n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - {\hat{E}}_{n} (f; A_{n})) + ({\hat{η}}_{n} (f; {\hat{A}}_{n}) - {\hat{η}}_{n} (f; A_{n})) .

If

| {\hat{A}}_{n} - A_{n} | \leq δ,

we have

\begin{matrix} sup_{f \in F} | {\hat{ξ}}_{n} (f) - {\hat{η}}_{n} (f; A_{n}) - n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - {\hat{E}}_{n} (f; A_{n})) | \\ \leq sup_{| A - A_{n} | \leq δ} sup_{f \in F} | {\hat{η}}_{n} (f; A) - {\hat{η}}_{n} (f; A_{n}) | . \end{matrix}

(9)

Note that

{\hat{η}}_{n} (f; A_{n}) = n^{1 / 2} \int_{R^{d}} τ_{A_{n}} \tilde{f} d ({\hat{P}}_{n} - P_{n}^{s}) .

Using (8), (9) and standard asymptotic properties of the estimators

A_{n}

we obtain

\begin{matrix} {\hat{ξ}}_{n} (f) = n^{1 / 2} \int_{R^{d}} τ_{A_{n}} \tilde{f} d ({\hat{P}}_{n} - P_{n}^{s}) + n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - {\hat{E}}_{n} (f; A_{n})) + o_{p} (1) \end{matrix}

(10)

as

n \to \infty

uniformly in

f \in F

. Here and in what follows the remainder term

o_{p} (1)

converges to 0 as

n \to \infty

uniformly in

f \in F

in probability

P \times \hat{P}

.

Applying the asymptotic equicontinuity condition to the process

n^{1 / 2} ({\hat{P}}_{n} - P_{n}^{s})

and using (C1), we obtain

n^{1 / 2} \int_{R^{d}} τ_{A_{n}} \tilde{f} d ({\hat{P}}_{n} - P_{n}^{s}) = n^{1 / 2} \int_{R^{d}} τ_{A_{0}} \tilde{f} d ({\hat{P}}_{n} - P_{n}^{s}) + o_{p} (1)

(11)

as

n \to \infty .

Now we can write

\begin{matrix} n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - {\hat{E}}_{n} (f; A_{n})) = n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - E^{s} (f; {\hat{A}}_{n})) \\ - n^{1 / 2} ({\hat{E}}_{n} (f; A_{n}) - E^{s} (f; A_{n})) + n^{1 / 2} (E^{s} (f; {\hat{A}}_{n}) - E^{s} (f; A_{n})) . \end{matrix}

(12)

Note that by (3) and (4)

\begin{matrix} n^{1 / 2} ({\hat{E}}_{n} (f; A) - E^{s} (f; A)) = n^{1 / 2} (P_{n} - P) (M_{τ_{A} \tilde{f}} (A_{n}; \cdot)) + n^{1 / 2} (Γ (τ_{A} \tilde{f}; A_{n}) - Γ (τ_{A} \tilde{f}; A_{0})) \\ = n^{1 / 2} (P_{n} - P) (M_{τ_{A} \tilde{f}} (A_{0}; \cdot)) + [n^{1 / 2} (P_{n} - P) (M_{τ_{A} \tilde{f}} (A_{n}; \cdot)) - n^{1 / 2} (P_{n} - P) (M_{τ_{A} \tilde{f}} (A_{0}; \cdot))] \\ + n^{1 / 2} (Γ (τ_{A} \tilde{f}; A_{n}) - Γ (τ_{A} \tilde{f}; A_{0})) . \end{matrix}

Since, by Lemma 2,

M ({(\tilde{F})}_{aff})

is uniformly Donsker class and since (C5) and (C6) hold, it is easy to prove the weak convergence of the processes

\{n^{1 / 2} ({\hat{E}}_{n} (f; A) - E^{s} (f; A)), f \in F, A \in B (A_{0})\}

in the space

ℓ^{\infty} (F \times B (A_{0})),

where

B (A_{0})

is a ball in

SP

with the center

A_{0} .

Using the asymptotic equicontinuity and (C6), we obtain

\begin{matrix} n^{1 / 2} ({\hat{E}}_{n} (f; {\hat{A}}_{n}) - E^{s} (f; {\hat{A}}_{n})) - n^{1 / 2} ({\hat{E}}_{n} (f; A_{n}) - E^{s} (f; A_{n})) = o_{p} (1) as n \to \infty . \end{matrix}

(13)

It follows from (C3) and standard asymptotic properties of the estimators

A_{n},

{\hat{A}}_{n}

that

\begin{matrix} n^{1 / 2} (E^{s} (f; {\hat{A}}_{n}) - E^{s} (f; A_{n})) = n^{1 / 2} {(E^{s})}_{A}^{'} (f, A_{0}) ({\hat{A}}_{n} - A_{n}) + o_{p} (1) \\ = {(E^{s})}_{A}^{'} (f, A_{0}) n^{1 / 2} \int_{R^{d}} γ_{A} (x) d ({\hat{P}}_{n} - P_{n}^{s}) + o_{p} (1), n \to \infty, \end{matrix}

(14)

uniformly in

f \in F

. Relationships (10)–(14) along with, again, Corollary 2.7 in [21], imply the statement of the theorem.

6. Conclusion

We propose and study a general class of tests for group symmetry, which encompasses different types of symmetry, such as ellipsoidal and permutation symmetries. Our approach is based on supremum norms of special empirical processes combined with bootstrap.

There are several advantages to our methodology. First, the test statistics are indexed by classes of functions that are rich enough and still relatively simple to use. This provides some flexibility in choosing a suitable class of functions, thereby giving an appropriate test. Secondly, these tests are consistent against all possible asymmetric alternatives. Thirdly, they enjoy the property of affine invariance. Fourth, these are bootstrap tests, which could be considered as a drawback but it is a way to deal with complex nature of asymptotic null distribution of a non-bootstrap semiparametric test, and these tests have good theoretical properties. Fifth, this approach gathers separate ideas and methods developed for various types of symmetry under one umbrella. It provides a unified theory for studying statistical properties of seemingly different tests for different types of symmetry.

7. Appendix

Definition of a semialgebraic set. For any polynomial p on

R^{m}

of degree less than or equal to r we will call

{x \in R^{m} : p (x) \geq 0}

a polynomial set of degree less than or equal to r in

R^{m}

. Let

P_{r, m}

denote the class of all polynomial sets in

R^{m}

of degree less than or equal to r. Then any set from the union

⋃ {A (B_{1}, \dots, B_{l}) : B_{1}, \dots, B_{l} \in P_{r, m}}

is called a semialgebraic set of degree less than or equal to r and order less than or equal to l,

A (B_{1}, \dots, B_{l})

being the minimal set algebra generated by

B_{1}, \dots, B_{l}

. Let

{S A}_{r, m, l}

denote the class of all semialgebraic sets of degree less than or equal to r and order less than or equal to l in

R^{m}

.

A class

G

of functions on

R^{d}

is a semialgebraic subgraph class if and only if for some

r, l

for all functions g from

G

the set

{(x, t) : g (x) \geq t \geq 0 or g (x) \leq t \leq 0}

belongs to

{S A}_{r, d + 1, l}

and for all functions

g_{1}, g_{2}

from

G

the set

{(x, y, t) : g_{1} (x) g_{2} (y) \geq t \geq 0 or g_{1} (x) g_{2} (y) \leq t \leq 0}

belongs to

{S A}_{r, 2 d + 1, l} .

Conditions on P and $F$ . We also introduce the following smoothness conditions on the distribution P and the class

F :

(S1) P is absolutely continuous with a uniformly bounded and continuously differentiable density p such that for some

C_{A} > d + 1

sup_{x \in R^{d}} {(1 + | x |)}^{C_{A}} | p^{'} (x) | < + \infty,

where

p^{'}

denotes the derivative of the density p.

(S2) The class

F

is uniformly bounded and for all

ε > 0

and

R > 0

sup_{f \in F} mes \{x \in R^{d} : | x | \leq R and ω_{f} (x; δ) \geq ε\} \to 0 as δ \to 0 .

Here

mes

denotes Lebesgue measure in

R^{d}

and

ω_{f} (x; δ) : = sup {| f (x_{1}) - f (x_{2}) | : | x_{1} - x | \leq δ, | x_{2} - x | \leq δ} .

The classes $F$ characterize the distribution. The classes

F

characterize the distribution in the case of the examples 1.1, 2.1, 2.3, 2.4, 3.1, 4.1 above. Indeed, this is a well-known property of the classes used in examples 1.1, 2.3 and 4.1. As to the Example 2.4, we refer, e.g., to the paper [22] for similar statements. To prove that this is the case in Example 2.1 (and in Example 3.1, similarly), consider the map

R^{d} ∖ {0} ∋ x \mapsto (| x |, \frac{x}{| x |}) \in R_{+} \times S^{d - 1} .

Since this map is a Borel isomorphism (even a homeomorphism), it suffices to show that for any two finite measures

P, Q

in

R_{+} \times S^{d - 1}

the condition

P ((0, t] \times C) = Q ((0, t] \times C) for all t > 0, C \in C

implies

P = Q .

We will prove that, in fact, for any two finite measures

P, Q

on

R_{+} \times R^{d}

the condition

P ((0, t] \times H) = Q ((0, t] \times H) for all t > 0, H \in H,

where

H

is the class of all half-spaces in

R^{d},

implies that

P = Q

(the previous statement then follows, since one can consider two measures in

R_{+} \times R^{d}

both supported in

R_{+} \times S^{d - 1}

). The condition

P ((0, t] \times H) = Q ((0, t] \times H) for all t > 0, H \in H

is equivalent to the following one

\int_{R_{+}} \int_{R^{d}} I_{(0, t]} (u) I_{(- \infty, c]} (〈 l, x 〉) P (d u, d x)

= \int_{R_{+}} \int_{R^{d}} I_{(0, t]} (u) I_{(- \infty, c]} (〈 l, x 〉) Q (d u, d x)

for all

l \in S^{d - 1}, t > 0, c \in R .

Using a standard approximation of Borel functions by simple functions, we extend this to the equality

\int_{R_{+}} \int_{R^{d}} φ (u) ψ (〈 l, x 〉) P (d u, d x) = \int_{R_{+}} \int_{R^{d}} φ (u) ψ (〈 l, x 〉) Q (d u, d x)

that holds for all bounded Borel functions

φ, ψ .

If we set

φ (u) : = e^{i s u}

and

ψ (u) : = e^{i u},

we obtain that the characteristic functions of P and Q are equal, which implies that

P = Q .

Acknowledgments

The author would like to acknowledge the support by NSF grant DMS-0806176 and to thank anonymous referees for a number of comments and suggestions that improved this paper.

References

Beran, R. Testing for ellipsoidal symmetry of a multivariate density. Ann. Statist. 1979, 7, 150–162. [Google Scholar] [CrossRef]
Romano, J. Bootstrap and randomization tests of some nonparametric hypothesis. Ann. Statist. 1989, 17, 141–159. [Google Scholar] [CrossRef]
Baringhaus, L. Testing for spherical symmetry of a multivariate distribution. Ann. Statist. 1991, 19, 899–917. [Google Scholar] [CrossRef]
Heathcote, C.R.; Rachev, S.T.; Cheng, B. Testing Multivariate Symmetry. J. Multivariate Analysis 1995, 54, 91–112. [Google Scholar] [CrossRef]
Koltchinskii, V.; Li, L. Testing for spherical symmetry of a multivariate distribution. J. Multivariate Analysis 1998, 65, 228–244. [Google Scholar] [CrossRef]
Manzotti, A.; Pérez, F.J.; Quiroz, A.J. A procedure for testing the null hypothesis of elliptical symmetry. J. Multivariate Analysis 2002, 81, 274–285. [Google Scholar] [CrossRef]
Koltchinskii, V.; Sakhanenko, L. Testing for ellipsoidal symmetry of a multivariate distribution. In High Dimensional Probability II.; Giné, E., Mason, D., Wellner, J., Eds.; Progress in probability; Birkhäuser: Boston, MA, USA, 2000; pp. 493–510. [Google Scholar]
Huffer, F.W.; Park, C. A test for elliptical symmetry. J. Multivariate Analysis 2007, 98, 256–281. [Google Scholar] [CrossRef]
Sakhanenko, L. Testing for Ellipsoidal Symmetry: A comparison study. Comp. Stat. & Data Analysis 2008, 53, 565–581. [Google Scholar]
Beran, R.; Millar, P. Multivariate symmetry models. In Festschrift for Lucien le Cam. Research papers in Probability and Statistics; Pollard, D., Torgersen, E., Yang, G., Eds.; Springer: New York, NY, USA, 1997; pp. 13–42. [Google Scholar]
Wormleighton, R. Some tests of permutation symmetry. Ann. Math. Statist. 1959, 30, 1005–1017. [Google Scholar] [CrossRef]
Li, K.-C. Sliced inverse regression for dimensional reduction (with discussion). J. Am. Statist. Assoc. 1991, 86, 316–342. [Google Scholar] [CrossRef]
Petitjean, M. Chirality and Symmetry Measures: a Transdisciplinary Review. Entropy 2003, 5, 271–312. [Google Scholar] [CrossRef]
Maronna, R. Robust M-estimators of multivariate location and scatter. Ann. Statist. 1976, 4, 51–67. [Google Scholar] [CrossRef]
Hill, D.; Rao, P. Tests of symmetry based on Cramer-von Mises statistics. Biometrica 1977, 64, 489–494. [Google Scholar]
Quiroz, A.J.; Dudley, R. Some New Tests for Multivariate Normality. Probab. Theory and Related Fields 1991, 87, 521–546. [Google Scholar] [CrossRef]
Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Statist. Assoc. 1937, 32, 675–701. [Google Scholar] [CrossRef]
van der Vaart, A.W.; Wellner, J. Weak Convergence and Empirical Processes with Applications to Statistics; Springer-Verlag: New York, NY, USA, 1996. [Google Scholar]
Dudley, R.M. Uniform Central Limit Theorem; Cambridge University Press: Cambridge, UK, 1999. [Google Scholar]
Cirel’son, B. The density of the distribution of the maximum of a Gaussian process. Theory of Probability and Appl. 1975, 20, 847–855. [Google Scholar] [CrossRef]
Giné, E.; Zinn, J. Gaussian characterization of uniform Donsker classes. Ann. Probab. 1991, 19, 758–782. [Google Scholar] [CrossRef]
Koldobskii, A.L. Inverse problem for potentials of measures in Banach spaces. In Probab. Theory and Math. Stat.; Grigelionis, B., Prohorov, Y., Sazonov, V., Eds.; VSP-Mokslas: Vilnius, Lithuania, 1990; Vol. 1, pp. 627–637. [Google Scholar]

© 2009 by the author; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Sakhanenko, L. Testing Group Symmetry of a Multivariate Distribution. Symmetry 2009, 1, 180-200. https://doi.org/10.3390/sym1020180

AMA Style

Sakhanenko L. Testing Group Symmetry of a Multivariate Distribution. Symmetry. 2009; 1(2):180-200. https://doi.org/10.3390/sym1020180

Chicago/Turabian Style

Sakhanenko, Lyudmila. 2009. "Testing Group Symmetry of a Multivariate Distribution" Symmetry 1, no. 2: 180-200. https://doi.org/10.3390/sym1020180

Article Menu

Testing Group Symmetry of a Multivariate Distribution

Abstract

1. Introduction

2. Notations and Preliminaries

3. Main Results

4. Detailed example

5. Proofs

6. Conclusion

7. Appendix

Acknowledgments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI