1. Introduction
An interesting problem in a two-way contingency table is to investigate whether there are symmetric patterns in the data: Cell probabilities on one side of the main diagonal are a mirror image of those on the other side. This problem was first discussed by Bowker [
1] who gave the maximum likelihood estimator as well as a large sample chi-square type test for the null hypothesis of symmetry. The minimum discrimination information estimator was proposed in [
2] and the minimum chi-squared estimator in [
3]. In [
4,
5,
6,
7] new families of test statistics, based on
-divergence measures, were introduced. These families contain as a particular case the test statistic given by [
1] as well as the likelihood ratio test.
Let
X and
Y denote two ordinal response variables,
X and
Y having
I levels. When we classify subjects on both variables, there are
possible combinations of classifications. The responses
of a subject randomly chosen from some population have a probability distribution. Let
, with
,
. We display this distribution in a rectangular table having
I rows for the categories of
X and
I columns for the categories of
Y. Consider a random sample of size
n on
and we denote by
the observed frequency in the
th cell for
with
. The classical problem of testing for symmetry is given by
versus
This problem was considered for the first time by Bowker [
1] using the Pearson test statistic
for which established that
for large
n, where
.
In some real problems (
i.e., medicine, psychology, sociology,
etc.) the categorical response variables
represent the measure after or before a treatment. In such situations our interest is to determine the treatment effect,
i.e., if
(we assume that
X represents the measure after the treatment and
Y before the treatment). In the following we understand that
X is preferred or indifferent to
Y, according to joint likelihood ratio ordering, if and only if (iff)
. In this situation the alternative hypothesis is
This problem was first considered by El Barmi and Kochar [
8], who presented the likelihood ratio test for the problem of testing
and considered the application of it to a real life problem: They tested if the vision of both the eyes, for 7477 women, is the same against the alternative that the right eye has better vision than the left eye. In [
5] these results were extended using
-divergence measures.
In this paper we present an overview on contingency tables with symmetry structure on the basis of divergence measures. We pay especial attention to the family of
-divergence test statistics for testing
versus
,
against
and also for testing
against the alternative
of no restrictions over
’s,
i.e.,
It is interesting to observe that not only we consider
-divergence test statistics but also we consider minimum
-divergence estimators in order to estimate of the parameters of the model.
2. Phi-divergences Measures
We consider the set
and we denote
,
or equivalently by the
matrix
.
The
-divergence between two probability distributions
,
was introduced independently by [
9] and [
10]. It is defined as follows:
where
is the class of all convex functions
, such that
,
; and we define
and
. For every
that is differentiable at
, the function
given by
also belongs to
. Then we have
, and
has the additional property that
. Because the two divergence measures are equivalent, we can consider the set
to be equivalent to the set
An important family of
-divergences in statistical problems, is the power divergence family
which was introduced and studied by [
11]. Notice that
. In the following we shall denote the power-divergence measures by
,
. For more details about
-divergence measures see [
12].
3. Hypothesis Testing: versus
We define
the hypothesis (
1) can be written as
where the function
is defined by
with
and
Note that
, where
The maximum likelihood estimator (MLE) of
can be defined as
where
is the Kullback–Leibler divergence measure (see [
13,
14]) defined by
We denote by
and by
. It is well known that
. Using the ideas developed in [
15], we can consider the minimum
-divergence estimator (
) replacing the Kullback–Leibler divergence by a
-divergence measure in the following way
where
We denote
and we have (see [
7,
16])
where
being
and
. The functions
are given by
It is not difficult to establish that the matrix
can be written as
where
is the Fisher information matrix corresponding to
.
If we consider the family of power divergences we get the
minimum power-divergence estimator,
of , under the hypothesis of symmetry, whose expression is given by
For
we get
hence, we obtain the maximum likelihood estimator for symmetry introduced by [
1]. For
, we obtain as a limit case
i.e., the minimum discrimination estimator for symmetry introduced and studied in [
2]. For
we get the minimum chi-squared estimator for symmetry introduced in [
3],
We denote
and by
the (
) of the probability vector that characterizes the symmetry model. Based on
it is possible to define a new family of statistics for testing (
1) that contains as a particular case Pearson test statistic as well as likelihood ratio test. This family of statistics is given by
We can observe that the family (
13) involves two functions
and
, both belonging to
. We use the function
to obtain the (
) and
to obtain the family of statistics. If we consider
and
we get Pearson test statistic whose expression was given in (
3) and for
we get the likelihood ratio test given by
In the following theorem the asymptotic distribution of is obtained.
Theorem 1 The asymptotic distribution of is chi-squared with degrees of freedom.
Thus, for a given significance level
, the critical value of
may be approximated by
, the upper
of the chi-square distribution with
m degrees of freedom,
i.e., reject the hypothesis of symmetry iff
Now we are going to analyze the power of the test. Let
be a point at the alternative hypothesis,
i.e., there exist at least two indexes
i and
j for which
. We denote by
the point on
verifying
where
is given by
It is clear that
and
with
The notation
indicates that the elements of the vector
depend on
. For instance, for the power-divergence family
we have
We also denote
and then
where
. If the alternative
is true we have that
tends to
and
to
in probability.
If we define the function
we have
Then the random variables
and
have the same asymptotic distribution. If we define
and
, we have
where
.
If we consider the maximum likelihood estimator instead of minimum
-divergence estimator, we get
where
It is also interesting to observe, if we consider the power divergence measure, that
For
and
we get
respectively. Therefore, the corresponding asymptotic variances are given by
and
Based on the previous result we can formulate the following theorem.
Theorem 2 The asymptotic power for the test given in (15), at the alternative , is given bywhere is a sequence of distributions functions tending uniformly to the standard normal distribution function .
We consider a contiguous sequence of alternative hypotheses that approaches the null hypothesis
, for some unknown
, at the rate
. Consider the multinomial probability vector
where
is a fixed
vector such that
, recall that
n is the total count parameter of the multinomial distribution and
. As
, the sequence of multinomial probabilities
with
, converges to a multinomial probability in
at the rate of
. Let
In the next theorem we present the asymptotic distribution of the family of test statistics
defined in (
13), under the contiguous alternative hypotheses given in (
18).
Theorem 3 Under , given in (18), the family of test statistics is asymptotically noncentrality chi-squared distributed with degrees of freedom and noncentrality parameter An interesting a simulation study can be seen in [
7]. In that study some interesting alternative test statistics appear to the classical Pearson test statistics and likelihood ratio test.
4. Hypothesis Testing: versus and versus
In this section we consider the three hypotheses,
,
,
given in (
1), (
4), (
5) respectively and some test statistics based on
-divergence
for testing
against
and
against
.
In the expression (
19),
is the maximum likelihood estimator (MLE) of
given by
, where
and
and
denote the MLEs of
under
and
respectively. These MLEs were obtained by [
8]. Let
then
(for
and
(for
, and
It follows that
and
are given by
Then we have
and
To solve the problem of testing
against
, [
8] consider the likelihood ratio test statistic
This statistic is such that
where
is the Kullback–Leibler divergence given by (
20) with
defined above. Then the likelihood ratio test statistic is based on the closeness, in terms of the Kullback–Leibler divergence measure, between the probability distributions
and
. Thus, one could measure the closeness between the two probability distributions using a more general divergence measure if we are able to obtain its asymptotic distribution. One appropriate family of divergence measures for that purpose is the
-divergence measure.
As a generalization of the test statistic given in (
20) for testing
against
we introduce the family of test statistics
To test
against
, El Barmi and Kochar [
8] consider the likelihood ratio test statistic
It is clear that
As a generalization of this test statistic we consider in this paper the family of test statistics
If
then
and
, and hence the families of test statistics
and
can be considered as generalizations of the test statistics
and
respectively.
In order to get the asymptotic distribution of the test statistics given in (
21) and (
22), we first define the so-called chi-bar squared distribution with
n degrees of freedom, denoted by
.
Definition 4 Let where , so that the c.d.f. of U is given bywhere Φ denotes the standard normal cumulative distribution function. Let , where are independent and distributed like U, then .
It is readily shown that
This distribution is related to
distribution. It can be readily shown that
by conditioning on
L, the number of non-zero
s.
Furthermore, like the
distribution, the
distribution is stochastically increasing with
n. If
and
, where
, then
V is stochastically smaller than
. This follows since
where
with
V and
W independent. For more details about the chi-bar squared distribution see [
17].
The following theorem present the asymptotic distribution of
Theorem 5 Under , as where .
If we consider the family of power divergences given in (
8), we have the power divergence family of test statistics defined as
which can be used for testing
against
. Therefore some important statistics can now be expressed as members of the power divergence family of test statistics
, that is,
is the Pearson test statistic (
),
is the Freeman–Tukey test statistic (
),
is the Neyman-modified test statistic (
),
is the modified loglikelihood ratio test statistic (
),
is the loglikelihood ratio test statistic (
) introduced by [
8] and
is Cressie–Read test statistic (see [
11]).
Theorem 6 Under , as , where M is the number of elements in the set and If we consider the family of power divergences given in (
8), we have the power divergence family of test statistics defined as
which we can use for testing
against
Remark 7 In the same way as previously we can obtain the test statistics , , , , and .
We will refer here to the example of [
18, Section 9.5], where the test proposed by Bowker [
1] is applied. The proposed tests in this paper may be used in the situation such that it is hoped that a new formulation of a drug will reduce some side-effects.
Example We consider 158 patients who have been treated with the old formulation and records are available of any side-effects. We might now treat each patient with the new formulation and note incidence of side-effects.
Table 1 shows a possible outcome for such an experiment. Do the data in
Table 1 provide any evidence regarding a less severity of side-effects with the new formulation of the drug? BA The two test statistics given in (
21) and (
22) are appropriate for this problem. For the test statistic
given in (
22) the null hypothesis is that for all off-diagonal counts in the table the associated probabilities are such that all
The alternative is that
for all
. We have computed the members of the family
given in Remark 7 and the corresponding asymptotic
p-values
which are given in the following table:
On the other hand, if we consider the usual Pearson test statistic
, we have that the value of this statistic is
. In this case using the chi-squared distribution with 3 degrees of freedom, the corresponding asymptotic distribution found by Bowker [
1],
. Then for all the considered statistics there is evidence of a differing incidence rate for side-effects under the two formulations, moreover this difference is towards less severe side effects under the new formulation. Therefore, the two considered tests lead to the same conclusion: There is a strong evidence of a bigger incidence rate for side-effects under the old formulation. The conclusion obtained in [
18] is in accordance with our conclusion.