A Generalized Two-Dimensional Index to Measure the Degree of Deviation from Double Symmetry in Square Contingency Tables

: The double symmetry model satisﬁes both the symmetry and point symmetry models simultaneously. To measure the degree of deviation from the double symmetry model, a two-dimensional index that can concurrently measure the degree of deviation from symmetry and point symmetry is considered. This two-dimensional index is constructed by combining two existing indexes. Although the existing indexes are constructed using power divergence, the existing two-dimensional index that can concurrently measure both symmetries is constructed using only Kullback-Leibler information, which is a special case of power divergence. Previous studies note the importance of using several indexes of divergence to compare the degrees of deviation from a model for several square contingency tables. This study, therefore, proposes a two-dimensional index based on power divergence in order to measure deviation from double symmetry for square contingency tables. Numerical examples show the utility of the proposed two-dimensional index using two datasets.


Introduction
Consider an r × r square contingency table that has the same row and column classifications with nominal categories. Let π ij denote the probability that an observation will fall in the ith row and jth column of the table (i = 1, . . . , r; j = 1, . . . , r).
The symmetry (S) model proposed by Bowker [1] is defined by This S model is the most commonly used model for analyzing square contingency tables [2][3][4].
The point symmetry (PS) model proposed by Wall and Lienert [5] is defined by π ij = π i * j * for i, j = 1, . . . , r, where i * = r + 1 − i and j * = r + 1 − j. This PS model assumes the point of symmetry as a center of the square contingency table.
This DS model indicates that both the S and PS model hold. When a model does not hold, we may be interested in measuring the degree of deviation from the model. For square contingency tables with nominal categories, Tomizawa et al. [7] proposed an index Φ (λ) S that represents the degree of deviation from the S model, Tomizawa et al. [8] proposed an index Φ (λ) PS that represents the degree of deviation from the PS model, and Yamamoto et al. [9] proposed an index Φ (λ) DS that represents the degree of deviation from the DS model.
This study focuses on the index that represents the degree of deviation from the DS model. Although the DS model satisfies both the S and PS models simultaneously, the above index Φ (λ) DS cannot concurrently measure the degree of deviation from S and PS. To address this gap, Ando et al. [10] proposed a two-dimensional index that can concurrently measure those. This two-dimensional index was constructed by combining PS . Ando et al. [10] points out that it is necessary to construct as a two-dimensional index rather than a univariate index because existing indexes Φ PS are not independent. Ando et al. [10] considered three datasets: (1) the degree of deviation from the S model is large but the degree of deviation from the PS model is small, (2) the degree of deviation from the S model is small but the degree of deviation from the PS model is large, and (3) both the degree of deviation from the S model and the PS model are large. By using these datasets which have a different structure with respect to the deviation from the DS model, Ando et al. [10] showed that the all values of the index Φ (λ) DS applied to these datasets are the same, whereas all the values of the two-dimensional index are different. Thus, this two-dimensional index gives more detailed results than the DS are constructed using power divergence, while the two-dimensional index is constructed using only Kullback-Leibler information, which is a special case of power divergence. Moreover, the power divergence includes several divergences, for example, the power divergence with λ = −0.5 is equivalent to the Freeman-Tukey type divergence, the power divergence with λ = 1 is equivalent to the Pearson chi-squared type divergence. For details on power divergence, see Cressie and Read [11], Read and Cressie [12]. Previous studies (e.g., [7,8]) pointed out that it is important to use several indexes of divergence to accurately measure the degree of deviation from a model. This study proposes a two-dimensional index that is constructed by combining existing indexes Φ PS based on power divergence. The rest of this paper is organized as follows. In Section 2, we propose a generalized two-dimensional index for measuring the degree of deviation from DS. In Section 3, we develop an approximate confidence region for the proposed two-dimensional index. We then use numerical examples to show the utility of the proposed two-dimensional index in Section 4. We also present results obtained by applying the proposed two-dimensional index to real data. We close with concluding remarks in Section 5.

Two-Dimensional Index to Measure Deviation from DS
We propose a generalized two-dimensional index for measuring deviation from DS in square contingency tables. The proposed two-dimensional index can concurrently measure the degree of deviation from S and PS. The proposed two-dimensional index is based on power divergence.
In order to measure the degree of deviation from DS, we consider the following two-dimensional index: PS are those considered by Tomizawa et al. [7] and Tomizawa et al. [8], respectively (see the Appendixes A and B for the details of these indexes). Note that the λ is a real value and is chosen by the user. We recommend choosing the λ (e.g., −0.5, 0, 1) corresponding to the famous divergence. When λ = 0, the proposed two-dimensional index is equivalent to the index by Ando et al. [10]. Thus, Ψ (λ) is a generalization of the index by Ando et al. [10]. The two-dimensional index Ψ (λ) has the following characteristics: (i) Ψ (λ) = (0, 0) if and only if the DS model holds; (ii) Ψ (λ) = (1, 1) if and only if the degree of deviation from DS is maximum, in the sense that π ij = π j * i * = 0 (then π ji > 0 and π i * j * > 0) or π ji = π i * j * = 0 (then π ij > 0 and π j * i * > 0) for all i = j, and either π ii = 0 or π i * i * = 0 for i = 1, . . . , r/2 (when r is even) or i = 1, . . . , (r − 1)/2 (when r is odd); (iii) Ψ (λ) = (1, * * ) if and only if the degree of deviation from S is maximum and the degree of deviation from PS is not maximum, in the sense that π ij = 0 (then π ji > 0) for all i = j; and (iv) Ψ (λ) = ( * * , 1) if and only if the degree of deviation from PS is maximum and the degree of deviation from S is not maximum, in the sense that π ij = 0 (then π i * j * > 0) for all (i, j) ∈ E.
Assume that n has a multinomial distribution with sample size N and probability vector π. The √ N(p − π) has an asymptotically Gaussian distribution with mean zero and covariance matrix D(π) − ππ , where p = n/N and D(π) is a diagonal matrix with the elements of π on the main diagonal (see, e.g., Agresti [13]). We estimate Ψ (λ) bŷ PS with π ij replaced by p ij , respectively. Using the delta method (see Agresti [13]), √ N(Ψ (λ) − Ψ (λ) ) has an asymptotically bivariate Gaussian distribution with mean zero and covariance matrix The elements σ 22 are expressed as follows: Note that the asymptotic variances σ PS is first derived in this study. An approximate bivariate 100(1 − α)% confidence region for the index Ψ (λ) is given by where χ 2 (1−α;2) is the upper 1 − α percentile of the central chi-square distribution with two degrees of freedom and Σ (λ) is given by Σ (λ) with π ij replaced by p ij .

Utility of the Proposed Two-Dimensional Index
In this section, we demonstrate the usefulness employing several divergences to compare the degrees of deviation from DS in several datasets. We consider the two artificial datasets in Table 1. We compare the degrees of deviation from DS for Table 1a,b using the confidence region for Ψ (λ) . Table 2 gives the estimated values of Ψ (λ) and Σ (λ) for Table 1a,b.  From Figure 1, we see that the confidence regions for Ψ (λ) do not overlap for the data in Table 1a,b. We can conclude that Table 1a,b has a different structure in the degree of deviation from DS. That is, Table 1a,b has a different structure with regard to the degree of deviation from S or PS. From Figure 1, when λ = 0, we can conclude that the degree of deviation from DS for Table 1a is greater than that for Table 1b, but when λ = 1, we cannot conclude this. We should, therefore, examine the value of the two-dimensional index using several λ to compare the degrees of deviation from DS for several datasets.

Example with Real Data
Consider the data in Table 3, which are taken from Anderson [14]. We are interested in the DS model for these data. We define, for example, the probability that the forecast and actual figures are "No change"and "Higher", respectively, as π N,H , and the probability that they are "Lower"and "No change", respectively, by π L,N . For Table 3, we are interested in whether the forecast accuracy changes depending on the category. When the forecast accuracy does not depend on these categories, the following holds: (1) the probabilities that the categories of the forecast and the actual are the same and are equal to one another (π H,H = π L,L ); (2) the probabilities that the difference between the categories of the forecast and the actual is one are also equal (π N,H = π H,N = π N,L = π L,N ); and (3) the probabilities that the difference between the categories of the forecast and the actual is two are also equal (π H,L = π L,H ). The above probability structure indicates the DS model. Moreover, we are interested in whether the degree to which the forecast accuracy depends on the categories is greater for prices than for production, or vice versa. Table 4 shows the value ofΦ  Table 3a,b using the confidence region for Ψ (λ) . The estimates of Σ (λ) , applied to the data in Table 3a,b, are shown in Table 4.  Figure 2 shows the confidence regions of Ψ (λ) applied to the data in Table 3a,b. We see that the confidence regions of Ψ (λ) do not overlap with regard to several values of λ. Therefore, it may be concluded that Table 3a,b has a different structure with regard to the degree of deviation from DS, in the sense that Table 3a,b has a different structure with regard to the degree of deviation from S. However, we cannot conclude whether the degree of deviation from DS is greater for Table 3a than for Table 3b. This is because, when both the degrees of deviation from S and PS are greater for Table 3a than for Table 3b, we can conclude that the degree of deviation from DS is greater for Table 3a than for Table 3b.  Table 3a,b, where λ = −0.5, 0, 1.
Next, consider the data in Table 5, which are taken from Tomizawa et al. [15]. We shall compare the degrees of deviation from DS for Table 5a,b using the confidence region for Ψ (λ) . The estimates of Σ (λ) , applied to the data in Table 5a,b, are shown in Table 6.  Figure 3 shows the confidence regions of Ψ (λ) applied to the data in Table 5a,b. We see that the confidence regions of Ψ (λ) do not overlap in both horizontal and vertical axes with regard to several values of λ. Therefore, we can conclude that the degree of deviation from DS is greater for Table 5b than for Table 5a. (c) λ = 1 Figure 3. Approximate 95% confidence regions for Ψ (λ) , applied to the data in Table 5a,b, where λ = −0.5, 0, 1.

Concluding Remarks
This study proposed a generalized two-dimensional index that concurrently measures the degree of deviation from S and PS. Since the two indexes (Φ is necessary to concurrently measure the degree of deviation from S and PS when we measure the degree of deviation from DS. To compare degrees of deviation from DS in several datasets using the proposed two-dimensional index, we should use several λ rather than one specified λ. Therefore, we recommend to choose the several λ (e.g., −0.5, 0, 1) corresponding to the famous divergence.
The estimator of the proposed two-dimensional index is the unbiased estimator when the sample size is large. When the sample size is small, however, the estimator of the proposed two-dimensional index may be the biased estimator. Through simulation study, Tomizawa et al. [16] investigated the performance of the estimatorΦ (λ) S . Tomizawa et al. [16] showed that (1) when the sample size was less than 300, the estimatorΦ (λ) S had a bias, (2) when the sample size was above 300, it had a slight bias, and (3) when the sample size was above 1000, it had almost no bias. We believe that the proposed two-dimensional estimatorΨ (λ) may be similar results to the estimatorΦ (λ) S , although it is necessary to verify by simulation study. In future research, the above concern will be investigated. where I (λ) Note that I PS are the power divergence between the two conditional distributions, and the value at λ = 0 is taken to be the limit as λ → 0.