Measure of Departure from Conditional Symmetry Based on Cumulative Probabilities for Square Contingency Tables

For the analysis of square contingency tables with ordered categories, a measure was developed to represent the degree of departure from the conditional symmetry model in which there is an asymmetric structure of the cell probabilities with respect to the main diagonal of the table. The present paper proposes a novel measure for the departure from conditional symmetry based on the cumulative probabilities from the corners of the square table. In a given example, the proposed measure is applied to Japanese occupational status data, and the interpretation of the proposed measure is illustrated as the departure from a proportional structure of social mobility.


Introduction
Symmetry and asymmetry issues frequently arise in square contingency tables with the same row and column classifications from a broad range of scientific fields, for example, medical, social, and geographical sciences [1,2]. A number of statistical approaches were developed to find the symmetric and asymmetric relationships of the underlying probability distribution for the square table. The authors in [3] considered the symmetry model for cell (joint) probability in the square table, and, following that, various symmetry and asymmetry models were investigated and evaluated through data analysis [4,5]. The researchers in [1,6,7] considered the singular value decomposition of a skew-symmetric matrix, which is composed of the residuals from the symmetry model. Regarding the residuals from the symmetry model, [8] gave the index, which represents the degree of residuals, and [9] considered the correspondence analysis of the residual matrix. The present paper focuses on measuring the degree of departure from an asymmetric structure for cumulative cell probabilities.
Consider an R × R contingency table with the same row and column classifications with ordered categories. Let X and Y denote the row and column variables, respectively. Let p ij = Pr(X = i, Y = j) for i = 1, . . . , R; j = 1, . . . , R. The authors in [10] considered the conditional symmetry (CS) model defined by In particular, if ∆ = 1 holds in the CS model, the symmetry model ( [3,4], p. 282) holds. The CS model indicates a structure in which a cell probability is proportional to the symmetric cell probability with respect to the main diagonal with the common ratio to all pairs of probabilities. When the CS model does not hold, we are interested in measuring how far the probability distribution is distant from the CS model. The authors in [11] gave a measure to represent the degree of departure from the CS model. Let Φ (λ) denote the measure (see Appendix A for the details). The value of measure Φ (λ) is not invariant for re-ordering the categories of table. Therefore, Φ (λ) is appropriate for applying to a square table with ordered categories.
We introduce the cumulative probability from the upper-right and lower-left corners of the square table. Let The restriction of the CS model can be expressed by using the cumulative probabilities, that is, Thus, the CS model indicates that each cumulative probability is proportionate to the symmetric cumulative probability with the ratio of ∆.
The cross-classification of the father's and his son's occupational statuses in Japan was explored in 1955 and 1965 [12] (p. 151). The occupational status was classified into five categories: (1) capitalist; (2) new-middle; (3) working; (4) self-employed; and (5) farming. These categories were treated as an ordinal scale by some statisticians; see for example, [13].
Thus, these data are given in the form of 5 × 5 ordinal contingency tables. Noting that the row variable is the father's status and the column variable is the son's status, the father-son pairs in the off-diagonal cells indicate the social mobility to a different status between the father and his son.
For the analysis of these data, if there is a conditional symmetric structure of the cumulative probabilities underlying the occupational status data, there is a proportional relation in which the probability of social mobility from father's status i or below to son's status j or above is ∆ times higher than that from father's status j or above to son's status i or below for any value of the parameter of ∆, i < j; i = 1, . . . , 4; j = 2, . . . , 5.
The present paper focuses on measuring the degree of departure from the proportional structure of the cumulative probabilities (for the example described above) and comparing the degrees of departure among different tables. The measure Φ (λ) would be useful to determine the degree of departure from a conditional symmetric structure (1) for the cell probabilities. On the other hand, because the CS model can also be expressed as (2), we might also be interested in measuring the degree to which the cumulative probabilities {G ij } are distant from those with a conditional symmetric structure.
Such a measure should be expressed as a function of the cumulative probabilities. The present paper considers a new measure to represent the degree of departure from CS based on the cumulative probabilities {G ij }. Such a new measure may be useful when we determine the structure of the cumulative probabilities underlying the data rather than the structure of cell probabilities.
The rest of this paper is organized as follows. In Section 2, we propose a new measure that expresses the degree of departure from the CS model. In Section 3, we obtain the large-sample confidence interval for the proposed measure. In Section 4, we apply the measure to the actual data of the occupational status of father-son pairs in Japan and illustrate the interpretation of the measure. Section 5 provides our discussion.

Measure
For the cumulative probabilities {G ij } for an R × R table, the CS model can be expressed as

Consider the measure defined by
ij is the Patil and Taillie's [14] diversity index of degree λ, which is a real value chosen by the user. Which value of λ to use will be discussed in Section 5.
ij reflects the degree to which the weighted cumulative probabilities (G c ij , G c ji ) are distant from uniform, (1/2, 1/2), and these are rescaled in order to normalize the value of measure. The measure Ψ (λ) is formulated by combining the rescaled indices into the weighted mean with the weights of Q ij , which is the relative magnitude of the cumulative probabilities (G ij , G ji ). The measure evaluated at λ = 0 is taken to be the limit as λ → 0, namely, , that is, there is a structure of the CS model in the table. We provide a numerical experiment to determine the change of value of Ψ (λ) for departures from the CS model in Section 5. It can be seen that the value of measure Ψ (λ) is not invariant for re-ordering the categories of table and, thus, incorporates the information of the order of categories.

Data Analysis
Consider the Japanese father's and his son's occupational status data examined in 1955 and 1965 [12] (p. 151) introduced in Section 1 again. Applying the measure Ψ (λ) might be appropriate to evaluate the degree of departure from the proportional structure for the cumulative probability of social mobility and to compare the degrees of departure between in 1955 and in 1965. On the other hand, in order to evaluate the degree of departure from the proportional structure for the cell probabilities, the measure Φ (λ) should be applied. Table 1 gives the estimated values and the confidence intervals for Ψ (λ) applied to the occupational status data for some fixed λ. Similarly, Table 2 gives the estimated values and the confidence intervals for Φ (λ) . We compare the degrees of departure from CS between the occupational statuses in 1955 and in 1965 using the confidence intervals for Ψ (λ) given in Table 1. It is inferred that the degree of departure from CS in 1965 would be larger than that in 1955. Therefore, it can be interpreted from these results that the degree of departure from the proportional structure for the cumulative probabilities is larger for 1965 than for 1955. On the other hand, the results in Table 2 show that the degree of departure from the CS for the cell probabilities may be larger for 1965 than for 1955.

Discussion
For a square contingency table with the same ordered row and column classifications, we have proposed the measure to represent the degree of departure from the CS model based on the cumulative probability. The proposed measure Ψ (λ) is useful for comparing the degrees of departure from CS among different ordinal tables as shown in the example.
The measure Ψ (λ) can be reformulated using the power divergence [15] as follows.
When λ = 0, using the Kullback-Leibler (KL) divergence, where Especially when λ = 0, it can be easily seen that where ∑ ∑ i<j W ij = 1 and W ij > 0. Therefore, Q ij in Ψ (λ) is the value of W ij such that it minimizes the sum of the two KL divergences between {G U ij } and {W ij }, and {G L ji } and {W ij } with the conditional asymmetric structure. Note that Q ij does not minimize the power divergence for any λ (λ > −1; λ = 0).
The reader may be interested in how to select the value of the parameter λ in Ψ (λ) to compare different tables in a practical situation. It is recommended to compare the results using the various values of λ rather than to see the result from a specified value of λ. This is because it would be impossible to conclude which table has a larger departure from CS if the results differ depending on the value of λ (although the authors have not experienced such a case yet). On the other hand, it would be possible to draw a conclusion when the results agree for all the used values of λ. Thus, it seems to be safe to use various values of λ and to compare the obtained results. If the reader considers the interpretation of the measure to be important, adopting λ = 0 may be recommended from the discussion in the previous paragraph.
Consider the artificial cell and cumulative probability tables given in Table 3. This table has unspecified probabilities, p 12 and p 21 , where p 12 + p 21 = 0.61. In Table 3 (b), the ratios G 13 /G 31 and G 23 /G 32 are equal to 2. Figure 1 shows the value of measures Φ (0) and Ψ (0) for different values of the ratio G 12 /G 21 . It can be seen from this figure that (1) the value of Ψ (0) takes 0 when G 12 /G 21 = 2, which is the same as G 13 /G 31 and G 23 /G 32 ; namely, the CS model holds; (2) the value of Ψ (0) takes a larger value as G 12 /G 21 moves away from 2. The results for λ = 0 are similar and, thus, are not reported here.
From these results, the measure Ψ (λ) would be appropriate to represent the degree of departure from the CS model, because it is natural to consider that the departure from CS increases as G 12 /G 21 moves away from 2 in Table 3. Figure 1 shows that the measure Ψ (0) has a trend similar to the measure Φ (0) in this artificial example. Value of Φ (0) Value of Ψ (0) Figure 1. Values of measure Φ (0) and Ψ (0) for values of G 12 /G 21 in Table 3.
We present another artificial example to clarify the difference between the two measures Φ (λ) and Ψ (λ) . Consider the artificial cell and cumulative probability tables given in Table 4. Figure 2 shows the values of the measures Φ (0) and Ψ (0) for different values of the probability p 13 . When p 13 = 0.1, the ratios of the symmetric cell and cumulative probabilities are almost the same, and then the values of the two measures are nearly 0. When the value of p 13 increases toward 0.5, the proportional structure of the cumulative probabilities is almost preserved.
Consequently, the value of Ψ (0) remains nearly 0. On the other hand, the value of Φ (0) increases. This may be because the proportional structure of the cell probabilities is not preserved when the probability p 13 increases. The results for λ = 0 are similar and, thus, are not reported here. Hence, the measure Ψ (λ) is appropriate to represent the degree of departure from a conditional symmetric structure of cumulative probabilities where the measure Φ (λ) is not appropriate.   Table 4. where Noting that I (λ) ij is the diversity index of degree λ [14], this represents the degree to which the conditional probabilities (p c ij , p c ji ) are distant from (1/2, 1/2). The measure Ψ (λ) is the weighted mean of I (λ) ij with the weights of q ij . The value of the measure evaluated at λ = 0 is taken to be the limit as λ → 0, namely, Note that I ij is the Shannon entropy, which is a special case of the diversity index I (λ) ij .