Next Article in Journal
A Text Classification Model via Multi-Level Semantic Features
Previous Article in Journal
On the Sum and Spread of Reciprocal Distance Laplacian Eigenvalues of Graphs in Terms of Harary Index
Previous Article in Special Issue
Visualising Departures from Symmetry and Bowker’s X2 Statistic

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Partial Asymmetry Measures for Square Contingency Tables

1
Innovative and Clinical Research Promotion Center, Gifu University Hospital, Gifu City 501-1194, Gifu, Japan
2
Department of Biostatistics, Yokohama City University School of Medicine, Yokohama City 236-0004, Kanagawa, Japan
3
Department of Information Sciences, Faculty of Science and Technology, Tokyo University of Science, Noda City 278-8510, Chiba, Japan
4
Department of Information Sciences, Faculty of Information Sciences, Meisei University, Hino City 191-0042, Tokyo, Japan
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(9), 1936; https://doi.org/10.3390/sym14091936
Received: 19 August 2022 / Revised: 13 September 2022 / Accepted: 14 September 2022 / Published: 17 September 2022

Abstract

:
In square contingency table analysis, we consider a partial measure that represents the degree of departure from symmetry for each of several pairs. It may be useful to pool the values of the measure into a single summary measure of partial asymmetry. We show that the estimator of partial measures is asymptotically mutually independent for a large sample size. The present paper proposes a symmetry measure in the class of weighted averages that is different from previous studies. The proposed measure is an approximation of the measure in the class of weighted averages that has the smallest variance.

1. Introduction

In categorical data analysis, contingency tables are a basic tool used to examine the relationship between row and column categories. For example, the Pearson $X 2$ statistic is commonly used to test the null hypothesis of statistical independence (Agresti [1] [p. 75]). When statistical independence is rejected, we are interested in describing the association between the row and column categories. Summary measures of association have been proposed, such as the Cramér V, gamma, and uncertainty coefficient. For details, see for instance Agresti [1] [Sec. 2.4] and Bishop et al. [2] [Sec. 11.3]. Additionally, the recent development of association measures is described, for example, in Beh et al. [3], Lombardo [4], Wei and Kim [5], Wei and Kim [6], Zhang et al. [7], and Wei et al. [8].
Contingency tables with the same row and column classifications are called square contingency tables. These tables are used for unaided distance vision data, social mobility data, and longitudinal data in biomedical research. The analysis of square contingency tables considers the issue of symmetry rather than independence because it is not sensible to treat these data as independent.
Bowker [9] introduced the simple symmetry model and proposed a test for the hypothesis of symmetry. When the symmetry model fits the given data poorly, we are interested in measuring the degree of departure from symmetry. Tomizawa [10] proposed a measure that represents the degree of departure from symmetry expressed using the Shannon entropy or Kullback–Leibler information. In the real world, the Shannon entropy is widely applied as a measure of complexity, for example in Fernandes and Araújo [11]. The measure lies between 0 and 1, and its value equals 0 if and only if the symmetry model holds. Additionally, the degree of departure from symmetry increases as the value of the measure increases.
In the present paper, we propose a measure that represents the degree of departure from symmetry using a different approach. We also consider a partial measure that represents the degree of departure from symmetry for each of several pairs. If the asymmetry appears to be similar in the various pairs, it may be useful to pool the values of the measure into a single summary measure of partial asymmetry. In an analogous manner to Agresti [12] [p. 170], we consider taking a weighted average of the sample values as a summary measure. The properties of the proposed measure are given, and it has a characteristic that is different from that of Tomizawa’s measure.
The rest of this paper is organized as follows. Section 2 describes the background of this study by reviewing previous research. Section 3 proposes the new measure that represents the degree of departure from symmetry. Section 4 shows some numerical examples and discusses the difference between the estimate of $Φ T$ and the estimate of the proposed measure. Section 5 gives an example involving the cross-classification of mothers’ and fathers’ birth orders. Section 6 contains concluding remarks.

2. Review of Previous Research

Consider an $r × r$ square contingency table having the same row and column classifications. Let $p i j$ denote the probability that an observation will fall in the $( i , j )$th cell of the table $( i = 1 , ⋯ , r ; j = 1 , ⋯ , r )$. The simple symmetry model introduced by Bowker [9] is defined by
$p i j = p j i ( i ≠ j ) .$
This model indicates the symmetry structure with respect to the cell probabilities. Bowker [9] proposed a test for the hypothesis of symmetry.
When the symmetry model does not hold for a given dataset, we are interested in evaluating the degree of departure from symmetry. Assuming $p i j + p j i$ is not equal to zero for $i < j$, the measure is defined as
$Φ T = 1 δ log 2 ∑ ∑ i ≠ j p i j log 2 p i j p i j + p j i ,$
where $δ = ∑ ∑ i ≠ j p i j$. The measure $Φ T$ has three properties: (i) $0 ≤ Φ T ≤ 1$; (ii) the table has a symmetrical structure if and only if $Φ T = 0$; (iii) there is a structure for which either $p i j = 0$ or $p j i = 0$ for $i ≠ j$ if and only if $Φ T = 1$.
Let $π i j = p i j / ( p i j + p j i )$ for $i = 1 , ⋯ , r ; j = 1 , ⋯ , r ; i ≠ j$. The conditional probability that an observation falls in cell $( i , j )$ or $( j , i )$ in the table is $π i j$. It should be noted that the symmetry model can be expressed as
The measure $Φ T$ can be expressed as
where
It should be noted that $ϕ i j$ is the normalized Kullback–Leibler information between $( π i j , π j i )$ and $( 1 / 2 , 1 / 2 )$. That is, the measure $Φ T$ is the weighted average of $ϕ i j$.
We review $ϕ i j$ in $Φ T$. The partial measure $ϕ i j$ represents the degree of departure from symmetry for a pair of symmetric cells because: (i) $0 ≤ ϕ i j ≤ 1$; (ii) there is a symmetrical structure for the pair of $( i , j )$ and $( j , i )$ cells if and only if $ϕ i j = 0$; (iii) there is a structure for which either $p i j = 0$ or $p j i = 0$ for the pair of $( i , j )$ and $( j , i )$ cells if and only if $ϕ i j = 1$. That is, the measure $ϕ i j$ expresses partial asymmetry.

3. The Proposed Measure

Let $n i j$ denote the observed frequency in the $( i , j )$th cell of the table ($i = 1 , ⋯ , r$; $j = 1 , ⋯ , r$). We assume that ${ n i j }$ have a multinomial distribution:
$n ! ∏ i = 1 r ∏ j = 1 r n i j ! ∏ i = 1 r ∏ j = 1 r p i j n i j ,$
where $n = ∑ i = 1 r ∑ j = 1 r n i j$. Let $p ^ ′$ be the $1 × r 2$ vector
$p ^ ′ = ( p ^ ( 12 ) ′ , p ^ ( 13 ) ′ , ⋯ , p ^ ( r − 1 , r ) ′ , p ^ 11 , ⋯ , p ^ r r ) ,$
where
$p ^ ( i j ) ′ = ( p ^ i j , p ^ j i ) , p ^ i j = n i j n , p ^ i i = n i i n .$
Furthermore, let us define the vector $p$ in terms of $p i j$s in the same way as $p ^$. Let $ϕ ^ i j$ denote the sample version of $ϕ i j$. Namely, the estimated $ϕ i j$ is given as
where $π ^ i j = p ^ i j / ( p ^ i j + p ^ j i )$ and $π ^ j i = p ^ j i / ( p ^ i j + p ^ j i )$. Let $ϕ ^$ be the $r ( r − 1 ) / 2 × 1$ vector:
$ϕ ^ = ( ϕ ^ 12 , ϕ ^ 13 , ⋯ , ϕ ^ r − 1 , r ) ′ ,$
and we define the vector $ϕ$ in terms of $ϕ i j$s in a similar manner to $ϕ ^$. From Appendix A, $ϕ ^$ is asymptotically distributed as normal with mean $ϕ$ and covariance matrix
$σ 2 [ ϕ ^ ] = σ 12 2 0 ⋯ 0 0 σ 13 2 ⋮ ⋮ ⋱ 0 0 ⋯ 0 σ r − 1 , r 2 ,$
where
It should be noted that the set $ϕ ^ 12 , ⋯ , ϕ ^ r − 1 , r$ is asymptotically mutually independent for large n.
We consider the weighted average of ${ ϕ ^ i j }$, that is
$Φ = ∑ ∑ i < j w i j ϕ ^ i j ,$
where the weights ${ w i j }$ satisfy all $w i j > 0$ and $∑ ∑ i < j w i j = 1$. We see from Appendix A that $ϕ ^ i j$$( i < j )$ is asymptotically distributed normal as $N ( ϕ i j , σ i j 2 )$ independently. Thus, the measure $Φ$ has an asymptotically normal distribution with mean
$∑ ∑ i < j w i j ϕ i j ,$
and variance
$σ 2 = ∑ ∑ i < j ( w i j ) 2 σ i j 2 .$
In an analogous manner to Agresti [12] [p. 170], we derive the weights ${ w i j * }$ so as to minimize the variance of $Φ$ with the constraint that all $w i j > 0$ and $∑ ∑ i < j w i j = 1$. From Appendix B, we obtain
$w i j * = 1 / σ i j 2 ∑ ∑ s < t 1 / σ s t 2 .$
Then, we consider the following measure, which represents the degree of departure from symmetry:
$Φ S = ∑ ∑ i < j w i j * ϕ ^ i j .$
The measure (17) has the smallest variance among measures in the class of weighted averages given in Equation (13). It should be noted that we should estimate the variances ${ σ i j 2 }$ because these are unknown.
We propose the estimated measure as follows:
$Φ ^ S = ∑ ∑ i < j w ^ i j * ϕ ^ i j ,$
where $w ^ i j *$ is given by $w i j *$ with ${ p i j }$ replaced by ${ p ^ i j }$. The proposed measure approximates the measure in the class of weighted averages that has the smallest variance. The estimated measure $Φ ^ T$ is the weighted average of $ϕ ^ i j$ using the weights ${ w ^ i j = ( p ^ i j + p ^ j i ) / δ ^ }$, where $δ ^ = ∑ ∑ i ≠ j p ^ i j$. On the other hand, the proposed measure $Φ ^ S$ is the weighted average of $ϕ ^ i j$ using the weights ${ w ^ i j * }$. It should be noted that (i) $w ^ i j = ( p ^ i j + p ^ j i ) / δ ^$ indicates the estimated conditional probability that the observation falls in $( i , j )$ or $( j , i )$ cells on the condition that the observation falls in off-diagonal cells and (ii) the weight $w ^ i j *$ becomes larger as the variance of partial measure $ϕ ^ i j$ decreases.

4. Numerical Examples

The objective is to confirm the difference in the single summary measure for symmetry by comparing the weights ${ w ^ i j }$ and ${ w ^ i j * }$. Consider the artificial data in Table 1(a)–(d) with $n = 1000$ and Table 1(e) with $n = 200$. Table 1(a)–(d) are generated from the random numbers of the multinomial distribution based on the cell probability tables (a), (b), (c), and (d) in Table 2, respectively. Table 1(e) is generated from the random numbers of the multinomial distribution based on the cell probability table in Table 2(a). The artificial cell probability tables of Table 2 focus in particular on the probabilities of cells $( 1 , 2 )$ and $( 2 , 1 )$, and the four patterns (a), (b), (c), and (d) are set according to the combination of partial symmetry/asymmetry. We shall apply the partial measure $ϕ i j$. Table 3 shows the estimated partial measure $ϕ ^ i j$, estimated variance $σ ^ i j 2$, confidence interval for $ϕ i j$, estimated weights ${ w ^ i j }$ for $Φ ^ T$, and estimated weights ${ w ^ i j * }$ for $Φ ^ S$. Figure 1 visualizes the estimated partial measure $ϕ ^ i j$ and confidence interval for $ϕ i j$. The confidence interval for $ϕ 12$ applied to the data in Table 1(a) does not contain zero, indicating that there is a partially asymmetric structure in cells $( 1 , 2 )$ and $( 2 , 1 )$. Furthermore, the confidence interval for $ϕ 12$ does not overlap with the confidence intervals for $ϕ i j$ for any other pair of cells, indicating that cells $( 1 , 2 )$ and $( 2 , 1 )$ are partially asymmetric compared to every other pair of cells. The $w ^ 12 *$ of $Φ ^ S$ is remarkably smaller than the $w ^ 12$ of $Φ ^ T$, and the value of $Φ ^ S$ is smaller than that of $Φ ^ T$. The value of $ϕ ^ 12$ applied to the data in Table 1(b) is as large as the value of $ϕ ^ 12$ applied to the data in Table 1(a). However, it cannot be shown that there is a partially asymmetric structure in cells $( 1 , 2 )$ and $( 2 , 1 )$ in Table 1(b) because the confidence interval for $ϕ 12$ is wide and contains zero. Both $Φ ^ S$ and $Φ ^ T$ in Table 1(b) have small values of $w ^ 12$ and $w ^ 12 *$, and in particular, the $w ^ 12 *$ for $Φ ^ S$ is remarkably small.
On the other hand, the value of $ϕ ^ 12$ applied to the data in Table 1(c) indicates that there is a partially symmetric structure in cells $( 1 , 2 )$ and $( 2 , 1 )$ because the value of $ϕ ^ 12$ is small and the confidence interval for the $ϕ 12$ contains zero. In addition, the $w ^ 12 *$ in Table 1(c) is large, indicating that the weight of $Φ ^ S$ is larger when the pair of cells is more frequent than others and has a partially symmetric structure. Both $Φ ^ S$ and $Φ ^ T$ applied to the data in Table 1(c) are close to zero because of the greater weight of the pair of cells $( 1 , 2 )$ and $( 2 , 1 )$ that show partial symmetry compared to the pairs of cells $( 4 , 5 )$ and $( 5 , 4 )$ and $( 3 , 4 )$ and $( 4 , 3 )$ that show partial asymmetry. The values of $ϕ ^ i j$ applied to the data in Table 1(d) indicate that the pair of cells $( 1 , 2 )$ and $( 2 , 1 )$ and the pair of cells $( 1 , 4 )$ and $( 4 , 1 )$ both have a partially symmetric structure because the confidence intervals of $ϕ 12$ and $ϕ 14$ include zero. It can be seen that the values of $w ^ 12$ and $w ^ 14$ for $Φ ^ T$ are similar, while the value of $w ^ 14 *$ for $Φ ^ S$ is large compared to $w ^ 12 *$.
The value of $ϕ ^ 12$ applied to the data in Table 1(e) is about the same as the value applied to the data in Table 1(a), but the confidence interval is wider due to the smaller sample size, making it relatively difficult to conclude partial asymmetry for the pair of cells $( 1 , 2 )$ and $( 2 , 1 )$. The magnitude of $ϕ ^ 12$ applied to Table 1(e) does not differ much from the results applied to Table 1(a). However, the values of $w ^ 12 *$ and $Φ ^ S$ are greater when applied to Table 1(e) than Table 1(a). It should be noted that the weight $w ^ i j *$ becomes larger as the variance of the partial measure $ϕ ^ i j$ decreases, and the weight $w ^ i j$ becomes larger as the proportion $( n i j + n j i ) / n$ increases.

5. Example

Consider the data in Table 4, derived from the national survey on educational attitudes of high school students and their mothers in Japan in 2012. To clarify the structure of educational inequalities in contemporary Japanese society and the actual educational awareness of parents and children, a postal survey was conducted among second-year high school students and their mothers throughout Japan, using the same framework as the national survey on the educational awareness of high school students and their mothers conducted in November 2002. The data describe the cross-classification of mothers’ and fathers’ birth orders. For example, for the 179 high school students whose mothers’ birth order is “First” and whose fathers’ birth order is “Second”, the mother is the eldest daughter and the father is the second son. The partial symmetry of cells $( 1 , 2 )$ and $( 2 , 1 )$ means that the probability of high school students whose mother is the eldest daughter and whose father is the second son is equal to that of high school students whose mother is the second daughter and whose father is the eldest son.
Let $G S 2$ and $X S 2$ denote the likelihood ratio and Pearson’s chi-squared statistics for testing the goodness of fit of the symmetry model, i.e., $G S 2 = 2 ∑ ∑ i ≠ j n i j log ( 2 n i j / ( n i j + n j i ) )$ and $X S 2 = ∑ ∑ i < j ( n i j − n j i ) 2 / ( n i j + n j i )$. For large samples, $G S 2$ and $X S 2$ have a chi-squared null distribution with $r ( r − 1 ) / 2$ degrees of freedom. From $G S 2 = 14.58$ and $X S 2 = 14.17$ with six degrees of freedom for the data in Table 4, these values indicate the lack of a symmetrical structure. Note that the exact test introduced by West [13] is well known as a test for the contingency table including structural zeros. As the proposed measure does not require the frequency of the diagonal components, West’s test was also conducted assuming that the diagonal components are structural zeros. The simulated p-value from West’s test is 0.085, which indicates that the rows and columns are independent. The value of $Φ ^ T$ is 0.0184, and the 95% confidence interval is (0.000003, 0.036719), which does not include zero.
Next, we measured the degree of departure from partial symmetry for each pair of cells. We shall apply the partial measure $ϕ i j$ for the data in Table 4. Table 5 shows the estimated values for $ϕ i j$ and $σ i j 2$, confidence intervals for $ϕ i j$, estimated weights ${ w ^ i j }$ and ${ w ^ i j * }$, and estimated measures $Φ ^ S$ and $Φ ^ T$. According to the magnitudes of the estimates, $ϕ i j$ can explain the partial symmetry for each pair of cells in Table 4. The 95% confidence interval for the $ϕ i j$ for all pairs of cells contains zero, which indicates that there is a partially symmetrical structure in each birth order category in the mother–father pairs.
Furthermore, the estimated departure from symmetry is smaller with $Φ ^ S$, which uses different weights than $Φ ^ T$. Figure 2 plots estimated weights $w ^ i j$ and $w ^ i j *$. Cells $( 1 , 2 )$ and $( 2 , 1 )$ have similar frequencies and are more frequent than the other cells. Then, weights $w ^ i j$ and $w ^ i j *$ are similar and large. On the other hand, the pair of cells $( 2 , 3 )$ and $( 3 , 2 )$ have similar frequencies, but are less frequent than the pair of cells $( 1 , 2 )$ and $( 2 , 1 )$. In such cases, $w ^ i j *$ is larger than $w ^ i j$. Therefore, $Φ ^ S$ has a higher weight than $Φ ^ T$ when the pair of cells has a lower frequency than another pair of cells and when the cells have similar frequencies. Conversely, $w ^ i j *$ is smaller than $w ^ i j$ when the frequencies are different, as in the pair of cells $( 1 , 3 )$ and $( 3 , 1 )$. Since the weights ${ w ^ i j * }$ and ${ w ^ i j }$ take different values, the single summary measures $Φ ^ S$ and $Φ ^ T$ also take different values. As mentioned above, there is a partially symmetrical structure in each birth order category in the mother–father pairs. Then, the proposed measure may be reasonable to express the degree of departure from symmetry.

6. Concluding Remarks

We proposed a partial measure to express the degree of departure from partial symmetry. The measure was constructed as the weighted average of partial measures expressed using the Shannon entropy or Kullback–Leibler information. The composition of the proposed measure $Φ S$ is similar to that of the measure proposed by Tomizawa [10] in the sense that they are classes of weighted averages. However, they differ in that the weights multiplied by the partial measure are constructed so as to minimize the measure’s variance. This measure increase with the degree of departure from symmetry, allowing us to see how far away the probability structure of the contingency table is from complete asymmetry.
The measures $Φ ^ S$ and $Φ ^ T$ are invariant under the arbitrary simultaneous permutations of row and column categories, and therefore, it is possible to apply these measures to analyze the data on a nominal scale, as well as on an ordinal scale if one cannot use the information about the order in which the categories are listed.
We compared the weights used to construct the measures $Φ ^ S$ and $Φ ^ T$. Those used to construct $Φ ^ T$ are large when the frequency of the pair of cells is high compared to others. On the other hand, the weight of $Φ ^ S$ is higher when the frequency of the pair of cells is higher than others and when the structure is partially symmetric. Conversely, when the frequency of the pair of cells is lower than others and the structure is partially asymmetric, the weights of $Φ ^ S$ are smaller than those of $Φ ^ T$.
In the present study, confidence intervals for partial measures were used to interpret the partially asymmetric structure of the data. Alternatively, global tests for the null hypothesis that all $ϕ i j$ are equally zero, and multiplicity correction for paired comparisons also need to be considered and are left as future works.
We should note, however, that $Φ ^ S$ cannot be calculated if any of the off-diagonal cells are zero. As such, the proposed measure should be used for contingency tables with large sample sizes.

Author Contributions

Conceptualization, T.I., K.T. and K.Y.; methodology, T.I., K.T. and K.Y.; software, T.I.; validation, K.T.; formal analysis, T.I.; investigation, T.I.; writing—original draft preparation, T.I., K.T. and K.Y.; writing—review and editing, T.I., K.T., K.Y. and S.T.; visualization, T.I.; supervision, S.T.; project administration, K.T.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Not applicable.

Acknowledgments

The data, for example, in Section 5, the national survey on educational attitudes of high school students and their mothers in Japan in 2012, were provided by the Social Science Japan Data Archive, Center for Social Research and Data Archives, Institute of Social Science, The University of Tokyo. Furthermore, the authors are grateful to an Associate Editor and three Referees for their constructive comments and useful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

From the central limit theorem, $p ^$ is asymptotically distributed as normal $N ( p , Σ 1 ( p ) )$, where $Σ 1 ( p )$ is the $r 2 × r 2$ matrix
$Σ 1 ( p ) = 1 n ( D ( p ) − p p ′ ) ,$
where $D ( p )$ denotes a diagonal matrix with the ith element of $p$ as the ith diagonal element. Then, we also obtain
$ϕ ^ = ϕ + d 1 ( p ) ( p ^ − p ) + o ( | | p ^ − p | | ) ,$
where $d 1 ( p ) = ∂ ϕ / ∂ p ′$ is the $r ( r − 1 ) / 2 × r 2$ matrix. Thus, $ϕ ^$ is asymptotically distributed as normal $N ( ϕ , σ 2 [ ϕ ^ ] )$, where
$σ 2 [ ϕ ^ ] = d 1 ( p ) Σ 1 ( p ) d 1 ( p ) ′ .$
Let $π ^ ′$ be the $1 × r ( r − 1 )$ vector:
$π ^ ′ = ( π ^ ( 12 ) ′ , π ^ ( 13 ) ′ , . . . , π ^ ( r − 1 , r ) ′ ) ,$
where $π ^ ( i j ) ′ = ( π ^ i j , π ^ j i )$. Noting that $ϕ ^$ is a function of only ${ π i j }$, we obtain
$d 1 ( p ) = ∂ ϕ ∂ π ′ · ∂ π ∂ p ′ .$
It should be noted that $∂ ϕ / ∂ π ′$ is the $r ( r − 1 ) / 2 × r ( r − 1 )$ matrix and $∂ π / ∂ p ′$ is the $r ( r − 1 ) × r 2$ matrix. By obtaining $∂ π / ∂ p ′$, we can see that $σ 2 [ ϕ ^ ]$ is expressed as
$σ 2 [ ϕ ^ ] = ∂ ϕ ∂ π ′ · Σ 2 ( p ) · ∂ ϕ ∂ π ′ ′ ,$
where $Σ 2 ( p )$ is the $r ( r − 1 ) × r ( r − 1 )$ matrix:
$Σ 2 ( p ) = Σ 12 ( p ) 0 ⋯ 0 0 Σ 13 ( p ) ⋮ ⋮ ⋱ 0 0 ⋯ 0 Σ r − 1 , r ( p ) ,$
where
$Σ i j ( p ) = 1 n ( p i j + p j i ) π i j ( 1 − π i j ) − π i j π j i − π i j π j i π j i ( 1 − π j i ) ( i < j ) .$
Thus, $σ 2 [ ϕ ^ ]$ is also expressed as follows:
$σ 2 [ ϕ ^ ] = σ 12 2 0 ⋯ 0 0 σ 13 2 ⋮ ⋮ ⋱ 0 0 ⋯ 0 σ r − 1 , r 2 ,$
where

Appendix B

Let $w$ be the $r ( r − 1 ) / 2 × 1$ vector:
$w = ( w 12 , w 13 , ⋯ , w r − 1 , r ) ′ .$
Then, the measure (13) is expressed as $Φ = w ′ ϕ ^$.
From Appendix A, the mean and variance of $Φ$ are approximately calculated as follows:
$E ( Φ ) = w ′ ϕ ,$
$V a r ( Φ ) = w ′ σ 2 [ ϕ ^ ] w .$
Then, we can obtain the following $w *$ so as to minimize $V a r ( Φ )$ with the constraint that $w ′ 1 d ( d = r ( r − 1 ) / 2 )$ is unity ($1 d$ is the $d × 1$ vector of 1 elements):
$w i j * = 1 / σ i j 2 ∑ ∑ s < t 1 / σ s t 2 ( i < j ) .$

References

1. Agresti, A. Categorical Data Analysis, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
2. Bishop, Y.M.M.; Fienberg, S.E.; Holl, P.W. Discrete Multivariate Analysis: Theory and Practice; The MIT Press: Cambridge, MA, USA, 1975. [Google Scholar]
3. Beh, E.J.; Simonetti, B.; D’Ambra, L. Partitioning a non-symmetric measure of association for three-way contingency tables. J. Multivar. Anal. 2007, 98, 1391–1411. [Google Scholar] [CrossRef]
4. Lombardo, R. Three-way association measure decompositions: The Delta index. J. Stat. Plan. Inference 2011, 141, 1789–1799. [Google Scholar] [CrossRef]
5. Wei, Z.; Kim, D. Subcopula-based measure of asymmetric association for contingency tables. Stat. Med. 2017, 36, 3875–3894. [Google Scholar] [CrossRef] [PubMed]
6. Wei, Z.; Kim, D. Measure of asymmetric association for ordinal contingency tables via the bilinear extension copula. Stat. Probab. Lett. 2021, 178, 109183. [Google Scholar] [CrossRef]
7. Zhang, L.; Lu, D.; Wang, X. The essential dependence for a group of random vectors. Commun. Stat.-Theory Methods 2021, 50, 5836–5872. [Google Scholar] [CrossRef]
8. Wei, Z.; Kim, D.; Conlon, E.M. A Bayesian approach to the analysis of asymmetric association for two-way contingency tables. Comput. Stat. 2022, 37, 1311–1338. [Google Scholar] [CrossRef]
9. Bowker, A.H. A test for symmetry in contingency tables. J. Am. Stat. Assoc. 1948, 43, 572–574. [Google Scholar] [CrossRef] [PubMed]
10. Tomizawa, S. Two kinds of measures of departure from symmetry in square contingency tables having nominal categories. Stat. Sin. 1994, 4, 325–334. [Google Scholar]
11. Fernandes, L.H.S.; Araújo, F.H.A. Taxonomy of commodities assets via the complexity-entropy causality plane. Chaos Solitons Fractals 2020, 137, 109909. [Google Scholar] [CrossRef]
12. Agresti, A. Analysis of Ordinal Categorical Data; Wiley: New York, NY, USA, 1984. [Google Scholar]
13. West, L.J.; Hankin, R.K.S. Exact tests for two-way contingency tables with structural zeros. J. Stat. Softw. 2008, 28, 1–19. [Google Scholar] [CrossRef][Green Version]
Figure 1. Estimate of measure $ϕ i j$ and approximate 95% confidence interval for $ϕ i j$ applied to Table 1.
Figure 1. Estimate of measure $ϕ i j$ and approximate 95% confidence interval for $ϕ i j$ applied to Table 1.
Figure 2. The weight for each pair of symmetric cells obtained by applying the proposed method and Tomizawa [10] to Table 4.
Figure 2. The weight for each pair of symmetric cells obtained by applying the proposed method and Tomizawa [10] to Table 4.
Table 1. Artificial data.
Table 1. Artificial data.
 (a) (1) (2) (3) (4) (5) (1) 37 544 12 7 8 (2) 102 26 15 15 12 (3) 9 8 29 10 11 (4) 9 9 12 40 12 (5) 14 9 10 11 29 (b) (1) (2) (3) (4) (5) (1) 47 11 37 44 48 (2) 3 38 34 37 49 (3) 44 44 52 56 48 (4) 38 25 55 45 47 (5) 35 25 51 43 44 (c) (1) (2) (3) (4) (5) (1) 33 316 13 18 18 (2) 321 37 20 18 20 (3) 7 6 26 16 14 (4) 5 5 1 30 19 (5) 5 10 5 2 35 (d) (1) (2) (3) (4) (5) (1) 39 4 70 42 50 (2) 5 34 45 110 84 (3) 17 12 54 103 63 (4) 31 14 20 39 48 (5) 9 29 26 6 46 (e) (1) (2) (3) (4) (5) (1) 7 103 1 1 4 (2) 19 10 2 2 4 (3) 2 1 6 2 2 (4) 3 4 4 5 2 (5) 1 2 4 1 8
Table 2. Artificial cell probability tables.
Table 2. Artificial cell probability tables.
 (a) (1) (2) (3) (4) (5) (1) 0.030 0.570 0.010 0.010 0.010 (2) 0.010 0.030 0.010 0.010 0.010 (3) 0.010 0.010 0.030 0.010 0.010 (4) 0.010 0.010 0.010 0.030 0.010 (5) 0.010 0.010 0.010 0.010 0.030 (1) 0.040 0.008 0.040 0.040 0.040 (2) 0.004 0.040 0.040 0.040 0.040 (3) 0.040 0.040 0.050 0.050 0.050 (4) 0.040 0.040 0.050 0.040 0.049 (5) 0.040 0.040 0.050 0.049 0.040 (c) (1) (2) (3) (4) (5) (1) 0.030 0.320 0.015 0.015 0.015 (2) 0.320 0.030 0.021 0.019 0.024 (3) 0.005 0.007 0.030 0.020 0.016 (4) 0.005 0.006 0.005 0.030 0.018 (5) 0.005 0.007 0.004 0.003 0.030 (d) (1) (2) (3) (4) (5) (1) 0.040 0.003 0.080 0.050 0.050 (2) 0.003 0.040 0.050 0.100 0.080 (3) 0.020 0.010 0.050 0.100 0.064 (4) 0.030 0.020 0.020 0.040 0.050 (5) 0.010 0.020 0.020 0.020 0.040
Table 3. Estimate of measure $ϕ i j$, estimated approximate variance for $ϕ i j$, approximate 95% confidence interval for $ϕ i j$, weights for measures $Φ S$ and $Φ T$, and estimates of measures $Φ S$ and $Φ T$, applied to Table 1(a)–(d).
Table 3. Estimate of measure $ϕ i j$, estimated approximate variance for $ϕ i j$, approximate 95% confidence interval for $ϕ i j$, weights for measures $Φ S$ and $Φ T$, and estimates of measures $Φ S$ and $Φ T$, applied to Table 1(a)–(d).
Applied
Data
Cells$ϕ ^ ij$$σ ^ ij 2$Confidence
Interval for $ϕ ij$
$w ^ ij *$$w ^ ij$$Φ ^ S$$Φ ^ T$
Table 1(a)(1,2), (2,1)0.370750.0012005(0.303, 0.439)0.0579870.769964
(1,3), (3,1)0.014770.0020088(−0.073, 0.103)0.0346530.025030
(1,4), (4,1)0.011300.0020219(−0.077, 0.099)0.0344280.019070
(1,5), (5,1)0.054340.0068561(−0.108, 0.217)0.0101530.026222
(2,3), (3,2)0.067890.0081116(−0.109, 0.244)0.0085820.0274140.026240.29124
(2,4), (4,2)0.045570.0053039(−0.097, 0.188)0.0131240.028605
(2,5), (5,2)0.014770.0020088(−0.073, 0.103)0.0346530.025030
(3,4), (4,3)0.005970.0007797(−0.049, 0.061)0.0892770.026222
(3,5), (5,3)0.001640.0002246(−0.028, 0.031)0.3099660.025030
(4,5), (5,4)0.001360.0001710(−0.024, 0.027)0.4071780.027414
Table 1(b)(1,2), (2,1)0.250410.0422558(−0.152, 0.653)0.0000320.018088
(1,3), (3,1)0.005390.0001914(−0.022, 0.033)0.0069790.104651
(1,4), (4,1)0.003870.0001357(−0.019, 0.027)0.0098480.105943
(1,5), (5,1)0.017770.0006101(−0.031, 0.066)0.0021900.107235
(2,3), (3,2)0.011890.0004362(−0.029, 0.053)0.0030630.1007750.000360.01843
(2,4), (4,2)0.027190.0012416(−0.042, 0.096)0.0010760.080103
(2,5), (5,2)0.077270.0028494(−0.027, 0.182)0.0004690.095607
(3,4), (4,3)0.000060.0000015(−0.002, 0.002)0.8778580.143411
(3,5), (5,3)0.000660.0000193(−0.008, 0.009)0.0692210.127907
(4,5), (5,4)0.001430.0000457(−0.012, 0.015)0.0292640.116279
Table 1(c)(1,2), (2,1)0.000040.0000002(−0.001, 0.001)0.9999000.759237
(1,3), (3,1)0.065930.0090727(−0.121, 0.253)0.0000220.023838
(1,4), (4,1)0.244630.0252616(−0.067, 0.556)0.0000080.027414
(1,5), (5,1)0.244630.0252616(−0.067, 0.556)0.0000080.027414
(2,3), (3,2)0.220650.0205989(−0.061, 0.502)0.0000100.0309890.000060.06270
(2,4), (4,2)0.244630.0252616(−0.067, 0.556)0.0000080.027414
(2,5), (5,2)0.081700.0074074(−0.087, 0.250)0.0000270.035757
(3,4), (4,3)0.677240.0521067(0.230, 1.125)0.0000040.020262
(3,5), (5,3)0.168530.0225185(−0.126, 0.463)0.0000090.022646
(4,5), (5,4)0.546280.0432851(0.139, 0.954)0.0000050.025030
Table 1(d)(1,2), (2,1)0.008920.0028433(−0.096, 0.113)0.1139030.011421
(1,3), (3,1)0.287360.0075340(0.117, 0.457)0.0429860.110406
(1,4), (4,1)0.016440.0006424(−0.033, 0.066)0.5041070.092640
(1,5), (5,1)0.383830.0134101(0.157, 0.611)0.0241500.074873
(2,3), (3,2)0.257510.0106028(0.056, 0.459)0.0305450.0723350.115180.28830
(2,4), (4,2)0.491390.0071440(0.326, 0.657)0.0453330.157360
(2,5), (5,2)0.178370.0039745(0.055, 0.302)0.0814840.143401
(3,4), (4,3)0.359500.0061895(0.205, 0.514)0.0523230.156091
(3,5), (5,3)0.128540.0037881(0.008, 0.249)0.0854940.112944
(4,5), (5,4)0.496740.0164609(0.245, 0.748)0.0196740.068528
Table 1(e)(1,2), (2,1)0.375990.0064089(0.219, 0.533)0.4863300.743902
(1,3), (3,1)0.081700.0740741(−0.452, 0.615)0.0420770.018293
(1,4), (4,1)0.188720.1177550(−0.484, 0.861)0.0264690.024390
(1,5), (5,1)0.278070.1280000(−0.423, 0.979)0.0243500.030488
(2,3), (3,2)0.081700.0740741(−0.452, 0.615)0.0420770.0182930.232440.30922
(2,4), (4,2)0.081700.0370370(−0.295, 0.459)0.0841550.036585
(2,5), (5,2)0.081700.0370370(−0.295, 0.459)0.0841550.036585
(3,4), (4,3)0.081700.0370370(−0.295, 0.459)0.0841550.036585
(3,5), (5,3)0.081700.0370370(−0.295, 0.459)0.0841550.036585
(4,5), (5,4)0.081700.0740741(−0.452, 0.615)0.0420770.018293
Table 4. Cross-classification of mothers’ and fathers’ birth orders.
Table 4. Cross-classification of mothers’ and fathers’ birth orders.
Fathers’ Birth Order
Mothers’ Birth OrderFirstSecondThirdFourth or MoreTotal
First2241795322478
Second1621533515365
Third37371811103
Fourth or more1273527
Total43537610953973
Table 5. Estimate of measure $ϕ i j$, estimated approximate variance for $ϕ i j$, approximate 95% confidence interval for $ϕ i j$, estimates of measures of $Φ S$ and $Φ T$, and weights for measures of $Φ S$ and $Φ T$, applied to Table 4.
Table 5. Estimate of measure $ϕ i j$, estimated approximate variance for $ϕ i j$, approximate 95% confidence interval for $ϕ i j$, estimates of measures of $Φ S$ and $Φ T$, and weights for measures of $Φ S$ and $Φ T$, applied to Table 4.
Cells$ϕ ^ ij$$σ ^ ij 2$Confidence
Interval for $ϕ ij$
$w ^ ij *$$w ^ ij$$Φ ^ S$$Φ ^ T$
(1,2), (2,1)0.00180.00002(−0.006, 0.009)0.58640.5951
(1,3), (3,1)0.02290.00072(−0.030, 0.076)0.01230.1571
(1,4), (4,1)0.06330.00514(−0.077, 0.204)0.00170.0593
(2,3), (3,2)0.00060.00002(−0.009, 0.010)0.39860.12570.00180.0184
(2,4), (4,2)0.09760.01192(−0.116, 0.312)0.00070.0384
(3,4), (4,3)0.25040.04226(−0.152, 0.653)0.00020.0244
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ishihara, T.; Yamamoto, K.; Tahata, K.; Tomizawa, S. Partial Asymmetry Measures for Square Contingency Tables. Symmetry 2022, 14, 1936. https://doi.org/10.3390/sym14091936

AMA Style

Ishihara T, Yamamoto K, Tahata K, Tomizawa S. Partial Asymmetry Measures for Square Contingency Tables. Symmetry. 2022; 14(9):1936. https://doi.org/10.3390/sym14091936

Chicago/Turabian Style

Ishihara, Takuma, Kouji Yamamoto, Kouji Tahata, and Sadao Tomizawa. 2022. "Partial Asymmetry Measures for Square Contingency Tables" Symmetry 14, no. 9: 1936. https://doi.org/10.3390/sym14091936

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.