Abstract
Contingency tables highlight relationships between categorical variables. Typically, the symmetry or marginal homogeneity of a square contingency table is evaluated. The original symmetry model often does not accurately fit a dataset due to its restrictions. Caussinus proposed a quasi-symmetry model which served as a bridge between symmetry and marginal homogeneity in square contingency tables. This study significantly influenced methodological developments in the statistical analysis of categorical data. Herein recent advances in quasi-symmetry are reviewed with an emphasis on four topics related to the author’s results: (1) modeling based on the f-divergence, (2) the necessary and sufficient condition of symmetry, (3) partition of test statistics for symmetry, and (4) measure of the departure from symmetry. The asymmetry model based on f-divergence enables us to express various asymmetries. Additionally, these models are useful to derive the necessary and sufficient conditions of symmetry with desirable properties. This review may be useful to consider the statistical modeling and the measure of symmetry for contingency tables with the same classifications.
1. Introduction
Contingency tables play important roles in various fields, as they highlight relationships between categorical variables. Typically, the analysis of contingency tables is interested in whether row and column variables are independent. If the independence hypothesis is rejected, then the association between the variables is of interest. Various coefficients have been proposed to measure the association, such as gamma, Yule’s Q, Kendall’s tau-b, Kendall’s tau, and Somers’ d. See (Bishop et al. [1], Ch. 11) and (Agresti [2], pp. 184–192). Bishop et al. [1], Agresti [2], and Kateri [3] present overviews of contingency table analysis. Additionally, Kateri [4] reviewed the -divergence association models related to the independence model for two-way contingency tables. Fujisawa and Tahata [5] proposed the generalized asymmetry plus quasi-uniform association model. These works in the literature provide reviews of contingency table analysis based on association models.
Contingency tables with the same row and column classifications are often called square contingency table. Square contingency table data arise many fields. For example, unaided distance vision data, social mobility data, father–son matched educational level data, longitudinal data in biomedical research, and so on. Typically, the analysis of square contingency tables considers the issue of symmetry rather than independence (null association) because observations tend to concentrate on or near the main diagonal. Therefore, many statisticians have considered various symmetry and asymmetry models, and modeling of symmetry is one of the important topics for the analysis of square contingency tables. For example, Bowker [6] proposed a test for the hypothesis of symmetry (S). The S model indicates the symmetry structure of cell probabilities. Additionally, Stuart [7] provided a large-sample test for marginal homogeneity (MH). The MH model indicates the equivalence of marginal probabilities.
Caussinus [8] developed quasi-symmetry (QS). The QS model bridges S and MH in square tables. The QS model indicates the symmetry of the odds ratio with respect to the main diagonal of the square contingency table. If the S model holds, then the MH model holds. However, the inverse is not always true. The QS model can be used to show that the S model holds if and only if both the QS and MH models hold. This finding has influenced the methodological developments in statistical analysis of categorical data. See Section 3 for more details.
A special issue of Annales de la Faculté des Sciences de Toulouse, Mathématiques was published in 2002. It contained papers written by internationally distinguished authors on topics related to QS. Agresti [9] described some generalizations of the QS model that have similar connections with generalizations of the Rasch model. Goodman [10] commented relationship between the QS model and the quasi-independence model. McCullagh [11] set out to list all hereditary sub-representations by real-valued square matrices, and to explain how these may be used in model construction. There are Fienberg and van der Heijden [12], Bergsma and Rudas [13], Dossou-Gbété and Grorud [14], Erosheva et al. [15], De Falguerolles and van der Heijden [16], Stigler [17], Thélot [18], and Caussinus [19] in addition to those in the special issue.
Since then, numerous studies have treated QS from a variety of perspectives. For square contingency tables, Tomizawa [20] introduced modeling based on the cumulative probabilities. Miyamoto et al. [21] proposed the cumulative QS model for square contingency tables with ordinal categories. Kateri et al. [22] considered generalized QS models. To evaluate the goodness of fit, Booth et al. [23] adapted a network algorithm to test QS in square tables. Krampe et al. [24] proposed a method based on algebraic statistics combined with Markov chain Monte Carlo (MCMC) methods for the ordinal QS (OQS) model. Additionally, various topics related to the QS model are discussed in mathematical statistics. For example, see Rapallo [25], Pardo and Martin [26], and Gottard et al. [27]. Tomizawa and Tahata [28] reviewed some topics on various symmetry models and showed the property of test statistics of symmetry for multi-way contingency tables. Tahata and Tomizawa [29] reviewed various models of symmetry and asymmetry and presented the relationships among models.
The topics related to the QS model have been discussed in several papers for the last 5 years. Bocci and Rapallo [30] reviewed why the synergy between algebraic statistics and quasi-independence has been fruitful. Additionally, see Khan and Tewari [31]. Altun [32] considered various symmetry and asymmetry models for square contingency tables with ordinal categories. Tahata et al. [33] proposed the model selection via the penalized likelihood approach. The symmetry models for square contingency tables are applied to the cross classified a single nucleotide polymorphism (SNP) interactions data in Karadağ et al. [34]. Altunay and Yilmaz [35] proposed two novel log-linear models to measure the degree of accumulation of the neutral option over the contingency tables based on Likert-type items. Additionally, Ando [36,37,38] proposed models that indicate the structure of asymmetry and gave the decompositions of the models. For square contingency tables with ordinal categories, the models which have various ordered scores were compared in Ando [39].
This review focuses on the further developments of modeling and properties of symmetry in recent years. Additionally, some comments which are not described in referenced papers are added. Herein four topics related to Tahata [40] and Tahata et al. [41] are reviewed: (1) modeling based on the f-divergence, (2) the necessary and sufficient condition of symmetry, (3) partition of test statistics for symmetry, and (4) the measure of departure from symmetry.
The rest of this paper is organized as follows. Section 2 demonstrates modeling based on the f-divergence. Section 3 reviews the necessary and sufficient condition of symmetry. The result given by Caussinus [8] is included as a special case. Aitchison [42], Darroch and Silvey [43], Read [44], Lang and Agresti [45], and Lang [46] discussed the partitioning of goodness of fit statistics. Section 4 presents an overview of the main results presented in Tomizawa and Tahata [28]. Caussinus’ QS model has a good property from a partitioning point of view. Section 5 introduces the measure of departure from symmetry. Section 6 discusses the relationships between the measure and asymmetry model. Finally, Section 7 contains concluding remarks.
2. Modeling Based on the f-Divergence
Consider an square contingency table with the same row and column classifications. Let X and Y denote the row and column variables, respectively, and let denote the probability that an observation will fall in the cell of the table.
The S model, which is defined as for , indicates the symmetry structure of cell probabilities (Bowker [6]). Additionally, the MH model indicates the equivalence of marginal probabilities and is defined as for where and (Stuart [7]). The QS model is defined as
From (1), the QS model can be expressed using the odds ratio as
where and .
This indicates the symmetry of the odds ratio with respect to the main diagonal of the square contingency table (Caussinus [8]). For the analysis of square contingency tables, symmetry models have been proposed by using various ideas, for example, McCullagh [47], Agresti [48], and Tomizawa [49].
Ireland et al. [50], (Bishop et al. [1], pp. 345–346) and Gilula et al. [51] presented the method for model generation. In particular, Kateri and Papaioannou [52] applied the method for the modeling of symmetry for square contingency tables and proposed a family of models, which includes the QS model as a special case. This family is derived by minimizing the f-divergence under certain conditions. Additionally, Tahata [40] extended the results to the modeling of the asymmetry structure of cell probabilities. This section reviews asymmetry models based on the f-divergence.
Let and be two discrete finite bivariate probability distributions. (Csiszár and Shields [53], Section 4) introduced the f-divergence between and . Let f be a convex function on with . The f-divergence of a distribution from is defined as
Here, we take , , and .
Let f be a twice differentiable and strictly convex function, and . Kateri and Papaioannou [52] proposed the generalized QS (denoted by QS[f]) model. It is defined as
where and . The QS[f] model is the closest to the S model in terms of the f-divergence under the condition where the marginals (or ) for and the sums for are given. Kateri and Papaioannou [52] noted that if , , then the f-divergence is reduced to the Kullback–Leibler divergence (Kullback and Leibler [54]), and the QS[f] model is equivalent to the QS model. Namely, the QS model is the closest to the S model in terms of the Kullback–Leibler divergence under this condition. Additionally, see Kateri [4], Kateri and Agresti [55], Kateri [56], and Tahata [40].
Next, consider an square contingency table with ordered categories. Let be a set of known scores (with ). Replacing by in the QS[f] model is the ordinal QS[f] (OQS[f]) model (Kateri and Agresti [55]). The OQS[f] model is the closest to the S model in terms of the f-divergence under the condition where (or ) and the sums are given. If , , then (1) the OQS[f] model with is equivalent to the linear diagonals parameter S (LDPS) model (Agresti [48]) and (2) the OQS[f] model is equivalent to the OQS model (Agresti [9], pp. 236–238).
Tahata [40] proposed a model to fill the gap between the QS[f] and OQS[f] models. It should be noted that is a set of known scores . For a given k () the asymmetry model based on the f-divergence (ASk[f]) is defined as
where and . From the relation , the parameters of the ASk[f] model must satisfy
The ASk[f] model is the closest to the S model in terms of the f-divergence under the condition where (or ) for and the sums for are given. Additionally, the ASk[f] model can be expressed as
where . For example, can be set without a loss of generality. There is a one-to-one transformation between a set of in Equation (2) and that of in Equation (3) when . When , the ASk[f] model is reduced to the S model. It should be noted that (1) the AS1[f] model is reduced to the OQS[f] model, (2) the ASk[f] model () is an extension of the OQS[f] model, and (3) the ASR−1[f] model is reduced to the QS[f] model.
If for , then the ASk[f] model becomes
where . Then under Equation (4)
where . Equation (5) with is the kth linear asymmetry (LSk) model proposed by Tahata and Tomizawa [57]. Therefore, the LSk model is the closest to the S model in terms of the Kullback–Leibler divergence under the condition where (or ) for and the sums are given.
Let the conditional probability that an observation will fall in cell when the observation falls in cell or be denoted by . That is, . When , the ASk[f] model becomes
where . Then under Equation (6)
where with . This model is the closest to the S model when the divergence is measured by the Pearson distance.
Equation (5) can be expressed as
Namely, the ASk[f] model with indicates that the log odds is expressed as a polynomial of . On the other hand, the ASk[f] model with (namely, Equation (6)) indicates that the difference between symmetric conditional probabilities is expressed as a polynomial of from Equation (7). Let and the composition of functions F and g be denoted by H. That is, . Then the ASk[f] model can be expressed as
where . This formula indicates that a H-transformation of conditional probability () has a linear combination of parameters. For ,
Therefore, the ASk[f] model is characterized as the structure with symmetric conditional probabilities . The structure depends on the function f. The function f is chosen by the user. Namely, we can apply various asymmetry models to the given dataset by changing the function f. Namely, modeling based on f-divergence enable to construct various asymmetries by using the function f.
Fujisawa and Tahata [58] proposed the extended ASk[f] model for square contingency tables with ordinal categories. Additionally, Yoshimoto et al. [59] discussed the quasi point-symmetry models based on the f-divergence. For two-way contingency tables, the -divergence (f-divergence) association models are a family of models which includes the association and correlation models as special cases. Kateri [4] presented this family of models and demonstrated the role of -divergence in building this family. Additionally, Kateri [56] developed the new families of -divergence generalized QS models.
3. Necessary and Sufficient Condition of Symmetry
The S model rarely fits the given dataset due to its strong restrictions. More relaxed models, such as the QS and MH models, are often applied to a dataset. Here, we are interested in an extension of the S model, which indicates the asymmetry structure of cell probabilities. Caussinus [8] noted that the S model holds if and only if both the QS and MH models hold. Caussinus’ result is useful to deduce the reason for a poor fit of the S model, and Bishop et al. [1] (Section 8.2.3) derived its proof. Let denote model M. Then
In this section, the necessary and sufficient conditions for the S model are reviewed.
The conditional S (CS) model (Read [44] and McCullagh [47]) is defined as for . The CS model with is reduced to the S model. See also (Bishop et al. [1], pp. 285–286).
Consider the global symmetry (GS) model, which is defined as
Namely, this model indicates that is equal to . If both the CS and GS models hold, then the S model holds. Since the converse holds, Read [44] highlighted that
The LDPS model fits well when there is an underlying bivariate normal distribution (Agresti [48]). Additionally, Tomizawa [49] considered an extended LDPS (ELDPS) model and reported the relationship with bivariate normal distribution. Tahata and Tomizawa [57] considered the LSk model, which is defined by Equation (5) with . This model is a generalization of the LDPS and ELDPS models. That is, the LDPS and ELDPS models are the LS1 and LS2 models, respectively. It should be noted that
For , the difference in the degrees of freedom (df) between and is one.
On the other hand, for a given positive integer k, the kth moment equality (MEk) model is defined as
where and . For example, the ME1 model indicates that the mean of X is equal to that of Y, and the ME2 model indicates the equality of the mean and variance for X and Y. Tahata and Tomizawa [60] proved that the MH model is equivalent to the MER−1 model. Therefore,
For , the difference in df between and is one. From the Caussinus’ result, Equations (10) and (11) provide various relations. For example
and
(Agresti [2], p. 261) and Kateri and Agresti [55] reported similar comments. However, it should be noted that and have more redundant conditions than the S model.
Let denote the cell probabilities satisfying both the LSk and MEk models. The LSk model is expressed as
where . Additionally, the MEk model is expressed as
where and . By considering with (namely, the Kullback–Leibler divergence) to measure the difference between and ,
That is, holds for from the property of the Kullback–Leibler divergence. Additionally, if the S model holds, then both LSk and MEk models hold clearly. Therefore, for a given positive integer k, these lead to
Tahata [40] extended the result using the ASk[f] model . Consider the generalized MEk (denoted by GMEk) model defined as
with a set of known scores . Tahata [40] proved the following result:
Tahata et al. [61] proposed the extended LSk (ELSk) model, which is defined as
The ELSk model with γ = 1 is reduced to the LSk model. It should be noted that the ELSR−1 model is equal to the extended QS (EQS) model (Tomizawa [62]), which indicates the asymmetry structure of the odds ratio. The QS model indicates the symmetry structure of the odds ratios. The OQS (LDPS) model indicates both the QS model and the asymmetry structure of cell probabilities and is a special case of the QS model. The EQS model is an extension of the QS model. Tahata et al. [61] noted that
Fujisawa and Tahata [58] proposed the model, which is a generalization of the ELSk model using the f-divergence and gave the generalization of Equation (14).
4. Partition of Test Statistics for Symmetry
Consider the situation where the analyst has found hypothesis unacceptable and has turned their attention to an examination of components and such that . Let denote the statistic for testing the goodness of fit of model M. is asymptotically distributed as a chi-square distribution with the corresponding df. Aitchison [42] noted that the possibility of partitioning the test statistic for into components for testing and must be investigated. Namely, . When hypotheses and are separable, partitioning is possible. Separable implies that the restricted tests of against and that of against use the same critical regions as the unrestricted test of and that of , respectively. It should be noted that df for (denoted by ) is equal to the sum of df for and . That is, . The acceptance of and separately means that and where is the significance level and c denotes the corresponding critical value. Then . Under the separable hypotheses, hypothesis is rejected when . The acceptance of and usually results in the acceptance of because rarely exceeds . Darroch and Silvey [43] considered the separability with respect to the likelihood ratio method, which they called “independence”. Additionally, Lang and Agresti [45] and Lang [46] discussed the partitioning of goodness of fit statistics. This section describes a property of the statistic for testing the goodness of fit of the S model.
Let denote the observed frequency in cell in an square contingency table with . Additionally, let and denote the expected frequency in cell and the corresponding maximum likelihood estimate (MLE) under a model, respectively. Assume that a multinomial distribution applies to the table. For the S model, MLEs of are given as
See, for example, Bowker [6] and Bishop et al. [1] (p. 283). Additionally, MLEs of for the CS model are given as
where and . MLEs of for the GS model are given as
It should be noted that the S, CS, and GS models have closed-form estimators. Let denote the likelihood ratio chi-square statistic for testing the goodness of fit of model M. Read [44] noted that
From Equations (9) and (15), (or ). Therefore, the CS and GS models are separable (i.e., exhibit independence).
Caussinus [8] gave Equation (8). The QS and MH models do not have closed-form estimators. Therefore, it is difficult to obtain a relationship similar to Equation (15). Tomizawa and Tahata [28] proved the relationship between test statistics for the QS and MH models. We review the result briefly.
The QS model is expressed in the log-linear form as
where . Without the loss of generality, for example, set . Let Then the QS model is expressed as
where , , , and
It should be noted that is the matrix with , where is the vector of the 1 element,
and is the matrix of 1 or 0 elements determined from (16). Additionally, is the identity matrix, is the zero matrix, is the zero vector, and ⊗ denotes the Kronecker product. The matrix is a full column rank, which is K. The linear space spanned by the columns of the matrix is denoted by . The dimension of is K and is a subspace of . Let denote the orthogonal complement of . Let be an , where , full column rank matrix such that the columns of span . Since , the QS model can be expressed as , where .
The MH model is defined as
Let , where . The MH model can be expressed as , where . The columns of belong to subspace . It should be noted that . Since Equation (8) holds, the S model can be expressed as
where .
Let denote the sample proportions, where . From the multivariate central limit theorem, has an asymptotic normal distribution with mean and covariance matrix , where is the diagonal matrix with the elements of on the main diagonal. In an analogous manner to Bhapkar [63], the Wald statistic can be derived for the S model. Let for . Using the delta method, has an asymptotic normal distribution with mean and covariance matrix of
because . Then , where
Let denote the Wald statistic for testing goodness of fit of model M. Since , and correspond to , and , respectively, we obtain
Additionally, from the asymptotic equivalence of the Wald statistic and the likelihood ratio statistic (Rao [64], Section 6e. 3), we obtain
Therefore, the QS and MH models are separable and exhibit asymptotic independence. Caussinus’ result (8) shows a good property, such as Equations (17) and (18).
Tahata and Tomizawa [57] proved the asymptotic separability of the LSk and MEk models for Equation (12). Tahata [40] extended the result for Equation (13). Additionally, Tahata et al. [61] provided the property of the test statistics related to Equation (14). Separable hypotheses for the S model are discussed elsewhere, for example, by Saigusa et al. [65] and Fujisawa and Tahata [58].
5. Measure of the Departure from Symmetry
As described in Section 1, the analysis of contingency tables is interested in whether row and column variables are independent. If the independence hypothesis is rejected, then the association between the variables is of interest. Various coefficients have been proposed to measure the association such as gamma, Yule’s Q, Kendall’s tau-b, Kendall’s tau, and Somers’ d. See (Bishop et al. [1], Ch. 11) and (Agresti [2], pp. 184–192). On the other hand, Tomizawa [66] proposed two kinds of measures to represent the degree of departure from the S model. Tomizawa et al. [67] gave a generalization of these measures. Their generalization is expressed as the average of the power divergence (Cressie and Read [68]). Additionally, Tomizawa et al. [69], Tahata et al. [70], and Iki et al. [71] have proposed measures to represent the degree of departure from symmetry and the marginal homogeneity. This section reviews a measure to represent the degree of departure from the QS model.
Consider an square contingency table with nominal categories. Caussinus [8] proposed the QS model defined as Equation (1). Using the conditional probability , the QS model can be expressed as
where and .
Let
Additionally, let
for . Assuming that for , Tahata et al. [41] proposed a measure defined as
When , this measure is defined by the value taken at the limit as . Thus
is the modification of the power divergence between and . Especially, is the modification of the Kullback–Leibler divergence between them. It should be noted that the user chooses the real value .
In addition, measure can be expressed using the diversity index of degree (Patil and Taillie [72]). Let
Namely, measure is a weighted sum of the diversity index, which includes the Shannon entropy (Shannon [75]) when and the Gini concentration (Giorgi and Gigliarano [76]) when .
Measure lies between 0 and 1. For any , the value of the measure is 0 if and only if the contingency table has a QS structure, while its value is 1 if and only if the degree of departure from QS is the largest, in the sense that (then ) or (then ) for any . When the value of the measure is 1, for any , then or holds. Namely, for any , at least one of , , and is equal to 0, or at least one of , , and is equal to 0. In other words, for any , the complete asymmetry arises for at least one pair of symmetric cells. That is, it gives the partial complete asymmetry of cell probabilities. The partial complete asymmetry is a weaker condition than the complete asymmetry of cell probabilities used in Tomizawa [66] and Tomizawa et al. [67]. Additionally, the QS model indicates the symmetry structure of the odds ratio, while indicates the complete asymmetry of the odds ratio because or ∞ for any . It is natural that the degree of departure from the QS model is the largest.
Consider an approximate standard error and large-sample confidence interval for measure . Let denote the sample version of , which is given by when is replaced by . has an asymptotic normal distribution with mean and variance using the delta method (Bishop et al. [1] (Section 14.6)). See Tahata et al. [41] for details of .
Tahata and Kozai [73] proposed a measure to represent the degree of departure from the EQS model proposed by Tomizawa [62]. Additionally, Tahata et al. [74] developed a different measure to represent the degree of departure from the QS model and partitioned their measure into two components. Ando [77] discussed the bivariate index vector to concurrently analyze both the degree and the direction of departure from the QS model.
6. Discussions
When the QS model does not fit a given dataset well, often whether an extended model is suitable is evaluated. Tomizawa [62] proposed the EQS model for a square contingency table with ordinal categories. On the other hand, Tahata [78] proposed the quasi-asymmetry (QA) model for a square contingency table with nominal categories. The QA model is defined as
where and () for . Using the odds ratio, the QA model can be expressed as
This states that the odds of the symmetric odds ratio are equal to for some and for other . Using and in Equation (19), the QA model can be expressed as
where . When (), the QA model is reduced to the QS model. If the QA model holds, parameter is useful to visualize the degree of departure from QS because . Additionally, the value of approaches 1 as the value of increases. The parameter may be effective to express the degree of departure from QS.
If a square table has a structure of QA, then measure is simply expressed as
where
Namely, measure can be expressed as the weighted sum of the function of . Herein, we note that (i) when , for any . That is, the value of the measure equals 0 (i.e., the QS model holds). (ii) If we take the limit as , then (then ) or (then ) for any . That is, the value of the measure equals 1. Therefore, the maximum degree of departure from QS can be interpreted as the limitation of QA model with .
7. Concluding Remarks
Previously, Tomizawa and Tahata [28] reviewed topics on various symmetry models to analyze square contingency tables, and Tahata and Tomizawa [29] summarized topics related to the symmetry and asymmetry models. This paper mainly reviews four recent developments in the analysis of square contingency tables: (1) modeling based on the f-divergence, (2) the necessary and sufficient condition of symmetry, (3) partition of test statistics for symmetry, and (4) the measure of departure from symmetry. These are based on the results in Tahata [40] and Tahata et al. [41].
This paper focuses on the analysis of square contingency tables. Previous papers have examined contingency tables with other structures. For example, Tahata and Tomizawa [79] discussed the double symmetry, which is characterized as both symmetry and point symmetry. In this review, the problem of estimation of cell probabilities and goodness of fit test are omitted. For such issue, Kateri [3], Lawal [80] and Tan [81] would be useful to study computational aspects using R or SAS. Additionally, Lang [82,83] provided the methodology and corresponding R code.
In this paper, the models related to the QS model are described. Especially, the characterization of models and its properties are discussed. The models described this paper is useful for analyzing real dataset. For example, unaided distance vision data, social mobility data, father–son matched educational level data, longitudinal data in biomedical research, and so on. Such applications can be seen in each of references. For details of the estimation of cell probabilities and goodness of fit test, please see corresponding papers.
Various topics are generalized for multi-way contingency tables with the same classifications. Bhapkar and Darroch [84] proposed the QS and MH models for a general order and gave the generalization of Equation (8). Tomizawa and Tahata [28] proved the generalization of Equation (18). For multi-way contingency tables, the structures of asymmetries are considered in Tahata and Tomizawa [57], Tahata et al. [85], Tahata et al. [86] and Shinoda et al. [87]. Moreover, Tahata and Tomizawa [88] and Yoshimoto et al. [59] treated the problem of point symmetry for multi-way contingency tables.
In future work, the generalization of the results in described in this review will be considered for the multi-way contingency tables. Additionally, the goodness of fit test for symmetry and asymmetry models should be considered for sparse contingency tables.
Funding
This research was funded by JSPS KAKENHI (Grant Number 20K03756).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Acknowledgments
I would like to thank five referees for valuable comments to improve this review. Additionally, I would like to thank Sadao Tomizawa for his helpful comments and many suggestions. On the occasion of Tomizawa’s retirement in the Tokyo University of Science, I would like to thank Echo Zhao for proposal for this special issue and kind support.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Bishop, Y.M.; Fienberg, S.E.; Holland, P.W. Discrete Multivariate Analysis: Theory and Practice; The MIT Press: Cambridge, MA, USA, 1975. [Google Scholar]
- Agresti, A. Analysis of Ordinal Categorical Data; John Wiley: Hoboken, NJ, USA, 2010. [Google Scholar]
- Kateri, M. Contingency Table Analysis. Methods and Implementation Using R; Birkhäuser: Basel, Switzerland, 2014. [Google Scholar]
- Kateri, M. ϕ-divergence in contingency table analysis. Entropy 2018, 20, 324. [Google Scholar] [CrossRef] [PubMed]
- Fujisawa, K.; Tahata, K. Quasi Association Models for Square Contingency Tables with Ordinal Categories. Symmetry 2022, 14, 805. [Google Scholar] [CrossRef]
- Bowker, A.H. A test for symmetry in contingency tables. J. Am. Stat. Assoc. 1948, 43, 572–574. [Google Scholar] [CrossRef] [PubMed]
- Stuart, A. A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 1955, 42, 412–416. [Google Scholar] [CrossRef]
- Caussinus, H. Contribution à l’analyse statistique des tableaux de corrélation. Ann. Fac. Sci. Toulouse Math. 1965, 29, 77–183. [Google Scholar] [CrossRef]
- Agresti, A. Links between binary and multi-category logit item response models and quasi-symmetric loglinear models. Ann. Fac. Sci. Toulouse Math. 2002, 11, 443–454. [Google Scholar] [CrossRef][Green Version]
- Goodman, L.A. Contributions to the statistical analysis of contingency tables: Notes on quasi-symmetry, quasi-independence, log-linear models, log-bilinear models, and correspondence analysis models. Ann. Fac. Sci. Toulouse Math. 2002, 11, 525–540. [Google Scholar] [CrossRef][Green Version]
- McCullagh, P. Quasi-symmetry and representation theory. Ann. Fac. Sci. Toulouse Math. 2002, 11, 541–561. [Google Scholar] [CrossRef][Green Version]
- Fienberg, S.E.; van der Heijden, P.G.M. Introduction to special issue on quasi-symmetry and categorical data analysis. Ann. Fac. Sci. Toulouse Math. 2002, 11, 439–441. [Google Scholar] [CrossRef][Green Version]
- Bergsma, W.P.; Rudas, T. Modeling conditional and marginal association in contingency tables. Ann. Fac. Sci. Toulouse Math. 2002, 11, 455–468. [Google Scholar] [CrossRef]
- Dossou-Gbété, S.; Grorud, A. Biplots for matched two-way tables. Ann. Fac. Sci. Toulouse Math. 2002, 11, 469–483. [Google Scholar] [CrossRef]
- Erosheva, E.A.; Fienberg, S.E.; Junker, B.W. Alternative statistical models and representations for large sparse multi-dimensional contingency tables. Ann. Fac. Sci. Toulouse Math. 2002, 11, 485–505. [Google Scholar] [CrossRef]
- De Falguerolles, A.; van der Heijden, P.G.M. Reduced rank quasi-symmetry and quasi-skew symmetry: A generalized bi-linear model approach. Ann. Fac. Sci. Toulouse Math. 2002, 11, 507–524. [Google Scholar] [CrossRef]
- Stigler, S. The missing early history of contingency tables. Ann. Fac. Sci. Toulouse Math. 2002, 11, 563–573. [Google Scholar] [CrossRef]
- Thélot, C. L’analyse statistique des tables de mobilité à l’aide du modèle quasisymétrique et de ses dérivés. Ann. Fac. Sci. Toulouse Math. 2002, 11, 575–585. [Google Scholar] [CrossRef]
- Caussinus, H. Some concluding observations. Ann. Fac. Sci. Toulouse Math. 2002, 11, 587–591. [Google Scholar] [CrossRef]
- Tomizawa, S. Diagonals-parameter symmetry model for cumulative probabilities in square contingency tables with ordered categories. Biometrics 1993, 49, 883–887. [Google Scholar] [CrossRef]
- Miyamoto, N.; Ohtsuka, W.; Tomizawa, S. Linear diagonals-parameter symmetry and quasi-symmetry models for cumulative probabilities in square contingency tables with ordered categories. Biom. J. 2004, 46, 664–674. [Google Scholar] [CrossRef]
- Kateri, M.; Gottard, A.; Tarantola, C. Generalised quasi-symmetry models for ordinal contingency tables. Aust. N. Z. J. Stat. 2017, 59, 239–253. [Google Scholar] [CrossRef]
- Booth, J.G.; Capanu, M.; Heigenhauser, L. Exact conditional P value calculation for the quasi-symmetry model. J. Comput. Graph. Stat. 2005, 14, 716–725. [Google Scholar] [CrossRef]
- Krampe, A.; Kateri, M.; Kuhnt, S. Asymmetry models for square contingency tables: Exact tests via algebraic statistics. Stat. Comput. 2011, 21, 55–67. [Google Scholar] [CrossRef]
- Rapallo, F. Algebraic markov bases and MCMC for two-way contingency tables. Scand. J. Stat. 2003, 30, 385–397. [Google Scholar] [CrossRef]
- Pardo, L.; Martin, N. Minimum phi-divergence estimators and phi-divergence test statistics in contingency tables with symmetry structure: An overview. Symmetry 2010, 2, 1108–1120. [Google Scholar] [CrossRef]
- Gottard, A.; Marchetti, G.M.; Agresti, A. Quasi-symmetric graphical log-linear models. Scand. J. Stat. 2011, 38, 447–465. [Google Scholar] [CrossRef]
- Tomizawa, S.; Tahata, K. The analysis of symmetry and asymmetry: Orthogonality of decomposition of symmetry into quasi-symmetry and marginal symmetry for multi-way tables. J. Soc. Fr. Stat. 2007, 148, 3–36. [Google Scholar]
- Tahata, K.; Tomizawa, S. Symmetry and asymmetry models and decompositions of models for contingency tables. SUT J. Math. 2014, 50, 131–165. [Google Scholar]
- Bocci, C.; Rapallo, F. Exact tests to compare contingency tables under quasi independence and quasi-symmetry. J. Algebr. Stat. 2019, 10, 13–29. [Google Scholar] [CrossRef]
- Khan, Z.A.; Tewari, R.C. Markov reversibility, quasi-symmetry, and marginal homogeneity in cyclothymiacs geological successions. Int. J. Geoinform. Geol. Sci. 2021, 8, 9–25. [Google Scholar]
- Altun, G. Quasi local odds symmetry model for square contingency table with ordinal categories. J. Stat. Comput. Simul. 2019, 89, 2899–2913. [Google Scholar] [CrossRef]
- Tahata, K.; Ochiai, T.; Matsushima, U. Asymmetry models and model selection in square contingency tables with ordinal categories. In Proceedings of the 2020 International Symposium on Information Theory and Its Applications (ISITA), Kapolei, HI, USA, 24–27 October 2020; pp. 573–577. [Google Scholar]
- Karadağ, O.; Altun, G.; Aktaş, S. Assessment of SNP-SNP interactions by using square contingency table analysis. Ann. Braz. Acad. Sci. 2020, 92, e20190465. [Google Scholar] [CrossRef]
- Altunay, S.A.; Yilmaz, A.E. Median Distance Model for Likert-Type Items in Contingency Table Analysis. Accepted-November 2021. Available online: https://revstat.ine.pt/index.php/REVSTAT/article/view/401 (accessed on 1 February 2022).
- Ando, S. Asymmetry models based on ordered score and separations of symmetry model for square contingency tables. Biom. Lett. 2021, 58, 27–39. [Google Scholar] [CrossRef]
- Ando, S. Orthogonal decomposition of the sum-symmetry model for square contingency tables with ordinal categories: Use of the exponential sum-symmetry model. Biom. Lett. 2021, 58, 95–104. [Google Scholar] [CrossRef]
- Ando, S. Odds-symmetry model for cumulative probabilities and decomposition of a conditional symmetry model in square contingency tables. Aust. N. Z. J. Stat. 2021, 63, 674–684. [Google Scholar] [CrossRef]
- Ando, S. Asymmetry models based on non-integer scores for square contingency tables. J. Stat. Theory Appl. 2022, 21, 21–30. [Google Scholar] [CrossRef]
- Tahata, K. Separation of symmetry for square tables with ordinal categorical data. Jpn. J. Stat. Data Sci. 2020, 3, 469–484. [Google Scholar] [CrossRef]
- Tahata, K.; Miyamoto, N.; Tomizawa, S. Measure of departure from quasi-symmetry and bradley-terry models for square contingency tables with nominal categories. J. Korean Stat. Soc. 2004, 33, 129–147. [Google Scholar]
- Aitchison, J. Large-sample restricted parametric tests. J. R. Stat. Soc. Ser.-Stat. Methodol. 1962, 24, 234–250. [Google Scholar] [CrossRef]
- Darroch, J.N.; Silvey, S.D. On testing more than one hypothesis. Ann. Math. Stat. 1963, 34, 555–567. [Google Scholar] [CrossRef]
- Read, C.B. Partitioning chi-squape in contingency tables: A teaching approach. Commun. Stat.-Theory Methods 1977, 6, 553–562. [Google Scholar] [CrossRef]
- Lang, J.B.; Agresti, A. Simultaneously modeling joint and marginal distributions of multivariate categorical responses. J. Am. Stat. Assoc. 1994, 89, 625–632. [Google Scholar] [CrossRef]
- Lang, J.B. On the partitioning of goodness-of-fit statistics for multivariate categorical response models. J. Am. Stat. Assoc. 1996, 91, 1017–1023. [Google Scholar] [CrossRef]
- McCullagh, P. A class of parametric models for the analysis of square contingency tables with ordered categories. Biometrika 1978, 65, 413–418. [Google Scholar] [CrossRef]
- Agresti, A. A simple diagonals-parameter symmetry and quasi-symmetry model. Stat. Probab. Lett. 1983, 1, 313–316. [Google Scholar] [CrossRef]
- Tomizawa, S. An extended linear diagonals-parameter symmetry model for square contingency tables with ordered categories. Metron 1991, 49, 401–409. [Google Scholar]
- Ireland, C.T.; Ku, H.H.; Kullback, S. Symmetry and Marginal Homogeneity of an r × r Contingency Table. J. Am. Stat. Assoc. 1969, 64, 1323–1341. [Google Scholar] [CrossRef]
- Gilula, Z.; Krieger, A.M.; Ritov, Y. Ordinal Association in Contingency Tables: Some Interpretive Aspects. J. Am. Stat. Assoc. 1988, 83, 540–545. [Google Scholar] [CrossRef]
- Kateri, M.; Papaioannou, T. Asymmetry models for contingency tables. J. Am. Stat. Assoc. 1997, 92, 1124–1131. [Google Scholar] [CrossRef]
- Csiszár, I.; Shields, P.C. Information Theory and Statistics: A Tutorial; Now Publishers Inc.: Hanover, Germany, 2004. [Google Scholar]
- Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
- Kateri, M.; Agresti, A. A class of ordinal quasi-symmetry models for square contingency tables. Stat. Probab. Lett. 2007, 77, 598–603. [Google Scholar] [CrossRef]
- Kateri, M. Families of generalized quasisymmetry models: A ϕ-divergence approach. Symmetry 2021, 13, 2297. [Google Scholar] [CrossRef]
- Tahata, K.; Tomizawa, S. Generalized linear asymmetry model and decomposition of symmetry for multiway contingency tables. J. Biom. Biostat. 2011, 2, 1–6. [Google Scholar] [CrossRef]
- Fujisawa, K.; Tahata, K. Asymmetry model based on f-divergence and orthogonal decomposition of symmetry for square contingency tables with ordinal categories. SUT J. Math. 2020, 56, 39–53. [Google Scholar]
- Yoshimoto, T.; Tahata, K.; Saigusa, Y.; Tomizawa, S. Quasi point-symmetry models based on f-divergence and decomposition of point-symmetry for multi-way contingency tables. SUT J. Math. 2019, 55, 103–131. [Google Scholar]
- Tahata, K.; Tomizawa, S. Generalized marginal homogeneity model and its relation to marginal equimoments for square contingency tables with ordered categories. Adv. Data Anal. Classif. 2008, 2, 295–311. [Google Scholar] [CrossRef]
- Tahata, K.; Naganawa, M.; Tomizawa, S. Extended linear asymmetry model and separation of symmetry for square contingency tables. J. Jpn. Stat. Soc. 2016, 46, 189–202. [Google Scholar] [CrossRef]
- Tomizawa, S. Three kinds of decompositions for the conditional symmetry model in a square contingency table. J. Jpn. Stat. Soc. 1984, 14, 35–42. [Google Scholar]
- Bhapkar, V.P. A note on the equivalence of two test criteria for hypotheses in categorical data. J. Am. Stat. Assoc. 1966, 61, 228–235. [Google Scholar] [CrossRef]
- Rao, C.R. Linear Statistical Inference and Its Applications, 2nd ed.; Wiley: New York, NY, USA, 1973. [Google Scholar]
- Saigusa, Y.; Tahata, K.; Tomizawa, S. Orthogonal decomposition of symmetry model using the ordinal quasi-symmetry model based on f-divergence for square contingency tables. Stat. Probab. Lett. 2015, 101, 33–37. [Google Scholar] [CrossRef]
- Tomizawa, S. Two kinds of measures of departure from symmetry in square contingency tables having nominal categories. Stat. Sin. 1994, 4, 325–334. [Google Scholar]
- Tomizawa, S.; Seo, T.; Yamamoto, H. Power-divergence-type measure of departure from symmetry for square contingency tables that have nominal categories. J. Appl. Stat. 1998, 25, 387–398. [Google Scholar] [CrossRef]
- Cressie, N.; Read, T.R.C. Multinomial goodness-of-fit tests. J. R. Stat. Soc. Ser. B-Methodol. 1984, 46, 440–464. [Google Scholar] [CrossRef]
- Tomizawa, S.; Miyamoto, N.; Hatanaka, Y. Measure of asymmetry for square contingency tables having ordered categories. Aust. N. Z. J. Stat. 2001, 43, 335–349. [Google Scholar] [CrossRef]
- Tahata, K.; Iwashita, T.; Tomizawa, S. Measure of departure from conditional marginal homogeneity for square contingency tables with ordered categories. Statistics 2008, 42, 453–466. [Google Scholar] [CrossRef]
- Iki, K.; Tahata, K.; Tomizawa, S. Measure of departure from marginal homogeneity using marginal odds for multi-way tables with ordered categories. J. Appl. Stat. 2012, 39, 279–295. [Google Scholar] [CrossRef]
- Patil, G.P.; Taillie, C. Diversity as a concept and its measurement. J. Am. Stat. Assoc. 1982, 77, 548–561. [Google Scholar] [CrossRef]
- Tahata, K.; Kozai, K. Measuring degree of departure from extended quasi-symmetry for square contingency tables. Rev. Colomb. Estad. 2012, 35, 55–65. [Google Scholar]
- Tahata, K.; Kozai, K.; Tomizawa, S. Partitioning measure of quasi-symmetry for square contingency tables. Braz. J. Probab. Stat. 2014, 28, 353–366. [Google Scholar] [CrossRef]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Giorgi, G.M.; Gigliarano, C. The Gini concentration index: A review of the inference literature. J. Econ. Surv. 2017, 31, 1130–1148. [Google Scholar] [CrossRef]
- Ando, S. A bivariate index vector to measure departure from quasi-symmetry for ordinal square contingency tables. Austrian J. Stat. 2021, 50, 115–126. [Google Scholar] [CrossRef]
- Tahata, K. Quasi-asymmetry model for square tables with nominal categories. J. Appl. Stat. 2012, 39, 723–729. [Google Scholar] [CrossRef]
- Tahata, K.; Tomizawa, S. Double linear diagonals-parameter symmetry and decomposition of double symmetry for square tables. Stat. Methods Appl. 2010, 19, 307–318. [Google Scholar] [CrossRef]
- Lawal, H.B. Using a GLM to decompose the symmetry model in square contingency tables with ordered categories. J. Appl. Stat. 2004, 31, 279–303. [Google Scholar] [CrossRef]
- Tan, T. Doubly Classified Model with R; Springer: Singapore, 2017. [Google Scholar]
- Lang, J.B. Multinomial-Poisson homogeneous models for contingency tables. Ann. Stat. 2004, 32, 340–383. [Google Scholar] [CrossRef]
- Lang, J.B. Homogeneous Linear Predictor Models for Contingency Tables. J. Am. Stat. Assoc. 2005, 100, 121–134. [Google Scholar] [CrossRef]
- Bhapkar, V.P.; Darroch, J.N. Marginal symmetry and quasi symmetry of general order. J. Multivar. Anal. 1990, 34, 173–184. [Google Scholar] [CrossRef][Green Version]
- Tahata, K.; Yamamoto, H.; Tomizawa, S. Orthogonality of decompositions of symmetry into extended symmetry and marginal equimoment for multi-way tables with ordered categories. Austrian J. Stat. 2008, 37, 185–194. [Google Scholar] [CrossRef]
- Tahata, K.; Yamamoto, H.; Tomizawa, S. Linear ordinal quasi-symmetry model and decomposition of symmetry for multi-way tables. Math. Methods Stat. 2011, 20, 158–164. [Google Scholar] [CrossRef]
- Shinoda, S.; Tahata, K.; Yamamoto, K.; Tomizawa, S. Marginal continuation odds ratio model and decomposition of marginal homogeneity model for multi-way contingency tables. Sankhya B 2021, 83, 304–324. [Google Scholar] [CrossRef]
- Tahata, K.; Tomizawa, S. Orthogonal decomposition of point-symmetry for multiway tables. AStA Adv. Stat. Anal. 2008, 92, 255–269. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).