1. Introduction
A special type of square contingency table that occurs often in studies of correlated or repeated categorical measurements, e.g., in panels or social mobility studies, is a table with row and column classification variables measured on the same scale, which can be nominal or ordinal. Caussinus, in his pioneering work [
1], introduced quasisymmetry (
) models for such square contingency tables, focusing on its mathematical properties and its connection to the models of complete symmetry (
S) and marginal homogeneity (
). The interpretation aspects were developed later by [
2,
3,
4], among others. The 
 model applies on contingency tables that have a common nominal classification scale. An ordinal 
 (
) model was introduced by [
5].
In this work, we focus on the 
 model, penetrating its features by considering the alternative equivalent definitions of 
 in terms of cell probabilities, local odds ratios (LOR), and as a model measuring departure from the 
S model. Furthermore, outgoing from the fact that, under 
, the table of LOR is symmetric, we discuss more special parsimonious 
 models that are based on the family of association models (AMs) with homogeneous row and column scores ([
3,
6]). In a information-theoretic setup, 
 and 
 models were generalized through 
-divergence to the corresponding families of models 
 and 
 ([
7,
8]). Associated orthogonal decomposition properties of the 
S model were proved for the 
 and 
 models, respectively, by [
7,
9]. Further variants of 
-divergence 
-type models and 
S decomposition properties were discussed in [
10,
11,
12]. A detailed literature review on 
 models and links to other models of symmetry and asymmetry can be found in [
13]. Here, we introduce a 
-divergence family of special 
 models that roots on the 
-divergence family of AMs ([
14]).
Moving from LOR to generalized odds ratios, e.g., global odds ratios (GOR), 
 models were also considered for modeling the symmetry of generalized odds ratios ([
15]). AMs for generalized odds ratios ([
16]) were extended to a broader family through the 
-divergence ([
17]). Combining these two families of models discussed in [
15,
17], we introduce here new flexible classes of 
 models for generalized odds ratios that are based on 
-divergence.
Overall, we revisit the  model:
- Aiming at an indepth discussion of its nature and properties, as consolidated by the alternative possible definitions of , and consideration of special -type models, with an emphasis on -type models that are based on homogeneous AMs. 
- Reviewing extensions of -type models towards two directions: (a) in an information-theoretic setup by replacing the role of KL divergence by -divergence, and (b) considering them for generalized odds ratios other than LOR. 
- Proposing a new -divergence family of -type models by expanding the relation of  to homogeneous AMs for the -divergence AMs. 
- Introducing a flexible family of models by extending models of ii(b) above in terms of the -divergence. 
The paper is structured as follows. 
Section 2 reviews 
 and 
 models, focusing on their structural properties, and further discusses AMs with homogeneous scores that are of the 
-type. 
Section 3 presents 
 and 
 models for generalized odds ratios. Generalized families of 
 and 
 models for LOR based on 
 divergence are briefly reviewed in 
Section 4. In 
Section 5, the 
-type models of 
Section 2.2 and the 
 models for generalized odds ratios of 
Section 3 are expanded to corresponding 
 divergence-based classes of models by modeling 
-scaled generalized odds ratios. A selection of the discussed models is illustrated on two examples in 
Section 6. 
Section 7 discusses further possible models that can be investigated. 
Section 8 summarizes our results.
  2. Quasisymmetry Models for Square Contingency Tables
Consider an 
 contingency table cross-classifying two categorical variables 
X and 
Y, measured on the same scale, and corresponding to the rows and columns of the table. Let 
 be the associated probability table with cell entry probabilities 
, for 
, where 
 and 
. Then, the 
 model, initially introduced by Caussinus [
1], is expressed in log-linear form as
      
      with symmetric interactions, i.e., with
      
Parameters in (
1) satisfy some identifiability constraints. We set
      
The degrees of freedom of (
1) equal 
. The 
 parameters measure the departure from model of complete symmetry 
S, under which 
, for all 
. Indeed, 
 is reduced to 
S if 
, for all 
j. Model (
1) fits the diagonal entries exactly.
In terms of cell probabilities, (
1) is equivalently expressed as
      
      with parameters 
 providing insight into sources of marginal inhomogeneity, as discussed in [
7,
18].
Alternatively, the 
 model can be defined in terms of the LOR
      
      i.e., the odds ratios of all 
 subtables formed by pairs of successive rows and successive columns. These form an 
 contingency table, and 
QS is the model that has symmetric LOR
      
      and fits the diagonal entries of the probability table exactly, as indicated in [
19]. This definition through the LOR highlights a basic structural property of 
, facilitating its physical interpretation and enabling generalizations in new directions by considering alternative types of odds ratios, as we show in 
Section 3.
  2.1. QS Model for Ordinal Classification Variables
Usually in applications of the 
 model, the classification scale is ordinal. However, 
 is also applicable for tables with nominal classification variables, since it fulfils the permutation invariance property ([
20]). In the case of an ordinal scale, alternative 
-type models are possible that are more parsimonious and provide insightful interpretation. Agresti [
5] introduced the 
ordinal  (
) model
        
        with interaction parameters satisfying (
2) and under (
3). In other words, 
 is a special, more parsimonious 
 model, derived from (
1) when 
. It has just one parameter more than the 
S model; hence, 
. Equivalently, (
7) can be expressed as
        
        with 
, or as a departure from symmetry model
        
Under the 
 model, scores are assigned to the classification categories that equal the category indices (
, 
). It can be easily verified that these scores in (
7) and (
8) can equivalently be replaced by any equally spaced scores, i.e., linear transformation of 
s. More generally, one could consider an 
 model for known but unequally spaced scores, that is, for any set of known scores 
 (with 
). This model however will no more be equivalent to (
7). Analogously to the 
 model, 
 reduces to the 
S model when 
 (or 
).
  2.2. Association Models with Homogeneous Scores
Log-linear models with interactions for two-way contingency tables are saturated. In case of a square table, the saturated log-linear model is given by
        
Association models (AMs) impose special structures on the interactions, thus leading to nonsaturated dependence models of sound interpretation. They are also known as Goodman’s AMs (see [
6] and references therein). For a detailed discussion of AMs and the associated literature, we refer to [
21] (Chapter 6); a short presentation is provided in [
22].
The simplest association model is that of 
uniform association (
U) that is applied on tables with ordinal classification variables and model interactions through just one parameter of intrinsic association on the basis of equidistant scores assigned to the categories of the classification variables. The 
U model for square contingency tables (with classification variables of common scale) can be expressed as
        
        where 
 is the intrinsic association parameter and 
 are known scores assigned to the classification categories, which are homogeneous for rows and columns, and equidistant for successive categories (
). Under (
11), interaction parameters 
 are obviously symmetric. Under the 
U model, all LORs are equal since
        
        and thus (
6) is trivially fulfilled. However, (
11) is not of the 
-type since it does not exactly fit on the diagonal. Its extension
        
        where I is the indicator function, is the homogeneous uniform association model with exactly fitted diagonal entries, denoted by 
, introduced by [
3] as the uniform with main diagonal deleted model. It is a quasisymmetric model for ordinal classifications, more parsimonious than 
 with 
. It is an alternative to 
 and more parsimonious than 
 for 
.
Analogously to the 
 model, Model (
13) can be considered for arbitrary known scores 
 (with 
). Furthermore, considering expression (
13) with unknown parametric scores, not necessarily ordered, the homogeneous 
 model with exactly fitted diagonals (
) is derived, which is another 
-type model, less parsimonious than 
 that can apply also to nominal classification variables. For a discussion on 
, 
 and further homogeneous AMs of higher order (i.e., homogeneous 
 models) and their links to 
, we refer e.g., to [
21] (Section 9.4).
  3. Generalized QS Models
In case one or both of the classification variables are ordinal, there exist other types of odds ratios that are alternatives to LOR. In our framework of square tables with classification variables measured on the same scale, of interest are, beyond LOR, odds ratios for tables with ordinal classification scale. The most popular type for ordinal classification variables is the global odds ratios (GOR), which for an 
 contingency table are defined as
      
The characterization 
global is because every 
 is based on the whole contingency table since it dichotomizes 
X and 
Y at levels 
i and 
j, respectively, and accordingly merges the cell probabilities. GOR treat both classification variables in a symmetric manner. When merging is considered only for one classification variable, for example, 
Y, while the other is treated locally, then the 
cumulative odds ratios (COR) are derived
      
COR can be used in problems of modeling the effect of an explanatory variable on a response. In particular, 
 could be considered if 
Y is the response. Obviously, 
 can be analogously defined. For further types of generalized odds ratios and their detailed study, their inter-relations, and associated positive dependence properties, we refer to [
23].
Motivated by the definition of 
 through the symmetric LOR, the authors in [
15] introduced generalized 
-type models for generalized odds ratios other than the LOR. In this context, the classical 
 model that applies on the LOR would be denoted by 
, while analogously to (
6), the 
 property for the GOR is defined by
      
      and denoted by 
. On the other hand, the definition of the 
 model for the COR requires to change also the role of the response variable
      
      as explained in [
15].
  4. -Divergence-Based Families of Generalized  Models
Model 
 and 
 can be defined as departure from complete symmetry models (see (
4) and (
9)). From a statistical information point of view, they share a common property. Both, under certain conditions (different for each model), are the closest models to complete symmetry when the distance in measured in terms of the KL divergence, as proved by [
7,
8] for 
 and 
, respectively. Furthermore, the authors in [
7,
8] introduced and studied general classes of 
 and 
 models, derived by replacing the KL divergence by a family of divergences, 
-divergence, which includes the KL as special case.
In particular, 
-divergence is an important generalized measure for measuring divergence between two probability distributions. In our setup, if 
, 
 are two discrete finite 
 probability distributions, the 
-divergence between 
 and 
 is given by
      
      where 
 is a real-valued strictly convex function on 
 with 
, 
, 
 (see e.g., [
24]). The KL divergence
      
      is derived from (
18) for 
. Setting 
, Pearson’s divergence is obtained, while 
, leads to the power divergence measure of Cressie and Read (CR) [
25]
      
      which is a flexible parametric family itself, controlled by the parameter 
. For 
, (
20) converges to the 
, for 
 it corresponds to Pearson’s divergence and for 
 to the Hellinger divergence.
The 
-divergence-based 
 family of models (
), introduced by [
7], is defined by
      
      with 
, where 
 is the inverse function of 
. For the KL divergence 
 and (
21) leads to model (
4) while for the CR 
, the 
 family of models
      
      is derived. In this context, the standard 
 model is denoted by 
 as a member of this family. The superscript 
L in these models notation indicates the fact that they impose symmetric restrictions on LOR.
Analogously, the generalized family of 
 models, 
, is defined by
      
      with 
 and 
 known monotonic equidistant scores (see [
8]). 
 reduces to (
9) for 
, while, for the CR divergence, 
 is given by
      
  5. New Families of -Divergence Generalized  Models
The 
-type models of 
Section 2.2 that are linked to AMs with homogenous row and column scores can be extended to 
-divergence-based families through the 
-divergence AMs of [
14]. A brief presentation of the 
-divergence AMs and the underlying concept can be found in [
22]. Here, we focus just on the families corresponding to the 
U model with homogeneous row and column scores. Associated 
-divergence-based family of models 
 is defined as
      
      where 
 and 
 denote the 
i-th row and 
j-th column marginals respectively, i.e., 
. For the KL divergence, 
 leads to model (
11) while for the CR–divergence it takes the form
      
      denoted by 
. Hence the special 
-type model 
 defined in (
13), can be extended to a family of models
      
      denoted by 
. The model expression corresponding to CR divergence is denoted by 
.
Furthermore, the 
 models for generalized odds ratios, introduced in [
15] and presented in 
Section 3, can be extended to a flexible 
-divergence family, based on the 
-divergence generalized AMs of [
17], which are briefly presented below, adjusted in our set–up.
Analogously to expression (
12) for the 
 model, 
 can alternatively be expressed as
      
      where
      
      for 
, are measures of local dependence, scaled through the 
-divergence and denoted by LOR
. For 
, LOR
 is the log(LOR), modeled in (
12), which in the sequel is denoted as 
, 
. For the CR divergence,(
29) becomes
      
Forcina and Kateri ([
17]) provided expressions for 
-scaled generalized odds ratios and introduced families of 
-divergence AMs for generalized odds ratios, which they studied. Thus, for example, GOR and COR extend to GOR
 and COR
, given by
      
      and
      
      for 
. The merged probabilities in (
31) and (
32) are the same as defined in (
14) and (
15). For the KL divergence, (
31) and (
32) reduce to (
14) and (
15), while for the CR divergence, setting 
, the corresponding expressions GOR
 and COR
 are derived. Through these 
-scaled generalized odds ratios, generalized 
 models, such as 
 and 
, are extended to 
-divergence-based families of models by replacing in the definitions (
16) and (
17) the GOR and COR by GOR
 and COR
. 
-type models for other types of generalized odds ratios, introduced in [
15], can be analogously extended to 
-divergence-based families.
  6. Examples
We first illustrate 
-type models on one of the most classical datasets of square tables, namely, the women vision data provided in 
Table 1. Apart from the standard 
 model fitted often in the literature, the author in [
5] fitted on this dataset the 
 model, while the authors in [
7] the 
 and in [
8] the 
, for 
. Furthermore, the authors in [
15] fitted 
 models for generalized odds ratios other than the LOR.
Our second example, provided in 
Table 2, cross-classifies male respondents of the 2008 General Social Survey (GSS) in the USA on the basis of their degree of pride with regard to America’s economic vs scientific and tech achievements. We applied on this dataset the same models as on the women-vision data.
In 
Table 3 we provide the likelihood ratio goodness of fit (GOF) test statistic values for 
-type models fitted on the LOR for both examples. 
Table 4 shows the GOF test statistics for the generalized 
 model fitted on the GOR.
For the women-vision example, 
 models were best. The scale in this case is not of practical importance, since 
 and 
, corresponding to the KL and Pearsonian divergence, are of comparable fit (their maximum likelihood estimates (MLEs) differ on the second decimal place). The MLEs of the expected cell frequencies under 
 are provided in 
Table 1. This model, i.e., (
24) with 
 and 
, 
, takes the final form (see [
8])
      
The MLE for 
 is 
 and hence the probabilities in the lower triangle in 
Table 1 are estimated to be smaller than those in the upper. Hence the vision is worse for the left eye. Under this model, it holds
      
Thus, the odds of an observation falling in a certain subdiagonal under the main diagonal of the table (instead of the corresponding superdiagonal) are estimated as
, 
. Notice that in [
8] the corresponding 
 value is different (=0.119). This is due to rescaling, since there is used a different set of 
 scores (the 
’s are the same).
The situation is different for the second example, where it is clear that the KL divergence should be used for modeling the LOR (see 
Table 3). Furthermore we see that for this data set, the 
 model, that imposes a special parsimonious structure on the interaction terms and not on the main effects (as under the 
 models), is of better fit. However, an impressive fit is provided by the 
 models (see 
Table 4). The best fit is for 
 and thus the MLEs of the expected cell frequencies under 
 are shown in 
Table 2. Hence, for this data set the 
 property is significantly stronger supported for the global (than the local) dependencies.
In our examples, we considered specific choices for the parameter . Analysis and interpretation of results follows analogously for other choices of the parameter .
  7. Discussion
In future research, it would be interesting to consider more parsimonious 
-type models for generalized odds ratios, analogs to 
 for LOR, as for example, the model of uniform GOR with homogeneous (equidistant) row and column scores, i.e., satisfying
      
      that additionally fits the probabilities on the main diagonal cells exactly. AMs that model interactions other than local, though they are naturally defined by the corresponding type of odds ratios, do not provide closed form expressions for the individual cell probabilities. Hence, a definition of the associated 
-type models by expressions analog to (
13) is not possible. For the same reason, the 
 model cannot be extended to other types of odds ratios by the approach adopted here. Since it imposes a special structure on the main effects, this cannot be captured when defining models in terms of odds ratios; expressions in terms of cell probabilities are required. Recently, the authors in [
17] derived expressions for such generalized AMs in terms of suitable associated marginal probabilities. These expressions include parameters for the main effects (see Forms (9) and (10) in [
17]). One could generalize 
 and 
 for other types of odds ratios, adopting the framework of [
17].
  8. Conclusions
In this work, we revisited the 
 model for square contingency tables with commensurable classification variables and discussed its possible equivalent formulations. 
 is mostly expressed in terms of cell probabilities, while it can alternatively be expressed in terms of local odds ratios Its definition as a departure model from the more parsimonious model of complete symmetry provides additional interpretation features. Furthermore, we considered the 
 model, a more parsimonious 
-type model, applicable if the classification scale is ordinal, which imposes a special structure on the main effects of the model. On the other hand, further 
-type models can be derived by considering a special structure for the interaction terms. This is possible through AMs with homogeneous row and column scores. In particular, by adding parameters to homogeneous AMs that ensure the exact fit on the diagonal entries of the contingency table, models are derived that model the off-diagonal cells and have symmetric interaction terms. Thus, they are of the 
-type, but more parsimonious than the standard 
 model. All these models are related to LOR and model local dependencies of the table. Next we present how these models can be defined for other types of generalized odds ratios, reviewing the work of [
15].
In a statistical information-theoretic setup, 
-type models and AMs satisfy properties of closeness to a specific reference model, when their divergence from the reference model is measured in terms of KL divergence. The reference model is that of symmetry (for 
 and 
 models) or independence (for AMs). Replacing KL divergence with 
-divergence, generalized families of AMs, 
 and 
 models for LOR were considered by [
7,
8,
14], respectively. The 
 model was linked to GOR in [
16], while [
15] introduced and implemented 
 models for GOR and other types of generalized odds ratios, without, however, considering the link to divergence measures and associated properties. The possible extension of these models in terms of 
-divergence was a topic for further research in [
15]. Here, we extended these models to 
-scaled generalized odds ratios and linked them to corresponding AMs on the basis of the results and models discussed in [
17]. We demonstrated the flexibility in modeling the classes of models discussed here by implementing and discussing some of these models on two representative examples.