Recent Extensions to the Cochran – Mantel – Haenszel Tests

The Cochran–Mantel–Haenszel (CMH) methodology is a suite of tests applicable to particular tables of count data. The inference is conditional on the treatment and outcome totals on each stratum being known before sighting the data. The CMH tests are important for analysing randomised blocks data when the responses are categorical rather than continuous. This overview of some recent extensions to CMH testing first describes the traditional CMH tests and then explores new alternative presentations of the ordinal CMH tests. Next, the ordinal CMH tests will be extended so they can be used to test for higher moment effects. Finally, unconditional analogues of the extended CMH tests will be developed.


Introduction
The Cochran-Mantel-Haenszel (CMH) methodology is a suite of tests applicable to particular tables of count data.The inference is conditional on the treatment and outcome totals on each stratum being known before sighting the data.The CMH tests are applicable to more complex designs than randomised blocks but the analysis of randomised blocks data when the responses are categorical rather than continuous is certainly an important application of the CMH tests.
This paper does not intend to be a general review of the CMH methodology or even recent CMH methodology.The principal aim was to introduce the reader to particular extensions introduced by the first author.An extensive literature survey of CMH-related topics would give the paper a new focus.We suggest that readers interested in CMH testing more broadly use their preferred search engines.This overview will first describe the traditional CMH tests and then explore new alternative presentations of the ordinal CMH tests.These could be used to calculate the test statistics but their main use will perhaps be to develop intuition about CMH testing.Next, the ordinal CMH tests will be extended so they can be used to test for higher moment effects.One rationale for developing these extensions was to enable a comparison with the nonparametric ANOVA tests introduced in References [1][2][3].These tests permit univariate moment assessments beyond the mean and bivariate assessments beyond the (order (1, 1)) correlation.The nonparametric ANOVA tests are therefore briefly reviewed before considering the CMH extensions.Finally, unconditional analogues of the extended CMH tests will be developed.

The CMH Tests
The CMH tests are a class of nonparametric tests used to nonparametrically analyse tables {N ihj } of count data of a particular structure.Specifically, N ihj counts the number of times treatment i is classified into outcome category h in the jth stratum, i = 1, . . ., t, h = 1, . . ., c and j = 1, . . ., b. Strata are independent and the treatments present in each stratum are fixed by design.As is usual, note that N ihj is a random variable and n ihj is a particular value of that random variable.Dot notation is used to reflect summation over a subscript.In the traditional CMH tests (see, for example, References [4][5][6][7]),

•
the strata totals ∑ i,h N ihj = n ••j , • the treatment totals within strata, ∑ h N ihj = n i•j , and the outcome totals within strata, ∑ i N ihj = n •hj are all assumed to be known prior to sighting the data; they are not random variables.The inference is conditional.
In parentheses are the symbols used subsequently for each test statistic and the asymptotic null distributions of the test statistics.The test statistics are quadratic forms with vectors using the table counts.Some of the covariance matrices involve Kronecker products.
This design is appropriate for categorical randomised block data.While the CMH methodology is appropriate more generally, it won't accommodate more complex designs such as Latin square, multifactor and many other designs.However, it is an extremely important analysis tool for randomised blocks when the responses are categorical rather than continuous.In consumer studies with just about right (JAR) responses such as in the Jams Example following, when c is small the randomised block F test will often be invalid in spite of the well-known ANOVA robustness.
Two examples will now be introduced.They will be reconsidered throughout this article.Homosexual Marriage Example.Scores of 1, 2 and 3 are assigned to the responses agree, neutral and disagree respectively to the proposition "Homosexuals should be able to marry" and scores of 1, 2 and 3 are assigned for the religious categories fundamentalist, moderate and liberal respectively.See Table 1.Agresti [8] finds S GA takes the value 19.76 with χ 2 4 p-value 0.0006, S MS takes the value 17.94 with χ 2 2 p-value 0.0001, and S C takes the value 16.83 with χ 2 1 p-value less than 0.0001.From these tests, we conclude there is strong evidence of an association between the proposition responses and religion.In particular, there is evidence of mean differences in the responses and of a (linear-linear) correlation between responses and religion.Jams Example.Three plum jams, A, B and C are given JAR sweetness codes by eight judges as in Table 2. Here, 1 denotes not sweet enough, 2 not quite sweet enough, 3 just about right, 4 a little too sweet and 5 too sweet.The treatment sums for jams A, B and C are 18, 30 and 23 respectively.We find S MS = 9.6177 with χ 2 2 p-value 0.0082 and S C = 1.1029 with χ 2 1 p-value 0.2936.There is evidence of a mean effect; on average the jams are different.However, there is no evidence of a correlation effect: As we pass from jam A to B and then to C there is no evidence of an increasing (or decreasing) response.

The Nominal CMH Tests
The CMH MS test assumes that response categories are ordered while the CMH C test assumes that both treatment categories and response categories are ordered.It is therefore appropriate to label these as ordinal CMH tests, while the OPA and GA CMH tests, that make no assumption about ordering, can be called nominal CMH tests.Scores for the ordered categories are needed to apply the ordinal tests, while no scores are required for the nominal tests.
It is important to note that the traditional CMH tests are conditional tests, conditional on the treatment totals within strata, ∑ h N ihj = n i•j , and the outcome totals within strata, ∑ i N ihj = n •hj being known prior to collecting or sighting the data.For randomised block data in which the responses are untied ranks, the marginal totals are known before collecting the data.This is because each treatment is applied once in every block and each response is a rank that is assigned once.This is not the case for data such as in the homosexual marriage example.
For conditional testing, the distribution theory involved in determining means and covariance matrices uses a product extended hypergeometric distribution.For each stratum, since the row and column totals are known, the counts N ihj follow an extended hypergeometric distribution.Moreover, since the strata are mutually independent, a product distribution is appropriate for the collection of these strata counts.
To define the CMH GA test statistic S GA , first define the vector of counts on the jth stratum U j = (N 11j , . . ., N 1cj , . . ., N t1j , . . ., N tcj ) T .Summing over strata gives Using the extended hypergeometric distribution, it may be shown, as in Reference [4], that U • has mean Here, ⊗ is the Kronecker or direct product.The CMH GA test statistic is a quadratic form using ).As cov(U • ) is not of full rank either a generalised inverse cov − (U • ) can be used or dependent variables can be chosen and omitted to produce a covariance matrix of full rank.All unknown parameters are estimated using maximum likelihood under the null hypothesis.Asymptotically, as n ••• becomes large, S GA has the χ 2 (c−1)(t−1) distribution.The test statistic is symmetric in the treatments and outcome categories and independent of the choice of the dependent variables.
The test statistic S GA is too complicated for routine hand calculation; it is almost always applied using software in packages such as R.
To calculate S OPA , the vector of the quadratic form involves the aggregation of the U j via U = (U T 1 , . . ., U T b ) T .Again, the covariance matrix is calculated using the product extended hypergeometric distribution.The overall partial association statistic so derived is given by Asymptotically this has the χ 2 b(c−1)(t−1) distribution.One difference between the two tests is that the CMH GA test is seeking to detect the average partial association while the CMH OPA test is seeking to detect the overall partial association.The former is more focused, with (c − 1)(t − 1) degrees of freedom, compared with b(c − 1)(t − 1) for the CMH OPA statistic.The degrees of freedom are the dimension of the alternative hypothesis space.Thus, the CMH OPA test is seeking to detect very general alternatives to the null hypothesis and will have a relatively low power for these alternatives.The CMH GA test seeks to detect fewer alternatives and will have more power than the CMH OPA test for these alternatives.However, the CMH OPA test will have some power for alternatives to which the CMH GA is insensitive.In other contexts, more focused tests have been constructed using components of an omnibus test statistic.This idea will be taken up subsequently.
An alternative test for the overall partial association is the Pearson test, with statistic This is an unconditional test that does not assume all treatment and outcome categories are known before sighting the data.The difference between S OPA and T OPA is merely the factor (n ••j − 1)/n ••j applied to each stratum.For large stratum counts, this will make little difference in the values of S OPA and T OPA .
Likewise general association can be tested for using a Pearson test for the two-way table of counts {N ih• }, T GA say.The Pearson test statistics T OPA and T GA will have the same asymptotic distributions as the corresponding conditional tests.Most users will have more familiarity with the unconditional tests and most packages will have routines for their calculation even if they don't have routines for the CMH tests.
Subsequently, the main focus will be the CMH ordinal tests.

CMH Mean Scores Test
Suppose, as before, that N ihj counts the number of times treatment i is classified into outcome category h in the jth stratum, i = 1, . . ., t, h = 1, . . ., c and j = 1, . . ., b. Assume that outcomes are ordinal and assign the score b hj to the hth response on the jth stratum.All marginal totals are assumed to be known, so the product extended hypergeometric model is assumed.The score sum for treatment i in stratum j is in which all unknown parameters are estimated by maximum likelihood under the null hypothesis.
Then it may be shown that cov M j = S 2 j V Tj .Since cov(M j ) is not of full rank, the usual approaches, such as dropping appropriate treatment and/or outcome categories, or using a generalised inverse, can be used.
Note.The mean scores statistic depends on the scores assigned to the response categories.Thus, the statistic could be written S MS ({b hj }) to emphasise this dependence.
Aside.The derivation of cov(M j ) requires routine but tedious algebra.If δ uv is the Kronecker delta, = 1 if u = v and zero otherwise, using standard distribution theory for the product extended hypergeometric distribution, E N ihj = n i•j n •hj /n ••j and the covariance between N ihj and N i h j is These lead to the stated covariance matrix.Under the null hypothesis of no treatment effects, the distribution of S MS can be shown to be asymptotically χ 2 t−1 ; see Reference [4].

The CMH Correlation Test
The CMH correlation tests assume that the treatment and response variables are both measured on either an ordinal or the interval scale and that for the ith treatment the scores are a hi , i = 1, . . ., t, and on the jth stratum the response scores are b hj , j = 1, . . ., b.
The null hypothesis of no association between the treatment and response variables, having adjusted for the b strata, is tested against the alternative that across strata there is a consistent association, positive or negative, between the treatment scores and response scores.Take The CMH correlation (CMH C) statistic is C 2 /var(C) = S C say.The derivation of var(C) is relatively complex if scalars are used but is routine using Kronecker products.
To derive var(C), first define a j = (a 1j , . . ., a tj ) T , b j = (b 1j , . . ., b cj ) T and N j = (N 11j , . . ., N 1cj , . . ., Recall from Section 3 that with Hence, because both factors in the Kronecker product are scalars.Finally, var(C) = ∑ j cov(C j ) because counts in different strata are mutually independent.The CMH correlation statistic S C is now fully specified.The Central Limit Theorem assures the asymptotic normality of C, so as the total sample size n Again, see Reference [4].

Alternative Presentations of the Ordinal CMH Test Statistics
In this section, the focus is on alternative expressions for the ordinal CMH test statistics.In the case of randomised block data, simple expressions are given for the ready calculation of the test statistics.The expression for the correlation statistic is more general and quite insightful, so that will be considered first.

The CMH Correlation Statistic
Using the definitions previously given for V Tj and V Cj we have On stratum j now define With appropriate divisors the S XXj , S XYj and S YYj give unbiased estimators of the stratum variances and covariances.With these definitions, Finally, since C = ∑ j C j and var(C) = ∑ j cov(C j ), , The expression here uses the S XXj , S XYj and S YYj familiar in formulae for regression coefficients.Three special cases will be considered: (1) The data consists of one stratum only; (2) the treatment scores are independent of the strata: a ij = a i for all i and j; (3) the randomised block design (in Section 5.2).
In the second case, S XX is constant over strata.This gives a slight simplification of the S C formula.See the Jams Example below.If the data come from a randomised block design, a considerable simplification is possible if the same treatment and response scores are used on each stratum or block.See Section 5.2.
In the first case, the CMH correlation statistic simplifies to in which r P is the Pearson correlation coefficient.This is well known.See, for example, Reference [6], p. 253.
If we now write r Pj for the Pearson correlation in the jth stratum, it follows that since S XYj = r Pj S XXj S YYj S C is proportional to the square of a linear combination of the Pearson correlations in each stratum.The proportionality factor ensures S C has the χ 2 1 distribution.This formula demonstrates how the Pearson correlations in each stratum contribute to the overall correlation measure.
Homosexual Marriage Example.These data were considered in Section 1; they are given in Table 1.Noting that stratum 1 is school and stratum 2 is college, we find S XX1 = 50, S XY1 = −9 and S YY1 = 39.7333,r P1 = −0.2019and the CMH C statistic for school takes the value 2.4055 with χ 2 1 p-value 0.1209.Similarly, S XX2 = 51.6712,S XY2 = −23.8904,S YY2 = 42.6301,r P2 = −0.5090and the CMH C statistic for college takes the value 18.6558 with χ 2 1 p-value 0.0000.From these, the value of the CMH C statistic and its χ 2 p-value are confirmed: It was previously noted that S C takes the value 16.8328 with χ 2 1 p-value 0.0000.Clearly, there is an insignificant Pearson correlation for schools and a highly significant Pearson correlation for college.The latter dominates the former so that overall there is strong evidence of a correlation effect: As religion becomes increasingly liberal there is greater agreement with the proposition that homosexuals should be able to marry.This is due mainly to the stratum college.
Whiskey Example.O'Mahony [9], p. 363 gave the data in Table 3, which were analysed in Reference [10].They use mid-rank scores and find the Spearman correlation, which takes the value 0.73.In testing if this is zero against two-sided alternatives they give a Monte Carlo p-value of 0.09 and an asymptotic p-value of 0.04.Using scores 1, 2 and 3 for grade and 1, 5 and 7 for years of maturity, we find S XX = 6, S XY = −12 and S YY = 43.5.It follows that the Pearson correlation is −0.7428 and the CMH C statistic takes the value 3.8621 with χ 2 1 p-value 0.0494.There is some evidence that as maturity increases so does the grade of the whiskey.
Jams Example.As there are eight strata, hand calculation is possible if a little tedious.As S XX is constant over strata, it is not too much extra work to calculate S XY , S YY , the Pearson correlation, the CMH correlation statistic and its p-value on each stratum (See Table 4).We find S XX = 2 on all strata and S C = 1.1029 with χ 2 1 p-value 0.2936.There is no significant correlation effect, which would, if present, indicate that as we pass from jam A to B to C there is increasing (or decreasing) sweetness.It could be that overall there is no correlation effect with the contrary being the case in a minority of strata.That is not the case here; no stratum shows any evidence of a slight correlation effect.Here this is hardly surprising; with only three observations in each stratum, there can be little power in testing for a correlation.
If there is interest in the contributions to the correlation from individual strata (as in the homosexual marriage example) this is a reasonable approach.However, if there is not, then for the randomised blocks design with the same treatment scores in each stratum, a considerable simplification is possible.This is now considered.

The Randomised Blocks Design
As developed, the CMH methodology does not apply to Latin square, multifactor and many other designs.However, it is an extremely important analysis tool for randomised blocks when the responses are categorical rather than continuous.See, for example, Reference [6], Chapter 8, who use CMH methods to analyse repeated measures categorical data.
In the case of the randomised block design, a considerable simplification of the CMH MS statistic is possible.In this case, n i•j = 1 for all i and j.Consequently, Substituting in the definitions given in Section 4 ultimately gives in which F is the ANOVA F test statistic.A derivation of this relationship is given in the supplementary Materials.The significance of this result is that the CMH MS and ANOVA F tests are formally equivalent.
Coincidentally, the given relationship is the same as that relating the Friedman test statistic and the F test statistic using the ranks as data.
p-Values may be obtained by referring F to its F (t-1),(b-1)(t-1) distribution or S MS its χ 2 t−1 distribution, or otherwise.An empirical study would be required to assess which is the more reliable in the sense of closeness to, say, permutation test p-values.In the examples we have analysed, there has been little difference in these methods.
For the CMH C test statistic with the same scores in every stratum is in which C is the sum of the products of the treatment sums and the centred treatment scores, S XX is the sum of the squares of the centred treatment scores and ∑ j S YYj may be read from the output for a one-way ANOVA.Again, the derivation is given in the supplementary Materials.Jams Example.For the jams data, a i = 1, 2 and 3, so S 2 XX = 2. From a one-way ANOVA with judges as treatments, ∑ j S 2 YYj = 22 2 3 .This can be confirmed by summing across the S YY row in Table 4.By summing the columns in Table 2, the treatment score sums are found to be 18, 30 and 23, and since the centred treatment scores are −1, 0 and 1, C = −18 + 0 + 23 = 5.Substituting gives S C = 2 × 25/(2 × 68/3) = 75/68 = 1.1029 with χ 2 1 p-value 0.2936, as previously.

Nonparametric ANOVA
A suite of tests that can analyse the same data as the CMH tests is the nonparametric ANOVA tests of [1][2][3].However, these tests are applicable for a wider range of designs and are more flexible than the CMH tests, in that they assess both moment effects and generalised correlation (GC) effects of all orders, although only lower order tests would be applied in practice.The extension of the CMH tests considered in the next section makes the CMH tests equally as flexible as the nonparametric ANOVA tests.
The nonparametric ANOVA tests are nonparametric in that they are based on weak multinomial models incorporating smooth alternatives.The sampling distributions of the test statistics are not known, but the usual F distributions provide approximations that agree very well with permutation test results and, of course, are very convenient.Responses for both the CMH and nonparametric ANOVA are categorical, although the use of the term ANOVA suggests continuous responses.In fact, the issue of whether the data are categorical or continuous is of little practical importance.The nonparametric ANOVA tests are robust to even highly categorical data and, while the sampling distributions for the CMH tests are asymptotic, they are quite good even for small sample sizes.A simulation study in Reference [11] showed that for randomised blocks data both suites of tests showed good agreement between their nominal significance levels and the test sizes achieved.
It is interesting to note that if the data are ranked then the first order nonparametric ANOVA F tests are equivalent to the Kruskal-Wallis when the design is the completely randomised design and the Friedman for the randomised blocks design.See, for example, Reference [12], Section 2.5.
Nonparametric ANOVA may be applied to designs consistent with the general linear model (see Reference [3]) while of course, the design for the CMH extensions is much more restrictive.There are two types of nonparametric ANOVA: Ordered and unordered.For the unordered analysis, orthonormal polynomials on the response variable up to a given level, typically three, are constructed.For any given ANOVA, the analysis with the responses transformed by the order r, the orthonormal polynomial is called the order r analysis.Analyses of different orders are uncorrelated, so significance or not at any given order doesn't affect the significance or not at any other order.For a given design there will usually be several effects, such as main effects, interactions and so on.The order r analysis tests null hypotheses that the responses transformed by the rth orthonormal polynomial are consistent across levels of these effects.Suppose, for example, that one effect is the application of treatment A. Analyses of successive orders seek to assess whether or not the responses transformed by the first, second and higher order orthonormal polynomials are consistent across levels of treatment A.
The ordered analysis assumes that at least one of the independent variables is ordered in the sense that the levels of the effects corresponding to that variable are ordered.Orthonormal polynomials are constructed on the response variable and on each ordered independent variable.The responses are transformed by an orthonormal polynomial of a particular order and each independent variable is also transformed by an orthonormal polynomial of a particular order.Then the product of the transformed variables is taken.A new, reduced design is formed from the original.A new response, the product of the orthonormal polynomials, replaces the original response and the ordered independent variables are removed.ANOVAs of interest on the reduced design are then performed.These analyses assess generalised correlations, for which see Reference [13].As with the unordered analysis, the conclusions from one ANOVA do not affect the conclusions from other ANOVAs.Of most interest is the usual order (1, 1) correlation that assesses linear-linear effects and order (1, 2) and (2, 1) correlations that assess umbrella effects.
Both the extended CMH tests and the unordered and ordered nonparametric ANOVAs assume the existence of the orthonormal polynomials.An orthonormal polynomial of order r requires the existence of moments up to order 2r while the corresponding orthogonal polynomial requires the existence of moments up to order r + 1.However, as responses are classified into, say, c classes, it is only possible to calculate moments of the responses up to order c − 1.Thus, moments, orthonormal polynomials and hence tests of certain orders are not available for some parameter choices.

Extensions of the CMH Mean Scores and Correlation Tests
To construct extensions to S MS , consider the scores b hj on the jth stratum.A common choice of scores to give a 'mean' assessment would be the 'natural' scores 1, 2, . . ., c on all strata.A 'dispersion' assessment could be given by choosing scores 1 2 , 2 2 , . . ., c 2 and similarly higher order powers might be of interest: b hj = h r .One problem with using the scores b hj = h r is that the test statistics are correlated.Thus, the significance or not of the test for any order may affect the significance or not of tests at other orders.We now look at using more general scores with the objective of having uncorrelated test statistics.To this end, now denote order r scores that are not stratum-specific by b r h , h = 1, . . ., c. Define the order r score sum for treatment i by The ith score sums of different orders, M r i and M s i , are uncorrelated.In this sense, the information provided by the scores sums of different orders for the same treatment is not related.
To construct extensions to S C , suppose that instead of a single set of outcome scores {b hj }, we consider c sets of scores, {b hj } for s = 0, 1, . . ., c − 1.Moreover, suppose the scores are orthonormal in the sense that hj n •hj /n ••j = δ rs with r, s = 0, 1, . . ., c − 1 and with b (0) hj = 1 for h = 1, . . ., c. Similarly, instead of a single set of treatment scores {a ij }, consider t sets of scores, {a (r) ij } for r = 0, 1, . . ., t − 1 and suppose these scores are orthonormal in the sense that with r, s = 0, 1, . . ., t − 1 and with a (0) ij = 1 for i = 1, . . ., t.Both sets of scores may be stratum-specific.Define S Crs as before but using a hj .It can be shown that the S Crs are uncorrelated.See Reference [11] for details.
Jams Example.For the jams data, the CMH MS and unordered nonparametric ANOVA p-values are exactly the same.This is because for a randomised block design the CMH MS tests using the ANOVA F statistic use the same orthonormal functions, based on the category weights, as the unordered nonparametric ANOVA.
For the jams data, the p-values using the F distribution are 0.0278, 0.2435 and 0.5554 and those using the χ 2 t−1 distribution are 0.0082, 0.1116 and 0.3802 respectively.At the 0.05 level, there is evidence of mean differences in the scores for jams but not of higher moment effects.
We may test for generalised correlation effects using both CMH GC and nonparametric ANOVA.In Table 5, the nonparametric ANOVA cells give p-values, first using the t-test and second using the Wilcoxon signed ranks test.
Only the (1, 2) p-value using the t-test is less than 0.05.The treatment sums for jams A, B and C are 18, 30 and 23 respectively.It seems that in passing from jam A to B and then C there is evidence that the sweetness is assessed to increase and then decrease.
Here, and in other examples, we find that the CMH extensions and the corresponding nonparametric ANOVA tests give fairly similar conclusions.Homosexual Marriage Example.For the homosexual marriage data, the CMH MS extensions of first and second order yield p-values of 0.0000 and 0.2732 respectively.
The CMH C extensions of orders (1, 1), (1, 2), (2, 1) and (2, 2) were calculated.An intermediate step gave the generalised correlations of these orders for each stratum.Again, the (1, 1) correlation gave very strong evidence of being non-zero.The others were not significant at the commonly used levels of significance.See Table 6.The unordered nonparametric ANOVA gave p-values of less than 0.0001 for first order and 0.3719 for second order.As with the CMH MS extensions there is strong evidence of a mean difference between religions but no evidence of a second order effect.

Development of Unconditional CMH Tests
It was noted in Section 3 that the nominal CMH test statistics S OPA and S GA had analogues denoted by T OPA and T GA .The latter is based on Pearson tests.The nominal CMH tests were identified as conditional tests while the analogues since they are not based on the product extended hypergeometric distribution, are unconditional tests.Since the unconditional Pearson tests are very familiar, they would be the tests of preference for most users.It is of interest to develop unconditional analogues of the ordinal CMH tests.This was done in Reference [14].
For singly ordered two-way tables, an unconditional model is explored in Reference [10], Chapter 4. On the jth stratum define orthonormal polynomials {ω uj (h)} on ( p•1j , . . ., p•cj ): Next, define V uij reflects the contribution of treatment i to the order u effect on stratum j.Rayner and Best [10], Chapter 4 show that

Pj
the Pearson statistic on the jth stratum.The statistic V 2 u1j + . . .+ V 2 utj gives an order u assessment of the consistency of treatments in the jth stratum.It has asymptotic distribution χ 2 t−1 .Summing over strata ∑ i,j V 2 uij gives an order u assessment of the consistency of treatments over all b strata and has asymptotic distribution χ 2 b(t−1) .In particular, ∑ i,j V 2 1ij gives an overall mean assessment of the consistency of the treatments, ∑ i,j V 2 2ij gives an overall dispersion assessment of the consistency of the treatments and so on.The ∑ i,j V 2 uij partition ∑ j X 2 Pj = T OPA .Average assessments may be obtained by summing over strata.Put f j = ( √ n 1•j , . . . ,√ n t•j ) T and ∑ j (I t − f j f T j /n ••j ) = Σ say.An average order u assessment of the consistency of treatments can be based on the quadratic form Of course, T M1 gives an unconditional assessment of mean score differences in treatments in contrast with S MS that gives a conditional assessment of mean score differences in treatments.The T Mu are all asymptotically χ 2 t−1 distributed.Note that the tests for overall and even average association may have quite large degrees of freedom.This means they are seeking to detect very general alternatives to the null hypothesis of no association.The test for the average partial association may well be more focused than that for the overall association but may well still have low power when (t − 1)(c − 1) is quite large: The alternative to the null is too general.The moment tests here, and the generalised correlation tests following, offer focused tests for alternatives that are important and easy to understand.
For correlation tests results analogous to those in Reference [10], Section 8.2 for two-way tables are needed.There is a single multinomial {N ij } that follows a multinomial distribution with total count n = N •• and cell probabilities {p ij } with p •• = 1 is assumed.Instead, we need to assume {N ij } follows a multinomial distribution with total count n i• and cell probabilities {p ij } with p i• = 1.Nevertheless, it may be shown that similar results apply.

Pj
the Pearson statistic on the jth stratum.The V 2 rsj may be summed over strata to give an overall order (r, s) correlation assessment.As strata are independent, ∑ j V 2 rsj will have an asymptotic χ 2 b distribution.In particular, ∑ j V 2 11j will give an overall unconditional assessment of the usual (that is, order (1, 1) generalised) correlation.Since the sum of the strata Pearson statistics is the overall Pearson statistic, the ∑ j V 2 rsj numerically partition the overall Pearson statistic.For a comparison of the conditional and unconditional analyses for the homosexual marriage example, see Table 7. Rayner and Best Reference [14] note that care must be exercised in the application of these tests.Any response category that included no observations would normally be deleted.For the jams data, see Table 2 to confirm that overall there are no null categories; however, that is not the case within each stratum.For example, the first judge gives scores of 2, 3 and 3.The first, fourth and fifth categories include no responses.That means ∑ i N ihj = n •hj = 0 for h = 1, 4 and 5 when j = 1.Thus, X 2 P1 is not defined, as is the case for both S OPA and T OPA .It is some consolation that T GA is defined.
What is happening with T OPA is that in order to sum the Pearson statistics from each stratum the same treatment and responses need to be included in all statistics and with sparse data that need not be the case.The same applies to S OPA , which is using a weighted sum of Pearson statistics.
The sparseness of the data also affects the definition of the unconditional moment and correlation tests.Both require the definition of orthonormal functions on each stratum.If there are three distinct responses, then the orthonormal functions up to order two can be defined; but for the jams data this only happens on strata 4 and 7.All other strata have two distinct responses and so only orthonormal functions of order zero and one can be defined.It is feasible that a judge not be able to distinguish between the treatments and would therefore give the same score to all treatments.There is then no information about the differences between the treatments and such a judge should be excluded from the analysis.For these data, there are no non-informative judges to be removed but only first order orthonormal functions can be constructed on all strata.Second order orthonormal functions are not defined on all strata, so no second order analysis can be given.
It seems necessary to accept that sparse data yields sparse information.Here, the data gives information about the usual (order (1, 1) generalised) correlation but not enough information about, for example, the order (1, 2) generalised correlation, to be able to say anything further.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2571-905X/1/1/8/s1. Derivation of the relationship between SMS and F and; Derivation of the CMH C test statistic the same scores in every stratum Author Contributions: J.C.W.R. wrote the manuscript with P.R. adding critical and constructive comment as well as editorial advice.P.R. assisted with analysis of the data in the examples.

Table 1 .
Opinions on homosexual marriage by religious beliefs and education levels for ages 18 to 25.

Table 2 .
JAR codes for the sweetness of three plum jams.

Table 3 .
Cross-classification of age and whiskey.

Table 4 .
Analysis of Jams data.

Table 5 .
CMH and nonparametric ANOVA GC p-values when testing for zero generalised correlations for the jam data.

Table 6 .
CMH p-values for each stratum and overall when testing for zero generalised correlations for the jam data.
then the {V rs } numerically partition the Pearson statistic and are asymptotically standard normal score test statistics.In particular, V 11 / √ n is the Pearson product moment correlation for grouped data and, if the scores are ranks, V 11 /