Next Article in Journal
Self-Supervised Autoencoders for Visual Anomaly Detection
Previous Article in Journal
Adaptive Centroid-Connected Structure Matching Network Based on Semi-Supervised Heterogeneous Domain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exact Moments of Residuals of Independence

Department of Mathematics and Statistics, Oakland University, 146 Library Drive, Rochester, MI 48309, USA
Mathematics 2024, 12(24), 3987; https://doi.org/10.3390/math12243987
Submission received: 27 November 2024 / Revised: 10 December 2024 / Accepted: 18 December 2024 / Published: 18 December 2024
(This article belongs to the Section D1: Probability and Statistics)

Abstract

The diagnosis of residuals of independence is critical in association analysis and loglinear modeling of two-way contingency tables. Most residual diagnostics depend on large-sample methods, and diagnostic results become dubious when sample sizes are small or data are sparse. In such cases, statistical inference based on non-asymptotic theory or exact inference is desirable. This paper explicitly derives the first four moments of the residuals of independence in a two-way contingency table under a multinomial model. These exact moments are necessary tools for studying the analytical features of the distribution of residuals of independence, such as skewness and kurtosis. Higher-order moments can be found similarly, but the results are more complicated.

1. Introduction

Testing independence has been popularly applied in the association analysis of two-way contingency tables from cross-sectional studies and other statistical applications. Ref. [1] investigated the association between levels of paternal education (completed university, partially completed university, completed secondary education, and not completed secondary education) and quartiles of neonatal weight gain ( Q 1 : lowest 25 % ; Q 2 : second lowest 25 % ; Q 3 : second highest 25 % ; and Q 4 : highest 25 % ) from a cross-sectional study involving 13,262 Belarusian infants born at or over 37 weeks of gestation and weighing at or over 2500 g. Table 1 provides the observed frequencies and the expected frequencies under independence (in parentheses).
With an observed χ 2 statistic value of 19.016 and nine degrees of freedom, the p-value for testing the independence between levels of paternal education and quartiles of neonatal weight gain is 0.025 . At the 0.05 level of significance, the data reject the independence of the levels of paternal education and the quartiles of neonatal weight gain. The nature of a dependence is usually revealed by the distribution of differences between the observed and expected frequencies, i.e., residuals of independence. For example, the number of subjects with the highest 25 % neonatal weight gain and partial university paternal education exceeds the expected number, while the number of subjects with the highest 25 % neonatal weight gain and secondary paternal education subceeds the expected value.
Since the analytical form of the distribution of the residuals of independence is not available, the first four moments of the distribution provide vital information about the center, spread, skewness, and kurtosis of the distribution. For example, standardized residuals of independence are commonly used to reduce the heterogeneity from cell to cell. Standardization is usually carried out with the asymptotic mean and variance of the residuals. Non-asymptotic and explicit expressions of the mean and variance of the residuals of independence seem to be missing in the literature. This paper explicitly derives the first four raw moments of the residuals of independence under a multinomial model.

2. Main Results

Consider the following r × c table:
12 c 1 cTotal
1 n 11 n 12 n 1 ( c 1 ) n 1 c n 1 +
2 n 21 n 22 n 2 ( c 1 ) n 2 c n 2 +
r 1 n ( r 1 ) 1 n ( r 1 ) 2 n ( r 1 ) ( c 1 ) n ( r 1 ) c n ( r 1 ) +
r n r 1 n r 2 n r ( c 1 ) n r c n r +
Total n + 1 n + 2 n + ( c 1 ) n + c n + +
The residual of independence of cell ( i , j ) is defined as n i j n i + n + j n + + . Assume that ( n 11 , , n r c ) follows a multinomial distribution with n + + trials and a probability of π i j for cell ( i , j ) for i = 1 , 2 , , r and j = 1 , 2 , , c , where π 11 + + π r c = 1 , i.e., ( n 11 , , n r c ) Multinomial ( n + + ; π 11 , , π r c ) .
The following factorial moments of a multinomial distribution are taken from [2] and can be proven straightforwardly.
Lemma 1.
Assume that ( T 1 , T 2 , , T k 1 , T k ) M u l t i n o m i a l ( n ; π 1 , π 2 , , π k 1 , π k ) . For any nonnegative integer m and random variable X , let X ( m ) = X ( X 1 ) , ( X m + 1 ) . Then, for nonnegative integers m 1 , , m k , we have
E [ T 1 ( m 1 ) T 2 ( m 2 ) T k ( m k ) ] = n ( i = 1 k m i ) π 1 m 1 π 2 m 2 π k m k .
In particular,
E ( T i ) = n π i , i = 1 , 2 , , k ; E ( T i 2 ) = n ( n 1 ) π i 2 + n π i , i = 1 , 2 , , k ; E ( T i T j ) = n ( n 1 ) π i π j , i j , i , j = 1 , 2 , , k ; E ( T i 3 ) = n ( n 1 ) ( n 2 ) π i 3 + 3 n ( n 1 ) π i 2 + n π i , i = 1 , 2 , , k ; E ( T i 2 T j ) = n ( n 1 ) ( n 2 ) π i 2 π j + n ( n 1 ) π i π j , i j , i , j = 1 , 2 , , k ; E ( T i T j T l ) = n ( n 1 ) ( n 2 ) π i π j π l , i j l , i , j , l = 1 , 2 , , k ; E ( T i 4 ) = n ( n 1 ) ( n 2 ) ( n 3 ) π i 4 + 6 n ( n 1 ) ( n 2 ) π i 3 + 7 n ( n 1 ) π i 2 + n π i , i = 1 , 2 , , k ; E ( T i 3 T j ) = n ( n 1 ) ( n 2 ) ( n 3 ) π i 3 π j + 3 n ( n 1 ) ( n 2 ) π i 2 π j + n ( n 1 ) π i π j ,   i j , i , j = 1 , 2 , , k ; E ( T i 2 T j 2 ) = n ( n 1 ) ( n 2 ) ( n 3 ) π i 2 π j 2 + n ( n 1 ) ( n 2 ) ( π i 2 π j + π i π j 2 ) + n ( n 1 ) π i π j , i j , i , j = 1 , 2 , , k ; E ( T i 2 T j T l ) = n ( n 1 ) ( n 2 ) ( n 3 ) π i 2 π j π l + n ( n 1 ) ( n 2 ) π i π j π l , i j l , i , j , l = 1 , 2 , , k ; E ( T i T j T l T m ) = n ( n 1 ) ( n 2 ) ( n 3 ) π i π j π l π m , i j l m , i , j , l , m = 1 , 2 , , k . V a r ( T i T j ) = n ( n 1 ) ( 6 4 n ) π i 2 π j 2 + n ( n 1 ) ( n 2 ) ( π i 2 π j + π i π j 2 ) + n ( n 1 ) π i π j , i j , i , j = 1 , 2 , , k ; C o v ( T i T j , T l T m ) = n ( n 1 ) ( 6 4 n ) π i π j π l π m , i j l m , i , j , l , m = 1 , 2 , , k .
The next result is taken from [3] and can be proven directly from the definition of multinomial distribution.
Lemma 2.
Assume that ( T 1 , T 2 , , T k 1 , T k ) M u l t i n o m i a l ( n ; π 1 , π 2 , , π k 1 , π k ) and S 1 , S 2 , , S m is a set partition of { 1 , 2 , , k } . Let X i = j S i T j , p i = j S i π j . Then, ( X 1 , , X m ) M u l t i n o m i a l ( n ; p 1 , p 2 , , p m ) .
The mean and variance of the residuals of independence are given below.
Theorem 1.
Assume that ( n 11 , , n r c ) M u l t i n o m i a l ( n + + ; π 11 , , π r c ) , where n + + = i = 1 r j = 1 c n i j is a constant, π i j 0 , for i = 1 , , r ,   j = 1 , , c , and i = 1 r j = 1 c π i j = 1 . For any i = 1 , , r and j = 1 , , c , consider the residual of independence of cell ( i , j ) , R i j = n i j n i + n + j n + + , where n i + = j = 1 c n i j and n + j = i = 1 r n i j . We have, for i = 1 , , r and j = 1 , , c ,
μ = E ( R i j ) = ( n + + 1 ) ( π i j π i + π + j ) , σ 2 = V a r ( R i j ) = n + + 1 n + + ( 6 4 n + + ) ( π i j π i + π + j ) 2 ( n + + 2 ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) + ( n + + 1 ) π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) , a n d E ( R i j 2 ) = n + + 1 n + + ( n + + 2 ) ( n + + 3 ) ( π i j π i + π + j ) 2 ( n + + 2 ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) + ( n + + 1 ) π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) .
where π i + = j = 1 c π i j and π + j = i = 1 r π i j .
When independence holds, i.e., π i j = π i + π + j ,
V a r ( R i j ) = ( n + + 1 ) π i + π + j ( 1 π i + ) ( 1 π + j ) .
Proof. 
For any i = 1 , 2 , , r and j = 1 , 2 , , c , note that
E ( n i + n + j ) = E ( s i r n s j t j c n i t + s i r n i j n s j + t j c n i j n i t + n i j 2 ) = s i r t j c E ( n s j n i t ) + s i r E ( n i j n s j ) + t j c E ( n i j n i t ) + E ( n i j 2 ) = s i r t j c n + + ( n + + 1 ) π s j π i t + s i r n + + ( n + + 1 ) π i j π s j ) + t j c n + + ( n + + 1 ) π i j π i t + n + + ( n + + 1 ) π i j 2 + n + + π i j = n + + ( n + + 1 ) s = 1 r π s j t = 1 c π i t + n + + π i j = n + + ( n + + 1 ) π i + π + j + n + + π i j
Recall that E ( n i j ) = n + + π i j , so we have
E ( R i j ) = n + + π i j n + + ( n + + 1 ) π i + π + j + n + + π i j n + + = ( n + + 1 ) ( π i j π i + π + j ) .
To calculate the variance of R i j , let n i = n i + n i j , n j = n + j n i j , π i = π i + π i j , and π j = π + j π i j . Then,
R i j = 1 n + + [ n i j ( n + + n i j n i n j ) n i n j ] .
For any i = 1 , , r , j = 1 , , c and i j ,
V a r ( R i j ) = 1 n + + 2 { V a r [ n i j ( n + + n i n j n i j ) ] 2 C o v ( n i j ( n + + n i n j n i j ) , n i n j ) + V a r ( n i n j ) } = 1 n + + 2 { n + + ( n + + 1 ) ( 6 4 n + + ) π i j 2 ( 1 π i π j π i j ) 2 + n + + ( n + + 1 ) ( n + + 2 ) [ π i j 2 ( 1 π i π j π i j ) + π i j ( 1 π i π j π i j ) 2 ] + n + + ( n + + 1 ) π i j ( 1 π i π j π i j ) 2 n + + ( n + + 1 ) ( 6 4 n + + ) π i j ( 1 π i π j π i j ) π i π j + n + + ( n + + 1 ) ( 6 4 n + + ) π i 2 π j 2 + n + + ( n + + 1 ) ( n + + 2 ) ( π i 2 π j + π i π j 2 ) + n + + ( n + + 1 ) π i π j } = n + + 1 n + + { ( 6 4 n + + ) [ π i j ( 1 π i π j π i j ) π i π j ] 2 + ( n + + 2 ) [ π i j 2 ( 1 π i π j π i j ) + π i j ( 1 π i π j π i j ) 2 + π i 2 π j + π i π j 2 ] + π i j ( 1 π i π j π i j ) + π i π j } = n + + 1 n + + { ( 6 4 n + + ) ( π i j π i + π + j ) 2 + ( n + + 2 ) [ π i j ( 1 π i + π + j + π i j ) ( 1 π i + π + j + 2 π i j ) + ( π i + π i j ) ( π + j π i j ) ( π i + + π + j 2 π i j ) ] + π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) } = n + + 1 n + + { ( 6 4 n + + ) ( π i j π i + π + j ) 2 + ( n + + 2 ) [ π i j ( 1 π i + π + j + π i j ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) ] + π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) } = n + + 1 n + + { ( 6 4 n + + ) ( π i j π i + π + j ) 2 ( n + + 2 ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) + ( n + + 1 ) π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) } .
The second moment can be obtained by noting E ( R i j 2 ) = V a r ( R i j + [ E ( R i j ) ] 2 .
When independence holds, i.e., π i j = π i + π + j , it is straightforward to see that E ( R i j ) = 0 and V a r ( R i j ) = ( n + + 1 ) π i + π + j ( 1 π i + ) ( 1 π + j ) .
Based on the exact variance of the residuals of independence above, the standardized residual of independence of cell ( i , j ) is n i j n i + n + j / n + + ( n + + 1 ) ( 1 n i + / n + + ) ( 1 n + j / n + + ) n i + n + j / n + + 2 . This exact standardized residual is asymptotically equivalent to n i j n i + n + j / n + + ( 1 n i + / n + + ) ( 1 n + j / n + + ) n i + n + j / n + + , which is used in many textbooks, e.g., [4].
In order to derive the third and fourth moments of R i j , we need higher-order mixed moments. However, the derivation of higher-order mixed moments from higher-order factorial moments in Lemma 1 is too tedious. Using the differential relationships between the moment-generating function of a distribution and its moments as well as the computer algebra system Wolfram|Alpha, we obtain the following Lemma 3.
Lemma 3.
Assume that ( T 1 , T 2 , , T k 1 , T k ) M u l t i n o m i a l ( n ; π 1 , π 2 , , π k 1 , π k ) . For any nonnegative integer m , let n ( m ) = n ( n 1 ) ( n m + 1 ) . Then,
E ( T i 3 T j 3 ) = n ( 6 ) π i 3 π j 3 + 3 n ( 5 ) π i 3 π j 2 + 3 n ( 5 ) π i 2 π j 3 + n ( 4 ) π i 3 π j + n ( 4 ) π i π j 3 + 9 n ( 4 ) π i 2 π j 2 + 3 n ( 3 ) π i 2 π j + 3 n ( 3 ) π i π j 2 + n ( 2 ) π i π j , i j , i , j = 1 , 2 , , k ; E ( T i 2 T j 2 T l T h ) = n ( 6 ) π i 2 π j 2 π l π h + n ( 5 ) π i 2 π j π l π h + n ( 5 ) π i π j 2 π l π h + n ( 4 ) π i π j π l π h , i j l h , i , j , l , h = 1 , 2 , , k ; E ( T i 4 T j 4 ) = n ( 8 ) π i 4 π j 4 + 6 n ( 7 ) π i 4 π j 3 + 6 n ( 7 ) π i 3 π j 4 + 7 n ( 6 ) π i 4 π j 2 + 7 n ( 6 ) π i 2 π j 4 + 36 n ( 6 ) π i 3 π j 3 + 42 n ( 5 ) π i 3 π j 2 + 42 n ( 5 ) π i 2 π j 3 + n ( 5 ) π i 4 π j + n ( 5 ) π i π j 4 + 6 n ( 4 ) π i 3 π j + 6 n ( 4 ) π i π j 3 + 49 n ( 4 ) π i 2 π j 2 + 7 n ( 3 ) π i 2 π j + 7 n ( 3 ) π i π j 2 + n ( 2 ) π i π j , i j , i , j = 1 , 2 , , k ; E ( T i 3 T j 3 T l T h ) = n ( 8 ) π i 3 π j 3 π l π h + 3 n ( 7 ) π i 3 π j 2 π l π h + 3 n ( 7 ) π i 2 π j 3 π l π h + n ( 6 ) π i 3 π j π l π h + n ( 6 ) π i π j 3 π l π h + 9 n ( 6 ) π i 2 π j 2 π l π h + 3 n ( 5 ) π i 2 π j π l π h + 3 n ( 5 ) π i π j 2 π l π h + n ( 4 ) π i π j π l π h , i j l h , i , j , l , h = 1 , 2 , , k ; E ( T i 2 T j 2 T l 2 T h 2 ) = n ( 8 ) π i 2 π j 2 π l 2 π h 2 + n ( 7 ) π i 2 π j 2 π l 2 π h + n ( 7 ) π i 2 π j 2 π l π h 2 + n ( 7 ) π i 2 π j π l 2 π h 2 + n ( 7 ) π i π j 2 π l 2 π h 2 + n ( 6 ) π i 2 π j 2 π l π h + n ( 6 ) π i 2 π j π l 2 π h + n ( 6 ) π i 2 π j π l π h 2 + n ( 6 ) π i π j 2 π l 2 π h + n ( 6 ) π i π j 2 π l π h 2 + n ( 6 ) π i π j π l 2 π h 2 + n ( 5 ) π i 2 π j π l π h + n ( 5 ) π i π j 2 π l π h + n ( 5 ) π i π j π l 2 π h + n ( 5 ) π i π j π l π h 2 + n ( 4 ) π i π j π l π h , i j l h , i , j , l , h = 1 , 2 , , k .
Proof. 
Since ( T 1 , T 2 , , T k 1 , T k ) M u l t i n o m i a l ( n ; π 1 , π 2 , , π k 1 , π k ) , its moment-generating function is
M ( t 1 , , t k ) = E e t 1 T 1 + + t k T k = ( π 1 e t 1 + + π k e t k ) n + + .
The results are obtained by noting that
E ( T i r T j s T l u T h v ) = r + s + u + v M ( t 1 , , t k ) t i r t j s t l u t h v | t 1 = 0 , , t k = 0
for i j l h , i , j , l , h = 1 , 2 , , k , and nonnegative intergers r , s , u , and v .
Theorem 2 next provides the explicit expressions of the third and fourth moments of the residuals of independence.
Theorem 2.
Assume that ( n 11 , , n r c ) M u l t i n o m i a l ( n + + ; π 11 , , π r c ) , where n + + = i = 1 r j = 1 c n i j is a constant, π i j 0 , for i = 1 , , r , j = 1 , , c , and i = 1 r j = 1 c π i j = 1 . For any i = 1 , , r and j = 1 , , c , consider the residual of independence of cell ( i , j ) , R i j = n i j n i + n + j n + + , where n i + = j = 1 c n i j and n + j = i = 1 r n i j . We have, for i = 1 , , r and j = 1 , , c , and i j ,
n + + 3 E ( R i j 3 ) = n + + ( 6 ) ( π i j π i + π + j ) 3 3 n + + ( 5 ) ( π i j π i + π + j ) 2 ( π i + + π + j 2 π i j ) + 3 n + + ( 5 ) π i j ( π i j π i + π + j ) ( 1 π i + π + j + π i j ) + n + + ( 4 ) [ π i j 3 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) 3 ( π + j π i j ) ( π i + π i j ) ( π + j π i j ) 3 + 9 π i j ( π i j π i + π + j ) ( 1 π i + π + j + π i j ) + 9 ( π i j π i + π + j ) ( π i + π i j ) ( π + j π i j ) ] + 3 n + + ( 3 ) [ π i j ( 1 π i + π + j + π i j ) ( 1 π i + π + j + 2 π i j ) ( π i + π i j ) ( π + j π i j ) ( π i + + π + j 2 π i j ) ] + ( π i j π i + π + j ) .   n + + 4 E ( R i j ) 4 = n + + ( 8 ) ( π i j π i + π + j ) 4 + n + + ( 7 ) [ 6 π i j 4 ( 1 π i + π + j + π i j ) 3 + 6 π i j 3 ( 1 π i + π + j + π i j ) 4 12 π i j 3 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 12 π i j 2 ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 3 + 6 ( π i + π i j ) 4 ( π + j π i j ) 3 + 6 ( π i + π i j ) 3 ( π + j π i j ) 4 ] + n + + ( 6 ) [ 7 π i j 4 ( 1 π i + π + j + π i j ) 2 + 7 π i j 2 ( 1 π i + π + j + π i j ) 4 + 36 π i j 3 ( 1 π i + π + j + π i j ) 3 4 π i j 3 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) 30 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 30 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 3 + 7 ( π i + π i j ) 4 ( π + j π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) 4 + 36 ( π i + π i j ) 3 ( π + j π i j ) 3 ] + n + + ( 5 ) [ π i j 4 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 4 + 42 π i j 3 ( 1 π i + π + j + π i j ) 2 + 42 π i j 2 ( 1 π i + π + j + π i j ) 3 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + ( π i + π i j ) 4 ( π + j π i j ) + ( π i + π i j ) ( π + j π i j ) 4 + 42 ( π i + π i j ) 3 ( π + j π i j ) 2 + 42 ( π i + π i j ) 2 ( π + j π i j ) 3 ] + n + + ( 4 ) [ 6 π i j 3 ( 1 π i + π + j + π i j ) + 49 π i j 2 ( 1 π i + π + j + π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 3 2 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) + 6 ( π i + π i j ) 3 ( π + j π i j ) + 49 ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 ( π i + π i j ) ( π + j π i j ) 3 ] + n + + ( 3 ) [ 7 π i j 2 ( 1 π i + π + j + π i j ) + 7 π i j ( 1 π i + π + j + π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) + 7 ( π i + π i j ) ( π + j π i j ) 2 ] + n + + ( 2 ) [ π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) ]
where n + + ( m ) = n + + ( n + + 1 ) ( n + + m + 1 ) for any nonnegative integer m .
Proof. 
Let n i = n i + n i j , n j = n + j n i j , π i = π i + π i j , and π j = π + j π i j . Then,
R i j = 1 n + + [ n i j ( n + + n i j n i n j ) n i n j ] . E ( R i j ) 3 = 1 n + + 3 E [ n i j ( n + + n i j n i n j ) n i n j ] 3 = 1 n + + 3 { E [ n i j 3 ( n + + n i j n i n j ) 3 ] 3 E [ n i j 2 ( n + + n i j n i n j ) 2 n i n j ] + 3 E [ n i j ( n + + n i j n i n j ) n i 2 n j 2 ] E ( n i 3 n j 3 ) }
Since ( n i j , n i , n j , n + + n i j n i n j ) M u l t i n o m i a l ( π i j , π i , π j , 1 π i j π i π j ) , we have, from Lemma 3,
E [ n i j 3 ( n + + n i j n i n j ) 3 ] = n + + ( 6 ) π i j 3 ( 1 π i j π i π j ) 3 + 3 n + + ( 5 ) π i j 3 ( 1 π i j π i π j ) 2 + 3 n + + ( 5 ) π i j 2 ( 1 π i j π i π j ) 3 + n + + ( 4 ) π i j 3 ( 1 π i j π i π j ) + n + + ( 4 ) π i j ( 1 π i j π i π j ) 3 + 9 n + + ( 4 ) π i j 2 ( 1 π i j π i π j ) 2 + 3 n + + ( 3 ) π i j 2 ( 1 π i j π i π j ) + 3 n + + ( 3 ) π i j ( 1 π i j π i π j ) 2 + n + + ( 2 ) π i j ( 1 π i j π i π j ) .   E [ n i j 2 ( n + + n i j n i n j ) 2 n i n j ] = n + + ( 6 ) π i j 2 ( 1 π i j π i π j ) 2 π i π j + n + + ( 5 ) π i j 2 ( 1 π i j π i π j ) π i π j + n + + ( 5 ) π i j ( 1 π i j π i π j ) 2 π i π j + n + + ( 4 ) π i j ( 1 π i j π i π j ) π i π j .   E [ n i j ( n + + n i j n i n j ) n i 2 n j 2 ] = n + + ( 6 ) π i j ( 1 π i j π i π j ) π i 2 π j 2 + n + + ( 5 ) π i j ( 1 π i j π i π j ) π i 2 π j + n + + ( 5 ) π i j ( 1 π i j π i π j ) π i π j 2 + n + + ( 4 ) π i j ( 1 π i j π i π j ) π i π j .   E [ n i 3 n j 3 ] = n + + ( 6 ) π i 3 π j 3 + 3 n + + ( 5 ) π i 3 π j 2 + 3 n + + ( 5 ) π i 2 π j 3 + n + + ( 4 ) π i 3 π j + n + + ( 4 ) π i π j 3 + 9 n + + ( 4 ) π i 2 π j 2 + 3 n + + ( 3 ) π i 2 π j + 3 n + + ( 3 ) π i π j 2 + n + + ( 2 ) π i π j .
The result is obtained by noting π i = π i + π i j , π j = π + j π i j ,
π i j ( 1 π i j π i π j ) π i π j = π i j π i + π + j , π i j 2 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) ( π i + π i j ) ( π + j π i j ) 2 = π i j ( 1 π i + π + j + π i j ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) .
The fourth moment is obtained from Lemma 3 and the following:
E ( R i j ) 4 = 1 n + + 4 E [ n i j ( n + + n i j n i n j ) n i n j ] 4 = 1 n + + 4 { E [ n i j 4 ( n + + n i j n i n j ) 4 ] 4 E [ n i j 3 ( n + + n i j n i n j ) 3 n i n j ] + 6 E [ n i j 2 ( n + + n i j n i n j ) 2 n i 2 n j 2 ] 4 E [ n i j ( n + + n i j n i n j ) n i 3 n j 3 ] + E ( n i 4 n j 4 ) } .
The exact third and fourth central moments can be derived straightforwardly by noting E ( R i j μ ) 3 = E ( R i j 3 ) 3 μ E ( R i j 2 ) + 2 μ 3 and E ( R i j μ ) 4 = E ( R i j 4 ) 4 μ E ( R i j 3 ) + 6 μ 2 E ( R i j 2 ) 3 μ 4 . Note that the first four cumulants of a distribution are its mean, variance, third central moment, and fourth central moment minus three times the squared variance. We can also obtain the exact first four cumulants of the distribution of the residuals of independence. Corollary 1 gives explicit expressions for the third and fourth central moments as well as the fourth cumulant.
Corollary 1.
Under the conditions of Theorem 2, the third central moment of R i j is
n + + 3 E ( R i j μ ) 3 = n + + 3 E ( R i j 3 ) 3 ( n + + μ ) [ n + + 2 E ( R i j 2 ) ] + 2 n + + 3 μ 3 = 8 n + + ( n + + 1 ) ( 5 n + + 2 17 n + + + 15 ) ( π i j π i + π + j ) 3 + 18 n + + ( n + + 1 ) ( n + + 2 ) 2 ( π i j π i + π + j ) 2 ( π i + + π + j 2 π i j ) 6 n + + ( n + + 1 ) 2 ( 2 n + + 3 ) π i j ( π i j π i + π + j ) ( 1 π i + π + j + π i j ) + 6 n + + ( n + + 1 ) ( n + + 2 7 n + + + 9 ) ( π i j π i + π + j ) ( π i + π i j ) ( π + j π i j ) + n + + ( n + + 1 ) ( n + + 2 ) ( n + + 3 ) [ π i j 3 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) 3 ( π + j π i j ) ( π i + π i j ) ( π + j π i j ) 3 ] + 3 n + + ( n + + 1 ) ( n + + 2 ) [ π i j ( 1 π i + π + j + π i j ) ( 1 π i + π + j + 2 π i j ) ( π i + π i j ) ( π + j π i j ) ( π i + + π + j 2 π i j ) ] + ( π i j π i + π + j ) .
The fourth central moment of R i j is
n + + 4 E ( R i j μ ) 4 = n + + 4 E ( R i j 4 ) 4 ( n + + μ ) [ n + + 3 E ( R i j 3 ] + 6 n + + 2 μ 2 [ n + + 2 E ( R i j 2 ) ] 3 μ 4 = 12 n + + ( n + + 1 ) ( 4 n + + 4 72 n + + 3 + 337 n + + 2 629 n + + + 420 ) ( π i j π i + π + j ) 4 + 6 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( n + + 2 13 n + + + 24 ) ( π i j π i + π + j ) 3 ( π i + + π + j 2 π i j ) 6 n + + 2 ( n + + 1 ) 3 ( n + + 2 9 n + + + 12 ) π i j ( π i j π i + π + j ) 2 ( 1 π i + π + j + π i j ) 6 n + + 2 ( n + + 1 ) 2 ( n + + 4 ) ( 5 n + + 9 ) ( π i j π i + π + j ) 2 ( π i + π i j ) ( π + j π i j ) 4 n + + ( n + + 1 ) ( π i j π i + π + j ) 2 4 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( n + + 3 ) ( π i j π i + π + j ) [ π i j 3 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) 3 ( π + j π i j ) ( π i + π i j ) ( π + j π i j ) 3 ] 12 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( π i j π i + π + j ) [ π i j ( 1 π i + π + j + π i j ) × ( 1 π i + π + j + 2 π i j ) ( π i + π i j ) ( π + j π i j ) ( π i + + π + j 2 π i j ) ] + n + + ( 7 ) [ 6 π i j 4 ( 1 π i + π + j + π i j ) 3 + 6 π i j 3 ( 1 π i + π + j + π i j ) 4 12 π i j 3 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 12 π i j 2 ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 3 + 6 ( π i + π i j ) 4 ( π + j π i j ) 3 + 6 ( π i + π i j ) 3 ( π + j π i j ) 4 ] + n + + ( 6 ) [ 7 π i j 4 ( 1 π i + π + j + π i j ) 2 + 7 π i j 2 ( 1 π i + π + j + π i j ) 4 + 36 π i j 3 ( 1 π i + π + j + π i j ) 3 4 π i j 3 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) 30 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 30 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 3 + 7 ( π i + π i j ) 4 ( π + j π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) 4 + 36 ( π i + π i j ) 3 ( π + j π i j ) 3 ] + n + + ( 5 ) [ π i j 4 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 4 + 42 π i j 3 ( 1 π i + π + j + π i j ) 2 + 42 π i j 2 ( 1 π i + π + j + π i j ) 3 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + ( π i + π i j ) 4 ( π + j π i j ) + ( π i + π i j ) ( π + j π i j ) 4 + 42 ( π i + π i j ) 3 ( π + j π i j ) 2 + 42 ( π i + π i j ) 2 ( π + j π i j ) 3 ] + n + + ( 4 ) [ 6 π i j 3 ( 1 π i + π + j + π i j ) + 49 π i j 2 ( 1 π i + π + j + π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 3 2 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) + 6 ( π i + π i j ) 3 ( π + j π i j ) + 49 ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 ( π i + π i j ) ( π + j π i j ) 3 ] + n + + ( 3 ) [ 7 π i j 2 ( 1 π i + π + j + π i j ) + 7 π i j ( 1 π i + π + j + π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) + 7 ( π i + π i j ) ( π + j π i j ) 2 ] + n + + ( 2 ) [ π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) ]
The fourth cumulant of R i j is
n + + 4 κ 4 ( R i j ) = n + + 4 E ( R i j μ ) 4 3 [ n + + 2 V a r ( R i j ) ] 2 = 48 n + + ( n + + 1 ) ( 14 n + + 3 79 n + + 2 + 155 n + + 105 ) ( π i j π i + π + j ) 4 + 6 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) 2 ( n + + 15 ) ( π i j π i + π + j ) 3 ( π i + + π + j 2 π i j ) 3 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) 2 ( π i j π i + π + j ) 2 ( π i + + π + j 2 π i j ) 2 6 n + + 2 ( n + + 1 ) 3 ( n + + 2 13 n + + + 18 ) π i j ( π i j π i + π + j ) 2 ( 1 π i + π + j + π i j ) 6 n + + 2 ( n + + 1 ) 2 ( 5 n + + 2 33 n + + + 42 ) ( π i j π i + π + j ) 2 ( π i + π i j ) ( π + j π i j ) 4 n + + ( n + + 1 ) ( π i j π i + π + j ) 2 4 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( n + + 3 ) ( π i j π i + π + j ) [ π i j 3 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) 3 ( π + j π i j ) ( π i + π i j ) ( π + j π i j ) 3 ] 12 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( π i j π i + π + j ) [ π i j ( 1 π i + π + j + π i j ) × ( 1 π i + π + j + 2 π i j ) ( π i + π i j ) ( π + j π i j ) ( π i + + π + j 2 π i j ) ] + 6 n + + 2 ( n + + 1 ) 3 ( n + + 2 ) π i j ( π i j π i + π + j ) ( 1 π i + π + j + π i j ) ( π i + + π + j 2 π i j ) + 6 n + + 2 ( n + + 1 ) 2 ( n + + 2 ) ( π i j π i + π + j ) ( π i + + π + j 2 π i j ) ( π i + π i j ) ( π + j π i j ) 3 n + + 2 ( n + + 1 ) 4 π i j 2 ( 1 π i + π + j + π i j ) 2 3 n + + 2 ( n + + 1 ) 2 ( π i + π i j ) 2 ( π + j π i j ) 2 6 n + + 2 ( n + + 1 ) 3 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) + n + + ( 7 ) [ 6 π i j 4 ( 1 π i + π + j + π i j ) 3 + 6 π i j 3 ( 1 π i + π + j + π i j ) 4 12 π i j 3 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 12 π i j 2 ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 2 12 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 3 + 6 ( π i + π i j ) 4 ( π + j π i j ) 3 + 6 ( π i + π i j ) 3 ( π + j π i j ) 4 ] + n + + ( 6 ) [ 7 π i j 4 ( 1 π i + π + j + π i j ) 2 + 7 π i j 2 ( 1 π i + π + j + π i j ) 4 + 36 π i j 3 ( 1 π i + π + j + π i j ) 3 4 π i j 3 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) 3 ( π i + π i j ) ( π + j π i j ) 30 π i j 2 ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) 2 ( π + j π i j ) + 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 2 30 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 2 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 3 ( π + j π i j ) 4 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 3 + 7 ( π i + π i j ) 4 ( π + j π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) 4 + 36 ( π i + π i j ) 3 ( π + j π i j ) 3 ] + n + + ( 5 ) [ π i j 4 ( 1 π i + π + j + π i j ) + π i j ( 1 π i + π + j + π i j ) 4 + 42 π i j 3 ( 1 π i + π + j + π i j ) 2 + 42 π i j 2 ( 1 π i + π + j + π i j ) 3 6 π i j 2 ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) 2 ( π i + π i j ) ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) 2 ( π + j π i j ) 6 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) 2 + ( π i + π i j ) 4 ( π + j π i j ) + ( π i + π i j ) ( π + j π i j ) 4 + 42 ( π i + π i j ) 3 ( π + j π i j ) 2 + 42 ( π i + π i j ) 2 ( π + j π i j ) 3 ] + n + + ( 4 ) [ 6 π i j 3 ( 1 π i + π + j + π i j ) + 49 π i j 2 ( 1 π i + π + j + π i j ) 2 + 6 π i j ( 1 π i + π + j + π i j ) 3 2 π i j ( 1 π i + π + j + π i j ) ( π i + π i j ) ( π + j π i j ) + 6 ( π i + π i j ) 3 ( π + j π i j ) + 49 ( π i + π i j ) 2 ( π + j π i j ) 2 + 6 ( π i + π i j ) ( π + j π i j ) 3 ] + n + + ( 3 ) [ 7 π i j 2 ( 1 π i + π + j + π i j ) + 7 π i j ( 1 π i + π + j + π i j ) 2 + 7 ( π i + π i j ) 2 ( π + j π i j ) + 7 ( π i + π i j ) ( π + j π i j ) 2 ] + n + + ( 2 ) [ π i j ( 1 π i + π + j + π i j ) + ( π i + π i j ) ( π + j π i j ) ]
where μ = E ( R i j ) = ( n + + 1 ) ( π i j π i + π + j ) and n + + ( m ) = n + + ( n + + 1 ) ( n + + m + 1 ) for any nonnegative integer m .

3. Conclusions

We have explicitly derived the first four moments of the residuals of independence in a two-way contingency table under a multinomial model. From these exact moments, we have the exact skewness, E ( R i j μ ) 3 / σ 3 , and kurtosis, E ( R i j μ ) 4 / σ 4 , of the distribution of the residuals of independence. These explicit but tedious results provide us with the vital statistical characteristics of the exact distribution of the residuals of independence in the association analysis of two-way contingency tables. Moreover, since the joint probability distribution of independent Poisson random variables, depending on their sum, is a multinomial distribution, these exact results can also be used in the residual analysis of log-linear models. Higher-order raw moments of the residuals of independence can be found similarly, but the results are more complicated.
Currently, most residual diagnostics of discrete data depend on large-sample methods. When sample sizes are not large or data are sparse, diagnostic results based on large-sample theory are debatable, and exact methods or methods based on non-asymptotic theory are desirable. The explicit moments of the residuals of independence contribute to exact residual diagnostics significantly. More discussions of and references to the exact analysis of discrete data are given in [5].

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The author appreciates the insightful comments and suggestions from the editors and referees that substantially improved the presentation of the article.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Smithers, L.G.; Lynch, J.W.; Yang, S.; Dahhou, M.; Kramer, M.S. Impact of Neonatal Growth on IQ and Behavior at Early School Age. Pediatrics 2013, 132, 53–60. [Google Scholar]
  2. Mosimann, J.E. On the Compound Multinomial Distribution, the Multivariate β-Distribution, and Correlations among Proportions. Biometrics 1962, 49, 61–82. [Google Scholar]
  3. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Discrete Multivariate Distributions; John Wiley & Sons Inc.: Hoboken, NJ, USA, 1997. [Google Scholar]
  4. Agresti, A. Categorical Data Analysis, 3rd ed.; John Wiley & Sons Inc.: Hoboken, NJ, USA, 2013. [Google Scholar]
  5. Hirji, K.F. Exact Analysis of Discrete Data; Chapman & Hall/CRC: Boca Raton, FL, USA, 2006. [Google Scholar]
Table 1. Paternal education and neonatal weight gain.
Table 1. Paternal education and neonatal weight gain.
Q 1 Q 2 Q 3 Q 4 Total
Complete university4224334294141698
(411.63)(444.79)(422.64)(418.93)
Partially complete university14931655155616056309
(1529.44)(1652.65)(1570.35)(1556.56)
Complete secondary education12391276124311794937
(1196.84)(1293.25)(1228.85)(1218.06)
Incomplete secondary education611107374318
(77.09)(83.30)(79.15)(78.46)
Total321534743301327213,262
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qu, X. Exact Moments of Residuals of Independence. Mathematics 2024, 12, 3987. https://doi.org/10.3390/math12243987

AMA Style

Qu X. Exact Moments of Residuals of Independence. Mathematics. 2024; 12(24):3987. https://doi.org/10.3390/math12243987

Chicago/Turabian Style

Qu, Xianggui. 2024. "Exact Moments of Residuals of Independence" Mathematics 12, no. 24: 3987. https://doi.org/10.3390/math12243987

APA Style

Qu, X. (2024). Exact Moments of Residuals of Independence. Mathematics, 12(24), 3987. https://doi.org/10.3390/math12243987

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop