Next Article in Journal
Tsallis Entropy Measures for Concomitants of Generalized Order Statistics with Applications in Image Segmentation and Reliability Analysis
Previous Article in Journal
Discrete-Time Stability Analysis of Neural Networks with Piecewise Constant Arguments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Parity-Based Statistics and Combinatorial Identities

Department of Mathematics and Statistics, University of Maryland at Baltimore County, Baltimore, MD 21250, USA
Mathematics 2026, 14(13), 2407; https://doi.org/10.3390/math14132407 (registering DOI)
Submission received: 26 April 2026 / Revised: 11 June 2026 / Accepted: 21 June 2026 / Published: 5 July 2026
(This article belongs to the Section D1: Probability and Statistics)

Abstract

Notable discrete probability laws appear as posterior distributions in the estimation of the common mean with heterogeneous variances. These probabilities, which are defined by an arbitrary set of distinct real numbers, also arise in seemingly unrelated areas of polynomial approximation and statistical physics. The corresponding orthogonal polynomials possess an interesting self-duality property, which invites the study of statistical distributions based on data parity. These distributions provide novel, intriguing formulas for the classical hypergeometric function along with several combinatorial identities.

1. Introduction: Common Mean and Unknown Heterogeneous Uncertainties

This paper arose from statistical analysis of a sequence, x 1 , , x n , in the setting of heterogeneous research synthesis, with x j representing the estimate of the common mean (say, treatment effect), as reported by the j-th study. No conditions are imposed on the unknown accuracies, which cannot be assumed equal. The main statistical challenge is the estimation of the common mean, treated as a shift parameter, when the standard deviations are considered to be unknown nuisance scale parameters.
In some applications, the uncertainty appraisals are either missing or utterly unreliable. The difficulty in accurately valuing the variances of systematic errors, whether due to specific laboratory conditions or hospital protocols, is well acknowledged by data scientists.
The issue of underreported uncertainties, particularly those that stem from asymptotic normal theory, which presupposes large data sets, is prevalent in metrology. Furthermore, the challenge of reproducibility within individual centers may be exacerbated by the nature of the employed measuring instruments.
Another source of artificially small uncertainties may be due to the removal of outliers for purely mathematical reasons. By eliminating “unrepresentative” or “spurious” data points, one typically is left with a part of the sample that is unrealistically accurate. See [1] for further motivation.
Unlike classical statistical models, the scenario suggested here does not require accompanying estimates of uncertainty of x j . Our investigation focuses on the special “self-dual” weights that define the discrete posterior distribution for the unknown mean, set against a non-informative objective prior for both the mean and independent variances.
This line of inquiry, initiated in [2] under the assumption of normality, grapples with the lack of variance information, causing several statistical complications. For instance, the classical maximum likelihood estimator cannot be determined uniquely, as the likelihood function reaches infinity at each data point. Nevertheless, the problem is well-defined. Indeed, estimating the common mean requires determining at most n parameters, the mean itself and, say, ω i = σ i 2 / j σ j 2 , which belong to the unit simplex of dimension n 1 . Statistical practice needs to determine only ( n 1 ) weights to form the common mean estimator, say, i = 1 n ω x i . In some applications, there is additional information that allows further dimension reduction.
This paper is motivated by mathematical statistics, but its goal is to explore the mathematical aspects of the issues arising in the statistical problem. Indeed, it is addressed to a general mathematical audience. The main contribution is the construction of a unified probabilistic framework for rank-induced parity distributions and the derivation of related moment and combinatorial formulas, which show the link to the Gauss hypergeometric function.
More specifically, Section 2 examines the polynomial approximation problem over the set { x 1 , , x n } , and orthogonal polynomials, which are less deviant from zero on this set. The required formulas for the distributions of parity-based sums are established in Section 3. The deep connection between these distributions and the hypergeometric function is demonstrated in Section 4. We present a self-contained proof of formulas for the specific value of this classical function. Obviously, these formulas are known to specialists, but the author failed to find an easy reference. Our approach yields seemingly new combinatorial identities, (39), (48), (53)–(55), (57) in Section 4 and Section 5. Some useful expressions for partial derivatives are given in Section 6.

2. Self-Dual Probabilities and Orthogonal Polynomials Least Deviating from Zero

We initiate our discussion with a problem that, at first glance, appears unrelated to the main focus. Namely, assuming that all x’s are distinct, the best polynomial approximation of a function f over a finite set { x 1 , , x n } is sought.
The optimal uniform approximation by a polynomial of degree n 1 (or higher) is achieved through the classical Lagrange interpolation polynomial L ( x ) , which coincides with f on this set, i.e., L ( x j ) = f j = f ( x j ) for j = 1 , , n . If the polynomial’s degree is n 2 , the best approximation is derived by subtracting from L ( x ) a specific multiple of the oscillating polynomial that alternates its sign at each successive x j , attaining the same absolute value at these points.
The probabilities,
w i = 1 j i | x i x j | k j k 1 | x k x j | 1
= s i j i ( x i x j ) k s k j k 1 ( x k x ) | 1 ,
are related to the Lagrange formula and to the parities of x i , i = 1 , , n ,
s i = sign j i ( x i x j ) .
It is known (see Theorem 1.15, [3]) that the approximation error coincides with the absolute value of the average, taken under (1), of products s i f i . Thus,
| i s i f i w i | = min R max i | f i R ( x i ) | ,
where R ( x ) runs through all polynomials of degree not exceeding n 2 . For instance, the approximation error of the oscillating function f i = s i is independent of n,
1 = min R max i | s i R ( x i ) | .
The barycentric form of the Lagrange interpolation formula,
L ( x ) = i w i s i f i / ( x x i ) i w i s i / ( x x i ) ,
provides numerous advantages [4].
Probabilities (1) originate in random matrix theory, where they offer alternative descriptions of a physical ensemble in terms of particles or holes. Many optimization problems involving the discriminant function through electrostatic equilibrium are underpinned by them [5]. From the mathematical perspective, these probabilities are self-dual under the duality definition given in [6] and developed in [7,8,9].
In mathematical statistics, probabilities (1) define the discrete posterior distribution for the location parameter against a non-informative prior for this parameter, mean, and independent variances [10]. The generalized Bayes estimator of the mean under the quadratic loss is
δ = i w i x i = i x i j i | x i x j | i j i 1 | x i x j | 1 .
This statistic serves as a semiparametric estimator of the symmetry center in a heterogeneous sample, meaning that it does not depend on the distribution of x’s from a broad class.
Orthogonal polynomials against (1) exhibit striking symmetry. To see it, let δ m = k w k x k m , m = 0 , 1 , represent the moments of self-dual weights (1). The monic polynomials T m , m = 1 , 0 , 1 , , n , T 1 ( z ) = 0 , T 0 ( z ) = 1 , T 1 ( z ) = z δ 1 , , T n ( z ) = ( z x k ) , are orthogonal in L 2 ,
i w i T m ( x i ) T p ( x i ) = δ m p h m h p .
With
W = i j i | x i x j | 1 1 ,
the sequence h m = i [ T m ( x i ) ] 2 w i , h 0 = 1 , h 1 = δ 2 δ 1 2 , , h n 1 = W 2 , enjoys the mentioned symmetry as h m h n m 1 = W 2 .
The orthogonal polynomials T m are known to satisfy the three-term recurrence,
T m + 1 ( z ) = ( z α m ) T m ( z ) β m T m 1 ( z ) ,
where the coefficients β m = h m / h m 1 , β m = β n m , and α m = i x i [ T m ( x i ) ] 2 w i / h m , also possess symmetry property: α m = α n 1 m .
Mathematical induction applied to (5) shows that for i , m = 1 , , n ,
W T m ( x i ) = s i h m T n m 1 ( x i ) .
Identity (6) can be obtained from the original duality definition [6], according to which T m ( x i ) is a multiple of T n 1 ( x i ) T n m 1 ( x i ) .
The polynomial,
T n 1 ( z ) = i w i j i ( z x j ) ,
is less deviant from zero in L : T n 1 ( x i ) = s i W , i.e., for any monic polynomial R of degree not exceeding n 1 ,
max i | T n 1 ( x i ) | = W max i | R ( x i ) | .
The comparison of the extremes of T n 1 ( z ) and those of the classical monic, degree n 1 , Chebyshev polynomial on the interval, [ min x i , max x i ] , provides a sharp inequality,
W ( max x i min x i ) n 1 2 n 2 = 2 max x i min x i 2 n 1 ,
which holds for all distinct x 1 , , x n . The factor 2 2 n in (8) is given incorrectly in [10].
Equality in (8) is attained if and only if x’s are extreme points of the mentioned Chebyshev polynomial on this interval, i.e., when for j = 1 , , n ,
x j = max x i + min x i 2 + ( max x i min x i ) 2 cos ( j 1 ) π n 1 .
Then the probabilities (1) have a remarkably simple form,
w j = 1 n 1 , j = 2 , , n 1 , w 1 = w n = 1 2 ( n 1 ) .
Since w 1 < w 2 , w n < w n 1 , this provides the closest resemblance of (1) to the uniform distribution.
In addition to x 1 , , x n , the polynomial T n 1 2 ( z ) W 2 has n 2 real roots (which interlace those of T n 1 ), so that, with R n 1 denoting the monic polynomial of degree n 2 having these roots,
T n 1 2 ( z ) W 2 = T n ( z ) R n 1 ( z ) .
Therefore, R n 1 is the associated polynomial to T n 1 ,
R n 1 ( z ) = i [ T n 1 ( z ) T n 1 ( x i ) ] w i z x i .
The associated with T m orthogonal monic polynomial R m of degree m 1 , m = 1 , , n 1 , satisfies the same recurrence (5), but the initial conditions are different: R 0 = 0 , R 1 = 1 , so that R 2 ( z ) = z α 1 , , R n ( z ) = T n 1 ( z ) = ( z δ 1 ) R n 1 ( z ) β 1 R n 2 ( z ) .
The coefficients α n / 2 , β n / 2 admit an explicit form: for even n
α ( n 2 ) / 2 = α n / 2 = i s i x i 2 2 i s i x i , β n / 2 = ( i s i x i ) 2 4 .
If n is odd, then
T ( n 1 ) / 2 ( z ) = P o ( z ) = j : s j = 1 ( z x j ) ,
h ( n 1 ) / 2 = W , and
α ( n 1 ) / 2 = i s i x i , β ( n 1 ) / 2 = β ( n + 1 ) / 2 = i s i x i 2 ( i s i x i ) 2 4 .
With P e ( z ) = j : s j = 1 ( z x j ) , other central orthogonal polynomials are of the form
T n / 2 ( z ) = P o ( z ) + P e ( z ) 2 ,
T ( n + 1 ) / 2 ( z ) = ( z i s i x i ) P o ( z ) + P e ( z ) 2 ,
and
T ( n 2 ) / 2 ( z ) = P o ( z ) P e ( z ) i s i x i .
One can represent R n 1 ( z ) = R e ( z ) R o ( z ) as a product of two monic polynomials (with real roots) of degrees n e 1 and n o 1 , respectively. Then with P e and T n 1 ( z ) W = P e ( z ) R o ( z ) and T n 1 ( z ) + W = P o ( z ) R e ( z ) . If s i = 1 , R e ( x i ) = 2 W / P o ( x i ) , when s j = 1 , R o ( x j ) = 2 W / P e ( x j ) . Thus,
R e ( z ) = 2 W s i = 1 s k = 1 , k i ( z x k ) P o ( x i ) P e ( x i ) = 2 s i = 1 w i s k = 1 , k i ( z x k ) ,
and
R o ( z ) = 2 s j = 1 w j s = 1 , j ( z x ) .
Central associated polynomials are R ( n 1 ) / 2 = R o ,
R n / 2 ( z ) = R o ( z ) + R e ( z ) 2 ,
R ( n + 1 ) / 2 ( z ) = ( z i s i x i ) R o ( z ) + R e ( z ) 2 ,
and
R ( n 2 ) / 2 ( z ) = R o ( z ) R e ( z ) i s i x i .
We summarize the main results as a theorem whose detailed proof can be found in in [10].
Theorem 1.
Polynomials T m , which are orthogonal with regard to probabilities (1), satisfy (5) and (6), with their central versions in (12), (14)–(16). Their associate polynomials satisfy (9) and (10), with the central versions given in (19)–(21). Central coefficients are given in (11) and (13). Inequality (8) is valid.
Coefficients α ( n 1 ) / 2 , β n / 2 are completely determined by i s i x i . Two other coefficients involve quadratic forms involving s i x i . We refer to these functions of observations as parity-based statistics and embark on their study.

3. Parity-Based Distributions

The main object of interest in this section is the parity-based sums of the form, i s i x i .
The finite set { x 1 , , x n } in Section 2 can be considered as the representative points of univariate statistical distribution [11]. Thus, we assume that it is a realization of n independent random variables with common continuous distribution function F ( x ) = F 0 ( x ) , whose density f ( x ) has all finite moments, m p = E x p = x p f ( x ) d x , which determine F uniquely.
Denote by s 1 , , s n the parity sequence corresponding to x 1 , , x n . To start exploring the behavior of the parity-based sums, notice that the distribution of a random parity s i can be written as
P ( s i = ϵ ) = 1 2 n r = 1 n 1 + ϵ ( 1 ) r + n = 1 2 1 + ϵ [ 1 ( 1 ) n ] 2 n ,
ϵ = ± 1 , which does not depend on F.
The joint density of x i and its parity s i is
f ( x i , s i = ϵ ) = f ( x i ) 2 1 + ϵ [ 2 F ( x i ) 1 ] n 1 .
Therefore, for i = 1 , , n ,
P ( x i x , s i = ϵ ) = F ( x ) 2 + ϵ { [ 2 F ( x ) 1 ] n ( 1 ) n } 4 n .
If f is symmetric, the distribution function of y i = s i x i is
G ( y ) = P ( x i < y , s i = 1 ) + P ( x i > y , s i = 1 )
= F ( y ) + [ 1 + ( 1 ) n ] { [ 2 F ( y ) 1 ] n 1 } 4 n .
These formulas can be derived from the distribution of order statistics whose rank has the same parity as the largest observation. For example, the conditional density of x i and x j for given s i = ϵ 1 , s j = ϵ 2 , has the form
f ( x i , x j | s i = ϵ 1 , s j = ϵ 2 ) = f ( x i ) f ( x j ) 4 P ( s i = ϵ 1 , s j = ϵ 2 ) { 1 + ϵ ( 1 ) n + r 1 [ 1 2 F ( x i ) ] n 2
+ ϵ 2 ( 1 ) n + r 2 [ 1 2 F ( x j ) ] n 2 + ϵ 1 ϵ 2 ( 1 ) r 1 + r 2 [ 1 2 | F ( x i ) F ( x j ) | ] n 2 } ,
where r 1 and r 2 , 1 r 1 r 2 2 , are relative ranks for x i and x j , say, r 1 = 1 if x i < x j .
The next result provides the joint density of m-sub-vectors x n 1 , , x n m and s n 1 , , s n m , the relevant conditional distributions, as well as the form of the moments of parity sums. Here, m is a fixed integer, 1 m n . Thus, M = { 1 , , m } N = { 1 , , n } .
Theorem 2.
The exchangeable distribution of x n 1 , , x n m and s n 1 , , s n m , is provided by (28). The conditional density of x n j , j M , for a given value of the product s n 1 s n m satisfies (33). The moments of the parity sum, i = 1 n s i x i , can be found from (37).
Proof. 
Let z 1 < < z m be the order statistics corresponding to x n j , j M . The joint distribution of z 1 , , z m and the parities can be represented as a mixture of the conditional densities for given ranks. A particular density enters this mixture if and only if s n j = ( 1 ) n + R j , where R j , 1 R j n , is the rank of x n j among the total sample.
Let r i denote the rank of x n i within our subsample. Then x n j = z r j , and s j M = ( 1 ) m + r j is the parity of x n j in this subsample.
Since the probability of any rank combination is n m 1 , the classical formula for the distribution of several order statistics [12] implies that the joint density of y 1 , , y m and the corresponding parities is
f ( z 1 , , z m ; s n 1 , , s n m ) = m ! 1 m f ( z j )
× 1 p 1 < < p m n n m p 1 1 p 2 p 1 1 n p m j = 1 m [ 1 + s n j ( 1 ) n + p j ] 2
× Δ 1 p 1 1 Δ 2 p 2 p 1 1 Δ ` m p m p m 1 1 Δ m + 1 n p m .
Here, under the convention that F ( z m + 1 ) = 1 , z ( y 0 ) = 0 , Δ j = F ( y j ) F ( y j 1 ) are familiar m + 1 spacings, j = 1 m + 1 Δ j = 1 , which are known to have a Dirichlet distribution Dirm+1 with positive concentration parameters r 1 , r 2 r 1 , , r m r m 1 , n + 1 r m [13]. Therefore,
z 1 < < z m j = 1 m + 1 [ F ( z j ) F ( z j 1 ) ] r j r j 1 1 j = 1 m f ( z j ) d z j
= Γ ( r 1 ) Γ ( r 2 r 1 ) Γ ( r m r m 1 ) Γ ( n + 1 r m ) m ! Γ ( n + 1 ) .
Integration of (26) over z 1 , , z m gives our first combinatorial identity,
2 m n m P ( s n j = ϵ j , j = 1 , , m ) = 1 p 1 < < p m n j = 1 m [ 1 + ϵ j ( 1 ) n + p j ] .
By replacing the summation variables in (26) with p 1 1 , p 2 p 1 1 , , n p m and using multinomial theorem, we arrive at the form of the joint distribution of x n 1 , , x n m and s n 1 , , s n m ,
f ( x n 1 , , x n m ; s n 1 , , s n m )
= j f ( x n j ) 2 m k = 0 m ( 1 ) n k K M , | K | = k i K ( 1 ) r i [ 1 2 i K s i K F ( x n i ) ] n m i K s n i
= j f ( x n j ) 2 m k = 0 m ( 1 ) ( n m ) k K M , | K | = k i K s n i s i M [ 1 2 i K s i K F ( x n i ) ] n m .
Here, for i K , s i K = sign ( j i , j K ( x n i x n j ) , is the parity of x n i in the subsample x n j , j K , where k is the cardinality of K. Thus, this joint density is a linear function of parity products, i K s n i , K M .
The joint Dirichlet distribution of spacings implies that j K s j K F ( x j ) is beta-distributed with parameters ( q , m + 1 q ) , q = q k , where for any subset K of M,
q k = j k j k 1 + ( 1 ) k j 1 = i K ( 1 ) j i + k j i , q k 0 = q 0 .
Clearly, ( 1 ) q k = ( 1 ) i K i = ( 1 ) j r K r j . If k is even, k / 2 q k m k / 2 ; for odd k, ( k + 1 ) / 2 q k m ( k 1 ) / 2 , so that
q m = j = 1 m ( 1 ) j + m = m / 2 = m / 2 + [ 1 ( 1 ) m ] / 4 ,
and ( 1 ) q m = ( 1 ) m ( m + 1 ) / 2 .
Thus, density (28) is related to the classical hypergeometric function, F 1 2 ( m n , q ; m + 1 ; z ) = F ( m n , q ; m + 1 ; z ) (actually a polynomial in z of degree n m ). Our argument shows the special role of the specific value z = 2 , which appears as the factor at s j K F ( x j ) in (28).
Indeed, by using the fundamental integral representation of the hypergeometric function [14], and setting q = q k , one obtains the following expression:
E [ 1 2 j K s j K F ( x j ) ] n m = 0 1 ( 1 2 u ) n m u q 1 ( 1 u ) m q d u B ( q , m + 1 q )
= F ( m n , q ; m + 1 ; 2 ) .
Observe that for ϵ j = ± 1 , j M , 2 m functions, K ϵ i , K M , are linearly independent. Indeed, they are orthogonal under the natural inner product,
ϵ j = ± 1 , j M K ϵ i L ϵ i = 2 m δ L , K ,
where K and L are fixed subsets of M. Also,
ϵ j = ± 1 , j M , K ϵ i = ϵ L ϵ = 2 m 1 [ δ , K + ϵ δ L , K ] .
Simplifying the notation from x n j to x j , we see that the joint density of x j , j M , and of the product, K s j , K M , has the form
f ( x j , j M ; i K s i = ϵ )
= M f ( x j ) 2 { 1 + ϵ ( 1 ) n k + K r j [ 1 2 j K s j K F ( x j ) ] n m } .
When K = M , one obtains
E ( M s j | x j , j M ) = ( 1 ) n m + m ( m + 1 ) / 2 [ 1 2 M s j M F ( x j ) ] n m ,
so that by (31) with q m given in (30),
E M s j = ( 1 ) n m + m ( m + 1 ) / 2 F ( m n , q m ; m + 1 ; 2 ) .
To derive formulas for the moments of the parity-based sum, i = 1 n s i x i , we need an extension of (32) for positive integers ν j , j = 1 , , m . For this purpose, the form of the joint density of x j , j M , and the product of the corresponding parities, 1 m s j ν j , is desired.
This density can be obtained from (32), since 1 m s j ν j = ϵ , if and only if with D = { j , j = 1 , , m , ν j odd }, one has j , ν j D s j = ϵ . Thus, with d denoting the cardinality of D and r j still denoting the rank of x j within the subsample,
E ( 1 m s j ν j | x j , j M ) = ( 1 ) n d + D r j [ 1 2 j D s j D F ( x j ) ] n m ,
and
E j = 1 m ( s j x j ) ν j = ( 1 ) n d E ( 1 ) j D r j [ 1 2 j D s j D F ( x j ) ] n m j = 1 m x j ν j
= ( 1 ) n d + d ( d + 1 ) / 2 E j D x j ν j [ 1 2 j D s j D F ( x j ) ] n m i D m ν i [ 1 2 j D s j D F ν i ( x j ) ] ,
where F ν ( x ) = x u ν f ( u ) d u / m ν , ν = 0 , 2 , 4 , ,   s i D = sign ( j i , j D ( x i x j ) .
To prove (35), we evaluate the following conditional expectation:
E ( i : ν i even ( 1 ) r i x i ν i | x j , j D ) = ( 1 ) m ( m d ) [ i : i , D sign ( x i x ) ]
× i D m ν i E ( i : ν i e v e n k : ν k o d d sign [ k i ( x i x k ) ] x i ν i | x j , j D )
= ( 1 ) m ( m d ) + ( m d ) ( m d 1 ) / 2 i D m ν i [ 1 2 D s j D F ν i ( x j ) ] .
Identity (35) is valid when some ν i , i D vanish. Thus, for any non-negative integers ν j , j = 1 , , n ,
E j = 1 n ( s j x j ) ν j = ( 1 ) n d + d ( d + 1 ) / 2 E j D x j ν j i D m ν i [ 1 2 j D s j D F ν i ( x j ) ] ,
D = { j , j = 1 , , n , ν j odd},
More general formula involves functions ϕ j , j = 1 , , n ,
E j = 1 n s j ν j ϕ j ( x j ) = ( 1 ) n d + d ( d + 1 ) / 2 E j D ϕ j ( x j )
× i D ϕ i ( u ) f ( u ) d u 2 j D s j D x j ϕ i ( u ) f ( u ) d u .
According to (36),
E 1 n s j x j p = ν j = p p ν 1 ν n ( 1 ) n d + d ( d + 1 ) / 2
× E j : ν j odd x j ν j i : ν i even m ν i [ 1 2 D s j D F ν i ( x j ) ]
where d is as above and p d is even; (37) presents the correct version of Formula (34) in [10]. □
By using (28), one can find the joint (symmetric) density of y n 1 , , y n m . For example, when f is assumed symmetric, the distribution of y 1 = s 1 x 1 and y 2 = s 2 x 2 is exchangeable, so that it suffices to determine its density when | y 1 | < | y 2 | . Then ϵ 1 y 1 ϵ 2 y 2 = | y 2 | , if ϵ 2 = sign ( y 2 ) ; = ϵ 1 y 1 , otherwise. Similarly, ϵ 1 y 1 ϵ 2 y 2 = | y 2 | , when ϵ 2 = sign ( y 2 ) ; = ϵ 1 y 1 , otherwise. Thus,
ϵ 1 y 1 < ϵ 2 y 2 ϵ 1 [ 1 2 F ( ϵ 1 y 1 ) ] n 2 + ϵ 2 y 2 < ϵ 1 y 1 ϵ 2 [ 1 2 F ( ϵ 2 y 2 ) ] n 2
= [ 1 ( 1 ) n ] [ 1 2 F ( y 1 ) ] n 2 2 sign ( y 2 ) [ 1 2 F ( | y 2 | ) ] n 2 ,
and
ϵ 1 y 1 < ϵ 2 y 2 ϵ 2 [ 1 2 F ( ϵ 2 y 2 ) ] n 2 + ϵ 2 y 2 < ϵ 1 y 1 ϵ 1 [ 1 2 F ( ϵ 1 y 1 ) ] n 2
= [ 1 ( 1 ) n ] [ 1 2 F ( y 1 ) ] n 2 + 2 sign ( y 2 ) [ 1 2 F ( | y 2 | ) ] n 2 .
Since
ϵ 1 , ϵ 2 ϵ 1 ϵ 2 [ 1 2 | F ( ϵ 1 y 1 ) F ( ϵ 2 y 2 ) | ] n 2
= 2 sign ( y 2 ) [ 1 2 | F ( y 1 ) F ( | y 2 | ) | ] n 2 [ 1 2 | F ( y 1 ) + F ( | y 2 | ) 1 | ] n 2
= 2 [ 1 2 | F ( y 1 ) F ( y 2 ) | ] n 2 [ 1 2 | F ( y 1 ) + F ( y 2 ) 1 | ] n 2 ,
it follows that the joint density of y 1 , y 2 when | y 1 | < | y 2 | is
g ( y 1 , y 2 ) = ϵ 1 , ϵ 2 = ± 1 f ( ϵ 1 y 1 , ϵ 2 y 2 , s i = ϵ 1 , s j = ϵ 2 )
= f ( y 1 ) f ( y 2 ) { 1 + sign ( y 2 ) [ 1 + ( 1 ) n ] 2 [ 1 2 F ( | y 2 | ) ] n 2
1 2 [ 1 2 | F ( y 1 ) F ( y 2 ) | ] n 2 [ 1 2 | F ( y 1 ) + F ( y 2 ) 1 | ] n 2 .
In the symmetric case, E ( i = 1 n s i x i ) p = 0 , for odd n and p. Indeed, i = 1 n s i x i up to multiple ( 1 ) n coincides with the parity sum derived from the sample ( x 1 , , x n ) , which is equidistributed with x’s.
If p = 1 , the first moment can be derived from (23); the form of the second moment follows from (25),
E ( i = 1 n s i x i ) 2 = n m 2 n ( n 1 ) E x 1 x 2 [ 1 2 | F ( x 2 ) F ( x 1 ) | ] n 2 .
When p = 3 n ,
E ( i = 1 n s i x i ) 3 = n E s 1 x 1 3 + 3 n ( n 1 ) E s 1 x 1 x 2 2 + n ( n 1 ) ( n 2 ) E 1 3 s i x i
= n x 3 f ( x ) [ 2 F ( x ) 1 ] n 1 d x
3 n ( n 1 ) x f ( x ) [ 2 F ( x ) 1 ] n 2 m 2 2 x y 2 f ( y ) d y d x
+ 6 n ( n 1 ) ( n 2 ) x < y < z x y z f ( x ) f ( y ) f ( z )
× [ 1 2 F ( z ) + 2 F ( y ) 2 F ( x ) ] n 3 d x d y d z .
When p = 4 n ,
E ( i = 1 n s i x i ) 4 = n m 4 + 3 n ( n 1 ) m 2 2 E x 1 x 2 3 [ 1 2 | F ( x 2 ) F ( x 1 ) | ] n 2
6 n ( n 1 ) ( n 2 ) m 2 E x 1 x 2 x 3 2 [ 1 2 | F 2 ( x 2 ) F 2 ( x 1 ) | ] [ 1 2 | F ( x 2 ) F ( x 1 ) | ] n 3
+ E 1 4 x i [ 1 2 1 4 s j F ( x j ) ] n 4 .

4. Parities and Hypergeometric Function

For fixed m , 1 m n , the joint distribution of parities s n 1 , , s n m is obtained in Theorem 2, whose notation we follow. According to (27) and (28), if 1 m n ,
P ( s n j = ϵ j , j = 1 , , m ) = 1 p 1 < < p m n j = 1 m [ 1 + ϵ j ( 1 ) n + p j ] 2 m n m
= 1 2 m k = 0 m ( 1 ) n k + k ( k + 1 ) / 2 F ( k n , q k ; k + 1 ; 2 ) k ! E k ( ffl ) .
Here, for ffl = ( ϵ 1 , , ϵ m ) , E k ( ffl ) = K M , | K | = k K ffl i , is the k-th elementary symmetric function whose values depend only on d, the number of ϵ ’s equal to 1 . Indeed
j = 1 m ( z s j ) = ( z + 1 ) d ( z 1 ) m d = k = 0 m ( 1 ) m k E k ( ffl ) z m k ,
so that
E k ( ffl ) = ( 1 ) k d = 0 k ( 1 ) m d d k .
Now we give explicit formulas for F ( m n , q k ; k + 1 ; 2 ) involving double factorials. See [15] for a survey of related combinatorial identities, and [16] for further instances of closed-form expressions for this function at specific arguments.
Theorem 3.
If 1 k m n , n 2 , are positive integers, the following identities hold for the hypergeometric function:
F ( m n , ( k + 1 ) / 2 ; k + 1 ; 2 ) = ( n m 1 ) ! ! k ! ! ( n m + k ) ! ! , n m e v e n , k o d d ,
F ( m n , ( k + 1 ) / 2 ; k + 1 ; 2 ) = 0 , n m o d d , k o d d ,
F ( m n , k / 2 ; k + 1 ; 2 ) = F ( m n , ( k + 2 ) / 2 ; k + 1 ; 2 )
= ( n m 1 ) ! ! ( k 1 ) ! ! ( n m + k 1 ) ! ! , n m e v e n , k e v e n ,
F ( m n , k / 2 ; k + 1 ; 2 ) = F ( m n , ( k + 2 ) / 2 ; k + 1 ; 2 )
= ( n m ) ! ! ( k 1 ) ! ! ( n m + k ) ! ! , n m o d d , k e v e n .
For any positive integers q , 1 q k , and p,
F ( p , q ; m + 1 ; 2 ) = ( 1 ) n m F ( p , m + 1 q ; m + 1 ; 2 ) ,
and (46) is valid.
Proof. 
To prove Formulas (40)–(43), we use the well-known facts about the hypergeometric function. According to 15.8.13 in [14]
F m n , ( k + 1 ) / 2 ; k + 1 ; z
= ( 2 z ) n m 2 n m F m n 2 , m n + 1 2 ; k + 2 2 ; z 2 ( 2 z ) 2 .
If n m is even, this identity means that
F m n , ( k + 1 ) / 2 ; k + 1 ; z = ( 2 z ) n m 2 n m
× 0 j ( n m ) / 2 [ ( m n ) / 2 ] j ¯ [ ( m n 1 ) / 2 ] j ¯ [ ( j + 2 ) / 2 ] j ¯ j ! z 2 j ( 2 z ) n m 2 j .
Here, for any real a and non-negative integer j, [ a ] j ¯ = a ( a + 1 ) ( a + j 1 ) = Γ ( a + j ) / Γ ( a ) is the ascending factorial.
The only term of the finite series in the right-hand side of (45) without a positive power of ( 2 z ) corresponds to j = j 0 = ( n m ) / 2 . Its coefficient equals
[ ( m n ) / 2 ) ] j ¯ 0 [ ( m n 1 ) / 2 ] j ¯ 0 [ ( k + 2 ) / 2 ] j ¯ 0 j 0 ! = Γ n m + 1 2 Γ k + 2 2 Γ 1 2 Γ n m + k + 1 2 ,
which is seen to coincide with (40). This coefficient vanishes when n m is odd, implying (41).
One has
k z 2 F m n , k 2 ; k + 1 ; z
= k F m n , k 2 ; k ; z + k ( z 1 ) F m n 1 , k 2 ; k ; z ,
15.5.16 in [14] leading to (42) and (43).
Identity (39) means that for any positive integers, k m n ,
K M , | K | = k 1 p 1 < < p m n ( 1 ) K p j = ( 1 ) k ( k + 1 ) / 2 n m F ( k n , q k ; k + 1 ; 2 ) .
= ( 1 ) n k n m E K s n j = ( 1 ) n k n m [ 2 P ( K s n j = 1 ) 1 ] .
In the last two formulas, K is any k-element subset of M. □
The values of the hypergeometric function entering (39) and other formulas with 0 m n , q m = m / 2 , can be summarized as follows:
F ( m n , q m ; m + 1 ; 2 ) = ( m 1 ) ! ! [ 1 + ( 1 ) m ] 2 ( n 1 ) ( n 3 ) ( n m + 1 ) n even ( m 1 ) ! ! ( n 2 ) ( n m ) n odd , m even m ! ! ( n 2 ) ( n m 1 ) n odd , m odd .
When m = 0 , this function takes its largest value, F ( n , 0 ; 1 ; 2 ) = 1 . If q = m + 1 , F ( m n , m + 1 ; m + 1 ; 2 ) = ( 1 ) n m .
It is immediate that when n and m is fixed,
F ( m n , q m ; m + 1 ; 2 ) ( m 1 ) ! ! [ 1 + ( 1 ) m ] 2 n m / 2 n even ( m 1 ) ! ! n m / 2 n odd , m even m ! ! n ( m + 1 ) / 2 n odd , m odd .
Now by using (39), one gets the first-order approximation for fixed m 2 ,
2 m P ( s n j = ϵ j , j = 1 , , m ) = 1 + m 2 n ϵ m ( 1 ) n ϵ 1 j = 1 m 1 ϵ j ϵ j + 1 + O 1 n 2 ,
which indicates the deviation from uniformity of the distribution of ( s n 1 , , s n m ) .
Another proof of Formulas (40)–(43) in Theorem 3 uses the following generating function:
k = 0 F ( k , ( p + 1 ) / 2 ; k + 1 ; 2 ) z k
= 1 B ( ( p + 1 ) / 2 , ( p + 1 ) / 2 ) 0 1 u ( p 1 ) / 2 ( 1 u ) ( p 1 ) / 2 d u 1 z 2 ( 1 2 u ) = F 1 , 1 2 ; p + 2 2 ; z 2 ,
which is an even function of z when p is odd. If p is even,
k even F ( k , p / 2 ; p + 1 ; 2 ) z k + k odd F ( k , p / 2 ; p + 1 ; 2 ) k z k p + k
= k even F ( k , p / 2 ; p + 1 ; 2 ) ( z + 1 ) z k
= 1 B ( p / 2 , ( p + 2 ) / 2 ) 0 1 u ( p 2 ) / 2 ( 1 u ) p / 2 d u 1 z 2 ( 1 2 u ) = F 1 , 1 2 ; p + 1 2 ; z 2 .
These facts can be found in 15.15.1 [14].
It is well known that the probabilities defining the classical hypergeometric distribution with parameters m , n e , n o = n n e , can be determined from its probability generating function, which is the (finite) hypergeometric series F ( m n , n e ; n o m + 1 ; z ) .
Therefore, the probability that such a random variable takes an even value (under any positive integers n e and n o , n e + n o = n ) is
max ( 0 , m n e ) k min ( n o , m ) [ 1 + ( 1 ) k ] n o k n e m k 2 n m
= 1 2 1 + ( n m ) ! n o ! n ! ( n o m ) ! F ( m , n e ; n o m + 1 ; 1 ) .
If n e n o = [ 1 ( 1 ) n ] / 2 , this probability coincides with P ( s 1 s m = 1 ) whose expression through F ( m n , q m ; m + 1 ; 2 ) is given in Theorem 3.
The joint distribution of s 1 , , s m , which define traditional hypergeometric random variable ( s 1 + + s m + m ) / 2 ,
1 2 m k = 1 m 1 + ϵ k ( n e n o ϵ k 1 ϵ 1 ) n k + 1 ,
differs from (39). Indeed, the parities s 1 , , s m , 1 m n , in (39) are special because of their association with ranks of the subsample.
For two disjoint subsets K and K of M and any L,
ϵ i = ± 1 , K ϵ j = ffl 1 , K ϵ j = ffl 2 L ϵ
= 2 m 2 [ δ , L + ffl 1 δ K , L + ffl 2 δ K , L + ffl 1 ffl 2 δ K K , L ] ,
so that one obtains
4 P ( i K s i = ϵ 1 , i L s i = ϵ 2 | x j , j M ) = 1 + ϵ 1 ( 1 ) n k + K r i [ 1 2 K s i K F ( x i ) ] n m
+ ϵ 2 ( 1 ) n + L r i [ 1 2 L s i L F ( x i ) ] n m
+ ϵ 1 ϵ 2 ( 1 ) n ( k + ) + K L r i [ 1 2 K L s i K L F ( x i ) ] n m .
It follows that
P ( i K s i = ϵ 1 , i L l s i = ϵ 2 ) = 1 4 [ 1 + ( 1 ) n k + q k F ( m n , q k ; m + 1 ; 2 ) ϵ 1
+ ( 1 ) n + q F ( m n , q ; m + 1 ; 2 ) ϵ 2 + ( 1 ) n ( k + ) + q k + F ( m n , q k + ; m + 1 ; 2 ) ϵ 1 ϵ 2 ] ,
with similar formulas for the joint distribution of products of s’s over several disjoint subsets.
As in Theorem 2, all joint probabilities are linear functions of ϵ 1 , ϵ 2 and ϵ 1 ϵ 2 ,
4 P ( s i = ϵ 1 , s j = ϵ 2 ) = 1 + ϵ 1 α + ϵ 2 β + ϵ 1 ϵ 2 χ
with some coefficients α , β , χ . The degree of dependence of s’s (or of their Bernoulli versions ( s i + 1 ) / 2 ) can be measured via the correlation coefficient between s i and s j ,
ρ = 2 P ( s i s j = 1 ) 1 [ 2 P ( s i = 1 ) 1 ] [ 2 P ( s j = 1 ) 1 ] 4 P ( s i = 1 ) P ( s i = 1 ) P ( s j = 1 ) P ( s j = 1 ) = χ α β ( 1 α 2 ) ( 1 β 2 ) .
In our examples, χ < α β , and the dependence is negative.
In (39), if n is even, β = α = χ = 1 / ( n 1 ) , when n is odd, α = β = χ = 1 / n . More generally, in (50) α = ( 1 ) n k + q k + q m k F ( k n , q k ; k + 1 ; 2 ) , β = ( 1 ) n ( m k ) + q k + q m k F ( m k n , q m k ; m k + 1 ; 2 ) , δ = ( 1 ) n m + q m F ( m n , q m ; m + 1 ; 2 ) . Then if n is even, and m , k are odd, ρ = 0 , which means that K s n j and K s n j are independent, P ( M s n j = ± 1 ) = 1 / 2 .
When n e n o = [ 1 ( 1 ) n ] / 2 , one achieves in (49), α = β = [ 1 ( 1 ) n ] / ( 2 n ) , χ = [ 2 n + ( 1 ) n 1 ] / [ 2 n ( n 1 ) ] .
These formulas may find further use in probability modeling and estimation of entropy of binary sequences [17].

5. Parity-Based Sums and Dirichlet Distribution

We start here with the identity, which is similar to the formulas (40)–(43) in Theorem 3. Namely, for any positive integers p and n with ν i 0 , i = 1 , n , forming a weak decomposition of p into n (non-negative) parts, one has
ν 1 + + ν n = p i s i ν i = ( p + n 2 ) / 2 p / 2 n even , p even 0 n even , p odd ( p + n 1 ) / 2 p / 2 n odd , p even ( p + n 2 ) / 2 ( p 1 ) / 2 n odd , p odd .
Indeed, ν 1 + + ν m = p 1 = p + n 1 n 1 , so that with q = q n defined by (30), the sum in the left-hand side of (51) can be written as
f m , p ( q ) = k = 0 p ( 1 ) k m q + k 1 m q 1 q 1 + p k q 1 ,
which is the coefficient at z p in the series expansion of ( 1 z ) q ( 1 + z ) q m , cf. Section 1.3 in [18].
One has, f m , p ( m q ) = ( 1 ) p f m , p ( q ) , f m , 0 ( q ) = 1 , f 0 , p ( q ) = 0 , p 1 . A formula similar to (52) for the generating function, z m ( 1 z ) q ( 1 + z ) q m , obtains for the partition of p (strictly positive ν ’s, k = 1 m ν k = p ) .
Theorem 4.
For f m . p ( q ) given in (52), q = q m = m / 2 , one has
f m + 1 , p ( q ) = m + p p F ( p , q ; m + 1 ; 2 )
as well as (54) and (55).
Proof. 
The equality (53) holds because of (40)–(43). Indeed, its left-hand part satisfies (51), which corresponds to the weak decomposition of p. By comparing this equality with the partition of p into m (positive size) blocks, one obtains a representation of f m , p ( q ) , q = q m ,
( 1 ) m p f m , p ( q ) = ν 1 + + ν m = p i ( 1 ) i ν i
= K M K ν i = p , ν i > 0 i K ( 1 ) i ν i = K M ( 1 ) k K ν i = p k , ν i 0 i K ( 1 ) i ν i
= K M ( 1 ) k ( p k + 1 ) f k , p k k + K ( 1 ) i 2 .
Here, k , 1 k m p , denotes the cardinality of K = { i : ν i > 0 } , so that [ k + K ( 1 ) i ) ] / 2 is the multiplicity of 1 among ( 1 ) i , i K .
The number of possible choices of the set K with the given multiplicity j of 1 in the set { ( 1 ) i , i K } , is q j m q k j . Therefore,
( 1 ) m p f m , p ( q ) = k = 1 m p ( 1 ) k ( p k + 1 ) j = 0 k q j m q k j f k , p k ( j ) .
Alternative representation of f m , p ( q ) , p 1 , results from the binomial theorem applied to ( 1 s ) q ( 1 + s ) q m , q = q m ,
f m , p ( q ) = 1 2 q j = 0 q q j f q , p ( j ) m even 1 2 q j = 0 q q j [ f q , p ( j ) f q , p 1 ( j ) ] m odd .
The coefficients f m , p ( q m ) also appear in the conditional cumulative distribution function corresponding to (32). For positive N put,
U m N ( y 1 , , y m ) = < x j < y j , j = 1 , , m [ ( 1 ) m 2 j ( 1 ) r j F ( x j ) ] N j f ( x j ) d x j
= ( 1 ) m N 0 < u j < F ( y j ) , j = 1 , , m [ 1 2 j s j M u j ] N j d u j ,
where r j denotes the rank of x j , s j M = sign ( k j , j , k M ( x j x k ) .
Then, the mentioned distribution function can be expressed through U m n m as
F ϵ ( x n 1 , , x n m ) = 1 2 P ( j s n j = ϵ )
× M F ( x n j ) + ϵ ( 1 ) m ( m 1 ) / 2 U m n m ( x n 1 , , x n m ) ,
which implies that U m N ( , , ) = ( 1 ) m N F ( N , q m ; m + 1 , 2 ) .
Thus, under the Dirichlet distribution Dirm+1 of Δ 1 , , Δ m + 1 , with all concentration parameters 1, one has for even m,
U m N ( x , , , ) = 0 F ( x ) 0 1 0 1 [ 1 2 ( Δ m + Δ m 2 + + Δ 2 ) ] N d Dir m + 1 .
Now we use the facts that the marginal distribution of Δ 1 has a beta-density β ( Δ 1 ) with parameters ( 1 , m ) , and that the conditional distribution of ( Δ 2 , , Δ m + 1 ) = ( Δ 2 / ( 1 Δ 1 ) , , Δ m + 1 / ( 1 Δ 1 ) ) is Dir m .
Thus, when m is even,
( 1 ) m N U m N ( x , , , )
= 0 F ( x ) 0 1 0 1 v 1 1 v 1 + 1 2 i = 2 , , m Δ i 1 Δ 1 N ( 1 v 1 ) N β ( v 1 ) d v 1 d Dir m
= k = 0 N N k F k , q m ; m , 2 0 F ( x ) v 1 N k ( 1 v 1 ) k β ( v 1 ) d v 1
= m + N N 1 k = 0 N f m + 1 , k ( q m ) m + k I F ( x ) ( N k + 1 , m + k ) .
Here we took advantage of the binomial theorem, (31) and (53); in the last equality I p ( N k + 1 , m + k ) = 0 p v N k ( 1 v ) m + k 1 d v / B ( N k + 1 , m + k ) , 0 p 1 , denotes the incomplete beta function.
Identity (57) also holds for odd values of m. For example,
U 1 N ( x ) = ( 1 ) N { 1 [ 1 2 F ( x ) ] N + 1 } 2 ( N + 1 ) ,
and
U 2 N ( x , ) = F ( x ) N + 1 [ 1 ( 1 ) N ] { 1 [ 1 2 F ( x ) ] N + 2 } 4 ( N + 1 ) ( N + 2 ) .
Interest in U is due to the multivariate integration by parts, which also motivates the study of its derivatives in the next section.

6. Partial Derivatives

For 1 m n , we look at the properties of the symmetric function G = G m , which so far is defined almost everywhere (for pairwise different x’s),
G ( x 1 , , x m ) = [ 1 2 j = 1 m s j F ( x j ) ] n ,
s j = sign ( j ; x j x i ( x i x j ) .
If the data consists of clusters, C j = { i : x i = x j } , j = 1 , , k , then using the definition by continuity put
G ( x 1 , , x m ) = [ 1 2 j : | C j | odd s j F ( x j ) ] n ,
which allows possibly equal x’s. Notice that functions s j are discontinuous.
If all x’s coincide, then under this definition, G ( x , , x ) = 1 , if m is even; = [ 1 2 F ( x ) ] n , if m is odd. If there are m k points equal to + , then G = [ 1 2 j = 1 k s j F ( x j ) ] n , m k even; = [ 1 2 j = 1 k s j F ( x j ) ] n , m k odd. If m k of x’s are equal to , then G = [ 1 2 j = 1 k s j F ( x j ) ] n .
Thus, G ( x 1 , , x m ) becomes a continuous function whose absolute value is bounded by 1. Actually, it possesses the Lipschitz property if the density f is bounded.
Our goal here is to determine the generalized derivative of order m ,
  • m G ( x ) / j x j , x = ( x 1 , , x m ) , so that for any smooth compactly supported φ , the multivariate integration by parts formula holds
    m G ( x ) j x j φ ( x ) d x = ( 1 ) m m φ ( x ) j x j G ( x ) d x .
Theorem 5.
With ( n ) k = n ( n 1 ) ( n k + 1 ) = Γ ( n + 1 ) / Γ ( n + 1 k ) , denoting the descending factorial, one has
m j x j G ( x 1 , , x m ) = k = 1 m ( 2 ) k ( n ) k [ 1 2 j = 1 m s j F ( x j ) ] n k
× m k 1 K M , | K | = k j K f ( x j ) m k i K x i j K s j .
Proof. 
For 1 k m , differentiation over x 1 , , x k shows that
k x 1 x k [ 1 2 j = 1 m s j F ( x j ) ] n = p = 1 k ( 2 ) p ( n ) p [ 1 2 j = 1 m s j F ( x j ) ] n p A p ( k ) ,
where
A p ( k ) = j = 1 p f ( x j ) k p = p + 1 k x j = 1 p s j .
For k 2 , A p ( k ) = A p ( k ) ( x 1 , , x k ) is a symmetric function of its arguments,
A p ( k ) = k p 1 1 j 1 < < j p k i = 1 p f ( x j i ) k p j 1 , , j p x i = 1 p s j i .
For 1 p k , these functions can be characterized by the following recursion with A 0 ( k ) = 0 , A 0 ( 0 ) = 1 , A p ( k ) = 0 , p > k ,
A p ( k ) = s k f ( x k ) A p 1 ( k 1 ) + x k A p ( k 1 ) .
The proof is by induction. When k = 1 ,
x i [ 1 2 j = 1 m s j F ( x j ) ] n = 2 n s i f ( x i ) [ 1 2 j = 1 m s j F ( x j ) ] n 1 .
Indeed, shift invariance of s j means that for any i, j s j x i = 0 . Thus, A 1 ( 1 ) = s 1 f ( x 1 ) , and for k 2 ,
A 1 ( k ) = f ( x 1 ) k 1 s 1 x 2 x k = ( 2 ) k 1 f ( x 1 ) j = 2 k δ ( x 1 x j ) i = k + 1 m sign ( x 1 x i ) ,
which indeed is a symmetric function of x 1 , , x k .
The following induction steps are straightforward, so that (58) follows. □
For m = 2 , s 1 = s 2 , so that according to (58),
2 x y [ 1 2 | F ( x ) F ( y ) | ] n = 4 n ( n 1 ) f ( x ) f ( y ) [ 1 2 | F ( x ) F ( y ) | ] n 2
+ 4 n f ( x ) δ ( x y ) .
For m = 3 ,
3 x y z [ 1 2 s 1 F ( x ) 2 s 2 F ( y ) 2 s 3 F ( z ) ] n
= 8 n ( n 1 ) ( n 2 ) [ 1 2 s 1 F ( x ) 2 s 2 F ( y ) 2 s 3 F ( z ) ] n 3 f ( x ) f ( y ) f ( z )
+ 8 n ( n 1 ) [ 1 2 s 1 F ( x ) 2 s 2 F ( y ) 2 s 3 F ( z ) ] n 2 [ s 1 s 2 f ( x ) δ ( x y ) + s 1 s 3 f ( z ) δ ( x z )
+ s 2 s 3 f ( y ) δ ( y z ) ] 8 n f ( x ) δ ( y x ) δ ( z x ) .
Let
G ˜ m ( x 1 , , x m ) = k = 0 m 1 K , | K | = k ( 1 ) m k G k ( x j , j K ) ,
where summation is over all proper subsets K of M, G ˜ 0 = 1 . Then G ˜ m is grounded, i.e., it vanishes if x j = for at least one j. If one of the x’s is equal to + , then G ˜ m ( x 1 , , x m 1 , ) = [ ( 1 ) n 1 ] G ˜ m 1 ( x 1 , , x m 1 ) , so that for even n, G ˜ m vanishes if x j = + for at least one j. When m 2 , G ˜ m ( x , , , ) = ( 1 ) m 2 m 2 [ 1 ( 1 ) n ] { 1 [ 1 2 F ( x ) ] n } , G ˜ p ( + , , + ) = ( 1 ) m 2 m 1 [ 1 ( 1 ) n ] .
Thus,
m j x j G ( x 1 , , x m ) = m j x j G ˜ m ( x 1 , , x m ) .
For m = 2 and even n it follows that for any integrable function ϕ ( x , y ) = ϕ ( y , x ) ,
ϕ ( x , y ) f ( x ) f ( y ) [ 1 2 | F ( x ) F ( y ) | ] n 2 d x d y
= 1 n 1 ϕ ( x , x ) ) f ( x ) 1 [ 1 ( 1 ) n ] 2 [ 2 F ( x ) 1 ] n 1 d x
1 4 n ( n 1 ) 2 ϕ ( x , y ) x y { [ 1 2 F ( x y ) + 2 F ( x y ) ] n
+ ( 1 ) n [ 1 2 F ( x y ) ] n [ 2 F ( x y ) 1 ] n } d x d y ,
which is useful to determine the covariance structure.

7. Conclusions

Interesting properties of self-dual probabilities demonstrate their potential in statistical estimation without any additional variance information. The polynomial approximation over a finite set, as well as the Gauss hypergeometric function, are intimately related to data parities and parity-based distributions.
Several presented combinatorial identities may find wider use in probability theory applications, in particular, in random matrices and in statistical physics.

Funding

This research received no external funding.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed. All conclusions are based on mathematical reasoning, and the author is solely responsible for them.

Acknowledgments

Many thanks are due to anonymous referees for their very careful reading of the original version and all their critical comments. The role of Mathematics editorial board in securing such referees is also acknowledged.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Rukhin, A.L. Estimating common mean in a heteroscedastic variances model. Mathematics 2025, 13, 1290. [Google Scholar] [CrossRef]
  2. Rukhin, A.L. Estimation of the common mean from heterogeneous normal observations with unknown variances. J. R. Stat. Soc. Ser. B 2017, 79, 1601–1618. [Google Scholar] [CrossRef]
  3. Rivlin, T.J. An Introduction to Approximation of Functions; Dover: New York, NY, USA, 1969. [Google Scholar]
  4. Trefethen, L.N. Approximation Theory and Approximation Practice; SIAM: Philadelphia, PA, USA, 2013. [Google Scholar]
  5. Karlin, S.; Studden, W.J. Tchebysheff Systems: With Applications in Analysis and Statistics; Wiley: New York, NY, USA, 1966. [Google Scholar]
  6. de Boor, C.; Saff, E.B. Finite sequences of orthogonal polynomials connected by a Jacobi matrix. Linear Algebra Appl. 1986, 75, 43–56. [Google Scholar] [CrossRef]
  7. Borodin, A. Duality of orthogonal polynomials on a finite set. J. Stat. Phys. 2002, 109, 1109–1120. [Google Scholar] [CrossRef]
  8. Vinet, L.; Zhedanov, A. The characterization of classical and semiclassical orthogonal polynomials from their dual polynomials. J. Comp. Appl. Math. 2004, 172, 41–48. [Google Scholar] [CrossRef]
  9. Genest, V.; Tsujimoto, S.; Vinet, L.; Zhedanov, A. Persymmetric Jacobi matrices, isospectral deformations and orthogonal polynomials. J. Math. Anal. Appl. 2017, 450, 915–928. [Google Scholar] [CrossRef]
  10. Rukhin, A.L. Parities and hypergeometric function. Theory Probab. Appl. 2025, 70, 355–374. [Google Scholar] [CrossRef]
  11. Fang, K.-T.; Pan, J. A review: Representative points of statistical distributions and their applications. Mathematics 2023, 11, 2930. [Google Scholar] [CrossRef]
  12. David, H.A.; Nagarajah, N.H. Order Statistics, 3rd ed.; Wiley: New York, NY, USA, 2003. [Google Scholar]
  13. Ng, K.W.; Tian, G.-L.; Tang, M.-L. Dirichlet and Related Distributions: Theory, Methods and Applications; Wiley: New York, NY, USA, 2011. [Google Scholar]
  14. Olver, F.W.J.; Lozier, D.W.; Boisvert, R.W.; Clark, C.W. NIST Handbook of Mathematical Functions; NIST: Gaithersburg, MD, USA; U.S. Department of Commerce: Washington, DC, USA; Cambridge University Press: Cambridge, UK, 2010.
  15. Callan, D. A combinatorial survey of identities for the double factorials. arXiv 2009, arXiv:0906.1317. [Google Scholar]
  16. Li, Y.-W.; Qi, F. A new closed-form formula of the Gauss hypergeometric function at specific arguments. Axioms 2024, 13, 317. [Google Scholar] [CrossRef]
  17. De Gregorio, J.; Sanchez, D.; Toral, R. Entropy estimators for Markovian sequences: A comparative analysis. Entropy 2024, 26, 79. [Google Scholar] [CrossRef] [PubMed]
  18. Riordan, H.J. Combinatorial Identities; Wiley: New York, NY, USA, 1968. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rukhin, A.L. Parity-Based Statistics and Combinatorial Identities. Mathematics 2026, 14, 2407. https://doi.org/10.3390/math14132407

AMA Style

Rukhin AL. Parity-Based Statistics and Combinatorial Identities. Mathematics. 2026; 14(13):2407. https://doi.org/10.3390/math14132407

Chicago/Turabian Style

Rukhin, Andrew L. 2026. "Parity-Based Statistics and Combinatorial Identities" Mathematics 14, no. 13: 2407. https://doi.org/10.3390/math14132407

APA Style

Rukhin, A. L. (2026). Parity-Based Statistics and Combinatorial Identities. Mathematics, 14(13), 2407. https://doi.org/10.3390/math14132407

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop