Parity-Based Statistics and Combinatorial Identities

Rukhin, Andrew L.

doi:10.3390/math14132407

Open AccessArticle

Parity-Based Statistics and Combinatorial Identities

by

Andrew L. Rukhin

Department of Mathematics and Statistics, University of Maryland at Baltimore County, Baltimore, MD 21250, USA

Mathematics 2026, 14(13), 2407; https://doi.org/10.3390/math14132407 (registering DOI)

Submission received: 26 April 2026 / Revised: 11 June 2026 / Accepted: 21 June 2026 / Published: 5 July 2026

(This article belongs to the Section D1: Probability and Statistics)

Download Versions Notes

Abstract

Notable discrete probability laws appear as posterior distributions in the estimation of the common mean with heterogeneous variances. These probabilities, which are defined by an arbitrary set of distinct real numbers, also arise in seemingly unrelated areas of polynomial approximation and statistical physics. The corresponding orthogonal polynomials possess an interesting self-duality property, which invites the study of statistical distributions based on data parity. These distributions provide novel, intriguing formulas for the classical hypergeometric function along with several combinatorial identities.

Keywords:

Dirichlet distribution; hypergeometric function; lagrange interpolation formula; parity-based sum; self-dual orthogonal polynomials

MSC:

05A99; 33C05; 41A10; 60C99; 62F15

1. Introduction: Common Mean and Unknown Heterogeneous Uncertainties

This paper arose from statistical analysis of a sequence,

x_{1}, \dots, x_{n}

, in the setting of heterogeneous research synthesis, with

x_{j}

representing the estimate of the common mean (say, treatment effect), as reported by the j-th study. No conditions are imposed on the unknown accuracies, which cannot be assumed equal. The main statistical challenge is the estimation of the common mean, treated as a shift parameter, when the standard deviations are considered to be unknown nuisance scale parameters.

In some applications, the uncertainty appraisals are either missing or utterly unreliable. The difficulty in accurately valuing the variances of systematic errors, whether due to specific laboratory conditions or hospital protocols, is well acknowledged by data scientists.

The issue of underreported uncertainties, particularly those that stem from asymptotic normal theory, which presupposes large data sets, is prevalent in metrology. Furthermore, the challenge of reproducibility within individual centers may be exacerbated by the nature of the employed measuring instruments.

Another source of artificially small uncertainties may be due to the removal of outliers for purely mathematical reasons. By eliminating “unrepresentative” or “spurious” data points, one typically is left with a part of the sample that is unrealistically accurate. See [1] for further motivation.

Unlike classical statistical models, the scenario suggested here does not require accompanying estimates of uncertainty of

x_{j}

. Our investigation focuses on the special “self-dual” weights that define the discrete posterior distribution for the unknown mean, set against a non-informative objective prior for both the mean and independent variances.

This line of inquiry, initiated in [2] under the assumption of normality, grapples with the lack of variance information, causing several statistical complications. For instance, the classical maximum likelihood estimator cannot be determined uniquely, as the likelihood function reaches infinity at each data point. Nevertheless, the problem is well-defined. Indeed, estimating the common mean requires determining at most n parameters, the mean itself and, say,

ω_{i} = σ_{i}^{- 2} / \sum_{j} σ_{j}^{- 2}

, which belong to the unit simplex of dimension

n - 1

. Statistical practice needs to determine only

(n - 1)

weights to form the common mean estimator, say,

\sum_{i = 1}^{n} ω x_{i}

. In some applications, there is additional information that allows further dimension reduction.

This paper is motivated by mathematical statistics, but its goal is to explore the mathematical aspects of the issues arising in the statistical problem. Indeed, it is addressed to a general mathematical audience. The main contribution is the construction of a unified probabilistic framework for rank-induced parity distributions and the derivation of related moment and combinatorial formulas, which show the link to the Gauss hypergeometric function.

More specifically, Section 2 examines the polynomial approximation problem over the set

{x_{1}, \dots, x_{n}}

, and orthogonal polynomials, which are less deviant from zero on this set. The required formulas for the distributions of parity-based sums are established in Section 3. The deep connection between these distributions and the hypergeometric function is demonstrated in Section 4. We present a self-contained proof of formulas for the specific value of this classical function. Obviously, these formulas are known to specialists, but the author failed to find an easy reference. Our approach yields seemingly new combinatorial identities, (39), (48), (53)–(55), (57) in Section 4 and Section 5. Some useful expressions for partial derivatives are given in Section 6.

2. Self-Dual Probabilities and Orthogonal Polynomials Least Deviating from Zero

We initiate our discussion with a problem that, at first glance, appears unrelated to the main focus. Namely, assuming that all x’s are distinct, the best polynomial approximation of a function f over a finite set

{x_{1}, \dots, x_{n}}

is sought.

The optimal uniform approximation by a polynomial of degree

n - 1

(or higher) is achieved through the classical Lagrange interpolation polynomial

L (x)

, which coincides with f on this set, i.e.,

L (x_{j}) = f_{j} = f (x_{j})

for

j = 1, \dots, n

. If the polynomial’s degree is

n - 2

, the best approximation is derived by subtracting from

L (x)

a specific multiple of the oscillating polynomial that alternates its sign at each successive

x_{j}

, attaining the same absolute value at these points.

The probabilities,

w_{i} = \frac{1}{\prod_{j \neq i} | x_{i} - x_{j} |} {[\sum_{k} \prod_{j \neq k} \frac{1}{| x_{k} - x_{j} |}]}^{- 1}

(1)

= \frac{s_{i}}{\prod_{j \neq i} (x_{i} - x_{j})} {[\sum_{k} s_{k} \prod_{j \neq k} \frac{1}{(x_{k} - x_{)} |}]}^{- 1},

are related to the Lagrange formula and to the parities of

x_{i}

,

i = 1, \dots, n

,

s_{i} = sign (\prod_{j \neq i} (x_{i} - x_{j})) .

(2)

It is known (see Theorem 1.15, [3]) that the approximation error coincides with the absolute value of the average, taken under (1), of products

s_{i} f_{i}

. Thus,

| \sum_{i} s_{i} f_{i} w_{i} | = \min_{R} \max_{i} | f_{i} - R (x_{i}) |,

where

R (x)

runs through all polynomials of degree not exceeding

n - 2

. For instance, the approximation error of the oscillating function

f_{i} = s_{i}

is independent of n,

1 = \min_{R} \max_{i} | s_{i} - R (x_{i}) | .

The barycentric form of the Lagrange interpolation formula,

L (x) = \frac{\sum_{i} w_{i} s_{i} f_{i} / (x - x_{i})}{\sum_{i} w_{i} s_{i} / (x - x_{i})},

provides numerous advantages [4].

Probabilities (1) originate in random matrix theory, where they offer alternative descriptions of a physical ensemble in terms of particles or holes. Many optimization problems involving the discriminant function through electrostatic equilibrium are underpinned by them [5]. From the mathematical perspective, these probabilities are self-dual under the duality definition given in [6] and developed in [7,8,9].

In mathematical statistics, probabilities (1) define the discrete posterior distribution for the location parameter against a non-informative prior for this parameter, mean, and independent variances [10]. The generalized Bayes estimator of the mean under the quadratic loss is

δ = \sum_{i} w_{i} x_{i} = \sum_{i} \frac{x_{i}}{\prod_{j \neq i} | x_{i} - x_{j} |} {[\sum_{i} \prod_{j \neq i} \frac{1}{| x_{i} - x_{j} |}]}^{- 1} .

(3)

This statistic serves as a semiparametric estimator of the symmetry center in a heterogeneous sample, meaning that it does not depend on the distribution of x’s from a broad class.

Orthogonal polynomials against (1) exhibit striking symmetry. To see it, let

δ_{m} = \sum_{k} w_{k} x_{k}^{m}, m = 0, 1, \dots

represent the moments of self-dual weights (1). The monic polynomials

T_{m}, m = - 1, 0, 1, \dots, n,

T_{- 1} (z) = 0, T_{0} (z) = 1, T_{1} (z) = z - δ_{1}, \dots, T_{n} (z) = \prod (z - x_{k})

, are orthogonal in

L_{2}

,

\sum_{i} w_{i} T_{m} (x_{i}) T_{p} (x_{i}) = δ_{m p} \sqrt{h_{m} h_{p}} .

(4)

With

W = {[\sum_{i} \prod_{j \neq i} {| x_{i} - x_{j} |}^{- 1}]}^{- 1},

the sequence

h_{m} = \sum_{i} {[T_{m} (x_{i})]}^{2} w_{i}, h_{0} = 1, h_{1} = δ_{2} - δ_{1}^{2}, \dots, h_{n - 1} = W^{2},

enjoys the mentioned symmetry as

h_{m} h_{n - m - 1} = W^{2}

.

The orthogonal polynomials

T_{m}

are known to satisfy the three-term recurrence,

T_{m + 1} (z) = (z - α_{m}) T_{m} (z) - β_{m} T_{m - 1} (z),

(5)

where the coefficients

β_{m} = h_{m} / h_{m - 1}, β_{m} = β_{n - m}

, and

α_{m} = \sum_{i} x_{i} {[T_{m} (x_{i})]}^{2} w_{i} / h_{m},

also possess symmetry property:

α_{m} = α_{n - 1 - m}

.

Mathematical induction applied to (5) shows that for

i, m = 1, \dots, n

,

W T_{m} (x_{i}) = s_{i} h_{m} T_{n - m - 1} (x_{i}) .

(6)

Identity (6) can be obtained from the original duality definition [6], according to which

T_{m} (x_{i})

is a multiple of

T_{n - 1} (x_{i}) T_{n - m - 1} (x_{i})

.

The polynomial,

T_{n - 1} (z) = \sum_{i} w_{i} \prod_{j \neq i} (z - x_{j}),

is less deviant from zero in

L_{\infty}

:

T_{n - 1} (x_{i}) = s_{i} W

, i.e., for any monic polynomial R of degree not exceeding

n - 1

,

\max_{i} | T_{n - 1} (x_{i}) | = W \leq \max_{i} | R (x_{i}) | .

(7)

The comparison of the extremes of

T_{n - 1} (z)

and those of the classical monic, degree

n - 1

, Chebyshev polynomial on the interval,

[\min x_{i}, \max x_{i}]

, provides a sharp inequality,

W \leq \frac{{(\max x_{i} - \min x_{i})}^{n - 1}}{2^{n - 2}} = 2 {(\frac{\max x_{i} - \min x_{i}}{2})}^{n - 1},

(8)

which holds for all distinct

x_{1}, \dots, x_{n}

. The factor

2^{2 - n}

in (8) is given incorrectly in [10].

Equality in (8) is attained if and only if x’s are extreme points of the mentioned Chebyshev polynomial on this interval, i.e., when for

j = 1, \dots, n

,

x_{j} = \frac{\max x_{i} + \min x_{i}}{2} + \frac{(\max x_{i} - \min x_{i})}{2} \cos (\frac{(j - 1) π}{n - 1}) .

Then the probabilities (1) have a remarkably simple form,

w_{j} = \frac{1}{n - 1}, j = 2, \dots, n - 1, w_{1} = w_{n} = \frac{1}{2 (n - 1)} .

Since

w_{1} < w_{2}, w_{n} < w_{n - 1}

, this provides the closest resemblance of (1) to the uniform distribution.

In addition to

x_{1}, \dots, x_{n}

, the polynomial

T_{n - 1}^{2} (z) - W^{2}

has

n - 2

real roots (which interlace those of

T_{n - 1}

), so that, with

R_{n - 1}

denoting the monic polynomial of degree

n - 2

having these roots,

T_{n - 1}^{2} (z) - W^{2} = T_{n} (z) R_{n - 1} (z) .

(9)

Therefore,

R_{n - 1}

is the associated polynomial to

T_{n - 1}

,

R_{n - 1} (z) = \sum_{i} \frac{[T_{n - 1} (z) - T_{n - 1} (x_{i})] w_{i}}{z - x_{i}} .

(10)

The associated with

T_{m}

orthogonal monic polynomial

R_{m}

of degree

m - 1, m = 1, \dots, n - 1,

satisfies the same recurrence (5), but the initial conditions are different:

R_{0} = 0, R_{1} = 1

, so that

R_{2} (z) = z - α_{1}

,

\dots, R_{n} (z) = T_{n - 1} (z) = (z - δ_{1}) R_{n - 1} (z) - β_{1} R_{n - 2} (z)

.

The coefficients

α_{n / 2}, β_{n / 2}

admit an explicit form: for even n

α_{(n - 2) / 2} = α_{n / 2} = \frac{\sum_{i} s_{i} x_{i}^{2}}{2 \sum_{i} s_{i} x_{i}}, β_{n / 2} = \frac{{(\sum_{i} s_{i} x_{i})}^{2}}{4} .

(11)

If n is odd, then

T_{(n - 1) / 2} (z) = P_{o} (z) = \prod_{j : s_{j} = - 1} (z - x_{j}),

(12)

h_{(n - 1) / 2} = W

, and

α_{(n - 1) / 2} = \sum_{i} s_{i} x_{i}, β_{(n - 1) / 2} = β_{(n + 1) / 2} = \frac{\sum_{i} s_{i} x_{i}^{2} - {(\sum_{i} s_{i} x_{i})}^{2}}{4} .

(13)

With

P_{e} (z) = \prod_{j : s_{j} = 1} (z - x_{j}),

other central orthogonal polynomials are of the form

T_{n / 2} (z) = \frac{P_{o} (z) + P_{e} (z)}{2},

(14)

T_{(n + 1) / 2} (z) = \frac{(z - \sum_{i} s_{i} x_{i}) P_{o} (z) + P_{e} (z)}{2},

(15)

and

T_{(n - 2) / 2} (z) = \frac{P_{o} (z) - P_{e} (z)}{\sum_{i} s_{i} x_{i}} .

(16)

One can represent

R_{n - 1} (z) = R_{e} (z) R_{o} (z)

as a product of two monic polynomials (with real roots) of degrees

n_{e} - 1

and

n_{o} - 1

, respectively. Then with

P_{e}

and

T_{n - 1} (z) - W = P_{e} (z) R_{o} (z)

and

T_{n - 1} (z) + W = P_{o} (z) R_{e} (z)

. If

s_{i} = 1

,

R_{e} (x_{i}) = 2 W / P_{o} (x_{i}),

when

s_{j} = - 1

,

R_{o} (x_{j}) = - 2 W / P_{e} (x_{j})

. Thus,

R_{e} (z) = 2 W \sum_{s_{i} = 1} \frac{\prod_{s_{k} = 1, k \neq i} (z - x_{k})}{P_{o} (x_{i}) P_{e}^{'} (x_{i})} = 2 \sum_{s_{i} = 1} w_{i} \prod_{s_{k} = 1, k \neq i} (z - x_{k}),

(17)

and

R_{o} (z) = 2 \sum_{s_{j} = - 1} w_{j} \prod_{s_{ℓ} = - 1, ℓ \neq j} (z - x_{ℓ}) .

(18)

Central associated polynomials are

R_{(n - 1) / 2} = R_{o},

R_{n / 2} (z) = \frac{R_{o} (z) + R_{e} (z)}{2},

(19)

R_{(n + 1) / 2} (z) = \frac{(z - \sum_{i} s_{i} x_{i}) R_{o} (z) + R_{e} (z)}{2},

(20)

and

R_{(n - 2) / 2} (z) = \frac{R_{o} (z) - R_{e} (z)}{\sum_{i} s_{i} x_{i}} .

(21)

We summarize the main results as a theorem whose detailed proof can be found in in [10].

Theorem 1.

Polynomials

T_{m}

, which are orthogonal with regard to probabilities (1), satisfy (5) and (6), with their central versions in (12), (14)–(16). Their associate polynomials satisfy (9) and (10), with the central versions given in (19)–(21). Central coefficients are given in (11) and (13). Inequality (8) is valid.

Coefficients

α_{(n - 1) / 2}, β_{n / 2}

are completely determined by

\sum_{i} s_{i} x_{i}

. Two other coefficients involve quadratic forms involving

s_{i} x_{i}

. We refer to these functions of observations as parity-based statistics and embark on their study.

3. Parity-Based Distributions

The main object of interest in this section is the parity-based sums of the form,

\sum_{i} s_{i} x_{i}

.

The finite set

{x_{1}, \dots, x_{n}}

in Section 2 can be considered as the representative points of univariate statistical distribution [11]. Thus, we assume that it is a realization of n independent random variables with common continuous distribution function

F (x) = F_{0} (x)

, whose density

f (x)

has all finite moments,

m_{p} = E x^{p} = \int x^{p} f (x) d x

, which determine F uniquely.

Denote by

s_{1}, \dots, s_{n}

the parity sequence corresponding to

x_{1}, \dots, x_{n}

. To start exploring the behavior of the parity-based sums, notice that the distribution of a random parity

s_{i}

can be written as

P (s_{i} = ϵ) = \frac{1}{2 n} \sum_{r = 1}^{n} [1 + ϵ {(- 1)}^{r + n}] = \frac{1}{2} [1 + \frac{ϵ [1 - {(- 1)}^{n}]}{2 n}],

(22)

ϵ = \pm 1

, which does not depend on F.

The joint density of

x_{i}

and its parity

s_{i}

is

f (x_{i}, s_{i} = ϵ) = \frac{f (x_{i})}{2} \{1 + ϵ {[2 F (x_{i}) - 1]}^{n - 1}\} .

(23)

Therefore, for

i = 1, \dots, n

,

P (x_{i} \leq x, s_{i} = ϵ) = \frac{F (x)}{2} + \frac{ϵ {{[2 F (x) - 1]}^{n} - {(- 1)}^{n}}}{4 n} .

If f is symmetric, the distribution function of

y_{i} = s_{i} x_{i}

is

G (y) = P (x_{i} < y, s_{i} = 1) + P (x_{i} > - y, s_{i} = - 1)

(24)

= F (y) + \frac{[1 + {(- 1)}^{n}] {{[2 F (y) - 1]}^{n} - 1}}{4 n} .

These formulas can be derived from the distribution of order statistics whose rank has the same parity as the largest observation. For example, the conditional density of

x_{i}

and

x_{j}

for given

s_{i} = ϵ_{1}, s_{j} = ϵ_{2}

, has the form

f (x_{i}, x_{j} | s_{i} = ϵ_{1}, s_{j} = ϵ_{2}) = \frac{f (x_{i}) f (x_{j})}{4 P (s_{i} = ϵ_{1}, s_{j} = ϵ_{2})} {1 + ϵ {(- 1)}^{n + r_{1}} {[1 - 2 F (x_{i})]}^{n - 2}

(25)

+ ϵ_{2} {(- 1)}^{n + r_{2}} {[1 - 2 F (x_{j})]}^{n - 2} + ϵ_{1} ϵ_{2} {(- 1)}^{r_{1} + r_{2}} [1 - 2 | F (x_{i}) - F (x_{j}) {|]}^{n - 2}},

where

r_{1}

and

r_{2}

,

1 \leq r_{1} \neq r_{2} \leq 2

, are relative ranks for

x_{i}

and

x_{j}

, say,

r_{1} = 1

if

x_{i} < x_{j}

.

The next result provides the joint density of m-sub-vectors

x_{n_{1}}, \dots, x_{n_{m}}

and

s_{n_{1}}, \dots, s_{n_{m}}

, the relevant conditional distributions, as well as the form of the moments of parity sums. Here, m is a fixed integer,

1 \leq m \leq n

. Thus,

M = {1, \dots, m} \subset N = {1, \dots, n}

.

Theorem 2.

The exchangeable distribution of

x_{n_{1}}, \dots, x_{n_{m}}

and

s_{n_{1}}, \dots, s_{n_{m}}

, is provided by (28). The conditional density of

x_{n_{j}}, j \in M

, for a given value of the product

s_{n_{1}} \dots s_{n_{m}}

satisfies (33). The moments of the parity sum,

\sum_{i = 1}^{n} s_{i} x_{i},

can be found from (37).

Proof.

Let

z_{1} < \dots < z_{m}

be the order statistics corresponding to

x_{n_{j}}, j \in M

. The joint distribution of

z_{1}, \dots, z_{m}

and the parities can be represented as a mixture of the conditional densities for given ranks. A particular density enters this mixture if and only if

s_{n_{j}} = {(- 1)}^{n + R_{j}}

, where

R_{j}, 1 \leq R_{j} \leq n

, is the rank of

x_{n_{j}}

among the total sample.

Let

r_{i}

denote the rank of

x_{n_{i}}

within our subsample. Then

x_{n_{j}} = z_{r_{j}}

, and

s_{j}^{M} = {(- 1)}^{m + r_{j}}

is the parity of

x_{n j}

in this subsample.

Since the probability of any rank combination is

{(\binom{n}{m})}^{- 1}

, the classical formula for the distribution of several order statistics [12] implies that the joint density of

y_{1}, \dots, y_{m}

and the corresponding parities is

f (z_{1}, \dots, z_{m}; s_{n_{1}}, \dots, s_{n_{m}}) = m! \prod_{1}^{m} f (z_{j})

(26)

\times \sum_{1 \leq p_{1} < \dots < p_{m} \leq n} (\begin{matrix} n - m \\ p_{1} - 1 & p_{2} - p_{1} - 1 & \dots & n - p_{m} \end{matrix}) \prod_{j = 1}^{m} \frac{[1 + s_{n_{j}} {(- 1)}^{n + p_{j}}]}{2}

\times Δ_{1}^{p_{1} - 1} Δ_{2}^{p_{2} - p_{1} - 1} \dots Δ `_{m}^{p_{m} - p_{m - 1} - 1} Δ_{m + 1}^{n - p_{m}} .

Here, under the convention that

F (z_{m + 1}) = 1

,

z (y_{0}) = 0

,

Δ_{j} = F (y_{j}) - F (y_{j - 1})

are familiar

m + 1

spacings,

\sum_{j = 1}^{m + 1} Δ_{j} = 1

, which are known to have a Dirichlet distribution Dir_m+1 with positive concentration parameters

r_{1}, r_{2} - r_{1}, \dots, r_{m} - r_{m - 1}, n + 1 - r_{m}

[13]. Therefore,

\int \dots \int_{z_{1} < \dots < z_{m}} \prod_{j = 1}^{m + 1} {[F (z_{j}) - F (z_{j - 1})]}^{r_{j} - r_{j - 1} - 1} \prod_{j = 1}^{m} f (z_{j}) d z_{j}

= \frac{Γ (r_{1}) Γ (r_{2} - r_{1}) \dots Γ (r_{m} - r_{m - 1}) Γ (n + 1 - r_{m})}{m! Γ (n + 1)} .

Integration of (26) over

z_{1}, \dots, z_{m}

gives our first combinatorial identity,

2^{m} (\binom{n}{m}) P (s_{n_{j}} = ϵ_{j}, j = 1, \dots, m) = \sum_{1 \leq p_{1} < \dots < p_{m} \leq n} \prod_{j = 1}^{m} [1 + ϵ_{j} {(- 1)}^{n + p_{j}}] .

(27)

By replacing the summation variables in (26) with

p_{1} - 1, p_{2} - p_{1} - 1, \dots, n - p_{m}

and using multinomial theorem, we arrive at the form of the joint distribution of

x_{n_{1}}, \dots, x_{n_{m}}

and

s_{n_{1}}, \dots, s_{n_{m}}

,

f (x_{n_{1}}, \dots, x_{n_{m}}; s_{n_{1}}, \dots, s_{n_{m}})

(28)

= \frac{\prod_{j} f (x_{n_{j}})}{2^{m}} \sum_{k = 0}^{m} {(- 1)}^{n k} \sum_{K \subset M, | K | = k} [\prod_{i \in K} {(- 1)}^{r_{i}} {[1 - 2 \sum_{i \in K} s_{i}^{K} F (x_{n_{i}})]}^{n - m}] \prod_{i \in K} s_{n_{i}}

= \frac{\prod_{j} f (x_{n_{j}})}{2^{m}} \sum_{k = 0}^{m} {(- 1)}^{(n - m) k} \sum_{K \subset M, | K | = k} \prod_{i \in K} s_{n_{i}} s_{i}^{M} {[1 - 2 \sum_{i \in K} s_{i}^{K} F (x_{n_{i}})]}^{n - m} .

Here, for

i \in K

,

s_{i}^{K} = sign (\prod_{j \neq i, j \in K} (x_{n_{i}} - x_{n_{j}}),

is the parity of

x_{n_{i}}

in the subsample

x_{n_{j}}, j \in K

, where k is the cardinality of K. Thus, this joint density is a linear function of parity products,

\prod_{i \in K} s_{n_{i}}, K \subset M

.

The joint Dirichlet distribution of spacings implies that

\sum_{j \in K} s_{j}^{K} F (x_{j})

is beta-distributed with parameters

(q, m + 1 - q), q = q_{k},

where for any subset K of M,

q_{k} = j_{k} - j_{k - 1} + \dots - {(- 1)}^{k} j_{1} = \sum_{i \in K} {(- 1)}^{j_{i} + k} j_{i}, q_{k} \geq 0 = q_{0} .

(29)

Clearly,

{(- 1)}^{q_{k}} = {(- 1)}^{\sum_{i \in K} i} = {(- 1)}^{\sum_{j \in r K} r_{j}}

. If k is even,

k / 2 \leq q_{k} \leq m - k / 2;

for odd k,

(k + 1) / 2 \leq q_{k} \leq m - (k - 1) / 2,

so that

q_{m} = \sum_{j = 1}^{m} {(- 1)}^{j + m} = ⌈ m / 2 ⌉ = m / 2 + [1 - {(- 1)}^{m}] / 4,

(30)

and

{(- 1)}^{q_{m}} = {(- 1)}^{m (m + 1) / 2}

.

Thus, density (28) is related to the classical hypergeometric function,

{}_{2}F_{1} (m - n, q; m + 1; z) = F (m - n, q; m + 1; z)

(actually a polynomial in z of degree

n - m

). Our argument shows the special role of the specific value

z = 2,

which appears as the factor at

\sum s_{j}^{K} F (x_{j})

in (28).

Indeed, by using the fundamental integral representation of the hypergeometric function [14], and setting

q = q_{k}

, one obtains the following expression:

E {[1 - 2 \sum_{j \in K} s_{j}^{K} F (x_{j})]}^{n - m} = \frac{\int_{0}^{1} {(1 - 2 u)}^{n - m} u^{q - 1} {(1 - u)}^{m - q} d u}{B (q, m + 1 - q)}

(31)

= F (m - n, q; m + 1; 2) .

Observe that for

ϵ_{j} = \pm 1, j \in M

,

2^{m}

functions,

\prod_{K} ϵ_{i}, K \subset M

, are linearly independent. Indeed, they are orthogonal under the natural inner product,

\sum_{ϵ_{j} = \pm 1, j \in M} \prod_{K} ϵ_{i} \prod_{L} ϵ_{i} = 2^{m} δ_{L, K},

where K and L are fixed subsets of M. Also,

\sum_{ϵ_{j} = \pm 1, j \in M, \prod_{K} ϵ_{i} = ϵ} \prod_{L} ϵ_{ℓ} = 2^{m - 1} [δ_{\emptyset, K} + ϵ δ_{L, K}] .

Simplifying the notation from

x_{n_{j}}

to

x_{j}

, we see that the joint density of

x_{j}, j \in M

, and of the product,

\prod_{K} s_{j}

,

K \subset M

, has the form

f (x_{j}, j \in M; \prod_{i \in K} s_{i} = ϵ)

(32)

= \frac{\prod_{M} f (x_{j})}{2} {1 + ϵ {(- 1)}^{n k + \sum_{K} r_{j}} {[1 - 2 \sum_{j \in K} s_{j}^{K} F (x_{j})]}^{n - m}} .

When

K = M

, one obtains

E (\prod_{M} s_{j} | x_{j}, j \in M) = {(- 1)}^{n m + m (m + 1) / 2} {[1 - 2 \sum_{M} s_{j}^{M} F (x_{j})]}^{n - m},

(33)

so that by (31) with

q_{m}

given in (30),

E \prod_{M} s_{j} = {(- 1)}^{n m + m (m + 1) / 2} F (m - n, q_{m}; m + 1; 2) .

(34)

To derive formulas for the moments of the parity-based sum,

\sum_{i = 1}^{n} s_{i} x_{i},

we need an extension of (32) for positive integers

ν_{j}, j = 1, \dots, m

. For this purpose, the form of the joint density of

x_{j}, j \in M

, and the product of the corresponding parities,

\prod_{1}^{m} s_{j}^{ν_{j}}

, is desired.

This density can be obtained from (32), since

\prod_{1}^{m} s_{j}^{ν_{j}} = ϵ

, if and only if with

D = {j, j = 1, \dots, m, ν_{j}

odd }, one has

\prod_{j, ν_{j} \in D} s_{j} = ϵ .

Thus, with d denoting the cardinality of D and

r_{j}

still denoting the rank of

x_{j}

within the subsample,

E (\prod_{1}^{m} s_{j}^{ν_{j}} | x_{j}, j \in M) = {(- 1)}^{n d + \sum_{D} r_{j}} {[1 - 2 \sum_{j \in D} s_{j}^{D} F (x_{j})]}^{n - m},

and

E \prod_{j = 1}^{m} {(s_{j} x_{j})}^{ν_{j}} = {(- 1)}^{n d} E {(- 1)}^{\sum_{j \in_{D}} r_{j}} {[1 - 2 \sum_{j \in D} s_{j}^{D} F (x_{j})]}^{n - m} \prod_{j = 1}^{m} x_{j}^{ν_{j}}

(35)

= {(- 1)}^{n d + d (d + 1) / 2} E \prod_{j \in D} x_{j}^{ν_{j}} {[1 - 2 \sum_{j \in D} s_{j}^{D} F (x_{j})]}^{n - m} \prod_{i \notin D} m_{ν_{i}} [1 - 2 \sum_{j \in D} s_{j}^{D} F_{ν_{i}} (x_{j})],

where

F_{ν} (x) = \int_{- \infty}^{x} u^{ν} f (u) d u / m_{ν}, ν = 0, 2, 4, \dots,

s_{i}^{D} = sign (\prod_{j \neq i, j \in D} (x_{i} - x_{j})

.

To prove (35), we evaluate the following conditional expectation:

E (\prod_{i : ν_{i} even} {(- 1)}^{r_{i}} x_{i}^{ν_{i}} | x_{j}, j \in D) = {(- 1)}^{m (m - d)} [\prod_{i \neq ℓ : i, ℓ \notin D} sign (x_{i} - x_{ℓ})]

\times \prod_{i \notin D} m_{ν_{i}} E (\prod_{i : ν_{i} e v e n} \prod_{k : ν_{k} o d d} sign [\prod_{k \neq i} (x_{i} - x_{k})] x_{i}^{ν_{i}} | x_{j}, j \in D)

= {(- 1)}^{m (m - d) + (m - d) (m - d - 1) / 2} \prod_{i \notin D} m_{ν_{i}} [1 - 2 \sum_{D} s_{j}^{D} F_{ν_{i}} (x_{j})] .

Identity (35) is valid when some

ν_{i}, i \notin D

vanish. Thus, for any non-negative integers

ν_{j}, j = 1, \dots, n

,

E \prod_{j = 1}^{n} {(s_{j} x_{j})}^{ν_{j}} = {(- 1)}^{n d + d (d + 1) / 2} E \prod_{j \in D} x_{j}^{ν_{j}} \prod_{i \notin D} m_{ν_{i}} [1 - 2 \sum_{j \in D} s_{j}^{D} F_{ν_{i}} (x_{j})],

(36)

D = {j, j = 1, \dots, n, ν_{j}

odd},

More general formula involves functions

ϕ_{j}, j = 1, \dots, n

,

E \prod_{j = 1}^{n} s_{j}^{ν_{j}} ϕ_{j} (x_{j}) = {(- 1)}^{n d + d (d + 1) / 2} E \prod_{j \in D} ϕ_{j} (x_{j})

\times \prod_{i \notin D} [\int_{- \infty}^{\infty} ϕ_{i} (u) f (u) d u - 2 \sum_{j \in D} s_{j}^{D} \int_{- \infty}^{x_{j}} ϕ_{i} (u) f (u) d u] .

According to (36),

E {[\sum_{1}^{n} s_{j} x_{j}]}^{p} = \sum_{\sum ν_{j} = p} (\begin{matrix} p \\ ν_{1} & \dots & ν_{n} \end{matrix}) {(- 1)}^{n d + d (d + 1) / 2}

(37)

\times E \prod_{j : ν_{j} odd} x_{j}^{ν_{j}} \prod_{i : ν_{i} even} m_{ν_{i}} [1 - 2 \sum_{D} s_{j}^{D} F_{ν_{i}} (x_{j})]

where d is as above and

p - d

is even; (37) presents the correct version of Formula (34) in [10]. □

By using (28), one can find the joint (symmetric) density of

y_{n_{1}}, \dots, y_{n_{m}}

. For example, when f is assumed symmetric, the distribution of

y_{1} = s_{1} x_{1}

and

y_{2} = s_{2} x_{2}

is exchangeable, so that it suffices to determine its density when

| y_{1} | < | y_{2} |

. Then

ϵ_{1} y_{1} \land ϵ_{2} y_{2} = - | y_{2} |,

if

ϵ_{2} = - sign (y_{2}); = ϵ_{1} y_{1},

otherwise. Similarly,

ϵ_{1} y_{1} \lor ϵ_{2} y_{2} = | y_{2} |,

when

ϵ_{2} = sign (y_{2}); = ϵ_{1} y_{1},

otherwise. Thus,

\sum_{ϵ_{1} y_{1} < ϵ_{2} y_{2}} ϵ_{1} {[1 - 2 F (ϵ_{1} y_{1})]}^{n - 2} + \sum_{ϵ_{2} y_{2} < ϵ_{1} y_{1}} ϵ_{2} {[1 - 2 F (ϵ_{2} y_{2})]}^{n - 2}

= [1 - {(- 1)}^{n}] {[1 - 2 F (y_{1})]}^{n - 2} - 2 sign (y_{2}) [1 - 2 F (- | y_{2} {|)]}^{n - 2},

and

\sum_{ϵ_{1} y_{1} < ϵ_{2} y_{2}} ϵ_{2} {[1 - 2 F (ϵ_{2} y_{2})]}^{n - 2} + \sum_{ϵ_{2} y_{2} < ϵ_{1} y_{1}} ϵ_{1} {[1 - 2 F (ϵ_{1} y_{1})]}^{n - 2}

= [1 - {(- 1)}^{n}] {[1 - 2 F (y_{1})]}^{n - 2} + 2 sign (y_{2}) [1 - 2 F (| y_{2} {|)]}^{n - 2} .

Since

\sum_{ϵ_{1}, ϵ_{2}} ϵ_{1} ϵ_{2} [1 - 2 | F (ϵ_{1} y_{1}) - F (ϵ_{2} y_{2}) {|]}^{n - 2}

= 2 sign (y_{2}) \{[1 - 2 | F (y_{1}) - F (| y_{2} {|) |]}^{n - 2} - [1 - 2 | F (y_{1}) + F (| y_{2} |) - {1 |]}^{n - 2}\}

= 2 \{[1 - 2 | F (y_{1}) - F (y_{2}) {|]}^{n - 2} - [1 - 2 | F (y_{1}) + F (y_{2}) - 1 {|]}^{n - 2}\},

it follows that the joint density of

y_{1}, y_{2}

when

| y_{1} | < | y_{2} |

is

g (y_{1}, y_{2}) = \sum_{ϵ_{1}, ϵ_{2} = \pm 1} f (ϵ_{1} y_{1}, ϵ_{2} y_{2}, s_{i} = ϵ_{1}, s_{j} = ϵ_{2})

(38)

= f (y_{1}) f (y_{2}) {1 + \frac{sign (y_{2}) [1 + {(- 1)}^{n}]}{2} [1 - 2 F (| y_{2} {|)]}^{n - 2}

- \frac{1}{2} \{[1 - 2 | F (y_{1}) - F (y_{2}) {|]}^{n - 2} - [1 - 2 | F (y_{1}) + F (y_{2}) - 1 {|]}^{n - 2}\} .

In the symmetric case,

E {(\sum_{i = 1}^{n} s_{i} x_{i})}^{p} = 0,

for odd n and p. Indeed,

\sum_{i = 1}^{n} s_{i} x_{i}

up to multiple

{(- 1)}^{n}

coincides with the parity sum derived from the sample

(- x_{1}, \dots, - x_{n})

, which is equidistributed with x’s.

If

p = 1

, the first moment can be derived from (23); the form of the second moment follows from (25),

E {(\sum_{i = 1}^{n} s_{i} x_{i})}^{2} = n m_{2} - n (n - 1) E x_{1} x_{2} [1 - 2 | F (x_{2}) - F (x_{1}) {|]}^{n - 2} .

When

p = 3 \leq n

,

E {(\sum_{i = 1}^{n} s_{i} x_{i})}^{3} = n E s_{1} x_{1}^{3} + 3 n (n - 1) E s_{1} x_{1} x_{2}^{2} + n (n - 1) (n - 2) E \prod_{1}^{3} s_{i} x_{i}

= n \int x^{3} f (x) {[2 F (x) - 1]}^{n - 1} d x

- 3 n (n - 1) \int x f (x) {[2 F (x) - 1]}^{n - 2} [m_{2} - 2 \int_{- \infty}^{x} y^{2} f (y) d y] d x

+ 6 n (n - 1) (n - 2) {\int \int}_{x < y < z} x y z f (x) f (y) f (z)

\times {[1 - 2 F (z) + 2 F (y) - 2 F (x)]}^{n - 3} d x d y d z .

When

p = 4 \leq n

,

E {(\sum_{i = 1}^{n} s_{i} x_{i})}^{4} = n m_{4} + 3 n (n - 1) m_{2}^{2} E x_{1} x_{2}^{3} [1 - 2 | F (x_{2}) - F (x_{1}) {|]}^{n - 2}

- 6 n (n - 1) (n - 2) m_{2} E x_{1} x_{2} x_{3}^{2} [1 - 2 | F_{2} (x_{2}) - F_{2} (x_{1}) |] [1 - 2 | F (x_{2}) - F (x_{1}) {|]}^{n - 3}

+ E \prod_{1}^{4} x_{i} {[1 - 2 \sum_{1}^{4} s_{j} F (x_{j})]}^{n - 4} .

4. Parities and Hypergeometric Function

For fixed

m, 1 \leq m \leq n

, the joint distribution of parities

s_{n_{1}}, \dots, s_{n_{m}}

is obtained in Theorem 2, whose notation we follow. According to (27) and (28), if

1 \leq m \leq n,

P (s_{n_{j}} = ϵ_{j}, j = 1, \dots, m) = \frac{\sum_{1 \leq p_{1} < \dots < p_{m} \leq n} \prod_{j = 1}^{m} [1 + ϵ_{j} {(- 1)}^{n + p_{j}}]}{2^{m} (\binom{n}{m})}

(39)

= \frac{1}{2^{m}} \sum_{k = 0}^{m} {(- 1)}^{n k + k (k + 1) / 2} F (k - n, q_{k}; k + 1; 2) k! E_{k} (ffl) .

Here, for

ffl = (ϵ_{1}, \dots, ϵ_{m})

,

E_{k} (ffl) = \sum_{K \subset M, | K | = k} \prod_{K} {ffl}_{i},

is the k-th elementary symmetric function whose values depend only on d, the number of

ϵ

’s equal to

- 1

. Indeed

\prod_{j = 1}^{m} (z - s_{j}) = {(z + 1)}^{d} {(z - 1)}^{m - d} = \sum_{k = 0}^{m} {(- 1)}^{m - k} E_{k} (ffl) z^{m - k},

so that

E_{k} (ffl) = {(- 1)}^{k - d} \sum_{ℓ = 0}^{k} {(- 1)}^{ℓ} (\binom{m - d}{ℓ}) (\binom{d}{k - ℓ}) .

Now we give explicit formulas for

F (m - n, q_{k}; k + 1; 2)

involving double factorials. See [15] for a survey of related combinatorial identities, and [16] for further instances of closed-form expressions for this function at specific arguments.

Theorem 3.

If

1 \leq k \leq m \leq n, n \geq 2

, are positive integers, the following identities hold for the hypergeometric function:

F (m - n, (k + 1) / 2; k + 1; 2) = \frac{(n - m - 1)!! k!!}{(n - m + k)!!}, n - m e v e n, k o d d,

(40)

F (m - n, (k + 1) / 2; k + 1; 2) = 0, n - m o d d, k o d d,

(41)

F (m - n, k / 2; k + 1; 2) = F (m - n, (k + 2) / 2; k + 1; 2)

(42)

= \frac{(n - m - 1)!! (k - 1)!!}{(n - m + k - 1)!!}, n - m e v e n, k e v e n,

F (m - n, k / 2; k + 1; 2) = - F (m - n, (k + 2) / 2; k + 1; 2)

(43)

= \frac{(n - m)!! (k - 1)!!}{(n - m + k)!!}, n - m o d d, k e v e n .

For any positive integers

q, 1 \leq q \leq k,

and p,

F (- p, q; m + 1; 2) = {(- 1)}^{n - m} F (- p, m + 1 - q; m + 1; 2),

(44)

and (46) is valid.

Proof.

To prove Formulas (40)–(43), we use the well-known facts about the hypergeometric function. According to 15.8.13 in [14]

F (m - n, (k + 1) / 2; k + 1; z)

= \frac{{(2 - z)}^{n - m}}{2^{n - m}} F (\frac{m - n}{2}, \frac{m - n + 1}{2}; \frac{k + 2}{2}; \frac{z^{2}}{{(2 - z)}^{2}}) .

If

n - m

is even, this identity means that

F (m - n, (k + 1) / 2; k + 1; z) = \frac{{(2 - z)}^{n - m}}{2^{n - m}}

(45)

\times \sum_{0 \leq j \leq (n - m) / 2} \frac{{[(m - n) / 2]}^{\bar{j}} {[(m - n - 1) / 2]}^{\bar{j}}}{{[(j + 2) / 2]}^{\bar{j}} j!} z^{2 j} {(2 - z)}^{n - m - 2 j} .

Here, for any real a and non-negative integer j,

{[a]}^{\bar{j}} = a (a + 1) \dots (a + j - 1) = Γ (a + j) / Γ (a)

is the ascending factorial.

The only term of the finite series in the right-hand side of (45) without a positive power of

(2 - z)

corresponds to

j = j_{0} = (n - m) / 2

. Its coefficient equals

\frac{{[(m - n) / 2)]}^{{\bar{j}}_{0}} {[(m - n - 1) / 2]}^{{\bar{j}}_{0}}}{{[(k + 2) / 2]}^{{\bar{j}}_{0}} j_{0}!} = \frac{Γ (\frac{n - m + 1}{2}) Γ (\frac{k + 2}{2})}{Γ (\frac{1}{2}) Γ (\frac{n - m + k + 1}{2})},

which is seen to coincide with (40). This coefficient vanishes when

n - m

is odd, implying (41).

One has

\frac{k z}{2} F (m - n, \frac{k}{2}; k + 1; z)

= k F (m - n, \frac{k}{2}; k; z) + k (z - 1) F (m - n - 1, \frac{k}{2}; k; z),

15.5.16 in [14] leading to (42) and (43).

Identity (39) means that for any positive integers,

k \leq m \leq n

,

\sum_{K \subset M, | K | = k} \sum_{1 \leq p_{1} < \dots < p_{m} \leq n} {(- 1)}^{\sum_{K} p_{j}} = {(- 1)}^{k (k + 1) / 2} (\binom{n}{m}) F (k - n, q_{k}; k + 1; 2) .

(46)

= {(- 1)}^{n k} (\binom{n}{m}) E \prod_{K} s_{n_{j}} = {(- 1)}^{n k} (\binom{n}{m}) [2 P (\prod_{K} s_{n_{j}} = 1) - 1] .

In the last two formulas, K is any k-element subset of M. □

The values of the hypergeometric function entering (39) and other formulas with

0 \leq m \leq n

,

q_{m} = ⌈ m / 2 ⌉

, can be summarized as follows:

F (m - n, q_{m}; m + 1; 2) = \{\begin{matrix} \frac{(m - 1)!! [1 + {(- 1)}^{m}]}{2 (n - 1) (n - 3) \dots (n - m + 1)} & n even \\ \frac{(m - 1)!!}{(n - 2) \dots (n - m)} & n odd, m even \\ \frac{m!!}{(n - 2) \dots (n - m - 1)} & n odd, m odd . \end{matrix}

(47)

When

m = 0

, this function takes its largest value,

F (- n, 0; 1; 2) = 1

. If

q = m + 1

,

F (m - n, m + 1; m + 1; 2) = {(- 1)}^{n - m}

.

It is immediate that when

n \to \infty

and m is fixed,

F (m - n, q_{m}; m + 1; 2) \sim \{\begin{matrix} \frac{(m - 1)!! [1 + {(- 1)}^{m}]}{2 n^{m / 2}} & n even \\ \frac{(m - 1)!!}{n^{m / 2}} & n odd, m even \\ \frac{m!!}{n^{(m + 1) / 2}} & n odd, m odd . \end{matrix}

Now by using (39), one gets the first-order approximation for fixed

m \geq 2

,

2^{m} P (s_{n_{j}} = ϵ_{j}, j = 1, \dots, m) = 1 + \frac{m}{2 n} [ϵ_{m} - {(- 1)}^{n} ϵ_{1} - \sum_{j = 1}^{m - 1} ϵ_{j} ϵ_{j + 1}] + O (\frac{1}{n^{2}}),

(48)

which indicates the deviation from uniformity of the distribution of

(s_{n_{1}}, \dots, s_{n_{m}})

.

Another proof of Formulas (40)–(43) in Theorem 3 uses the following generating function:

\sum_{k = 0}^{\infty} F (- k, (p + 1) / 2; k + 1; 2) z^{k}

= \frac{1}{B ((p + 1) / 2, (p + 1) / 2)} \int_{0}^{1} \frac{u^{(p - 1) / 2} {(1 - u)}^{(p - 1) / 2} d u}{1 - z^{2} (1 - 2 u)} = F (1, \frac{1}{2}; \frac{p + 2}{2}; z^{2}),

which is an even function of z when p is odd. If p is even,

\sum_{k even} F (- k, p / 2; p + 1; 2) z^{k} + \sum_{k odd} F (- k, p / 2; p + 1; 2) \frac{k z^{k}}{p + k}

= \sum_{k even} F (- k, p / 2; p + 1; 2) (z + 1) z^{k}

= \frac{1}{B (p / 2, (p + 2) / 2)} \int_{0}^{1} \frac{u^{(p - 2) / 2} {(1 - u)}^{p / 2} d u}{1 - z^{2} (1 - 2 u)} = F (1, \frac{1}{2}; \frac{p + 1}{2}; z^{2}) .

These facts can be found in 15.15.1 [14].

It is well known that the probabilities defining the classical hypergeometric distribution with parameters

m, n_{e}, n_{o} = n - n_{e}

, can be determined from its probability generating function, which is the (finite) hypergeometric series

F (m - n, - n_{e}; n_{o} - m + 1; z)

.

Therefore, the probability that such a random variable takes an even value (under any positive integers

n_{e}

and

n_{o}, n_{e} + n_{o} = n

) is

\sum_{\max (0, m - n_{e}) \leq k \leq \min (n_{o}, m)} \frac{[1 + {(- 1)}^{k}] (\binom{n_{o}}{k}) (\binom{n_{e}}{m - k})}{2 (\binom{n}{m})}

= \frac{1}{2} [1 + \frac{(n - m)! n_{o}!}{n! (n_{o} - m)!} F (- m, - n_{e}; n_{o} - m + 1; - 1)] .

If

n_{e} - n_{o} = [1 - {(- 1)}^{n}] / 2,

this probability coincides with

P (s_{1} \dots s_{m} = 1)

whose expression through

F (m - n, q_{m}; m + 1; 2)

is given in Theorem 3.

The joint distribution of

s_{1}, \dots, s_{m}

, which define traditional hypergeometric random variable

(s_{1} + \dots + s_{m} + m) / 2

,

\frac{1}{2^{m}} \prod_{k = 1}^{m} [1 + \frac{ϵ_{k} (n_{e} - n_{o} - ϵ_{k - 1} - \dots - ϵ_{1})}{n - k + 1}],

(49)

differs from (39). Indeed, the parities

s_{1}, \dots, s_{m}, 1 \leq m \leq n

, in (39) are special because of their association with ranks of the subsample.

For two disjoint subsets K and

K^{'}

of M and any L,

\sum_{ϵ_{i} = \pm 1, \prod_{K} ϵ_{j} = {ffl}_{1}, \prod_{K^{'}} ϵ_{j} = {ffl}_{2}} \prod_{L} ϵ_{ℓ}

= 2^{m - 2} [δ_{\emptyset, L} + {ffl}_{1} δ_{K, L} + {ffl}_{2} δ_{K^{'}, L} + {ffl}_{1} {ffl}_{2} δ_{K ⋃ K^{'}, L}],

so that one obtains

4 P (\prod_{i \in K} s_{i} = ϵ_{1}, \prod_{i \in L} s_{i} = ϵ_{2} | x_{j}, j \in M) = 1 + ϵ_{1} {(- 1)}^{n k + \sum_{K} r_{i}} {[1 - 2 \sum_{K} s_{i}^{K} F (x_{i})]}^{n - m}

(50)

+ ϵ_{2} {(- 1)}^{n ℓ + \sum_{L} r_{i}} {[1 - 2 \sum_{L} s_{i}^{L} F (x_{i})]}^{n - m}

+ ϵ_{1} ϵ_{2} {(- 1)}^{n (k + ℓ) + \sum_{K ⋃ L} r_{i}} {[1 - 2 \sum_{K ⋃ L} s_{i}^{K ⋃ L} F (x_{i})]}^{n - m} .

It follows that

P (\prod_{i \in K} s_{i} = ϵ_{1}, \prod_{i \in L l} s_{i} = ϵ_{2}) = \frac{1}{4} [1 + {(- 1)}^{n k + q_{k}} F (m - n, q_{k}; m + 1; 2) ϵ_{1}

+ {(- 1)}^{n ℓ + q_{ℓ}} F (m - n, q_{ℓ}; m + 1; 2) ϵ_{2} + {(- 1)}^{n (k + ℓ) + q_{k + ℓ}} F (m - n, q_{k + ℓ}; m + 1; 2) ϵ_{1} ϵ_{2}],

with similar formulas for the joint distribution of products of s’s over several disjoint subsets.

As in Theorem 2, all joint probabilities are linear functions of

ϵ_{1}, ϵ_{2}

and

ϵ_{1} ϵ_{2}

,

4 P (s_{i} = ϵ_{1}, s_{j} = ϵ_{2}) = 1 + ϵ_{1} α + ϵ_{2} β + ϵ_{1} ϵ_{2} χ

with some coefficients

α, β, χ

. The degree of dependence of s’s (or of their Bernoulli versions

(s_{i} + 1) / 2

) can be measured via the correlation coefficient between

s_{i}

and

s_{j}

,

ρ = \frac{2 P (s_{i} s_{j} = 1) - 1 - [2 P (s_{i} = 1) - 1] [2 P (s_{j} = 1) - 1]}{4 \sqrt{P (s_{i} = 1) P (s_{i} = - 1) P (s_{j} = 1) P (s_{j} = - 1)}} = \frac{χ - α β}{\sqrt{(1 - α^{2}) (1 - β^{2})}} .

In our examples,

χ < α β

, and the dependence is negative.

In (39), if n is even,

β = - α = - χ = 1 / (n - 1),

when n is odd,

α = β = - χ = 1 / n .

More generally, in (50)

α = {(- 1)}^{n k + q_{k} + q_{m - k}} F (k - n, q_{k}; k + 1; 2), β = {(- 1)}^{n (m - k) + q_{k} + q_{m - k}} F (m - k - n, q_{m - k}; m - k + 1; 2), δ = {(- 1)}^{n m + q_{m}} F (m - n, q_{m}; m + 1; 2)

. Then if n is even, and

m, k

are odd,

ρ = 0

, which means that

\prod_{K} s_{n_{j}}

and

\prod_{K^{'}} s_{n_{j}}

are independent,

P (\prod_{M} s_{n_{j}} = \pm 1) = 1 / 2

.

When

n_{e} - n_{o} = [1 - {(- 1)}^{n}] / 2

, one achieves in (49),

α = β = [1 - {(- 1)}^{n}] / (2 n), χ = - [2 n + {(- 1)}^{n} - 1] / [2 n (n - 1)]

.

These formulas may find further use in probability modeling and estimation of entropy of binary sequences [17].

5. Parity-Based Sums and Dirichlet Distribution

We start here with the identity, which is similar to the formulas (40)–(43) in Theorem 3. Namely, for any positive integers p and n with

ν_{i} \geq 0, i = 1, \dots n,

forming a weak decomposition of p into n (non-negative) parts, one has

\sum_{ν_{1} + \dots + ν_{n} = p} \prod_{i} s_{i}^{ν_{i}} = \{\begin{matrix} (\binom{(p + n - 2) / 2}{p / 2}) & n even, p even \\ 0 & n even, p odd \\ (\binom{(p + n - 1) / 2}{p / 2}) & n odd, p even \\ (\binom{(p + n - 2) / 2}{(p - 1) / 2}) & n odd, p odd . \end{matrix}

(51)

Indeed,

\sum_{ν_{1} + \dots + ν_{m} = p} 1 = (\binom{p + n - 1}{n - 1})

, so that with

q = q_{n}

defined by (30), the sum in the left-hand side of (51) can be written as

f_{m, p} (q) = \sum_{k = 0}^{p} {(- 1)}^{k} (\binom{m - q + k - 1}{m - q - 1}) (\binom{q - 1 + p - k}{q - 1}),

(52)

which is the coefficient at

z^{p}

in the series expansion of

{(1 - z)}^{- q} {(1 + z)}^{q - m}

, cf. Section 1.3 in [18].

One has,

f_{m, p} (m - q) = {(- 1)}^{p} f_{m, p} (q), f_{m, 0} (q) = 1, f_{0, p} (q) = 0, p \geq 1 .

A formula similar to (52) for the generating function,

z^{m} {(1 - z)}^{- q} {(1 + z)}^{q - m}

, obtains for the partition of p (strictly positive

ν

’s,

\sum_{k = 1}^{m} ν_{k} = p)

.

Theorem 4.

For

f_{m . p} (q)

given in (52),

q = q_{m} = ⌈ m / 2 ⌉

, one has

f_{m + 1, p} (q) = (\binom{m + p}{p}) F (- p, q; m + 1; 2)

(53)

as well as (54) and (55).

Proof.

The equality (53) holds because of (40)–(43). Indeed, its left-hand part satisfies (51), which corresponds to the weak decomposition of p. By comparing this equality with the partition of p into m (positive size) blocks, one obtains a representation of

f_{m, p} (q), q = q_{m}

,

{(- 1)}^{m p} f_{m, p} (q) = \sum_{ν_{1} + \dots + ν_{m} = p} \prod_{i} {(- 1)}^{i ν_{i}}

= \sum_{K \subset M} \sum_{\sum_{K} ν_{i} = p, ν_{i} > 0} \prod_{i \in K} {(- 1)}^{i ν_{i}} = \sum_{K \subset M} {(- 1)}^{k} \sum_{\sum_{K} ν_{i} = p - k, ν_{i} \geq 0} \prod_{i \in K} {(- 1)}^{i ν_{i}}

= \sum_{K \subset M} {(- 1)}^{k (p - k + 1)} f_{k, p - k} (\frac{k + \sum_{K} {(- 1)}^{i}}{2}) .

Here,

k, 1 \leq k \leq m \land p

, denotes the cardinality of

K = {i : ν_{i} > 0}

, so that

[k + \sum_{K} {(- 1)}^{i})] / 2

is the multiplicity of 1 among

{(- 1)}^{i}, i \in K

.

The number of possible choices of the set K with the given multiplicity j of 1 in the set

{{(- 1)}^{i}, i \in K}

, is

(\binom{q}{j}) (\binom{m - q}{k - j})

. Therefore,

{(- 1)}^{m p} f_{m, p} (q) = \sum_{k = 1}^{m \land p} {(- 1)}^{k (p - k + 1)} \sum_{j = 0}^{k} (\binom{q}{j}) (\binom{m - q}{k - j}) f_{k, p - k} (j) .

(54)

Alternative representation of

f_{m, p} (q), p \geq 1,

results from the binomial theorem applied to

{(1 - s)}^{- q} {(1 + s)}^{q - m}, q = q_{m}

,

f_{m, p} (q) = \{\begin{matrix} \frac{1}{2^{q}} \sum_{j = 0}^{q} (\binom{q}{j}) f_{q, p} (j) & m even \\ \frac{1}{2^{q}} \sum_{j = 0}^{q} (\binom{q}{j}) [f_{q, p} (j) - f_{q, p - 1} (j)] & m odd . \end{matrix}

(55)

□

The coefficients

f_{m, p} (q_{m})

also appear in the conditional cumulative distribution function corresponding to (32). For positive N put,

U_{m}^{N} (y_{1}, \dots, y_{m}) = \int \dots \int_{- \infty < x_{j} < y_{j}, j = 1, \dots, m} {[{(- 1)}^{m} - 2 \sum_{j} {(- 1)}^{r_{j}} F (x_{j})]}^{N} \prod_{j} f (x_{j}) d x_{j}

(56)

= {(- 1)}^{m N} \int \dots \int_{0 < u_{j} < F (y_{j}), j = 1, \dots, m} {[1 - 2 \sum_{j} s_{j}^{M} u_{j}]}^{N} \prod_{j} d u_{j},

where

r_{j}

denotes the rank of

x_{j}

,

s_{j}^{M} = sign (\prod_{k \neq j, j, k \in M} (x_{j} - x_{k})

.

Then, the mentioned distribution function can be expressed through

U_{m}^{n - m}

as

F_{ϵ} (x_{n_{1}}, \dots, x_{n_{m}}) = \frac{1}{2 P (\prod_{j} s_{n_{j}} = ϵ)}

\times [\prod_{M} F (x_{n_{j}}) + ϵ {(- 1)}^{m (m - 1) / 2} U_{m}^{n - m} (x_{n_{1}}, \dots, x_{n_{m}})],

which implies that

U_{m}^{N} (\infty, \dots, \infty) = {(- 1)}^{m N} F (- N, q_{m}; m + 1, 2)

.

Thus, under the Dirichlet distribution Dir_m+1 of

Δ_{1}, \dots, Δ_{m + 1}

, with all concentration parameters 1, one has for even m,

U_{m}^{N} (x, \infty, \dots, \infty) = \int_{0}^{F (x)} \int_{0}^{1} \dots \int_{0}^{1} {[1 - 2 (Δ_{m} + Δ_{m - 2} + \dots + Δ_{2})]}^{N} d {Dir}_{m + 1} .

Now we use the facts that the marginal distribution of

Δ_{1}

has a beta-density

β (Δ_{1})

with parameters

(1, m)

, and that the conditional distribution of

(Δ_{2}^{'}, \dots, Δ_{m + 1}^{'}) = (Δ_{2} / (1 - Δ_{1}), \dots, Δ_{m + 1} / (1 - Δ_{1}))

is

{Dir}_{m}

.

Thus, when m is even,

{(- 1)}^{m N} U_{m}^{N} (x, \infty, \dots, \infty)

(57)

= \int_{0}^{F (x)} \int_{0}^{1} \dots \int_{0}^{1} {[\frac{v_{1}}{1 - v_{1}} + 1 - 2 \sum_{i = 2, \dots, m} \frac{Δ_{i}}{1 - Δ_{1}}]}^{N} {(1 - v_{1})}^{N} β (v_{1}) d v_{1} d {Dir}_{m}

= \sum_{k = 0}^{N} (\binom{N}{k}) F (- k, q_{m}; m, 2) \int_{0}^{F (x)} v_{1}^{N - k} {(1 - v_{1})}^{k} β (v_{1}) d v_{1}

= {(\binom{m + N}{N})}^{- 1} \sum_{k = 0}^{N} \frac{f_{m + 1, k} (q_{m})}{m + k} I_{F (x)} (N - k + 1, m + k) .

Here we took advantage of the binomial theorem, (31) and (53); in the last equality

I_{p} (N - k + 1, m + k) = \int_{0}^{p} v^{N - k} {(1 - v)}^{m + k - 1} d v / B (N - k + 1, m + k), 0 \leq p \leq 1,

denotes the incomplete beta function.

Identity (57) also holds for odd values of m. For example,

U_{1}^{N} (x) = \frac{{(- 1)}^{N} {1 - {[1 - 2 F (x)]}^{N + 1}}}{2 (N + 1)},

and

U_{2}^{N} (x, \infty) = \frac{F (x)}{N + 1} - \frac{[1 - {(- 1)}^{N}] {1 - {[1 - 2 F (x)]}^{N + 2}}}{4 (N + 1) (N + 2)} .

Interest in U is due to the multivariate integration by parts, which also motivates the study of its derivatives in the next section.

6. Partial Derivatives

For

1 \leq m \leq n

, we look at the properties of the symmetric function

G = G_{m}

, which so far is defined almost everywhere (for pairwise different x’s),

G (x_{1}, \dots, x_{m}) = {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n},

s_{j} = sign (\prod_{j; x_{j} \neq x_{i}} (x_{i} - x_{j})

.

If the data consists of clusters,

C_{j} = {i : x_{i} = x_{j}}, j = 1, \dots, k

, then using the definition by continuity put

G (x_{1}, \dots, x_{m}) = {[1 - 2 \sum_{j : | C_{j} | odd} s_{j} F (x_{j})]}^{n},

which allows possibly equal x’s. Notice that functions

s_{j}

are discontinuous.

If all x’s coincide, then under this definition,

G (x, \dots, x) = 1

, if m is even;

= {[1 - 2 F (x)]}^{n}

, if m is odd. If there are

m - k

points equal to

+ \infty

, then

G = {[1 - 2 \sum_{j = 1}^{k} s_{j} F (x_{j})]}^{n}, m - k

even;

= {[- 1 - 2 \sum_{j = 1}^{k} s_{j} F (x_{j})]}^{n}, m - k

odd. If

m - k

of x’s are equal to

- \infty

, then

G = {[1 - 2 \sum_{j = 1}^{k} s_{j} F (x_{j})]}^{n}

.

Thus,

G (x_{1}, \dots, x_{m})

becomes a continuous function whose absolute value is bounded by 1. Actually, it possesses the Lipschitz property if the density f is bounded.

Our goal here is to determine the generalized derivative of order

m

,

$\partial^{m} G (x) / \prod_{j} \partial x_{j}, x = (x_{1}, \dots, x_{m})$ , so that for any smooth compactly supported $φ$ , the multivariate integration by parts formula holds

$\int \int \frac{\partial^{m} G (x)}{\prod_{j} \partial x_{j}} φ (x) d x = {(- 1)}^{m} \int \int \frac{\partial^{m} φ (x)}{\prod_{j} \partial x_{j}} G (x) d x .$

Theorem 5.

With

{(n)}_{k} = n (n - 1) \dots (n - k + 1) = Γ (n + 1) / Γ (n + 1 - k)

, denoting the descending factorial, one has

\frac{\partial^{m}}{\prod_{j} \partial x_{j}} G (x_{1}, \dots, x_{m}) = \sum_{k = 1}^{m} {(- 2)}^{k} {(n)}_{k} {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n - k}

(58)

\times {[(\binom{m}{k})]}^{- 1} \sum_{K \subset M, | K | = k} \prod_{j \in K} f (x_{j}) \frac{\partial^{m - k}}{\prod_{i \notin K} \partial x_{i}} \prod_{j \in K} s_{j} .

Proof.

For

1 \leq k \leq m,

differentiation over

x_{1}, \dots, x_{k}

shows that

\frac{\partial^{k}}{\partial x_{1} \dots \partial x_{k}} {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n} = \sum_{p = 1}^{k} {(- 2)}^{p} {(n)}_{p} {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n - p} A_{p}^{(k)},

where

A_{p}^{(k)} = \prod_{j = 1}^{p} f (x_{j}) \frac{\partial^{k - p}}{\prod_{ℓ = p + 1}^{k} \partial x_{ℓ}} \prod_{j = 1}^{p} s_{j} .

For

k \geq 2

,

A_{p}^{(k)} = A_{p}^{(k)} (x_{1}, \dots, x_{k})

is a symmetric function of its arguments,

A_{p}^{(k)} = {[(\binom{k}{p})]}^{- 1} \sum_{1 \leq j_{1} < \dots < j_{p} \leq k} \prod_{i = 1}^{p} f (x_{j_{i}}) \frac{\partial^{k - p}}{\prod_{ℓ \neq j_{1}, \dots, j_{p}} \partial x_{ℓ}} \prod_{i = 1}^{p} s_{j_{i}} .

For

1 \leq p \leq k

, these functions can be characterized by the following recursion with

A_{0}^{(k)} = 0, A_{0}^{(0)} = 1

,

A_{p}^{(k)} = 0, p > k

,

A_{p}^{(k)} = s_{k} f (x_{k}) A_{p - 1}^{(k - 1)} + \frac{\partial}{\partial x_{k}} A_{p}^{(k - 1)} .

The proof is by induction. When

k = 1

,

\frac{\partial}{\partial x_{i}} {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n} = - 2 n s_{i} f (x_{i}) {[1 - 2 \sum_{j = 1}^{m} s_{j} F (x_{j})]}^{n - 1} .

Indeed, shift invariance of

s_{j}

means that for any i,

\sum_{j} \frac{\partial s_{j}}{\partial x_{i}} = 0 .

Thus,

A_{1}^{(1)} = s_{1} f (x_{1}),

and for

k \geq 2

,

A_{1}^{(k)} = f (x_{1}) \frac{\partial^{k - 1} s_{1}}{\partial x_{2} \dots \partial x_{k}} = {(- 2)}^{k - 1} f (x_{1}) \prod_{j = 2}^{k} δ (x_{1} - x_{j}) \prod_{i = k + 1}^{m} sign (x_{1} - x_{i}),

which indeed is a symmetric function of

x_{1}, \dots, x_{k}

.

The following induction steps are straightforward, so that (58) follows. □

For

m = 2, s_{1} = - s_{2}

, so that according to (58),

\frac{\partial^{2}}{\partial x \partial y} {[1 - 2 | F (x) - F (y) |]}^{n} = - 4 n (n - 1) f (x) f (y) {[1 - 2 | F (x) - F (y) |]}^{n - 2}

+ 4 n f (x) δ (x - y) .

For

m = 3

,

\frac{\partial^{3}}{\partial x \partial y \partial z} {[1 - 2 s_{1} F (x) - 2 s_{2} F (y) - 2 s_{3} F (z)]}^{n}

= 8 n (n - 1) (n - 2) {[1 - 2 s_{1} F (x) - 2 s_{2} F (y) - 2 s_{3} F (z)]}^{n - 3} f (x) f (y) f (z)

+ 8 n (n - 1) {[1 - 2 s_{1} F (x) - 2 s_{2} F (y) - 2 s_{3} F (z)]}^{n - 2} [s_{1} s_{2} f (x) δ (x - y) + s_{1} s_{3} f (z) δ (x - z)

+ s_{2} s_{3} f (y) δ (y - z)] - 8 n f (x) δ (y - x) δ (z - x) .

Let

{\tilde{G}}_{m} (x_{1}, \dots, x_{m}) = \sum_{k = 0}^{m - 1} \sum_{K, | K | = k} {(- 1)}^{m - k} G_{k} (x_{j}, j \in K),

where summation is over all proper subsets K of M,

{\tilde{G}}_{0} = 1

. Then

{\tilde{G}}_{m}

is grounded, i.e., it vanishes if

x_{j} = - \infty

for at least one j. If one of the x’s is equal to

+ \infty

, then

{\tilde{G}}_{m} (x_{1}, \dots, x_{m - 1}, \infty) = [{(- 1)}^{n} - 1] {\tilde{G}}_{m - 1} (x_{1}, \dots, x_{m - 1}),

so that for even n,

{\tilde{G}}_{m}

vanishes if

x_{j} = + \infty

for at least one j. When

m \geq 2

,

{\tilde{G}}_{m} (x, \infty, \dots, \infty) = {(- 1)}^{m} 2^{m - 2} [1 - {(- 1)}^{n}] {1 - {[1 - 2 F (x)]}^{n}}

,

{\tilde{G}}_{p} (+ \infty, \dots, + \infty)

= {(- 1)}^{m} 2^{m - 1} [1 - {(- 1)}^{n}]

.

Thus,

\frac{\partial^{m}}{\prod_{j} \partial x_{j}} G (x_{1}, \dots, x_{m}) = \frac{\partial^{m}}{\prod_{j} \partial x_{j}} {\tilde{G}}_{m} (x_{1}, \dots, x_{m}) .

For

m = 2

and even n it follows that for any integrable function

ϕ (x, y) = ϕ (y, x)

,

\int \int ϕ (x, y) f (x) f (y) {[1 - 2 | F (x) - F (y) |]}^{n - 2} d x d y

= \frac{1}{n - 1} \int ϕ (x, x)) f (x) \{1 - \frac{[1 - {(- 1)}^{n}]}{2} {[2 F (x) - 1]}^{n - 1}\} d x

- \frac{1}{4 n (n - 1)} \int \int \frac{\partial^{2} ϕ (x, y)}{\partial x \partial y} {{[1 - 2 F (x \lor y) + 2 F (x \land y)]}^{n}

+ {(- 1)}^{n} - {[1 - 2 F (x \lor y)]}^{n} - {[2 F (x \land y) - 1]}^{n}} d x d y,

which is useful to determine the covariance structure.

7. Conclusions

Interesting properties of self-dual probabilities demonstrate their potential in statistical estimation without any additional variance information. The polynomial approximation over a finite set, as well as the Gauss hypergeometric function, are intimately related to data parities and parity-based distributions.

Several presented combinatorial identities may find wider use in probability theory applications, in particular, in random matrices and in statistical physics.

Funding

This research received no external funding.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed. All conclusions are based on mathematical reasoning, and the author is solely responsible for them.

Acknowledgments

Many thanks are due to anonymous referees for their very careful reading of the original version and all their critical comments. The role of Mathematics editorial board in securing such referees is also acknowledged.

Conflicts of Interest

The author declares no conflicts of interest.

References

Rukhin, A.L. Estimating common mean in a heteroscedastic variances model. Mathematics 2025, 13, 1290. [Google Scholar] [CrossRef]
Rukhin, A.L. Estimation of the common mean from heterogeneous normal observations with unknown variances. J. R. Stat. Soc. Ser. B 2017, 79, 1601–1618. [Google Scholar] [CrossRef]
Rivlin, T.J. An Introduction to Approximation of Functions; Dover: New York, NY, USA, 1969. [Google Scholar]
Trefethen, L.N. Approximation Theory and Approximation Practice; SIAM: Philadelphia, PA, USA, 2013. [Google Scholar]
Karlin, S.; Studden, W.J. Tchebysheff Systems: With Applications in Analysis and Statistics; Wiley: New York, NY, USA, 1966. [Google Scholar]
de Boor, C.; Saff, E.B. Finite sequences of orthogonal polynomials connected by a Jacobi matrix. Linear Algebra Appl. 1986, 75, 43–56. [Google Scholar] [CrossRef]
Borodin, A. Duality of orthogonal polynomials on a finite set. J. Stat. Phys. 2002, 109, 1109–1120. [Google Scholar] [CrossRef]
Vinet, L.; Zhedanov, A. The characterization of classical and semiclassical orthogonal polynomials from their dual polynomials. J. Comp. Appl. Math. 2004, 172, 41–48. [Google Scholar] [CrossRef]
Genest, V.; Tsujimoto, S.; Vinet, L.; Zhedanov, A. Persymmetric Jacobi matrices, isospectral deformations and orthogonal polynomials. J. Math. Anal. Appl. 2017, 450, 915–928. [Google Scholar] [CrossRef]
Rukhin, A.L. Parities and hypergeometric function. Theory Probab. Appl. 2025, 70, 355–374. [Google Scholar] [CrossRef]
Fang, K.-T.; Pan, J. A review: Representative points of statistical distributions and their applications. Mathematics 2023, 11, 2930. [Google Scholar] [CrossRef]
David, H.A.; Nagarajah, N.H. Order Statistics, 3rd ed.; Wiley: New York, NY, USA, 2003. [Google Scholar]
Ng, K.W.; Tian, G.-L.; Tang, M.-L. Dirichlet and Related Distributions: Theory, Methods and Applications; Wiley: New York, NY, USA, 2011. [Google Scholar]
Olver, F.W.J.; Lozier, D.W.; Boisvert, R.W.; Clark, C.W. NIST Handbook of Mathematical Functions; NIST: Gaithersburg, MD, USA; U.S. Department of Commerce: Washington, DC, USA; Cambridge University Press: Cambridge, UK, 2010.
Callan, D. A combinatorial survey of identities for the double factorials. arXiv 2009, arXiv:0906.1317. [Google Scholar]
Li, Y.-W.; Qi, F. A new closed-form formula of the Gauss hypergeometric function at specific arguments. Axioms 2024, 13, 317. [Google Scholar] [CrossRef]
De Gregorio, J.; Sanchez, D.; Toral, R. Entropy estimators for Markovian sequences: A comparative analysis. Entropy 2024, 26, 79. [Google Scholar] [CrossRef] [PubMed]
Riordan, H.J. Combinatorial Identities; Wiley: New York, NY, USA, 1968. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rukhin, A.L. Parity-Based Statistics and Combinatorial Identities. Mathematics 2026, 14, 2407. https://doi.org/10.3390/math14132407

AMA Style

Rukhin AL. Parity-Based Statistics and Combinatorial Identities. Mathematics. 2026; 14(13):2407. https://doi.org/10.3390/math14132407

Chicago/Turabian Style

Rukhin, Andrew L. 2026. "Parity-Based Statistics and Combinatorial Identities" Mathematics 14, no. 13: 2407. https://doi.org/10.3390/math14132407

APA Style

Rukhin, A. L. (2026). Parity-Based Statistics and Combinatorial Identities. Mathematics, 14(13), 2407. https://doi.org/10.3390/math14132407

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Parity-Based Statistics and Combinatorial Identities

Abstract

1. Introduction: Common Mean and Unknown Heterogeneous Uncertainties

2. Self-Dual Probabilities and Orthogonal Polynomials Least Deviating from Zero

3. Parity-Based Distributions

4. Parities and Hypergeometric Function

5. Parity-Based Sums and Dirichlet Distribution

6. Partial Derivatives

7. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI