Multivariate Krawtchouk polynomials and composition birth and death processes

This paper defines the multivariate Krawtchouk polynomials, orthogonal on the multinomial distribution, and summarizes their properties as a review. The multivariate Krawtchouk polynomials are symmetric functions of orthogonal sets of functions defined on each of N multinomial trials. The dual multivariate Krawtchouk polynomials, which also have a polynomial structure, are seen to occur naturally as spectral orthogonal polynomials in a Karlin and McGregor spectral representation of transition functions in a composition birth and death process. In this Markov composition process in continuous time there are N independent and identically distributed birth and death processes each with support 0,1, .... The state space in the composition process is the number of processes in the different states 0,1,... Dealing with the spectral representation requires new extensions of the multivariate Krawtchouk polynomials to orthogonal polynomials on a multinomial distribution with a countable infinity of states.

Authors emphasise different approaches to the multivariate orthogonal polynomials. Diaconis and Griffiths's approach is probabilistic and directed to Markov chain applications; Iliev's approach is via Lie groups; and Genest et. al.'s physics approach is as matrix elements of group representations on oscillator states. Xu (2015) studies discrete multivariate orthogonal polynomials which have a triangular construction of products of 1-dimensional orthogonal polynomials. These are particular cases of the polynomials in this paper, see Diaconis and Griffiths (2014). These polynomials extend the Krawtchouk polynomials on the binomial distribution to multi-dimensional polynomials on the multinomial distribution. They appear naturally in composition Markov chains as eigenfunctions in a diagonal expansion of the transition functions. There are many interesting examples of these Markov chains in Zhou and Lange (2009). Binomial and multinomial random variables can be constructed as a sum of independent identically distributed random variables which are indicator functions of the events that occur on each of N trials. The Krawtchouk and multivariate Krawtchouk polynomials are symmetric functions of orthogonal functions sets on each of the trials. The simplest case is the Krawtchouk polynomials where the representation is explained in Section 2.
In the multivariate Krawtchouk polynomials there is not a unique orthogonal function set on trials with multiple outcomes greater than two, so the polynomials depend on which orthogonal function set is taken for a basis on the trials.
A number of classical birth and death processes have a spectral expansion where the orthogonal polynomials are constructed from the Meixner class. This class has a generating function of the form where h(v) is a power series in t with h(0) = 1 and u(v) is a power series with u(0) = 0 and u ′ (0) = 0. Meixner (1934) characterizes the class of weight functions and orthogonal polynomials with the generating function (2). They include the Krawtchouk polynomials, Poisson-Charlier polynomials, scaled Meixner polynomials, and Laguerre polynomials. (The Meixner orthogonal polynomials are a specific set belonging to the Meixner class with a name in common.) In this paper the spectral expansion is extended to composition birth and death processes where there are N independent and identically distributed birth and death processes operating and {X(t)} t≥0 is such that the ith element {X i (t)} t≥0 counts the number of processes in state i at time t. In the analogue of (1) the spectral polynomials are the dual multivariate Krawtchouk polynomials. The dual polynomial system is therefore very important and attention is paid to describing it.
There are extensions of the multivariate Krawtchouk polynomials to multivariate orthogonal polynomials on the multivariate Meixner distribution and multivariate product Poisson distribution where they occur as eigenfunctions of multi-type birth and death processes (Griffiths 2016). This paper defines the multivariate Krawtchouk polynomials, summarizes their properties, then considers how they are found in spectral expansions of composition birth and death processes. It is partly a review of these polynomials and is self-contained. For a fuller treatment see Diaconis and Griffiths (2012). The polynomials are naturally defined by a generating function and so generating functions techniques are used extensively in the paper. Probabilistic notation is used, particularly the expectation operator E which is a linear operator acting on functions of random variables, which take discrete values in this paper. If X 1 , . . . , X d are random variables then Often orthogonal polynomials are regarded as random variables. For example {K n (X; N, p)} N n=0 are the 1-dimensional Krawthcouk polynomials as random variables and where q = 1 − p. A convention of using capital letters for random variables and lower case for values they take is used, except when the random variables are denoted by Greek letters, when they have to be considered in context. Section 2, Theorem 1, shows how the Krawtchouk polynomials can be expressed as elementary symmetric functions of N Bernoulli trials, centered at their mean p. The Meixner orthogonal polynomials on the geometric distribution are also expressed as functions of an infinity of centered Bernoulli trials in Theorem 2. There is some, but not total symmetry in this expression. Krawtchouk polynomials occur naturally as eigenfunctions in Ehrenfest urn processes and the eigenfunction expansion of their transition functions is explained in Section 2.3. Section 3 introduces the multivariate Krawtchouk polynomials, explaining how they are constructed in a symmetric way from a product set of orthogonal functions on N independent multinomial trials. The dual orthogonal system is described and a scaling found so that they are multivariate Krawtchouk polynomials on a different multinomial distribution in Theorem 3. The polynomial structure of the multivariate Krawtchouk polynomals is described in Theorem 4 and the structure in the dual system in Theorem 5. Recurrence relationships are found for the system in Theorem 6 and for the dual system in Theorem 7. The dual recurrence relationship is used to identify the polynomials as eigenfunctions in a d-type Ehrenfest urn in Theorem 8. In Section 3.2 a new extension is made to multivariate Krawtchouk polynomials where there are an infinite number of possibilities in each multinomial trial. These polynomials occur naturally as eigenfunctions in composition birth and death processes in a Karlin and McGregor spectral expansion in Theorem 9. Theorem 10 considers the polynomial structure of the dual polynomials in the spectral expansion. Theorem 11 gives an interesting identity for these spectral polynomials in composition birth and death processes when the spectral polynomials in the individual processes belong to the Meixner class.

Bernoulli trials and orthogonal polynomials
The paper begins with expressing the 1-dimensional Krawtchouk polynomials as symmetric functions of Bernoulli trials. The multivariate Krawtchouk polynomials are extensions of this construction in higher dimensions.

Krawtchouk orthogonal polynomials
The Krawtchouk orthogonal polynomials {K n (x; N, p)} N n=0 are orthogonal on the binomial They have a generating function The scaling is such that the polynomials K n (x; N, p)/n! are monic and If the Krawtchouk polynomials are scaled to be Q n (x) so that Q n (0) = 1, then there is a duality that Q n (x) = Q x (n). A binomial random variable X counts the number of successes in N independent trials, each with a probability p of success. Let ξ i = 1 if the ith trial is a success and ξ i = 0 otherwise.
is a sequence of Bernoulli trials with P(ξ i = 1) = p, P(ξ i = 0) = q, and X = ∑ N i=1 ξ i . It is interesting to express the Krawtchouk polynomials as symmetric functions of {ξ i } N i=1 . If there is just one trial with N = 1, X = ξ 1 and the orthogonal polynomial set on X is {1, ξ 1 − p}. There can only be a constant function and a linear function if there are just two values that ξ 1 can take. A product set of orthogonal functions on and we want to form a smaller basis from these functions to orthogonal polynomials in where S N is the symmetric group on {1, 2, . . . , N}.

Proof.
A generating function for the symmetric functions on the right of (5) is If x of the ξ i are 1 and N − x are 0, then (6) is equal to identical to the right side of (3). Therefore (5) holds since the generating functions of both sides, regarding X as a random variable, are the same.
The representation (5) of the Krawtchouk polynomials appears with a full treatment in Diaconis and Griffiths (2012) and very briefly as the generating function proof above in Griffiths (1971). The author is not aware of any other appearances of (5).

Meixner polynomials on the geometric distribution
The Meixner orthogonal polynomials on the geometric distribution are orthogonal on pq x , x = 0, 1, . . .
be a sequence of Bernoulli trials. Let X count the number of trials ξ i = 0 before the first trial where ξ x+1 = 1. That is The orthogonal polynomials on the geometric distribution are a special case of the general Meixner polynomials and have a generating function

A product set of orthogonal functions on the trials is
It is of interest to express the orthogonal polynomial set {M n (x; 1, q)} ∞ n=0 as a series expansion in the product set (7) as a comparision of what happens with the Krawtchouk polynomials. A calculation is now made of E (ξ i 1 − p) · · · (ξ i r − p)G(z; X) leading to coefficients in the expansion of the Meixner polynomials in the product set of orthogonal functions. Given X = x it must be that ξ j = 0, j = 1, . . . , x, ξ x+1 = 1 and {ξ j } ∞ j=x+2 are distributed as Bernoulli trials. Therefore Taking an expectation conditional on X, then over X, Simplification to the last line is straightforward, so is omitted. Considering the coefficients of z n in (8), and using Theorem 1 gives the following theorem.

Then X has a geometric distribution and the Meixner polynomials on this geometric distribution have a representation for
where X l = ξ 1 + · · · ξ l .

An Ehrenfest urn
The Krawtchouk polynomials appear naturally as eigenfunctions in an Ehrenfest urn model. This is explored in Diaconis and Griffiths (2012). An urn has N balls coloured red or blue. Transitions occur at rate 1 when a ball is chosen at random and the colour of the ball is changed according to a transition matrix That is, if a blue ball is chosen it is changed to red with probability 1, whereas if a red ball is chosen it is changed to blue with probability q/p. {X(t)} t≥0 is a reversible Markov process, which is a birth and death process, with a Binomial (N, p) stationary distribution.
The process is a composition Markov process in the following sense. Label the balls 1, 2, . . . N at time t = 0 and keep the labels over time as their colours change. Let {ξ i (t)} t≥0 describe the colour of ball i at time t: ξ i (t) = 1 if the ith ball is red or 0 if the ball is blue. The processes {ξ i (t)} t≥0 , i = 1, . . . , N are independent, each has a rate of events 1/N when the specified ball is chosen, and Standard Markov process theory gives that where λ = 1/(N p). It is immediate that the stationary distribution of each of the labelled processes is (p, q). An eigenvalue-eigenfunction expansion of P(t) is where π ξ is the stationary distribution with π 0 = q, π 1 = p. It is straightforward to check the agreement with P(t) by substituting the four values of η, ξ = 0, 1.
In the Ehrenfest urn composition process the transitions are made from The Krawtchouk polynomials thus appear naturally as elementary symmetric functions of the individual labelled indicator functions in the Markov process.

Multivariate Krawtchouk polynomials
The multivariate Krawtchouk polynomials with elementary basis u were first constructed by Griffiths (1971). A recent introduction to them is Diaconis and Griffiths (2014). They play an important role in the spectral expansion of transition functions of composition Markov processes. Khare and Zhou (2009), Zhou and Lange (2009) have many interesting examples of such Markov processes. Later in this paper we consider the particular composition processes where there are N particles independently performing birth and death processes.
The multivariate Krawtchouk polynomials are orthogonal on the multinomial distribution with p = {p j } d j=1 a probability distribution. Let J 1 , . . . , J N be independent identically distributed random variables specifying outcomes on the N trials such that Then This notation for the orthogonal set of functions follows Lancaster (1969). There is an equivalence that In this paper {u (l) } d−1 l=0 are usually orthonormal functions with a l = 1, l = 0, 1, . . . , d − 1 unless stated otherwise. The 1-dimensional Krawtchouk polynomials are constructed from a symmetrized product set of orthogonal functions {1, ξ i − p} N i=1 and the construction of the multivariate polynomials follows a similar, but more complicated proceedure. Instead of having two unique elements in each orthogonal function set there is a choice of orthogonal basis and the construction is from the product set Define a collection of orthogonal polynomials Q n (X; u) with n = (n 1 , . . . n d−1 ) and |n| ≤ N on the multinomial distribution as symmetrized elements from the product set such that the sum is over products u (l 1 ) In the 1-dimensional case u which is, of course, the generating function of the Krawtchouk polynomials. x 1 , x 2 are respectively the number of 0 and 1 values in the N trials. It is straightfoward to show, by using the generating function (15), that where n + = (n 0 , . . . , n d−1 ), with n 0 = N − ∑ d−1 j=1 n j . Instead of indexing the polynomials by n = (n 1 , . . . , n d−1 ) they could be indexed by n + . This notation is sometimes convenient to use in the paper. The dual orthogonality relationship is, immediately from (16), Expanding the generating function (15) shows that where · indicates summation over an index and a [b] = a(a − 1) · · · (a − b + 1) for non-negative integers b. The dual generating function is Expanding the generating function The two generating functions (15) and (19) are similar and there is a form of self-duality for the polynomials. Let which apart from the different indexing, and non-constant function ω (0) , generates multivariate Krawtchouk polynomials. Suppose that ω The following theorem is evident from (19) and (21), once the indexing is sorted out.
There is an interesting identity when u is self-dual with an indexing of j beginning from 0 instead of 1. That is u l , j, l = 0, 1, . . . , n.
Then indexing x = (x 0 , . . . , x n ), l . This duality occurs in the scaled Krawtchouk polynomial basis, orthogonal on a Binomial (n, p) distribution.
The emphasis in Theorem 3 is on considering the dual system, obtaining ω from u, however sometimes it is natural to construct u from an orthogonal set ω, particularly when ω (0) i = 1, i = 1, . . . , d and ω = ω. Then the polynomials on the left of (22) are defined by the dual polynomials on the right. Later in the paper it will be seen that this is natural in composition birth and death Markov processes.
The polynomial structure of the multivariate Krawtchouk polynomials is detailed in the next theorem.

Proof.
A method of proof is to consider the transform of Q n (X; u), which is given by where This transform is easily found by taking the transform of the generating function (15). One can see directly that Q n (X; u) is an orthogonal polynomial by considering the transform From (23) and (24), Q n (x) is a polynomial of degree |n| whose only leading term is This is seen by noting that the leading term is found by replacing φ j p j by X j in since we can replace X j [k j ] by X k j in considering the leading term of (23) and setting φ i = 1 for i = 1, . . . , d.
The next theorem explains the polynomial structure in the dual system. There are recurrence relationships for the multivariate Krawtchouk polynomials, which are found here from a generating function approach; for another different proof see Iliev (2012), Theorem 6.1. Note that his multivariate Krawtchouk polynomials are Q n (x; u)( N n + ) −1 . In Theorems 6, 7, 8, u is taken to be orthonormal on p, so a l = 1, l = 0, 1, . . . d − 1 in (14).
Two recursive systems are: and (27) The first recursive equation (25) then follows by an expansion of x j Q n (x; u) as a series in Q n ′ (x; u) dividing the cases in (27) to obtain the coefficients by The second recursion (26) is found by summation, using the orthogonality of u.

The dual orthogonal system when u is orthonormal is
A dual generating function is The generating function (29) arises from considering the coefficient of Theorem 7. A dual recurrence system is, for i = 0, . . . , d − 1 Proof. A derivation of the recurrence system uses a transform method. Consider Therefore non-zero terms with y = x − e j + e l are ∑ {n + :|n + |=N} The dual recurrence is therefore (30).
The reproducing kernel polynomials are invariant under which set of orthonormal functions u is used. They have an explicit form, see Diaconis and Griffiths (2014) and Xu (2015) for details.

An Ehrenfest urn with d-types
A d-type Ehrenfest urn has N balls of d colours {1, . . . , d}. At rate 1 a ball is chosen and if it is of type j it is changed to colour l with probability p jl , l = 1, . . . , d. {X(t)} t≥0 , with |X(t)| = N, is the number of balls of the different colours at time t, which can be regarded as a d-dimensional random walk on |x| = N. The transition functions have an eigenfunction expansion in the multivariate Krawtchouk polynomials, extending the case (12) with two colours.
Theorem 8. Let {X(t)} t≥0 be a d-dimensional random walk on x, |x| = N, where transitions are made from x → x − e j + e l at rate r(x, x − e j + e l ) = (x j /N)p jl . P is a d × d transition matrix, with stationary distribution p such that Then the transition functions of X(t) have an eigenfunction expansion p(x, y; t) = m(y, p) Proof. {X(t)} t≥0 is a reversible Markov process with stationary distribution m(x; p) becuse it satisfies the balance equation The reversibility is a consequence of assuming that P is a reversible transition matrix. The generator of the process acting on f (x) is specified by so the eigenvalues and eigenvectors (λ n , g n (x)) satisfy Lg n (x) = −λ n g n (x).
Now from (30) which is the same as (33), noting that the total rate is 1 away from x. Then (32) holds immediately.

Extensions to the multivariate Krawtchouk polynomials
It is useful in considering spectral expansions of composition Markov processes to allow the following generalizations of the multivariate Krawtchouk polynomials.
• Allow d = ∞ as a possibility and let {u (j) } ∞ j=0 be a complete orthogonal set of functions on p 1 , p 2 , . . .. The multinomial distribution is still well defined as and the generating function for the multivariate Krawtchouk polynomials still holds with d = ∞. • When d = ∞ take {u (j) } ∞ j=1 to be orthogonal on a discrete measure π which is non-negative, but not a probability measure because ∑ ∞ i=1 π i = ∞. • Allow the basis functions u to be orthogonal on π, and take the dual functions {u to be orthogonal on a continuous distribution. An example that occurs naturally in composition birth and death chains is when u i (z), z ≥ 0, i = 0, 1, . . . are the Laguerre polynomials, orthogonal on the density z α Γ(α + 1) e −z , z > 0.

Karlin and McGregor spectral theory
Consider a birth and death process {ξ(t)} t≥0 on {−1, 0, 1, . . .} with birth and death rates λ i , µ i from state i and transition probabilities p ij (t). −1 is an absorbing state which can be reached if µ 0 > 0. We assume that the process is non-explosive so only a finite number of events will take place in any finite time interval. Define orthogonal polynomials {Q n (z)} ∞ n=0 by for n ∈ Z + with Q 0 = 1 and Q −1 = 0. The polynomials are defined by recursion from (34) with Q n+1 defined by knowing Q n and Q n−1 . If µ 0 = 0, then Q n (0) = 1. There is a spectral measure ψ with support on the non-negative axis and total mass 1 so that for i, j = 0, 1, . . . where If µ 0 > 0 then ∑ ∞ j=0 p ij (t) < 1 because of possible absorption into state −1. If µ 0 = 0 but there is no stationary distribution because ∑ ∞ j=0 π j = ∞ then also possibly ∑ ∞ j=0 p ij (t) < 1. Placing t = 0 shows the orthogonality of the polynomials {Q i (z)} i≥0 on the measure ψ because p ij (0) = δ ij . {ξ(t)} t≥0 is clearly reversible with respect to {π j } j≥0 when a stationary distribution exists, or before absorption at 0 if it does not exist since π i p ij (t) = π j p ji (t). As t → ∞ the limit stationary distribution, if µ 0 = 0 and ∑ ∞ k=0 π k < ∞, is Suppose a stationary distribution exists and there is a discrete spectrum with support {ζ l } l≥0 , ζ 0 = 0. Then This is an eigenfunction expansion where u is a set of orthonormal functions on p defined by Several well known birth and death processes give rise to classical orthogonal polynomial systems.
3. λ = µ. The spectral polynomials are related to the Laguerre polynomials by In this case there is a continuous spectrum and the polynomials are orthogonal on the gamma distribution 1 There is no stationary distribution of the process in this case. The Laguerre polynomials have a generating function The process arises from a model with two urns with a and b balls, with N tagged balls. At an event two balls are chosen at random from the urns and interchanged. The state of the process is the number of tagged balls in the first urn. The spectral polynomials are related to the dual Hahn polynomials by Q n (z) = R n (λ(z); a, b, N), n = 0, 1, . . . where There is a hypergeometric stationary distribution in the process of , i = 0, 1, . . . , N.
• An Ehrenfest urn where λ n = (N − n)p, µ n = nq, 0 ≤ n ≤ N, 0 < p < 1 and q = 1 − p. The spectral polynomials are the Krawtchouk polynomials Q n (z) = K n (z; N, p), 0 ≤ n ≤ N, which is also the stationary distribution in the process.

Composition birth and death processes
Consider N identically distributed birth and death processes {ξ i (t)} t≥0 , i = 1, . . . N each with state space 0, 1, . . .. It is assumed that there is no absorbing state at −1 and λ 0 > 0. The transition functions for the labelled processes are p ij (t) := ∏ N k=1 p i k ,j k (t). In composition Markov processes interest is in the unlabelled configuration of ξ(t) specified by X(t), where X k (t) = |{i j = k, j = 1, . . . , N}| for k = 0, 1 . . .. The probability generating function of X(t) conditional on where possibly there are a countable infinity of states with d = ∞. Transitions and rates are, for j = 0, 1, . . ., The total rate from x is ∑ j≥0

This covers the case when a stationary distribution does exist and also when a stationary distribution does not exist because
Proof. The probabilistic structure of {X(t)} t≥0 with probability generating function (40) implies that the multivariate Krawtchouk polynomials are the eigenfunctions of the transition distribution.
The generating function of the dual polynomials where in this generating function n(Z) is regarded as a random variable by taking n l (Z) = |{Z k : Z k = ζ l , k = 1, . . . , N}|.
{Z k } N k=1 are independent identically distributed random variables with probability measure ψ. Without loss of take generality v 0 = 1 in (50) and consider coefficients of ∏ i≥1 v x i i , indexing the dual polynomial by (x 1 , x 2 , . . .) with x 1 + x 2 + · · · ≤ N. Note the scaling that the dual polynomials is 1 when x i = 0, i ≥ 1. , u) is a polynomial of degree x 1 + x 2 + · · · in {N j } j≥1 whose only term of maximal degree is ∏ j≥1 N x j j . The total degree of Z in the dual polynomials indexed by (x 1 , x 2 , . . .) is ∑ j≥1 jx j with a single leading term of this degree.

Theorem 10. Define
Proof. The proof of the first statement follows from Theorem 5. The proof of the second statement is immediate by knowing that N j is of degree j in Z.
The third case of linear birth and death processes composition Markov chains is interesting as having a continuous spectral measure which is a product measure of N gamma distribution measures. The spectral polynomials are well defined by a generating function as coefficients of ∏ ∞ j=1 v however elements of {Z k } N k=1 are distinct, being continuous random variables, and the dual of the dual system is products of dual Laguerre polynomials which are not grouped to an index n as when there is a discrete spectrum.
The polynomials in the Meixner class (2) are additive in the sense that if {Q N m (|z|)} are the orthogonal polynomials on the distribution of |Z| then the generating function for these polynomials is This additivity implies an interesting identity.
Theorem 11. The dual multivariate Krawtchouk polynomials with generating function (50) satisfy the identity N n where x = (x 1 , x 2 , . . .). In this equation n = n(Z) is regarded as a random variable in the sense of (51).

Proof.
Set v j = v j /j!, j = 0, 1, . . . in (50). Then The theorem then follows by equating coefficients of v m on both sides of the generating function.