Next Article in Journal
Buckling of Cracked Euler–Bernoulli Columns Embedded in a Winkler Elastic Medium
Previous Article in Journal
A Quantile Functions-Based Investigation on the Characteristics of Southern African Solar Irradiation Data

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Moments of the Negative Multinomial Distribution

by
Frédéric Ouimet
Centre de Recherches Mathématiques, Université de Montréal, Montreal, QC H3T 1J4, Canada
Math. Comput. Appl. 2023, 28(4), 85; https://doi.org/10.3390/mca28040085
Submission received: 1 July 2023 / Revised: 18 July 2023 / Accepted: 20 July 2023 / Published: 24 July 2023

Abstract

:
The negative multinomial distribution appears in many areas of applications such as polarimetric image processing and the analysis of longitudinal count data. In previous studies, general formulas for the falling factorial moments and cumulants of the negative multinomial distribution were obtained. However, despite the availability of the moment generating function, no comprehensive formulas for the moments have been calculated thus far. This paper addresses this gap by presenting general formulas for both central and non-central moments of the negative multinomial distribution. These formulas are expressed in terms of binomial coefficients and Stirling numbers of the second kind. Utilizing these formulas, we provide explicit expressions for all central moments up to the fourth order and all non-central moments up to the eighth order.
MSC:
Primary: 62E15; Secondary: 60E05

1. Introduction

The negative multinomial distribution is a probability distribution that can be used to model count data, where the outcome of interest is the number of occurrences of $d ∈ N$ different events when the number of failures (a failure means that, for a given trial, an object is not categorized in any of the d categories) is a fixed value $r ∈ N$. It is a multivariate generalization of the well-known negative binomial distribution, for which $d = 1$. For a general reference on the negative multinomial distribution and its properties, refer to Sibuya et al. [1] or Chapter 36 of Johnson et al. [2].
One of the main motivations for using the negative multinomial distribution is its ability to model overdispersion for count vectors, which happens when the variances of the count variables are larger than their mean (Fitzmaurice et al. [3]). The Poisson distribution for instance assumes that the mean and variance are equal but, in many real-world scenarios, this is often not the case. The negative multinomial distribution allows for modeling overdispersion by allowing for different variances for each event type (Cameron and Trivedi [4]).
Another motivation for using the negative multinomial distribution is its ability to handle excess zeros in count data, see, e.g., Haslett et al. [5]. Count data often exhibit zero inflation, where there are more zeros than would be expected under a Poisson distribution. The negative multinomial distribution provides a flexible framework for modeling zero inflation by allowing for different probabilities of zero occurrences for each event type.
A third motivation for using the negative multinomial distribution is its ability to model count data with multiple event types, see, e.g., [6,7,8,9,10,11,12,13]. In many applications, there is more than one type of event that can occur and the negative multinomial distribution allows for modeling the counts of each event type simultaneously. This is particularly useful in fields such as marketing, where the goal is to model the number of purchases of different products, or in ecology, where the goal is to model the counts of different species in a community [14].
Overall, the negative multinomial distribution provides a flexible and powerful tool for modeling count data in a variety of applications such as polarimetric image processing [15], the analysis of RNA-seq. data [16], pollen analysis [17], longitudinal data [6,7,12], etc. Its theoretical properties have been investigated in numerous papers, see, e.g., [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33]. One can find extensions of the model in [16,34,35,36].
The ability of the negative multinomial distribution to handle overdispersion, zero inflation, and multiple event types makes it a valuable tool for data scientists and statisticians. Whether one is interested in modeling the number of purchases of different products, the counts of different species in a community, or any other count data, the negative multinomial distribution can provide valuable insights and inform decision-making.
In previous studies, Mosimann [17] derived general formulas for the falling factorial moments of the negative multinomial distribution, while Withers and Nadarajah [37] obtained expressions for the cumulants. Despite the availability of the moment generating function, no comprehensive formulas for the moments have been calculated thus far. Our goal in this paper is to address this gap by presenting general formulas for both central and non-central moments of the negative multinomial distribution.
Here is an outline of the paper. In Section 2, the necessary definitions and notations are introduced, along with a preliminary result on factorial moments of the negative multinomial distribution due to Mosimann [17]. The general formulas for the central and non-central moments of the negative multinomial distribution are stated and proved in Section 3. The numerical implementation of those formulas in Mathematica is provided in Section 4. Finally, in Section 5, our general formulas are applied to give explicit expressions for all central moments up to the fourth order and all non-central moments up to the eighth order. Open problems of interest are stated in Section 6.

2. The Negative Multinomial Distribution

For any $d ∈ N$, let $x ∈ [ 0 , 1 ] d$ be such that $∥ x ∥ 1 : = ∑ i = 1 d | x i | < 1$. The probability mass function $k ↦ P r , x ( k )$ of the negative multinomial distribution is defined by
$P r , x ( k ) : = Γ ( r + ∥ k ∥ 1 ) Γ ( r ) ∏ i = 1 d Γ ( k i + 1 ) ( 1 − ∥ x ∥ 1 ) r ∏ i = 1 d x i k i = Γ ( r + ∥ k ∥ 1 ) Γ ( r ) ∏ i = 1 d Γ ( k i + 1 ) ( 1 − ∥ x ∥ 1 ) r + ∥ k ∥ 1 ∏ i = 1 d y i k i , k ∈ N 0 d ,$
where $r > 0$ is a positive real number and $y i := x i / ( 1 − ∥ x ∥ 1 )$ for all $i ∈ { 1 , … , d }$. If a random vector $η = ( η 1 , … , η d )$ follows this distribution, we write for short $η ∼ Neg Multinomial ( r , x )$. In this paper, our main goal is to give general formulas for the non-central and central moments of (1), namely
We obtain the formulas using a combinatorial argument and the general expression for the falling factorial moments found by Mosimann [17], which we register in the lemma below.
Lemma 1
(Factorial moments). Let $η ∼ Neg Multinomial ( r , x )$. Then, for all $k 1 , … , k d ∈ N 0$,
where $m ( k ) := m ( m − 1 ) … ( m − k + 1 )$ denotes the $k t h$ order falling factorial of m.
The formulas we derive for the expectations in Equation (2) will be employed to calculate all the central moments up to the fourth order, as well as all the non-central moments up to the eighth order. For information about the moment generating function, the cumulant generating function, and expressions for the cumulants, refer to Withers and Nadarajah [37].

3. Results

First, we give a general formula for the non-central moments of the negative multinomial distribution in (1).
Theorem 1
(Non-central moments). Let $η ∼ Neg Multinomial ( r , x )$. Then, for all $p 1 , … , p d ∈ N 0$,
where denotes a Stirling number of the second kind (i.e., the number of ways to partition a set of p objects into k non-empty subsets); recall that
$y i := x i 1 − ∥ x ∥ 1 , for all i ∈ { 1 , … , d } .$
Proof.
The following well-known relationship between the power $p ∈ N 0$ of a number $x ∈ R$ and the falling factorials of x is already established:
This relationship can be found in [38] (p. 262). By applying this formula to each $η i p i$ and utilizing the linearity of expectation, we obtain the following:
Therefore, the conclusion is a direct consequence of Lemma 1.    □
We can now derive a comprehensive formula for the central moments of the negative multinomial distribution.
Theorem 2
(Central moments). Let $η ∼ Neg Multinomial ( r , x )$. Then, for all $p 1 , … , p d ∈ N 0$,
where denotes the binomial coefficient $p ! ℓ ! ( p − ℓ ) !$; recall that
$y i : = x i 1 − ∥ x ∥ 1 , for all i ∈ { 1 , … , d } .$
Proof.
By applying the binomial formula to each factor $( η i − E [ η i ] ) p i$ and using the fact that $E [ η i ] = r y i$ for all $i ∈ { 1 , … , d }$, note that
Therefore, the conclusion is a direct consequence of Theorem 1.    □

4. Numerical Codes

The formulas in Theorems 1 and 2 can be put into practice in Mathematica through the following procedure:
NonCentral[r_, x_, p_, d_] :=
Sum[FactorialPower[r - 1 + Sum[k[i], {i, 1, d}],
Sum[k[i], {i, 1, d}]] * Product[StirlingS2[p[[i]], k[i]] *
(x[[i]] / (1 - Sum[x[[i]], {i, 1, d}])) ^ k[i], {i, 1, d}], ##] & @@
({k[#], 0, p[[#]]} & /@ Range[d]);
Central[r_, x_, p_, d_] :=
Sum[Sum[FactorialPower[r - 1 + Sum[k[i], {i, 1, d}],
Sum[k[i], {i, 1, d}]] * (-r) ^ Sum[p[[i]] - ell[i], {i, 1, d}]
* Product[Binomial[p[[i]], ell[i]] * StirlingS2[ell[i], k[i]]
* (x[[i]] / (1 - Sum[x[[i]], {i, 1, d}])) ^
(p[[i]] - ell[i] + k[i]), {i, 1, d}], ##] & @@
({k[#], 0, ell[#]} & /@ Range[d]), ##] & @@
({ell[#], 0, p[[#]]} & /@ Range[d]);

5. Explicit Formulas

In the two subsections below, we calculate (explicitly) all the non-central moments up to the eighth order and all the central moments up to the fourth order. Here is a table of the Stirling numbers of the second kind that we will use in our calculations:

5.1. Computation of the Non-Central Moments up to the Eighth Order

By utilizing the general expression outlined in Theorem 1 and eliminating the Stirling numbers that are equal to zero, we obtain the following results effortlessly.
1st order: For $j 1 ∈ { 1 , … , d }$,
$E [ η j 1 ] = y j 1 r .$
2nd order: For different $j 1 , j 2 ∈ { 1 , … , d }$,
3rd order: For different $j 1 , j 2 , j 3 ∈ { 1 , … , d }$,
4th order: For different $j 1 , j 2 , j 3 , j 4 ∈ { 1 , … , d }$,
5th order: For different $j 1 , j 2 , j 3 , j 4 , j 5 ∈ { 1 , … , d }$,
6th order: For different $j 1 , j 2 , j 3 , j 4 , j 5 , j 6 ∈ { 1 , … , d }$,
7th order: For different $j 1 , j 2 , j 3 , j 4 , j 5 , j 6 , j 7 ∈ { 1 , … , d }$,
8th order: For different $j 1 , j 2 , j 3 , j 4 , j 5 , j 6 , j 7 , j 8 ∈ { 1 , … , d }$,

5.2. Computation of the Central Moments up to the Fourth Order

By combining the results of Section 5.1 with some algebraic manipulations, we are now able to calculate the central moments explicitly. The simplifications we apply to arrive at the final boxed expressions below were performed using Mathematica. We use a symbolic calculator like Mathematica to do the simplifications because many terms cancel each other out in every expression; it would be virtually impossible to do the simplifications by hand without making mistakes. While our methodology allows us to obtain simplified formulas for the central moments up to any order in principle (assuming we calculate explicit expressions for the appropriate higher order non-central moments in Section 5.1), it would be quite time-consuming for us to input the base formula for the central moments as a function of the non-central moments in Mathematica and let Mathematica do the simplifications beyond the fourth order. Therefore, for the sake of conciseness, we only present explicit simplified formulas for the central moments up to the fourth order below. It is worth noting that the numerical formulas we developed in Section 4 are fast for higher orders (i.e., beyond the fourth order) if the categorical probabilities $x i$ are known; otherwise, Mathematica has trouble calculating for unknown values of $x i$’s (i.e., Mathematica has trouble getting simplified general expressions by itself. This is why our approach below is necessary.
2nd order: For different $j 1 , j 2 ∈ { 1 , … , d }$,
3rd order: For different $j 1 , j 2 , j 3 ∈ { 1 , … , d }$,
4th order: For different $j 1 , j 2 , j 3 , j 4 ∈ { 1 , … , d }$,

6. Open Problems

Here are some research questions for the reader that are of interest:
• Using the moment formulas in the present paper, extend to the negative multinomial distribution the local limit theorem, total variation bound and Le Cam distance bound found in Lemma 1, Theorems 3 and 4 of Ouimet [39] for the negative binomial distribution ($d = 1$).
• Using the moment formulas in the present paper, study the asymptotic properties of the Bernstein estimator with a negative multinomial kernel, as was carried out for the Bernstein estimator with a multinomial kernel on the simplex in Ouimet [40].
• Investigate whether the negative multinomial distribution is a completely monotonic function of its parameters. This question was answered positively by Ouimet [29] and Qi et al. [41] for the multinomial distribution, who showed that it is in fact even logarithmically completely monotonic. The same result was extended to a matrix-parametrized generalization by Ouimet and Qi [42].

Funding

F.O. is supported financially by a postdoctoral fellowship (CRM-Simons) from the Centre de Recherches Mathématiques (Montréal, Canada) and the Simons Foundation.

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

References

1. Sibuya, M.; Yoshimura, I.; Shimizu, R. Negative multinomial distribution. Ann. Inst. Stat. Math. 1964, 16, 409–426. [Google Scholar] [CrossRef]
2. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Discrete Multivariate Distributions; Wiley Series in Probability and Statistics: Applied Probability and Statistics; John Wiley & Sons, Inc.: New York, NY, USA, 1997. [Google Scholar]
3. Fitzmaurice, G.M.; Laird, N.M.; Ware, J.H. Applied Longitudinal Analysis; Wiley Series in Probability and Statistics; Wiley-Interscience (John Wiley & Sons): Hoboken, NJ, USA, 2004. [Google Scholar]
4. Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data. In Econometric Society Monographs, 2nd ed.; Cambridge University Press: Cambridge, UK, 2013; Volume 53. [Google Scholar]
5. Haslett, J.; Parnell, A.C.; Hinde, J.; de Andrade Moral, R. Modelling excess zeros in count data: A new perspective on modelling approaches. Int. Stat. Rev. 2022, 90, 216–236. [Google Scholar] [CrossRef]
6. Böckenholt, U. Analyzing multiple emotions over time by autoregressive negative multinomial regression models. J. Am. Stat. Assoc. 1999, 94, 757–765. [Google Scholar] [CrossRef]
7. Böckenholt, U. Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal count data. J. Econom. 1999, 89, 317–338. [Google Scholar] [CrossRef]
8. Bonett, D.G. A linear negative multinomial model. Stat. Probab. Lett. 1985, 3, 127–129. [Google Scholar] [CrossRef]
9. Bonett, D.G. The negative multinomial logit model. Comm. Stat. Theory Methods 1985, 14, 1713–1717. [Google Scholar] [CrossRef]
10. Chiarappa, J.A.; Hoover, D.R. Comparative Poisson clinical trials of multiple experimental treatments vs a single control using the negative multinomial distribution. Stat. Med. 2021, 40, 2452–2466. [Google Scholar] [CrossRef]
11. Guo, G. Negative multinomial regression models for clustered event counts. Sociol. Methodol. 1996, 26, 113–132. [Google Scholar] [CrossRef]
12. Waller, L.A.; Zelterman, D. Log-linear modeling with the negative multinomial distribution. Biometrics 1997, 53, 971–982. [Google Scholar] [CrossRef]
13. Zhang, Y.; Zhou, H.; Zhou, J.; Sun, W. Regression models for multivariate count data. J. Comput. Graph. Stat. 2017, 26, 1–13. [Google Scholar] [CrossRef]
14. Chen, Y.; Wu, Y.; Chen, W.; Zhao, T.; Zhang, W.; Shen, T.-J. Application of a negative multinomial model gives insight into rarity-area relationships. Forests 2020, 11, 571. [Google Scholar] [CrossRef]
15. Bernardoff, P.; Chatelain, F.; Tourneret, J.-Y. Masses of negative multinomial distributions: Application to polarimetric image processing. J. Probab. Stat. 2013, 2013, 170967. [Google Scholar] [CrossRef] [Green Version]
16. Kusi-Appiah, A.O. On the Exchangeable Negative Multinomial Distribution and Applications to Analysis of RNA-Seq. Data. Ph.D. Thesis, The University of Memphis, Memphis, TN, USA, 2016. Available online: https://digitalcommons.memphis.edu/etd/1485 (accessed on 1 July 2023).
17. Mosimann, J.E. On the compound negative multinomial distribution and correlations among inversely sampled pollen counts. Biometrika 1963, 50, 47–54. [Google Scholar] [CrossRef]
18. Afendras, G.; Papathanasiou, V. A note on a variance bound for the multinomial and the negative multinomial distribution. Naval Res. Logist. 2014, 61, 179–183. [Google Scholar] [CrossRef] [Green Version]
19. Bernardoff, P. Which negative multinomial distributions are infinitely divisible? Bernoulli 2003, 9, 877–893. [Google Scholar] [CrossRef]
20. Bernardoff, P. Domain of existence of the Laplace transform of negative multinomial distributions and simulations. Stat. Probab. Lett. 2023, 193, 109709. [Google Scholar] [CrossRef]
21. Evans, M.A.; Bonett, D.G. Maximum likelihood estimation for the negative multinomial log-linear model. Comm. Stat. Theory Methods 1989, 18, 4059–4065. [Google Scholar] [CrossRef]
22. Griffiths, R.C. Orthogonal polynomials on the negative multinomial distribution. J. Multivar. Anal. 1975, 5, 271–277. [Google Scholar] [CrossRef] [Green Version]
23. Hamura, Y.; Kubokawa, T. Bayesian shrinkage estimation of negative multinomial parameter vectors. J. Multivar. Anal. 2020, 179, 104653. [Google Scholar] [CrossRef]
24. Janardan, K.G. A characterization of multinomial and negative multinomial distributions. Scand. Actuar. J. 1974, 1974, 58–62. [Google Scholar] [CrossRef]
25. Joshi, S.W. Integral expressions for tail probabilities of the negative multinomial distribution. Ann. Inst. Stat. Math. 1975, 27, 95–97. [Google Scholar] [CrossRef]
26. Le Gall, F. The modes of a negative multinomial distribution. Stat. Probab. Lett. 2006, 76, 619–624. [Google Scholar] [CrossRef]
27. Olkin, I.; Sobel, M. Integral expressions for tail probabilities of the multinomial and negative multinomial distributions. Biometrika 1965, 52, 167–179. [Google Scholar] [CrossRef]
28. Oller, J.M.; Cuadras, C.M. Rao’s distance for negative multinomial distributions. Sankhyā Ser. A 1985, 47, 75–83. [Google Scholar]
29. Ouimet, F. Complete monotonicity of multinomial probabilities and its application to Bernstein estimators on the simplex. J. Math. Anal. Appl. 2018, 466, 1609–1617. [Google Scholar] [CrossRef] [Green Version]
30. Panaretos, J. A characterization of the negative multinomial distribution. In Statistical Distributions in Scientific Work; NATO Advanced Study Institutes Series, Volume 79; Springer: Dordrecht, The Netherlands, 1981; Volume 4, pp. 331–339. [Google Scholar]
31. Rufo, M.J.; Pérez, C.J.; Martín, J. Bayesian analysis of finite mixtures of multinomial and negative-multinomial distributions. Comput. Stat. Data Anal. 2007, 51, 5452–5466. [Google Scholar] [CrossRef]
32. Sagae, M.; Tanabe, K. Symbolic Cholesky decomposition of the variance-covariance matrix of the negative multinomial distribution. Stat. Probab. Lett. 1992, 15, 103–108. [Google Scholar] [CrossRef]
33. Withers, C.S.; Nadarajah, S. The spectral decomposition and inverse of multinomial and negative multinomial covariances. Braz. J. Probab. Stat. 2014, 28, 376–380. [Google Scholar] [CrossRef]
34. Charalambides, C.A. q-multinomial and negative q-multinomial distributions. Comm. Stat. Theory Methods 2021, 50, 5873–5898. [Google Scholar] [CrossRef]
35. Dhar, S.K. Extension of a negative multinomial model. Comm. Stat. Theory Methods 1985, 24, 39–57. [Google Scholar] [CrossRef]
36. Patil, G.P. On multivariate generalized power series distribution and its application to the multinomial and negative multinomial. Sankhyā Ser. A 1966, 28, 225–238. [Google Scholar]
37. Withers, C.S.; Nadarajah, S. Cumulants of multinomial and negative multinomial distributions. Stat. Probab. Lett. 2014, 87, 18–26. [Google Scholar] [CrossRef]
38. Graham, R.L.; Knuth, D.E.; Patashnik, O. Concrete Mathematics, 2nd ed.; Addison-Wesley Publishing Company: Reading, MA, USA, 1994. [Google Scholar]
39. Ouimet, F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika 2023, 23. [Google Scholar] [CrossRef]
40. Ouimet, F. Asymptotic properties of Bernstein estimators on the simplex. J. Multivar. Anal. 2021, 185, 104784. [Google Scholar] [CrossRef]
41. Qi, F.; Niu, D.-W.; Lim, D.; Guo, B.-N. Some logarithmically completely monotonic functions and inequalities for multinomial coefficients and multivariate beta functions. Appl. Anal. Discret. Math. 2020, 14, 512–527. [Google Scholar] [CrossRef]
42. Ouimet, F.; Qi, F. Logarithmically complete monotonicity of a matrix-parametrized analogue of the multinomial distribution. Math. Inequal. Appl. 2022, 25, 703–714. [Google Scholar] [CrossRef]
 Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ouimet, F. Moments of the Negative Multinomial Distribution. Math. Comput. Appl. 2023, 28, 85. https://doi.org/10.3390/mca28040085

AMA Style

Ouimet F. Moments of the Negative Multinomial Distribution. Mathematical and Computational Applications. 2023; 28(4):85. https://doi.org/10.3390/mca28040085

Chicago/Turabian Style

Ouimet, Frédéric. 2023. "Moments of the Negative Multinomial Distribution" Mathematical and Computational Applications 28, no. 4: 85. https://doi.org/10.3390/mca28040085