Next Article in Journal
Application of Machine Learning and Deep Learning Techniques for Enhanced Insider Threat Detection in Cybersecurity: Bibliometric Review
Previous Article in Journal
Analysis and Compensation of Dead-Time Effect in Dual Active Bridge with Asymmetric Duty Cycle
Previous Article in Special Issue
A Comprehensive Review on the Generalized Sylvester Equation AXYB = C
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Poisson Mean Homogeneity: Single-Observation Framework with Applications

1
Department of Computer Science, Mathematics, Physics and Statistics, University of British Columbia, Kelowna, BC V1V 1V7, Canada
2
Department of Mathematics and Statistics, York University, Toronto, ON M3J 1P3, Canada
3
Environmental Instruments Canada Inc., Saskatoon, SK S7L 6M3, Canada
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(10), 1702; https://doi.org/10.3390/sym17101702
Submission received: 7 July 2025 / Revised: 22 August 2025 / Accepted: 3 September 2025 / Published: 10 October 2025
(This article belongs to the Special Issue Mathematics: Feature Papers 2025)

Abstract

Practical problems often drive the development of new statistical methods by presenting real-world challenges. Testing the homogeneity of n independent Poisson means when only one observation per population is available is considered in this paper. This scenario is common in fields where limited data from multiple sources must be analyzed to determine whether different groups share the same underlying event rate or mean. These settings often exhibit underlying structural or spatial symmetries that influence statistical behavior. Traditional methods that rely on large sample sizes are not applicable. Hence, it is crucial to develop techniques tailored to the constraints of single observations. Under the null hypothesis, with large n and a fixed common mean λ , the likelihood ratio test statistic (LRTS) is shown to be asymptotically normally distributed, with the mean and variance being approximated by a truncation method and a parametric bootstrap method. Moreover, with fixed n and large λ , under the null hypothesis, the LRTS is shown to be asymptotically distributed as a chi-square with n 1 degrees of freedom. The Bartlett correction method is applied to improve the accuracy of the asymptotic distribution of the LRTS. We highlight the practical relevance of the proposed method through applications to wildfire and radioactive event data, where correlated observations and sparse sampling are common. Simulation studies further demonstrate the accuracy and robustness of the test under various scenarios, making it well-suited for modern applications in environmental science and risk assessment.

1. Introduction

Poisson distribution is widely used to model the number of events that occur randomly over time or space and is commonly used to model data from areas such as epidemiology, industrial quality control, environmental statistics, etc. When the data are from multiple sources, it is of particular interest to determine if the underlying Poisson populations share the same event rate or mean. Note that testing the homogeneity of Poisson distributions is fundamentally a test of symmetry across datasets, and rejection of the test corresponds to evidence of asymmetry in the mean parameters, leading to different stochastic behaviors across populations. For this problem, ref. [1] proposed a Bayesian approach. However, the proposed method requires each population to have more than one observation. On the other hand, refs. [2,3] considered the problem of testing the homogeneity of the means of n independent Poisson populations when there is only one observation for each population. Traditional methods that rely on large sample sizes are not applicable in this setting, making it crucial to develop specialized techniques tailored to the constraints of single observations.
Mathematically, let X i , for i = 1 , , n , denote the independent Poisson random variables with mean λ i . Let x i denote the observed value of X i . The problem of interest is to test
H 0 : λ 1 = = λ n = λ vs . H a : not all λ i are equal
where λ is the unknown common mean parameter. One of the many tests considered by ref. [2] is the likelihood ratio test (LRT), where the likelihood ratio test statistic (LRTS), W n , is given by
W n = 2 i = 1 n X i log X i X ¯ ,
with X ¯ = 1 n i = 1 n X i , and log ( ) refers to the natural logarithmic function. Ref. [2] showed that as n , W n is asymptotically distributed as a Chi-square distribution with ( n 1 ) degrees of freedom, χ n 1 2 . However, ref. [3] proved that for a given λ , the limiting distribution of W n , as n , is not χ n 1 2 . More precisely, ref. [3] showed that, under the null hypothesis H 0 , for a given λ , as n ,
n W n n 2 μ ( λ ) N ( 0 , 4 ν ( λ ) ) n W n n 2 μ ( λ ) 2 4 ν ( λ ) χ 1 2 ,
where 2 n μ ( λ ) and 4 n ν ( λ ) are the approximate mean and variance of W n , respectively, with
μ ( λ ) = E X i log X i λ and ν ( λ ) = v a r X i log X i λ 1 .
Note that if the exact mean and variance of W n , denoted as E ( W n ) and v a r ( W n ) , were available, the central limit theorem would imply that
W n E ( W n ) v a r ( W n ) N ( 0 , 1 ) [ W n E ( W n ) ] 2 v a r ( W n ) χ 1 2 .
The difficulty lies in the fact that μ ( λ ) , ν ( λ ) , E ( W n ) , and v a r ( W n ) do not have closed-form expressions.
In this paper, we propose two methods to approximate E ( W n ) and v a r ( W n ) . The first is a truncation method, which provides simple closed-form approximations of E ( W n ) and v a r ( W n ) . The second is a parametric bootstrap method, which, based on the observed sample, numerically approximates E ( W n ) and v a r ( W n ) . We compare the accuracy of the p-values obtained by these two methods for testing the hypothesis stated in Equation (1).
Furthermore, in real-life applications, such as online sequential testing in sequential sampling inspection procedures (see ref. [4]) and quality control charts (see ref. [5]), where n is typically fixed, while λ , a setting different from that considered in ref. [3], we prove that the limiting distribution of the LRTS is indeed χ n 1 2 .
The rest of the paper is organized as follows. In Section 2, we derive the LRTS for testing the homogeneity of the means across n independent Poisson populations, which is symmetrical across populations, each with one observation. For a fixed λ and n , we propose two methods to approximate the mean and variance of the LRTS and use them to compute the p-value for testing Equation (1). In Section 3, the case that n is fixed and λ is considered. The asymptotic distrtibution of the LRTS is derived for testing Equation (1). With λ , a simple closed-form approximation of the mean of the LRTS is derived and used in the Bartlett correction to provide a more accurate approximation of the p-value.
In addition, a parametric bootstrap method is also proposed to approximate the mean of the LRTS, and, using the Bartlett correction again, an accurate approximation of the p-value is obtained. Section 4 presents numerical examples that demonstrate the application and accuracy of the proposed methods. Simulation results show that even for small n and small λ , the Bartlett correction using the bootstrap-approximated mean of the LRTS yields highly accurate results, although it is computationally intensive. Some concluding remarks are given in Section 5.

2. Asymptotic Distribution of LRTS When n Is Large and λ Is Fixed

Let X i , i = 1 , , n , denote independent Poisson ( λ i ) random variables with only one observation, x i , from each distribution. To test the symmetry across the n independent Poisson populations, the null and alternative hypotheses are stated in Equation (1). For this problem, the log-likelihood function is
( λ 1 , , λ n ) = i = 1 n log ( x i ! ) λ i + x i log λ i .
It can be shown that the overall maximum likelihood estimate (MLE) of ( λ 1 , , λ n ) is given by ( λ ^ 1 , , λ ^ n ) , where λ ^ i = x i for i = 1 , , n . The log-likelihood function evaluated at the overall MLE is
^ = ( λ ^ 1 , , λ ^ n ) = i = 1 n log ( x i ! ) x i + x i log x i .
When H 0 is true, the constrained log-likelihood function is
( λ , , λ ) = i = 1 n log ( x i ! ) λ + x i log λ .
Then, the constrained MLE of λ is λ ˜ = 1 n i = 1 n x i = x ¯ , and the log-likelihood function evaluated at the constrained MLE is
˜ = ( λ ˜ , , λ ˜ ) = i = 1 n log ( x i ! ) x ¯ + x i log x ¯ .
Thus, as given in [2], the LRTS is
W n = 2 ^ ˜ = 2 i = 1 n X i log X i X ¯ .
Note that this LRTS, W n , measures deviation from symmetry across the n independent Poisson populations. Under the null hypothesis, the means are assumed to be equal, reflecting a symmetric structure. A significant value of W n , therefore, signals a symmetry-breaking event, indicating heterogeneity among the populations. Accurately deriving and approximating the distribution of W n is crucial as it underpins valid statistical inference about such structural deviations. Beyond its inferential utility, this analysis contributes to the broader understanding of invariant properties and symmetry-related transformations in statistical systems—core themes within the scope of Symmetry. By identifying and quantifying departures from symmetry, this work aligns with the journal’s emphasis on both the theoretical foundations and applied implications of symmetry and asymmetry in mathematical and scientific contexts.
In ref. [2], the authors proved that as n ,
W n χ n 1 2 .
However, ref. [3] showed that this result is wrong. More specifically, the authors showed that for a fixed λ , and as n ,
n W n n 2 μ ( λ ) N ( 0 , 4 ν ( λ ) ) n W n n 2 μ ( λ ) 2 4 ν ( λ ) χ 1 2 ,
where
μ ( λ ) = E X i log X i λ and ν ( λ ) = v a r X i log X i λ 1 .
Since λ is unknown, and assuming H 0 is true, λ can be estimated by λ ˜ = x ¯ . However, μ ( λ ˜ ) and ν ( λ ˜ ) do not have closed-form expressions.
In the following subsections, we first propose a truncation method to approximate μ ( λ ˜ ) and ν ( λ ˜ ) with closed-form expressions. Thus, E ( W n ) and v a r ( W n ) can be approximated. Alternatively, we propose a parametric bootstrap method to numerically estimate E ( W n ) and v a r ( W n ) . Once E ( W n ) and v a r ( W n ) are available, either through the truncation method or the parametric bootstrap method, we apply the central limit theorem to obtain the p-value for testing the symmetry across the n independent distribution:
p - value = P χ 1 2 > [ w E ( W n ) ] 2 v a r ( W n ) ,
where w is the observed LRTS. A small p-value suggests evidence of symmetry breaking, while a large p-value indicates no evidence against the assumed symmetry.

2.1. Truncation Method

The truncated Poisson random variable in mixed effects models was introduced in ref. [6]. Based on the saddlepoint approximation, ref. [7] approximated the nonlinear moments of truncated Poisson random variables. In this section and in the Appendix A, we followed the arguments in ref. [7] and derived a closed-form expression of the mean and variance of LRTS. More specifically, let X i , i = 1 , , n , be independent Poisson ( λ ) variables. Assume that m and are positive integers. Then,
E ( X i m log X i ) = k = 1 k m log k e λ λ k k ! = k = 1 λ k m 1 log k e λ λ k 1 ( k 1 ) ! = λ E ( X i + 1 ) m 1 log ( X i + 1 ) .
Hence, for m = 1 and = 1 , we have
E ( X i log X i ) = λ E [ log ( X i + 1 ) ] = λ E [ log ( X i + 1 ) I ( X i n λ ) ] + E [ log ( X i + 1 ) I ( X i > n λ ) ]
where
I ( X i n λ ) = 1 if X i n λ , 0 otherwise .
Moreover, for X i 1 , we have log ( X i + 1 ) e X i / 2 . Thus,
E [ log ( X i + 1 ) I ( X i > n λ ) ] E e X i / 2 I ( X i > n λ ) e n λ / 2 E e X i = e λ ( e 1 ) n λ / 2 .
Therefore, for a given λ , as n , E [ log ( X i + 1 ) I ( X i > n λ ) ] 0 . Since μ ( λ ) can be expressed as
μ ( λ ) = E X i log X i λ log λ = λ E log ( X i + 1 ) λ log λ = λ E [ log ( X i + 1 ) I ( X i n λ ) ] + E [ log ( X i + 1 ) I ( X i > n λ ) ] λ log λ ,
as n , for a given λ , we have
μ ( λ ) λ E [ log ( X i + 1 ) I ( X i n λ ) ] λ log λ = λ k = 1 n λ log ( k + 1 ) p λ ( k ) λ log λ ,
where p λ ( · ) is the probability mass function of the Poisson ( λ ) distribution.
Following the same argument, in the Appendix A, we show that for a given λ , and as n , applying the truncation method gives
ν ( λ ) λ 2 k = 0 n λ log 2 ( k + 2 ) p λ ( k ) 2 λ 2 ( 1 + log λ ) k = 0 n λ log ( k + 2 ) p λ ( k ) + λ k = 0 n λ log 2 ( k + 1 ) p λ ( k ) 2 λ ( 1 + log λ ) k = 0 n λ log ( k + 1 ) p λ ( k ) + ( λ + λ 2 ) [ 1 + log λ ] 2 [ μ ( λ ) λ ] 2 ,
where μ ( λ ) is given in Equation (8). Finally, the mean and variance of W n can be approximated by 2 n μ ( λ ) and 4 n ν ( λ ) , respectively.
In the case that λ is unknown, the estimate of λ under H 0 is the constrained MLE λ ˜ = x ¯ . Hence, E ( W n ) and v a r ( W n ) can be approximated by 2 n μ ( λ ˜ ) and 4 n ν ( λ ˜ ) , respectively. Thus, the p-value for testing Equation (1) can be obtained using Equation (6). Finally, evidence of symmetry across the n independent Poisson populations can be determined from the p-value.

2.2. Parametric Bootstrap Method

As an alternative to the truncation method proposed in the previous subsection, we propose a parametric bootstrap method to numerically approximate the mean and variance of W n . The idea of approximating the mean of an LRTS has been demonstrated in ref. [8] via cointegration testing, and also in ref. [9] for testing the equality of Gaussian graphical models. In this paper, we adopt the same approach to approximate the mean and variance of W n . The procedure can be summarized by the following Algorithm 1.
Algorithm 1: Approximating the mean and varaince of LRT
Have:From the observed data x 1 , , x n , we can calculate the observed LRTS w by
w = 2 i = 1 n x i log x i x ¯ ,
where x ¯ = 1 n i = 1 n x i = λ ˜ .
Step 1:Obtain a parametric bootstrap sample of size n, x 1 b , , x n b , from Poisson ( λ ˜ ).
Step 2:From the bootstrap sample, calculate the observed LRTS w b by
w b = 2 i = 1 n x i b log x i b x ¯ b ,
where x ¯ b = 1 n i = 1 n x i b .
Step 3:Repeat Steps 1 and 2 M times, where M is large and we have w 1 b , , w M b .
Step 4:Then
w ¯ b = 1 M i = 1 M w i b and s w b 2 = 1 M 1 i = 1 M w i b w ¯ b 2 ,
are the unbiased estimate of E ( W n ) and v a r ( W n ) , respectively.
Hence, the p-value for testing Equation (1) can be obtained from Equation (6). Finally, evidence of symmetry across the n independent Poisson populations can be determined from the p-value.

2.3. Numerical Examples

Example 1. 
We generated one set of data from each of the following five cases stated in Table 1:
Table 2 reports that the p-values for testing
H 0 : λ 1 = = λ n vs . H a : not all λ i are equal ,
calculated from the four methods discussed in this paper:
  • BZ: method discussed in ref. [2];
  • FWT: method discussed in ref. [3];
  • Truncated: truncated method;
  • Bootstrap: parametric bootstrap method.
Note that rejecting H 0 corresponds to evidence of having asymmetry in the mean parameters. For FWT, calculating μ ( λ ˜ ) and ν ( λ ˜ ) requires an infinite sum, and numerically, we summed 1,000,000 times. Also, for Bootstrap, we use M = 1,000,000 resamples. Note that for cases 1 to 4, the data were generated using the same mean parameters. Therefore, we expect to obtain large p-values. In contract, for case 5, the data were generated using different mean parameters, so we expect small p-values. The results are recorded in Table 2.
It is clear from Table 2 that the results from ref. [2] are significantly different from those of the other three methods, and theoretically, we know they are the wrong results. The results from ref. [3] and the results from the proposed truncation method are the same. However, in terms of the required time in calculation, the method in ref. [3] takes, on average, 10 to 15 times longer to obtain the results than the proposed truncation method. The parametric bootstrap method gives result similar to those obtained by the truncation method. But due to the required large number of simulations (we use M = 1,000,000), the parametric bootstrap method takes, on average, 600 times longer than the truncated method to obtain the results.
Example 2. 
We consider the five cases studied in Example 1. For each case, we perform the following calculations:
1. 
Generate a sample of size n.
2. 
Record the p-value calculated from the truncation method and the parametric bootstrap method. We did not include method from ref. [2] because it is wrong, and we did not include method ref. [3] because it gives the same answer as the truncation method.
3. 
Repeat the above steps N times.
4. 
Report the proportion of p-values that is larger than α out of these N simulated samples, where α is a preset significance level.
In our study, we set α = 0.05 and N = 10,000. The corresponding standard error is 0.05 ( 1 0.05 ) 10 , 000 = 0.0022 . To reduce computation time, we use M = 5000 resamples for the parametric bootstrap method in this example.
As discussed in ref. [10], if the data generated under H 0 are correct (symmetry across populations), then the proportion of p-values less than α should be close to the nominal level α with standard error α ( 1 α ) n . On the other hand, if the data generated under H a are correct (asymmetry in mean parameters), then the proportion of p-values less than α should correspond to the power of the test at a 5% level of significance. Hence, a higher proportion corresponds to a more powerful test.
For our simulation, data in cases 1 to 4 are generated from independent Poisson distributions with the same mean parameter. The best method is the one that yields the proportion of p-values less than α closest to the nominal-level α . However, for case 5, data are generated from independent Poisson distributions with different mean parameters. The best method is the one that yields the largest proportion of p-values less than α .
The results of this simulated study are recorded in Table 3. We observed that the parametric bootstrap method gives results that are slightly closer to the nominal α than those obtained by the truncation method (see cases 1 to 4). Moreover, the parametric bootstrap method also has slightly higher power than the truncation method (see case 5). But the results are not significantly different. The advantage of the truncation method is the simplicity in the calculation, whereas the parametric bootstrap method is very time-consuming because of the required bootstrapping.

3. Asymptotic Distribution of LRTS When n Is Small but λ Is Large

In many applied contexts, especially in real-time monitoring systems, one often encounters a small number of independent Poisson-distributed observations with large expected counts. A relevant example arises in online sequential testing, where a detector continuously monitors radiation by recording the number of emissions per second. The detector maintains a moving average of the most recent n counts, with n being fixed and small, while the underlying Poisson means are large. Under normal operating conditions, these counts are expected to be statistically similar, reflecting distributional symmetry over time. However, when a newly observed emission count deviates substantially from the previous ones, this symmetry is broken, suggesting an anomaly or system shift. In such cases, the detector triggers a reset mechanism. Currently, systems often rely on a moving average of four observations to determine when this symmetry-breaking event occurs. This scenario can be formulated as a statistical hypothesis testing problem. Let x i denote the ith observed number of emissions asummed to follow an independent Poisson ( λ i ) distribution, for i = 1 , , n . Each λ i represents the mean emission rate per second and is assumed to be large. The goal is to test symmetry across n independent Poisson populations based on the observed sample ( x 1 , , x n ) . And the statistical hypothesis is stated in Equation (1). The LRTS W n is defined as in Section 2, measuring the deviation from symmetry of the independent populations. However, the problem differs from the setting discussed in Section 2, as here, n is small and the common mean λ is large. Thus, the asymptotic distribution of W n is different from the one derived in Section 2.
To address this problem, we first derived the asymptotic distribution of W n under the condition that n is fixed and λ . Then, we propose three methods for approximating the p-value for testing the hypothesis stated in Equation (1). The first method directly utilizes the asymptotic distribution of W n . The second and third methods are modifications of the Bartlett correction method (see ref. [11]), where the mean of W n is approximated by using a special case of the truncation method described in Section 2, and by the approximated parametric bootstrap method, as given in Section 2. An accurate approximation of the distribution of W n is essential as it enables precise statistical inference regarding potential departures from symmetry in the underlying system. Such approximations not only support the detection of symmetry-breaking phenomena but also contribute to the broader understanding of invariant structures and transformations that align with the theoretical and applied focus of Symmetry.

3.1. Obtaining p-Value Based on LRTS

Since we are considering the case where λ , we can rewrite λ = m λ 0 , where λ 0 is fixed, and m . The problem can then be reformulated as X i = j = 1 m Y j , where each Y j follows a Poisson distribution with parameter λ 0 . Hence, as m , the central limit theorem gives
X i λ λ N ( 0 , 1 ) i = 1 n ( X i λ ) 2 λ χ n 2 .
Since X ¯ / λ converges to 1 in probability, it follows that
i = 1 n ( X i X ¯ ) 2 X ¯ χ n 1 2 .
The LRTS given in Equation (2) is
W n = 2 i = 1 n X i log X i X ¯ .
Using the Taylor expansion of W n , we have
W n = 2 i = 1 n X i log X i X ¯ 1 + 1 = 2 i = 1 n X i X i X ¯ 1 i = 1 n X i X i X ¯ 1 2 + o p ( 1 ) .
The first term simplifies to
2 i = 1 n X i X i X ¯ 1 = 2 i = 1 n X i X i X ¯ X ¯ = 2 i = 1 n ( X i X ¯ ) 2 X ¯ + 2 X ¯ i = 1 n ( X i X ¯ ) = 2 i = 1 n ( X i X ¯ ) 2 X ¯ ,
and the second term simplies to
i = 1 n X i X i X ¯ 1 2 = i = 1 n ( X i X ¯ ) 3 X ¯ 2 + i = 1 n ( X i X ¯ ) 2 X ¯ = i = 1 n ( X i X ¯ ) 2 X ¯ + o p ( 1 ) .
Therefore, W n = i = 1 n ( X i X ¯ ) 2 X ¯ + o p ( 1 ) , and it follows that W n χ n 1 2 . Thus, the p-value for testing the hypothesis stated in Equation (1) is
p - value = P χ n 1 2 > w ,
where w is the observed LRTS.

3.2. Bartlett Correction of LRTS

As stated in ref. [12], Equation (10) has a convergence rate of order O ( n 1 ) . Bartlett (see ref. [11]) proposed the Bartlett correction method to improve the convergence rate of the p-value to O ( n 2 ) , and ref. [13] gives a detailed summary of the Bartlett correction method. Although the Bartlett correction method has a hight convergence rate, as is demonstrated in refs. [14,15], the exact Bartlett correction factor is very difficult to obtain. An alternative way of explaining the Bartlett correction method is to obtain a transformation of W n such that the limiting distribution of the transformed statistic remains as a Chi-square distribution but the mean of the transformed statistic exactly matches the mean of the limiting distribution. This can be achieve by using the scale transformation:
W n = W n E ( W n ) / ( n 1 ) ,
which ensures that the mean of W n is ( n 1 ) . Thus, according to ref. [11],
p - value = P χ n 1 2 > w ,
where w is the observed value of W , and this method achieves a convergence rate of O ( n 2 ) . A more accurate approximation of the p-value provides stronger evidence for detecting whether the underlying symmetry of mean homogeneity across Poisson populations is preserved or broken. This leads to more precise decision-making and directly supports the journal’s focus on identifying and understanding symmetry-breaking phenomena in statistical structures.
However, E ( W n ) does not have a closed-form expression. As discussed in Section 2, we propose two methods to approximate E ( W n ) : a version of the truncation method, and a parametric bootstrap method.

3.2.1. Truncation Method

Let X be a Poisson random variable with mean m λ 0 . By following the arguments in [7], we have
E [ log ( 1 + X ) ] = log ( 1 + m λ 0 ) + m 2 λ 0 ( 1 + m λ 0 ) 2 2 m + 18 m 4 λ 0 2 ( 1 + m λ 0 ) 4 + 8 m 3 λ 0 ( 1 + m λ 0 ) 3 24 m 2 + O ( m 3 ) .
Let λ = m λ 0 ; then, we have
E [ log ( 1 + X ) ] = log ( 1 + λ ) + λ ( 1 + λ ) 2 2 5 12 λ 2 + O ( λ 3 ) .
Moreover, let S n = i = 1 n Y i where S n Poisson( n λ ). Then,
E [ log ( 1 + S n ) ] = log ( 1 + n λ ) + n λ ( 1 + n λ ) 2 2 5 12 n 2 λ 2 + O ( λ 3 ) .
Since log n + log ( 1 + λ ) log ( 1 + n λ ) = log ( 1 + λ 1 ) log { 1 + ( n λ ) 1 } , by Taylor expansion of log ( 1 + x ) , we have
log n + log ( 1 + λ ) log ( 1 + n λ ) = 1 λ 1 2 λ 2 1 n λ + 1 2 n 2 λ 2 5 12 λ 2 + 5 12 n 2 λ 2 + O ( λ 3 ) .
Furthermore, by Taylor expansion of λ ( 1 + λ ) 2 2 , we have λ ( 1 + λ ) 2 2 = 1 2 λ + 1 λ 2 + O ( λ 3 ) , and similarly, n λ ( 1 + n λ ) 2 2 = 1 2 n λ + 1 n 2 λ 2 + O ( λ 3 ) . Finally,
E ( W n ) = 2 n E ( X log X ) E ( S n log S n ) + n λ log n = 2 n λ E [ log ( 1 + X ) ] E [ log ( 1 + S n ) ] + log n = 2 n λ 1 λ 1 2 λ 2 1 n λ + 1 2 n 2 λ 2 1 2 λ + 1 λ 2 + 1 2 n λ 1 n 2 λ 2 5 12 λ 2 + 5 12 n 2 λ 2 + O ( λ 3 ) = n 1 + n 2 1 6 n λ + O ( λ 2 ) .
Therefore, under H 0 , we have
E ( W n ) n 1 + n 2 1 6 n λ ˜ .

3.2.2. Parametric Bootstrap Method

We can also apply the parametric bootstrap method discussed in Section 2 to this problem. Using the Algorithm given in Section 2, we obtain w ¯ b , which is an unbiased estimate of E ( W n ) .

3.3. Numerical Studies

Example 3. 
We generated a dataset from each of the following six cases stated in Table 4:
The table reports the p-values calculated from the three methods—the likelihood ratio method (LRT), the Bartlett correction method where the mean of the LRTS is approximated by Equation (15) (BCTruncated), and the Bartlett correction method where the mean of the LRTS is approximated by the parametric bootstrap method (BCBootstrap)—for testing the hypothesis stated in Equation (1). For the parametric bootstrap, we use M = 1,000,000 resamples. Note that rejecting H 0 corresponds to evidence of having asymmetry in the mean parameters. Similar to Example 1, for the first four cases, the data were generated using the same mean parameter. Hence, we expect large p-values. For cases 5 and 6, the data were generated using different mean parameters, so we expect small p-values.
In Table 5, the p-values across all three methods are evidently similar. The parametric bootstrap method takes approximately 500 times longer to obtain the result than the other two methods.
Example 4. 
To compare the accuracy of the three methods discussed in Example 3, simulation studies with N = 10,000 were performed using the parameter settings given in Example 3. At the 5% level of significance, the standard error of each simulated study is 0.0022. Table 6 reports the proportion of p-values that are less than the chosen α = 0.05 . For cases 1 to 4, the best method is the one which gives results closest to α. However, for cases 5 and 6, the best method is the one that yields the largest value. As shown in Table 6, the results are not significantly different from each other.
Example 5. 
We consider additional combinations of n = 3 , 5 , a n d 10 and λ = 5 a n d 10 , where both n and λ are small. Table 7 reports the proportion of p-values less than α, obtained using the method in [3] (FWT), the truncation method discussed in Section 2 (Truncated), the parametric bootstrap method discussed in Section 2 (Bootstrap), the LRT method discussed in Section 3 (LRT), the Bartlett correction method where the mean of LRTS is approximated by the truncation method discussed in Section 3 (BCTruncated), and the Bartlett correction method where the mean of LRTS is approximated by the parametric bootstrap method discussed in Section 3 (BCBootstrap).
The nominal α value is 0.05 , and with N = 10,000, the standard error of each of the simulated studies is 0.0022. Again, the criterion for the best method is given in ref. [10]. It is evident that results from FWT and Truncated are not satisfactory when n is small. This suggests that the approximated mean and variance of the LRTS may not be accurate in such cases. The LRT method discussed in Section 3 performed reasonably well, but still falls short of being ideal. The results from BCTruncated and Bootstrap are much better, with Bootstrap seeming to produce slightly more accurate results. Overall, BCBootstrap is the most accurate method regardless of the size of n and λ . However, the trade-off is that both Bootstrap and BCBootstrap are computationally intensive. If computational time is not a constraint, we recommend using either Bootstrap or BCBootstrap. If computational time is a concern, then we recommend using BCTruncated. Truncated should only be used when the n is large relative to λ . LRT is acceptable if the λ is large, regardless of the size of n.
Other simulation studies have also been performed, and they give similar results to those reported here. These results are available from the authors upon request.

4. Real Data Analysis

4.1. Wildfire Data

Impacts of wildfires include the deterioration of air quality, the loss of property, crops, resources, animals, and human lives, and the onset of mental health problems (see https://www.who.int/health-topics/wildfires (accessed on 17 August 2023)). Studying the distribution and variability of wildfires can improve incident management and enhance collaboration with other fire centers and local communities.
In Canada, British Columbia (BC) experiences the highest number of wildfires. There are six regions, including Coastal, Northwest, Prince George, Kamloops, Southeast and Cariboo (see Figure 1 originating from https://www2.gov.bc.ca/gov/content/safety/wildfire-status/about-bcws/fire-centres#cariboo (accessed on 17 August 2023)). We consider two-years of wildfire summary data. The wildfire numbers are listed in Table 8 (see https://www2.gov.bc.ca/gov/content/safety/wildfire-status/about-bcws/wildfire-history/wildfire-season-summary (accessed on 17 August 2023)).
The number of wildfires in each region for each period is modeled using a Poisson distribution. The main question of interest is, for each region, if the mean number of wildfires in 2023 is different from the mean number of wildfires in the previous year. In other words, for each region, we are testing the symmetry of the mean across time. For each region, we have two years of data ( n = 2 ) and only one observation per year. The p-values obtained using the methods discussed in this paper are recorded in Table 9. Note that the Truncated and Bootstrap results are based on the discussion in Section 2, where the derivation assume large n and finite λ . The LRT, BCTruncated, and BCBootstrap results are based on the methods described in Section 3, where the derivations of the results are based on small n and large λ .
We can observe from Table 9 that the p-values obtained by the methods discussed in Section 2 are different from those obtained by the methods in Section 3. Moreover, the results obtained by the methods in Section 3 are almost identical. For this example, we have small n; thus, methods in Section 3 should be more appropriate. At the 5% level of significance, based on the reported results in Table 9, the wildfire numbers in Cariboo, Kamloops, and Prince Georges regions undergo no significant changes over the two years studied because the p-values are all greater than 5%. In other words, we identify a symmetry of means across time in these three regions. On the other hand, number of wildfires in the Coastal, Northwest, and Southeast regions show significant changes over the same period because of the p-values are all less than 5%; that is, we find asymmetry of means across time in these three regions. More specifically, the Northwest and Southeast regions seems to undergo an increase in the number of wildfires from 2022 to 2023, whereas the Coastal region underwent a decrease in the number of wildfires over the same period.

4.2. Radioactive Data

The hand-held radiation detectors currently manufactured by Environmental Instruments Canada Inc. operate in one of two modes for recalculating the radiation level measured by the instrument. In the first mode, the instrument averages over a fixed time interval. In the second mode, the instrument updates after a fixed number of radiation interactions are detected. At a constant-radiation level, the number of radiation interactions counted in a given time follow a Poisson distribution. The problem with the first mode is that the reading may not be accurate enough at low radiation levels, or the instrument may be slow to respond at high radiation levels. In the second mode, the response time decreases as the count rate increases. Moving from low radiation levels to high levels, the instrument responds quickly. However, in moving from high radiation levels to low, it takes a long time for the minimum number of counts to be accumulated. A better algorithm would be one that detects when a change in count rate has taken place and then resets the calculation. That way, the user will know very quickly that the radiation level is no longer high.
Our aim is to test radiation detectors and to add new functionality to existing instruments. A CT008-F radiation detector was exposed to radiation from a Thorium-232 containing Welsbach Mantle. We used the instrument to collect the number of counts received in a 3 s interval. The counts are independently observed and assumed to follow a Poisson distribution. A promising algorithm should have the advantage of filtering homogeneous counts using the mean and resetting the count when the new count differs from the previous counts.
For illustration purposes, Figure 2 demonstrates a scenario with an increasing trend in radiation, showing the values recorded by the instrument, whether a reset is required, and the values retained after each reset. We began with an observed count of 2238, followed by a second observation of 2219 three seconds later. To test whether the two mean counts are homogeneous at the 5% significance level, we applied the BCBootstrap method. With n = 2 and finite λ , the resulting p-value was 0.7760, indicating that no reset was necessary, so both observations were retained. Three seconds later, a third count of 2539 was recorded. Testing homogeneity across the three observations ( n = 3 ), the BCBootstrap method yielded a p-value approximately equal to 0, suggesting that a reset was required. As a result, only the most recent count (2539) was kept. Another three seconds later, a new count of 3197 was observed. With n = 2 , testing homogeneity again resulted in a p-value near 0, indicating that another reset was necessary, leaving only a count of 3197.
Similarly, Figure 3 demonstrates a scenario with a decreasing trend in radiation, showing the values recorded by the instrument, whether a reset is required, and the values retained after each reset. We began with an observed count of 3350, followed three seconds later by a count of 1632. The BCBootstrap method yielded a p-value close to 0, indicating that a reset was necessary, and only the value of 1632 was retained. Next, a count of 1633 was observed. The resulting p-value was approximately 0.9860, suggesting that no reset was needed, so both 1632 and 1633 were kept. Subsequently, a count of 614 was recorded. The p-value was again approximately 0, indicating that another reset was required, leaving only the value of 614.

5. Conclusions

This paper investigates the problem of testing the homogeneity of n independent Poisson means when only one observation per population is available—a common situation in fields such as epidemiology, environmental statistics, and industrial quality control. Traditional large-sample methods are not applicable in this setting. We proposed a LRTS-based approach, where the mean and variance of the LRTS are approximated using either a truncation method or a parametric bootstrap. These approximations enable p-value computation via the central limit theorem. For small n and large λ , we also derived the asymptotic distribution of the LRTS. The mean of the LRTS can be approximated analytically or via bootstraping, allowing for Bartlett correction to improve test accuracy. Simulations show that the bootstrap-based correction provides the most accurate results but is computationally intensive. In contrast, the truncation-based method is faster but less precise.
The framework also extends to constructing confidence intervals for the common mean λ by inverting the test based on p ( λ ) values. The resulting interval uses chi-square critical values, with degrees of freedom depending on the chosen method. Overall, this work provides a practical and flexible approach to testing Poisson mean homogeneity in data-limited settings.

Author Contributions

Conceptualization, X.S., A.W. and K.K.; methodology, X.S. and A.W.; validation, X.S., A.W. and K.K.; formal analysis, X.S., A.W. and K.K.; investigation, X.S. and A.W.; resources, X.S., A.W. and K.K.; data curation, X.S. and K.K.; writing—original draft preparation, X.S., A.W. and K.K.; writing—review and editing, X.S., A.W. and K.K.; visualization, X.S., A.W. and K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada under Grant RGPIN-2022-03264, Mathematics of Information Technology and Complex Systems Accelerate, the NSERC Alliance International Catalyst Grant ALLRP 590341-23, and the University of British Columbia Okanagan (UBC-O) Vice Principal Research in collaboration with UBC-O Irving K. Barber Faculty of Science.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to sincerely thank the reviewers for their thoughtful and constructive comments. Their feedback has been invaluable in helping us to improve the clarity, depth, and overall quality of this manuscript.

Conflicts of Interest

Author K.K. was employed by the company Environmental Instruments Canada Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

The aim of this Appendix is to derive the closed-form approximation of ν ( λ ) . From Equation (7), we have
E X m log X = λ E ( X + 1 ) m 1 log ( X + 1 ) .
Consider the following special cases:
  • m = 1 and = 1 , we have E ( X log X ) = λ E [ log ( X + 1 ) ] .
  • m = 2 and = 1 , we have
    E ( X 2 log X ) = λ E [ ( X + 1 ) log ( X + 1 ) ] = E [ X log ( X + 1 ) ] + E [ log ( X + 1 ) ] = λ 2 E [ log ( X + 2 ) ] + λ E [ log ( X + 1 ) ] .
  • m = 1 and = 2 , we have E ( X log 2 X ) = λ E [ log 2 ( X + 1 ) ] .
  • m = 2 and = 2 , we have
    E ( X 2 log 2 X ) = λ E [ ( X + 1 ) log 2 ( X + 1 ) ] = λ 2 E [ log 2 ( X + 2 ) ] + λ E [ log 2 ( X + 1 ) ] .
We can rewrite ν ( λ ) as
ν ( λ ) = v a r X log S λ 1 = E X log X λ 1 2 E X log X λ 1 2 .
Since
X log X λ 1 2 = X 2 log 2 X λ 2 log X λ + 1 = X 2 log 2 X λ 2 X 2 log X + 2 X 2 log λ + X 2 = X 2 log 2 X 2 X 2 log X log λ + X 2 log 2 λ 2 X 2 log X + 2 X 2 log λ + X 2 , E ( X 2 log 2 X ) = λ 2 E [ log 2 ( X + 2 ) ] + λ E [ log 2 ( X + 1 ) ] , E ( X 2 log X ) = λ 2 E [ log ( X + 2 ) ] + λ E [ log ( X + 1 ) ] ,
we have
E X log X λ 1 2 = E ( X 2 log 2 X ) 2 E ( X 2 log X ) log λ + E ( X 2 ) log 2 λ 2 E ( X 2 log X ) + 2 E ( X 2 ) log λ + E ( X 2 ) = λ 2 E [ log 2 ( X + 2 ) ] 2 λ 2 ( 1 + log λ ) E [ log ( X + 2 ) ] + λ E [ log 2 ( X + 1 ) ] 2 λ ( 1 + log λ ) E [ log ( X + 1 ) ] + ( λ + λ 2 ) [ 1 + log λ ] 2 .
Moreover,
E X log X λ 1 = E X log X λ E ( X ) = μ ( λ ) λ .
Thus,
ν ( λ ) = λ 2 E [ log 2 ( X + 2 ) ] 2 λ 2 ( 1 + log λ ) E [ log ( X + 2 ) ] + λ E [ log 2 ( X + 1 ) ] 2 λ ( 1 + log λ ) E [ log ( X + 1 ) ] + ( λ + λ 2 ) [ 1 + log λ ] 2 [ μ ( λ ) λ ] 2
For a fixed λ , and as n , applying the truncation method, ν ( λ ) can be approximated by using
E [ log 2 ( X + 2 ) ] k = 0 n λ log 2 ( k + 2 ) p λ ( k ) , E [ log ( X + 2 ) ] k = 0 n λ log ( k + 2 ) p λ ( k ) , E [ log 2 ( X + 1 ) ] k = 0 n λ log 2 ( k + 1 ) p λ ( k ) , E [ log ( X + 1 ) ] k = 0 n λ log ( k + 1 ) p λ ( k ) ,
where p λ ( · ) is the probability mass function of Poisson ( λ ) distribution, and μ ( λ ) is approximated in Equation (8).

References

  1. Giron, F.J.; Martel-Escobar, M.; Vazquez-Polo, F. A Bayesian homogeneity test for comparing Poisson populations. Appl. Stoch. Model. Bus. Ind. 2022, 8, 1158–1171. [Google Scholar] [CrossRef]
  2. Brown, L.D.; Zhao, L.H. A test for the Poisson distribution. Sankhya 2002, 64, 611–625. [Google Scholar]
  3. Feng, C.; Wang, H.; Tu, X.M. The asymptotic distribution of a likelihood ratio test statistic for the homogeneity of Poisson distribution. Sankhya 2012, 74 Pt 2, 263–268. [Google Scholar] [CrossRef]
  4. Dodge, H.F.; Romig, H.G. A method of sampling inspection. Bell Syst. Tech. J. 1929, 8, 613–631. [Google Scholar] [CrossRef]
  5. Shewhart, W.A. Economic Control of Manufactured Products; Van Nostrand Reinhold: New York, NY, USA, 1931. [Google Scholar]
  6. Nunes, C.; Moreira, E.; Ferreira, S.; Ferreira, D.; Mexia, J. Considering the sample sizes as truncated Poisson random variables in mixed effects models. J. Appl. Stat. 2019, 47, 2641–2657. [Google Scholar] [CrossRef] [PubMed]
  7. Shi, X.; Wang, X.S.; Reid, N. Saddlepoint approximation of nonlinear moments. Stat. Sin. 2014, 24, 1597–1611. [Google Scholar] [CrossRef]
  8. Jacobson, T.; Larsson, R. Bartlett corrections in cointegration testing. Comput. Stat. Data Anal. 1999, 28, 203–225. [Google Scholar] [CrossRef]
  9. Banzato, E.; Chiogna, M.; Djordjilovic, V.; Risso, D. A Bartlett-type correction for likelihood ratio tests with application to testing equality of Gaussian graphical models. Stat. Probab. Lett. 2023, 193, 109732. [Google Scholar] [CrossRef] [PubMed]
  10. Malezadeh, A.; Kharrati-Kopaei, M. Inferences on the common mean of several heterogeneous log-normal distributions. J. Appl. Stat. 2019, 46, 1066–1083. [Google Scholar] [CrossRef]
  11. Bartlett, M.S. Properties of sufficiency and statistical tests. Proc. R. Soc. Lond. 1937, 160, 268–282. [Google Scholar]
  12. Barndorff-Nielsen, O.E.; Cox, D.R. Inference and Asymptotics; Chapman and Hall: New York, NY, USA, 1994. [Google Scholar]
  13. Cordeiro, G.M.; Cribari-Neto, F. An Introduction to Bartlett Correction and Bias Reduction; Springer: Berlin, Germany, 2014. [Google Scholar]
  14. DiCiccio, T.J. Approximate inference for the generalized gamma distribution. Am. J. Stat. 1987, 29, 33–40. [Google Scholar] [CrossRef]
  15. Johansen, S. A Bartlett correction factor for tests on the cointegrating relations. Econom. Theory 2000, 16, 740–778. [Google Scholar] [CrossRef]
Figure 1. Six regional fire centers in BC, each responsible for wildfire management within its boundaries.
Figure 1. Six regional fire centers in BC, each responsible for wildfire management within its boundaries.
Symmetry 17 01702 g001
Figure 2. An illustration of data sequence filtering process with an upward trend via multiple sequential tests.
Figure 2. An illustration of data sequence filtering process with an upward trend via multiple sequential tests.
Symmetry 17 01702 g002
Figure 3. Illustration of data sequence filtering process with a downward trend via multiple sequential tests.
Figure 3. Illustration of data sequence filtering process with a downward trend via multiple sequential tests.
Symmetry 17 01702 g003
Table 1. Cases generated for Example 1.
Table 1. Cases generated for Example 1.
CaseSample Size nData Generated  
1100Poisson distribution with λ = 1
2100Poisson distribution with λ = 2
3200Poisson distribution with λ = 1
4200Poisson distribution with λ = 2
510050 data from Poisson distribution with λ = 1 and 50 data from Poisson distribution with λ = 2
Table 2. p-values obtained by BZ, FWT, Truncated, and Bootstrap methods.
Table 2. p-values obtained by BZ, FWT, Truncated, and Bootstrap methods.
CaseBZFWTTruncatedBootstrap
10.05210.51490.51490.4605
20.04220.49290.49290.4491
30.02550.54220.54220.5034
40.09650.92640.92640.9633
50.00070.01640.01640.0132
Table 3. Proporton of p-values obtained via Truncated and Bootstrap methods using α = 0.05 .
Table 3. Proporton of p-values obtained via Truncated and Bootstrap methods using α = 0.05 .
CaseTruncatedBootstrap
10.04620.0470
20.04900.0500
30.04450.0452
40.05220.0517
50.85660.8769
Table 4. Cases generated for Example 3.
Table 4. Cases generated for Example 3.
CaseSample SizeData Generated  
14Poisson distribution with λ = 20
24Poisson distribution with λ = 30
36Poisson distribution with λ = 20
46Poisson distribution with λ = 30
543 data from Poisson distribution with λ = 30 and 1 datum from Poisson distribution with λ = 5
663 data from Poisson distribution with λ = 10 and 3 data from Poisson distribution with λ = 15
Table 5. p-values obtained by LRT, BCTruncated, and BCBootstrap.
Table 5. p-values obtained by LRT, BCTruncated, and BCBootstrap.
CaseLRTBCTruncatedBCBootstrap
10.64500.63940.6397
20.89090.89190.8921
30.97540.97600.9760
40.82120.82320.8228
5 1.4690 × 10 6 1.6681 × 10 6 1.6659 × 10 6
60.03020.03210.0322
Table 6. Proportion of p-values obtained from LRT, BCTruncated, and BCBootstrap using α = 0.05 .
Table 6. Proportion of p-values obtained from LRT, BCTruncated, and BCBootstrap using α = 0.05 .
CaseLRTBCTruncatedBCBootstrap
10.05140.04910.0497
20.05200.05060.0504
30.05270.05120.0505
40.05760.05610.0561
50.99900.99900.9990
60.95210.95020.9507
Table 7. Proportion of p-values obtained from FWT, Truncated, Bootstrap, LRT, BCTruncated, and BCBootstrap using α = 0.05 .
Table 7. Proportion of p-values obtained from FWT, Truncated, Bootstrap, LRT, BCTruncated, and BCBootstrap using α = 0.05 .
n λ FWTTruncatedBootstrapLRTBCTruncatedBCBootstrap
350.02340.02160.05300.06100.05480.0517
550.02800.02800.05140.06440.05670.0539
1050.03410.03410.05050.06440.05670.0539
3100.01990.01990.05110.05380.05040.0496
5100.02640.02640.05250.05910.05470.0543
10100.02860.02860.04430.05610.05160.0507
Table 8. Two-year wildfire summary data in six regions.
Table 8. Two-year wildfire summary data in six regions.
Time PeriodCaribooCoastalKamloopsNorthwestPrince GeorgeSoutheast
1 April 2022–28 March 2023228281453119248429
1 April 2021–28 March 202227021645958274367
Table 9. p-values for each region.
Table 9. p-values for each region.
MethodCaribooCoastalKamloopsNorthwestPrince GeorgeSoutheast
Truncated0.44010.00110.32700.00000.72430.1567
Bootstrap0.07250.00000.49700.00000.83430.0065
LRT0.05970.00350.84250.00000.25500.0279
BCTruncated0.05980.00350.84260.00000.25530.0280
BCBootstrap0.05960.00350.84260.00000.25510.0279
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, X.; Wong, A.; Kaletsch, K. Poisson Mean Homogeneity: Single-Observation Framework with Applications. Symmetry 2025, 17, 1702. https://doi.org/10.3390/sym17101702

AMA Style

Shi X, Wong A, Kaletsch K. Poisson Mean Homogeneity: Single-Observation Framework with Applications. Symmetry. 2025; 17(10):1702. https://doi.org/10.3390/sym17101702

Chicago/Turabian Style

Shi, Xiaoping, Augustine Wong, and Kai Kaletsch. 2025. "Poisson Mean Homogeneity: Single-Observation Framework with Applications" Symmetry 17, no. 10: 1702. https://doi.org/10.3390/sym17101702

APA Style

Shi, X., Wong, A., & Kaletsch, K. (2025). Poisson Mean Homogeneity: Single-Observation Framework with Applications. Symmetry, 17(10), 1702. https://doi.org/10.3390/sym17101702

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop