Next Article in Journal
On Aggregations of Algebraic Objects in Data Modeling
Next Article in Special Issue
A New Exponential-Type Model Under Unified Progressive Hybrid Censoring: Computational Inference and Its Applications
Previous Article in Journal
Current Transformer Error Compensation Under Core Saturation Conditions Based on Machine Learning Algorithms
Previous Article in Special Issue
Confidence Intervals for the Difference and Ratio of Two Variances of Delta–Inverse Gaussian Distributions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Simultaneous Confidence Intervals for Pairwise Differences of Means in Zero-Inflated Rayleigh Distributions with an Application to Road Accident Fatalities Data

by
Warisa Thangjai
1,
Sa-Aat Niwitpong
2,*,
Narudee Smithpreecha
3 and
Arunee Wongkhao
4
1
Department of Statistics, Faculty of Science, Ramkhamhaeng University, Bangkok 10240, Thailand
2
Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand
3
Department of Mathematics and Statistics, Faculty of Science and Technology, Rajamangala University of Technology Phra Nakhon, Bangkok 10800, Thailand
4
Faculty of Science and Agricultural of Technology, Rajamangala University of Technology Lanna, Tak 63000, Thailand
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(3), 569; https://doi.org/10.3390/math14030569
Submission received: 12 January 2026 / Revised: 2 February 2026 / Accepted: 3 February 2026 / Published: 5 February 2026
(This article belongs to the Special Issue Statistical Inference: Methods and Applications)

Abstract

This paper develops simultaneous confidence intervals (SCIs) for pairwise differences of means with zero-inflated Rayleigh (ZIR) distributions, a flexible framework for modeling positively skewed data with excess zeros. Closed-form expressions for the ZIR mean are derived, and several competing interval estimation procedures are investigated, including generalized confidence interval (GCI), parametric bootstrap (PB), method of variance estimates recovery (MOVER), delta-method normal approximation, and highest posterior density (HPD) intervals. The finite-sample performance of the proposed SCIs is examined via extensive Monte Carlo simulations, focusing on empirical coverage probabilities (CPs) and average interval lengths (ALs) over a broad range of parameter configurations and zero-inflation levels. A real data application to road accident fatality counts demonstrates the practical utility of the proposed methodology. The results show that the HPD method consistently achieves the most favorable balance between coverage accuracy and interval efficiency. Overall, this study advances reliable simultaneous inference for zero-inflated models commonly encountered in environmental, biomedical, and reliability studies.

1. Introduction

Statistics is a branch of mathematics that deals with collecting, organizing, analyzing, interpreting, and presenting data. It is broadly classified into descriptive statistics and inferential statistics, which together provide a framework for understanding data and making informed conclusions about a population based on sample observations. Descriptive statistics focus on summarizing and presenting the key features of a dataset in a clear and structured way, using numerical measures and graphical displays. Among descriptive measures, the mean (or average) is a fundamental statistic that represents the central tendency of quantitative data and is widely used due to its mathematical simplicity and practical usefulness. Inferential statistics, on the other hand, extends data analysis by allowing conclusions about a population to be drawn from sample data using probability theory. Key inferential tools include point estimation, confidence intervals, and hypothesis testing, with the population mean being a central parameter of interest. These methods account for sampling variability and provide a basis for reliable statistical inference beyond descriptive summaries.
The Rayleigh distribution has been widely used to model nonnegative data across a broad range of disciplines, including acoustics, medical research, quality control, communications engineering, reliability analysis, and aerospace applications. Owing to its practical relevance and mathematical simplicity, numerous extensions of the Rayleigh distribution have been introduced in the literature to improve its flexibility and modeling capability for complex real-world data. Notable contributions in this area include the works of Bashir and Rasul [1], Abdulhakim [2], Krishnamoorthy [3], and Almongy et al. [4], among others.
Data that contain a substantial proportion of zero values along with positively skewed nonzero observations commonly arise in areas such as reliability engineering, environmental science, biomedical research, and actuarial studies. In these settings, conventional continuous distributions are often inadequate because they fail to accommodate the excess zeros resulting from structural or process-related factors. Consequently, zero-inflated models have become an essential modeling framework as they explicitly incorporate a point mass at zero together with a continuous distribution to describe the positive outcomes.
The zero-inflated Rayleigh (ZIR) distribution offers a flexible and effective framework for modeling nonnegative data characterized by an excess number of zero observations. In this distribution, positive outcomes are modeled using a Rayleigh distribution, while a mixing parameter controls the probability of structural zeros. This dual-component structure enables the ZIR distribution to capture both the frequency of zero occurrences and the distributional behavior of positive measurements. Owing to its parsimonious formulation and interpretable parameters, the ZIR distribution is well suited for applications in lifetime and reliability analysis, signal amplitude modeling, and environmental studies, where zero values commonly occur alongside positively skewed data. A notable advantage of the ZIR distribution is that its mean reflects contributions from both the zero-inflation mechanism and the positive component, thereby summarizing the overall level of the underlying process. In comparative analyses involving multiple populations or experimental conditions, differences between the means of ZIR distributions provide a natural and informative measure for assessing group effects. Such comparisons allow researchers to simultaneously evaluate changes in the likelihood of zero outcomes and the expected magnitude of positive observations, offering a more comprehensive assessment than approaches based solely on the positive component. Consequently, statistical inference for differences between ZIR means is of considerable practical importance in many applied settings. Several studies have investigated properties and applications of the ZIR distribution, including those by Fuxiang et al. [5] and Kijsason et al. [6].
In studies that compare multiple populations or treatment groups, the primary focus is often on differences between group means rather than on individual mean estimates. When multiple groups are examined simultaneously, constructing separate pairwise confidence intervals can result in an inflated family-wise error rate. Simultaneous confidence intervals (SCIs) for all pairwise differences of means offer a coherent approach to joint inference, providing reliable conclusions while appropriately controlling for multiple comparisons. Although inferential methods for zero-inflated models have gained increasing attention, the development of SCIs for all differences of means within the framework of ZIR distributions remains relatively unexplored. This difficulty stems from the involvement of multiple parameters—specifically, the zero-inflation proportion and the Rayleigh scale parameter—which together define the mean and complicate its sampling behavior. Consequently, standard normal approximation techniques may yield unsatisfactory performance, particularly in settings with small to moderate sample sizes or substantial zero inflation.
Motivated by these issues, this study proposes the construction of SCIs for all pairwise differences of means of ZIR distributions using a range of inferential techniques, where simultaneous refers to the concurrent construction of confidence intervals for all pairwise contrasts within a unified inferential framework. Specifically, SCIs are developed based on the generalized confidence interval (GCI) method, the parametric bootstrap (PB) method, the method of variance estimates recovery (MOVER), and the delta-method normal approximation. In addition, a Bayesian framework is considered, with simultaneous inference conducted using highest posterior density (HPD) credible intervals for all pairwise contrasts. The performance of the proposed frequentist and Bayesian methods is systematically evaluated through extensive Monte Carlo simulations, with emphasis on the marginal coverage probability (CP) of individual pairwise intervals and the average interval length (AL) for various parameter settings and sample sizes. An application to road accident fatality counts is presented to demonstrate the practical effectiveness and applicability of the proposed methods.

2. Methods

The Rayleigh distribution is adopted for modeling the positive values of X because the observed nonzero data are continuous, nonnegative, and right-skewed, rendering discrete distributions inappropriate. As a member of the Weibull family, the Rayleigh distribution offers analytical tractability and computational efficiency, which are beneficial for statistical inference and simulation-based construction of simultaneous confidence intervals. Therefore, the zero-inflated Rayleigh model constitutes a flexible and theoretically sound framework for data exhibiting both excess zeros and continuous positive measurements.
Let X be a random variable following a ZIR distribution with parameters p and σ , where p [ 0 , 1 ) denotes the probability of an additional point mass at zero and σ > 0 is the scale parameter of the Rayleigh component. The probability density function (PDF) of X is given by
f ( x ; p , σ ) = p ; x = 0 ( 1 p ) x σ 2 exp x 2 2 σ 2 ; x > 0 . 0 ; o t h e r w i s e
This distribution represents a mixture of a degenerate distribution at zero with probability p and a Rayleigh distribution with mixing weight 1 p . The corresponding cumulative distribution function (CDF) is
F ( x ; p , σ ) = 0 ; x < 0 p ; x = 0 . p + ( 1 p ) 1 exp x 2 2 σ 2 ; x > 0
The mean of X is
E ( X ) = ( 1 p ) σ π 2 ,
where π denotes the mathematical constant.
The variance of X is
Var ( X ) = ( 1 p ) σ 2 2 π 2 ( 1 p ) .
In this study, inference for the mean of the ZIR distribution across multiple groups is considered. Let θ i be the population mean of the ZIR distribution in the i-th group, for i = 1 , 2 , , k . The mean θ i is defined as
θ i = ( 1 p i ) σ i π 2 ,
where p i and σ i are the zero-inflation probability and the Rayleigh scale parameter for group i, respectively.
A plug-in estimator of θ i is given by
θ ^ i = ( 1 p ^ i ) σ ^ i π 2 ,
where p ^ i and σ ^ i are the corresponding MLEs.
The asymptotic variance of θ ^ i is denoted by
Var ( θ ^ i ) = π 2 σ ^ i 2 p ^ i ( 1 p ^ i ) n i + ( 1 p ^ i ) 2 σ ^ i 2 4 n i ( 1 ) .
To compare group means, consider the vector of mean parameters
θ ̲ = ( θ 1 , θ 2 , , θ k ) T .
For any two groups i and l  ( i l ) , the difference in means is defined as
θ i l = θ i θ l = ( 1 p i ) σ i π 2 ( 1 p l ) σ l π 2 .
The estimator of θ i l is
θ ^ i l = θ ^ i θ ^ l = ( 1 p ^ i ) σ ^ i π 2 ( 1 p ^ l ) σ ^ l π 2 .
In Appendix A, assuming independence between groups, the asymptotic variance of θ ^ i l is
Var ( θ ^ i l ) = Var ( θ ^ i θ ^ l ) = Var ( θ ^ i ) + Var ( θ ^ l ) = π 2 σ ^ i 2 p ^ i ( 1 p ^ i ) n i + ( 1 p ^ i ) 2 σ ^ i 2 4 n i ( 1 ) + π 2 σ ^ l 2 p ^ l ( 1 p ^ l ) n l + ( 1 p ^ l ) 2 σ ^ l 2 4 n l ( 1 ) .

2.1. Generalized Confidence Interval Method

The GCI method, introduced by Weerahandi, provides an effective inferential framework for complex models in which the exact sampling distributions of estimators are analytically intractable. Unlike classical confidence intervals, the GCI method is constructed using generalized pivotal quantities (GPQs), whose distributions are free of unknown parameters. This method is particularly advantageous for mixture models and zero-inflated distributions, such as the ZIR distribution, where standard asymptotic approximations may perform poorly, especially in small or moderate samples.
The GPQ for σ i is
R σ i = i : X i > 0 X i 2 χ 2 n i ( 1 ) 2 .
Let θ i be the mean of the ZIR distribution for the i-th group, where i = 1 , 2 , , k . Let θ ^ i be its estimator. The GPQ for θ i is
R θ i = ( 1 p ^ i ) R σ i π 2 .
Similarly, let θ i l = θ i θ l be the difference between the means of groups i and l. The corresponding GPQ is defined as
R θ i l = R θ i R θ l = ( 1 p ^ i ) R σ i π 2 ( 1 p ^ l ) R σ l π 2 .
It should be noted that the proposed GPQs are constructed using plug-in estimators for the zero-inflation probability and therefore constitute approximate GPQs. As a result, the corresponding GCI-based intervals are approximate and may exhibit deviations from nominal coverage, particularly in settings with high zero inflation or small sample sizes.
The distribution of R θ i l is free of unknown parameters and can be approximated via Monte Carlo simulation. Consequently, a 100 ( 1 α ) % two-sided confidence interval for θ i l based on the GCI method is given by
CI i l ( GCI ) = L i l ( GCI ) , U i l ( GCI ) = R θ i l ( α / 2 ) , R θ i l ( 1 α / 2 ) ,
where R θ i l ( α / 2 ) and R θ i l ( 1 α / 2 ) denote the ( α / 2 ) -th and ( 1 α / 2 ) -th quantile of the simulated GPQ distribution, respectively. The detailed steps are presented in Algorithm 1.
Algorithm 1 GCI.
  • Estimates p ^ i , σ ^ i , and i : X i > 0 X i 2 , i = 1 , 2 , . . . , k ;
  • Number of GPQ draws m;
  • Significance level α ;
  • For  b = 1 , 2 , . . . , m  do
  •         For  i = 1 , 2 , . . . , k  do
  •                 If  n i ( 1 ) > 0  then
  •                         Draw χ 2 n i ( 1 ) 2
  •                         Set R σ i = i : X i > 0 X i 2 / χ 2 n i ( 1 ) 2
  •                 Else set R σ i = σ ^ i
  •                 Set R θ i = ( 1 p ^ i ) R σ i π / 2
  •         End for
  •         For all  i < l  do
  •                 Set R θ i l = R θ i R θ l
  •         End for
  • End for
  •         For all  i < l  do
  •                 Set L i l ( G C I ) = R θ i l ( α / 2 ) and U i l ( G C I ) = R θ i l ( 1 α / 2 )
  •         End for

2.2. Parametric Bootstrap Method

Bootstrapping is a widely used resampling technique for assessing the sampling distribution of estimators and constructing confidence intervals when analytic results are difficult or unavailable. Among bootstrap methods, the parametric bootstrap is particularly effective when the underlying data-generating mechanism can be reasonably modeled by a parametric family.
In the parametric bootstrap, it is assumed that the observed data arise from a distribution F ( · ; θ ̲ ) indexed by an unknown parameter vector θ ̲ . The parameter is first estimated from the original sample, yielding θ ̲ ^ . Bootstrap samples are then generated from the fitted parametric model F ( · ; θ ̲ ^ ) , rather than directly resampling from the empirical distribution. This procedure allows the bootstrap samples to preserve structural features implied by the assumed model, such as skewness, tail behavior, or zero inflation.
Let T = T ( X ̲ ) denote a statistic of interest computed from the original data X ̲ . For each bootstrap replication, a synthetic sample is generated from F ( · ; θ ̲ ^ ) , and the corresponding bootstrap statistic T * is calculated. Repeating this process a large number of times yields an empirical approximation to the sampling distribution of T, which can be used to estimate bias, variance, and confidence intervals.
Parametric bootstrap confidence intervals are commonly constructed using the percentile method, where the interval endpoints are obtained from the empirical quantiles of the bootstrap distribution. Compared with the nonparametric bootstrap, the parametric bootstrap often exhibits improved efficiency and smoother sampling distributions when the assumed model is correctly specified. However, its performance depends critically on the validity of the parametric assumption; model misspecification may lead to biased inference.
Due to its flexibility and computational simplicity, the parametric bootstrap has been widely applied in complex inference problems, including small-sample settings, censored or zero-inflated data, and situations involving nonlinear estimators or functions of multiple parameters.
Generating a bootstrap sample, let X i 1 * , X i 2 * , , X i n i * be i.i.d. observations from a ZIR distribution with parameters p ^ i and σ ^ i . Define n i ( 0 ) * = j = 1 n i 1 { X i j * = 0 } , n i ( 1 ) * = n i n i ( 0 ) * . The bootstrap estimators of p i and σ i are
p ^ i * = n i ( 0 ) * n i
and
σ ^ i * = i : X i * > 0 ( X i * ) 2 2 n i ( 1 ) * .
The bootstrap parameter of the mean for group i is
θ i * = ( 1 p ^ i * ) σ ^ i * π 2 .
In the parametric bootstrap framework, the fitted ZIR model with parameter estimates p ^ i and σ ^ i is employed as a plug-in approximation to the unknown true distribution, from which bootstrap samples are subsequently generated. The bootstrap parameter of the mean is defined as the corresponding population mean evaluated at these fitted parameters and is treated as fixed conditional on the observed data, thereby serving as the true parameter within the bootstrap resampling scheme. Equation (7) is standard in parametric bootstrap inference and ensures coherence between the original estimation problem and its bootstrap counterpart.
The bootstrap difference of the means is
θ i l * = θ i * θ l * = ( 1 p ^ i * ) σ ^ i * π 2 ( 1 p ^ l * ) σ ^ l * π 2 .
Therefore, a 100 ( 1 α ) % two-sided confidence interval for θ i l * based on the parametric bootstrap method is given by
CI i l ( PB ) = [ L i l ( PB ) , U i l ( PB ) ] = [ θ i l * ( α / 2 ) , θ i l * ( 1 α / 2 ) ] ,
where θ i l * ( α / 2 ) and θ i l * ( 1 α / 2 ) denote the ( α / 2 ) -th and ( 1 α / 2 ) -th quantile of the simulated bootstrap replication, respectively. Algorithm 2 presents the detailed procedure.
Algorithm 2 Parametric Bootstrap Confidence Interval.
  • Estimates p ^ i , σ ^ i , i = 1 , 2 , . . . , k ;
  • Number of bootstrap samples m;
  • Significance level α ;
  • For  b = 1 , 2 , . . . , m  do
  •         For  i = 1 , 2 , . . . , k  do
  •                 Generate bootstrap sample X i * ( b ) ZIR ( p ^ i , σ ^ i ) of size n i
  •                 Compute p ^ i * ( b ) = n i ( 0 ) * ( b ) / n i
  •                 Compute σ ^ i * ( b ) = i : X i * > 0 ( X i * ( b ) ) 2 / 2 n i ( 1 ) * ( b )
  •                 Set θ i * ( b ) = ( 1 p ^ i * ( b ) ) σ ^ i * ( b ) π / 2
  •         End for
  •         For all  i < l  do
  •                 Set θ i l * ( b ) = θ i * ( b ) θ l * ( b )
  •         End for
  • End for
  • For all  i < l  do
  •         Set L i l ( PB ) = θ i l * ( α / 2 ) and U i l ( PB ) = θ i l * ( 1 α / 2 )
  • End for

2.3. Method of Variance Estimates Recovery

The MOVER is a general methodology for constructing confidence intervals for functions of parameters when direct variance estimation or joint distributional assumptions are impractical. The core principle of MOVER is that the uncertainty of a target function can be recovered from marginal confidence intervals of the individual parameters and combined in a least favorable configuration to ensure nominal coverage.
The MOVER was extended by Donner and Zou [7], who demonstrated that valid closed-form confidence intervals for nonlinear functions of scale parameters, such as the normal standard deviation, can be obtained using endpoint-based constructions. Their work establishes a unifying theoretical framework in which confidence intervals for both linear and nonlinear functions can be derived directly from marginal confidence limits.
Consider a ZIR distribution with parameters ( p i , σ i ) , where p i denotes the probability of a structural zero and σ i > 0 is the Rayleigh scale parameter. The mean of the distribution for group i is
θ i = ( 1 p i ) σ i π 2 .
Let [ p i , L , p i , U ] be the marginal 100 ( 1 α ) % two-sided confidence interval for p i . The lower confidence limit p i , L and upper confidence limit p i , U are defined as
p i , L = 0 ; n i ( 0 ) = 0 B 1 ( α / 2 ; n i ( 0 ) , n i n i ( 0 ) + 1 ) ; 0 < n i ( 0 ) < n i ( α / 2 ) 1 / n ; n i ( 0 ) = n i
and
p i , U = 1 ( α / 2 ) 1 / n ; n i ( 0 ) = 0 B 1 ( 1 α / 2 ; n i ( 0 ) + 1 , n i n i ( 0 ) ) ; 0 < n i ( 0 ) < n i , 1 ; n i ( 0 ) = n i
where B 1 ( q ; a , b ) denotes the q-th quantile of the Beta ( a , b ) distribution.
Let [ σ i , L , σ i , U ] be the marginal 100 ( 1 α ) % two-sided confidence interval for σ i . The lower confidence limit σ i , L and upper confidence limit σ i , U are defined as
σ i , L = σ ^ i z α / 2 σ ^ i 2 4 n i ( 1 )
and
σ i , U = σ ^ i + z α / 2 σ ^ i 2 4 n i ( 1 ) ,
where z α / 2 is the upper ( α / 2 ) -quantile of the standard normal distribution.
Following the endpoint-based MOVER principle, the confidence interval for θ i is constructed as
θ i , L = ( 1 p i , U ) σ i , L π 2
and
θ i , U = ( 1 p i , L ) σ i , U π 2 .
Under the MOVER variance recovery principle, the variance of θ ^ i at the lower and upper confidence limits can be approximated by
Var ( θ ^ i ) | θ i , L = θ ^ i θ i , L z α / 2 2
and
Var ( θ ^ i ) | θ i , U = θ i , U θ ^ i z α / 2 2 .
These recovered variances implicitly account for the combined uncertainty arising from both the zero-inflation parameter p i and the scale parameter σ i , without requiring their joint sampling distribution.
For k independent groups i and l, define the difference θ i l = θ i θ l . Given the marginal MOVER confidence intervals [ θ i , L , θ i , U ] and [ θ l , L , θ l , U ] , the endpoint-based MOVER confidence interval for θ i l is
θ i l , L = θ i , L θ l , U
and
θ i l , U = θ i , U θ l , L .
From the variance recovery perspective, the recovered variance of θ ^ i l = θ ^ i θ ^ l at the lower and upper limits is given by
Var ( θ ^ i l ) | θ i l , L = Var ( θ ^ i ) | θ i , L + Var ( θ ^ l ) | θ l , U
and
Var ( θ ^ i l ) | θ i l , U = Var ( θ ^ i ) | θ i , U + Var ( θ ^ l ) | θ l , L ,
assuming independence between the two samples (i and l).
This leads to the closed-form endpoint-based MOVER confidence interval. Therefore, in Appendix B, the 100 ( 1 α ) % two-sided confidence interval for θ i l based on the MOVER method is given by
CI i l ( MOVER ) = [ L i l ( MOVER ) , U i l ( MOVER ) ] = [ θ i , L θ l , U , θ i , U θ l , L ] .
The complete procedure is described in Algorithm 3.
Algorithm 3 MOVER Confidence Interval.
  • Estimates p ^ i and σ ^ i , i = 1 , 2 , . . . , k ;
  • Significance level α ;
  • For  i = 1 , 2 , . . . , k  do
  •         Obtain confidence interval [ p i , L , p i , U ] for p i
  •         Obtain confidence interval [ σ i , L , σ i , U ] for σ i
  •         Set θ i , L = ( 1 p i , U ) σ i , L π / 2 and θ i , U = ( 1 p i , L ) σ i , U π / 2
  • End for
  • For all  i < l  do
  •         Set L i l ( MOVER ) = θ i , L θ l , U and U i l ( MOVER ) = θ i , U θ l , L
  • End for

2.4. Delta-Method Normal Approximation

The delta method is a widely used asymptotic technique for approximating the sampling distribution of a function of an estimator. It is particularly useful when the parameter of interest is a nonlinear function of one or more parameters whose estimators are asymptotically normal. The delta-method normal approach relies on a first-order Taylor series expansion to obtain a normal approximation for the transformed estimator.
The delta-method normal approximation is used to construct confidence intervals for pairwise differences of parameters. The method relies on asymptotic normality and variance approximation obtained via the delta method.
Let θ ^ i be the estimator of θ i for group i. Using the delta method, the asymptotic variance of θ ^ i is approximated by
Var ( θ ^ i ) = π 2 σ ^ i 2 p ^ i ( 1 p ^ i ) n i + ( 1 p ^ i ) 2 σ ^ i 2 4 n i ( 1 ) .
Assuming independence between estimators from different groups, the variance of the difference θ ^ i l = θ ^ i θ ^ l is approximated by
Var ( θ ^ i l ) = Var ( θ ^ i θ ^ l ) = Var ( θ ^ i ) + Var ( θ ^ l ) .
Therefore, in Appendix C, the 100 ( 1 α ) % two-sided confidence interval for θ i l based on the delta-method normal approximation is given by
CI i l ( Delta ) = [ L i l ( Delta ) , U i l ( Delta ) ] = [ θ ^ i l z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) , θ ^ i l + z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) ] .
Algorithm 4 presents the detailed procedure.
Algorithm 4 Delta-Method Normal Approximation Confidence Interval.
  • Estimates p ^ i and σ ^ i , i = 1 , 2 , . . . , k ;
  • Significance level α ;
  • For  i = 1 , 2 , . . . , k  do
  •         If  n i ( 1 ) > 0  then
  •                 Set Var ( θ ^ i ) = π 2 σ ^ i 2 p ^ i ( 1 p ^ i ) n i + ( 1 p ^ i ) 2 σ ^ i 2 4 n i ( 1 )
  •         Else set Var ( θ ^ i ) =
  • End for
  • For all  i < l  do
  •         Set θ ^ i l = θ ^ i θ ^ l
  •         Set L i l ( Delta ) = θ ^ i l z α / 2 Var ( θ ^ i ) + Var ( θ ^ l )
  •         Set U i l ( Delta ) = θ ^ i l + z α / 2 Var ( θ ^ i ) + Var ( θ ^ l )
  • End for

2.5. Highest Posterior Density Method

The Bayesian method is a probabilistic framework for statistical inference and decision-making. In contrast to frequentist methodology, which assumes model parameters are fixed but unknown constants, the Bayesian paradigm treats parameters as random variables and quantifies uncertainty through probability distributions.
Bayesian inference is founded on Bayes’ theorem, which provides a systematic way to revise prior assumptions about unknown parameters in light of observed data. Prior beliefs are expressed through a prior distribution, while the information contained in the data is captured by the likelihood function. The combination of these elements results in the posterior distribution, representing updated knowledge about the parameters after the data have been observed.
A key strength of the Bayesian framework is its ability to formally incorporate prior information or expert knowledge into the analysis. This feature is especially advantageous when dealing with small samples or complex statistical models. Moreover, Bayesian methods yield a comprehensive probabilistic characterization of uncertainty, enabling direct probability statements about parameters and derived quantities, such as percentiles or reliability indices.
Bayesian inference also supports interval estimation via credible intervals, including both equal-tailed intervals and HPD credible intervals, which are often easier to interpret than classical confidence intervals. Furthermore, modern computational tools, particularly Markov Chain Monte Carlo (MCMC) algorithms, have greatly expanded the applicability of Bayesian methods to problems where closed-form solutions are not available.
In summary, the Bayesian method provides a flexible and robust alternative to traditional statistical techniques and has been widely applied in areas such as reliability engineering, survival analysis, and numerous other scientific disciplines.
For i = 1 , 2 , , k , let X i = ( X i 1 , X i 2 , , X i n i ) be independent samples from k populations. For each population, the model is characterized by a mixing probability p i and a scale parameter σ i . The parameters are assigned independent prior distributions given by
p i Beta ( a p , b p )
and
σ i 2 I G ( a σ , b σ ) ,
where I G ( · , · ) denotes the inverse-gamma distribution.
Let n i be the sample size in group i, n i ( 0 ) the number of zero observations, and n i ( 1 ) = n i n i ( 0 ) the number of positive observations. Conditional on the observed data, the posterior distribution of p i is
p i X i Beta ( n i ( 0 ) + a p , n i n i ( 0 ) + b p ) .
When n i ( 1 ) > 0 , the posterior distribution of σ i 2 is
σ i 2 X i IG a σ + n i ( 1 ) , b σ + 1 2 i : X i > 0 X i 2
whereas if n i ( 1 ) = 0 , the posterior of σ i 2 coincides with its prior.
Posterior samples ( p i , σ i ) are generated by Monte Carlo simulation. For each draw, the parameter of interest is defined as
θ i = ( 1 p i ) σ i π 2 .
For any pair of groups ( i , l ) , posterior samples of the difference are obtained as
θ i l = θ i θ l = ( 1 p i ) σ i π 2 ( 1 p l ) σ l π 2 .
Therefore, the 100 ( 1 α ) % two-sided credible interval for θ i l based on the HPD method is given by
CI i l ( HPD ) = [ L i l ( HPD ) , U i l ( HPD ) ] ,
where L i l ( HPD ) and U i l ( HPD ) are obtained using the HPD interval function in R software (version 2024.12.0 + 467). The complete procedure is described in Algorithm 5.
Algorithm 5 HPD Credible Interval.
  • Priors p i Beta ( a p , b p ) and σ i 2 IG ( a σ , b σ ) ;
  • Number of posterior draws m;
  • Credibility level 1 α ;
  • For  b = 1 , 2 , . . . , m  do
  •         For  i = 1 , 2 , . . . , k  do
  •                 Draw p i ( b ) Beta ( n i ( 0 ) + a p , n i n i ( 0 ) + b p )
  •                 Draw σ i 2 ( b ) IG a σ + n i ( 1 ) , b σ + 1 2 i : X i > 0 X i 2
  •                 Set θ i ( b ) = ( 1 p i ( b ) ) σ i ( b ) π / 2
  •         End for
  •         For all  i < l  do
  •                 Set θ i l ( b ) = θ i ( b ) θ l ( b )
  •         End for
  • End for
  • For all  i < l  do
  •         Compute L i l ( HPD ) and U i l ( HPD )
  • End for

3. Simulation Studies

A detailed simulation study was conducted to evaluate the performance of the proposed SCIs. The investigation focused on two main performance metrics: coverage probability (CP) and average interval length (AL). All computational experiments were performed in R software, with each scenario replicated a sufficiently large number of times to ensure stable and reliable results. For each interval estimation method, the empirical CP was calculated as the proportion of simulated intervals that contained the true parameter value, while the AL was defined as the mean width of the intervals across replications. These measures formed the basis for comparing the efficiency and robustness of the SCIs with various parameter settings and sample sizes.
To determine the most suitable interval method, preference was given to methods achieving an empirical CP at or above the nominal confidence level of 0.95. Among the methods satisfying this criterion, the one yielding the smallest AL was regarded as the most efficient. This dual evaluation framework ensures that both the accuracy and precision of the interval estimators are appropriately considered.
The SCIs were constructed using five different methods: the GCI method, the PB method, the MOVER method, the delta-method normal approximation, and the HPD method. The simulation study examined three scenarios corresponding to k = 3 , 6 and 10 groups. The sample sizes were denoted by n ( k ) ; the probabilities of zero inflation by p ( k ) ; and the Rayleigh scale parameters by σ ( k ) . Note that n ( k ) , p ( k ) , and σ ( k ) indicate that the same values n, p, and σ , respectively, are repeated across the k groups. For every sample generated, an additional 1000 simulations were carried out following Algorithms 1, 2 and 5. For each set of parameter values, 1000 random samples were generated using Algorithm 6.
Algorithm 6 CP and AL.
  • True pairwise difference θ i l ;
  • Coverage indicators C method ( r ) ;
  • Interval lengths L method ( r ) ;
  • r = 1 , 2 , , B for each method
  • For each method  { GCI , PB , MOVER , delta , HPD }  do
  •         Compute average coverage probability CP = 1 B r = 1 B C ( r )
  •         Compute average interval length AL = 1 B r = 1 B L ( r )
  • End for
From Figure 1, Figure 2 and Figure 3, the results for k = 3 indicate that the GCI method consistently produced the lowest CPs, with values typically falling below the nominal 0.95 level. This under-coverage became more pronounced when the third group exhibited higher zero-inflation probabilities or when the scale parameter showed greater heterogeneity. Although the GCI yielded relatively narrow ALs, these intervals were often too short to achieve nominal coverage.
In contrast, both the PB method and delta-method normal approximation attained CPs close to the targeted 0.95 across nearly all parameter settings. Their CP performance improved as sample sizes increased while still maintaining moderate ALs. Between the two, the PB method generally produced slightly shorter intervals for a given CP, particularly in balanced sample-size scenarios.
The MOVER method consistently achieved CPs equal to or near 1.0000 in every configuration, indicating substantial over-coverage. However, this gain in CP came at the expense of much wider intervals—often more than double the ALs of the PB method and delta-method normal approximation, especially with greater scale disparities. Although the MOVER method is the most conservative method, its excessive interval lengths make it less efficient for practical applications.
The Bayesian method performed similarly to the PB method and delta-method normal approximation, with CPs typically ranging from 0.94 to 0.96 depending on the scenario. Its ALs were also comparable to those of the PB method and delta-method normal approximation and consistently far shorter than those of the MOVER method. Moreover, the HPD method maintained stable CP and AL behavior across variations in sample size and zero-inflation probability, demonstrating robustness.
Increasing sample sizes led to improved CPs for all approaches except the MOVER method (which had already attained 1.0000) and generally reduced ALs. This trend was especially noticeable for the PB method, delta-method normal approximation, and HPD method, whose CPs approached the nominal level more closely as n increased to 100 or 200.
Overall, the PB method, delta-method normal approximation, and HPD method provided the most favorable balance between coverage accuracy and interval length. The MOVER method offered the highest CPs but at the cost of substantially inflated ALs, whereas the GCI method persistently under-covered across most settings. These patterns identify the PB method, delta-method normal approximation, and HPD method as the most efficient SCI procedures for the three-sample ZIR scenarios.
Figure 4, Figure 5 and Figure 6 summarize the empirical CPs and ALs of the 95% two-sided SCIs for all pairwise mean differences in various six-sample configurations. The overall trends observed in the three-sample setting remain consistent and become even more pronounced when extended to six samples.
Across every parameter setup, the GCI method produced the lowest CPs, typically ranging from 0.83 to 0.91. Its under-coverage worsened in scenarios where the latter groups had higher zero-inflation probabilities (e.g., p 4 = p 5 = p 6 = 0.3 ) or when scale heterogeneity was substantial (e.g., σ 4 = σ 5 = σ 6 = 0.75 ). Although the GCI method generated the shortest intervals among all methods, this gain in precision came at the cost of inadequate coverage, demonstrating that its intervals were too narrow to maintain the nominal 0.95 confidence level.
The MOVER interval again displayed extreme conservatism, yielding CPs essentially equal to 1.0000 in every scenario. However, this was accompanied by substantially inflated ALs, often two to three times larger than those of the PB, delta-method normal approximation, and HPD intervals. The effect was particularly evident when the scale parameters increased from 0.25 to 0.75, producing noticeably wider MOVER intervals. Although MOVER interval ensures near-certainty in coverage, its inefficiency renders it the least practical option.
Sample size played a major role in shaping CP and AL across all methods. As sample sizes increased from 30 to 200, CPs improved and ALs consistently decreased. The PB, delta-method normal approximation, and HPD method showed the greatest gains, exhibiting highly stable CPs close to the nominal level and substantially shorter intervals with larger samples. In contrast, the GCI method continued to under-cover even with larger n, whereas the MOVER method persistently over-covered regardless of sample size.
Overall, the six-sample results reinforce the conclusions drawn from smaller-sample analyses. The PB, delta-method normal approximation, and HPD methods provide the most effective balance between achieving nominal coverage and maintaining reasonably short intervals. The GCI remains overly liberal, while MOVER interval remains excessively conservative. Thus, PB, delta-method normal approximation, and HPD procedures stand out as the most dependable and efficient SCI approaches for six-sample ZIR scenarios.
Figure 7, Figure 8 and Figure 9 report the empirical CPs and ALs of the 95% SCIs for all pairwise mean differences when k = 10 . The general trends observed in the three- and six-sample situations persist and become even more pronounced as the dimensionality increases.
Across every configuration, the GCI procedure again produced CPs well below the nominal level, with values roughly between 0.83 and 0.90. Its under-coverage intensified when the last five groups exhibited higher zero-inflation probabilities (i.e., p 6 = p 7 = p 8 = p 9 = p 10 = 0.3 ) or when the scale parameters were larger (e.g., σ 6 = σ 7 = σ 8 = σ 9 = σ 10 = 0.75 ). Although the GCI continued to yield the shortest intervals, these reduced ALs did not offset the considerable loss in coverage, reaffirming its overly liberal nature.
The PB, delta-method normal approximation, and HPD methods exhibited strong and consistent performance across nearly all settings. Their CPs remained close to the desired 0.95 level, with only minor decreases under higher zero-inflation conditions. These three methods also showed clear gains from increased sample size, with improvements in both CP and AL as the smallest group size rose from 30 to 200. Regarding interval width, PB, delta-method normal approximation, and HPD intervals stayed within a moderate range—substantially shorter than those of MOVER interval yet appropriately wider than the GCI—making them practical and efficient options.
As in the previous figures, the MOVER method produced CPs essentially equal to 1.0000 across all scenarios, underscoring its conservativeness. However, this came with extremely large interval lengths, often two to three times greater than those obtained from the PB, delta-method normal approximation, and HPD methods. The inefficiency of MOVER interval became even more pronounced when scale parameters increased, resulting in notably wider intervals.
Sample size exerted a strong and consistent influence. As sample sizes increased from n = 30 to n = 200 , all methods produced shorter intervals, with PB, delta-method normal approximation, and HPD showing the most substantial improvements in precision and stability. The GCI continued to under-cover even in the largest samples, while MOVER persistently over-covered across all settings.
In summary, the findings for the ten-sample case reinforce the conclusions from Figure 7, Figure 8 and Figure 9. The PB, delta-method normal approximation, and HPD methods provide the best compromise between nominal coverage and reasonable interval widths. The GCI remains overly liberal and unreliable, whereas MOVER interval, although ensuring high CP, is excessively conservative and inefficient. Therefore, in high-dimensional ZIR scenarios, PB, delta-method normal approximation, and HPD methods stand out as the most dependable and effective procedures for constructing SCIs.

4. Empirical Application

Global road traffic accidents represent a critical public health crisis, serving as a major contributor to premature death and injury. Beyond the immediate human suffering, these incidents impose substantial economic burdens through rising healthcare expenditures, reduced workforce productivity, and the long-term costs associated with disability rehabilitation. The causes of road traffic accidents are multifaceted, arising from a complex interplay between risky driving behaviors—such as speeding and alcohol impairment—and external factors including inadequate infrastructure and weak law enforcement. Mitigating this challenge requires an integrated strategy that combines legislative reforms, public education initiatives, and infrastructure improvements. Consequently, robust statistical modeling is essential for developing evidence-based interventions aimed at reducing both the frequency and severity of traffic crashes.
This study examines fatality data obtained from the Road Accident Victims Protection Company Limited (Thai RSC) (https://www.thairsc.com/, accessed on 25 December 2025) for the period from 1 January to 24 December 2025, as presented in Table 1. The analysis encompasses 36 districts across Chachoengsao, Uttaradit, and Chaiyaphum provinces. As shown in Figure 10, Figure 11 and Figure 12, preliminary analysis indicates that the distributions of non-zero fatalities in these regions are markedly right-skewed. As shown in Table 2, model selection based on the Akaike Information Criterion (AIC) shows that the Rayleigh distribution provides the best fit for the non-zero observations. However, to accommodate the substantial proportion of zero-fatality reports, the ZIR model is identified as the most appropriate framework for modeling the complete dataset.
Table 3 reports the descriptive statistics of fatality counts for Chachoengsao, Uttaradit, and Chaiyaphum provinces. The estimated pairwise mean differences are 6.3476 (Chachoengsao–Uttaradit), 7.0852 (Chachoengsao–Chaiyaphum), and 0.7376 (Uttaradit–Chaiyaphum). As shown in Table 4, the 95% SCIs produced by all considered methods contain the true mean differences. The MOVER approach yields the widest intervals, in agreement with the simulation results presented in the previous section. In contrast, the GCI, PB, and delta intervals are shorter than the HPD credible interval.
It is important to note that, unlike the simulation study based on 1000 replications, this analysis corresponds to a single empirical dataset. The simulation results indicate that the CPs of the GCI, PB, and delta intervals fall below the nominal 0.95 level, rendering them unsuitable for constructing reliable 95% SCIs for mean differences in this setting. Conversely, the HPD interval consistently attains CPs exceeding the nominal level. Therefore, the HPD interval is recommended for constructing 95% SCIs for pairwise mean differences in fatalities among the three provinces.
It should be noted that the empirical analysis serves to demonstrate the practical implementation of the proposed simultaneous interval estimation procedures, while detailed model diagnostics and causal interpretation are beyond the scope of the present study.

5. Discussion

Overall, the simulation study reveals clear and systematic performance differences among the competing SCI methods across all scenarios. The GCI method consistently under-covered, with empirical coverage probabilities falling well below the nominal 0.95 level, particularly with higher zero inflation, greater scale heterogeneity, and increasing numbers of groups. Although the GCI produced the shortest intervals, these were generally too narrow to support reliable inference. In contrast, the MOVER method was highly conservative, achieving coverage probabilities close to 1.0000 in all settings but at the cost of substantially inflated interval widths, which severely limited its practical efficiency.
By comparison, the PB, delta-method normal approximation, and HPD methods provided the most favorable trade-off between coverage accuracy and interval length. Their CPs remained close to the nominal level across a broad range of sample sizes, zero-inflation levels, and dimensions while yielding intervals that were moderate in length and markedly shorter than those of the MOVER method. Moreover, their performance improved with increasing sample size, leading to greater stability and precision. These results identify the PB, delta-method normal approximation, and HPD methods as the most reliable and efficient approaches for constructing SCIs for pairwise mean differences in ZIR distributions.

6. Conclusions

In this work, SCIs for all pairwise differences of means in ZIR distributions were investigated in a variety of sampling and distributional settings. Several competing methods were examined, including the GCI, PB, delta-method normal approximation, MOVER, and HPD methods. Extensive simulation studies were conducted to evaluate their performance in terms of empirical CP and AL across different sample sizes, probabilities of zero inflation, scale parameters, and numbers of groups.
The results indicate substantial differences among the methods. The GCI method consistently failed to achieve nominal coverage, particularly in scenarios with higher zero inflation and increased scale heterogeneity, despite producing relatively short intervals. In contrast, the MOVER method was overly conservative, yielding near-unit coverage probabilities at the expense of excessively wide intervals. Among the methods considered, the PB, delta-method normal approximation, and HPD methods demonstrated the most favorable performance, providing CPs close to the nominal level with moderate interval lengths and improved stability as sample sizes increased, even in higher-dimensional settings.
Overall, the PB, delta-method normal approximation, and HPD methods are recommended for constructing SCIs for pairwise mean differences in ZIR distributions due to their reliability and efficiency. These findings contribute to the growing literature on inference for zero-inflated models and offer practical guidance for applied researchers working with skewed data containing excess zeros. Future research may extend the proposed framework to other zero-inflated distributions, assess robustness under model misspecification conditions, and explore computational enhancements for large-scale applications.

Author Contributions

Conceptualization, S.-A.N. and W.T.; methodology, S.-A.N. and W.T.; software, W.T. and N.S.; validation, S.-A.N. and W.T.; formal analysis, S.-A.N. and W.T.; investigation, S.-A.N. and A.W.; resources, W.T. and N.S.; data curation, W.T. and A.W.; writing—original draft preparation, W.T.; writing—review and editing, S.-A.N. and W.T.; visualization, W.T. and N.S.; supervision, S.-A.N.; project administration, S.-A.N.; funding acquisition, S.-A.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research budget was allocated by the National Science, Research, and Innovation Fund (NSRF) and King Mongkut’s University of Technology North Bangkok: KMUTNB-FF-69-B-07.

Data Availability Statement

The data presented in this study are available in https://www.thairsc.com.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Let X be a random variable that follows a ZIR distribution with parameters p and σ . The population mean is θ = ( 1 p ) σ π / 2 . Let θ ^ = g ( p ^ , σ ^ ) be the estimator of θ , where p ^ and σ ^ are the maximum likelihood estimators (MLEs) of p and σ , respectively. Then, an asymptotic estimator of the variance of θ ^ is given by
Var ( θ ^ ) = π 2 σ ^ 2 p ^ ( 1 p ^ ) n + ( 1 p ^ ) 2 σ ^ 2 4 n 1 ,
where n is the total sample size and n 1 is the number of positive observations.
Proof. 
Let X 1 , X 2 , , X n be independent and identically distributed (i.i.d.) observations from a ZIR distribution with parameters p and σ . Because the Rayleigh density is zero at x = 0 , all zero observations arise from the point mass at zero. The log-likelihood therefore factorizes into two independent components: one involving p and the other involving σ . The estimator p ^ is the sample proportion of zeros and satisfies
n ( p ^ p ) d N 0 , p ( 1 p ) .
The estimator σ ^ is the MLE of the Rayleigh scale parameter based on the positive observations only. For a single positive Rayleigh observation, the Fisher information for σ is 4 / σ 2 . Since the expected number of positive observations is n ( 1 p ) ,
n ( σ ^ σ ) d N 0 , σ 2 4 ( 1 p ) .
Because the likelihood factorizes with respect to p and σ , p ^ and σ ^ are asymptotically independent. Define
g ( p , σ ) = ( 1 p ) σ π 2 ,
so that θ ^ = g ( p ^ , σ ^ ) . The gradient of g is
g ( p , σ ) = 𝜕 g 𝜕 p , 𝜕 g 𝜕 σ = σ π 2 , ( 1 p ) π 2 .
By the multivariate delta method, the variance of θ ^ is
Var ( θ ^ ) g ( p , σ ) p ( 1 p ) n 0 0 σ 2 4 n ( 1 p ) g ( p , σ ) .
Evaluating the quadratic form yields
Var ( θ ^ ) = π 2 σ 2 p ( 1 p ) n + ( 1 p ) 2 σ 2 4 n ( 1 p ) .
Since n 1 / n p ( 1 p ) , the large-sample approximation is
1 n ( 1 p ) 1 n 1 .
Replacing the unknown parameters in Var ( θ ^ ) by their MLEs gives
Var ( θ ^ ) = π 2 σ ^ 2 p ^ ( 1 p ^ ) n + ( 1 p ^ ) 2 σ ^ 2 4 n 1 .

Appendix B

For i = 1 , 2 , . . . , k , let X i = ( X i 1 , X i 2 , . . . , X i n i ) be independent random samples from ZIR distributions with parameters p i and σ i . The mean of group i is θ i = ( 1 p i ) σ i π / 2 . Let [ p i , L , p i , U ] and [ σ i , L , σ i , U ] be the marginal 100 ( 1 α ) % two-sided confidence interval for p i and σ i , respectively. Define θ i , L = ( 1 p i , U ) σ i , L π / 2 and θ i , U = ( 1 p i , L ) σ i , U π / 2 . For i , l = 1 , 2 , . . . , k and i l , the 100 ( 1 α ) % two-sided confidence interval for θ i l = θ i θ l using the endpoint-based MOVER confidence interval is
P { θ i , L θ l , U θ i θ l θ i , U θ l , L , i l } 1 α .
Proof. 
By construction of the marginal confidence intervals, P ( p i , L p i p i , U ) 1 α and P ( σ i , L σ i σ i , U ) 1 α , where i = 1 , 2 , , k . Since the mean function θ i ( p i , σ i ) = ( 1 p i ) σ i π / 2 is monotonically decreasing in p i and monotonically increasing in σ i , it follows that for all admissible values ( p i , σ i ) [ p i , L , p i , U ] × [ σ i , L , σ i , U ] . Then θ i , L θ i θ i , U , where i = 1 , 2 , , k . Therefore, whenever both marginal confidence intervals simultaneously cover their true parameters, θ i [ θ i , L , θ i , U ] and θ l [ θ l , L , θ l , U ] . For any θ i [ θ i , L , θ i , U ] and θ l [ θ l , L , θ l , U ] , the difference θ i l = θ i θ l satisfies
θ i , L θ l , U θ i θ l θ i , U θ 1 , L .
Hence,
{ θ i [ θ i , L , θ i , U ] , θ l [ θ l , L , θ l , U ] } { θ i l [ θ i , L θ l , U , θ i , U θ l , L ] } .
Thus, whenever the marginal confidence intervals for θ i and θ l jointly cover their true values, the endpoint-based MOVER interval necessarily covers the true difference θ i l . Consequently, the interval C I MOVER = [ θ i , L θ l , U , θ i , U θ l , L ] achieves at least the nominal coverage probability 1 α under the least favorable configuration, which implies that
P { θ i , L θ l , U θ i θ l θ i , U θ l , L , i l } 1 α .

Appendix C

For i , l = 1 , 2 , , k and i l , let θ i l = θ i θ l be the difference between means for groups i and l, respectively. Let θ ^ i l be the estimator of θ i l . Assume that θ i and θ l are independent. The variance of θ ^ i l is approximated by Var ( θ ^ i l ) = Var ( θ ^ i ) + Var ( θ ^ l ) . Therefore, the confidence interval for θ i l based on the delta-method normal approximation is given by
P θ ^ i l z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) θ i θ l θ ^ i l + z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) , i l 1 α .
Proof. 
Let θ i = g ( η i ) be a scalar mean parameter for group i, where η i = ( p i , σ i 2 ) T is a vector of underlying parameters, and η ^ i = ( p ^ i , σ ^ i 2 ) T is its estimator. Assume that η ^ i satisfies the multivariate asymptotic normality condition
n i ( η ^ i η i ) d N ( 0 , Σ i ) ,
where Σ i is a finite, positive-definite covariance matrix.
Since g ( · ) is differentiable at η i , a first-order Taylor expansion of θ ^ i = g ( η ^ i ) around η i yields
θ ^ i = θ i + g ( η i ) T ( η ^ i η i ) + o p ( n i 1 / 2 ) .
Multiplying both sides by n i , then
n i ( θ ^ i θ i ) = g ( η i ) T n i ( η ^ i η i ) + o p ( l ) .
By Slutsky’s theorem,
n i ( θ ^ i θ i ) d N ( 0 , g ( η i ) T Σ i g ( η i ) ) .
Hence,
θ ^ i N ( θ i , Var ( θ ^ i ) ) ,
where Var ( θ ^ i ) is consistently estimated by
Var ( θ ^ i ) = π 2 σ ^ i 2 p ^ i ( 1 p ^ i ) n i + ( 1 p ^ i ) 2 σ ^ i 2 4 n i ( 1 ) .
Consider two independent estimators θ ^ i and θ ^ l . By independence and asymptotic normality,
θ ^ i θ ^ l N ( θ i θ l , Var ( θ ^ i ) + Var ( θ ^ l ) ) .
Then, by Slutsky’s theorem,
( θ ^ i θ ^ l ) ( θ i θ l ) Var ( θ ^ i ) + Var ( θ ^ l ) d N ( 0 , 1 ) .
Let z α / 2 be the upper α / 2 quantile of the standard normal distribution. From the limiting result above,
P z α / 2 ( θ ^ i θ ^ l ) ( θ i θ l ) Var ( θ ^ i ) + Var ( θ ^ l ) z α / 2 1 α .
Rearranging the inequality gives
P { θ ^ i l z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) θ i θ l θ ^ i l + z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) , i l } 1 α .
Therefore, the delta-method normal approximation confidence interval for the difference θ i l = θ i θ l is [ θ ^ i l z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) , θ ^ i l + z α / 2 Var ( θ ^ i ) + Var ( θ ^ l ) ] and is asymptotically valid with coverage probability 1 α . □

References

  1. Bashir, S.; Rasul, M. A new weighted Rayleigh distribution: Properties and applications on lifetime data. Open J. Stat. 2018, 8, 640–650. [Google Scholar] [CrossRef]
  2. Al-Babtain, A.A. A new extended Rayleigh distribution. J. King Saud-Univ.-Sci. 2020, 32, 2576–2581. [Google Scholar] [CrossRef]
  3. Krishnamoorthy, K.; Waguespack, D.; Hoang-Nguyen-Thuy, N. Confidence interval, prediction interval and tolerance limits for a two-parameter Rayleigh distribution. J. App. Stat. 2020, 47, 160–175. [Google Scholar] [CrossRef] [PubMed]
  4. Almongy, H.M.; Almetwally, E.M.; Aljohani, H.M.; Alghamdi, A.S.; Hafez, E.H. A new extended Rayleigh distribution with applications of COVID-19 data. Results Phy. 2021, 23, 104012. [Google Scholar] [CrossRef] [PubMed]
  5. Fuxiang, L.; Jianing, L.; Peng, X. A novel zero-inflated Rayleigh distribution and its properties. Results Phy. 2023, 51, 106634. [Google Scholar] [CrossRef]
  6. Kijsason, S.; Niwitpong, S.-A.; Niwitpong, S. Confidence intervals for the parameter mean of zero-inflated two-parameter Rayleigh distribution. Symmetry 2025, 17, 1019. [Google Scholar] [CrossRef]
  7. Donner, A.; Zou, G.Y. Closed-form confidence intervals for function of the normal standard deviation. Stat. Meth. Med. Res. 2010, 21, 347–359. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three sample size scenarios: (A) coverage probabilities and (B) average lengths.
Figure 1. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three sample size scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g001
Figure 2. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Figure 2. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g002
Figure 3. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Figure 3. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across three Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g003
Figure 4. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six sample size scenarios: (A) coverage probabilities and (B) average lengths.
Figure 4. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six sample size scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g004
Figure 5. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Figure 5. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g005
Figure 6. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Figure 6. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across six Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g006
Figure 7. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten sample size scenarios: (A) coverage probabilities and (B) average lengths.
Figure 7. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten sample size scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g007
Figure 8. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Figure 8. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten zero-inflation probability scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g008
Figure 9. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Figure 9. Comparison of the CPs and ALs of the SCIs for pairwise differences of means across ten Rayleigh scale parameter scenarios: (A) coverage probabilities and (B) average lengths.
Mathematics 14 00569 g009
Figure 10. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Chachoengsao province.
Figure 10. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Chachoengsao province.
Mathematics 14 00569 g010
Figure 11. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Uttaradit province.
Figure 11. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Uttaradit province.
Mathematics 14 00569 g011
Figure 12. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Chaiyaphum province.
Figure 12. Histogram (A) and cumulative distribution function (CDF) (B) of the number of fatalities in Chaiyaphum province.
Mathematics 14 00569 g012
Table 1. The number of fatalities from road accidents in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
Table 1. The number of fatalities from road accidents in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
ProvincesNumber of Fatalities
Chachoengsao3329241714131364
20
Uttaradit221410964320
Chaiyaphum2112111199988
6544310
Source: Road Accident Victims Protection Company Limited (https://www.thairsc.com/).
Table 2. AIC values for eight distributions in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
Table 2. AIC values for eight distributions in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
DistributionsAIC
ChachoengsaoUttaraditChaiyaphum
Normal78.253956.049992.7017
Log-normal78.299052.657292.5681
Weibull76.477753.050590.0834
Gamma76.853652.829590.3140
Exponential76.816852.704994.6322
Logistic78.970756.088291.5391
Cauchy82.600158.261594.9144
Rayleigh75.597152.533188.3944
Note: Bold font means the distribution with the lowest AIC value.
Table 3. Statistics of the number of fatalities in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
Table 3. Statistics of the number of fatalities in Chachoengsao, Uttaradit, and Chaiyaphum provinces.
StatisticsChachoengsaoUttaraditChaiyaphum
Total sample size ( n i )11916
Number of zero observations ( n i ( 0 ) )111
Number of non-zero observations ( n i ( 1 ) )10815
Estimator of zero-inflation probability ( p ^ i )0.09090.11110.0625
Estimator of scale parameter ( σ ^ i )13.00967.60766.5853
Estimator for mean ( θ ^ i )14.82288.47537.7377
Table 4. The 95% two-sided SCIs for pairwise differences of means in ZIR distributions.
Table 4. The 95% two-sided SCIs for pairwise differences of means in ZIR distributions.
ComparisonChachoengsao–UttaraditChachoengsao–ChaiyaphumUttaradit–Chaiyaphum
GCI CI GCI [0.9969,13.4857][2.7051,14.3792][−2.7753,5.3810]
Length12.488811.67418.1563
PB CI PB [-0.0504,12.9515][1.2607,12.6435][−3.4796,4.4960]
Length13.001911.38287.9756
MOVER CI MOVER [−6.1946,18.0843][−3.7180,17.0077][−7.1009,8.5009]
Length24.278920.725715.6018
Delta CI Delta [−0.0734,12.7686][1.2916,12.8788][−3.4153,4.8905]
Length12.842011.58728.3058
HPD CI HPD [−0.8202,12.8196][0.9181,12.7709][−3.3553,5.6676]
Length13.639811.85289.0229
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Thangjai, W.; Niwitpong, S.-A.; Smithpreecha, N.; Wongkhao, A. Simultaneous Confidence Intervals for Pairwise Differences of Means in Zero-Inflated Rayleigh Distributions with an Application to Road Accident Fatalities Data. Mathematics 2026, 14, 569. https://doi.org/10.3390/math14030569

AMA Style

Thangjai W, Niwitpong S-A, Smithpreecha N, Wongkhao A. Simultaneous Confidence Intervals for Pairwise Differences of Means in Zero-Inflated Rayleigh Distributions with an Application to Road Accident Fatalities Data. Mathematics. 2026; 14(3):569. https://doi.org/10.3390/math14030569

Chicago/Turabian Style

Thangjai, Warisa, Sa-Aat Niwitpong, Narudee Smithpreecha, and Arunee Wongkhao. 2026. "Simultaneous Confidence Intervals for Pairwise Differences of Means in Zero-Inflated Rayleigh Distributions with an Application to Road Accident Fatalities Data" Mathematics 14, no. 3: 569. https://doi.org/10.3390/math14030569

APA Style

Thangjai, W., Niwitpong, S.-A., Smithpreecha, N., & Wongkhao, A. (2026). Simultaneous Confidence Intervals for Pairwise Differences of Means in Zero-Inflated Rayleigh Distributions with an Application to Road Accident Fatalities Data. Mathematics, 14(3), 569. https://doi.org/10.3390/math14030569

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop