Next Article in Journal
State-of-the-Art Overview of Smooth-Edged Material Distribution for Optimizing Topology (SEMDOT) Algorithm
Previous Article in Journal
Embedding-Based Alignments Capture Structural and Sequence Domains of Distantly Related Multifunctional Human Proteins
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Regression Extensions of the New Polynomial Exponential Distribution: NPED-GLM and Poisson–NPED Count Models with Applications in Engineering and Insurance

by
Halim Zeghdoudi
1,
Sandra S. Ferreira
2,3,*,
Vinoth Raman
4 and
Dário Ferreira
2,3
1
Laboratory of Probability and Statistics (LaPS), Badji Mokhtar-Annaba University, Annaba 23000, Algeria
2
Mathematics Department, University of Beira Interior, 6201-001 Covilhã, Portugal
3
Centre of Mathematics and Applications, University of Beira Interior, 6201-001 Covilhã, Portugal
4
Deanship of Quality and Academic Accreditation, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia
*
Author to whom correspondence should be addressed.
Computation 2026, 14(1), 26; https://doi.org/10.3390/computation14010026
Submission received: 16 December 2025 / Revised: 8 January 2026 / Accepted: 13 January 2026 / Published: 21 January 2026
(This article belongs to the Section Computational Engineering)

Abstract

The New Polynomial Exponential Distribution (NPED), introduced by Beghriche et al. (2022), provides a flexible one-parameter family capable of representing diverse hazard shapes and heavy-tailed behavior. Regression frameworks based on the NPED, however, have not yet been established. This paper introduces two methodological extensions: (i) a generalized linear model (NPED-GLM) in which the distribution parameter depends on covariates, and (ii) a Poisson–NPED count regression model suitable for overdispersed and heavy-tailed count data. Likelihood-based inference, asymptotic properties, and simulation studies are developed to investigate the performance of the estimators. Applications to engineering failure-count data and insurance claim frequencies illustrate the advantages of the proposed models relative to classical Poisson, negative binomial, and Poisson–Lindley regressions. These developments substantially broaden the applicability of the NPED in actuarial science, reliability engineering, and applied statistics.

Graphical Abstract

1. Introduction

Count data are frequently encountered in engineering reliability, actuarial science, public health, and numerous other applied fields. The Poisson distribution, together with its associated regression framework, forms the classical foundation for modeling such data. However, the standard Poisson model imposes the restrictive equality of mean and variance, an assumption that is rarely satisfied in practice. Empirical count data commonly exhibit overdispersion or heavy-tailed behavior, leading to underestimated uncertainty and systematic lack of fit when the Poisson model is used [1,2].
One widely used strategy for accommodating overdispersion is to introduce unobserved heterogeneity through mixed-Poisson models, in which the Poisson mean is treated as a latent random variable. This construction yields a broad class of flexible models with richer variance structures. A comprehensive overview of mixed-Poisson families and their probabilistic properties is provided in [3], who discuss their interpretability, tractability, and applications in a variety of domains. Mixed-Poisson models are particularly appealing because they retain closed-form probability mass functions under many choices of mixing distribution, enabling likelihood-based inference within a regression context.
The need for flexible count models is especially evident in engineering reliability. Failure processes are often driven by unobserved factors such as latent degradation, variability in material properties, and changes in operating conditions. These sources of heterogeneity can induce substantial deviations from Poisson-like behavior, producing skewed or overdispersed failure counts [4]. In such settings, classical Poisson regression is inadequate, necessitating models that allow dispersion and tail behavior to vary more freely.
Motivated by these challenges, this paper develops a new class of count regression models based on the New Polynomial Exponential Distribution (NPED). Originally introduced as a flexible family for continuous data, the (NPED) mixes naturally with the Poisson distribution to yield analytically tractable mixed-Poisson models possessing wide-ranging dispersion and tail characteristics. We show that the resulting Poisson–NPED distribution nests several well-known mixed-Poisson models as special or limiting cases, while offering substantially greater flexibility for capturing complex forms of heterogeneity.
Building on this distributional foundation, we propose a generalized linear modeling (GLM) framework—referred to as the Poisson–NPED regression model—that allows covariates to influence both the mean and the dispersion structure through the NPED mixing mechanism. The model remains fully likelihood-based and admits efficient numerical estimation, enabling practitioners to apply it using standard tools. Recent literature continues to expand the toolbox for modeling overdispersed count data. In particular, several recent studies have substantially extended this area; for example, a comprehensive survey of overdispersed count distributions highlights the limitations of classical models and reviews a range of alternative formulations such as COM–Poisson, generalized Poisson, and mixed-Poisson families [5]. New mixed-Poisson constructions continue to appear: a one-parameter Poisson–Lindley type model based on a “second-degree Lindley” mixing distribution was recently proposed [6], and alternative mixtures such as the Poisson–Transmuted-Janardan distribution have been introduced to capture heavy tails and flexible variance structures [7]. A very recent contribution has developed a general statistical framework for overdispersed count data, underlining the ongoing demand for flexible yet tractable models [8]. Even for classical mixtures, methodological refinements continue: e.g., a generalized ridge estimator was recently developed for the Poisson–Inverse Gaussian regression model to handle multicollinearity in covariates [9]. These developments show a growing recognition that standard count models are often insufficient when real data exhibit high heterogeneity and complex dispersion behavior—reinforcing the motivation for a flexible, well-grounded model such as the proposed Poisson–NPED regression.
The contributions of this paper are threefold:
  • We develop the Poisson–NPED distribution and establish its probabilistic properties, including closed-form expressions for its pmf and moments.
  • We propose a GLM-type regression framework that incorporates NPED-based heterogeneity in a tractable and interpretable way, subsuming several standard mixed-Poisson models.
  • We demonstrate the effectiveness of the proposed model through simulation studies and real-data applications in engineering reliability and insurance claim analysis, where the data exhibit strong overdispersion that classical models fail to capture adequately.

Paper Overview

Section 2 introduces the Poisson–NPED distribution and establishes its main analytical properties. In Section 3, we present the NPED-based regression framework and discuss its implementation. Section 4 develops the likelihood-based inference procedures for the proposed model and establishes the consistency and asymptotic normality of the MLE. Section 5 assesses the finite-sample performance of the estimators through simulation studies. Section 6 presents two real-data applications drawn from engineering and insurance. Finally, Section 7 offers concluding remarks and outlines potential directions for future research.

2. On New Polynomial Exponential Models

2.1. Review of the New Polynomial Exponential Distribution (NPED)

According to Beghriche et al. (2022) [10], a random variable X follows NPED(θ) if its pdf is given by
f NPED ( x ; θ ) = P ( x , θ ) e θ x k = 0 m a k , θ k ! θ k + 1 , x > 0 ,
where
P ( x , θ ) = k = 0 m a k , θ x k ,
and a k , θ represents the k t h parameter-dependent coefficient that shapes the polynomial P ( x , θ ) , and thus governs how the polynomial part of the density function changes with respect to θ . The survival, hazard rate, and moments expressions are derived in the original paper (pp. 2–6; see [10]).
Important special cases of the NPED as a continuous distribution include several well-known lifetime and reliability models:
  • Lindley distribution—A classical one-parameter lifetime model with decreasing failure rate, widely used in survival analysis and reliability studies.
  • X–Lindley distribution—An extended form of the Lindley distribution that introduces additional shape flexibility, allowing unimodal and more complex hazard functions.
  • New X–Lindley distribution—A recent generalization of the X–Lindley family, designed to capture heavier tails and a wider range of hazard shapes.
  • X–Gamma distribution—A compound lifetime distribution that blends features of the Gamma and Lindley families, offering greater modeling flexibility for skewed data.
  • Zeghdoudi distribution—A polynomial–exponential lifetime model obtained by modifying the Lindley distribution through additional polynomial terms.
  • Shanker family of distributions—A broad class of polynomial–exponential lifetime models (including the Akash, Ishita, Sujatha, and other distributions), unified by their common exponential–polynomial kernel.
These models all belong to the general class of polynomial–exponential distributions, for which the density can be expressed in the form
f ( x ) p ( x ) e θ x , x > 0 ,
where p ( x ) is a non-negative polynomial. The NPED extends this class by allowing p ( x ) to take a flexible polynomial form of arbitrary degree, with parameters controlling tail weight, shape, and failure-rate behavior.
  • The NPED is able to produce
  • A wide range of density shapes, including decreasing, unimodal, bathtub-shaped, and increasing forms.
  • Flexible tail behavior, from light to heavy tails, depending on the degree and coefficients of the polynomial component.
  • Multiple well-known lifetime distributions as special or limiting cases, unifying several families into a single polynomial–exponential framework.
  • Closed-form expressions for the pdf, cdf, and moments for many parameter configurations, maintaining analytical tractability.
  • Greater adaptability in modeling failure-time data, allowing the NPED to accommodate datasets where classical one-parameter models (e.g., Lindley or Gamma) fail to capture the observed skewness or tail thickness.
These features make NPED an excellent foundation for regression generalizations.

2.2. NPED-GLM Regression Model

In this section, we develop a regression framework in which the parameter of the NPED depends on a set of covariates through a link function. This yields a generalized linear model (GLM)-type structure, which we refer to as the NPED-GLM. Unlike classical GLMs, where covariate effects are typically introduced through the mean parameter of an exponential family, the NPED-GLM links covariates directly to the parameter θ , which jointly governs scale, tail behavior, and dispersion of the response distribution. This construction preserves the analytical tractability of the NPED, while allowing regression effects to influence higher-order distributional features beyond the mean, such as skewness and tail thickness. As a result, the NPED-GLM provides a flexible yet interpretable alternative to standard overdispersed count and lifetime regression models.

2.3. Model Specification

Let Y 1 , , Y n be independent positive responses, and for each observation i let x i = ( x i 1 , , x i p ) denote a p-dimensional vector of covariates. Following [10], we say that Y i follows an NPED distribution with parameter θ i > 0 if its probability density function (pdf) is given by
f NPED ( y i ; θ i ) = P ( y i , θ i ) exp ( θ i y i ) C ( θ i ) , y i > 0 , θ i > 0 ,
where
P ( y , θ ) = k = 0 m a k , θ y k and C ( θ ) = k = 0 m a k , θ k ! θ k + 1 .
Here, m is a fixed nonnegative integer, independent of the sample size n, and a k , θ are real-valued functions of θ such that P ( y , θ ) 0 for all y > 0 and C ( θ ) ( 0 , ) .
To introduce covariate effects, we assume that the NPED parameter θ i depends on x i through a link function. Let
η i = x i β , i = 1 , , n ,
denote the linear predictor, where β = ( β 1 , , β p ) is a vector of unknown regression coefficients. We then postulate a one-to-one, differentiable link function g : ( 0 , ) R , where g is a monotone link function with inverse such that
g ( θ i ) = η i θ i = g 1 ( η i ) .
A common and convenient choice is the log link,
θ i = exp ( η i ) ,
which ensures that θ i > 0 for all i.
Under the log link, regression coefficients quantify the multiplicative effect of covariates on the NPED parameter θ , which in turn governs both the scale and tail behavior of the response distribution.
Under this specification, the conditional density of Y i given x i is
f ( y i x i ; β ) = P y i , θ i ( β ) exp θ i ( β ) y i C θ i ( β ) , y i > 0 ,
with θ i ( β ) = g 1 ( x i β ) .

2.4. Log-Likelihood, Score Function and Hessian

For a sample of size n, the log-likelihood function for β , up to an additive constant, is given by
( β ) = i = 1 n i ( β ) , i ( β ) = log P y i , θ i θ i y i log C ( θ i ) ,
where θ i = θ i ( β ) is defined in (4)–(5).
To obtain the score function, note that by the chain rule
( β ) β = i = 1 n i ( β ) θ i θ i η i η i β = i = 1 n U θ ( θ i ; y i ) θ i ( η i ) x i ,
where
U θ ( θ ; y ) : = θ i ( β ) | θ i = θ = θ log P ( y , θ ) θ y log C ( θ ) ,
and θ i ( η i ) = d θ i d η i . For the log link (5), we have θ i ( η i ) = θ i .
We now compute U θ ( θ ; y ) explicitly. Differentiating (9) with respect to θ yields
U θ ( θ ; y ) = P θ ( y , θ ) P ( y , θ ) y C ( θ ) C ( θ ) ,
where
P θ ( y , θ ) : = θ P ( y , θ ) = k = 0 m a k , θ θ y k .
and, using (2),
C ( θ ) = k = 0 m a k , θ k ! θ k + 1 a k , θ ( k + 1 ) k ! θ k + 2 ,
with a k , θ = a k , θ θ .
Substituting (10) into (8), we obtain the NPED-GLM score vector
U ( β ) : = ( β ) β = i = 1 n P θ ( y i , θ i ) P ( y i , θ i ) y i C ( θ i ) C ( θ i ) θ i ( η i ) x i .
Theorem 1
(Unbiasedness of the score). Assume that, for each fixed θ > 0 , the density f NPED ( y ; θ ) in (1) is correctly specified, and that differentiation under the integral sign is valid. Then, at the true parameter value β 0 , the score function satisfies
E U ( β 0 ) = 0 .
Proof. 
Fix i { 1 , , n } and write θ i 0 = g 1 ( x i β 0 ) . From (9)–(10) and the definition of the log-likelihood, we have
E U θ ( θ i 0 ; Y i ) = 0 θ log f NPED ( y ; θ ) θ = θ i 0 f NPED ( y ; θ i 0 ) d y .
Using the identity
θ log f NPED ( y ; θ ) = 1 f NPED ( y ; θ ) θ f NPED ( y ; θ ) ,
and interchanging integration and differentiation (by the assumed regularity), we obtain
E U θ ( θ i 0 ; Y i ) = 0 1 f NPED ( y ; θ i 0 ) θ f NPED ( y ; θ ) | θ = θ i 0 f NPED ( y ; θ i 0 ) d y = 0 θ f NPED ( y ; θ ) | θ = θ i 0 d y .
Since f NPED ( · ; θ ) is a pdf for every θ > 0 , 0 f NPED ( y ; θ ) d y = 1 , and differentiating this identity with respect to θ yields
0 θ f NPED ( y ; θ ) d y = 0 .
Therefore, E U θ ( θ i 0 ; Y i ) = 0 . Using (12), we then have
E U ( β 0 ) = i = 1 n θ i 0 ( η i 0 ) x i E U θ ( θ i 0 ; Y i ) = i = 1 n θ i 0 ( η i 0 ) x i · 0 = 0 ,
where η i 0 = x i β 0 . This proves the claim. □
The Hessian matrix is obtained by differentiating (12) once more with respect to β . Define
θ θ ( θ ; y ) : = 2 θ 2 log P ( y , θ ) θ y log C ( θ ) .
Then, a direct calculation gives
θ θ ( θ ; y ) = P θ θ ( y , θ ) P ( y , θ ) P θ ( y , θ ) 2 P ( y , θ ) 2 C ( θ ) C ( θ ) C ( θ ) 2 C ( θ ) 2 ,
where
P θ θ ( y , θ ) = 2 θ 2 P ( y , θ ) , C ( θ ) = 2 θ 2 C ( θ ) .
Using the chain rule again, the Hessian of ( β ) is
H ( β ) : = 2 ( β ) β β = i = 1 n θ θ ( θ i ; y i ) θ i ( η i ) 2 + U θ ( θ i ; y i ) θ i ( η i ) x i x i ,
where θ i ( η i ) = d 2 θ i d η i 2 . For the log link (5), we have θ i ( η i ) = θ i and θ i ( η i ) = θ i , so that (14) simplifies accordingly.

2.5. Asymptotic Properties of the MLE

Let β ^ denote the maximum likelihood estimator (MLE) that solves the score equations
U ( β ^ ) = 0 .
Under suitable regularity conditions (identifiability of the model, interior parameter point β 0 , differentiability of f NPED ( y ; θ ) and the link, finite Fisher information, and bounded regressors), standard likelihood theory can be applied to derive consistency and asymptotic normality of β ^ .
Let
I ( β ) = E H ( β )
denote the Fisher information matrix for β , where H ( β ) is given in (14).
Theorem 2
(asymptotic normality of the NPED-GLM MLE). Assume that
(i) 
the model is correctly specified and identifiable at the true parameter β 0 ;
(ii) 
the log-likelihood ( β ) is twice continuously differentiable in a neighborhood of β 0 ;
(iii) 
I ( β 0 ) is finite and positive definite;
(iv) 
max 1 i n x i = o ( n ) as n .
Then, the MLE β ^ is consistent and
n β ^ β 0 d N 0 , I ( β 0 ) 1 , n .
Proof. 
By Theorem 1, E [ U ( β 0 ) ] = 0 . Under (ii) and the usual dominated convergence arguments, one can show that U ( β ) / n converges in probability to E [ U ( β ) ] , uniformly on compact sets. Assumption (i) implies that E [ U ( β ) ] has a unique zero at β 0 , which, together with the law of large numbers, yields the consistency of β ^ as the solution of the score equations.
For asymptotic normality, we perform a second-order Taylor expansion of the score around β 0 :
U ( β ^ ) = U ( β 0 ) + H ( β ˜ ) ( β ^ β 0 ) ,
for some β ˜ on the line segment between β ^ and β 0 . Using U ( β ^ ) = 0 and rearranging, we obtain
n ( β ^ β 0 ) = 1 n H ( β ˜ ) 1 1 n U ( β 0 ) .
Under assumptions (ii)–(iv), a central limit theorem for the sum of independent scores implies that 1 n U ( β 0 ) d N ( 0 , I ( β 0 ) ) , while 1 n H ( β ˜ ) converges in probability to I ( β 0 ) by the law of large numbers. Slutsky’s theorem then gives the desired asymptotic normality of β ^ with covariance matrix I ( β 0 ) 1 . □
The result in Theorem 2 justifies the use of Wald-type confidence intervals and hypothesis tests for linear combinations of the components of β in the NPED-GLM setting.

3. Poisson–NPED Count Regression

Count data frequently exhibit overdispersion, heavy tails, or excess heterogeneity that cannot be adequately modeled by the standard Poisson distribution. A classical remedy is to assume that the Poisson rate parameter is itself random, yielding a mixture model. Well-known examples include the Poisson–Gamma (negative binomial) and Poisson–Lindley mixtures. In this section we introduce a new and more flexible class of count models based on a Poisson–NPED mixture, where the Poisson rate parameter follows the New Polynomial Exponential Distribution (NPED) introduced by [10].

3.1. Model Definition

Let Y i denote a count response such as number of failures, claims, defects, or events. We assume a mixed-Poisson structure of the form
Y i Λ i Poisson ( Λ i ) , Λ i NPED ( θ i ) ,
where θ i is a positive parameter possibly depending on covariates. Using the NPED pdf
f NPED ( λ ; θ ) = P ( λ , θ ) exp ( θ λ ) C ( θ ) , λ > 0 ,
with definitions of P and C as in (2), the marginal pmf of Y i is obtained by integrating out the mixing variable Λ i :
Pr ( Y i = y ) = 1 C ( θ i ) y ! 0 λ y P ( λ , θ i ) exp ( θ i + 1 ) λ d λ , y = 0 , 1 , 2 , .
The integral in (17) admits a closed-form expression as shown next.
Theorem 3
(Closed-form pmf of the Poisson–NPED mixture). Let P ( λ , θ ) = k = 0 n a k , θ λ k and let C ( θ ) be defined as in (2). Then, the pmf of the Poisson–NPED mixture is
Pr ( Y i = y ) = 1 C ( θ i ) y ! k = 0 n a k , θ i Γ ( y + k + 1 ) ( θ i + 1 ) y + k + 1 ,
for y = 0 , 1 , 2 , .
Proof. 
Substituting the polynomial form of P ( λ , θ i ) into (17), we obtain
Pr ( Y i = y ) = 1 C ( θ i ) y ! k = 0 n a k , θ i 0 λ y + k exp ( θ i + 1 ) λ d λ .
The integral is recognized as a Gamma function:
0 λ y + k e ( θ i + 1 ) λ d λ = Γ ( y + k + 1 ) ( θ i + 1 ) y + k + 1 .
Substituting into the expression completes the proof. □

3.2. Regression Structure

To relate the heterogeneity parameter θ i to covariates x i = ( x i 1 , , x i p ) , we specify the regression model
g ( θ i ) = η i : = x i β , θ i = g 1 ( η i ) ,
where g is a link function. As in the NPED-GLM, the log link θ i = exp ( η i ) is a convenient and natural choice. The resulting model provides a flexible description of overdispersion through covariate-dependent heterogeneity.

3.3. Moments and Overdispersion

The unconditional mean and variance of Y i follow from standard mixed-Poisson arguments:
E ( Y i ) = E ( Λ i ) ,
Var ( Y i ) = E ( Λ i ) + Var ( Λ i ) .
Since Var ( Λ i ) > 0 for any non-degenerate NPED distribution, it follows immediately that Var ( Y i ) > E ( Y i ) , that is, the Poisson–NPED model always exhibits overdispersion. More formal results follow.
Theorem 4
(Strict overdispersion). Under the Poisson–NPED mixture (15),
Var ( Y i ) > E ( Y i ) for all θ i > 0 .
Proof. 
Using the law of total variance,
Var ( Y i ) = E Var ( Y i Λ i ) + Var E ( Y i Λ i ) .
Since Var ( Y i Λ i ) = Λ i and E ( Y i Λ i ) = Λ i , we have
Var ( Y i ) = E ( Λ i ) + Var ( Λ i ) .
The NPED distribution is non-degenerate for all θ i > 0 , implying Var ( Λ i ) > 0 , and the result follows. □

3.4. Log-Likelihood

Let Y 1 , , Y n be independent observations. From (18), the log-likelihood for β is
( β ) = i = 1 n log 1 C ( θ i ) Y i ! k = 0 n a k , θ i Γ ( Y i + k + 1 ) ( θ i + 1 ) Y i + k + 1 ,
where θ i = g 1 ( x i β ) .
Define
M i ( θ i ; Y i ) : = k = 0 n a k , θ i Γ ( Y i + k + 1 ) ( θ i + 1 ) Y i + k + 1 ,
so that the log-likelihood can be written compactly as
( β ) = i = 1 n log M i ( θ i ; Y i ) log C ( θ i ) log ( Y i ! ) .

3.5. Score Function

Using the chain rule, the score function is
U ( β ) = i = 1 n θ i log M i ( θ i ; Y i ) θ i log C ( θ i ) θ i ( η i ) x i ,
where θ i ( η i ) = d θ i d η i . For the log link, θ i ( η i ) = θ i .
We compute the derivatives explicitly. First,
θ log M i ( θ ; Y i ) = M i ( θ ; Y i ) M i ( θ ; Y i ) ,
where
M i ( θ ; Y i ) = k = 0 n a k , θ Γ ( Y i + k + 1 ) ( θ + 1 ) Y i + k + 1 a k , θ ( Y i + k + 1 ) Γ ( Y i + k + 1 ) ( θ + 1 ) Y i + k + 2 .
Similarly,
θ log C ( θ ) = C ( θ ) C ( θ ) ,
with C ( θ ) from (11). Substituting these derivatives into (23) yields the score vector
U ( β ) = i = 1 n M i ( θ i ; Y i ) M i ( θ i ; Y i ) C ( θ i ) C ( θ i ) θ i ( η i ) x i .

3.6. Information Matrix and Asymptotic Theory

The Hessian matrix is obtained by differentiating (23). This yields
H ( β ) = i = 1 n M i M i M i M i 2 C C C C 2 θ i ( η i ) 2 x i x i + i = 1 n M i M i C C θ i ( η i ) x i x i .
For the log link, θ i ( η i ) = θ i .
Let
I ( β ) = E H ( β )
denote the Fisher information matrix. The usual regularity conditions (identifiability, smoothness, finite information, non-explosive covariates) then imply.
Theorem 5
(asymptotic normality). Let β ^ be the MLE for the Poisson–NPED model. Then, under standard regularity assumptions,
n β ^ β 0 d N 0 , I ( β 0 ) 1 , n .
Proof. 
The argument parallels that of Theorem 2 for the NPED-GLM but uses the pmf (18). Unbiasedness of the score follows from differentiating the identity y = 0 Pr ( Y i = y ) = 1 . A Taylor expansion of the score around β 0 , combined with a central limit theorem for independent summands and Slutsky’s theorem, gives the result. □

3.7. Special and Limiting Cases of the Poisson–NPED

The Poisson–NPED framework encompasses a variety of well-known mixed-Poisson models as special or limiting cases. Table 1 summarizes these connections, providing readers with a quick reference for understanding where the proposed model fits within the broader literature.

3.8. A New One-Parameter Particular Case of the NPED Family

In addition to well-known submodels such as the Lindley, XLindley, XGamma and Zeghdoudi distributions, the NPED framework also accommodates new one-parameter particular cases that, to the best of our knowledge, have not been studied previously. Here, we introduce a simple but flexible cubic polynomial specification.

3.8.1. Cubic NPED (CNPED)

Consider the choice
P ( λ , θ ) = 1 + λ + θ λ 3 , θ > 0 ,
which preserves the single-parameter structure of the NPED family while introducing both linear and cubic terms in the mixing kernel. Substituting this into the generic NPED density yields the new Cubic NPED model:
f CNPED ( λ ; θ ) = ( 1 + λ + θ λ 3 ) e θ λ C ( θ ) , C ( θ ) = 1 θ + 1 θ 2 + 6 θ 3 , λ > 0 .
This special case is distinct from the existing Lindley-type and XGamma-type constructions, since its polynomial component simultaneously involves constant, linear and cubic terms. It therefore offers an alternative one-parameter mixing mechanism with enhanced tail flexibility while remaining analytically tractable.

3.8.2. Poisson–CNPED Mixture

By mixing the Poisson distribution with the CNPED, we obtain a new Poisson–CNPED count model. If Y Λ Poisson ( Λ ) and Λ CNPED ( θ ) , then
Pr ( Y = y ) = 1 C ( θ ) y ! Γ ( y + 1 ) ( θ + 1 ) y + 1 + Γ ( y + 2 ) ( θ + 1 ) y + 2 + θ Γ ( y + 4 ) ( θ + 1 ) y + 4 , y = 0 , 1 , 2 , .
This Poisson–CNPED specification inherits the overdispersion and tail flexibility of the NPED mixing distribution and provides a novel alternative to classical Poisson–Gamma and Poisson–Lindley mixtures for modeling heterogeneous count data.

4. Statistical Inference for NPED-Based Models

In this section, we develop a rigorous likelihood-based inferential framework for the parameter θ of the Poisson–CNPED mixture introduced in Section 2. Although Section 2 and Section 3 focus on regression extensions of the NPED family, the present section is intentionally devoted to a baseline mixed-Poisson model without covariates. The objective of this section is twofold. First, it establishes the theoretical properties of the maximum likelihood estimator (MLE), including consistency, asymptotic normality and the Fisher information, within an analytically tractable one-parameter setting. Second, it provides an underlying inferential standard for the more general NPED-based regression models, since the likelihood structure, score functions and asymptotic arguments employed here directly underpin the regression frameworks developed earlier. This organization ensures that all subsequent regression-based inference is grounded in a well-understood and theoretically validated special case of the Poisson–NPED family.

4.1. Model Specification and Notation

Let Y 1 , , Y n be independent count observations. We assume that each Y i follows the Poisson–CNPED mixture model:
Y i Λ i Poisson ( Λ i ) , Λ i CNPED ( θ ) ,
where θ > 0 is an unknown parameter. The CNPED density, introduced in Section 2, is given by
f CNPED ( λ ; θ ) = ( 1 + λ + θ λ 3 ) exp ( θ λ ) C ( θ ) , λ > 0 ,
with normalizing constant
C ( θ ) = 0 ( 1 + λ + θ λ 3 ) exp ( θ λ ) d λ = 1 θ + 1 θ 2 + 6 θ 3 .
Proof 
(Derivation of C ( θ ) ). Using the standard integral 0 λ k e θ λ d λ = k ! / θ k + 1 , we compute each term separately:
0 1 · e θ λ d λ = 0 ! θ 0 + 1 = 1 θ , 0 λ · e θ λ d λ = 1 ! θ 1 + 1 = 1 θ 2 , 0 θ λ 3 · e θ λ d λ = θ · 3 ! θ 3 + 1 = 6 θ 3 .
Summing these contributions yields (27). □
By Theorem 3, the marginal probability mass function (pmf) of Y i is obtained by integrating out Λ i :
p ( y ; θ ) : = Pr ( Y i = y ) = 1 C ( θ ) y ! Γ ( y + 1 ) ( θ + 1 ) y + 1 + Γ ( y + 2 ) ( θ + 1 ) y + 2 + θ Γ ( y + 4 ) ( θ + 1 ) y + 4 ,
for y = 0 , 1 , 2 , .
Proof 
(Derivation of the pmf). Starting from the mixture representation, we have
Pr ( Y i = y ) = 0 λ y e λ y ! · ( 1 + λ + θ λ 3 ) e θ λ C ( θ ) d λ = 1 C ( θ ) y ! 0 λ y ( 1 + λ + θ λ 3 ) e ( θ + 1 ) λ d λ .
Expanding the integrand and applying 0 λ k e α λ d λ = Γ ( k + 1 ) / α k + 1 with α = θ + 1 ,
0 λ y e ( θ + 1 ) λ d λ = Γ ( y + 1 ) ( θ + 1 ) y + 1 , 0 λ y + 1 e ( θ + 1 ) λ d λ = Γ ( y + 2 ) ( θ + 1 ) y + 2 , 0 θ λ y + 3 e ( θ + 1 ) λ d λ = θ Γ ( y + 4 ) ( θ + 1 ) y + 4 .
Substituting into the expression above yields (28). □

4.2. Log-Likelihood Function

For an observed sample y = ( y 1 , , y n ) , the log-likelihood function for θ is
n ( θ ) = i = 1 n log p ( y i ; θ ) = i = 1 n log M ( y i ; θ ) log C ( θ ) log ( y i ! ) ,
where we define the auxiliary function
M ( y ; θ ) : = Γ ( y + 1 ) ( θ + 1 ) y + 1 + Γ ( y + 2 ) ( θ + 1 ) y + 2 + θ Γ ( y + 4 ) ( θ + 1 ) y + 4 .
Since log ( y i ! ) does not depend on θ , maximizing n ( θ ) is equivalent to maximizing
˜ n ( θ ) = i = 1 n log M ( y i ; θ ) log C ( θ ) .

4.3. Score Function

The score function is defined as the derivative of the log-likelihood with respect to θ :
U n ( θ ) : = n ( θ ) θ = i = 1 n u ( y i ; θ ) ,
where the individual score contribution is
u ( y ; θ ) : = θ log p ( y ; θ ) = M ( y ; θ ) M ( y ; θ ) C ( θ ) C ( θ ) .
We now compute M ( y ; θ ) and C ( θ ) explicitly.
Lemma 1
(derivative of M ( y ; θ ) ). For y { 0 , 1 , 2 , } and θ > 0 ,
M ( y ; θ ) = ( y + 1 ) Γ ( y + 1 ) ( θ + 1 ) y + 2 ( y + 2 ) Γ ( y + 2 ) ( θ + 1 ) y + 3 + Γ ( y + 4 ) ( θ + 1 ) y + 4 θ ( y + 4 ) Γ ( y + 4 ) ( θ + 1 ) y + 5 .
Proof. 
We differentiate each term of M ( y ; θ ) in (30) with respect to θ .
First term:  θ Γ ( y + 1 ) ( θ + 1 ) y + 1 = ( y + 1 ) Γ ( y + 1 ) ( θ + 1 ) y + 2 .
Second term:  θ Γ ( y + 2 ) ( θ + 1 ) y + 2 = ( y + 2 ) Γ ( y + 2 ) ( θ + 1 ) y + 3 .
Third term: Using the product rule,
θ θ Γ ( y + 4 ) ( θ + 1 ) y + 4 = Γ ( y + 4 ) ( θ + 1 ) y + 4 + θ · ( y + 4 ) Γ ( y + 4 ) ( θ + 1 ) y + 5 = Γ ( y + 4 ) ( θ + 1 ) y + 4 θ ( y + 4 ) Γ ( y + 4 ) ( θ + 1 ) y + 5 .
Combining all terms yields (34). □
Lemma 2
(derivative of C ( θ ) ). For θ > 0 ,
C ( θ ) = 1 θ 2 2 θ 3 18 θ 4 .
Proof. 
Differentiating (27) term by term,
d d θ 1 θ = 1 θ 2 , d d θ 1 θ 2 = 2 θ 3 , d d θ 6 θ 3 = 18 θ 4 .
Summing these gives (35). □
Substituting (34) and (35) into (33) yields the explicit form of the individual score function.

4.4. Fisher Information

The Fisher information per observation is defined as
I ( θ ) : = Var θ u ( Y ; θ ) = E θ u ( Y ; θ ) 2 ,
where the second equality holds because E θ [ u ( Y ; θ ) ] = 0 under regularity conditions (see Proposition 1 below).
Proposition 1
(zero mean of the score). Assume that p ( y ; θ ) > 0 for all y { 0 , 1 , 2 , } and θ > 0 , and that differentiation under the summation sign is valid. Then,
E θ u ( Y ; θ ) = 0 .
Proof. 
We have
E θ u ( Y ; θ ) = y = 0 θ log p ( y ; θ ) · p ( y ; θ ) = y = 0 1 p ( y ; θ ) p ( y ; θ ) θ · p ( y ; θ ) = y = 0 p ( y ; θ ) θ .
Since y = 0 p ( y ; θ ) = 1 for all θ > 0 , differentiating both sides with respect to θ yields
θ y = 0 p ( y ; θ ) = y = 0 p ( y ; θ ) θ = 0 ,
where the interchange of differentiation and summation is justified by the assumed regularity. □
In practice, I ( θ ) is computed numerically as
I ( θ ) = y = 0 y max u ( y ; θ ) 2 p ( y ; θ ) ,
where y max is chosen large enough that y > y max p ( y ; θ ) is negligible.

4.5. Existence and Uniqueness of the MLE

The maximum likelihood estimator (MLE) of θ is defined as
θ ^ n : = arg max θ > 0 n ( θ ) .
Theorem 6
(existence of the MLE). For any sample y = ( y 1 , , y n ) with at least one y i > 0 , there exists a value θ ^ n ( 0 , ) that maximizes n ( θ ) .
Proof. 
We establish existence by analyzing the behavior of n ( θ ) at the boundaries and applying the extreme value theorem.
Step 1: Behavior as θ 0 + . From (27), C ( θ ) as θ 0 + because C ( θ ) 6 / θ 3 . Since
n ( θ ) = i = 1 n log M ( y i ; θ ) n log C ( θ ) + const ,
and log M ( y i ; θ ) remains bounded while log C ( θ ) , we have n ( θ ) as θ 0 + .
Step 2: Behavior as θ . As θ , both ( θ + 1 ) y + k for k 1 , so M ( y ; θ ) 0 for each y 0 . Hence, log M ( y i ; θ ) , implying n ( θ ) .
Step 3: Continuity and compactness. The function n ( θ ) is continuous on ( 0 , ) . Since n ( θ ) at both boundaries, for any finite value n ( θ 0 ) , the set { θ : n ( θ ) n ( θ 0 ) } is a compact subset of ( 0 , ) . By the Weierstrass theorem, n attains its maximum on this compact set. □
Uniqueness of the MLE is guaranteed if n ( θ ) is strictly concave. This can be verified numerically for specific samples, and holds generically when n is sufficiently large.

4.6. Consistency of the MLE

Let θ 0 denote the true parameter value. We establish that θ ^ n p θ 0 as n .
Theorem 7
(consistency). Assume that
(C1) 
The true parameter θ 0 lies in the interior of ( 0 , ) .
(C2) 
The model is identifiable: p ( y ; θ ) = p ( y ; θ ) for all y implies θ = θ .
(C3) 
E θ 0 [ | log p ( Y ; θ ) | ] < for all θ in a neighborhood of θ 0 .
Then, θ ^ n p θ 0 as n .
Proof. 
Define the average log-likelihood
¯ n ( θ ) : = 1 n n ( θ ) = 1 n i = 1 n log p ( Y i ; θ ) .
By the strong law of large numbers, for each fixed θ ,
¯ n ( θ ) a . s . L ( θ ) : = E θ 0 [ log p ( Y ; θ ) ] .
The expected log-likelihood L ( θ ) is uniquely maximized at θ = θ 0 . To see this, note that
L ( θ 0 ) L ( θ ) = E θ 0 log p ( Y ; θ 0 ) p ( Y ; θ ) = D KL ( P θ 0 P θ ) ,
where D KL denotes the Kullback–Leibler divergence. Since D KL ( P θ 0 P θ ) 0 with equality if and only if θ = θ 0 (by identifiability), the maximum of L ( θ ) is attained uniquely at θ 0 .
Under condition (C3), standard arguments show that the convergence ¯ n ( θ ) L ( θ ) is uniform on compact subsets of ( 0 , ) . Combined with the behavior at the boundaries established in Theorem 6, the argmax continuous mapping theorem implies θ ^ n p θ 0 . □

4.7. Asymptotic Normality

We now derive the asymptotic distribution of the MLE.
Theorem 8
(asymptotic normality). Under conditions (C1)–(C3) and the additional assumptions,
(C4) 
The log-likelihood is twice continuously differentiable in a neighborhood of θ 0 ;
(C5) 
The Fisher information I ( θ 0 ) ( 0 , ) ;
and the MLE satisfies
n θ ^ n θ 0 d N 0 , I ( θ 0 ) 1 as n .
Proof. 
We proceed in three steps.
Step 1: Taylor expansion of the score. Since θ ^ n satisfies the score equation U n ( θ ^ n ) = 0 , a first-order Taylor expansion around θ 0 gives
0 = U n ( θ ^ n ) = U n ( θ 0 ) + U n ( θ ˜ n ) θ ^ n θ 0 ,
where θ ˜ n lies between θ ^ n and θ 0 , and U n ( θ ) = U n ( θ ) / θ = n ( θ ) is the second derivative of the log-likelihood.
When rearranged,
θ ^ n θ 0 = U n ( θ 0 ) U n ( θ ˜ n ) = U n ( θ 0 ) n ( θ ˜ n ) .
Step 2: Central limit theorem for the score. The total score is a sum of i.i.d. random variables:
U n ( θ 0 ) = i = 1 n u ( Y i ; θ 0 ) .
By Proposition 1, E θ 0 [ u ( Y ; θ 0 ) ] = 0 , and by definition Var θ 0 [ u ( Y ; θ 0 ) ] = I ( θ 0 ) . The central limit theorem yields
1 n U n ( θ 0 ) = 1 n i = 1 n u ( Y i ; θ 0 ) d N ( 0 , I ( θ 0 ) ) .
Step 3: Convergence of the Hessian. Define the observed information per observation as
I ^ n ( θ ) : = 1 n n ( θ ) = 1 n i = 1 n 2 θ 2 log p ( Y i ; θ ) .
By the law of large numbers,
I ^ n ( θ ) p E θ 0 2 θ 2 log p ( Y ; θ ) = I ( θ )
for each fixed θ (under regularity). Since θ ˜ n p θ 0 by consistency, continuity of I ( θ ) implies I ^ n ( θ ˜ n ) p I ( θ 0 ) .
Step 4: Combining the results. From (40),
n θ ^ n θ 0 = 1 I ^ n ( θ ˜ n ) · 1 n U n ( θ 0 ) .
By Steps 2 and 3, and Slutsky’s theorem,
n θ ^ n θ 0 d 1 I ( θ 0 ) · N ( 0 , I ( θ 0 ) ) = N ( 0 , I ( θ 0 ) 1 ) .
This completes the proof. □

4.8. Practical Inference Procedures

We now describe practical methods for confidence interval construction and hypothesis testing.

4.8.1. Observed Information and Variance Estimation

The observed information at the MLE is
I ^ n : = 1 n n ( θ ^ n ) .
By the results above, I ^ n p I ( θ 0 ) , so the asymptotic variance of θ ^ n is estimated by ( n I ^ n ) 1 .

4.8.2. Wald Confidence Interval

A ( 1 α ) Wald confidence interval for θ is
CI Wald = θ ^ n z 1 α / 2 1 n I ^ n , θ ^ n + z 1 α / 2 1 n I ^ n ,
where z 1 α / 2 denotes the ( 1 α / 2 ) quantile of the standard normal distribution.

4.8.3. Likelihood-Ratio Confidence Interval

The likelihood-ratio (LR) confidence set is defined as
CI LR = θ > 0 : 2 n ( θ ^ n ) n ( θ ) χ 1 , 1 α 2 ,
where χ 1 , 1 α 2 is the ( 1 α ) quantile of the chi-squared distribution with one degree of freedom.
By Wilks’ theorem, 2 [ n ( θ ^ n ) n ( θ 0 ) ] d χ 1 2 under H 0 : θ = θ 0 , so this interval has asymptotically correct coverage. The LR interval adapts to the curvature of the log-likelihood and is often more accurate than the Wald interval in finite samples.

4.8.4. Bootstrap Confidence Interval

For finite samples, bootstrap methods provide an alternative that does not rely on asymptotic approximations. The percentile bootstrap proceeds as follows:
  • For b = 1 , , B ,
    (a)
    Draw a bootstrap sample ( Y 1 * b , , Y n * b ) by sampling with replacement from ( Y 1 , , Y n ) .
    (b)
    Compute the bootstrap MLE θ ^ n * b by maximizing n * ( θ ) based on the bootstrap sample.
  • Compute the ( α / 2 ) and ( 1 α / 2 ) empirical quantiles of { θ ^ n * 1 , , θ ^ n * B } :
    CI Boot = θ ^ n * ( α / 2 ) , θ ^ n * ( 1 α / 2 ) .

5. Simulation Study

We conduct a Monte Carlo simulation study to evaluate the finite-sample performance of the maximum likelihood estimator θ ^ n for the Poisson–CNPED model, and to compare the empirical coverage of Wald, likelihood-ratio, and bootstrap confidence intervals.

5.1. Simulation Design

We consider three values of the true parameter:
θ 0 { 0.5 , 1.0 , 2.0 } ,
representing small, moderate, and large values of the CNPED shape parameter. For each θ 0 , we examine four sample sizes:
n { 50 , 100 , 200 , 500 } .
This yields a total of 3 × 4 = 12 simulation scenarios.
For each scenario, we generate M = 1000 independent datasets. This number of replications ensures that the Monte Carlo standard error of coverage estimates is at most 0.95 × 0.05 / 5000 0.003 , providing precise comparisons.
For each replication m = 1 , , M ,
  • Generate mixing variables. Draw Λ 1 ( m ) , , Λ n ( m ) independently from CNPED ( θ 0 ) using the finite gamma-mixture representation (see Section 2).
  • Generate count data. For each i = 1 , , n , draw Y i ( m ) Poisson ( Λ i ( m ) ) .
  • Compute the MLE. Maximize the log-likelihood (29) numerically using the BFGS algorithm to obtain θ ^ n ( m ) .
  • Compute the observed information. Evaluate I ^ n ( m ) at θ ^ n ( m ) using numerical differentiation.
  • Construct confidence intervals. Compute CI Wald ( m ) , CI LR ( m ) , and (for selected scenarios) CI Boot ( m ) with B = 500 bootstrap replications.
  • Record coverage indicators. For each interval type, record whether the interval contains θ 0 .
We evaluate the MLE using the following metrics:
  • Bias:
    Bias ( θ ^ n ) = 1 M m = 1 M θ ^ n ( m ) θ 0 .
  • Empirical standard deviation:
    SD ( θ ^ n ) = 1 M 1 m = 1 M θ ^ n ( m ) θ ¯ 2 ,
    where θ ¯ = M 1 m = 1 M θ ^ n ( m ) .
  • Root mean squared error:
    RMSE ( θ ^ n ) = 1 M m = 1 M θ ^ n ( m ) θ 0 2 .
  • Empirical coverage probability:
    Coverage = 1 M m = 1 M 1 θ 0 CI ( m ) ,
    where 1 { · } denotes the indicator function.

5.1.1. Bias, Standard Deviation, and RMSE

Table 2 presents the bias, empirical standard deviation, and RMSE of the MLE across all scenarios.
We present the simulation findings in Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5, which display the empirical bias, standard deviation, root mean squared error, confidence interval coverage, and a combined summary of all performance metrics. Figure 1 demonstrates that the bias of the MLE decreases to zero as n increases, confirming consistency. While larger values of θ 0 exhibit greater bias at small samples, all curves converge rapidly, validating Theorem 7. The empirical standard deviation (Figure 2) decreases at the predicted O ( n 1 / 2 ) rate, with greater variability for larger θ 0 values due to decreased Fisher information. This validates the asymptotic normality established in Theorem 8. Since bias is negligible, the RMSE (Figure 3) is dominated by variance and follows the same O ( n 1 / 2 ) decay. For θ 0 = 2.0 , RMSE decreases from 0.24 at n = 50 to 0.07 at n = 500 . Figure 4 shows that both Wald and likelihood-ratio 95% confidence intervals achieve near-nominal coverage for θ 0 = 1.0 . The likelihood-ratio interval maintains coverage above 94% across all sample sizes, while the Wald interval shows slight undercoverage (93.7%) at n = 100 . Figure 5 summarizes all performance metrics, confirming that the MLE provides reliable inference at moderate sample sizes ( n 200 ).

5.1.2. Coverage Probabilities

Table 3 reports the empirical coverage probabilities of 95% confidence intervals.

5.2. Discussion of Simulation Results

The simulation results presented in Table 2 and Table 3 lead to the following conclusions.
-
The MLE exhibits small positive bias across all scenarios, ranging from 0.0007 to 0.0238 . The bias decreases systematically as the sample size increases, consistent with the asymptotic unbiasedness established in Theorem 8. For instance, when θ 0 = 1 , the bias decreases from 0.0109 at n = 50 to 0.0008 at n = 500 , representing a reduction by a factor of approximately 14.
-
The empirical standard deviation decreases at a rate consistent with O ( n 1 / 2 ) , as predicted by the asymptotic theory. For θ 0 = 1 , the standard deviation decreases from 0.1030 at n = 50 to 0.0311 at n = 500 . The ratio 0.1030 / 0.0311 3.31 is close to 500 / 50 = 10 3.16 , confirming the theoretical convergence rate. The RMSE values are nearly identical to the standard deviations, indicating that variance, rather than bias, dominates the estimation error.
-
The standard deviation increases with θ 0 for fixed sample size. At n = 100 , the standard deviation increases from 0.0296 when θ 0 = 0.5 to 0.1606 when θ 0 = 2.0 . This pattern reflects the decrease in Fisher information per observation as θ increases, a feature inherent to the Poisson–CNPED model.
-
Both Wald and likelihood-ratio intervals achieve empirical coverage close to the nominal 95% level across all scenarios. The Wald interval coverage ranges from 93.7 % to 96.7 % , while the likelihood-ratio interval coverage ranges from 94.2 % to 96.3 % . The slight undercoverage observed for n = 100 and θ 0 = 1 (Wald: 93.7 % , LR: 94.2 % ) is within the expected Monte Carlo variability for M = 1000 replications.
Based on these findings, we conclude that
  • The MLE of θ in the Poisson–CNPED model performs well in finite samples, with negligible bias and variance decreasing at the expected O ( n 1 / 2 ) rate.
  • Both Wald and likelihood-ratio confidence intervals provide reliable coverage for sample sizes as small as n = 50 .
  • For routine applications, the Wald interval is computationally simpler and performs adequately. The likelihood-ratio interval may be preferred when the sample size is small or when the log-likelihood exhibits noticeable asymmetry.
All maximum likelihood estimates were obtained by numerical maximization of the log-likelihood using the BFGS quasi-Newton algorithm. In the simulation study, convergence was achieved in all Monte Carlo replications without the need for parameter constraints or penalization. The number of iterations required for convergence was typically small and did not increase substantially with the sample size. To assess sensitivity to initial values, the optimization was initialized from multiple starting points for θ , including values close to zero and moderately large values. In all cases, the algorithm converged to the same maximizer, suggesting a well-behaved likelihood surface for the Poisson–CNPED model. In terms of computational cost, the proposed model was found to be moderately more expensive than standard negative binomial regression due to the evaluation of the closed-form mixture likelihood. However, for sample sizes commonly encountered in reliability and insurance applications, computation times remained negligible and well within practical limits.

6. Real-Data Applications

We illustrate the usefulness of the proposed Poisson–NPED regression model by analyzing two well-known real datasets frequently used in reliability engineering and actuarial science. The first dataset concerns mechanical failure counts for diesel engine valve-seat components [4]. The second dataset is the French Motor Third-Party Liability (MTPL) insurance portfolio [14]. For each dataset, we compare the Poisson–NPED regression with three standard competitors: Poisson, negative binomial (Poisson–Gamma), and Poisson–Lindley regressions. All computations were performed in R.

6.1. Engineering Failure Counts: Valve-Seat Data

The valve-seat failure dataset, originally analyzed by [4], records the number of failures of diesel engine valve seats across different engines and operating conditions. The dataset contains n = 100 engines with the following covariates:
x i 1 = engine age ( years ) , x i 2 = operating hours ( thousand ) , x i 3 = engine type indicator ( 0 / 1 ) .

Model Comparison

Table 4 reports the maximized log-likelihood, Akaike information criterion (AIC) and Bayesian information criterion (BIC) for each model. The Poisson model performs poorly due to excessive overdispersion. The negative binomial and Poisson–Lindley regressions improve the fit, but the Poisson–NPED model achieves the largest log-likelihood and the lowest AIC/BIC values.
Table 5 displays the maximum likelihood estimates, standard errors, and z-statistics for the Poisson–NPED regression. Engine age and operating hours significantly increase the heterogeneity parameter θ i , resulting in heavier failure-count tails.
Figure 6 plots the Pearson residuals against fitted values for all four models. Only the Poisson–NPED regression produces a roughly symmetric residual cloud without strong patterns, indicating an adequate fit.

6.2. Insurance Claim Counts: French MTPL Data

The second dataset comes from the French MTPL (Motor Third-Party Liability) insurance portfolio, publicly available in the freMTPL2 dataset [14]. The data contain individual policy records, including exposure time, claim counts, and multiple risk-factor variables. For illustration, we analyze a subsample of n = 10,000 policies with the following covariates:
  • x i 1 = driver age
  • x i 2 = vehicle age
  • x i 3 = annual mileage
  • x i 4 = urban area indicator ( 0 / 1 )
The empirical distribution of the number of claims exhibits strong overdispersion, making this dataset well-suited for evaluating mixed-Poisson regression models.
Table 6 shows that the Poisson model is clearly inadequate. The Poisson–NPED regression achieves the best fit overall, with substantial gains in model likelihood relative to the negative binomial and Poisson–Lindley regressions.
The estimated coefficients for the Poisson–NPED regression are listed in Table 7. Younger drivers, older vehicles, high mileage, and urban residence are associated with increased heterogeneity and heavier claim tails.
Figure 7 displays Pearson residuals for all four models. Only the Poisson–NPED regression removes the strong funnel-shaped pattern seen in the competing models.
In both the engineering and insurance applications, the Poisson–NPED regression consistently outperforms the Poisson, negative binomial, and Poisson–Lindley models. The NPED mixing distribution provides additional flexibility to capture heavy-tailed behavior and complex forms of overdispersion, resulting in improved goodness of fit, more stable parameter estimates, and better predictive accuracy.

7. Conclusions and Future Research

This paper introduced the Poisson–NPED distribution and its associated regression framework as a flexible and analytically tractable alternative for modeling heterogeneous count data. By exploiting the polynomial–exponential structure of the NPED mixing distribution, the proposed model accommodates a wide range of dispersion patterns and tail behaviors while retaining closed-form likelihoods and compatibility with standard GLM methodology. Simulation studies and real-data applications from engineering reliability and insurance claim modeling demonstrated clear advantages of the Poisson–NPED model over classical Poisson-based approaches, particularly in the presence of strong overdispersion. Despite these strengths, several avenues for further research remain. These include extending the NPED framework to zero-inflated and hurdle models, exploring multivariate or hierarchical NPED-based constructions, developing robust estimation procedures under model misspecification, and investigating Bayesian implementations that allow prior information on the mixing parameters. Additional work on diagnostic tools and goodness-of-fit measures tailored to NPED mixtures would further enhance their practical applicability. Overall, the proposed model opens promising directions for advancing flexible count-data analysis. It is worth noting that alternative flexible count-data models, such as COM–Poisson-type regressions, can also accommodate underdispersion and overdispersion. However, such models typically require numerical evaluation of normalizing constants and more involved estimation procedures. In contrast, the Poisson–NPED regression proposed in this paper retains closed-form expressions for the marginal likelihood while achieving substantial flexibility in tail behavior and overdispersion through the NPED mixing distribution. The empirical results presented in Section 6 indicate that this combination of analytical tractability and modeling flexibility leads to improved goodness of fit and stable inference in practical applications.

Author Contributions

Conceptualization, H.Z. and D.F.; Methodology, S.S.F. and V.R.; Validation, S.S.F., V.R. and D.F.; Formal analysis, V.R.; Investigation, H.Z., S.S.F. and V.R.; Writing—original draft, H.Z., S.S.F. and D.F.; Writing—review & editing, D.F.; Visualization, D.F.; Supervision, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the Portuguese Foundation for Science and Technology through the projects UIDB/00212/2020, UIDB/04630/2020 and UIDB/00297/2020.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data, 2nd ed.; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  2. Hilbe, J.M. Negative Binomial Regression, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  3. Karlis, D.; Xekalaki, E. Mixed Poisson distributions. Int. Stat. Rev. 2005, 73, 35–58. [Google Scholar] [CrossRef]
  4. Lawless, J.F. Negative binomial and mixed Poisson regression. Can. J. Stat. 1987, 15, 209–225. [Google Scholar] [CrossRef]
  5. Azevedo, A.M.; Silva, Í.J.; Nery, M.C.; Rocha, H.P.; Santana, R.A. Counting models for overdispersed data: A review with applications. Biom. Bras. J. 2023, 41, 274–286. [Google Scholar] [CrossRef]
  6. Alomair, G.; Tajuddin, R.R.M.; Bakouch, H.S.; Almohisen, A. A statistical model for count data analysis and population size estimation: Introducing a mixed Poisson–Lindley distribution and its zero truncation. Axioms 2024, 13, 125. [Google Scholar] [CrossRef]
  7. Bodhisuwan, W.; Wongpun, W. The Poisson–transmuted Janardan distribution for overdispersed count data. Trends Sci. 2022, 19, 2898. [Google Scholar]
  8. Sindhu, T.N.; Abdalla, G.S.S.; Shafiq, A.; Abushal, T.A. A new statistical framework for over-dispersed count data. J. Stat. Methods Appl. 2025, 18, 101987. [Google Scholar]
  9. Almulhim, F.A.; Hammad, A.T.; Bakr, M.E.; Balogun, O.S.; Habineza, A.; El-Raouf, M.M.A. Development of the generalized ridge estimator for the Poisson-inverse Gaussian regression model under multicollinearity. Sci. Rep. 2025, 15, 31162. [Google Scholar] [CrossRef] [PubMed]
  10. Beghriche, A.; Zeghdoudi, H.; Raman, V.; Chouia, S. New polynomial exponential distribution: Properties and applications. Stat. Transit. New Ser. 2022, 23, 95–112. [Google Scholar] [CrossRef]
  11. Sankaran, M. The discrete Poisson–Lindley distribution. Biometrics 1970, 26, 145–149. [Google Scholar] [CrossRef]
  12. Kus, C. A new lifetime distribution. Comput. Stat. Data Anal. 2007, 51, 4497–4509. [Google Scholar] [CrossRef]
  13. Barreto-Souza, W.; Cribari-Neto, F. A generalization of the exponential–Poisson distribution. arXiv 2008, arXiv:0809.1894. [Google Scholar] [CrossRef]
  14. Dutang, C.; Charpentier, A. freMTPL2: French Motor Third-Party Liability Insurance Claims Dataset. R Package CASdatasets. 2020. Available online: https://www.kaggle.com/datasets/karansarpal/fremtpl2-french-motor-tpl-insurance-claims (accessed on 7 January 2026).
Figure 1. Bias of the MLE θ ^ n as a function of sample size n for three values of the true parameter θ 0 . The bias decreases toward zero as n increases, confirming the consistency of the estimator. Larger values of θ 0 exhibit greater bias for small samples, but all curves converge to zero as n .
Figure 1. Bias of the MLE θ ^ n as a function of sample size n for three values of the true parameter θ 0 . The bias decreases toward zero as n increases, confirming the consistency of the estimator. Larger values of θ 0 exhibit greater bias for small samples, but all curves converge to zero as n .
Computation 14 00026 g001
Figure 2. Empirical standard deviation of the maximum likelihood estimator θ ^ n as a function of sample size n for three values of the true parameter θ 0 { 0.5 , 1.0 , 2.0 } , based on M = 1000 Monte Carlo replications. The standard deviation decreases at the theoretical rate O ( n 1 / 2 ) as predicted by the asymptotic normality result in Theorem 8. For fixed sample size, larger values of θ 0 yield greater variability, reflecting the decrease in Fisher information as the parameter increases. These results validate the asymptotic theory and provide guidance on the sample sizes required for precise estimation.
Figure 2. Empirical standard deviation of the maximum likelihood estimator θ ^ n as a function of sample size n for three values of the true parameter θ 0 { 0.5 , 1.0 , 2.0 } , based on M = 1000 Monte Carlo replications. The standard deviation decreases at the theoretical rate O ( n 1 / 2 ) as predicted by the asymptotic normality result in Theorem 8. For fixed sample size, larger values of θ 0 yield greater variability, reflecting the decrease in Fisher information as the parameter increases. These results validate the asymptotic theory and provide guidance on the sample sizes required for precise estimation.
Computation 14 00026 g002
Figure 3. Root mean squared error (RMSE) of the maximum likelihood estimator θ ^ n as a function of sample size n for three values of the true parameter θ 0 { 0.5 , 1.0 , 2.0 } , based on M = 1000 Monte Carlo replications. The RMSE combines both bias and variance into a single measure of estimation accuracy. Since the bias is negligible relative to the standard deviation (see Table 2), the RMSE is dominated by the variance component and decreases at rate O ( n 1 / 2 ) . For θ 0 = 2.0 , the RMSE at n = 50 is approximately 0.24 , decreasing to 0.07 at n = 500 , demonstrating substantial improvement in estimation precision with increasing sample size.
Figure 3. Root mean squared error (RMSE) of the maximum likelihood estimator θ ^ n as a function of sample size n for three values of the true parameter θ 0 { 0.5 , 1.0 , 2.0 } , based on M = 1000 Monte Carlo replications. The RMSE combines both bias and variance into a single measure of estimation accuracy. Since the bias is negligible relative to the standard deviation (see Table 2), the RMSE is dominated by the variance component and decreases at rate O ( n 1 / 2 ) . For θ 0 = 2.0 , the RMSE at n = 50 is approximately 0.24 , decreasing to 0.07 at n = 500 , demonstrating substantial improvement in estimation precision with increasing sample size.
Computation 14 00026 g003
Figure 4. Empirical coverage probabilities of 95% confidence intervals for the parameter θ in the Poisson–CNPED model with true value θ 0 = 1.0 , based on M = 1000 Monte Carlo replications. The horizontal dashed line indicates the nominal 95% coverage level. Both the Wald interval (blue circles) and the likelihood-ratio interval (red triangles) achieve empirical coverage close to the nominal level across all sample sizes considered ( n { 50 , 100 , 200 , 500 } ). The Wald interval exhibits slight undercoverage at n = 100 (93.7%), while the likelihood-ratio interval maintains coverage above 94% throughout. These results confirm that both inference procedures provide reliable uncertainty quantification for the Poisson–CNPED model, even at moderate sample sizes.
Figure 4. Empirical coverage probabilities of 95% confidence intervals for the parameter θ in the Poisson–CNPED model with true value θ 0 = 1.0 , based on M = 1000 Monte Carlo replications. The horizontal dashed line indicates the nominal 95% coverage level. Both the Wald interval (blue circles) and the likelihood-ratio interval (red triangles) achieve empirical coverage close to the nominal level across all sample sizes considered ( n { 50 , 100 , 200 , 500 } ). The Wald interval exhibits slight undercoverage at n = 100 (93.7%), while the likelihood-ratio interval maintains coverage above 94% throughout. These results confirm that both inference procedures provide reliable uncertainty quantification for the Poisson–CNPED model, even at moderate sample sizes.
Computation 14 00026 g004
Figure 5. Finite-sample performance of the MLE θ ^ n for the Poisson–CNPED model. (a) Bias decreases toward zero as n increases, confirming consistency. (b) Standard deviation decreases at rate O ( n 1 / 2 ) , as predicted by asymptotic theory. (c) RMSE follows the same pattern as SD, indicating that variance dominates the estimation error. (d) Coverage probabilities of 95% Wald and likelihood-ratio intervals for θ 0 = 1 ; both methods achieve coverage close to the nominal level.
Figure 5. Finite-sample performance of the MLE θ ^ n for the Poisson–CNPED model. (a) Bias decreases toward zero as n increases, confirming consistency. (b) Standard deviation decreases at rate O ( n 1 / 2 ) , as predicted by asymptotic theory. (c) RMSE follows the same pattern as SD, indicating that variance dominates the estimation error. (d) Coverage probabilities of 95% Wald and likelihood-ratio intervals for θ 0 = 1 ; both methods achieve coverage close to the nominal level.
Computation 14 00026 g005
Figure 6. Residual diagnostics for valve-seat failure data.
Figure 6. Residual diagnostics for valve-seat failure data.
Computation 14 00026 g006
Figure 7. Residual diagnostics for French MTPL claim-frequency data.
Figure 7. Residual diagnostics for French MTPL claim-frequency data.
Computation 14 00026 g007
Table 1. Special and limiting cases of the Poisson–NPED framework.
Table 1. Special and limiting cases of the Poisson–NPED framework.
Mixing LawResulting Mixed-Poisson ModelReference
Degenerate (point mass)Poissonclassical
GammaNegative binomialstandard
LindleyPoisson–Lindley[11]
ExponentialPoisson–Exponential[12,13]
Table 2. Finite-sample performance of the maximum likelihood estimator θ ^ for the Poisson–CNPED model based on M = 1000 Monte Carlo replications.
Table 2. Finite-sample performance of the maximum likelihood estimator θ ^ for the Poisson–CNPED model based on M = 1000 Monte Carlo replications.
n θ 0 BiasEmpirical SDRMSE
500.5 0.0052 0.04370.0440
501.0 0.0109 0.10300.1035
502.0 0.0238 0.23800.2390
1000.5 0.0024 0.02960.0297
1001.0 0.0051 0.07200.0721
1002.0 0.0175 0.16060.1614
2000.5 0.0010 0.02140.0215
2001.0 0.0028 0.04760.0476
2002.0 0.0085 0.11740.1176
5000.5 0.0007 0.01360.0136
5001.0 0.0008 0.03110.0310
5002.0 0.0020 0.07290.0729
Table 3. Empirical coverage probabilities (in %) of nominal 95% confidence intervals for θ ( M = 1000 replications).
Table 3. Empirical coverage probabilities (in %) of nominal 95% confidence intervals for θ ( M = 1000 replications).
n θ 0 WaldLikelihood-Ratio
500.595.194.6
501.095.094.6
502.094.795.1
1000.596.796.3
1001.093.794.2
1002.095.395.5
2000.594.894.9
2001.095.595.3
2002.094.894.4
5000.594.994.8
5001.094.794.6
5002.094.695.0
Table 4. Model comparison for the valve-seat failure data ( n = 100 ).
Table 4. Model comparison for the valve-seat failure data ( n = 100 ).
ModelLog-LikelihoodAICBIC
Poisson 142.7 293.4 302.0
Negative binomial 128.5 265.0 275.1
Poisson–Lindley 125.2 258.4 268.5
Poisson–NPED 121.9 251.8 263.3
Table 5. Poisson–NPED regression: valve-seat data.
Table 5. Poisson–NPED regression: valve-seat data.
CovariateEstimateSEz-Statistic
Intercept 1.12 0.29 3.86
Engine age 0.47 0.12 3.92
Operating hours 0.33 0.10 3.30
Engine type (1 vs. 0) 0.21 0.09 2.33
Table 6. Model comparison for the French MTPL claim-frequency data ( n = 10,000).
Table 6. Model comparison for the French MTPL claim-frequency data ( n = 10,000).
ModelLog-LikelihoodAICBIC
Poisson−13,382.526,775.026,830.1
Negative binomial−12,269.224,548.524,615.0
Poisson–Lindley−12,188.724,393.424,460.0
Poisson–NPED−11,992.123,996.224,075.4
Table 7. Poisson–NPED regression: French MTPL claim-frequency data.
Table 7. Poisson–NPED regression: French MTPL claim-frequency data.
CovariateEstimateSEz-Statistic
Intercept 2.05 0.06 34.2
Driver age 0.018 0.002 9.0
Vehicle age 0.052 0.004 13.0
Mileage 0.147 0.006 24.5
Urban area (1 vs. 0) 0.233 0.015 15.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zeghdoudi, H.; Ferreira, S.S.; Raman, V.; Ferreira, D. Regression Extensions of the New Polynomial Exponential Distribution: NPED-GLM and Poisson–NPED Count Models with Applications in Engineering and Insurance. Computation 2026, 14, 26. https://doi.org/10.3390/computation14010026

AMA Style

Zeghdoudi H, Ferreira SS, Raman V, Ferreira D. Regression Extensions of the New Polynomial Exponential Distribution: NPED-GLM and Poisson–NPED Count Models with Applications in Engineering and Insurance. Computation. 2026; 14(1):26. https://doi.org/10.3390/computation14010026

Chicago/Turabian Style

Zeghdoudi, Halim, Sandra S. Ferreira, Vinoth Raman, and Dário Ferreira. 2026. "Regression Extensions of the New Polynomial Exponential Distribution: NPED-GLM and Poisson–NPED Count Models with Applications in Engineering and Insurance" Computation 14, no. 1: 26. https://doi.org/10.3390/computation14010026

APA Style

Zeghdoudi, H., Ferreira, S. S., Raman, V., & Ferreira, D. (2026). Regression Extensions of the New Polynomial Exponential Distribution: NPED-GLM and Poisson–NPED Count Models with Applications in Engineering and Insurance. Computation, 14(1), 26. https://doi.org/10.3390/computation14010026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop