Next Article in Journal
Radius of α-Spirallikeness of Order cos(α)/2 for Entire Functions
Previous Article in Journal
Calculation and Analysis of Petri Net Reachability Graphs by a Think-Globally-Act-Locally Method
Previous Article in Special Issue
Modified Liu Parameters for Scaling Options of the Multiple Regression Model with Multicollinearity Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modified Kibria–Lukman Estimator for the Conway–Maxwell–Poisson Regression Model: Simulation and Application

by
Nasser A. Alreshidi
1,
Masad A. Alrasheedi
2,
Adewale F. Lukman
3,*,
Hleil Alrweili
1 and
Rasha A. Farghali
4
1
Department of Mathematics, College of Science, Northern Border University, Arar 73213, Saudi Arabia
2
Department of Management Information Systems, College of Business Administration, Taibah University, Al-Madinah Al-Munawara 42353, Saudi Arabia
3
Department of Mathematics and Statistics, University of North Dakota, Grand Forks, ND 58202, USA
4
Department of Mathematics, Insurance and Applied Statistics, Helwan University, Cairo 11795, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(5), 794; https://doi.org/10.3390/math13050794
Submission received: 10 January 2025 / Revised: 13 February 2025 / Accepted: 20 February 2025 / Published: 27 February 2025
(This article belongs to the Special Issue Application of Regression Models, Analysis and Bayesian Statistics)

Abstract

:
This study presents a novel estimator that combines the Kibria–Lukman and ridge estimators to address the challenges of multicollinearity in Conway–Maxwell–Poisson (COMP) regression models. The Conventional COMP Maximum Likelihood Estimator (CMLE) is notably susceptible to the adverse effects of multicollinearity, underscoring the necessity for alternative estimation strategies. We comprehensively compare the proposed COMP Modified Kibria–Lukman estimator (CMKLE) against existing methodologies to mitigate multicollinearity effects. Through rigorous Monte Carlo simulations and real-world applications, our results demonstrate that the CMKLE exhibits superior resilience to multicollinearity while consistently achieving lower mean squared error (MSE) values. Additionally, our findings underscore the critical role of larger sample sizes in enhancing estimator performance, particularly in the presence of high multicollinearity and over-dispersion. Importantly, the CMKLE outperforms traditional estimators, including the CMLE, in predictive accuracy, reinforcing the imperative for judicious selection of estimation techniques in statistical modeling.

1. Introduction

Generalized linear models (GLMs) are powerful tools for modeling relationships between explanatory variables (predictors) and response variables, extending traditional linear regression to accommodate various data types, including those following distributions within the exponential family [1]. A key application of GLMs is modeling count data, which arises in social sciences, healthcare, economics, auto insurance claims, and medical research. Unlike continuous data, count data are discrete and often exhibit patterns of dispersion that complicate standard modeling approaches [2,3]. Several specialized GLMs have been developed, including Poisson and negative binomial regression models, to model count data [4,5].
Each model has strengths and limitations, depending on the specific data characteristics. The Poisson regression model, for instance, is commonly used for discrete data but assumes that the mean and variance are equal (equi-dispersion). This assumption is rarely held in practice, as count data often exhibit greater variability (over-dispersion), necessitating alternatives like the negative binomial model [2]. The negative binomial regression model addresses over-dispersion by introducing a dispersion parameter that allows for a variance greater than the mean, making it more suitable for real-world applications. However, the negative binomial model is limited to over-dispersed data and cannot accommodate under-dispersion, where the variance is less than the mean. In contrast, the COM–Poisson regression model offers greater flexibility by handling over-dispersion and under-dispersion [6,7]. This versatility allows it to be applied across a broader range of count data scenarios. The COM–Poisson distribution, introduced by Conway and Maxwell [7], generalizes the Poisson distribution and includes the geometric, Poisson, and Bernoulli distributions as special cases, making it increasingly popular for count data analysis.
In recent studies, the Conway–Maxwell–Poisson (COM–Poisson or COMP) regression model has gained considerable attention due to its ability to accommodate data with varying levels of dispersion [8,9]. This flexibility makes it a robust alternative to traditional models such as the Poisson and negative binomial regression models, which are limited to handling equi-dispersion and over-dispersion. The COMP model is useful when dealing with count data that exhibit complex dispersion patterns, as it can model over-dispersion (variance greater than the mean) and under-dispersion (variance less than the mean). This versatility has led to its growing adoption and has attracted increasing attention in recent research. The parameters of the COM–Poisson model are typically estimated using the COM–Poisson Maximum Likelihood Estimator (CMLE). However, the CMLE has a notable limitation: it becomes inefficient and produces unstable estimates in multicollinearity, a common issue where the explanatory variables are highly correlated [10,11]. Multicollinearity can inflate the variance of parameter estimates, leading to unreliable inference and predictions.
To address this issue, researchers have developed alternative estimators, which can mitigate the impact of multicollinearity by introducing a bias that stabilizes the estimates. Among these alternatives are the COM–Poisson Ridge estimator (CRE) [11], COM–Poisson Liu estimator (CLE) [12], COM–Poisson Liu-type estimator (CLTE) [9], and the COM–Poisson Kibria–Lukman estimator (CKLE) [8], among others. These shrinkage estimators combine the robustness of the COM–Poisson model with strategies to handle multicollinearity, thereby providing more stable and efficient parameter estimates. Each of these estimators has its strengths depending on the severity of multicollinearity and the specific structure of the data. For instance, the CRE introduces a penalty proportional to the size of the coefficients, effectively shrinking them towards zero to reduce variance. The CLE and CLTE, on the other hand, adjust the bias–variance trade-off more finely, making them particularly useful when the multicollinearity is moderate.
Recent advancements in alternative estimation techniques have helped mitigate the effects of multicollinearity in the COMP regression model, leading to more stable parameter estimates. As research in this field continues to evolve, the COMP model combined with shrinkage estimators shows great potential for applications involving count data, particularly in scenarios where multicollinearity exists among explanatory variables. Aladeitan et al. [13] introduced the Modified Kibria–Lukman (MKL) estimator to address multicollinearity in Poisson regression, showing that the MKL outperforms the Poisson Maximum Likelihood Estimator (PMLE), Poisson Ridge Regression Estimator (PRE), Poisson Liu Estimator (PLE), and Poisson Kibria–Lukman (PKL) estimator.
This paper aims to extend these methodologies by proposing a new COMP Modified Kibria–Lukman Estimator (COMP-MKL) as an alternative for handling multicollinearity in the COMP regression model. We will compare the performance of the proposed COMP-MKL estimator with existing estimators such as the CRE, CLE, CKLE, and CLTE using the scalar mean squared error (SMSE) criterion. Theoretical findings will be substantiated through simulation studies and real data analysis to demonstrate the superior performance of the proposed estimator over its competitors.
The paper is structured as follows: Section 2 outlines the fundamentals of the COMP regression model, reviews existing estimators, and introduces the new COMP-MKL estimator. Section 3 presents a theoretical comparison of the proposed estimator with its competitors. Section 4 reports the results of an extensive Monte Carlo simulation study, assessing the performance of the proposed and competing estimators based on the MSE criteria. Section 5 showcases a real-world data example that further validates the superiority of the proposed estimator. Section 6 provides concluding remarks, summarizing the key findings and implications for future research.

2. COMP Regression Model and Estimation

The Conway–Maxwell–Poisson (COMP) distribution, introduced by Conway and Maxwell [7], is a versatile distribution designed to handle both over-dispersed and under-dispersed count data, addressing the limitations of the traditional Poisson distribution. Its probability mass function (pmf) is given as follows:
P Y = y λ , ν = λ y ( y ! ) ν Z λ , ν ,   y = 0,1 , 2 ,
where λ is the rate parameter, ν is the dispersion parameter, and Z λ , ν is the normalizing constant. The flexibility of the COMP distribution stems from the dispersion parameter, allowing it to model a range of distributions, from Poisson (when ν = 1 ) to Bernoulli-like (as ν ) to geometric (when ν = 0 ).
The COM–Poisson regression model, introduced by Sellers and Shmueli [14], incorporates this distribution to model count data with varying levels of dispersion. The expected value of the COMP distribution can be approximated as follows:
E Y λ 1 ν + 1 2 υ 0.5
while the variance is similarly expressed as follows:
V Y 1 υ λ 1 ν
The COMP regression model is a generalized linear model with a mean link function, typically the log link function:
l o g μ i = x i T β ,   i = 1,2 , , n
where μ i is the expected count, x i is a vector of covariates, and β is the corresponding vector of regression coefficients. The dispersion parameter can be either fixed or modeled as a function of covariates. Given the nonlinear nature of the likelihood function in COMP regression, maximum likelihood estimation (MLE) requires iterative optimization methods. The log-likelihood function for the COMP regression model is expressed as follows:
l β = i = 1 n y i l o g λ i υ l o g y i ! i = 1 n l o g Z λ i , υ
where λ i = e x p ( x i T β ) is the rate parameter. Since the likelihood is nonlinear in β , iterative algorithms like iteratively reweighted least squares (IRLSs) or Newton–Raphson methods are employed to find the CMLE of β . The CMLE β ^ C M L E is obtained by solving the following:
β ^ C M L E = X T W ^ X 1 X T W ^ z ^
where the weight matrix W ^ is based on the second derivative of the log-likelihood and z ^ is the adjusted response variable. For a more comprehensive treatment of the derivation and implementation of the CMLE, refer to [8,9]. The MLE is consistent and asymptotically efficient but possesses high variance and instability when multicollinearity is present in the model. Consequently, biased estimation methods such as CRE, CLE, and CKLE are often preferred over the CMLE.
The mean squared error (SMSE) of the CMLE is given as follows:
M S E ( β ^ C M L E ) = υ ^ X T W ^ X 1
When multicollinearity is present, the weighted cross-product matrix of the design matrix ( X T W ^ X ) becomes poorly conditioned, leading to instability in the coefficient estimates obtained from the conventional CMLE. To address this issue, shrinkage estimators provide a robust alternative by introducing regularization, stabilizing the regression estimates, and reducing the effects of multicollinearity. The following section will review several existing shrinkage methods specifically developed for the COMP regression model and discuss their theoretical properties and practical applications.

2.1. Shrinkage Estimators

To tackle the issue of multicollinearity in linear regression, several shrinkage estimators have been introduced, including the ridge regression estimator by Hoerl and Kennard [15], as well as the Liu and Liu-type estimators [16,17], the Kibria–Lukman estimator [18], and two-parameter estimators [19,20]. These methods have been adapted to address multicollinearity in generalized linear models, extending their application to models such as Poisson regression [4,21], negative binomial regression (NBR) [5,22], and the Conway–Maxwell–Poisson (COMP) regression model [5,6,23]. These extensions have proven effective in maintaining estimator stability in the presence of highly collinear predictors.
Sami et al. [11] developed the COMP ridge estimator (CRE) and defined it as follows:
β ^ C R E = D 1 X W ^ X β ^ C M L E ,             k > 0 ,
where D = X W ^ X + k I , and k is the ridge parameter, defined in this study as k = p sum β ^ C M L E 2 . The matrix mean squared error (MMSE) is expressed as follows:
M M S E β ^ C R E = υ ^ D 1 X W ^ X D 1 + k 2 D 1 β β T D 1 .
Akram et al. [12] expanded the application of the Liu estimator (LE) to the COMP regression model in the following way:
β ^ C L E = C d β ^ C M L E ,             0 < d < 1 ,
where C d = X W ^ X + I 1 X W ^ X + d I . The Liu parameter d in this study is defined as d = 1 m i n ( β ^ C M L E 2 ) . The MMSE is expressed as follows:
M M S E β ^ C L E = υ ^ C d X W ^ X 1 C d + ( 1 d ) 2 X W ^ X + I 1 β β T X W ^ X + I 1 .
Kibria and Lukman [18] introduced a ridge-type estimator as an effective solution to the problem of multicollinearity in linear regression. Recognizing the need for similar strategies in more complex models, Abonazel et al. [8] extended this approach to the COMP regression model, resulting in what is now referred to as the COMP Kibria–Lukman estimator (CKLE). This estimator is designed to address multicollinearity by balancing bias and variance through regularization, thereby improving the stability of estimates in the COMP model. The CKLE is defined as follows:
β ^ C K L E = C k β ^ C M L E ,               0 < k < 1 ,
where C k = X W ^ X + k I 1 X W ^ X k I . The MMSE is expressed as follows:
M M S E β ^ C K L E = υ ^ C k X W ^ X 1 C k + 4 k 2 X W ^ X + k I 1 β β T X W ^ X + k I 1 .
Liu [16] introduced the Liu-type estimator to address multicollinearity in linear regression, merging the strengths of both the ridge and Liu estimators. This hybrid approach leverages the benefits of reduced variance and controlled bias. Building on this concept, Tanış and Asar [9] extended the Liu-type estimator to the Conway–Maxwell–Poisson (COMP) regression model, referred to as the COMP Liu-type estimator (CLTE). The formulation of this estimator is as follows:
β ^ C L T E = C k d β ^ C M L E ,             k > 0 , 0 < d < 1 ,
where C k d = X W ^ X + k I 1 X W ^ X d I . The MMSE is expressed as follows:
M M S E β ^ C L T E = υ ^ C k d X W ^ X 1 C k d + ( k + d ) 2 X W ^ X + k I 1 β β T X W ^ X + k I 1 .
Sami et al. [23] introduced a two-parameter estimator for the Conway–Maxwell–Poisson (COMP) regression model, building upon the work of Asar and Genc [21]. This estimator, known as the two-parameter estimator (CTPE), was designed to provide greater flexibility in handling multicollinearity by incorporating two tuning parameters. The CTPE estimator is defined as follows:
β ^ C T P E = H k d β ^ C M L E ,           k > 0 , 0 < d < 1 ,
where H k d = X W ^ X + k I 1 X W ^ X + k d I . The MMSE is expressed as follows:
M M S E β ^ C T P E = υ ^ H k d X W ^ X 1 H k d + k 2 ( d 1 ) 2 X W ^ X + k I 1 β β T X W ^ X + k I 1
Additionally, Sami et al. [23] developed a two-parameter estimator for the Conway–Maxwell–Poisson (COMP) regression model, drawing from the work of Huang and Yang [22] on negative binomial regression (NBRM). This new estimator, the COMP Huang and Yang estimator (CHYE), was designed to extend the flexibility of biased estimation methods to the COMP framework. The CHYE is defined as follows:
β ^ C H Y E = M k d X W ^ X β ^ C M L E ,           k > 0 , 0 < d < 1 ,
where M k d = X W ^ X + I 1 X W ^ X + d I X W ^ X + k I 1 . The MMSE is expressed as follows:
M M S E β ^ C H Y E = υ ^ M k d X W ^ X M k d + G N k d β β T G N k d ,
where N k d = X W ^ X + I 1 X W ^ X + k I 1 and G = k d + 1 X W ^ X + k I .

2.2. Proposed Estimator

In this study, we introduce a new estimator that builds upon the work of Aladeitan et al. [13], who proposed the Modified Kibria–Lukman estimator specifically for the Poisson regression model. The estimator inherits the advantage of the Kibria–Lukman estimator with the ridge estimator. While many of the estimators discussed earlier can be classified as either single-parameter or two-parameter methods, the Modified Kibria–Lukman estimator, despite having a single biasing parameter, demonstrates superior performance compared to some two-parameter estimators. Recognizing the potential of this approach, we extended it to the Conway–Maxwell–Poisson (COMP) regression model, developing what we call the COMP Modified Kibria–Lukman estimator (CMKLE). The CMKLE offers improved estimator stability and accuracy in scenarios where traditional estimators may struggle due to high collinearity among predictor variables. The CMKLE is defined as follows:
β ^ C M K L E = C k β ^ C R E ,             0 < k < 1 .
where C k = X W ^ X + k I 1 X W ^ X k I . The bias of the estimator is expressed as follows:
B i a s β ^ C K L E = k D 1 F D 1 β
where F = 3 X W ^ X + k I . The covariance matrix is defined as follows:
C o v β ^ C M K L E = υ ^ C k D 1 X W ^ X D 1 C k
Hence, the MMSE is expressed as follows:
C o v β ^ C M K L E = υ ^ C k D 1 X W ^ X D 1 C k + k 2 D 2 F β β T F D 2
Let Q T X W ^ X Q = E = d i a g e j , j = 1,2 , , p , where e 1 e 2 e p * are the ordered eigenvalues of X W ^ X and Q is p   ×   p matrix whose columns of the normalized eigenvectors of X W ^ X . Thus, we can express β in terms of θ , then θ ^ C M L E = Q T β ^ C M L E , leading to the canonical form of the CMLE. Consequently, we expressed other equations in this study in canonical form:
θ ^ C R E = E + k I 1 E θ ^ C M L E .
θ ^ C L E = E + I 1 E + d I θ ^ C M L E .
θ ^ C K L E = E + k I 1 E k I θ ^ C M L E .
θ ^ C L T E = E + k I 1 E d I θ ^ C M L E .
θ ^ C T P E = E + k I 1 E + k d I θ ^ C M L E .
θ ^ C H Y E = E + I 1 E + d I E + k I 1 E θ ^ C M L E .
θ ^ C M K L E = E + k I 1 E k I E + k I 1 E θ ^ C M L E .
Furthermore, for ease of computation, we applied the same shrinkage parameters across all estimators in this study. This approach ensured an impartial and uniform comparison without favoring any estimator. Specifically, the ridge parameters k and d are defined as k = p sum β ^ C M L E 2 and = 1 m i n ( β ^ C M L E 2 ) , respectively.
We can also evaluate the performance of the estimators using the scalar mean squared error (SMSE) instead of the MMSE. The SMSE is defined as follows:
S M S E ϑ = t r C o v ϑ ~ + t r [ b i a s ϑ ~ b i a s ϑ ]
where C o v ϑ ~ is the covariance matrix of ϑ ~ and b i a s ϑ ~ = E ϑ ~ ϑ ~ . This formulation allows us to express the matrix mean squared error in a scalar form by summing the trace of the covariance and the trace of the squared bias, making it easier to compare the performances of different estimators. Consequently, we reformulate the MMSE of all the estimators in terms of the SMSE for a more concise comparison. The SMSEs of the mentioned estimators above are obtained as follows:
S M S E θ ^ C M L E = υ ^ j = 1 p 1 e j .
S M S E θ ^ C R E = υ ^ j = 1 p e j e j + k 2 + j = 1 p θ j 2 k 2 e j + k 2 .
S M S E θ ^ C L E = υ ^ j = 1 p ( e j + d ) 2 e j + 1 2 e j + ( d 1 ) 2 j = 1 p θ j 2 e j + 1 2 .
S M S E θ ^ C K L E = υ ^ j = 1 p ( e j k ) 2 e j + k 2 e j + 4 k 2 j = 1 p θ j 2 e j + k 2 .
S M S E θ ^ C L T E = υ ^ j = 1 p ( e j d ) 2 e j + k 2 e j + ( d + k ) 2 j = 1 p θ j 2 e j + k 2 .
S M S E θ ^ C T P E = υ ^ j = 1 p ( e j + k d ) 2 e j + k 2 e j + k 2 ( d 1 ) 2 j = 1 p θ j 2 e j + k 2 .
S M S E θ ^ C H Y E = υ ^ j = 1 p e j ( e j + d ) 2 e j + 1 2 e j + k 2 + j = 1 p ( 1 + k d ) e j + k 2 θ j 2 e j + k 2 e j + 1 2 .
S M S E θ ^ C M K L E = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 .

3. Theoretical Comparisons

3.1. CMKLE and CMLE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C M L E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C M L E < 0 for υ ^   > 0 and k > 0 if e j 2 ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 e j < e j + k 4 .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C M L E is obtained as follows:
1 = S M S E θ ^ C M K L E S M S E θ ^ C M L E 1 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p 1 e j 1 = υ ^ j = 1 p e j 2 ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 e j e j + k 4 e j e j + k 4
The difference 1 < 0 if e j 2 ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 e j < e j + k 4 for υ ^   > 0 and k > 0 . □

3.2. CMKLE and CRE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C R E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C R E < 0 for υ ^   > 0 and k > 0 if e j ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 < e j + υ ^ θ j 2 k 2 e j + k 2 .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C R E is obtained as follows:
2 = S M S E θ ^ C M K L E S M S E θ ^ C R E 2 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j e j + k 2 j = 1 p θ j 2 k 2 e j + k 2 2 = υ ^ j = 1 p e j ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 e j + υ ^ θ j 2 k 2 e j + k 2 e j + k 4
The difference 2 < 0 if e j ( e j k ) 2 + υ ^ k 2 3 e j + k 2 θ j 2 < e j + υ ^ θ j 2 k 2 e j + k 2 for υ ^ > 0 and k > 0 . □

3.3. CMKLE and CLE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C L E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C L E < 0 for υ ^   > 0 and k > 0 if e j 2 e j + 1 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j e j + 1 2 3 e j + k 2 < e j + d 2 e j + k 4 + υ ^ e j + k 4 ( d 1 ) 2 θ j 2 e j .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C L E is obtained as follows:
3 = S M S E θ ^ C M K L E S M S E θ ^ C L E 3 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j + d 2 e j + 1 2 e j ( d 1 ) 2 j = 1 p θ j 2 e j + 1 2 3 = υ ^ j = 1 p e j 2 e j + 1 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j e j + 1 2 3 e j + k 2 e j + d 2 e j + k 4 υ ^ e j + k 4 ( d 1 ) 2 θ j 2 e j e j e j + 1 2 e j + k 4
The difference 3 < 0 if e j 2 e j + 1 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j e j + 1 2 3 e j + k 2 < e j + d 2 e j + k 4 + υ ^ e j + k 4 ( d 1 ) 2 θ j 2 e j for υ ^ > 0 and k > 0 . □

3.4. The CMKLE and CKLE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C K L E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C K L E < 0 for υ ^   > 0 and k > 0 , if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < e j k 2 e j + k 2 + 4   υ ^ k 2 e j + k 2 θ j 2 .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C K L E is obtained as follows:
4 = S M S E θ ^ C M K L E S M S E θ ^ C K L E 4 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j k 2 e j + k 2 e j 4 k 2 j = 1 p θ j 2 e j + k 2 4 = υ ^ j = 1 p e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 e j k 2 e j + k 2 4 υ ^ k 2 e j + k 2 θ j 2 e j e j + k 4
The difference 4 < 0 if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < e j k 2 e j + k 2 + 4 υ ^ k 2 e j + k 2 θ j 2 for υ ^ > 0 and k > 0 . □

3.5. The CMKLE and CLTE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C L T E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C L T E < 0 for υ ^ > 0 and k > 0 if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < ( e j d ) 2 e j + k 2 + υ ^ ( d + k ) 2 e j + k 2 θ j 2 e j .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C L T E is obtained as follows:
5 = S M S E θ ^ C M K L E S M S E θ ^ C L T E 5 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j d 2 e j + k 2 e j ( d + k ) 2 j = 1 p θ j 2 e j + k 2 5 = υ ^ j = 1 p e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 e j d 2 e j + k 2 υ ^ ( d + k ) 2 e j + k 2 θ j 2 e j e j e j + k 4
The difference 5 < 0 if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < ( e j d ) 2 e j + k 2 + υ ^ ( d + k ) 2 e j + k 2 θ j 2 e j for υ ^ > 0 and k > 0 . □

3.6. CMKLE and CTPE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C T P E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C T P E < 0 for υ ^   > 0 and k > 0 if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < ( e j + k d ) 2 e j + k 2 + υ ^ k 2 ( d 1 ) 2 e j + k 2 θ j 2 e j .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C T P E is obtained as follows:
6 = S M S E θ ^ C M K L E S M S E θ ^ C T P E 6 = υ ^ j = 1 p e j e j k 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j + k d 2 e j + k 2 e j k 2 d 1 2 j = 1 p θ j 2 e j + k 2 6 = υ ^ j = 1 p e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 e j + k d 2 e j + k 2 υ ^ k 2 ( d 1 ) 2 e j + k 2 θ j 2 e j e j e j + k 4
The difference 6 < 0 if e j 2 ( e j k ) 2 + υ ^ k 2 θ j 2 e j 3 e j + k 2 < ( e j + k d ) 2 e j + k 2 + υ ^ k 2 ( d 1 ) 2 e j + k 2 θ j 2 e j for υ ^ > 0 and k > 0 . □

3.7. CMKLE and CHYE

The estimator θ ^ C M K L E is superior to the estimator θ ^ C H Y E in the sense of the SMSE criterion, i.e., SMSE θ ^ C M K L E S M S E θ ^ C H Y E < 0 for υ ^   > 0 and k > 0 if e j ( e j k ) 2 e j + 1 2 + υ ^ k 2 θ j 2 e j + 1 2 3 e j + k 2 < e j e j + d 2 e j + k 2 + υ ^ ( 1 + k d ) e j + k 2 θ j 2 e j + k 2 .
Proof. 
The difference between S M S E θ ^ C M K L E   a n d   S M S E θ ^ C H Y E is obtained as follows:
7 = S M S E θ ^ C M K L E S M S E θ ^ C H Y E 7 = υ ^ j = 1 p e j ( e j k ) 2 e j + k 4 + k 2 j = 1 p 3 e j + k 2 θ j 2 e j + k 4 υ ^ j = 1 p e j e j + d 2 e j + 1 2 e j + k 2 j = 1 p ( 1 + k d ) e j + k 2 θ j 2 e j + k 2 e j + 1 2 7 = υ ^ j = 1 p e j ( e j k ) 2 e j + 1 2 + υ ^ k 2 θ j 2 e j + 1 2 3 e j + k 2 e j e j + d 2 e j + k 2 υ ^ ( 1 + k d ) e j + k 2 θ j 2 e j + k 2 e j + 1 2 e j + k 4
The difference 7 < 0 if e j ( e j k ) 2 e j + 1 2 + υ ^ k 2 θ j 2 e j + 1 2 3 e j + k 2 < e j e j + d 2 e j + k 2 + υ ^ ( 1 + k d ) e j + k 2 θ j 2 e j + k 2 for υ ^ > 0 and k > 0 . □

4. Simulation Study

In this section, we design a Monte Carlo simulation to assess the estimators’ performance under varying conditions of multicollinearity and dispersion. The predictors are generated using the following equation:
x i j = ( 1 γ 2 ) 1 2 n i j + γ n i , p + 1 ,   i = 1,2 , , n ,   j = 1,2 , 3 ,   ,   p ,
where n i j denotes independent standard normal random numbers, and γ controls the degree of correlation between predictors. Specifically, as γ increases, the predictors become more correlated. We investigated different levels of multicollinearity with γ = 0.8 ,   0.9 ,   0.95 ,   a n d   0.99 . The response variable y i was drawn from the Conway–Maxwell Poisson distribution:
y i ~ C O M P ( μ i , ν ) ,
where μ i = exp x i T β and ν represents the dispersion parameter, with ν = 0.9 indicating under-dispersion and ν = 1.25 denoting over-dispersion. The sample sizes are set to n = 30, 50, 100, and 200, and two scenarios for the number of predictors are considered: p = 4 and p = 8. The true regression coefficients β are chosen such that β β = 1 , following the approach of similar studies in the literature [24,25].
The primary criterion for evaluating the estimators’ performances is the estimated mean squared error (MSE), which is computed as follows:
M S E = 1 1000 r = 1 1000 j = 1 p ( β ^ r j β j ) 2
where β ^ r j denotes the estimate obtained in the r t h replication of the simulation, and β j is the true vector of the regression parameters. Each simulation was repeated 1000 times to ensure robust and reliable results. The COMPoissonReg package in R [26] was used to generate data and estimate the parameters of the Conway–Maxwell–Poisson regression model. The model fitting was performed using the glm.cmp function from the package, which provides maximum likelihood estimates for COMP regression models [14,27]. The COMP model is fitted without standardization and an intercept. The simulation results are summarized in Table 1 and Table 2.
In this section, we evaluate the performance of the estimators based on their mean squared error (MSE) values across varying sample sizes, levels of multicollinearity, dispersion parameters, and the number of predictors. The results from Table 1 and Table 2 demonstrate a consistent pattern. MSE values are generally smaller for under-dispersion ( ν = 0.9) than over-dispersion ( ν = 1.25), especially as the sample size increases. The best-performing estimators exhibit lower MSEs when ν = 0.9, particularly with larger sample sizes and lower multicollinearity. As the dispersion parameter increases to ν = 1.25 , the MSE also increases, indicating the higher variability inherent in over-dispersed data. This trend is observed across all sample sizes, multicollinearity levels, and predictor counts.
Across both tables, the MSE decreases as the sample size increases, highlighting the advantages of larger datasets. For instance, when p = 4 and γ = 0.80, the MSE for n = 30 is relatively high (e.g., 0.4473 for the CMLE), whereas for n = 200, it drops significantly (e.g., 0.0239 for the CMLE). This trend is consistent across all levels of multicollinearity and dispersion, demonstrating that increasing the sample size reduces the estimator’s variance and improves overall performance. Conversely, small sample sizes (e.g., n = 30) lead to higher MSEs, particularly when combined with high multicollinearity (e.g., γ = 0.99) and over-dispersion. In these cases, the performance gap between estimators is most pronounced, with the proposed estimator showing more resilience to these conditions than its counterparts.
When multicollinearity is low (e.g., γ=0.8), all estimators tend to perform well, with the MSEs decreasing significantly as the sample size increases. For example, with ν = 0.9 and n = 200, the MSE values for all estimators are relatively close, indicating minimal performance differences under low correlation. However, as the correlation between predictors increases (e.g., γ = 0.99), the MSE rises, particularly for smaller sample sizes. For instance, with p = 4 and n = 30, the MSE for the CMLE is 3.5256, reflecting poor performance under severe multicollinearity. In contrast, the proposed estimator (CMKLE) consistently shows better robustness to multicollinearity. Although the performance difference diminishes as the sample size increases, the CMKLE still maintains superior results.
Increasing the number of predictors from p = 4 (Table 1) to p = 8 (Table 2) results in higher MSE values across all conditions. For example, with γ = 0.95 and n = 200, the MSE for p = 4 is 0.0412 (CMKLE), while for p = 8, it is 0.0470, reflecting the increased complexity in estimation with more predictors. The performance advantage of the CMKLE is more pronounced in smaller sample sizes, where it demonstrates greater robustness to both multicollinearity and dispersion effects. Even as the sample size increases and the performance gap narrows, the CMKLE maintains its leading position, reflecting its robustness and efficiency across a range of model complexities and data conditions.
Figure 1 and Figure 2, respectively, clearly illustrate the trends in the relationship between mean squared error (MSE) and sample sizes and MSE and multicollinearity levels. These figures provide compelling evidence of how variations in sample size and the degree of multicollinearity influence the predictive accuracy of the models under consideration.

5. Applications

In this study, we modeled two real-life datasets: one exhibiting over-dispersion and the other under-dispersion. By analyzing datasets with different dispersion characteristics, we aimed to demonstrate the versatility and advantages of the COM–Poisson regression model, which is uniquely capable of handling both types of dispersion. This allows for the more accurate modeling of count data compared to traditional models that are limited to addressing either over-dispersion or equi-dispersion scenarios.

5.1. Example I

In this study, we analyzed the nuts dataset, initially introduced by Hilbe [28], and most recently examined by Lukman et al. [2]. The dataset, available through the ‘COUNT’ package, comprises 52 observations and seven explanatory variables. It focuses on squirrel activity and various forest characteristics within Scotland’s Abernathy Forest. The response variable measures the number of cones stripped by squirrels per plot. In contrast, the explanatory variables include the following: the number of trees per plot ( x 1 ), the number of trees with a diameter at breast height (DBH) per plot ( x 2 ), mean tree height per plot ( x 3 ), canopy closure percentage ( x 4 ), standardized number of trees per plot ( x 5 ), standardized mean tree height per plot ( x 6 ), and standardized canopy closure percentage ( x 7 ).
In this analysis, following Lukman et al. [2], we excluded the variable x 5 and x 6 due to the modeling difficulties they posed. Subsequent investigations revealed the presence of multicollinearity, as indicated by variance inflation factors (VIFs) of 4.474, 16.332, 39.61, and 40.89 for the remaining variables. These high VIF values suggest strong correlations between several regressors, which, combined with over-dispersion, complicate the model-fitting process. Over-dispersion was confirmed by a dispersion parameter of 13.86763, signaling that the variance far exceeds the mean. This combination of multicollinearity and over-dispersion is critical, as it can distort the estimation of regression coefficients, compromise model accuracy, and reduce the reliability of statistical inferences.
Table 3 presents the estimated regression coefficients derived from various estimators employed in this study, which include the CMLE, CRE, CLE, CKLE, CLTE, CTPE, CHYE, and CMKLE. The performances of these estimators were evaluated using the scalar mean squared error (SMSE), as defined in Equations (32)–(39). The optimal estimator minimizes the SMSE, reflecting a lower prediction error. According to the results, the CMKLE demonstrates the lowest SMSE, with a value of 0.8626, suggesting that it provides the most precise predictions among the estimators considered in this analysis. The CHYE follows closely with an SMSE of 0.9609, also exhibiting strong performance. On the other hand, the CMLE has the highest SMSE at 17.3187, indicating that it is the least efficient estimator for this dataset.
The confidence intervals (CIs) computed for each estimator provide insights into the estimated coefficients’ statistical significance, precision, and reliability. A coefficient’s statistical significance is determined by whether its CI includes zero. If a CI does not contain zero, the coefficient is deemed statistically significant at the 5% level, indicating a meaningful contribution to the model. The CMKLE provides the most stable and precise estimates, as indicated by the lowest SMSE. However, not all coefficients are statistically significant. The results reveal that x 2 ,     x 3 and x 5 are significant while x 1 and x 4 are insignificant.
In summary, the CMKLE outperforms the other methods, providing the smallest SMSE and, consequently, the best prediction accuracy. Estimators with higher SMSE values, such as the CMLE, demonstrate reduced efficiency for this dataset due to their larger prediction errors. These findings align with the simulation results presented in Table 1 and Table 2, further reinforcing the robustness of the CMKLE approach.

5.2. Example 2

The second application involves analyzing aircraft data, where the response variable represents the count of damaged locations on the aircraft, which is modeled as following a Poisson distribution [29,30]. The explanatory variables include the type of aircraft coded as 0 for A-4 and 1 for A-6 ( x 1 ) , the bomb load in tons ( x 2 ), and the total months of aircrew experience ( x 3 ). Recent work by Algamal et al. [31] applied the COM–Poisson regression model for more suitable data modeling. Our analysis indicates that the dataset exhibits under-dispersion, characterized by a lower variability in counts than expected under traditional Poisson or negative binomial models. The COM–Poisson model effectively accommodates this behavior, as evidenced by the low dispersion parameter of 0.8756, which aligns with the characteristics of the dataset. Lukman et al. [4] identified significant multicollinearity issues, indicated by a condition number of 219.3654.
The results in Table 4 summarize the estimated regression coefficients for various estimators applied to the aircraft data, alongside the corresponding SMSE values. Among the estimators, the CMKLE demonstrates the best performance, achieving the lowest SMSE of 0.0253, indicating superior precision in coefficient estimation. The next best estimator, the CKLE, exhibits an SMSE of 0.0555, further reinforcing its reliability. In contrast, the CMLE shows relatively poorer performance, emphasizing the negative impact of multicollinearity on its performance. Given that the CMKLE outperforms other estimators in terms of having the lowest SMSE, we base our interpretation of the confidence intervals in Table 4 on the CMKLE estimates. The results from the CMKLE provide valuable insights into the relationship between the predictor variables and the response. Notably, the estimated coefficient for x 1 is 0.1312, with a 95% confidence interval of (−0.2520, 0.5144). Since this interval includes zero, we cannot conclude that x 1 has a statistically significant effect on the response variable at the given confidence level. This suggests that while x 1 may contribute to the model, its influence is not strong enough to be distinguished from random variation. In contrast, both x 2 and x 3 are statistically significant. The coefficient for x 2 is 0.1834, with a confidence interval of (0.1121, 0.2546), indicating a positive and significant relationship with the response variable. Similarly, x 3 has a negative coefficient of −0.0166, with a confidence interval of (−0.0256, −0.0076), confirming a significant negative effect.
Traditional estimators, such as the CMLE, may exhibit limitations in estimation accuracy, particularly in model complexities such as multicollinearity or non-normal error structures. This underscores the importance of selecting robust estimators tailored to the data’s characteristics to enhance predictive reliability. Notably, these findings align with the simulation results presented in Table 1 and Table 2, further validating the superior robustness and efficiency of the CMKLE approach.

6. Concluding Remarks

This study addresses the significant challenges posed by multicollinearity in COMP regression models by introducing a novel estimator that integrates the Kibria–Lukman and ridge estimators. This advancement significantly contributes to the existing literature, offering new methodologies that help mitigate the adverse effects of multicollinearity. Through rigorous Monte Carlo simulation studies and real-life applications using two distinct datasets, we comprehensively evaluated the performance of the proposed estimators. The simulation results reveal valuable insights into the behavior of the estimators under varying conditions, including different sample sizes, numbers of independent variables, dispersion parameters, and levels of multicollinearity. Our analysis of mean squared error (MSE) values across these variables demonstrated a consistent pattern: MSE values were generally lower under conditions of under-dispersion (ν = 0.9) compared to over-dispersion (ν = 1.25), particularly as sample sizes increased. The best-performing estimators exhibited significantly lower MSEs with larger sample sizes and reduced multicollinearity, underscoring their robustness.
Moreover, our findings emphasize the benefits of larger datasets, where the MSE consistently decreased with increasing sample size. This trend highlights the critical importance of sufficient sample sizes in enhancing estimator performance, particularly when there is high multicollinearity and over-dispersion. Conversely, smaller sample sizes resulted in high MSEs, illustrating the pronounced performance gap among estimators under these challenging conditions. Notably, the proposed CMKLE demonstrated superior resilience to multicollinearity, maintaining outstanding performance even as sample sizes increased. As we increased the number of predictors from p = 4 to p = 8, we observed a corresponding increase in MSE values across all conditions, confirming the complexities introduced by additional predictors. Notably, the CMKLE displayed a pronounced advantage in smaller sample sizes, demonstrating its robustness against multicollinearity and dispersion effects.
Furthermore, the application results reinforced the superior performance of the CMKLE. The scalar mean squared error (SMSE) evaluations indicate that the CMKLE achieved the lowest SMSE, demonstrating the highest predictive accuracy among the estimators analyzed. In contrast, the traditional CMLE displayed significantly higher SMSE values, highlighting its reduced efficiency for this dataset. In conclusion, the CMKLE surpasses other methodologies in predictive accuracy but also underscores the necessity for careful selection of estimation techniques tailored to specific data characteristics. This research paves the way for future studies to explore the broader application of these methodologies across diverse fields, ultimately enhancing the robustness of statistical models in the presence of multicollinearity and varying dispersion conditions.
Additionally, future research could explore Bayesian approaches to COM–Poisson regression, which have been proposed in the literature as an alternative methodology for handling dispersed count data. Bayesian inference methods, such as those developed by Chanialidis et al. [32], offer flexible estimation techniques that may provide further improvements in model performance and computational efficiency.

Author Contributions

Conceptualization: All the authors; Methodology: All the authors; Formal analysis: All the authors; Resources: All the authors; Writing—original draft preparation, All the authors; Writing—review and editing, All the authors; Visualization, A.F.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deputyship for Research & Innovation, Ministry of Education, Saudi Arabia through project number “NBU-FPEJ-2025-7-01”.

Data Availability Statement

Data will be made available upon request.

Acknowledgments

The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, Saudi Arabia for funding this research work through the project number “NBU-FPEJ-2025-7-01”.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Francis, R.A.; Geedipally, S.R.; Guikema, S.D.; Dhavala, S.S.; Lord, D.; LaRocca, S. Characterizing the performance of the conway-maxwell Poisson generalized linear model. Risk Anal. Int. J. 2012, 32, 167–183. [Google Scholar] [CrossRef] [PubMed]
  2. Lukman, A.F.; Albalawi, O.; Arashi, M.; Allohibi, J.; Alharbi, A.A.; Farghali, R.A. Robust Negative Binomial Regression via the Kibria–Lukman Strategy: Methodology and Application. Mathematics 2024, 12, 2929. [Google Scholar] [CrossRef]
  3. Walters, G.D. Using Poisson class regression to analyze count data in correctional and forensic psychology: A relatively old solution to a relatively new problem. Crim. Justice Behav. 2007, 34, 1659–1674. [Google Scholar] [CrossRef]
  4. Lukman, A.F.; Adewuyi, E.; Månsson, K.; Kibria, B.G. A new estimator for the multicollinear Poisson regression model: Simulation and application. Sci. Rep. 2021, 11, 3732. [Google Scholar] [CrossRef]
  5. Månsson, K. On ridge estimators for the negative binomial regression model. Econ. Model. 2012, 29, 178–184. [Google Scholar] [CrossRef]
  6. Conway, R.W.; Maxwell, W.L. A queuing model with state dependent service rates. J. Ind. Eng. Int. 1962, 12, 132–136. [Google Scholar]
  7. Chatla, S.B.; Shmueli, G. Efficient estimation of COM–Poisson regression and a generalized additive model. Comput. Stat. Data Anal. 2018, 121, 71–88. [Google Scholar] [CrossRef]
  8. Abonazel, M.R.; Saber, A.A.; Awwad, F.A. Kibria–Lukman estimator for the Conway–Maxwell Poisson regression model: Simulation and applications. Sci. Afr. 2023, 19, e01553. [Google Scholar] [CrossRef]
  9. Tanış, C.; Asar, Y. Liu-type estimator in Conway–Maxwell–Poisson regression model: Theory, simulation and application. Statistics 2024, 58, 65–86. [Google Scholar] [CrossRef]
  10. Abonazel, M.R. New modified two-parameter Liu estimator for the Conway–Maxwell Poisson regression model. J. Stat. Comput. Simul. 2023, 93, 1976–1996. [Google Scholar] [CrossRef]
  11. Sami, F.; Amin, M.; Butt, M.M. On the ridge estimation of the Conway-Maxwell Poisson regression model with multicollinearity: Methods and applications. Concurr. Comput. Pract. Exp. 2022, 34, e6477. [Google Scholar] [CrossRef]
  12. Akram, M.N.; Amin, M.; Sami, F.; Mastor, A.B.; Egeh, O.M.; Muse, A.H. A new Conway Maxwell–Poisson Liu regression estimator—Method and application. J. Math. 2022, 2022, 3323955. [Google Scholar] [CrossRef]
  13. Aladeitan, B.B.; Adebimpe, O.; Lukman, A.F.; Oludoun, O.; Abiodun, O.E. Modified Kibria-Lukman (MKL) estimator for the Poisson Regression Model: Application and simulation. F1000Research 2021, 10, 548. [Google Scholar] [CrossRef]
  14. Sellers, K.F.; Shmueli, G. A flexible regression model for count data. Ann. Appl. Stat. 2010, 4, 943–961. [Google Scholar] [CrossRef]
  15. Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  16. Liu, K. A new class of blased estimate in linear regression. Commun. Stat. Theory Methods 1993, 22, 393–402. [Google Scholar]
  17. Liu, K. Using Liu-type estimator to combat collinearity. Commun. Stat. Theory Methods 2003, 32, 1009–1020. [Google Scholar] [CrossRef]
  18. Kibria, B.G.; Lukman, A.F. A new ridge-type estimator for the linear regression model: Simulations and applications. Scientifica 2020, 2020, 9758378. [Google Scholar] [CrossRef] [PubMed]
  19. Özkale, M.R.; Kaciranlar, S. The restricted and unrestricted two-parameter estimators. Commun. Stat. Theory Methods 2007, 36, 2707–2725. [Google Scholar] [CrossRef]
  20. Sakallıoğlu, S.; Kaçıranlar, S. A new biased estimator based on ridge estimation. Stat. Pap. 2008, 49, 669–689. [Google Scholar] [CrossRef]
  21. Asar, Y.; Genç, A. A new two-parameter estimator for the Poisson regression model. Iran. J. Sci. Technol. Trans. A Sci. 2018, 42, 793–803. [Google Scholar] [CrossRef]
  22. Huang, J.; Yang, H. A two-parameter estimator in the negative binomial regression model. J. Stat. Comput. Simul. 2014, 84, 124–134. [Google Scholar] [CrossRef]
  23. Sami, F.; Butt, M.M.; Amin, M. Two parameter estimators for the Conway–Maxwell–Poisson regression model. J. Stat. Comput. Simul. 2023, 93, 2137–2157. [Google Scholar] [CrossRef]
  24. Kibria, B.G. Performance of some new ridge regression estimators. Commun. Stat. -Simul. Comput. 2003, 32, 419–435. [Google Scholar] [CrossRef]
  25. Oranye, H.E.; Ugwuowo, F.I. Modified jackknife Kibria–Lukman estimator for the Poisson regression model. Concurr. Comput. Pract. Exp. 2022, 34, e6757. [Google Scholar] [CrossRef]
  26. Sellers, K.; Lotze, T.; Raim, A. Compoissonreg: Conway-Maxwell Poisson (Com-Poisson) Regression; R Package Version 0.4; The Comprehensive R Archive Network: Vienna, Austria, 2017; Volume 1. [Google Scholar]
  27. Sellers, K.F.; Raim, A. A flexible zero-inflated model to address data dispersion. Comput. Stat. Data Anal. 2016, 99, 68–80. [Google Scholar] [CrossRef]
  28. Hilbe, J.M. Negative Binomial Regression, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  29. Myers, R.H.; Montgomery, D.C.; Vining, G.; Robinson, T.J. Generalized Linear Models: With Applications in Engineering and the Sciences; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  30. Amin, M.; Akram, M.N.; Amanullah, M. On the James-Stein estimator for the Poisson regression model. Commun. Stat.-Simul. Comput. 2022, 51, 5596–5608. [Google Scholar] [CrossRef]
  31. Algamal, Z.Y.; Abonazel, M.R.; Awwad, F.A.; Eldin, E.T. Modified jackknife ridge estimator for the Conway-Maxwell-Poisson model. Sci. Afr. 2023, 19, e01543. [Google Scholar] [CrossRef]
  32. Chanialidis, C.; Evers, L.; Neocleous, T.; Nobile, A. Efficient Bayesian inference for COM-Poisson regression models. Stat. Comput. 2018, 28, 595–608. [Google Scholar] [CrossRef]
Figure 1. A sample plot of the sample size against the mean squared error.
Figure 1. A sample plot of the sample size against the mean squared error.
Mathematics 13 00794 g001
Figure 2. A sample plot of the mean squared error against the level of multicollinearity.
Figure 2. A sample plot of the mean squared error against the level of multicollinearity.
Mathematics 13 00794 g002
Table 1. Estimated MSE values for p = 4.
Table 1. Estimated MSE values for p = 4.
ν 0.91.25
n30501002003050100200
γ = 0.8 γ = 0.8
θ ^ C M L E 0.44730.15310.04750.02390.76230.26340.07960.0433
θ ^ C R E 0.40870.14700.04700.02380.62090.23800.07690.0422
θ C L E 0.44070.15470.04680.02360.70510.26140.07710.0421
θ ^ C K L E 0.37410.14130.04660.02360.50600.21480.07420.0411
θ ^ C L T E 0.35490.13780.04620.02340.43550.20300.07230.0402
θ ^ C T P E 0.44790.15650.04810.02420.72660.27020.08240.0448
θ ^ C H Y E 0.38470.14180.04640.02350.55560.21990.07390.0408
θ ^ C M K L E 0.34880.13640.04610.02340.42790.18170.07170.0400
γ = 0.9 γ = 0.9
θ ^ C M L E 0.60400.19380.05330.02971.05440.37260.10440.0652
θ ^ C R E 0.51200.18120.05250.02950.73610.31220.09880.0620
θ C L E 0.57650.19600.05310.03000.88630.37010.10290.0627
θ ^ C K L E 0.43620.16950.05180.02930.51270.26010.09340.0589
θ ^ C L T E 0.40040.16300.05130.02910.45340.23640.09000.0569
θ ^ C T P E 0.59140.19890.05420.03050.92880.37750.10920.0688
θ ^ C H Y E 0.46510.17130.05170.02910.62020.27560.09330.0586
θ ^ C M K L E 0.39290.16030.05120.02910.44160.19730.08870.0562
γ = 0.95 γ = 0.95
θ ^ C M L E 0.98980.27190.06560.03781.84850.59980.15220.1092
θ ^ C R E 0.73450.23850.06410.03741.04100.43760.13790.0990
θ C L E 0.85840.27290.06560.03851.25210.56530.15000.1038
θ ^ C K L E 0.55210.20920.06260.03690.58990.31070.12450.0893
θ ^ C L T E 0.48370.19450.06140.03630.49990.24220.11720.0838
θ ^ C T P E 0.89710.27840.06740.03971.37200.57620.16130.1177
θ ^ C H Y E 0.63750.21650.06230.03650.83090.35890.12570.0896
θ ^ C M K L E 0.44450.19130.06110.03640.48820.22640.11360.0813
γ = 0.99 γ = 0.99
θ ^ C M L E 3.52560.88130.17370.08907.78522.39270.53740.4632
θ ^ C R E 1.29950.52550.15420.08332.00850.95580.36760.3119
θ C L E 1.51530.74210.13240.09471.79841.33950.52390.4625
θ ^ C K L E 0.57640.29260.13610.07791.08380.27770.23380.1930
θ ^ C L T E 0.66370.22930.12680.07381.19470.26190.18250.1484
θ ^ C T P E 1.81440.76100.18430.10372.68491.46240.53230.4717
θ ^ C H Y E 0.94020.39350.15890.07631.42060.62360.28010.2320
θ ^ C M K L E 0.48110.21990.12200.07340.51350.24840.16830.1351
Table 2. Estimated MSE values for p = 8.
Table 2. Estimated MSE values for p = 8.
ν 0.91.25
n30501002003050100200
γ = 0.8 γ = 0.8
θ ^ C M L E 3.52780.32240.07790.02764.96660.58660.15460.0564
θ ^ C R E 2.30670.30550.07700.02752.92920.51630.14650.0551
θ C L E 2.60450.33000.07900.02783.07460.59990.16100.0580
θ ^ C K L E 1.51190.28980.07610.02741.87350.45410.13870.0539
θ ^ C L T E 1.39900.28160.07520.02721.68750.42990.13370.0526
θ ^ C T P E 2.81180.34060.08070.02813.47300.62350.17060.0605
θ ^ C H Y E 2.00710.29070.07560.02732.57620.47000.13740.0532
θ ^ C M K L E 0.93740.27690.07520.02731.61620.41050.13190.0527
γ = 0.9 γ = 0.9
θ ^ C M L E 3.80530.42510.09510.03088.42930.81150.22850.0751
θ ^ C R E 2.61960.39140.09320.03033.93680.66700.20950.0724
θ C L E 2.74380.43570.09710.03043.48330.81580.24090.0782
θ ^ C K L E 1.63350.36110.09130.02992.28770.54700.19270.0696
θ ^ C L T E 1.61410.34710.08970.02952.16610.50660.18200.0673
θ ^ C T P E 2.96990.45090.10030.03154.68480.84160.25910.0831
θ ^ C H Y E 2.40570.36590.09050.02983.37910.58870.19300.0685
θ ^ C M K L E 1.32970.33880.08960.02951.85130.47680.18920.0672
γ = 0.95 γ = 0.95
θ ^ C M L E 6.20940.58760.12640.04520.951.26910.36890.1182
θ ^ C R E 2.88260.50530.12110.043814.07030.91280.30920.1106
θ C L E 3.15690.60120.13110.04514.83031.20620.39530.1253
θ ^ C K L E 2.02920.43610.11600.04243.87550.65040.25650.1035
θ ^ C L T E 1.83570.40910.11230.04142.86030.57930.23440.0981
θ ^ C T P E 3.58400.62640.13840.04702.55311.23070.42990.1364
θ ^ C H Y E 2.41110.45480.11470.04225.71580.76650.26410.1016
θ ^ C M K L E 1.37990.39400.11160.04124.03650.53850.22030.0980
γ = 0.99 γ = 0.99
θ ^ C M L E 13.37421.52930.38290.16280.994.58411.51180.4452
θ ^ C R E 3.18550.87220.31060.138342.23261.89040.84780.3402
θ C L E 2.92291.30900.41240.13166.82252.61541.42720.4906
θ ^ C K L E 2.50740.49250.24900.11624.14520.76640.41310.2531
θ ^ C L T E 2.30370.43930.22390.105110.62130.81240.31770.2177
θ ^ C T P E 4.06791.33460.45180.17734.17962.75031.44710.5458
θ ^ C H Y E 2.41820.67080.25980.11917.72871.38960.59850.2683
θ ^ C M K L E 1.83250.39790.21080.09985.54850.56260.27970.2033
Table 3. Estimated regression coefficients for the nut dataset.
Table 3. Estimated regression coefficients for the nut dataset.
Coef.Estimators Lower CIUpper CISMSE
x 1 θ ^ C M L E 0.02550.00930.041717.3187
x 2 2.48500.35234.6177
x 3 0.0367−0.05400.1273
x 4 0.0093−0.01030.0289
x 5 0.51780.02741.0082
x 1 θ ^ C R E 0.01920.00440.03403.6395
x 2 0.53280.09670.9688
x 3 0.0768−0.00100.1547
x 4 0.0087−0.01060.0280
x 5 0.52250.12370.9213
x 1 θ ^ C L E 0.02200.01500.04675.7150
x 2 1.37042.27506.0085
x 3 0.0603−0.08630.0894
x 4 0.0087−0.00940.0299
x 5 0.5508−0.02040.9579
x 1 θ ^ C K L E 0.0129−0.00170.02746.8430
x 2 −1.4194−0.32360.2969
x 3 0.11700.04170.1923
x 4 0.0081−0.01110.0273
x 5 0.52720.17230.8822
x 1 θ ^ C L T E 0.01890.00030.02951.4546
x 2 0.43640.15380.9872
x 3 0.07880.02820.1804
x 4 0.0087−0.01090.0275
x 5 0.52280.15640.8951
x 1 θ ^ C T P E 0.02020.01810.04882.9229
x 2 0.84893.55686.2770
x 3 0.0703−0.09670.0700
x 4 0.0088−0.00960.0296
x 5 0.52180.29010.9949
x 1 θ ^ C H Y E 0.01840.00540.03530.9609
x 2 0.30400.2891.0766
x 3 0.0812−0.00940.1500
x 4 0.0088−0.01080.0281
x 5 0.50980.10190.9811
x 1 θ ^ C M K L E 0.0164−0.01500.01050.8626
x 2 −0.2723−0.1213−0.2456
x 3 0.09020.16210.2793
x 4 0.0099−0.01480.0235
x 5 0.38910.43091.1188
Table 4. Estimated regression coefficients for the aircraft data.
Table 4. Estimated regression coefficients for the aircraft data.
Coef.Estimators Lower CIUpper CISMSE
x 1 θ ^ C M L E 0.6257−0.25381.50510.6084
x 2 0.14550.05000.2410
x 3 −0.0166−0.0257−0.0076
x 1 θ ^ C R E 0.4101−0.15870.97890.2565
x 2 0.16220.08330.2411
x 3 −0.0166−0.0257−0.0076
x 1 θ ^ C L E 0.7855−0.17981.41060.5881
x 2 0.13310.05570.2370
x 3 −0.0166−0.0257−0.0076
x 1 θ ^ C K L E 0.1945−0.18870.57770.0555
x 2 0.17890.10760.2501
x 3 −0.0167−0.0257−0.0077
x 1 θ ^ C L T E 0.2058−0.17620.85250.1684
x 2 0.17800.09140.2442
x 3 −0.0167−0.0257−0.0076
x 1 θ ^ C T P E 0.9608−0.09061.29880.5663
x 2 0.11960.06210.2323
x 3 −0.0166−0.0257−0.0076
x 1 θ ^ C H Y E 0.5135−0.23611.04300.2480
x 2 0.15430.08040.2450
x 3 −0.0166−0.0257−0.0076
x 1 θ ^ C M K L E 0.1312−0.25200.51440.0253
x 2 0.18340.11210.2546
x 3 −0.0166−0.0256−0.0076
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alreshidi, N.A.; Alrasheedi, M.A.; Lukman, A.F.; Alrweili, H.; Farghali, R.A. Modified Kibria–Lukman Estimator for the Conway–Maxwell–Poisson Regression Model: Simulation and Application. Mathematics 2025, 13, 794. https://doi.org/10.3390/math13050794

AMA Style

Alreshidi NA, Alrasheedi MA, Lukman AF, Alrweili H, Farghali RA. Modified Kibria–Lukman Estimator for the Conway–Maxwell–Poisson Regression Model: Simulation and Application. Mathematics. 2025; 13(5):794. https://doi.org/10.3390/math13050794

Chicago/Turabian Style

Alreshidi, Nasser A., Masad A. Alrasheedi, Adewale F. Lukman, Hleil Alrweili, and Rasha A. Farghali. 2025. "Modified Kibria–Lukman Estimator for the Conway–Maxwell–Poisson Regression Model: Simulation and Application" Mathematics 13, no. 5: 794. https://doi.org/10.3390/math13050794

APA Style

Alreshidi, N. A., Alrasheedi, M. A., Lukman, A. F., Alrweili, H., & Farghali, R. A. (2025). Modified Kibria–Lukman Estimator for the Conway–Maxwell–Poisson Regression Model: Simulation and Application. Mathematics, 13(5), 794. https://doi.org/10.3390/math13050794

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop