1. Introduction
Multicollinearity, a situation where predictor variables in a regression model are highly correlated, poses a significant challenge to analyzed the statistical models. It inflates the variances of parameter estimates, making them unstable and unreliable. Mathematically, the regression model is defined as
Equation (1)
is the response vector,
is a design matrix,
is the vector of parameters of the model, and
is the error vector of order
n × 1. The OLS parameter estimates are given below:
Here
and
, with the variance–covariance matrix given by
The condition number (CN) measures the multicollinearity and is mathematically defined as Equation (4):
where
and
represent the maximum and minimum eigenvalues, respectively, of matrix
. In Equation (4), the values of
show significant multicollinearity and the OLS estimates are unstable. Another tool to examine the multicollinearity issues in data is the variance inflation factor (VIF). Mathematically, for the
predictor, the VIF is computed as
Here,
represents the measure of how well the
predictor can be explained by the remaining predictors in Equation (5). A
suggests high multicollinearity and the need for remedial measures. In the presence of multicollinearity, the matrix
becomes nearly singular, resulting in inflated variances of the OLS estimates. For reliable parameter estimates of the model in the presence of significant multicollinearity, the shrinkage parameter is used to mitigate the issue. Initially, ref. [
1] introduced the shrinkage parameter (
for reliable parameter estimates of the regression model to mitigate multicollinearity issues. The ridge estimate of the model is defined as
where
is the ridge parameter, and
is the identity matrix. This adjustment effectively replaces the eigenvalues
of the
matrix with
, improving numerical stability. Most researchers have introduced basic ridge parameters for high-collinearity data. Garg [
2] modified the ridge estimators to mitigate multicollinearity in regression analysis, offering a more stable solution when predictors are highly correlated with each other. Kibria [
3] introduced average-based ridge estimators to optimize parameter estimation efficiency for the highly correlated predictor variables, while [
4] developed generalized ridge regression to handle the multicollinearity comprehensively as compared to the other method. Ahmed et al. [
5] explored kernel ridge-type estimators for partial multicollinearity.
Similarly, ref. [
6] improved estimation strategies to effectively mitigate multicollinearity issues of the data. Gregory [
7] modified the ridge regression method to be a helpful tool for handling the challenges of multicollinearity in regression analysis. Reference [
8] introduced rank ridge estimators to tackle highly correlated genetic data, while [
9,
10,
11,
12] introduced estimators for complex multicollinearity, making it a key tool across fields like genetics, environmental science, and econometrics of correlated datasets.
In most cases, the basic ridge parameter does not perform well under severe multicollinearity. To address this limitation, ref. [
13] introduced the two-parameter ridge estimator, which incorporates an additional scaling factor
alongside the penalty term
. For the two-parameter ridge estimator, Equation (7) is expressed as
where
is provided additionally to enable more flexible solutions to ridge regression problems and mitigate the effect of severe multicollinearity. The parameter
is mathematically defined in Equation (8):
The estimator
reduces to the
estimator when
and
and to the ridge estimator when
and
. Toker and Kaçıranlar [
14] introduced and explored ridge estimators, their applications in cross-sectional analysis and their role in improved stability under multicollinearity scenarios.
Many researchers have focused on two-parameter estimators rather than the basic ridge parameter to address severe multicollinearity among predictors in linear regression models, because two-parameter ridge estimators tend to perform better under such conditions compared to the basic ridge estimator; see references [
15,
16,
17,
18]. A recent study by Akhtar and Alharthi [
19] introduced three new two-parameter ridge estimators based on the number of predictors, eigenvalues and the condition number.
However, existing CAREs remain sensitive to high noise levels and instability in variance, particularly under severe multicollinearity. This limitation motivates the present study. To address this gap, we propose four modified estimators, , and , that incorporate a variance-stabilizing term to overcome their limitations under severe multicollinearity and noise. By jointly penalizing multicollinearity through the condition number and estimation uncertainty through variance adjustment, the proposed estimators provide more stable performance across different regression settings.
This paper is organized as follows:
Section 2 covers materials and methodology, including existing and newly proposed estimators. The simulation algorithm is outlined in
Section 3, while
Section 4 focuses on the two real-life applications. Finally,
Section 5 provides concluding remarks.
2. Materials and Methodology
The canonical form of Equation (1) is expressed as
where
=
is a modified design matrix and
represents the vector of the parameters in the transformed space. The matrix
is an orthogonal matrix obtained from the eigenvectors of
, satisfying
, where
is the
identity matrix. The transformation
=
adjusts the design matrix with the principal components of
, simplifying the structure for the regression problem.
Here,
is a diagonal matrix containing the eigenvalues
, arranged in increasing order. The canonical parameters are related to the original parameters of the models by
, enabling the model to function in the canonical space. The OLS estimator in Equation (2) can be written in canonical form as
The ridge estimate in Equation (6) can be expressed in canonical form as
which stabilizes the solution by adding
to the diagonal elements of Λ, effectively shrinking the contributions of smaller eigenvalues and enhancing numerical stability, or
A two-parameter ridge estimator in Equation (7) can be expressed in canonical form as
The MSEs of these estimators are given below in Equations (14)–(16):
where
represents is the error variance of linear regression model Equation (1),
is the ith estimated value of parameter
and
is the ith eigenvalue of the matrix
.
2.1. Existing Ridge Estimators
Given that represents the estimated error variance of the model, denotes the maximum eigenvalue of the matrix and regression estimates coefficient, several established ridge estimators, widely used for addressing multicollinearity in regression analysis, are discussed below.
Hoerl and Kennard [
1] introduced the shrinkage parameter to address multicollinearity issues and estimate reliable parameters of the regression model, referred to as the HK estimator in this study. Its mathematical expression is as follows:
Hoerl et al. [
20] modified the HK estimator to better handle the multicollinearity issue, referring to it as the HKB estimator, mathematically defined as
Kibria [
3] developed three ridge estimators based on the average, Geometric mean, arithmetic mean and median, to enhance the significant collinearity issues of the data. These are mathematically formulated as
The eigenvalue-based ridge estimator, referred to in this study as the KMS estimator, was explored by [
21] and is expressed as
Reference [
13] introduced the two-parameter ridge estimator technique for highly correlated data; it is referred to as the LC estimator in this research.
is a penalty parameter and
is a scale factor.
and
are calculated using Equation (17) and Equation (8), respectively.
Similarly, Toker and Kaçıranlar [
14] established the dual-parameter ridge estimator to handle the high collinearity data, in a study referred to as TK estimator, with the optimal values for
and
derived as follows:
Yasin et al. [
22] modified two-parameter ridge estimators based on averages, denoted as MTP1, MTP2, and MTP3, with their respective mathematical forms expressed as
In these formulations, the modified ridge parameter for the
predictor is calculated as
Equation (23) for -values of the estimators MTPR1, MTPR2, and MTPR3 mentioned above is used to determine using Equation (8).
Most recently, Akhtar and Alharthi [
19] developed three ridge estimators, referred to as CARE1, CARE2, and CARE3, to handle severe multicollinearity issues. They are mathematically defined as follows:
- i.
Case 1 (CARE1):
- ii.
Case 2 (CARE2):
- iii.
Case 3 (CARE3):
For the second parameter, Equation (8) is used. Among these three ridge estimators, we select Case 1, CARE1, as the competitor estimator in the study. The following existing estimators are used in this study: HK, HKB, KAM, KGM, KMed, KMS, LC, TK, MTPR1, MTPR2, MTPR3, and CARE1. These estimators are compared with our newly proposed estimators.
2.2. Proposed Ridge Estimators
In this section, we modify the condition-adjusted ridge estimators (CAREs) to address their limitations under high multicollinearity and noise conditions. While the CAREs framework effectively adjusts the ridge penalty based on the multicollinearity structure of the design matrix, it does not incorporate any direct mechanism to account for the error variance or the scale of regression coefficients.
To overcome this shortcoming, we propose an improved CARE, denoted as and , which augment the original penalty formulation by introducing a variance-stabilizing term of the form . This modification allows for more robust shrinkage behavior in severe mutlticollinearity settings, while maintaining adaptability to the eigenvalues structure of the design matrix.
The improved estimators retain the core idea of scaling the ridge penalty by the
but now operate within a two-component structure that jointly penalizes based on both multicollinearity and estimation uncertainty. We propose the following multiscale ridge estimators.
a, b, and f control the degree of shrinkage applied to eigenvalues, coefficient magnitudes, and the condition number, respectively. Their selected values represent different levels of penalization to improve stability under multicollinearity. To address potential arbitrariness, a sensitivity analysis is conducted, showing that the estimators are robust to moderate changes in these parameters.
are the eigenvalues of the matrix.
are estimated coefficients.
is the maximum estimated regression coefficient.
is the condition number of the design matrix, measuring collinearity.
The term captures estimation uncertainty.
are non-negative exponents that control the influence of each component.
This formulation generalizes the original CARE penalty and leads to the development of four specific multiscale ridge estimators, Equations (28)–(31), and the –value uses Equation (8).
Main Theoretical Results.
2.3. Theoretical Comparison
We now present two key theorems that establish the theoretical properties of the proposed estimators.
Theorem 1 (Consistency and Performance Under Severe Multicollinearity). Under standard regularity conditions, the proposed estimators are consistent. Moreover, when severe multicollinearity is present and the error variance is positive, there exists a threshold κ0 such that for all κ > κ0, the proposed estimators achieve a lower MSE than the existing CARE:
Proof. As the sample size
increases, the eigenvalues
grow proportionally to
. From the construction of the proposed estimators in Equations (28)–(31), the ridge parameter
remains bounded (i.e., O (1)), because
converges to a finite constant, while the term involving
diverges. Consequently,
Additionally, from Equation (8), the scaling parameter
converges to 1, see [
23]. Therefore,
converges to the OLS estimator
, which is known to be consistent. □
Under severe multicollinearity, the smallest eigenvalue
approaches zero. For the CARE proposed by Akhtar and Alharthi [
19], the ridge parameter is
, which tends to zero as
. In contrast, the proposed estimators include the additional term
, ensuring that
for sufficiently large condition numbers. Following the classic result of [
1], the derivative
is negative for small values of k when multicollinearity is present. The reduction in variance is of order
, while the increase in squared bias is only of order
. For sufficiently small
the variance reduction outweighs the bias increase, leading to a net decrease in MSE [
3,
24].
Theorem 2 (Condition for Superiority Over Existing Estimators). Let denote the parameters for , and let and denote the parameters for any competing two-parameter ridge estimator. Then yields a smaller MSE than the competing estimator if the following inequality holds:
The difference between the MSEs of . This means the proposed estimator has a smaller MSE when the expression involving the parameters of both estimators satisfies this inequality.
Proof. The mean squared error difference between the two estimators is obtained by subtracting their respective MSE expressions derived from Equation (16). When this difference is positive, the proposed estimator has a smaller MSE. The inequality above directly represents this condition. In the specific case of comparing with CARE under severe multicollinearity, Theorem 2.2 guarantees that and typically , which satisfies the required inequality. □
2.3.1. Asymptotic Properties
Under the assumptions that →∞, where is a positive definite matrix, and , the proposed estimators are consistent estimators of θ as n → ∞.
Proof. As
n → ∞, the eigenvalues
grow proportionally to
n. Consequently,
for each
. The proposed ridge parameter
from Equations (28)–(31) satisfies
because
converges to a finite constant by the consistency of OLS estimators, while
Thus,
Moreover, from Equation (8),
as
(see [
23], Lemma 1). □
Therefore,
, which is consistent [
25,
26].
Discussion of Theoretical Findings
The theoretical results presented above lead to several important observations about the proposed estimators:
When the variance-stabilizing term
is set to zero, the proposed estimators simplify to the CARE introduced by [
19]. This demonstrates that our approach serves as a natural extension of existing methodology, rather than a completely unrelated contribution.
A key limitation of many existing ridge estimators is that they can produce shrinkage parameters arbitrarily close to zero under certain conditions, effectively reverting to the unstable OLS estimator. By incorporating , our proposed estimators ensure that in all practical scenarios, providing consistent stabilization.
The flexibility to choose the exponents
allows the estimator to adapt to varying degrees of multicollinearity. The four specific estimators proposed (MSRE
1 through MSRE
4) represent a spectrum of shrinkage intensities. MSRE
1 and MSRE
2 offer moderate penalization suitable for mild to moderate collinearity, while MSRE
3 and MSRE
4 provide more aggressive shrinkage designed for severe multicollinearity scenarios [
1].
Theorem 2.1 establishes that under conditions of severe multicollinearity with non-negligible error variance, precisely the situation where standard ridge estimators struggle, the proposed estimators theoretically outperform the existing CARE. This theoretical finding will be empirically validated in the simulation studies and real-world applications presented in subsequent sections.
Remark 1. The specific exponent choices for the four proposed estimators are (a, b, f) = (1, 1, 1) for MSRE1, (2, 1, 1) for MSRE2, (6, 3, 3) for MSRE3, and (4, 4, 4) for MSRE4. These values were selected to represent a range of shrinkage intensities, with higher exponents providing stronger penalization suitable for more severe multicollinearity conditions.
By doing so, we not only improve the CAREs but also provide a structured classification of enhanced ridge estimators that better handle multicollinearity and error variance structures. These estimators are compared based on Monte Carlo simulation, with details provided in
Section 3.
3. Monte Carlo Simulation Design
Generating predictor variables under organized multicollinearity settings are produced as follows. Predictors
are simulated using Equation (32); many researchers, see references [
17,
27], adopted this method for generation of the predictor variables.
Here,
denotes the correlation between predictors,
represents random samples of predictors drawn from a standard normal distribution
is the total number of predictors, and
n is the sample size. To evaluate various levels of correlation values (0.80, 0.90 and 0.99),
and
are chosen to evaluate the model across different scenarios. Mathematically, the model is expressed as
The regression model parameters (
are calculated by determining the optimal direction, following the methodology outlined by [
28]. The error term
) is the model with mean zero and different values error variance (
. Three levels of error variance were considered (
. On this basis, the method ensures that parameters of the model are oriented to minimize errors and enhance the model’s predictive accuracy.
3.1. Estimated Mean Squared Errors
The estimators are biased; therefore, we measure their effectiveness based on estimated MSEs, calculated over
replications using the following formula:
Evaluating the performance of existing and proposed estimators is challenging, so we compute the estimated MSEs using specific algorithms and formulas (10) to (31). Accuracy is based on the MSE criterion.
The performance of the existing and proposed estimators is difficult to evaluate theoretically; therefore, Equations (10)–(31) were used alongside a specific algorithm to estimate MSEs. The accuracy of the estimators was assessed based on the MSE criterion. All the simulation analyses were carried out using the R programming language. These results, summarized in
Table A1,
Table A2,
Table A3,
Table A4,
Table A5 and
Table A6 in the
Appendix A, highlight the lowest estimated MSEs for each ridge estimator, with the lowest values bolded for easy reference in tables. A lower MSE reflects an estimator’s ability to closely approximate the true parameters, indicating higher accuracy and reliability. The simulation results are further discussed in
Section 3.2.
We created a summary in
Table 1, based on the results in simulation tables, highlighting the performance of various ridge estimators across different scenarios.
In the summary in
Table 1, it is clear that our newly modified estimators perform better in all scenarios as compared to other existing methods.
3.2. Simulation Results Analysis
Effect of Sample Size (n): As the sample size increases, all estimators show a decrease in the estimated MSEs. The newly proposed MSRE estimators, in particular, perform well across all sample sizes, often surpassing traditional OLS and other existing methods, especially for larger n values.
Effect of Number of Predictors : When the number of predictors increases, MSEs generally rise. However, MSRE estimators remain more stable than OLS and other methods as grows, showing better resilience in high-dimensional settings.
Effect of Multicollinearity : High multicollinearity ) significantly impacts OLS, leading to a higher MSE. MSRE estimators, particularly , handle multicollinearity more effectively, providing lower MSE compared to OLS and other methods.
Effect of Error Variance : As error variance increases, estimated MSEs rise for all estimators. the newly proposed MSRE estimators are less sensitive to larger error variances, maintaining superior performance compared to other methods, especially in higher error variance scenarios .
5. Conclusions
This research study introduced four new ridge estimators, named , and , to tackle the severe challenges of multicollinearity in regression models. The study comprehensively evaluated the performance of various estimators under different scenarios, using Monte Carlo simulations and three real-world datasets to highlight their effectiveness in addressing severe multicollinearity. The newly proposed estimators showed better performance as compared to other existing estimators, particularly , which consistently achieved the minimum MSE across different datasets and scenarios.
In simulation results, exceled under different levels of multicollinearity, sample sizes, and predictor counts, striking an effective balance between bias and variance. Real-world datasets further validated these findings, where in the Economic Indicators dataset, outperformed others by minimizing estimation errors even under severe multicollinearity; in the Body Fat dataset, it showed resilience against high condition numbers and variance inflation factors, delivering the best MSE; and in the Automobile Demand dataset, it achieved the lowest MSE, proving its robustness and reliability in practical applications.
Overall, the adaptive regularization techniques employed in the proposed estimators provide a clear advantage over traditional ridge regression methods. , in particular, stood out as a highly effective and versatile estimator for managing multicollinearity, reducing estimation errors, and improving predictive accuracy in both simulated and real-world settings.
Future research could focus on extending the applicability of the newly proposed estimators to dynamic systems, non-linear models, and high-dimensional datasets, while also exploring improvements through advanced computational and machine learning techniques.