Next Article in Journal
Black Box Optimization for Ergodic Systems in Markov Chains
Previous Article in Journal
Boundedness of Rough Multiple Oscillatory Singular Integral Operators on Triebel–Lizorkin Space
Previous Article in Special Issue
A Bayesian Feature Weighting Model with Simplex-Constrained Dirichlet and Contamination-Aware Priors for Noisy Medical Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity

by
Ali Rashash R. Alzahrani
1,* and
Asma Ahmad Alzahrani
2
1
Mathematics Department, Faculty of Sciences, Umm Al-Qura University, Makkah 24382, Saudi Arabia
2
Department of Mathematics, Faculty of Science, Al-Baha University, Al-Baha 65522, Saudi Arabia
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(8), 1245; https://doi.org/10.3390/math14081245
Submission received: 8 March 2026 / Revised: 30 March 2026 / Accepted: 5 April 2026 / Published: 9 April 2026
(This article belongs to the Special Issue Statistical Machine Learning: Models and Its Applications)

Abstract

Multicollinearity is a critical issue in regression analysis, often resulting in inflated variances and unstable parameter estimates. Ridge regression is a widely adopted solution to address this challenge; however, existing ridge estimators are typically tailored to specific scenarios, limiting their universal applicability. Akhtar and Alharthi developed ridge estimators based on condition-adjusted ridge estimators (CAREs) to handle severe multicollinearity issues. However, their approach did not account for the error variances in the estimation process. In this study, we propose improvements to these CAREs by incorporating error variances, resulting in the development of multiscale ridge estimators ( M S R E 1 , M S R E 2 , M S R E 3 and M S R E 4 ) that more effectively address the challenges posed by severe multicollinearity. We compare the performance of our newly proposed estimators with ordinary least square (OLS) and other existing ridge estimators using both simulation studies and real-life datasets. The evaluation, based on estimated mean squared error (MSE), demonstrates that the proposed estimators consistently outperform existing methods, particularly in scenarios with significant multicollinearity, larger sample sizes, and higher predictor dimensions. Results from three real-life datasets further validate the proposed estimators’ ability to reduce estimation error and improve predictive accuracy across diverse practical applications.

1. Introduction

Multicollinearity, a situation where predictor variables in a regression model are highly correlated, poses a significant challenge to analyzed the statistical models. It inflates the variances of parameter estimates, making them unstable and unreliable. Mathematically, the regression model is defined as
y = X φ + ϵ ,
Equation (1) y R n × 1 is the response vector, X R n × P is a design matrix, φ R P × 1 is the vector of parameters of the model, and ϵ is the error vector of order n × 1. The OLS parameter estimates are given below:
φ ^ O L S = 1 ω ,
Here = X X and ω = X y , with the variance–covariance matrix given by
C o v φ ^ O L S = σ 2 1 ,
The condition number (CN) measures the multicollinearity and is mathematically defined as Equation (4):
κ = λ m a x λ m i n ,
where λ m a x and λ m i n represent the maximum and minimum eigenvalues, respectively, of matrix   . In Equation (4), the values of κ 30 show significant multicollinearity and the OLS estimates are unstable. Another tool to examine the multicollinearity issues in data is the variance inflation factor (VIF). Mathematically, for the i t h predictor, the VIF is computed as
V I F i = 1 1 R i 2 ,
Here, R i 2 represents the measure of how well the i t h predictor can be explained by the remaining predictors in Equation (5). A V I F i > 10 suggests high multicollinearity and the need for remedial measures. In the presence of multicollinearity, the matrix becomes nearly singular, resulting in inflated variances of the OLS estimates. For reliable parameter estimates of the model in the presence of significant multicollinearity, the shrinkage parameter is used to mitigate the issue. Initially, ref. [1] introduced the shrinkage parameter ( k ) for reliable parameter estimates of the regression model to mitigate multicollinearity issues. The ridge estimate of the model is defined as
φ ^ r i d g e = + k I 1 ω ,
where k is the ridge parameter, and I is the identity matrix. This adjustment effectively replaces the eigenvalues λ i of the matrix with λ i + k , improving numerical stability. Most researchers have introduced basic ridge parameters for high-collinearity data. Garg [2] modified the ridge estimators to mitigate multicollinearity in regression analysis, offering a more stable solution when predictors are highly correlated with each other. Kibria [3] introduced average-based ridge estimators to optimize parameter estimation efficiency for the highly correlated predictor variables, while [4] developed generalized ridge regression to handle the multicollinearity comprehensively as compared to the other method. Ahmed et al. [5] explored kernel ridge-type estimators for partial multicollinearity.
Similarly, ref. [6] improved estimation strategies to effectively mitigate multicollinearity issues of the data. Gregory [7] modified the ridge regression method to be a helpful tool for handling the challenges of multicollinearity in regression analysis. Reference [8] introduced rank ridge estimators to tackle highly correlated genetic data, while [9,10,11,12] introduced estimators for complex multicollinearity, making it a key tool across fields like genetics, environmental science, and econometrics of correlated datasets.
In most cases, the basic ridge parameter does not perform well under severe multicollinearity. To address this limitation, ref. [13] introduced the two-parameter ridge estimator, which incorporates an additional scaling factor d alongside the penalty term k . For the two-parameter ridge estimator, Equation (7) is expressed as
φ ^ ( d , k ) = d + k I 1 ω ,
where d is provided additionally to enable more flexible solutions to ridge regression problems and mitigate the effect of severe multicollinearity. The parameter d is mathematically defined in Equation (8):
d ^ = ω + k I 1 ω ω + k I 1 + k I 1 ω ,
The estimator φ ^ ( d , k ) reduces to the O L S estimator when d = 1   and k   =   0 and to the ridge estimator when d = 1 and k   >   0 . Toker and Kaçıranlar [14] introduced and explored ridge estimators, their applications in cross-sectional analysis and their role in improved stability under multicollinearity scenarios.
Many researchers have focused on two-parameter estimators rather than the basic ridge parameter to address severe multicollinearity among predictors in linear regression models, because two-parameter ridge estimators tend to perform better under such conditions compared to the basic ridge estimator; see references [15,16,17,18]. A recent study by Akhtar and Alharthi [19] introduced three new two-parameter ridge estimators based on the number of predictors, eigenvalues and the condition number.
However, existing CAREs remain sensitive to high noise levels and instability in variance, particularly under severe multicollinearity. This limitation motivates the present study. To address this gap, we propose four modified estimators, M S R E 1 , M S R E 2 ,   M S R E 3 and M S R E 4 , that incorporate a variance-stabilizing term to overcome their limitations under severe multicollinearity and noise. By jointly penalizing multicollinearity through the condition number and estimation uncertainty through variance adjustment, the proposed estimators provide more stable performance across different regression settings.
This paper is organized as follows: Section 2 covers materials and methodology, including existing and newly proposed estimators. The simulation algorithm is outlined in Section 3, while Section 4 focuses on the two real-life applications. Finally, Section 5 provides concluding remarks.

2. Materials and Methodology

The canonical form of Equation (1) is expressed as
y = W θ + ϵ ,
where W = X Q is a modified design matrix and θ represents the vector of the parameters in the transformed space. The matrix Q   is an orthogonal matrix obtained from the eigenvectors of , satisfying Q Q = I p , where I p is the p × p   identity matrix. The transformation W = X Q adjusts the design matrix with the principal components of X , simplifying the structure for the regression problem.
Λ = Q Q ,
Here, Λ is a diagonal matrix containing the eigenvalues λ 1 , λ 2 , , λ p , arranged in increasing order. The canonical parameters are related to the original parameters of the models by θ = Q φ , enabling the model to function in the canonical space. The OLS estimator in Equation (2) can be written in canonical form as
θ ^ = Λ 1 W y ,
The ridge estimate in Equation (6) can be expressed in canonical form as
θ ^ k = Λ + k I p 1 W y ,
which stabilizes the solution by adding k to the diagonal elements of Λ, effectively shrinking the contributions of smaller eigenvalues and enhancing numerical stability, or K = D i a g k 1 ,   k 2 ,   , k p ,   k i > 0   f o r   i = 1 ,   2 ,   ,   p . A two-parameter ridge estimator in Equation (7) can be expressed in canonical form as
θ ^ ( d , k ) = d Λ + k I p 1 W y ,
The MSEs of these estimators are given below in Equations (14)–(16):
M S E θ ^ O L S = i = 1 p σ ^ 2 λ i ,
M S E θ ^ K = σ ^ 2 i = 1 p λ i ( λ i + k ) 2 + i = 1 p   k 2 θ o l s 2 ( λ i + k ) 2 ,
M S E θ ^ q , k = d 2 i = 1 p λ i σ ^ 2 λ i + k 2 + i = 1 p d   λ i λ i + k 1 2   θ ^ 2 O L S .
where σ ^ 2 represents is the error variance of linear regression model Equation (1), θ ^ is the ith estimated value of parameter θ and λ i is the ith eigenvalue of the matrix .

2.1. Existing Ridge Estimators

Given that σ ^ 2 represents the estimated error variance of the model, λ m a x   denotes the maximum eigenvalue of the matrix and θ ^ M a x = M a x ( θ ^ 1 ,   θ ^ 2 ,   , θ ^ p ) regression estimates coefficient, several established ridge estimators, widely used for addressing multicollinearity in regression analysis, are discussed below.
Hoerl and Kennard [1] introduced the shrinkage parameter to address multicollinearity issues and estimate reliable parameters of the regression model, referred to as the HK estimator in this study. Its mathematical expression is as follows:
k ^ H K = σ ^ 2 θ ^ M a x 2 ,
Hoerl et al. [20] modified the HK estimator to better handle the multicollinearity issue, referring to it as the HKB estimator, mathematically defined as
k ^ H K B = p     σ ^ 2 i = 1 p θ ^ i 2 ,
Kibria [3] developed three ridge estimators based on the average, Geometric mean, arithmetic mean and median, to enhance the significant collinearity issues of the data. These are mathematically formulated as
k ^ A M = 1 p i = 1 p σ ^ 2 θ ^ i 2 , k ^ G M = σ ^ 2 i = 1 p θ ^ i 2 1 p , k ^ M e d = M e d σ ^ 2 θ ^ i 2 .
The eigenvalue-based ridge estimator, referred to in this study as the KMS estimator, was explored by [21] and is expressed as
k ^ K M S = λ Max i = 1 p θ ^ i σ ^ 2 θ ^ M a x 2 , where   λ Max = M a x λ 1 ,   λ 2 ,   ,   λ p .
Reference [13] introduced the two-parameter ridge estimator technique for highly correlated data; it is referred to as the LC estimator in this research. k is a penalty parameter and q   is a scale factor. k and d are calculated using Equation (17) and Equation (8), respectively.
Similarly, Toker and Kaçıranlar [14] established the dual-parameter ridge estimator to handle the high collinearity data, in a study referred to as TK estimator, with the optimal values for d ^ o p t and k ^ o p t derived as follows:
d ^ o p t = i = 1 p θ ^ i 2 λ i λ i + k i = 1 p θ ^ 2 λ i + γ ^ i 2 λ i 2 λ i + k 2 ,
  k ^ o p t = q ^ o p t i = 1 p σ ^ 2 λ i + q ^ o p t 1   i = 1 p θ ^ i 2 λ i 2 i = 1 p θ ^ i 2 λ i .
Yasin et al. [22] modified two-parameter ridge estimators based on averages, denoted as MTP1, MTP2, and MTP3, with their respective mathematical forms expressed as
k ^ M T P R 1 * = i = 1 p k i * p , k ^ M T P R 2 * = i = 1 p k i * 1 p , k ^ M T P R 3 * = p i = 1 p 1 k i * .
In these formulations, the modified ridge parameter for the i th predictor is calculated as
k i * = λ i θ i k ^ o p t
Equation (23) for k -values of the estimators MTPR1, MTPR2, and MTPR3 mentioned above is used to determine d using Equation (8).
Most recently, Akhtar and Alharthi [19] developed three ridge estimators, referred to as CARE1, CARE2, and CARE3, to handle severe multicollinearity issues. They are mathematically defined as follows:
k ^ C A R E T = γ p i = 1 p λ i θ ^ i 1 + c o n d C ω ,   T = 1 ,   2 ,   3  
i.
Case 1 (CARE1):   ω = 1 ,   r = 1   a n d   γ = 1
k ^ C A R E 1 = 1 p i = 1 p λ i θ ^ i 1 + c o n d C ,
ii.
Case 2 (CARE2): ω = 2 ,   r = 1   a n d   γ = 2
k ^ C A R E 2 = 2 p i = 1 p λ i θ ^ i 1 + c o n d C 2 ,
iii.
Case 3 (CARE3): ω = 3 ,   r = 2   a n d   γ = 1
k ^ C A R E 3 = 1 p i = 1 p λ i 2 θ ^ i 1 + c o n d C 3
For the second parameter, Equation (8) is used. Among these three ridge estimators, we select Case 1, CARE1, as the competitor estimator in the study. The following existing estimators are used in this study: HK, HKB, KAM, KGM, KMed, KMS, LC, TK, MTPR1, MTPR2, MTPR3, and CARE1. These estimators are compared with our newly proposed estimators.

2.2. Proposed Ridge Estimators

In this section, we modify the condition-adjusted ridge estimators (CAREs) to address their limitations under high multicollinearity and noise conditions. While the CAREs framework effectively adjusts the ridge penalty based on the multicollinearity structure of the design matrix, it does not incorporate any direct mechanism to account for the error variance or the scale of regression coefficients.
To overcome this shortcoming, we propose an improved CARE, denoted as M S R E 2 ,   M S R E 3 and M S R E 4 , which augment the original penalty formulation by introducing a variance-stabilizing term of the form σ ^ 2 θ ^ m a x 2 . This modification allows for more robust shrinkage behavior in severe mutlticollinearity settings, while maintaining adaptability to the eigenvalues structure of the design matrix.
The improved estimators retain the core idea of scaling the ridge penalty by the C N ( ) but now operate within a two-component structure that jointly penalizes based on both multicollinearity and estimation uncertainty. We propose the following multiscale ridge estimators.
k ^ j = 1 p i = 1 p λ i a θ ^ i b 1 + c o n d f + σ ^ 2 θ ^ m a x 2 where   j = 1 ,   2 ,   3   a n d   4 .
a, b, and f control the degree of shrinkage applied to eigenvalues, coefficient magnitudes, and the condition number, respectively. Their selected values represent different levels of penalization to improve stability under multicollinearity. To address potential arbitrariness, a sensitivity analysis is conducted, showing that the estimators are robust to moderate changes in these parameters.
  • λ i are the eigenvalues of the matrix.
  • θ ^ i are estimated coefficients.
  • θ ^ m a x 2 is the maximum estimated regression coefficient.
  • c o n d is the condition number of the design matrix, measuring collinearity.
  • The term σ ^ 2 θ ^ m a x 2 captures estimation uncertainty.
  • a ,   b ,   a n d   f are non-negative exponents that control the influence of each component.
    k ^ 1 = 1 p i = 1 p λ i θ ^ i 1 + c o n d ( ) + σ ^ 2 θ ^ m a x 2 ,   f o r   a = 1 ,   b = 1   a n d   f = 1 .
    k ^ 2 = 1 p i = 1 p λ i 2 θ ^ i 1 + c o n d + σ ^ 2 θ ^ m a x 2 , f o r   a = 2 ,   b = 1   a n d   f = 1 .
    k ^ 3 = 1 p i = 1 p λ i 6 θ ^ i 3 1 + c o n d 3 + σ ^ 2 θ ^ m a x 2 , f o r   a = 6 ,   b = 3   a n d   f = 3 .
    k ^ 4 = 2 p i = 1 p λ i 4 θ ^ i 4 1 + c o n d 4 + σ ^ 2 θ ^ m a x 2 , f o r   a = 4 ,   b = 4   a n d   f = 4 .
This formulation generalizes the original CARE penalty and leads to the development of four specific multiscale ridge estimators, Equations (28)–(31), and the d   –value uses Equation (8).
Main Theoretical Results.

2.3. Theoretical Comparison

We now present two key theorems that establish the theoretical properties of the proposed estimators.
Theorem 1 
(Consistency and Performance Under Severe Multicollinearity). Under standard regularity conditions, the proposed estimators θ ^ ( j ) p r o p  are consistent. Moreover, when severe multicollinearity is present ( κ     30 )  and the error variance is positive ( σ 2 > 0 ) , there exists a threshold κ0 such that for all κ > κ0, the proposed estimators achieve a lower MSE than the existing CARE:
M S E θ ^ ( j ) p r o p < M S E θ ^ C A R E .
Proof. 
As the sample size n increases, the eigenvalues λ i grow proportionally to n . From the construction of the proposed estimators in Equations (28)–(31), the ridge parameter k p r o p j remains bounded (i.e., O (1)), because σ 2 θ m a x 2 converges to a finite constant, while the term involving λ i θ i 1 + κ diverges. Consequently, k p r o p j λ i +   k p r o p j   0 . Additionally, from Equation (8), the scaling parameter d p r o p converges to 1, see [23]. Therefore, θ ^ p r o p j converges to the OLS estimator θ ^ O L S , which is known to be consistent. □
Under severe multicollinearity, the smallest eigenvalue λ m i n approaches zero. For the CARE proposed by Akhtar and Alharthi [19], the ridge parameter is k C A R E = 1 p . Σ i = 1 p λ i θ i 1 + κ , which tends to zero as λ m i n   0 . In contrast, the proposed estimators include the additional term σ 2 θ m a x 2 >   0 , ensuring that k p r o p j >   k C A R E for sufficiently large condition numbers. Following the classic result of [1], the derivative M S E k is negative for small values of k when multicollinearity is present. The reduction in variance is of order O 1 λ m i n , while the increase in squared bias is only of order O k 2 . For sufficiently small λ m i n , the variance reduction outweighs the bias increase, leading to a net decrease in MSE [3,24].
Theorem 2 
(Condition for Superiority Over Existing Estimators). Let k p r o p j   a n d   d p r o p j  denote the parameters for  M S R E j , and let  k e x i s t  and  d e x i s t  denote the parameters for any competing two-parameter ridge estimator. Then  M S R E j  yields a smaller MSE than the competing estimator if the following inequality holds:
The difference between the MSEs of  M S R E j > 0 . This means the proposed estimator has a smaller MSE when the expression involving the parameters of both estimators satisfies this inequality.
Proof. 
The mean squared error difference between the two estimators is obtained by subtracting their respective MSE expressions derived from Equation (16). When this difference is positive, the proposed estimator has a smaller MSE. The inequality above directly represents this condition. In the specific case of comparing with CARE under severe multicollinearity, Theorem 2.2 guarantees that   k p r o p j >   k C A R E and typically d p r o p j <   1 , which satisfies the required inequality. □

2.3.1. Asymptotic Properties

Under the assumptions that    n l i m i t →∞ 1 n X X   =   Σ , where Σ is a positive definite matrix, and ϵ i   i . i . d .   0 ,   σ 2 , the proposed estimators θ ^ p r o p j are consistent estimators of θ as n → ∞.
Proof. 
As n → ∞, the eigenvalues λ i grow proportionally to n. Consequently, λ i   for each i . The proposed ridge parameter k p r o p j from Equations (28)–(31) satisfies k p r o p j =   O 1 because σ 2 θ m a x 2 >   0 converges to a finite constant by the consistency of OLS estimators, while λ i θ ^ i 1   +   c o n d X X   .  Thus, k p r o p j λ i +   k p r o p j   0 .  Moreover, from Equation (8), d p r o p   1   as n       (see [23], Lemma 1). □
Therefore, θ ^ p r o p j   Λ 1 W y   =   θ ^ O L S , which is consistent [25,26].
Discussion of Theoretical Findings
The theoretical results presented above lead to several important observations about the proposed estimators:
When the variance-stabilizing term σ 2 θ m a x 2 is set to zero, the proposed estimators simplify to the CARE introduced by [19]. This demonstrates that our approach serves as a natural extension of existing methodology, rather than a completely unrelated contribution.
A key limitation of many existing ridge estimators is that they can produce shrinkage parameters arbitrarily close to zero under certain conditions, effectively reverting to the unstable OLS estimator. By incorporating σ 2 θ m a x 2 , our proposed estimators ensure that k   >   0 in all practical scenarios, providing consistent stabilization.
The flexibility to choose the exponents a ,   b ,   a n d   f   allows the estimator to adapt to varying degrees of multicollinearity. The four specific estimators proposed (MSRE1 through MSRE4) represent a spectrum of shrinkage intensities. MSRE1 and MSRE2 offer moderate penalization suitable for mild to moderate collinearity, while MSRE3 and MSRE4 provide more aggressive shrinkage designed for severe multicollinearity scenarios [1].
Theorem 2.1 establishes that under conditions of severe multicollinearity with non-negligible error variance, precisely the situation where standard ridge estimators struggle, the proposed estimators theoretically outperform the existing CARE. This theoretical finding will be empirically validated in the simulation studies and real-world applications presented in subsequent sections.
Remark 1. 
The specific exponent choices for the four proposed estimators are (a, b, f) = (1, 1, 1) for MSRE1, (2, 1, 1) for MSRE2, (6, 3, 3) for MSRE3, and (4, 4, 4) for MSRE4. These values were selected to represent a range of shrinkage intensities, with higher exponents providing stronger penalization suitable for more severe multicollinearity conditions.
By doing so, we not only improve the CAREs but also provide a structured classification of enhanced ridge estimators that better handle multicollinearity and error variance structures. These estimators are compared based on Monte Carlo simulation, with details provided in Section 3.

3. Monte Carlo Simulation Design

Generating predictor variables under organized multicollinearity settings are produced as follows. Predictors x i j are simulated using Equation (32); many researchers, see references [17,27], adopted this method for generation of the predictor variables.
x i j = 1 ρ 2 0.5 w j i + ρ w j   p + 1 , j = 1,2 , .   .   . , n   a n d   i = 1 ,   2 , .   .   . ,   p .
Here, ( ρ ) denotes the correlation between predictors, w j i represents random samples of predictors drawn from a standard normal distribution N ( 0 ,   I ) ,   p   is the total number of predictors, and n is the sample size. To evaluate various levels of correlation values (0.80, 0.90 and 0.99), n = 20 ,   50 ,   100 and p   =   4 ,   10 are chosen to evaluate the model across different scenarios. Mathematically, the model is expressed as
y j = θ 0 + i = 1 p θ i x j i + ϵ j ,      j = 1 ,   2 , .   .   . , n
The regression model parameters ( θ i ) are calculated by determining the optimal direction, following the methodology outlined by [28]. The error term ( ϵ j ) is the model with mean zero and different values error variance ( σ ) . Three levels of error variance were considered ( σ = 0.5 ,   6 ,   12 ) . On this basis, the method ensures that parameters of the model are oriented to minimize errors and enhance the model’s predictive accuracy.

3.1. Estimated Mean Squared Errors

The estimators are biased; therefore, we measure their effectiveness based on estimated MSEs, calculated over N = 2500 replications using the following formula:
MSE θ ^ = 1 N i = 1 N ( θ ^ i θ ) θ ^ i θ
Evaluating the performance of existing and proposed estimators is challenging, so we compute the estimated MSEs using specific algorithms and formulas (10) to (31). Accuracy is based on the MSE criterion.
The performance of the existing and proposed estimators is difficult to evaluate theoretically; therefore, Equations (10)–(31) were used alongside a specific algorithm to estimate MSEs. The accuracy of the estimators was assessed based on the MSE criterion. All the simulation analyses were carried out using the R programming language. These results, summarized in Table A1, Table A2, Table A3, Table A4, Table A5 and Table A6 in the Appendix A, highlight the lowest estimated MSEs for each ridge estimator, with the lowest values bolded for easy reference in tables. A lower MSE reflects an estimator’s ability to closely approximate the true parameters, indicating higher accuracy and reliability. The simulation results are further discussed in Section 3.2.
We created a summary in Table 1, based on the results in simulation tables, highlighting the performance of various ridge estimators across different scenarios.
In the summary in Table 1, it is clear that our newly modified estimators perform better in all scenarios as compared to other existing methods.

3.2. Simulation Results Analysis

  • Effect of Sample Size (n): As the sample size increases, all estimators show a decrease in the estimated MSEs. The newly proposed MSRE estimators, in particular, perform well across all sample sizes, often surpassing traditional OLS and other existing methods, especially for larger n values.
  • Effect of Number of Predictors ( p ) : When the number of predictors increases, MSEs generally rise. However, MSRE estimators remain more stable than OLS and other methods as p grows, showing better resilience in high-dimensional settings.
  • Effect of Multicollinearity ( ρ ) : High multicollinearity ( ρ   =   0.99 ) significantly impacts OLS, leading to a higher MSE. MSRE estimators, particularly M S R E 3 , handle multicollinearity more effectively, providing lower MSE compared to OLS and other methods.
  • Effect of Error Variance ( σ ) : As error variance increases, estimated MSEs rise for all estimators. the newly proposed MSRE estimators are less sensitive to larger error variances, maintaining superior performance compared to other methods, especially in higher error variance scenarios ( σ   =   12 ) .

4. Practical Applications

In this section, we utilized three datasets of a similar nature to the simulation to examine performance of our proposed estimators.
(a)
The first dataset, the Economic Report of the President Dataset [29], was accessed via the U.S. Government Printing Office and provided key economic indicators.
(b)
The second dataset, the Body Fat Dataset [30], includes detailed anthropometric measurements and is publicly available online.
(c)
The third dataset, the (Automobile Demand dataset) Car Passenger Data, was sourced from the U.S. Dept. of Commerce 1986 [31].

4.1. Economic Dataset

The Economic Indicators Dataset includes the dependent variable Y (outstanding mortgage debt in trillions), and three other independent variables, X 1 (personal consumption in trillions), X 2 (personal income in trillions), and X 3 (consumer credit in trillions), covering the 1990–2006 period. The high correlations make it ideal for studying multicollinearity and econometric performance. Therefore, the regression line model for this dataset is
Y = θ 0 + θ 1 X 1 + θ 2 X 2 + θ 3 X 3 + ϵ .
The eigenvalues of the dataset are λ 1 = 2.94 , λ 2 = 0.058 , and λ 3 = 0.0017 , confirming significant multicollinearity issues in data examine through the CN and VIF. The CN is based on the ratio of maximum and minimum eigenvalues of the data, so CN is approximately 41.59, indicating a significant multicollinearity issue in the data. The calculated VIF values are 79.90 ,   521.82 and 919.65 , corresponding to R 2 values of 0.9875 ,   0.99808   a n d   0.99891 , all VIF values greater than 10 . Furthermore, Figure 1 graphically illustrates the highly correlated nature of the dataset.
The results in Table 2 clearly show the superior performance of the proposed MSRE estimators compared to OLS and other methods in terms of the MSE criterion, aligning with the simulation results.

4.2. Model Selection Criteria

To evaluate the performance of the various estimation methods, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) are employed. These criteria provide a balanced assessment of model fit and complexity, with lower values indicating a more parsimonious and well-fitting model. Table 3 presents the AIC and BIC values for each estimator applied to the economic indicators’ dataset.
Table 3 presents the AIC and BIC values for various estimation methods applied to the economic indicator dataset. The methods compared include existing along with newly proposed estimators (MSRE1 to MSRE4).
From Table 3, we observe that the newly proposed MSRE estimators (MSRE1 to MSRE4) consistently demonstrate lower AIC and BIC values compared to OLS and other methods, suggesting superior performance in terms of model selection criteria. Among these, MSRE3 has the lowest AIC and BIC values, indicating it may be the most optimal model for this dataset.

4.3. Medical Dataset

This Body Fat dataset (medical data) includes body composition and physical measurement data for 252 individuals. It includes variables such as body fat percentage (Y), density ( X 1 ), age ( X 2 ), weight ( X 3 ), height ( X 4 ), adiposity ( X 5 ) , and various circumferential measurements (e.g., neck ( X 6 ), chest ( X 7 ), abdomen ( X 8 ) , hip ( X 9 ) , thigh ( X 10 ) , knee ( X 11 ) , ankle ( X 12 ) , biceps ( X 13 ) , forearm ( X 14 ) , and wrist ( X 15 ) ). The given dataset is suitable for analyzing the relationships between body fat and physical attributes, as well as for developed predictive regression models in health and fitness research. We utilized the following regression model:
Y = θ 0 + i = 1 15 θ i X i + ϵ
In this analysis, multicollinearity is assessed using CN, eigenvalues, and VIF. Additionally, a heatmap of the correlation matrix in Figure 2 is produced to visually depict strong correlations and provide insights into the multicollinearity structure of the predictors. The eigenvalues for this body fat data ranged from 0.65   to   9.77 , with a C N value of 1234.89 , which is greater than 30, which shows significant multicollinearity issues in the data. The VIF values varied in the dataset between 2.31 and 62.63; the threshold of VIF, which is 10, also confirmed the presence of high collinearity among the predictors of the dataset. The results of CN and VIFs suggest that there could be numerical instability in regression models due to inflated coefficient variances. To address the severe multicollinearity issue, we used newly proposed and existing ridge estimators, as they help mitigate multicollinearity and enhance the stability and interpretability of the model.
As shown in Table 4, the real data validate the simulation results.

4.4. Automobile Demand Dataset

The Automobile demand dataset includes 16 observations, with the dependent variable (Y) representing new car sales (thousands), and independent variables: X 1 (CPI for new cars), X 2 (overall CPI), X 3 (disposable income), X 4 (interest rate), and X 5 (labor force). A linear regression model that can be used is
Y = θ 0 + θ 1 X 1 + θ 2 X 2 + θ 3 X 3 + θ 4 X 4 + θ 5 X 5 + ϵ ,
The eigenvalues of the dataset are 4.2867, 1.3603, 0.3252, 0.0214, 0.0054, and 0.0010, with the very small eigenvalues, particularly 0.0010 and 0.0054, strongly indicating the presence of multicollinearity. The CN (the ratio of the largest eigenvalue to the smallest eigenvalue) is 4125.53, which is greater than 30, confirming significant collinearity in the data. Additionally, the VIF values further support this conclusion, with Y having a VIF of 4.07, while X 1 , X 2 , X 3 , and X 4 show extremely high VIF values of 255.27, 602.26, 290.57, and 42.74, respectively, all significantly exceeding the critical threshold of 10.
The analysis in Table 5, also shows the superior performance of the proposed MSRE estimators over OLS and other methods, consistent with the simulation results.

5. Conclusions

This research study introduced four new ridge estimators, named M S R E 1 , M S R E 2 ,   M S R E 3 and M S R E 4 , to tackle the severe challenges of multicollinearity in regression models. The study comprehensively evaluated the performance of various estimators under different scenarios, using Monte Carlo simulations and three real-world datasets to highlight their effectiveness in addressing severe multicollinearity. The newly proposed estimators showed better performance as compared to other existing estimators, particularly M S R E 3 , which consistently achieved the minimum MSE across different datasets and scenarios.
In simulation results, M S R E 3 exceled under different levels of multicollinearity, sample sizes, and predictor counts, striking an effective balance between bias and variance. Real-world datasets further validated these findings, where in the Economic Indicators dataset, M S R E 3 outperformed others by minimizing estimation errors even under severe multicollinearity; in the Body Fat dataset, it showed resilience against high condition numbers and variance inflation factors, delivering the best MSE; and in the Automobile Demand dataset, it achieved the lowest MSE, proving its robustness and reliability in practical applications.
Overall, the adaptive regularization techniques employed in the proposed estimators provide a clear advantage over traditional ridge regression methods. M S R E 3 , in particular, stood out as a highly effective and versatile estimator for managing multicollinearity, reducing estimation errors, and improving predictive accuracy in both simulated and real-world settings.
Future research could focus on extending the applicability of the newly proposed estimators to dynamic systems, non-linear models, and high-dimensional datasets, while also exploring improvements through advanced computational and machine learning techniques.

Author Contributions

Conceptualization, A.R.R.A.; Methodology, A.A.A.; Software, A.R.R.A.; Validation, A.A.A.; Formal analysis, A.R.R.A.; Investigation, A.A.A.; Resources, A.R.R.A. and A.A.A.; Data curation, A.A.A.; Writing—original draft, A.R.R.A.; Writing—review & editing, A.A.A.; Visualization, A.R.R.A.; Supervision, A.A.A.; Project administration, A.R.R.A. and A.A.A.; Funding acquisition, A.R.R.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Appendix A

Table A1. MSE of the estimators for n = 20   a n d   p = 4 .
Table A1. MSE of the estimators for n = 20   a n d   p = 4 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS0.17690.32212.781926.119549.8266399.484104.374182.9441731.48
HK0.15390.21031.17389.0718.3251139.92936.001662.8926616.844
HKB0.10650.16380.74166.835112.111795.631531.743449.6878407.248
KAM0.1720.30882.484322.519143.1769342.9889.9829156.5681494.36
KGM0.08340.11190.52433.11694.283316.977810.624214.112153.0114
KMed0.11390.13120.62413.23794.748532.239111.111916.1339159.762
KMS0.1390.22031.47219.617239.3612353.0087.9476156.441622.99
LC0.01670.02460.1593.29792.00731.801214.466512.34576.4945
TK1.43830.98480.136711.112714.11222.028640.655548.505393.2063
MTPR10.00480.04710.00362.9311.96165.729614.4113.8467.3822
MTPR20.00610.00940.01634.86876.5259.626.43537.722399.51
MTPR30.00590.09160.94288.745418.721200.9643.85481.0091086.2
CARE10.00580.09030.94197.670114.702347.92023.903426.00144.289
M S R E 1 0.05780.08030.09216.7149.910546.35521.20627.345225.76
M S R E 2 0.00580.00430.00352.47642.11177.240111.73911.71743.878
M S R E 3 0.00520.00390.00342.07771.24583.257210.62210.04513.067
M S R E 4 0.15890.27991.91623.56734.265422.59813.72413.27169.631
The bold values in the table, represent the minimum MSE.
Table A2. MSE of the estimators for n = 50   and   p = 4 .
Table A2. MSE of the estimators for n = 50   and   p = 4 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS0.07750.15751.920110.604522.4686275.90943.131295.3144986.359
HK0.07250.13590.83753.46326.210489.570715.28332.4084296.546
HKB0.0560.0950.51442.72855.371865.035212.108923.501220.437
KAM0.07630.15281.70569.092218.9141232.74636.902180.6004813.955
KGM0.04810.05160.28531.28171.934712.90584.08065.30425.8388
KMed0.08670.06310.33581.28891.778619.22373.80036.082360.3166
KMS0.06780.11830.98437.22215.8038237.10834.239177.449889.259
LC0.00590.00920.06620.92970.6640.34485.62032.93414.093
TK0.83420.45840.15284.74994.33753.463216.216916.229927.858
MTPR10.00230.00160.00150.56710.35180.22184.51452.62144.0617
MTPR20.0030.00170.00390.75380.75697.46325.9555.465354.012
MTPR30.00220.00160.21271.37732.560482.84410.718.375351.71
CARE10.028160.03400.04151.69202.41039.67168.56028.601252.012
M S R E 1 0.0270.03110.04041.69062.40249.67188.5588.567253.478
M S R E 2 0.00250.00160.00140.4550.31940.28734.43042.542912.326
M S R E 3 0.00240.00150.00140.41550.26110.21333.96982.18955.3441
M S R E 4 0.07340.14351.34410.64050.69354.9325.15853.119417.907
The bold values in the table, represent the minimum MSE.
Table A3. MSE of the estimators for   n = 100   and   p = 4 .
Table A3. MSE of the estimators for   n = 100   and   p = 4 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS0.03830.07340.70115.13310.2144101.15721.326244.1344402.998
HK0.03720.11360.43521.75443.513933.41687.040813.6317132.064
HKB0.03340.05360.23381.55582.713823.38385.660610.823585.2707
KAM0.0380.07240.65194.51638.824885.522418.465637.9827338.645
KGM0.03060.03360.16490.84211.41136.65912.38063.663416.1988
KMed0.07040.04820.17710.85091.32187.89022.27173.710928.3569
KMS0.03580.06370.37963.29866.692781.123515.815934.3292351.584
LC0.00370.00450.03890.43290.47930.22962.14591.69850.5414
TK0.73580.4810.10923.0132.52884.26158.16918.82046.4584
MTPR10.00150.00620.00080.17870.15830.1651.38551.21610.4298
MTPR20.00140.00120.00220.23160.30211.89331.65441.72755.1174
MTPR30.00090.00280.04530.42120.696112.752.28983.641672.161
CARE10.01400.14780.01620.44110.80111.67014.0293.9208.0198
M S R E 1 0.01300.01380.01680.64210.81061.50613.02973.4857.4386
M S R E 2 0.00110.00080.00070.14240.12070.08351.42841.18920.498
M S R E 3 0.00110.00080.00070.13910.11860.08321.34271.0310.408
M S R E 4 0.03660.0690.54990.21440.21910.70681.60121.34850.8862
The bold values in the table, represent the minimum MSE.
Table A4. MSE of the estimators for n = 20   and   p = 10 .
Table A4. MSE of the estimators for n = 20   and   p = 10 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS1.15382.622731.972159.889378.9094775.58598.2421457.8717856.2
HK0.70381.20811.306161.4178141.1571626.53220.418543.0176031.92
HKB0.34210.55075.538225.933865.1459797.847110.41235.7082703.19
KAM1.11772.503729.7906150.55354.2044436.91562.4521361.5816549.9
KGM0.09340.14721.2516.126410.8812106.70720.489837.5026356.483
KMed0.09930.16821.58757.086312.7194153.86826.788246.7362611.17
KMS0.63241.324620.1563132.834317.5234416.04532.6391314.1317061.8
LC0.00630.00780.04621.89061.03290.42514.13737.54042.8092
TK0.7280.19070.805930.634227.954271.8887106.5648.82046.4584
MTPR10.00310.00190.00152.63871.0964.30620.8881.21610.4298
MTPR20.00410.00330.033210.51318.675387.1767.361.72755.1174
MTPR30.15730.377822.18250.188134.992928.4209.953.641672.161
CARE10.35900.55031.502133.99866.129300.01111.0343.82037.980
M S R E 1 0.34610.52561.473334.7963.946321.57106.273.4857.4386
M S R E 2 0.0070.00370.00152.77472.67955.874819.7671.18920.498
M S R E 3 0.0030.00180.00111.13830.58080.181911.3251.0310.408
M S R E 4 0.91431.970322.62513.08724.154276.6134.1771.34850.8862
The bold values in the table, represent the minimum MSE.
Table A5. MSE of the estimators for n = 50   and   p = 10 .
Table A5. MSE of the estimators for n = 50   and   p = 10 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS0.24440.51025.410636.170576.0053752.029149.405301.7083026.61
HK0.21820.22772.452115.109230.0347289.14264.8937127.1511230.78
HKB0.12350.19911.15157.026814.1168131.00130.763151.4977553.235
KAM0.24230.50235.219634.848372.8123718.292143.84289.2992892.11
KGM0.0360.04890.36592.25693.522929.46337.934213.14492.9034
KMed0.04050.05530.50412.89774.533541.782810.426318.76164.809
KMS0.18050.30022.859827.276658.4745662.566126.427258.9682822.74
LC0.00180.00220.01560.33960.26980.27423.07652.53620.5005
TK0.29460.86080.05377.25325.98053.395246.63351.052354.1969
MTPR10.000910.00080.00510.20090.15760.08124.06942.49740.4447
MTPR20.00130.0010.00150.75451.139312.80612.18518.87267.09
MTPR30.00090.02520.63912.11324.936105.3331.89466.1361165.5
CARE10.09400.12800.34014.502114.02940.03730.09844.890217.93
M S R E 1 0.09310.12610.23878.761512.2939.70728.90942.908214.89
M S R E 2 0.00110.00080.00060.19040.11170.10122.96473.28886.1599
M S R E 3 0.00090.000810.00050.15080.10150.07722.14471.97090.3637
M S R E 4 0.22910.44974.08381.3181.958616.7614.7215.91430.741
The bold values in the table, represent the minimum MSE.
Table A6. MSE of the estimators for n = 100   and   p = 10 .
Table A6. MSE of the estimators for n = 100   and   p = 10 .
σ 0.5 6 12
ρ
Methods  
0.800.900.990.800.900.990.800.900.99
OLS0.12220.25112.869316.798138.6759386.16772.0525154.8711552.27
HK0.11480.21821.49747.247216.6853146.27329.802960.2581596.186
HKB0.07470.11820.62253.25817.445863.650613.533828.476270.669
KAM0.12160.24842.772316.165937.0887367.23269.3161148.1821479.04
KGM0.02430.02560.20181.22342.199214.46144.19727.409747.7355
KMed0.02590.02840.26791.69612.882920.90895.328410.201379.2249
KMS0.09930.16451.407411.365427.8233325.09357.7646126.621411.43
LC0.0010.00090.00510.14340.17660.17220.77710.54242.1638
TK0.73430.01610.00072.42581.86730.904320.349922.450916.1109
MTPR10.00060.00050.00030.1030.05460.03590.71630.30762.0594
MTPR20.00080.00070.00040.15060.13141.01952.27813.256438.929
MTPR30.00150.00080.00110.25960.908816.3237.009712.456250.31
CARE10.05700.07310.13024.3035.93013.60212.78018.40166.013
M S R E 1 0.04740.06060.11293.69375.772213.04811.15517.65967.958
M S R E 2 0.00050.00040.00020.07620.05340.03590.5850.38522.4822
M S R E 3 0.00040.00030.00020.07120.05140.03560.48030.26122.1363
M S R E 4 0.11760.23162.18160.39410.77916.4361.00621.48956.5394
The bold values in the table, represent the minimum MSE.

References

  1. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Applications to Nonorthogonal Problems. Technometrics 1970, 12, 69–82. [Google Scholar] [CrossRef]
  2. Garg, R. Ridge Regression in the Presence of Multicollinearity. Psychol. Rep. 1984, 54, 559–566. [Google Scholar] [CrossRef]
  3. Kibria, B.M.G. Performance of some New Ridge regression estimators. Commun. Stat. Part B Simul. Comput. 2003, 32, 419–435. [Google Scholar] [CrossRef]
  4. Marquardt, D.W. Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation. Technometrics 1970, 12, 591–612. [Google Scholar] [CrossRef] [PubMed]
  5. Ahmed, S.E.; Yilmaz, E.; Aydın, D. Kernel Ridge-Type Shrinkage Estimators in Partially Linear Regression Models with Correlated Errors. Mathematics 2025, 13, 1959. [Google Scholar] [CrossRef]
  6. Bashtian, M.H.; Arashi, M.; Tabatabaey, S.M.M. Using improved estimation strategies to combat multicollinearity. J. Stat. Comput. Simul. 2011, 81, 1773–1797. [Google Scholar] [CrossRef]
  7. Schreiber-Gregory, D.N. Ridge Regression and multicollinearity: An in-depth review. Model Assist. Stat. Appl. 2018, 13, 359–365. [Google Scholar] [CrossRef]
  8. Arashi, M.; Roozbeh, M.; Hamzah, N.A.; Gasparini, M. Ridge regression and its applications in genetic studies. PLoS ONE 2021, 16, e0245376. [Google Scholar] [CrossRef]
  9. McDonald, G.C. Ridge regression. WIREs Comput. Stat. 2009, 1, 93–100. [Google Scholar] [CrossRef]
  10. Chandrasekhar, C.K.; Bagyalakshmi, H.; Srinivasan, M.R.; Gallo, M. Partial ridge regression under multicollinearity. J. Appl. Stat. 2016, 43, 2462–2473. [Google Scholar] [CrossRef]
  11. Al-Momani, M.; Yüzbaşı, B.; Bataineh, M.S.; Abdallah, R.; Moideenkutty, A. Shrinkage Approaches for Ridge-Type Estimators Under Multicollinearity. Mathematics 2025, 13, 3733. [Google Scholar] [CrossRef]
  12. Magklaras, A.; Gogos, C.; Alefragis, P.; Birbas, A. Enhancing Parameters Tuning of Overlay Models with Ridge Regression: Addressing Multicollinearity in High-Dimensional Data. Mathematics 2024, 12, 3179. [Google Scholar] [CrossRef]
  13. Lipovetsky, S.; Conklin, W.M. Ridge regression in two-parameter solution. Appl. Stoch. Model. Bus. Ind. 2005, 21, 525–540. [Google Scholar] [CrossRef]
  14. Toker, S.; Kaçıranlar, S. On the performance of two parameter ridge estimator under the mean square error criterion. Appl. Math. Comput. 2013, 219, 4718–4728. [Google Scholar] [CrossRef]
  15. Dar, I.S.; Chand, S. Bootstrap-quantile ridge estimator for linear regression with applications. PLoS ONE 2024, 19, e0302221. [Google Scholar] [CrossRef]
  16. Batah, F.S.; Salih, M.M.; Salih, M.K.; Erdal, Ş.C. On modified unbiased ridge regression estimator in linear regression model. AIP Conf. Proc. 2023, 2820, 040007. [Google Scholar] [CrossRef]
  17. Özbay, N. Two-Parameter Ridge Estimation for the Coefficients of Almon Distributed Lag Model. Iran. J. Sci. Technol. Trans. A Sci. 2019, 43, 1819–1828. [Google Scholar] [CrossRef]
  18. Alharthi, M.F.; Akhtar, N. Newly Improved Two-Parameter Ridge Estimators: A Better Approach for Mitigating Multicollinearity in Regression Analysis. Axioms 2025, 14, 186. [Google Scholar] [CrossRef]
  19. Akhtar, N.; Alharthi, M.F. Enhancing accuracy in modelling highly multicollinear data using alternative shrinkage parameters for ridge regression methods. Sci. Rep. 2025, 15, 10774. [Google Scholar] [CrossRef] [PubMed]
  20. Hoerl, A.E.; Kannard, R.W.; Baldwin, K.F. Ridge regression:some simulations. Commun. Stat. 1975, 4, 105–123. [Google Scholar] [CrossRef]
  21. Khalaf, G.; Månsson, K.; Shukur, G. Modified Ridge Regression Estimators. Commun. Stat.-Theory Methods 2013, 42, 1476–1487. [Google Scholar] [CrossRef]
  22. Yasin, S.; Salem, S.; Ayed, H.; Kamal, S.; Suhail, M.; Khan, Y.A. Modified Robust Ridge M-Estimators in Two-Parameter Ridge Regression Model. Math. Probl. Eng. 2021, 2021, 1845914. [Google Scholar] [CrossRef]
  23. Özkale, M.R. A stochastic restricted ridge regression estimator. J. Multivar. Anal. 2009, 100, 1706–1716. [Google Scholar] [CrossRef]
  24. Gower, J.C. Growth-free canonical variates and generalized inverses. Bull. Int. Statist. Inst. 1978, 47, 77–86. [Google Scholar]
  25. Greene, W.H. Econometric Analysis, 8th ed.; Pearson Education: Chennai, India, 2017. [Google Scholar]
  26. White, H. Asymptotic Theory for Econometricians; Academic Press: San Diego, CA, USA, 2014. [Google Scholar]
  27. Akhtar, N.; Alharthi, M.F.; Khan, M.S. Mitigating Multicollinearity in Regression: A Study on Improved Ridge Estimators. Mathematics 2024, 12, 3027. [Google Scholar] [CrossRef]
  28. Halawa, A.M.; El Bassiouni, M.Y. Tests of regression coefficients under ridge regression models. J. Stat. Comput. Simul. 2000, 65, 341–356. [Google Scholar] [CrossRef]
  29. United States Government Printing Office. Council of Economic Advisers, Economic Report of the President; United States Government Printing Office: Washington, DC, USA, 2008. Available online: https://www.govinfo.gov/app/details/ERP-2008 (accessed on 12 February 2008).
  30. Fisher, A.G. Body Fat Dataset. 1994. Available online: https://www.kaggle.com/datasets/fedesoriano/body-fat-prediction-dataset (accessed on 5 October 1994).
  31. Gujarati, D.N.; Porter, D.C. Basic Econometrics, 5th ed.; McGraw-Hill/Irwin: New York, NY, USA, 2009. [Google Scholar]
Figure 1. Economic Indicator Dataset Display.
Figure 1. Economic Indicator Dataset Display.
Mathematics 14 01245 g001
Figure 2. Medical Dataset Display.
Figure 2. Medical Dataset Display.
Mathematics 14 01245 g002
Table 1. Performance Summary of the Estimators.
Table 1. Performance Summary of the Estimators.
n ρ
σ
p = 4 p = 10
0.800.900.990.800.900.99
200.5MTPR1 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3
6 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3
12 M S R E 3 M S R E 3 MSRE1 M S R E 3 M S R E 3 M S R E 3
500.5 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 2 M S R E 3
6 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3
12 M S R E 3 M S R E 3 MTRE1 M S R E 3 M S R E 3 M S R E 3
1000.5 M S R E 3 M S R E 3 M S R E 3 M S R E 2 M S R E 3 M S R E 3
6 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3
12 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3 M S R E 3
Table 2. MSEs and estimated parameters of the models for economic indicator data.
Table 2. MSEs and estimated parameters of the models for economic indicator data.
Methods M S E θ ^ 0 θ ^ 1 θ ^ 2 θ ^ 3
O L S 2.59541−0.49885−0.16383−0.49901−0.10491
H K 2.37807−0.475850.028385−0.010130.014182
H K B 2.48442−0.30751−0.49832−0.00057−0.50005
K A M 2.577220.166985−0.447895.52 × 10−5−0.38898
K G M 2.46806−0.49633−0.17819−0.4995−0.08394
K M e d 2.47841−0.366420.033126−0.044270.01053
K M S 2.45759−0.06876−0.49808−0.00265−0.49957
LC2.391940.008212−0.436290.00026−0.04941
TK2.25374−0.49838−0.14969−0.50059−0.00299
MTPR1 2.24719 −0.450730.024307−0.163650.000293
MTPR22.25046−0.1866−0.50023−0.01305−0.49898
MTPR32.278720.036241−0.369290.001317−0.0081
CARE12.428506−0.44247−0.499010.0141820.014188
M S R E 1 2.42847−0.474190.008277−0.467524.39 × 10−5
M S R E 2 2.251140.136262−0.066380.076072−0.47557
M S R E 3 2.2466−0.49821−0.00416−0.49983−0.30539
M S R E 4 2.591845−0.442470.00041−0.409610.160769
The bold values in the table, represent the estimator with the minimum MSE.
Table 3. AIC and BIC Values for Different Estimators in the Economic Indicators Model.
Table 3. AIC and BIC Values for Different Estimators in the Economic Indicators Model.
MethodsAICBICMethodsAICBIC
OLS24.212926.9392TK21.817624.5439
HK22.723725.4500MTPR121.763224.4895
HKB23.475126.2014MTPR221.790424.5167
KAM24.099026.8253MTPR321.997824.7241
KGM23.361226.0875 M S R E 1 23.079025.8053
KMed23.434326.1606 M S R E 2 21.795524.5218
KMS23.286426.0127 M S R E 3 21.756424.4827
LC22.825725.5520 M S R E 4 24.190826.9171
CARE123.1626.4098
The bold values in table represent the lowest AIC and BIC of the estimators.
Table 4. Estimated MSE for medical Dataset.
Table 4. Estimated MSE for medical Dataset.
MethodsOLSHKHKBKAMKGMKMedKMSLC
M S E 1.469841.181401.113181.456091.058971.047231.107570.943481
MethodsTKMTPR1MTPR2MTPR3CARE1 M S R E 1 M S R E 2 M S R E 3 M S R E 4
MSE3.0740380.8989250.9468941.0008251.392791.368881.0266690.898861.195515
The bold values in table represent the lowest MSE of the estimator.
Table 5. Estimated MSE and parameters of the models of Automobile demand dataset.
Table 5. Estimated MSE and parameters of the models of Automobile demand dataset.
MethodsMSE θ ^ 0 θ ^ 1 θ ^ 2 θ ^ 3 θ ^ 4 θ ^ 5
O L S 4.283758−0.47378−0.47378−0.47377−0.47655−0.47378−0.47624
H K 2.3971910.1308510.130850.1308420.0426160.1308520.111277
H K B 3.5665810.1504250.1504220.1503820.0117990.1504250.07447
K A M 4.255817−0.43255−0.4324−0.43067−0.00224−0.43246−0.02437
K G M 2.787107−0.13251−0.13232−0.13023−0.00017−0.13239−0.00194
K M e d 2.508995−1.3665−1.35631−1.25367−0.00034−1.35992−0.00393
K M S 3.985858−0.47325−0.47365−0.47434−0.47687−0.47434−0.47653
LC2.3645520.1303870.1307340.1305860.0523340.1305690.042028
TK2.1879840.1482210.1498650.1479190.0157940.1478010.011578
MTPR12.182174−0.35289−0.40933−0.33404−0.00308−0.33058−0.00219
MTPR22.184134−0.0696−0.10799−0.06055−0.00024−0.05907−0.00017
MTPR32.204982−0.24165−0.62985−0.19158−0.00047−0.18431−0.00034
CARE12.3532810.144010.136390.0930410.134030.139930.11940
M S R E 1 2.3593250.1308240.130590.0725190.1211890.130580.130851
M S R E 2 2.197826−0.42695−0.38397−0.00569−0.04961−0.33278−0.43255
M S R E 3 2.182052−0.12590−0.08796−0.00044−0.00414−0.0600−0.13250
M S R E 4 4.282798−1.07555−0.37865−0.00088−0.00849−0.18888−1.36615
The bold values in table represent the minimum MSE of the estimator.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alzahrani, A.R.R.; Alzahrani, A.A. Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity. Mathematics 2026, 14, 1245. https://doi.org/10.3390/math14081245

AMA Style

Alzahrani ARR, Alzahrani AA. Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity. Mathematics. 2026; 14(8):1245. https://doi.org/10.3390/math14081245

Chicago/Turabian Style

Alzahrani, Ali Rashash R., and Asma Ahmad Alzahrani. 2026. "Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity" Mathematics 14, no. 8: 1245. https://doi.org/10.3390/math14081245

APA Style

Alzahrani, A. R. R., & Alzahrani, A. A. (2026). Improved Data-Driven Shrinkage Estimators for Regression Models Under Severe Multicollinearity. Mathematics, 14(8), 1245. https://doi.org/10.3390/math14081245

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop