Next Article in Journal
An Overview of Economics and Econometrics Related R Packages
Previous Article in Journal
Confidence Intervals of Risk Ratios for the Augmented Logistic Regression with Pseudo-Observations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Simulation-Based Comparative Analysis of Two-Parameter Robust Ridge M-Estimators for Linear Regression Models

1
Department of Statistics, University of Peshawar, Peshawar 25000, Pakistan
2
Department of Statistics, Government Superior Science College Peshawar, Peshawar 25000, Pakistan
3
Department of Mathematics and Statistics, Florida International University, Miami, FL 33199, USA
*
Author to whom correspondence should be addressed.
Stats 2025, 8(4), 84; https://doi.org/10.3390/stats8040084
Submission received: 28 August 2025 / Revised: 17 September 2025 / Accepted: 18 September 2025 / Published: 24 September 2025

Abstract

Traditional regression estimators like Ordinary Least Squares (OLS) and classical ridge regression often fail under multicollinearity and outlier contamination respectively. Although recently developed two-parameter ridge regression (TPRR) estimators improve efficiency by introducing dual shrinkage parameters, they remain sensitive to extreme observations. This study develops a new class of Two-Parameter Robust Ridge M-Estimators (TPRRM) that integrate dual shrinkage with robust M-estimation to simultaneously address multicollinearity and outliers. A Monte Carlo simulation study, conducted under varying sample sizes, predictor dimensions, correlation levels, and contamination structures, compares the proposed estimators with OLS, ridge, and the most recent TPRR estimators. The results demonstrate that TPRRM consistently achieves the lowest Mean Squared Error (MSE), particularly in heavy-tailed and outlier-prone scenarios. Application to the Tobacco and Gasoline Consumption datasets further validates the superiority of the proposed methods in real-world conditions. The findings confirm that the proposed TPRRM fills a critical methodological gap by offering estimators that are not only efficient under multicollinearity, but also robust against departures from normality.

1. Introduction

Linear regression (LR) is a foundational statistical technique that models the relationship between a dependent variable and one or more independent variables. It can range from simple linear regression, which uses a single predictor, to multiple linear regression (MLR) that incorporates numerous predictors, thus allowing for a more comprehensive understanding of how various factors influence the dependent variable. The mathematical form of MLRM [1] is
y = X β + ε
where y denotes an n × 1 vector of response variables, and X represents a known n × p matrix of predictor variables. The vector β is a p × 1 vector of unknown regression parameters, while ε is an n × 1 vector of error terms, satisfying E(ε) = 0 and V(ε) = σ2 In, where In is the n × n identity matrix. Typically, the Ordinary Least Squares (OLS) method is used to estimate the model parameters; it is a widely adopted approach due to its efficiency and ease of interpretation [2]. OLS estimation calculates the regression coefficients by minimizing the sum of squared residuals, thereby fitting the best linear line to the data. The simplicity of this method has made it a cornerstone in statistical analysis, but it requires various assumptions to ensure the validity and reliability of its predictions. The OLS of β is defined as
β ^ O L S = V 1 X y
where V = X X . OLS is the best linear unbiased estimator under some classical assumptions, like no correlation among predictors, i.e., no multicollinearity, no outliers, finite error variance, and correct model specification. Studies have shown that the above OLS estimator can yield inaccurate results when these model assumptions are violated [3]. However, even in the presence of multicollinearity, the OLS estimator retains optimal prediction properties under the classical assumptions of no outliers and finite error variance. However, in finite samples or when variance inflation becomes severe, the OLS estimates may become unstable, leading to poor practical performance [4]. In such cases, ridge regression [5] introduces a small bias that reduces variance, yielding more stable coefficient estimates and, in practice, improved predictive performance, particularly in finite samples or in data prone to noise and outliers. This adjustment helps to improve the model’s accuracy and prevents the coefficient estimates from becoming excessively large in the presence of highly correlated predictors. Ridge regression is therefore valuable in refining the predictive capacity of MLR models, especially when multicollinearity is unavoidable due to the nature of the data. From 1970 until now, almost 400 estimators have been suggested in the literature to combat the problem of MC [6]. The ridge regression estimator of β is defined as
β ^ k = ( V + k I ) 1 X y ,
where k > 0 is the ridge or shrinkage parameter. To enhance the efficiency of the ridge estimator, ref. [7] introduced an additional parameter, termed ‘q’, alongside k, resulting in the two-parameter ridge regression (TPRR) estimator, which is defined as
β ^ q , k =   q ̑ ( X X + k I ) 1 X y ,
where
q ̑ = ( X y ) ( X X + k I ) 1 X y ( X y ) ( X X + k I ) 1 X X ( X X + k I ) 1 X y
The value of ‘q’ in the above equation is derived by maximizing the coefficient of determination, while the value of ‘k’ is chosen to minimize the Mean Squared Error (MSE). It is important to note that k and q are shrinkage parameters, whereas the ridge regression and its two-parameter extensions are the corresponding estimators that utilize these parameters. This distinction is crucial, as the efficiency of an estimator depends on how appropriately its shrinkage parameters are chosen.
Notable work in the area of two-parameter estimators has been conducted in [8,9,10,11,12]. To simultaneously address multicollinearity and outliers, robust ridge M-estimators were introduced in [13] based on the work of [14,15]. The objective function for the robust M ridge regression used in [13] was based on Huber’s loss function [16].
β = arg min β i = 1 n ρ y i x i T β δ + λ β 2
These M-based ridge regression estimators demonstrate a smaller Mean Squared Error (MSE) compared to traditional ridge regression estimators. Recent advancements in this area have been made in [17,18,19]. The same concept of using M estimators to derive a robust version of a ridge estimator was extended to the two-parameter ridge regression approach by Ertaş et al. in 2015, by combining the benefits of robust M-estimation and the most efficient two-parameter ridge regression in order to handle both MC and outliers effectively [20]. Recent work in this area was conducted in [21].
Now, motivated by the idea of Ertas et al., the purpose of this study is to refine existing techniques and develop more adaptive and resilient estimation frameworks. This study proposes and evaluates a new TPRRM approach to tackle the combined challenges of MC and outliers. Through simulation studies and numerical examples, the research aims to demonstrate that these new estimators outperform existing methods. The goal of this paper is to improve the reliability and accuracy of regression models in datasets that are rich in outliers, contributing to more informed and effective decision-making in statistical modeling.
The organization of the paper is as follows: Section 2 presents the methodology, along with an introduction to the proposed estimator. A simulation study is conducted in Section 3. Section 4 illustrates the advantages of the proposed estimator using the Tobacco and Gasoline Consumption datasets. Some concluding remarks are presented in Section 5.

2. Materials and Method

For mathematical convenience, regression is expressed in canonical form. Equation (1), in orthogonal form, is
y = P α + ε
Here, P = X D , and α = C β , C is an orthogonal matrix containing eigen vectors of X′X such that C C = I p and Λ = C X X C = Λ = d i a g ( λ 1 , λ 2 , λ 3 , , λ p ) , where λ i are the ordered eigen values of matrix X′X. I p is the identity matrix of order p. Now the canonical form of Equations (2) and (3) is
α O L S = Λ 1 P y
α k = ( Λ + k I p ) 1 Λ α O L S
The canonical form of the M estimator of the above equation is
α k , M = ( Λ + k I p ) 1 Λ α M
The canonical form of Equation (4) is
α k , T P = q ( Λ + k I p ) 1 Λ α O L S
The canonical form of the M estimator of the above equation is
α k , T P , M = q ( Λ + k I p ) 1 Λ α M
Here, α M is an M estimator such that α = C β and M S E ( α ) = M S E ( β ) . The canonical forms of the OLS, ridge, robust ridge, and two-parameter ridge estimators are presented in Equations (8)–(12). The Mean Squared Errors (MSEs) of these estimators are given in (13)–(18), with robust versions based on [22].
M S E ( α O L S ) = i = 1 p σ 2 λ i
M S E ( α M ) = i = 1 p A ^ 2 λ i = i = 1 p Ω i i
M S E ( α k ) = i = 1 p λ i σ 2 ( λ i + k ) 2 + i = 1 p k 2 α O L S 2 ( λ i + k ) 2
M S E ( α k , M ) = i = 1 p λ i A ^ 2 λ i ( λ i + k ) 2 + i = 1 p k 2 α M 2 ( λ i + k ) 2
M S E ( α k , T P ) = q 2 i = 1 p λ i σ 2 ( λ i + k ) 2 + i = 1 p q λ i λ i + k 1 2 α O L S 2
M S E ( α k , T P , M ) = q 2 i = 1 p λ i A ^ 2 ( λ i + k ) 2 + i = 1 p q λ i λ i + k 1 2 α M 2
Here, A 2 = s 2 n p 1 i = 1 p ψ 2 i / s n 1 i = 1 p ψ i / s 2 , δ 2 is replaced by A 2 , where A 2 is a robust estimator based on the M-estimator proposed by [23].

2.1. Existing Estimators

  • The groundbreaking ridge estimator introduced by Hoerl & Kennard (1970a) is as follows [5]:
    k ^ H K 1 = σ ^ 2 α ^ i 2         w h e r e , i = 1 , 2 , 3 , , p
    where σ ^ 2 = y X β ^ T y X β ^ n p 1 .
  • The second ridge estimator introduced by Hoerl & Kennard (1970b) is as follows [24]:
k ^ H K 2 = σ 2 α max 2
3.
Ref. [14] generalized the idea of Hoerl & Kennard (1970a) and suggested a new estimator, denoted as HKB:
k ^ H K B = p σ 2 i = 1 p α i 2
4.
Kibria (2003) suggested three new ridge regression estimators by taking the AM, GM, and Median of Hoerl & Kennard (1970a, ref. [5]) [25]:
k ^ A M = 1 p i = 1 p σ 2 α i 2
k ^ G M = σ 2 i = 1 p α i 2 1 p
k ^ M E D = M e d σ 2 α i 2
5.
Similarly, Khalaf et al. (2013) proposed an estimator by combining the idea of Hoerl & Kennard (1970b, ref. [24]) with the concept of weight, as follows [26]:
k ^ K M S = λ max i = 1 p α i 2 σ 2 α max 2
These estimators reduce instability in coefficient estimates caused by multicollinearity, yet they remain sensitive to outliers. Their performance deteriorates under contamination, which limits their applicability in real-world data, where assumptions of normality and clean samples rarely hold.

2.2. Two-Parameter Ridge Regression Estimator

  • To improve the fit quality of the one-parameter ridge regression, ref. [7] recommended using the k value defined in Equation (20) along with the parameter ‘q’ from Equation (5), thus introducing the concept of the two-parameter robust ridge regression estimator (TPRRE) [7].
  • Inspired by [7], Toker & Kaçiranlar (2013) introduced a new TPRRE based on optimal selection of the k and q values [8].
q o p t = i = 1 p α i 2 λ i λ i + k i = 1 p σ 2 λ i + α i 2 λ i ( λ i + k ) 2
k o p t = q ^ o p t i = 1 p σ 2 λ i + ( q ^ o p t 1 ) i = 1 p α i 2 λ i i = 1 p α i 2 λ i
3.
The latest advancements in the field of TPRRE were introduced by Khan et al. (2024) [11], who proposed six new estimators with the goal of enhancing the accuracy of ridge estimation.
k ^ S H 1 = λ max 100 λ min 99
k ^ S H 2 = λ max 100 λ min
k ^ S H 3 = λ max λ min
k ^ S H 4 = λ max λ min
k ^ S H 5 = λ max
k ^ S H 6 = λ m i n
Recent contributions by [11] propose six new TPRR estimators, further improving estimation under MC. However, their framework does not account for robustness against non-normal errors and outliers. As such, while TPRR improves efficiency through two shrinkage parameters, it still inherits the vulnerability of ridge-type estimators to contaminated data.

2.3. Proposed Estimator

Now, motivated by the idea of [12,20,21], we propose Two-Parameter Robust Ridge M-Estimators (TPRRM). The novelty lies in embedding dual shrinkage parameters within a robust M-estimation framework, thereby achieving efficiency under multicollinearity while ensuring resilience to outliers. Unlike existing two-parameter estimators, the proposed TPRRM preserves the strengths of dual shrinkage while mitigating contamination effects. Through both simulation and application to the Tobacco dataset, we empirically demonstrate that TPRRM consistently outperforms classical ridge, robust ridge, and the most recent TPRR estimators.
The proposed estimators are as follows.
Ref. [26] suggested a weight for ridge estimators as follows:
w K M S = λ max i = 1 p α i 2
Following [19], we modified Equation (34) and used its M-version, as follows:
w m = λ max i = 1 p α i M p
We multiplied Equations (28)–(33) by Equation (35) and obtained our six new proposed estimators, BAD1, BAD2, BAD3, BAD4, BAD5, and BAD6, which are given below:
k ^ B A D 1 = λ max 100 λ min 99 * w m
k ^ B A D 2 = λ max 100 λ min * w m
k ^ B A D 3 = ( λ max λ min ) * w m
k ^ B A D 4 = λ max λ min * w m
k ^ B A D 5 = ( λ max ) * w m
k ^ B A D 6 = ( λ m i n ) * w m
The above value of k will be used in Equation (5) to calculate the value of q ^ . Then both q ^ and k ^ value will be put into Equation (4) to estimate β .
A simulation study is performed in the section below to compare the performance of the suggested estimator with that of existing estimators.

3. Simulation Study

3.1. Simulation Design

To examine the performance of the proposed and existing estimators under diverse conditions, a Monte Carlo simulation experiment incorporating varying levels of multicollinearity, error variance, sample sizes, and numbers of predictors has been used. The data-generating process was adapted from the work of [11,18,27,28].
The predictor variables were generated as follows:
x i j = 1 ρ 2 z j i + ρ z j p + 1 ,       f o r   i = 1 , 2 , 3 , ,   p   a n d   j = 1 , 2 , 3 , , n
where zij ∼ N(0,1) and ρ denotes the correlation between regressors. This design ensures that the predictors exhibit the desired degree of multicollinearity.
To capture a wide range of conditions, we consider pairwise correlations as ρ = {0.85, 0.90, 0.95, 0.99, 0.999}, ranging from moderate to near-perfect multicollinearity. The number of predictors is p = {4, 10} and the sample size is n = {20, 50, 100}.
The regression coefficients β were generated using the Most Favorable (MF) direction approach, which ensures comparability across estimators [12]. Without loss of generality, the intercept term was set to zero. The error term εi was generated from two distributions: normal distribution, εi ∼ N(0,σ2), and heavy-tailed distribution, εi ∼ Cauchy (0,1). The variance parameter was varied across four levels, σ2 = {0.5, 1, 5, 10}, allowing us to assess estimator performance under both low- and high-noise conditions.
To assess robustness, contamination was introduced by replacing randomly selected response values (yj) with values shifted in the y-direction. Two contamination levels were considered, 10% and 20%, generated by taking the error variance as ‘σ2 + 10’ and ‘σ2 + 20’ respectively. For robust ridge M-type estimators, the Huber objective function is used in conjunction with the ‘rlm()’ function to calculate the M-estimator.
The response variable was then computed as follows:
y j = α 0 + α 1 x i j + α 2 x 2 j + + α p x p j + ε j ,       f o r   i   =   1 ,   2 ,   3 , ,   p   a n d   j   =   1 ,   2 ,   3 , , n

3.2. Performance Evaluation Criteria

Unlike OLS, ridge estimators introduce a certain degree of bias. The MSE criterion provides a more appropriate basis for evaluating and comparing such biased estimators, as mentioned by [10,12]. Furthermore, the existing literature consistently emphasizes the use of the minimum MSE criterion as the standard for identifying the most efficient estimator [12,19,27]. The MSE is defined as
M S E ( α ) = E α α α α
Each scenario was replicated 5000 times to ensure stable results. The estimated MSE (EMSE) was computed as follows:
M S E ( α ) = 1 5000 j = 1 5000 α j α α j α
All simulations and calculation were performed using R version 4.5.1 programing language, and the results are summarized in both tabular and graphical form.

3.3. Simulation Results Discussion

Table 1, Table 2, Table 3 and Table 4 (and Supplementary Tables S1–S8) illustrate the estimated Mean Squared Error (MSE) of estimators for normally distributed errors with 10% and 20% outliers in the response variable. Table 5 contains comprehensive data for the Cauchy distribution. Table 6 provides a bird’s-eye view of all the situations considered in the simulation study of the proposed and existing estimators, and thoroughly reviews the top-performing estimators for all situations studied. Figure 1 is the diagnostic plot confirms the presence of three outliers (green dots), which are clearly separated from the fitted model (red dots and line), supporting the use of a BAD4. This plot visually demonstrates how the estimator is not influenced by these extreme data points, thereby producing a more reliable regression fit.
The bold value represents the smaller MSE. The simulation results in Table 1, Table 2, Table 3, Table 4 and Table 5 reveal that the BAD estimators (BAD1 to BAD6) demonstrate exceptional robustness against outliers across the varying scenarios presented in the tables. Overall, the BAD estimators maintain low Mean Squared Error (MSE) values, demonstrating superior performance compared to other methods such as OLS, HK, and HKB. Among them, BAD1 and BAD4 consistently yield the lowest MSEs, making them reliable choices—particularly in scenarios with lower confidence levels (0.85 and 0.95) and 10% outliers. As the confidence level increases (up to 0.999) or the percentage of outliers rises to 20%, BAD4 remains especially effective, highlighting its versatility and robustness under extreme conditions. BAD2, BAD3, and BAD5 also perform well, with minimal MSE variation, particularly at higher confidence levels. Although BAD6 shows slightly higher MSE under less extreme conditions, it proves advantageous in cases involving severe outliers and high confidence levels, indicating its suitability for more challenging datasets. In summary, BAD4 emerges as the most robust and versatile estimator overall, while BAD1 is optimal for scenarios with moderate outlier influence. For conditions involving a higher proportion of outliers and stringent confidence requirements, BAD2, BAD5, and BAD6 also deliver strong, reliable performance, making them valuable options for practical applications with noisy data.
The results from Table 5 reveal that, in the case of the heavy-tailed standardized Cauchy distribution, where outliers strongly affect the performance of traditional methods, the BAD estimators consistently demonstrate superior performance by maintaining low Mean Squared Error (MSE) values. Unlike traditional estimators such as OLS, HK, and HKB, which exhibit significantly higher MSE values, especially under high multicollinearity (0.99 and 0.999), the BAD estimators prove to be much more robust. The BAD4 estimator, in particular, achieves remarkably low MSE values even under extreme multicollinearity conditions. These findings highlight the robustness and accuracy of the BAD series in such challenging scenarios, making them a preferable choice when working with heavy-tailed data.
Table 6 provides a bird’s-eye view of all the scenarios explored in the simulation study.

4. Real Life Application

4.1. Tobacco Data

This section includes an empirical case that demonstrates the performance of the novel estimators. We followed [29,30,31] in analyzing tobacco data. The data includes thirty (30) observations of different tobacco mixes. The percentage concentrations of four essential components are employed as predictor variables, with the quantity of heat created by the tobacco during smoking serving as a response variable. The regression model for the tobacco data is shown below.
y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + ε
Figure 2 illustrating the relationships between variables X 1 , X 2 , X 3 , X 4 , and   y . The lower triangular matrix displays scatterplots, visualizing the linear relationships between pairs of variables. The upper triangular matrix shows the Pearson correlation coefficients (r), with asterisks (∗∗∗) indicating the strong positive correlation > 0.9. The diagonal panels show the kernel density distribution for each variable. The condition number of this data is obtained as 1855.526, which indicates severe multicollinearity among the predictors. Based on the covariance matrix, observations 4, 18, and 29 are identified as outliers.
Table 7 provides a comparative analysis of various regression estimators, including traditional methods like OLS, HK, and HKB, as well as robust estimators such as SH, RM, and the novel BAD series (BAD1 to BAD6), applied to the tobacco dataset. The MSE values indicate the error level for each estimator, with a lower MSE representing better performance.
The results highlight the limitations of traditional methods like OLS, which has the highest MSE (32.4972), suggesting poor robustness against multicollinearity and outliers. On the other hand, the novel BAD estimators demonstrate significantly lower MSE values. In particular, BAD6 achieves the lowest MSE (1.7834), indicating superior performance and robustness, especially in handling the severe multicollinearity and outliers present in the tobacco data. This is further supported by the regression coefficients, where BAD6 shows consistent and stable values.
BAD1, BAD3, and BAD4 also exhibit strong performance, with MSE values of 2.5162, 2.8647, and 2.9447, respectively, indicating their effectiveness as well. Notably, BAD4 achieves the lowest regression coefficient estimates, suggesting high precision and stability.
The robust performance of BAD6, followed closely by BAD1 and BAD4, suggests that these novel estimators provide a reliable alternative to traditional methods, particularly in datasets affected by multicollinearity and outliers. The bolded values indicate the estimators with the minimum MSE, further emphasizing the effectiveness of BAD estimators in this empirical case study.
To compare the statistical properties of estimators, the 95% CI was computed for each regression coefficient β ^ i = ( i = 0 , 1 , 2 , 3 , 4 ) under OLS, ridge regression, robust ridge, and two-parameter robust ridge regression. The confidence intervals were constructed using asymptotic variance–covariance matrices. The variance–covariance matrix for OLS is given by
V a r ( β ^ O L S ) = σ ^ 2 ( X X ) 1
and for RR, by
V a r ( β ^ R R ) = σ ^ 2 ( X X + k I ) 1 ( X X ) ( X X + k I ) 1
The standard errors (SE) were obtained from the diagonal elements of these matrices, and the 95% CIs were obtained as follows:
C I = β ^ i ± t 1 ( α / 2 ) , n p V a r ( β ^ i )
where t ( 1 α 2 ) , is the ( 1 α 2 ) quantile of Student’s t-distribution with n − 1 degrees of freedom.
The 95% CIs for the predictors are presented in Table 8. We denote L ( β ^ i ) and U ( β ^ i ) as the lower and upper bounds of the confidence interval, respectively [10].
In Table 8, where the CI values are presented, it is clear that traditional estimators such as OLS, HK, HKB, LC, and RM exhibit wide ranges, reflecting higher variability and lower precision. KAM and KMS provide somewhat narrower bounds, with KMS showing improved stability compared to OLS. The shrinkage-type estimators (ST, SH1–SH6) considerably reduce interval widths, with SH2–SH5 in particular producing highly compact bounds that suggest strong reliability. Most notably, the BAD family (BAD1–BAD6) achieves the narrowest intervals overall, with BAD4 displaying the smallest and most consistent ranges across all coefficients, marking it as the most precise and stable estimator among those compared.
Figure 3 illustrates a response plot using the BAD4 estimator. It shows the relationship between the fitted values (predicted values) and the observed values of a variable Y. The x-axis represents the fitted values ( Y ^ ) from BAD4, which are the predicted values from a statistical model using a BAD4 estimator. The y-axis represents the observed Y, which are the actual data points. The plot illustrates how well the BAD4 estimator fits the data while highlighting potential outliers. Most points (blue) align with the model’s predictions, indicating a good overall fit. One point’s X-outlier (orange) stands out with unusually high leverage. This diagnostic tool helps to visually identify influential observations that may affect model performance.

4.2. Gasoline Consumption Data

To further evaluate the performance of the proposed estimators under severe multicollinearity with y-direction outliers, the Gasoline Consumption dataset was analyzed [33]. This dataset comprises 30 automobile models, with fuel consumption (miles per gallon, y) as the response variable and four predictors: displacement (X1), horsepower (X2), torque (X3), and width (X4). The linear regression model is expressed as follows:
y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + ε
The eigen values (λ1 = 3.606, λ2 = 0.335, λ3 = 0.052, λ4 = 0.007) yield a condition number of 531.128 and a condition index of 23.046, confirming severe multicollinearity. Outlier detection using Cook’s distance identified observations 8 and 24 (values 0.01 and 0.00, respectively) as having negligible influence on regression estimates. Correlation analysis (Figure 4) revealed strong positive associations among predictors, while gasoline consumption showed a moderate negative relationship with all explanatory variables.
The results in Table 8 demonstrate that the proposed estimators significantly outperform OLS and RM, which yield very high MSE values (41.67 and 43.39, respectively), indicating poor predictive accuracy under multicollinearity and outliers. In contrast, estimators such as BAD2 (MSE = 1.9266), BAD3 (MSE = 1.9338), and BAD5 (MSE = 1.9339) achieve the lowest MSE values, demonstrating superior efficiency.
Table 9 below demonstrates that in the case of severe multicollinearity along with outliers, as in the gasoline consumption data, all the BAD estimators perform well compared to other traditional and newly developed TPREs, among which the BAD2 estimator outperforms all the others with a lower MSE; the same results are presented in a bar chart (Figure 5).
Table 10 present the 95% CI results for the proposed estimator, and demonstrate that the most striking results come from the BAD estimators (BAD1–BAD6), which consistently generate extremely narrow and stable confidence intervals. Among them, BAD4 demonstrates the smallest and most uniform ranges for all coefficients, establishing it as the most precise and reliable estimator for the gasoline consumption data.

5. Some Concluding Remarks

This study presented a comparative analysis of traditional regression estimators (OLS), one-parameter ridge estimators and recently proposed two-parameter ridge estimators (SH1–SH6), traditional robust ridge, and a novel class of Two-Parameter Robust Ridge M-Estimators (BAD1–BAD6). Both simulation experiments and real-life datasets (Tobacco and Gasoline Consumption) were used to evaluate performance, with MSE serving as the primary criterion. The findings demonstrate that OLS is highly sensitive to multicollinearity and outliers, producing unstable and inefficient estimates under such conditions. Recently developed two-parameter ridge estimators improved estimation accuracy over classical ridge forms by introducing an additional shrinkage parameter, yet they remain vulnerable to extreme data contamination. Robust ridge estimators demonstrated resilience to outliers, but were less efficient under severe multicollinearity. In contrast, the proposed BAD series of Two-Parameter Robust Ridge M-Estimators successfully integrates the dual benefits of robustness and efficiency. Specifically, BAD2, BAD4, and BAD6 consistently achieved the lowest MSE values across diverse scenarios, showing greater stability compared to both earlier two-parameter ridge estimators and robust single-parameter methods. These results provide justification for the theoretical superiority and practical utility of the proposed class.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/stats8040084/s1: Table S1. Estimated MSE with n = 50 , p = 4 , and 10% outliers in the y-direction. Table S2. Estimated MSE with n = 50 , p = 4 , and 20% outliers in the y-direction. Table S3. Estimated MSE with n = 100 , p = 4 , and 10% outliers in the y-direction. Table S4. Estimated MSE with n = 100 , p = 4 , and 20% outliers in the y-direction. Table S5. Estimated MSE with n = 50 , p = 10 , and 10% outliers in the y-direction. Table S6. Estimated MSE with n = 50 , p = 10 , and 20% outliers in the y-direction. Table S7. Estimated MSE with n = 100 , p = 10 , and 10% outliers in the y-direction. Table S8. Estimated MSE with n = 100 , p = 10 , and 20% outliers in the y-direction.

Author Contributions

B.H. contributed to the methodology, data analysis, and manuscript writing. S.M.A. supervised the research work. D.W. co-supervised the research and developed the R code. B.M.G.K. provided proofreading of the final manuscript and addressed the reviewers’ comments. All authors have read and agreed to the published version of the manuscript.

Funding

This study receives no external fund.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request. Further, no experiments on humans and/or use of human tissue samples were involved in this study.

Acknowledgments

The authors are thankful to the editors and reviewers for their valuable comments and suggestions, which have certainly helped to improve the quality and presentation of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mao, S. Statistical Derivation of Linear Regression. In International Conference on Statistics, Applied Mathematics, and Computing Science (CSAMCS 2021); SPIE: Bellingham, WA, USA, 2022; Volume 12163, pp. 893–901. [Google Scholar]
  2. Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. J. Appl. Sci. Technol. Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
  3. Lukman, A.F.; Mohammed, S.; Olaluwoye, O.; Farghali, R.A. Handling Multicollinearity and Outliers in Logistic Regression Using the Robust Kibria–Lukman Estimator. Axioms 2024, 14, 19. [Google Scholar] [CrossRef]
  4. Lukman, A.F.; Jegede, S.L.; Bellob, A.B.; Binuomote, S.; Haadi, A. Modified Ridge-Type Estimator with Prior Information. Int. J. Eng. Res. Technol. 2019, 12, 1668–1676. [Google Scholar]
  5. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  6. Mermi, S.; Göktaş, A.; Akkuş, Ö. How Well Do Ridge Parameter Estimators Proposed so Far Perform in Terms of Normality, Outlier Detection, and MSE Criteria? Commun. Stat. Comput. 2024. [Google Scholar] [CrossRef]
  7. Lipovetsky, S.; Conklin, W.M. Ridge Regression in Two-Parameter Solution. Appl. Stoch. Model. Bus. Ind. 2005, 21, 525–540. [Google Scholar] [CrossRef]
  8. Toker, S.; Kaçiranlar, S. On the Performance of Two Parameter Ridge Estimator under the Mean Square Error Criterion. Appl. Math. Comput. 2013, 219, 4718–4728. [Google Scholar] [CrossRef]
  9. Abdelwahab, M.M.; Abonazel, M.R.; Hammad, A.T.; El-Masry, A.M. Modified Two-Parameter Liu Estimator for Addressing Multicollinearity in the Poisson Regression Model. Axioms 2024, 13, 46. [Google Scholar] [CrossRef]
  10. Alharthi, M.F.; Akhtar, N. Modified Two-Parameter Ridge Estimators for Enhanced Regression Performance in the Presence of Multicollinearity: Simulations and Medical Data Applications. Axioms 2025, 14, 527. [Google Scholar] [CrossRef]
  11. Shakir, M.; Ali, A.; Suhail, M.; Sadun, E. On the Estimation of Ridge Penalty in Linear Regression: Simulation and Application. Kuwait J. Sci. 2024, 51, 100273. [Google Scholar] [CrossRef]
  12. Khan, M.S. Adaptive Penalized Regression for High-Efficiency Estimation in Correlated Predictor Settings: A Data-Driven Shrinkage Approach. Mathematics 2025, 13, 2884. [Google Scholar] [CrossRef]
  13. Silvapulle, M.J. Robust Ridge Regression Based On An M-Estimator. Austral. J. Stat. 1991, 33, 319–333. [Google Scholar]
  14. Hoerl, A.E.; Kannard, R.W.; Baldwin, K.F. Ridge Regression: Some Simulations. Commun. Stat. Methods 1975, 4, 105–123. [Google Scholar] [CrossRef]
  15. Lawless, J.F.; Wang, P. A Simulation Study Of Ridge And Other Regression Estimators. Commun. Stat.—Theory Methods 1976, 5, 307–323. [Google Scholar] [CrossRef]
  16. Begashaw, G.B.; Yohannes, Y.B. Review of Outlier Detection and Identifying Using Robust Regression Model. Int. J. Syst. Sci. Appl. Math. 2020, 5, 4–11. [Google Scholar] [CrossRef]
  17. Majid, A.; Ahmad, S.; Aslam, M.; Kashif, M. A Robust Kibria–Lukman Estimator for Linear Regression Model to Combat Multicollinearity and Outliers. Concurr. Comput. Pract. Exp. 2023, 35, e7533. [Google Scholar] [CrossRef]
  18. Suhail, M.; Chand, S.; Aslam, M. New Quantile Based Ridge M-Estimator for Linear Regression Models with Multicollinearity and Outliers. Commun. Stat. Simul. Comput. 2023, 52, 1418–1435. [Google Scholar] [CrossRef]
  19. Wasim, D.; Suhail, M.; Albalawi, O.; Shabbir, M. Weighted Penalized M-Estimators in Robust Ridge Regression: An Application to Gasoline Consumption Data. J. Stat. Comput. Simul. 2024, 94, 3427–3456. [Google Scholar] [CrossRef]
  20. Ertaş, H.; Toker, S.; Kaçıranlar, S. Robust Two Parameter Ridge M-Estimator for Linear Regression. J. Appl. Stat. 2015, 42, 1490–1502. [Google Scholar] [CrossRef]
  21. Yasin, S.; Salem, S.; Ayed, H.; Kamal, S.; Suhail, M.; Khan, Y.A. Modified Robust Ridge M-Estimators in Two-Parameter Ridge Regression Model. Math. Probl. Eng. 2021, 2021, 1845914. [Google Scholar] [CrossRef]
  22. Wasim, D.; Khan, S.A.; Suhail, M. Modified Robust Ridge M-Estimators for Linear Regression Models: An Application to Tobacco Data. J. Stat. Comput. Simul. 2023, 93, 2703–2724. [Google Scholar] [CrossRef]
  23. Huber, P.J. Robust Statistical Procedures; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
  24. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Applications to Nonorthogonal Problems. Technometrics 1970, 12, 69–82. [Google Scholar] [CrossRef]
  25. Kibria, B.M.G. Performance of Some New Ridge Regression Estimators. Commun. Stat. Part B Simul. Comput. 2003, 32, 419–435. [Google Scholar] [CrossRef]
  26. Khalaf, G.; Mansson, K.; Shukur, G. Modified Ridge Regression Estimators. Commun. Stat.—Theory Methods 2013, 42, 1476–1487. [Google Scholar] [CrossRef]
  27. Dar, I.S.; Chand, S.; Shabbir, M.; Kibria, B.M.G. Condition-Index Based New Ridge Regression Estimator for Linear Regression Model with Multicollinearity. Kuwait J. Sci. 2023, 50, 91–96. [Google Scholar] [CrossRef]
  28. Ali, A.; Suhail, M.; Awwad, F.A. On the Performance of Two-Parameter Ridge Estimators for Handling Multicollinearity Problem in Linear Regression: Simulation and Application. AIP Adv. 2023, 13, 115208. [Google Scholar] [CrossRef]
  29. Ertaş, H. A Modified Ridge M-Estimator for Linear Regression Model with Multicollinearity and Outliers. Commun. Stat. Simul. Comput. 2018, 47, 1240–1250. [Google Scholar] [CrossRef]
  30. Suhail, M.; Chand, S.; Kibria, B.M.G. Quantile-Based Robust Ridge m-Estimator for Linear Regression Model in Presence of Multicollinearity and Outliers. Commun. Stat. Simul. Comput. 2021, 50, 3194–3206. [Google Scholar] [CrossRef]
  31. Wasim, D.; Khan, S.A.; Suhail, M.; Shabbir, M. New Penalized M-Estimators in Robust Ridge Regression: Real Life Applications Using Sports and Tobacco Data. Commun. Stat. Simul. Comput. 2025, 54, 1746–1765. [Google Scholar] [CrossRef]
  32. Myers, R.H.; Sliema, B. Classical and Modern Regression with Applications; Duxbury Thomson Learning: Pacific Grove, CA, USA, 1990. [Google Scholar]
  33. Babar, I.; Ayed, H.; Chand, S.; Suhail, M.; Khan, Y.A.; Marzouki, R. Modified Liu Estimators in the Linear Regression Model: An Application to Tobacco Data. PLoS ONE 2021, 16, e0259991. [Google Scholar] [CrossRef]
Figure 1. Diagnostic plot of BAD4 estimator (observed vs. predicted values).
Figure 1. Diagnostic plot of BAD4 estimator (observed vs. predicted values).
Stats 08 00084 g001
Figure 2. Pairwise correlation between variables of tobacco data of ref. [32].
Figure 2. Pairwise correlation between variables of tobacco data of ref. [32].
Stats 08 00084 g002
Figure 3. Diagnosis of influential observation and BAD4 estimator.
Figure 3. Diagnosis of influential observation and BAD4 estimator.
Stats 08 00084 g003
Figure 4. Pairwise correlation between variables of gasoline consumption data.
Figure 4. Pairwise correlation between variables of gasoline consumption data.
Stats 08 00084 g004
Figure 5. Estimated MSE of OLS, RR, TPRR, and TPRRE on gasoline consumption data.
Figure 5. Estimated MSE of OLS, RR, TPRR, and TPRRE on gasoline consumption data.
Stats 08 00084 g005
Table 1. Estimated MSE values with n = 25 , p = 4 , and 10% outliers in the y-direction.
Table 1. Estimated MSE values with n = 25 , p = 4 , and 10% outliers in the y-direction.
σ 2 = 0.1 σ 2 = 1
ρ 0.850.950.990.9990.850.950.990.999
OLS6.5471519920.95106882105.53042311066.1070428.7848005627.33002393139.17311781345.611102
HK3.3331105810.4757752252.14910175525.73222124.5109840413.8336576370.61252185669.4597999
HKB1.949877535.5754934826.39481325263.68896132.590257947.3176471336.07807873334.5928801
KAM0.519297940.397562380.248318160.211610240.72865640.775292070.862783770.44779355
KMS1.428243727.76137369.62319025972.6681712.1910139811.3916643597.708094371239.587147
LC4.0083826812.2521596360.2990279605.62022065.5181416816.4549631282.64997568783.1625333
ST36,880.459911,807.55971319.621316138.7851291719,398.9654117,193.6689336.9727693190.8436776
SH114.8147177514.052151538.926509161.356856218.5223520918.6490426712.752626381.85909349
SH20.085592370.198573190.077700960.073760370.116963360.297427590.112356430.09623571
SH30.194720940.099733640.076992230.073754630.281930510.148786450.110814410.09622856
SH40.30779280.090331780.073910530.07343180.450207530.133880560.102103450.09582625
SH50.192112420.099537950.076986250.073754580.278006160.148481810.1108010.09622849
SH63.27209669.7540751547.52873911475.69879024.5081306813.1080279664.55541601616.1533227
RM0.012501550.038662510.19239381.94696041.185780523.7250618818.26307215181.1788334
BAD10.000192020.000190420.00014150.000142990.020068970.048354830.028444590.09703252
BAD20.000206150.000161570.000139750.000142430.020354810.015339750.014079960.01421816
BAD30.000206560.000161430.000139750.000142430.020488380.015279990.014076950.01421774
BAD40.00020680.00016140.000139740.000142420.020568960.015270770.014060130.01419418
BAD50.000206550.000161430.000139750.000142430.020486310.015279810.014076920.01421774
BAD60.000216340.000171760.000165210.001577470.024623440.024836930.3374862537.31415504
Note: Bold values show the minimum MSE of estimators.
Table 2. Estimated MSE values with n = 25 , p = 4 , and 20% outliers in the y-direction.
Table 2. Estimated MSE values with n = 25 , p = 4 , and 20% outliers in the y-direction.
σ 2 = 0.1 σ 2 = 1
ρ 0.850.950.990.9990.850.950.990.999
OLS10.3425631.88066157.92541569.83890113.2256972240.38766039202.94351591936.173262
HK3.9651911.9788258.46895580.77423925.1925534415.8224758979.48724404734.9093383
HKB2.040855.70915726.36743262.58949072.632962737.4785727736.36066518330.6130589
KAM0.7633780.7939590.6005030.223121870.8268740.863412910.653466020.23961787
KMS1.6010579.11076792.124821404.6126992.4644679113.9291572130.88241471760.583628
LC5.3985615.5430374.13663729.35121747.2267751220.89479407102.1910372949.0051419
ST4102.904693.95921887.41170.701389719,454.363772825.4726938103.52933161,495.9898
SH119.5058423.2737217.813383.7552681523.9436626729.7459096324.494278893.6694769
SH20.0985670.3354460.0902450.082566410.131261830.49633870.274413310.10774046
SH30.3213030.1324140.0889440.082556310.484533260.203177260.26877950.10771952
SH40.5528650.1135720.0833240.0819980.812569280.173802330.227327010.10669985
SH50.3159810.132020.0889330.082556210.476756860.202576050.268729230.10771931
SH65.90344917.3150283.06208818.01311537.660886422.26423997108.64516591031.511647
RM0.0161820.048850.2431622.445358361.442609784.4861196121.97417925217.3198248
BAD10.0002170.000220.0001580.000157590.02545540.072859790.043123530.18796203
BAD20.0002390.0001770.0001550.00015650.025121390.0165180.015073680.01527896
BAD30.0002390.0001770.0001550.00015650.025431110.016420010.015068540.01527816
BAD40.000240.0001770.0001550.00015650.025618860.016404990.01504020.01523528
BAD50.0002390.0001770.0001550.00015650.025426290.016419730.01506850.01527815
BAD60.0002550.0001920.0001960.003096360.03596940.033094370.5938420153.76964961
Note: Bold values show the minimum MSE of estimators.
Table 3. Estimated MSE values with n = 25 , p = 10 , and 10% outliers in the y-direction.
Table 3. Estimated MSE values with n = 25 , p = 10 , and 10% outliers in the y-direction.
σ 2 = 0.1 σ 2 = 1
ρ 0.850.950.990.9990.850.950.990.999
OLS78.30101157246.26710711282.48436612,681.8964597.90390012297.32775761581.40082315,639.95389
HK59.70684765185.1787591960.30224249447.40207673.30407106219.8517771165.90306911,516.97663
HKB24.3690735874.99720452390.30269833860.34512729.6603094888.31886929471.07625274702.277742
KAM0.61351111.027950643.214256943.122853010.890111571.433316433.385791232.47170042
KMS53.1599361198.62140281198.01709312,561.7553467.73458451241.81754661482.8114515,497.30392
LC63.55698725196.45777941018.22631510,015.0634678.69633868234.9453131246.34739412,308.08624
ST40,876.595222597.1504425612.9710167,589.015968917.6006893248.416821212.4032474161.2248544
SH114.9812530910.008430443.307675970.353850919.8373718513.325173054.525475360.45598036
SH20.1098510.049613040.037553060.033808070.147020820.101295810.04818870.04236365
SH30.097751840.049099480.037537750.033807920.129258670.100054170.048168960.04236346
SH40.08138070.044143160.036604430.033717110.104977250.080619290.046969560.04224744
SH50.09766540.04909470.03753760.033807920.129131080.100042340.048168770.04236346
SH634.98795682110.9290664584.06812695812.99151745.09842247137.4538888740.85713057377.198788
RM0.186972710.5847343.0164931430.7351533315.6447795648.39233284251.83684362529.300274
BAD10.000144840.000101450.000088360.000081880.033496570.018686570.035153270.28948076
BAD20.000140870.000100580.000087930.000081080.01259610.009038630.007824880.00732175
BAD30.000140860.000100580.000087930.000081080.012575770.009036950.007824590.00732166
BAD40.000140840.000100580.000087930.000081080.012542390.009016240.007803280.00727014
BAD50.000140860.000100580.000087930.000081080.012575620.009036940.007824590.00732166
BAD60.000155390.000126060.000600240.180687730.167951530.6964347522.67715721184.093386
Note: Bold values show the minimum MSE of estimators.
Table 4. Estimated MSE values with n = 25 , p = 10 , and 20% outliers in the y-direction.
Table 4. Estimated MSE values with n = 25 , p = 10 , and 20% outliers in the y-direction.
σ 2 = 0.1 σ 2 = 1
ρ 0.850.950.990.9990.850.950.990.999
OLS114.5824414361.5871888.76118,724.23139.0351434.9562291.06323,059.4307
HK70.53059487218.43921143.44811,128.2184.73393261.65531369.97313,805.1276
HKB21.8843517667.16994357.52243487.57726.294880.86005426.61574368.603253
KAM0.920149581.0291531.7280260.9271621.0331551.2021241.7691060.84455065
KMS65.56693152263.36511706.40818,451.9882.85135327.18052097.18222,783.94644
LC79.90803527244.93981278.34912,459.4297.12311296.8661553.4615,649.26391
ST7643.924984936.1121263.459351.03767285,876.36944.46120,035.3972.26809438
SH127.7124332420.423127.0448210.76880135.1089226.8636711.231220.87839776
SH20.218607490.0915860.0674340.0650760.3235730.2358690.1328940.07912513
SH30.186059720.0901790.0674060.0650760.2790720.2312780.1321040.07912482
SH40.140913010.0773810.0657560.0649170.2160520.171760.0886770.07893107
SH50.185824740.0901660.0674060.0650760.2787490.2312350.1320960.07912482
SH657.31405626178.9567936.0759340.35771.19536220.76421161.87711,802.61724
RM9.0707217428.48965163.08691509.56434.80965109.5238553.23365853.776643
BAD10.110948670.3071790.3560935.7425830.4242272.6103880.75018524.01817765
BAD20.009090160.0065820.0058950.005360.070270.0311890.0187250.02142807
BAD30.009040670.0065680.0058940.0053580.0602030.0307990.0187220.02141773
BAD40.008961760.0064180.0058440.0047210.0497710.0266170.0185870.0175246
BAD50.009040290.0065670.0058940.0053580.060140.0307950.0187220.02141763
BAD60.405249654.69936237.937071104.8921.43092512.33273115.27173981.876529
Note: Bold values show the minimum MSE of estimators.
Table 5. Estimated MSE values from heavy-tailed distribution, i.e., standardized Cauchy distribution.
Table 5. Estimated MSE values from heavy-tailed distribution, i.e., standardized Cauchy distribution.
n = 25 n = 50
p = 4
ρ 0.850.950.990.9990.850.950.990.999
OLS327.7731063.4516,268,975.814100,402.37034,617,964.808352.258719,921.32211,053.8726
HK189.935378.65525,010,510.54146,279.688422,201,062.274197.9311,162.1922,670.4271
HKB121.5978164.86433,341,404.93220,747.83637680,283.2972100.59544761.4449209.271088
KAM6.1473923.6827772190.60725517.2023034242.948475467.3285599.7237226.26712893
KGM46.8224526.95606427,858.6245892.33129345001.18651426.48541315.3168542.2703827
LC265.8935616.84985,957,957.41760,018.413814,164,713.177272.417715,175.5335,094.67056
ST2568.2411696.13220,761.0386431.5704270336,294.574831761.21128.635559.91351106
SH1442.1007880.32752,989,821.812417.16215114,909,038.35295.81122211.581029.955269
SH26.30036853.946833029.51130518.566466379123.13572416.0287318.384944.84214598
SH369.9481724.277652896.13160918.5658002184,911.701627.39176416.975544.83676737
SH4105.429921.300472295.86531918.5238428544,190.9528.23619212.4240544.51518299
SH568.8731224.217052895.00373618.5657935882,802.946457.3723416.9638644.8367139
SH6266.9576740.86145,538,159.01952,561.78223,672,641.926242.005510,702.79167,915.1946
RM0.0524780.1576020.829129298.452173190.020255590.0563580.2823832.73867735
BAD10.0009610.0010.000783410.000824090.000401810.0003640.0002350.00023702
BAD20.0010390.0007570.000669150.000627430.000421940.00030.0002340.00023601
BAD30.0010410.0007560.00066910.000627430.000422470.0002990.0002340.00023601
BAD40.0010420.0007560.000668770.000627310.00042340.0002990.0002340.00023601
BAD50.0010410.0007560.000669090.000627430.000422460.0002990.0002340.00023601
BAD60.00110.0008340.002148310.262288910.000435330.0003230.0002550.00261408
p = 10
ρ 0.850.950.990.9990.850.950.990.999
OLS1962.46503130,316.148641,143,828.418140,897.3123146,739.85173,645,514.81913,285,6913,169,694,630
HK1221.36465119,700.14084763,572.595569,907.9629998,328.616831,310,513.3454,946,7392,020,877,309
HKB575.98534449406.327958396,470.267325,064.0843738,577.65478480,423.88181,478,068883,580,608
KAM4.949979468.1065266712.30673223.36912377406.46125451589.5163053198.18142,966.64378
KGM50.208536561337.4445234,208.675292717.5106726972.49173779,112.74147122,18161,197,547.54
LC1551.86302624,223.87956922,548.39491,379.3289120,271.60962,305,440.2186,811,1752,395,866,595
ST13,689.516131,379,972.3247959.79913514.5727323316,128.32791,217,601.8023932.84247,924.76525
SH11360.2141612,070.8463241,504.0424800.694958675,359.333361,401,920.536371,453.330,246,242.64
SH2222.778941595.9614179955.955727771.7730994510,441.585183340.4728753684.65739,857.14819
SH3159.968455285.7275385555.379689661.772895388763.6848353205.6759213682.00939,856.55337
SH4130.792020136.3057497632.318764641.705452389832.4747632364.8567943552.19839,530.93114
SH5159.575517685.638631155.374009471.772893328751.6147843204.4484343681.98339,856.54738
SH61480.75811222,928.06242854,495.778597,626.0495794,579.96742,811,101.6738,812,8982,122,229,381
RM0.36062541.325662755.5934239851.792125469.8366977232.58118421166.19781679.712846
BAD10.001226180.051933780.000698594.375330730.188180270.05755090.0737418.31706426
BAD20.000976510.001789070.000496950.000548450.113931380.014746530.0117360.01139534
BAD30.000975760.001704280.000496940.000548370.113863950.014740370.0117340.01139511
BAD40.000975430.001341470.000496610.000523350.113906060.014697840.0116260.01127013
BAD50.000975750.001703560.000496940.000548370.11386350.014740310.0117340.0113951
BAD60.001396160.106232310.068707485.9988817273.796637491.3065683916.802832179.907373
Note: Bold values show the minimum MSE of estimators.
Table 6. Summary table of recommended estimators in terms of minimum MSE values.
Table 6. Summary table of recommended estimators in terms of minimum MSE values.
Outliers p 410
ρ 0.850.950.990.9990.850.950.990.999
n σ 2
10%250.1BAD1BAD4BAD4BAD 4BAD4BAD2345BAD2345BAD2345
1BAD1BAD4BAD4BAD4BAD4BAD4BAD4BAD4
500.1BAD1BAD2345BAD2345BAD2345BAD2345BAD2345BAD2345BAD2345
1BAD1BAD4BAD4BAD4BAD5BAD4BAD4BAD4
1000.1BAD1BAD35BAD2345BAD2345BAD2345BAD2345BAD2345BAD2345
1BAD1BAD35BAD4BAD4BAD35BAD4BAD4BAD4
20%250.1BAD1BAD4BAD4BAD2345BAD4BAD4BAD4BAD4
1BAD2BAD4BAD4BAD4BAD4BAD4BAD4BAD4
500.1BAD1BAD4BAD2345BAD2345BAD345BAD4BAD4BAD4
1BAD1BAD4BAD4BAD4BAD5BAD4BAD4BAD4
1000.1BAD1BAD35BAD2345BAD2345BAD35BAD2345BAD2345BAD2345
1BAD1BAD5BAD4BAD4BAD35BAD4BAD4BAD4
Table 7. Estimated regression coefficients and MSE values of tobacco data of ref. [32].
Table 7. Estimated regression coefficients and MSE values of tobacco data of ref. [32].
EstimatorsMSE k ^ β ^ 1 β ^ 2 β ^ 3 β ^ 4
OLS32.49720.02290.4857−0.67281.07441.4438
HK3.63660.0486190.4856−0.64380.95611.0548
HKB3.40220.0930240.4856−0.64380.95611.0548
KAM3.44520.0670520.4856−0.64380.95611.0548
KMS3.62840.02290.4856−0.64350.95491.0515
LC3.63910.02290.4867−0.64530.95821.0572
ST3.40950.0379770.29590.2242−0.0961−0.0394
SH13.42863.7597270.487−0.6280.89420.8987
SH23.73613.9717510.4863−0.08310.05210.0243
SH33.73941855.5260.4863−0.07930.04960.023
SH43.91023.9738930.4857−0.00320.00180.0008
SH53.73940.0021420.4863−0.07920.04950.023
SH612.54120.2038440.4858−0.67011.06241.3961
RM6.908220.180560.4888−0.651.2320.8844
BAD12.516221.318610.491−0.46720.58990.2078
BAD22.86159959.6440.489−0.01880.01320.0032
BAD32.864721.33010.489−0.0180.01260.003
BAD42.94470.0114950.4888−0.00290.0020.0005
BAD52.86470.02290.489−0.0180.01260.003
BAD61.78340.0486190.4893−0.63641.16130.7471
Note: Bold values show the minimum MSE of estimators.
Table 8. Confidence intervals (95%) for tobacco data.
Table 8. Confidence intervals (95%) for tobacco data.
Estimator L ( β ^ 1 ) U ( β ^ 1 ) L ( β ^ 2 ) U ( β ^ 2 ) L ( β ^ 3 ) U ( β ^ 3 ) L ( β ^ 4 ) U ( β ^ 4 )
OLS0.0059054093.00888230−1.0815896050.03945686−1.9393480040.25614058−0.1586152711.80203453
HK0.0867536532.2905041−0.9925561710.0693747−1.4729255710.28293550.0021130151.6676833
HKB0.137018391.83847598−0.915273380.09201028−1.158205960.307390280.093831171.53919381
KAM0.18303281.4101702−0.81069860.1178389−0.83346730.33454120.17105301.3623161
KMS0.0874493772.28429487−0.9916401890.06966175−1.4687420920.283232280.0034289561.66620587
LC0.0867536532.2905041−0.9925561710.0693747−1.4729255710.28293550.0021130151.6676833
ST0.0867536532.2905041−0.9925561710.0693747−1.4729255710.28293550.0021130151.6676833
SH10.119457171.9974122−0.945234880.0836136−1.271465810.29815690.062588561.5897296
SH20.233387820.30656260.091529240.22724700.175991670.27280770.239104540.3340377
SH30.232552990.30313870.097617090.22747060.178245940.27109240.238117470.3291206
SH40.013076840.015531640.012754060.015210520.012963310.015418580.013096250.01555145
SH50.232544680.30310580.097675390.22747260.178267210.27107570.238107700.3290733
SH60.015932722.91995177−1.071996930.04285575−1.883135130.25867334−0.137892221.78860732
RM0.005905413.00888230−1.081589610.03945686−1.939348000.25614058−0.158615271.80203453
BAD10.22585180.9607248−0.62818030.1526566−0.44816160.35901810.23528731.0772170
BAD20.19506980.23266990.16836470.21278970.18526140.22494060.19730760.2366952
BAD30.19322490.23034800.16798420.21134450.18396460.22297850.19536850.2341155
BAD40.002557320.003037450.002496600.002976790.002536010.003016150.002560800.00304094
BAD50.19320650.23032500.16797980.21132990.18395150.22295890.19534920.2340901
BAD60.052844662.59234445−1.033578150.05606784−1.672715520.26987847−0.063473701.73209874
Table 9. Estimated regression coefficients and MSE of gasoline consumption data.
Table 9. Estimated regression coefficients and MSE of gasoline consumption data.
EstimatorsMSE k ^ β ^ 1 β ^ 2 β ^ 3 β ^ 4
OLS41.6654---−1.3852−0.07760.7395−0.1818
HK2.88490.125136−1.04410.00620.3216−0.1841
HKB2.44260.391949−0.77420.03430.0345−0.192
KAM2.414311.0074−0.265−0.1474−0.2048−0.206
KMS2.67780.180161−0.96130.02010.2275−0.1857
LC2.89570.125136−1.05130.00620.3238−0.1854
ST2.36820.1251360.4766−0.67280.093−0.2518
SH15.22350.029569−1.2763−0.04680.5998−0.1825
SH22.34562.927352−0.4054−0.0778−0.2043−0.2255
SH32.35223.59956−0.3809−0.0942−0.2108−0.2275
SH42.4876531.1282−0.243−0.2195−0.2354−0.2124
SH52.35233.60635−0.3807−0.0943−0.2108−0.2275
SH614.01260.00679−1.3577−0.06950.7039−0.1819
RM43.3854---−1.2593−0.07610.6094−0.1586
BAD12.14310.230061−0.84180.0130.108−0.1658
BAD21.926622.77608−0.2618−0.1912−0.2292−0.2112
BAD31.933828.00615−0.2573−0.1958−0.2296−0.2106
BAD41.97384132.409−0.2373−0.2178−0.2317−0.2054
BAD51.933928.05898−0.2572−0.1958−0.2297−0.2106
BAD63.60230.052829−1.1027−0.03420.4113−0.16
Bold values show the minimum MSE of estimators.
Table 10. Confidence intervals (95%) for gasoline consumption data.
Table 10. Confidence intervals (95%) for gasoline consumption data.
Estimator L ( β ^ 1 ) U ( β ^ 1 ) L ( β ^ 2 ) U ( β ^ 2 ) L ( β ^ 3 ) U ( β ^ 3 ) L ( β ^ 4 ) U ( β ^ 4 )
OLS−2.80555620.03505633−0.86721920.71210810−1.08357242.56266260−0.50723730.14366063
HK−1.9825876−0.1055709−0.64188930.6542284−0.80432921.4475464−0.50088700.1327180
HKB−1.3771884−0.1712334−0.49051270.5591917−0.59808240.6671426−0.49396140.1099976
KAM−0.3487039−0.18136093−0.2543979−0.04044512−0.2747117−0.13486242−0.3469731−0.06510444
KMS−0.8061589−0.18855485−0.35827140.31477895−0.41320730.08948395−0.47367040.05204371
LC−0.9645881−0.18691899−0.39377640.40638817−0.46235070.22131849−0.48301790.07408722
ST−1.7915100−0.1311066−0.59283140.6331199−0.73910931.1940173−0.49910130.1277443
SH1−1.9825876−0.1055709−0.64188930.6542284−0.80432921.4475464−0.50088700.1327180
SH2−1.9825876−0.1055709−0.64188930.6542284−0.80432921.4475464−0.50088700.1327180
SH3−0.5420450−0.18992433−0.30140150.12048930−0.3352603−0.06977346−0.4357091−0.00148512
SH4−0.04862990−0.03127241−0.04484962−0.02730786−0.04736591−0.03003570−0.04395334−0.02587430
SH5−0.5415998−0.18992344−0.30130620.12012548−0.3351310−0.06997787−0.4355971−0.00160123
SH6−2.73511290.02166785−0.84707840.70808765−1.05978512.46656350−0.50673620.14312908
RM−2.80555620.03505633−0.86721920.71210810−1.08357242.56266260−0.50723730.14366063
BAD1−1.6971556−0.14224850−0.56905580.62040530−0.70689291.07056110−0.49812520.12469070
BAD2−0.2815145−0.16555419−0.2274766−0.08380654−0.2441799−0.13846204−0.2835324−0.08844829
BAD3−0.2636527−0.15873567−0.2179411−0.09060437−0.2336887−0.13602504−0.2636417−0.09219561
BAD4−0.00777730−0.00500441−0.00721815−0.00444137−0.00760737−0.00483503−0.00698284−0.00419447
BAD5−0.2634937−0.15866962−0.2178510−0.09065279−0.2335898−0.13599593−0.2634617−0.09222115
BAD6−2.3824579−0.04213433−0.74858140.68629135−0.94038751.98667893−0.50414130.13955386
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Haider, B.; Asim, S.M.; Wasim, D.; Kibria, B.M.G. A Simulation-Based Comparative Analysis of Two-Parameter Robust Ridge M-Estimators for Linear Regression Models. Stats 2025, 8, 84. https://doi.org/10.3390/stats8040084

AMA Style

Haider B, Asim SM, Wasim D, Kibria BMG. A Simulation-Based Comparative Analysis of Two-Parameter Robust Ridge M-Estimators for Linear Regression Models. Stats. 2025; 8(4):84. https://doi.org/10.3390/stats8040084

Chicago/Turabian Style

Haider, Bushra, Syed Muhammad Asim, Danish Wasim, and B. M. Golam Kibria. 2025. "A Simulation-Based Comparative Analysis of Two-Parameter Robust Ridge M-Estimators for Linear Regression Models" Stats 8, no. 4: 84. https://doi.org/10.3390/stats8040084

APA Style

Haider, B., Asim, S. M., Wasim, D., & Kibria, B. M. G. (2025). A Simulation-Based Comparative Analysis of Two-Parameter Robust Ridge M-Estimators for Linear Regression Models. Stats, 8(4), 84. https://doi.org/10.3390/stats8040084

Article Metrics

Back to TopTop