A New Biased Estimator to Combat the Multicollinearity of the Gaussian Linear Regression Model

: In a multiple linear regression model, the ordinary least squares estimator is ine ﬃ cient when the multicollinearity problem exists. Many authors have proposed di ﬀ erent estimators to overcome the multicollinearity problem for linear regression models. This paper introduces a new regression estimator, called the Dawoud–Kibria estimator, as an alternative to the ordinary least squares estimator. Theory and simulation results show that this estimator performs better than other regression estimators under some conditions, according to the mean squares error criterion. The real-life datasets are used to illustrate the ﬁndings of the paper.


Introduction
Consider the following linear regression model: where y is an n × 1 vector of the dependent variable, X is a known n × p full rank matrix of explanatory variables, and β is a p × 1 vector of unknown regression parameter. The ordinary least squares estimator (OLS) of β in (1) is defined byβ = S −1 X y, where S = X X and ε is an n × 1 vector of disturbances with zero mean and variance-covariance matrix, Cov(ε) = σ 2 I n ; I n is an identity matrix of order nxn. Under the normality assumption of the disturbances,β follows N β, σ 2 S −1 distribution. In a multiple linear regression model, it is assumed that the explanatory variables are independent. However, in real-life situations, there may be strong or near-to-strong linear relationships among the explanatory variables. This causes the problem of multicollinearity. In the presence of multicollinearity, it is difficult to estimate the unique effect of individual variables in the regression equations. Moreover, the OLS estimator becomes unstable or inefficient and may produce the wrong sign (see Hoerl and Kennard) [1]. To overcome these problems, many authors have introduced different kinds of one-and two-parameter estimators: to mention a few, Stein [2], Massy [3], Hoerl and Kennard [1], Mayer and Willke [4], Swindel [5], Liu [6], Akdeniz and Kaçiranlar [7], Ozkale and Kaçiranlar [8], Sakallıoglu and Kaçıranlar [9], Yang and Chang [10], Roozbeh [11], Akdeniz and Roozbeh [12], Lukman et al. [13,14], and, very recently, Kibria and Lukman [15], among others. The objective of this paper is to introduce a new class of two-parameter estimator for the regression parameter when the explanatory variables are correlated and then to compare the performance of the new estimator with the OLS estimator, the ordinary ridge regression (ORR) estimator, the Liu estimator, the Kibria-Lukman (KL) estimator, the two-parameter (TP) estimator proposed by Ozkale and Kaciranlar [8], and the new two-parameter (NTP) estimator that is proposed by Yang and Chang [10].

Some Alternative Biased Estimators and the Proposed Estimator
The canonical form of Equation (1) is as follows: where Z = XP and α = P β.

528
The new two-parameter (NTP) estimator of α (Yang, H.; Chang [10]) is given bŷ The proposed new class of two-parameter estimator of α is obtained by minimizing (y − Zα) (y − Zα), subject to (α +α) (α +α) = c, where c is a constant, Here, k and 1 + d are the Lagrangian multipliers. The solution of minimizing the objective function is obtained by Kibria and Lukman [15] for getting the KL estimator and defined in Equation (10). Now, the solution to (16) gives the proposed estimator as follows: where The proposed estimator will be called the Dawoud-Kibria (DK) estimator and is denoted byα DK . Moreover, the proposed DK estimator is also obtained by augmenting − √ k (3) and then using the OLS estimate. The MSEM of the DK estimator is given by The main differences between the KL estimator and the proposed DK estimator are as follows: - The KL is a one-parameter estimator, while the proposed DK is a two-parameter estimator. - The KL estimator is obtained based on the objective function (y − Zα) (y − Zα) + k[(α +α) (α + α) − c], while the proposed DK estimator is obtained from a different objective function, which is The KL estimator is a function of the shrinkage estimator k, while the proposed DK estimator is a function of k and d.

-
Since the KL estimator has one parameter and the proposed DK estimator has two parameters, their MSEs are different. - In the KL estimator, shrinkage parameter k needs to be estimated, while in the proposed DK estimator, both k and d need to be estimated. - The KL estimator is a special case of the proposed DK estimator when d = 0, so the proposed DK estimator is the general estimator.
The following lemmas will be used to make some theoretical comparisons among estimators in the following section.
Stats 2020, 3 529 Lemma 3 [18]. Let α i = B i y, i = 1, 2 be two linear estimators of α. Suppose that D = Cov(α 1 ) − Cov(α 2 ) > 0, where Cov(α i )i = 1, 2 is the covariance matrix ofα i and b i = Bias(α i ) = (B i X − I)α, i = 1, 2. Consequently, The rest of this article is organized as follows: In Section 2, we give the theoretical comparisons among the abovementioned estimators and derive the biasing parameters of the proposed DK estimator. A simulation study is conducted in Section 3. Two numerical examples are illustrated in Section 4. Finally, some concluding remarks are given in Section 5.

Comparison among the Estimators
Proof. The difference of the dispersion matrices is given by where will be positive definite (pd) if and only if We observed that for k > 0 and 0 < d < 1, Theorem 2. When λ max (HG −1 ) < 1, the proposed estimatorα DK is superior to estimatorα(k) if and only if where Proof.
It is clear that for k > 0 and 0 < d < 1, G > 0 and H > 0. It is obvious that G − H > 0 if and only if where λ max (HG −1 ) is the maximum eigenvalue of the matrix HG −1 . Consequently, V 1 is positive definite.

Theorem 3. The proposed estimatorα DK is superior to estimatorα(d) if and only if
Proof. Using the difference between the dispersion matrices where will be pd if and only if is positive definite.

531
Proof. Using the difference between the dispersion matrices where will be pd if and only if Obviously, for k > 0 and 0 < d < 1, Consequently, is positive definite.

Theorem 5. The proposed estimatorα DK is superior to estimatorα TP if and only if
Proof. where will be positive definite if and only if Clearly, for k > 0 and 0 Proof.
will be pd if and only if Clearly, for k > 0 and 0 < d < 1,

Determination of the Parameters k and d
Since both biasing parameters k and d are unknown and need to be estimated from the observed data, we will give a short discussion on the estimation of the parameters in this subsection. The biasing parameter k in the ORR estimator and the biasing parameter d in the Liu estimator were derived by Hoerl and Kennard [1] and Liu [6], respectively. Different authors for different kinds of models have proposed different estimators of k and d: to mention a few, Hoerl et al. [19], Kibria [20], Kibria and Banik [21], Lukman and Ayinde [22], Mansson et al. [23], and Khalaf and Shukur [24], among others. Now, we will discuss the estimation of the optimal values of k and d for the proposed DK estimator. First, we assume that d is fixed, then the optimal value of k can be obtained by minimizing Differentiating m(k, d) with respect to k and setting (∂m(k, d)/∂k) = 0, we obtain Since the optimal value of k in (33) depends on the unknown parameters σ 2 and α 2 i , we replace them with their corresponding unbiased estimators. Consequently, we havê Furthermore, the optimal value of d can be obtained by differentiating m(k, d) with respect to d for a fixed k and setting (∂m(k, d)/∂d) = 0, and we obtain where m = k(σ 2 + 2λ i α 2 i ). Additionally, the optimal d with known parameters iŝ The estimator determination of the parameters k and d inα DK is obtained iteratively as follows: Step 1: Obtain an initial estimate of d usingd = min(σ 2 α 2 i ).
Step 4: In cased min (DK) is not between 0 and 1, used min (DK) =d. Additionally, Hoerl et al. [19] defined the biasing parameter k for the ORR estimator aŝ The biasing parameter d is given by Ozkale and Kaciranlar [8] and adopted for the Liu estimator Then, Kibria and Lukman [15] found the biasing parameter estimator for the KL estimator aŝ In addition,k min of the KL estimator is also obtained when d = 0 in the derived biasing parameter estimatork min (DK) for the proposed DK estimator.

Simulation Study
To support a theoretical comparison of the estimators, a Monte Carlo simulation study was conducted to compare the performance of the estimators in this section. As such, this section will contain (i) the simulation technique and (ii) a discussion of the results.

Simulation Technique
Following Gibbons [25] and Kibria [20], we generated the explanatory variables using the following equation: where z ij are independent standard normal pseudo-random numbers, and ρ represents the correlation between any two explanatory variables and is considered here to be 0.90 and 0.99. We consider p = 3 in the simulation. These variables are standardized so that X X and X y are in correlation forms. The n observations for the dependent variable y are determined by the following equation: where e i are i.i.dN(0, σ 2 ). The values of β are chosen such that β β = 1 [26]. Since we aimed to compare the performance of the DK estimator with OLS, ORR, Liu, KL, TP, and NTP estimators, we chose k (0.3, 0.6, 0.9) between 0 and 1, as did Wichern and Churchill [27] and Kan et al. [28], where ORR gives better results and d (0.2, 0.5, 0.8). The replication of this simulation study is 1000 times for the sample sizes n = 50 and 100 and σ 2 = 1, 25, and 100. For each replicate, we computed the mean square error (MSE) of the estimators by using the equation below: where α * ij is the estimator values and α i is the true parameter values. The estimated MSEs of the estimators are shown in Tables 1-4.

Simulation Results Discussions
From Table 1, Table 2, Table 3. Table 4, it appears that as σ and ρ increase, the estimated MSE values increase, while as n increases, the estimated MSE values decrease. As expected, when the multicollinearity problem exists, the OLS estimator gives the highest MSE values and performs the worst among all estimators. Additionally, the results show that the proposed DK estimator is performing better than the rest of the estimators, followed by NTP and KL estimators, most of the time for all conditions. The NTP estimator gives better results in MSE values when d and k are near zero. The proposed DK estimator always performs better than the KL estimator. The NTP estimator performance is between the KL and DK estimators most of the time, while the KL estimator Stats 2020, 3 535 performance is between the NTP estimator and the proposed DK estimator some of the time. Thus, simulation results are consistent with the theoretical results.      It appears from Figure 1 that as ρ increases, the MSE values of the estimators increase for σ = 10, n = 50, k = 0.9, and d = 0.8, and the proposed DK estimator has the smallest MSE value among all estimators.

Portland Cement Data
We use the Portland cement data, which was originally adopted by Woods et al. [29] to explain their theoretical results. The data were analyzed by various researchers: to mention a few, Kaciranlar et al. [30], Li and Yang [31], Lukman et al. [13], and, recently, Kibria and Lukman [15], among others.
The regression model for these data is defined as For more details about these data, see Woods et al. [29]. The variance inflation factors are

Portland Cement Data
We use the Portland cement data, which was originally adopted by Woods et al. [29] to explain their theoretical results. The data were analyzed by various researchers: to mention a few, Kaciranlar et al. [30], Li and Yang [31], Lukman et al. [13], and, recently, Kibria and Lukman [15], among others.
The regression model for these data is defined as For more details about these data, see Woods et al. [29]. The variance inflation factors are  It appears from Figure 1 that as ρ increases, the MSE values of the estimators increase for σ = 10, n = 50, k = 0.9, and d = 0.8, and the proposed DK estimator has the smallest MSE value among all estimators. Figure 2 shows that as n increases, the MSE values of the estimators decrease for σ = 10, ρ = 0.99, k = 0.9, and d = 0.8, and the proposed DK estimator has the smallest MSE value among all estimators. Figure 3 shows the behavior of σ, where as σ increases, the MSE values of the estimators increase for n = 100, ρ = 0.99, k = 0.9, d = 0.8, and for other values of these factors. Figure 4 shows the behavior of the estimators for different values of d when k = 0.3. It is evident from Figure 4 that the proposed DK estimator gives the smallest MSE values when d is greater than 0.3, while the NTP estimator gives better results when d is less than 0.3 for n = 100, ρ = 0.99, σ = 10, and for other values of these factors. Figure 5 shows the behavior of the estimators for different values of k when d = 0.5, such that the proposed DK estimator gives the smallest MSE values among all other estimators for n = 100, ρ = 0.99, σ = 10, and for other values of these factors. Figure 6 shows the behavior of the estimators for different values of k when d = 0.8, such that the proposed DK estimator gives the smallest MSE values among all other estimators for n = 100, ρ = 0.99, σ = 10, and for other values of these factors.

Portland Cement Data
We use the Portland cement data, which was originally adopted by Woods et al. [29] to explain their theoretical results. The data were analyzed by various researchers: to mention a few, Kaciranlar et al. [30], Li and Yang [31], Lukman et al. [13], and, recently, Kibria and Lukman [15], among others.
The regression model for these data is defined as For more details about these data, see Woods et al. [29].
The variance inflation factors are VIF 1 = 38.50, VIF 2 = 254.42, VIF 3 = 46.87, and VIF 4 = 282.51. Eigenvalues of S are λ 1 = 44676.206, λ 2 = 5965.422, λ 3 = 809.952, and λ 4 = 105.419, and the condition number of S is approximately 20.58. The VIFs, the eigenvalues, and the condition number all indicate that severe multicollinearity exists. The estimated parameters and the MSE values of the estimators are presented in Table 5. It appears from Table 5 that the proposed DK estimator performs the best among the mentioned estimators as it gives the smallest MSE value.

Longley Data
Longley data were originally used by Longley [32] and then by other authors (Yasin and Murat [33]; Lukman and Ayinde [22]). The regression model of this data is defined as y = β 1 x 1 + β 2 x 2 + . . . + β 5 x 5 + β 6 x 6 + ε For more details about these data, see Longley [32]. The variance inflation factors are VIF 1 = 135.53, VIF 2 = 1788.51, VIF 3 = 33.62VIF 4 = 3.59, VIF 5 = 399.15, and VIF 6 = 758.98. Eigenvalues of S are as follows: 2.76779 × 10 12 , 7,039,139,179, 11,608,993.96, 2,504,761.021, 1738.356, 13.309, and the condition number of S is approximately 456,070. The VIFs, the eigenvalues, and the condition number all indicate that severe multicollinearity exists. The estimated parameters and the MSE values of the estimators are presented in Table 6. It appears from Table 6 that the proposed DK estimator performs the best among the mentioned estimators as it gives the smallest MSE value.

Summary and Concluding Remarks
In this paper, we introduced a new class of two-parameter estimator, namely, the Dawoud-Kibria (DK) estimator, to solve the multicollinearity problem for linear regression models. We theoretically compared the proposed DK estimator with some existing estimators, for example, the ordinary least squares (OLS) estimator, the ordinary ridge regression (ORR) estimator, the Liu (1993) estimator, the new modified ridge-type estimator of Kibria and Lukman (KL; 2020), the two-parameter (TP) estimator of Ozkale and Kaciranlar (2007), and the new two-parameter (NTP) estimator of Yang and Chang (2010), and derived the biasing parameters d and k of the proposed DK estimator. A simulation study has been conducted to compare the performance of the OLS, ORR, Liu, KL, TP, NTP, and the proposed DK estimators. It is evident from simulation results that the proposed DK estimator gives better results than the rest of the estimators under some conditions. Real-life datasets were analyzed to illustrate the findings of the paper. Hopefully, the paper will be useful for practitioners of various fields.