The VIF and MSE in Raise Regression

: The raise regression has been proposed as an alternative to ordinary least squares estimation when a model presents collinearity. In order to analyze whether the problem has been mitigated, it is necessary to develop measures to detect collinearity after the application of the raise regression. This paper extends the concept of the variance inﬂation factor to be applied in a raise regression. The relevance of this extension is that it can be applied to determine the raising factor which allows an optimal application of this technique. The mean square error is also calculated since the raise regression provides a biased estimator. The results are illustrated by two empirical examples where the application of the raise estimator is compared to the application of the ridge and Lasso estimators that are commonly applied to estimate models with multicollinearity as an alternative to ordinary least squares.


Introduction
In the last fifty years, different methods have been developed to avoid the instability of estimates derived from collinearity (see, for example, Kiers and Smilde [1]). Some of these methods can be grouped within a general denomination known as penalized regression.
In general terms, the penalized regression parts from the linear model (with p variables and n observations), Y = Xβ + u, and obtains the regularization of the estimated parameters, minimizing the following objective function: where P(β) is a penalty term that can take different forms. One of the most common penalty terms is the bridge penalty term ( [2,3]) is given by where λ is a tuning parameter. Note that the ridge ( [4]) and the Lasso ( [5]) regressions are obtained when α = 2 and α = 1, respectively. Penalties have also been called soft thresholding ( [6,7]). These methods are applied not only for the treatment of multicollinearity but also for the selection of variables (see, for example, Dupuis and Victoria-Feser [8], Li and Yang [9] Liu et al. [10], or Uematsu and Tanaka [11]), which is a crucial issue in many areas of science when the number of variables exceeds the sample size. Zou and Hastie [12] proposed elastic net regularization by using the penalty terms λ 1 and λ 2 that combine the Lasso and ridge regressions: Thus, the Lasso regression usually selects one of the regressors from among all those that are highly correlated, while the elastic net regression selects several of them. In the words of Tutz and Ulbricht [13] "the elastic net catches all the big fish", meaning that it selects the whole group. From a different point of view, other authors have also presented different techniques and methods well suited for dealing with the collinearity problems: continuum regression ( [14]), least angle regression ( [15]), generalized maximum entropy ( [16][17][18]), the principal component analysis (PCA) regression ( [19,20]), the principal correlation components estimator ( [21]), penalized splines ( [22]), partial least squares (PLS) regression ( [23,24]), or the surrogate estimator focused on the solution of the normal equations presented by Jensen and Ramirez [25].
Focusing on collinearity, the ridge regression is one of the more commonly applied methodologies and it is estimated by the following expression: where I is the identity matrix with adequate dimensions and K is the ridge factor (ordinary least squares (OLS) estimators are obtained when K = 0). Although ridge regression has been widely applied, it presents some problems with current practice in the presence of multicollinearity and the estimators derived from the penalty come into these same problems whenever n > p: • In relation to the calculation of the variance inflation factors (VIF), measures that quantify the degree of multicollinearity existing in a model from the coefficient of determination of the regression between the independent variables (for more details, see Section 2), García et al. [26] showed that the application of the original data when working with the ridge estimate leads to non-monotone VIF values by considering the VIF as a function of the penalty term. Logically, the Lasso and the elastic net regression inherit this property. • By following Marquardt [27]: "The least squares objective function is mathematically independent of the scaling of the predictor variables (while the objective function in ridge regression is mathematically dependent on the scaling of the predictor variables)". That is to say, the penalized objective function will bring problems derived from the standardization of the variables. This fact has to be taken into account both for obtaining the estimators of the regressors and for the application of measures that detect if the collinearity has been mitigated. Other penalized regressions (such as Lasso and elastic net regressions) are not scale invariant and hence yield different results depending on the predictor scaling used. • Some of the properties of the OLS estimator that are deduced from the normal equations are not verified by the ridge estimator and, among others, the estimated values for the endogenous variable are not orthogonal to the residuals. As a result, the following decomposition is verified When the OLS estimators are obtained (K = 0), the third term is null. However, this term is not null when K is not zero. Consequently, the relationship TSS(K) = ESS(K) + RSS(K) is not satisfied in ridge regression, and the definition of the coefficient of determination may not be suitable. This fact not only limits the analysis of the goodness of fit but also affects the global significance since the critical coefficient of determination is also questioned. Rodríguez et al. [28] showed that the estimators obtained from the penalties mentioned above inherit the problem of the ridge regression in relation to the goodness of fit.
In order to overcome these problems, this paper is focused on the raise regression (García et al. [29] and Salmerón et al. [30]) based on the treatment of collinearity from a geometrical point of view. It consists in separating the independent variables by using the residuals (weighted by the raising factor) of the auxiliary regression traditionally used to obtain the VIF. Salmerón et al. [30] showed that the raise regression presents better conditions than ridge regression and, more recently, García et al. [31] showed, among other questions, that the ridge regression is a particular case of the raise regression.
This paper presents the extension of the VIF to the raise regression showing that, although García et al. [31] showed that the application of the raise regression guarantees a diminishing of the VIF, it is not guaranteed that its value is lower the threshold traditionally established as troubling. Thus, it will be concluded that an unique application of the raise regression does not guarantee the mitigation of the multicollinearity. Consequently, this extension complements the results presented by García et al. [31] and determines, on the one hand, whether it is necessary to apply a successive raise regression (see García et al. [31] for more details) and, on the other hand, the most adequate variable for raising and the most optimal value for the raising factor in order to guarantee the mitigation of the multicollinearity.
On the other hand, the transformation of variables is common when strong collinearity exists in a linear model. The transformation to unit length (see Belsley et al. [32]) or standardization (see Marquardt [27]) is typical. Although the VIF is invariant to these transformations when it is calculated after estimation by OLS (see García et al. [26]), it is not guaranteed either in the case of the raise regression or in ridge regression as showed by García et al. [26]. The analysis of this fact is one of the goals of this paper.
Finally, since the raise estimator is biased, it is interesting to calculate its mean square error (MSE). It is studied whether the MSE of the raise regression is less than the one obtained by OLS. In this case, this study could be used to select an adequate raising factor similar to what is proposed by Hoerl et al. [33] in the case of the ridge regression. Note that estimators with MSE less than the one from OLS estimators are traditionally preferred (see, for example, Stein [34], James and Stein [35], Hoerl and Kennard [4], Ohtani [36], or Hubert et al. [37]). In addition, this measure allows us to conclude whether the raise regression is preferable, in terms of MSE, to other alternative techniques.
The structure of the paper is as follows: Section 2 briefly describes the VIF and the raise regression, and Section 3 extends the VIF to this methodology. Some desirable properties of the VIF are analyzed, and its asymptotic behavior is studied. It is also concluded that the VIF is invariant to data transformation. Section 4 calculates the MSE of the raise estimator, showing that there is a minimum value that is less than the MSE of the OLS estimator. Section 5 illustrates the contribution of this paper with two numerical examples. Finally, Section 6 summarizes the main conclusions of this paper.

Variance Inflation Factor
The following model for p independent variables and n observations is considered: where Y is a vector n × 1 that contains the observations of the dependent variable, X = [1 X 2 . . . X i . . . X p ] (with 1 being a vector of ones with dimension n × 1) is a matrix with order n × p that contains (by columns) the observations of the independent variables, β is a vector p × 1 that contains the coefficients of the independent variables, and u is a vector n × 1 that represents the random disturbance that is supposed to be spherical (E[u] = 0 and Var(u) = σ 2 I, where 0 is a vector with zeros with dimension n × 1 and I the identity matrix with adequate dimensions, in this case p × p).
Given the model in Equation (2), the variance inflation factor (VIF) is obtained as follows: where R 2 k is the coefficient of determination of the regression of the variable X k as a function of the rest of the independent variables of the model in Equation (2): where X −k corresponds to the matrix X after the elimination of the column k (variable X k ). If the variable X k has no linear relationship (i.e., is orthogonal) with the rest of the independent variables, the coefficient of determination will be zero (R 2 k = 0) and the V IF(k) = 1. As the linear relationship increases, the coefficient of determination (R 2 k ) and consequently V IF(k) will also increase. Thus, the higher the VIF associated with the variable X k , the greater the linear relationship between this variable and the rest of the independent variables in the model in Equation (2). It is considered that the collinearity is troubling for values of VIF higher than 10. Note that the VIF ignores the role of the constant term (see, for example, Salmerón et al. [38] or Salmerón et al. [39]), and consequently, this extension will be useful when the multicollinearity is essential; that is to say, when there is a linear relationship between at least two independent variables of the model of regression without considering the constant term (see, for example, Marquandt and Snee [40] for the definitions of essential and nonessential multicollinearity).

Raise Regression
Raise regression, presented by García et al. [29] and more developed further by Salmerón et al. [30], uses the residuals of the model in Equation (4), e k , to raise the variable k as X k = X k + λe k with λ ≥ 0 (called the raising factor) and to verify that e t k X −k = 0, where 0 is a vector of zeros with adequate dimensions. In that case, the raise regression consists in the estimation by OLS of the following model: García et al. [29] showed (Theorem 3.3) that this technique does not alter the global characteristics of the initial model. That is to say, the models in Equations (2) and (5) have the same coefficient of determination and experimental statistics for the global significance test. Figure 1 illustrates the raise regression for two independent variables being geometrically separated by using the residuals weighted by the raising factor λ. Thus, the selection of an adequate value for λ is essential, analogously to what occurs with the ridge factor K. A preliminary proposal about how to select the raising factor in a model with two independent standardized variables can be found in García et al. [41]. Other recently published papers introduce and highlight the various advantages of raise estimators for statistical analysis: Salmerón et al. [30] presented the raise regression for p = 3 standardized variables and showed that it presents better properties than the ridge regression and that the individual inference of the raised variable is not altered, García et al. [31] showed that it is guaranteed that all the VIFs associated with the model in Equation (5) diminish but that it is not possible to quantify the decrease, García and Ramírez [42] presented the successive raise regression, and García et al. [31] showed, among other questions, that ridge regression is a particular case of raise regression. Figure 1. Representation of the raise method.
The following section presents the extension of the VIF to be applied after the estimation by raise regression since it will be interesting whether, after the raising of one independent variable, the VIF falls below 10. It will be also analyzed when a successive raise regression can be recommendable (see García and Ramírez [42]).

VIF in Raise Regression
To calculate the VIF in the raise regression, two cases have to be differentiated depending on the dependent variable, X k , of the auxiliary regression: 1. If it is the raised variable, X i with i = 2, . . . , p, the coefficient of determination, R 2 i (λ), of the following auxiliary regression has to be calculated: 2. If it is not the raised variable, X j with j = 2, . . . , p being j = i, the coefficient of determination, R 2 j (λ), of the following auxiliary regression has to be calculated: where X −i,−j corresponding to the matrix X after the elimination of columns i and j (variables X i and X j ). The same notation is used for α −i,−j (λ).
Once these coefficients of determination are obtained (as indicated in the following subsections), the VIF of the raise regression will be given by the following:

VIF Associated with Raise Variable
In this case, for i = 2, . . . , p, the coefficient of determination of the regression in Equation (6) is given by since: i are the total sum of squares, explained sum of squares, and residual sum of squares of the model in Equation (4). Note that it has been taken into account that (9), it is evident that Finally, from properties 1) and 3), it is deduced that

VIF Associated with Non-Raised Variables
In this case, for j = 2, . . . , p, with j = i, the coefficient of determination of regression in Equation (7) is given by Taking into account that and, from Appendices A and B, where TSS Indeed, from Equation (10), it is evident that Finally, from properties 1) and 3), it is deduced that R 2 j (λ) ≤ R 2 j for all λ.

Properties of V IF(k, λ)
From conditions verified by the coefficient of determination in Equations (9) and (10), it is concluded that V IF(k, λ) (see expression Equation (8)), verifies that 1. The VIF associated with the raise regression is continuous in zero because the coefficients of determination of the auxiliary regressions in Equations (6) and (7) are also continuous in zero.
That is to say, for λ = 0, it coincides with the VIF obtained for the model in Equation (2) when it is estimated by OLS: 2. The VIF associated with the raise regression decreases as λ increases since this is the behavior of the coefficient of determination of the auxiliary regressions in Equations (6) and (7). Consequently, 3. The VIF associated with the raised variable is always higher than one since 4. The VIF associated with the non-raised variables has a horizontal asymptote since ij is the coefficient of determination of the regression in Equation (12) for j = 2, . . . , p and j = i. Indeed, this asymptote corresponds to the VIF, V IF −i (j), of the regression Y = X −i ξ + w and, consequently, will also always be equal to or higher than one.
Thus, from properties (1) to (4), V IF(k, λ) has the very desirable properties of being continuous, monotone in the raise parameter, and higher than one, as presented in García et al. [26].
In addition, the property (4) can be applied to determine the variable to be raised only considering the one with a lower horizontal asymptote. If the asymptote is lower than 10 (the threshold established traditionally as worrying), the extension could be applied to determine the raising factor by selecting, for example, the first λ that verifies V IF(k, λ) < 10 for k = 2, . . . , p. If none of the p − 1 asymptotes is lower than the established threshold, it will not be enough to raise one independent variable and a successive raise regression will be recommended (see García and Ramírez [42] and García et al. [31] for more details). Note that, if it were necessary to raise more than one variable, it is guaranteed that there will be values of the raising parameter that mitigate multicollinearity since, in the extreme case where all the variables of the model are raised, all the VIFs associated with the raised variables tend to one.

Transformation of Variables
The transformation of data is very common when working with models where strong collinearity exists. For this reason, this section analyzes whether the transformation of the data affects the VIF obtained in the previous section.
Since the expression given by Equation (9) can be expressed with i = 2, . . . , p in the function of R 2 i : it is concluded that it is invariant to origin and scale changes and, consequently, the VIF calculated from it will also be invariant.
On the other hand, the expression given by Equation (10) can be expressed for j = 2, . . . , p, with j = i as where it was applied that TSS . In this case, by following García et al. [26], transforming the variable X i as Taking into account that X i is the dependent variables in the regressions of RSS −i i and RSS −i,−j i , the following is obtained: Then, the expression given by Equation (13) is invariant to data transformations (As long as the dependent variables are transformed from the regressions of RSS −i i and RSS −i,−j i in the same form. For example, (a) for considering that a i is its mean and b i is its standard deviation (typification), (b) for considering that a i is its mean and b i is its standard deviation multiplied by the square root of the number of observations (standardization), or (c) for considering that a i is zero and b i is the square root of the squares sum of observations (unit length).) and, consequently, the VIF calculated from it will also be invariant.

MSE for Raise Regression
Since the estimator β obtained from Equation (5) is biased, it is interesting to study its Mean Square Error (MSE).
Taking into account that, for k = 2, . . . , p, it is obtained that matrix X of the expression in Equation (5) can be rewritten as Thus, we have β(λ) = ( X t · X) −1 X t · Y = M −1 λ · β, and then, the estimator of β obtained from Equation (5) is biased unless M λ = I, which only occurs when λ = 0, that is to say, when the raise regression coincides with OLS. Moreover, where tr denotes the trace of a matrix.
In that case, the MSE for raise regression is We can obtain the MSE from the estimated values of σ 2 and β k from the model in Equation (2). On the other hand, once the estimations are obtained and taking into account the Appendix C, that is to say, if the goal is exclusively to minimize the MSE (as in the work presented by Hoerl et al. [33]), λ min should be selected as the raising factor.

Numerical Examples
To illustrate the results of previous sections, two different set of data will be used that collect the two situations shown in the graphs of Figures A1 and A2. The second example also compares results obtained by the raise regression to results obtained by the application of ridge and Lasso regression.
The data set includes different financial variables for 15 Spanish companies for the year 2016 (consolidated account and results between e800,000 and e9,000,000) obtained from the dabase Sistema de Análisis de Balances Ibéricos (SABI) database. The relationship is studied between the number of employees, E, and the fixed assets (e), FA; operating income (e), OI; and sales (e), S. The model is expressed as Table 1 displays the results of the estimation by OLS of the model in Equation (15). The presence of essential collinearity in the model in Equation (15)  In contrast, due to the fact that the coefficients of variation of the independent variables (1.015027, 0.7469496, and 0.7452014) are higher than 0.1002506, the threshold established as troubling by Salmerón et al. [39], it is possible to conclude that the nonessential multicollinearity is not troubling. Thus, the extension of the VIF seems appropriate to check if the application of the raise regression has mitigated the multicollinearity. Remark 1. λ (1) and λ (2) will be the raising factor of the first and second raising, respectively. Table 1. Estimations of the models in Equations (15)- (18): Standard deviation is inside the parenthesis, R 2 is the coefficient of determination, F 3,11 is the experimental value of the joint significance contrast, andσ 2 is the variance estimate of the random perturbation.

p-Value
Model ( A possible solution could be to apply the raise regression to try to mitigate the collinearity. To decide which variable is raised, the thresholds for the VIFs associated with the raise regression are calculated with the goal of raising the variable that the smaller horizontal asymptotes present. In addition to raising the variable that presents the lowest VIF, it would be interesting to obtain a lower mean squared error (MSE) after raising. For this, the λ (1) min is calculated for each case. Results are shown in Table 2. Note that the variable to be raised should be the second or third since their asymptotes are lower than 10, although in both cases λ (1) min is lower than 1 and it is not guaranteed that the MSE of the raise regression will be less than the one obtained from the estimation by the OLS of the model in Equation (15). For this reason, this table also shows the values of λ (1) that make the MSE of the raise regression coincide with the MSE of the OLS regression, λ (1) mse , and the minimum value of λ (1) that leads to values of VIF less than 10, λ (1) vi f .  Figure 2 displays the VIF associated with the raise regression for 0 ≤ λ (1) ≤ 900 after raising the second variable. It is observed that VIFs are always higher than its corresponding horizontal asymptotes.
The model after raising the second variable will be given by where OI = OI + λ (1) · e OI with e OI the residual of regression:

Remark 2.
The coefficient of variation of OI for λ (1) = 24.5 is equal to 0.7922063; that is to say, it was lightly increased.
As can be observed from Table 3, in Equation (16), the collinearity is not mitigated by considering λ (1) equal to λ (1) min and λ (1) mse . For this reason, Table 1 only shows the values of the model in Equation (16) for the value of λ (1) that leads to VIF lower than 10. Table 3. VIF of regression Equation (16) for λ (1)

Transformation of Variables
After the first raising, it is interesting to verify that the VIF associated with the raise regression is invariant to data transformation. With this goal, the second variable has been raised, obtaining the V IF(FA, λ (1) ), V IF( OI, λ (1) ), and V IF(S, λ (1) ) for λ (1) ∈ {0, 0.5, 1, 1.5, 2, . . . , 9.5, 10}, supposing original, unit length, and standardized data. Next, the three possible differences and the average of the VIF associated with each variable are obtained. Table 4 displays the results from which it is possible to conclude that differences are almost null and that, consequently, the VIF associated with the raise regression is invariant to the most common data transformation. Table 4. Effect of data transformations on VIF associated with raise regression.

Second Raising
After the first raising, we can use the results obtained from the value of λ that obtains all VIFs less than 10 or consider the results obtained for λ min or λ mse and continue the procedure with a second raising. By following the second option, we part from the value of λ (1) = λ (1) min = 0.42 obtained after the first raising. From Table 5, the third variable is selected to be raised. Table 6 shows the VIF associated with the following model for λ (2) min , λ (2) mse , and λ (2) vi f : where S = S + λ (2) · e S with e S the residuals or regression: Remark 3. The coefficient of variation of OI for λ (1) = 0.42 is equal to 0.7470222, and the coefficient of variation of S for λ (2) = 17.5 is equal to 0.7473472. In both cases, they were slightly increased.
Note than it is only possible to state that collinearity has been mitigated when λ (2) = λ (2) vi f = 17.5. Results of this estimation are displayed in 1. Table 5. Horizontal asymptote for VIFs after raising each variable in the second raising for λ (2) min , λ (2) mse and λ (2) vi f .

Remark 4.
The coefficient of variation of OI for λ (1) = 1.43 is equal to 0.7473033, and the coefficient of variation of S for λ (2) = 10 is equal to 0.7651473. In both cases, they were lightly increased.

Raised lim
V IF(FA, λ (2) ) V IF( OI, λ (2) ) V IF( S, λ (2) ) λ ( (17) (in which the second variable is raised considering the value of λ that minimizes the MSE, λ (1) = 0.42, and after that, the third variable is raised considering the smallest λ that makes all the VIFs less than 10, λ (2) = 17.5), there is no difference in the individual significance of the coefficient. 3. In the model in Equation (18) (in which the second variable is raised considering the value of λ that makes the MSE of the raise regression coincide with that of OLS, λ (1) = 1.43, and next, the third variable is raised considering the smallest λ that makes all the VIFs less than 10, λ (2) = 10), there is no difference in the individual significance of the coefficient. 4. Although the coefficient of variable OI is not significantly different from zero in any case, the not expected negative sign obtained in model in Equation (15) is corrected in models Equations (17) and (18). 5. In the models with one or two raisings, all the global characteristics coincide with that of the model in Equation (15). Furthermore, there is a relevant decrease in the estimation of the standard deviation for the second and third variable. 6. In models with one or two raisings, the MSE increases, with the model in Equation (16) being the one that presents the smallest MSE among the biased models.
Thus, in conclusion, the model in Equation (16) is selected as it presents the smallest MSE and there is an improvement in the individual significance of the variables.

Example 2: h > 1
This example uses the following model previously applied by Klein and Goldberger [43] about consumption and salaries in the United States from 1936 to 1952 (1942 to 1944 were war years, and data are not available): where C is consumption, WI is wage income, NWI is non-wage, non-farm income, and FI is the farm income. Its estimation by OLS is shown in Table 9. However, this estimation is questionable since no estimated coefficient is significantly different to zero while the model is globally significant (with 5% significance level), and the VIFs associated with each variable (12.296, 9. is equal to 0.03713592 and, consequently, lower than the threshold recommended by García et al. [44] (1.013 · 0.1 + 0.00008626 · n − 0.01384 · p = 0.04714764 being n = 14 and p = 4); it is maintained the conclusion that the near multicollinearity existing in this model is troubling. Once again, the values of the coefficients of variation (0.2761369, 0.2597991, and 0.2976122) indicate that the nonessential multicollinearity is not troubling (see Salmerón et al. [39]). Thus, the extension of the VIF seems appropriate to check if the application of the raise regression has mitigated the near multicollinearity.
Next, it is presented the estimation of the model by raise regression and the results are compared to the estimation by ridge and Lasso regression.

Raise Regression
When calculating the thresholds that would be obtained for VIFs by raising each variable (see Table 10), it is observed that, in all cases, they are less than 10. However, when calculating λ min in each case, a value higher than one is only obtained when raising the third variable. Figure 3 displays the MSE for λ ∈ [0, 37). Note that MSE( β(λ)) is always less than the one obtained by OLS, 49.434, and presents an asymptote in lim  (19) after raising third variable. Table 9. Estimation of the original and raised models: Standard deviation is inside the parentheses, R 2 is the coefficient of determination, F 3,10 is the experimental value of the joint significance contrast, andσ 2 is the variance estimate of the random perturbation.
Model ( The following model is obtained by raising the third variable: where FI = FI + λ · e FI being e FI the residuals of regression: Remark 6. The coefficient of variation FI for λ (1) = 6.895 is 1.383309. Thus, the application of the raise regression has mitigated the nonessential multicollinearity in this variable. Table 9 shows the results for the model in Equation (20), being λ = 6.895. In this case, the MSE is the lowest possible for every possible value of λ and lower than the one obtained by OLS for the model in Equation (19). Furthermore, in this case, the collinearity is not strong once all the VIF are lower than 10 (9.098, 9.049, and 1.031, respectively). However, the individual significance in the variable was not improved.
With the purpose of improving this situation, another variable is raised. If the first variable is selected to be raised, the following model is obtained: where WI = WI + λ · e WI being e WI the residuals of regression: Remark 7. The coefficient of variation of WI for λ (1) = 0.673 is 0.2956465. Thus, it is noted that the raise regression has lightly mitigated the nonessential mutlicollinearity of this variable. Table 9 shows the results for the model in Equation (21), being λ = 0.673. In this case, the MSE is lower than the one obtained by OLS for the model in Equation (19). Furthermore, in this case, the collinearity is not strong once all the VIF are lower than 10 (5.036024, 4.705204, and 2.470980, respectively). Note that raising this variable, the values of VIFs are lower than raising the first variable but the MSE is higher. However, this model is selected as preferable due to the individual significance being better in this model and the MSE being lower than the one obtained by OLS.

Ridge Regression
This subsection presents the estimation of the model in Equation (19) by ridge regression (see Hoerl and Kennard [4] or Marquardt [45]). The first step is the selection of the appropriate value of K.
The following suggestions are addressed: • Hoerl et al. [33] proposed the value of K HKB = p · σ 2 β t β since probability higher than 50% leads to a MSE lower than the one from OLS. • García et al. [26] proposed the value of K, denoted as K V IF , that leads to values of VIF lower than 10 (threshold traditionally established as troubling).
The following values are obtained K HKB = 0.417083, K V IF = 0.013, K exp = 0.04020704, K linear = 0.04022313, and K sq = 0.02663591. Tables 11 and 12 show (The results for K linear are not considered as they are very similar to results obtained by K exp .) the estimations obtained from ridge estimators (expression (1)) and the individual significance intervals obtained by bootstrap considering percentiles 5 and 95 for 5000 repeats. It is also calculated the goodness of the fit by following the results shown by Rodríguez et al. [28] and the MSE.
Note that only the constant term can be considered significatively different to zero and that, curiously, the value of K proposed by Hoerl et al. [33] leads to a value of MSE higher than the one from OLS while the values proposed by García et al. [26] and García et al. [44] lead to a value of MSE lower than the one obtained by OLS. All cases lead to values of VIF lower than 10; see García et al. [26]  In any case, the lack of individual significance justifies the selection of the raise regression as preferable in comparison to the models obtained by ridge regression. Table 11. Estimation of the ridge models for K HKB = 0.417083 and K V IF = 0.013. Confidence interval, at 10% confidence, is obtained from bootstrap inside the parentheses, and R 2 is the coefficient of determination obtained from Rodríguez et al. [28].

Lasso Regression
The Lasso regression (see Tibshirani [5]) is a method initially designed to select variables constraining the coefficient to zero, being specially useful in models with a high number of independent variables. However, this estimation methodology has been widely applied in situation where the model presents worrying near multicollinearity. Table 13 shows results obtained by the application of the Lasso regression to the model in Equation (19) by using the package glmnet of the programming environment R Core Team [46]. Note that these estimations are obtained for the optimal value of λ = 0.1258925 obtained after a k-fold cross-validation. The inference obtained by bootstrap methodology (with 5000 repeats) allows us to conclude that in, at least, the 5% of the cases, the coefficient of NWI is constrained to zero. Thus, this variable should be eliminated from the model.
However, we consider that this situation should be avoided, and as an alternative to the elimination of variable, that is, as an alternative from the following model, the estimation by raise or ridge regression is proposed.
It could be also appropriate to apply the residualization method (see, for example, York [47], Salmerón et al. [48], and García et al. [44]), which consists in the estimation of the following model: where, for example, res NWI represents the residuals of the regression of NWI as a function of WI that will be interpreted as the part of NWI not related to WI. In this case (see García et al. [44]), it is verified that π i = τ i for i = 1, 2, 3. That is to say, the model in Equation (23) estimates the same relationship between WI and FI with C as in the model in Equation (22) with the benefit that the variable NWI is not eliminated due to a part of it being considered..

Conclusions
The Variance Inflation Factor (VIF) is one of the most applied measures to diagnose collinearity together with the Condition Number (CN). Once the collinearity is detected, different methodologies can be applied as, for example, the raise regression, but it will be required to check if the methodology has mitigated the collinearity effectively. This paper extends the concept of VIF to be applied after the raise regression and presents an expression of the VIF that verifies the following desirable properties (see García et al. [26]): 1. continuous in zero. That is to say, when the raising factor (λ) is zero, the VIF obtained in the raise regression coincides with the one obtained by OLS; 2. decreasing as a function of the raising factor (λ). That is to say, the degree of collinearity diminishes as λ increases, and 3. always equal or higher than 1.
The paper also shows that the VIF in the raise regression is scale invariant, which is a very common transformation when working with models with collinearity. Thus, it yields identical results regardless of whether predictions are based on unstandardized or standardized predictors. Contrarily, the VIFs obtained from other penalized regressions (ridge regression, Lasso, and Elastic Net) are not scale invariant and hence yield different results depending on the predictor scaling used.
Another contribution of this paper is the analysis of the asymptotic behavior of the VIF associated with the raised variable (verifying that its limit is equal to 1) and associated with the rest of the variables (presenting an horizontal asymptote). This analysis allows to conclude that

•
It is possible to know a priori how far each of the VIFs can decrease simply by calculating their horizontal asymptote. This could be used as a criterion to select the variable to be raised, the one with the lowest horizontal asymptote being chosen.

•
If there is asymptote under the threshold established as worrying, the extension of the VIF can be applied to select the raising factor considering the value of λ that verifies V IF(k, λ) < 10 for k = 2, . . . , p.

•
It is possible that the collinearity is not mitigated with any value of λ. This can happen when at least one horizontal asymptote is greater than the threshold. In that case, a second variable has to be raised. García and Ramírez [42] and García et al. [31] show the successive raising procedure.
On the other hand, since the raise estimator is biased, the paper analyzes its Mean Square Error (MSE), showing that there is a value of λ that minimizes the possibility of the MSE being lower than the one obtained by OLS. However, it is not guaranteed that the VIF for this value of λ presents a value less than the established thresholds. The results are illustrated with two numerical examples, and in the second one, the results obtained by OLS are compared to the results obtained with the raise, ridge, and Lasso regressions that are widely applied to estimated models with worrying multicollinearity. It is showed that the raise regression can compete and even overcome these methodologies.
Finally, we propose as future lines of research the following questions: • The examples showed that the coefficients of variation increase after raising the variables. This fact is associated with an increase in the variability of the variable and, consequently, with a decrease of the near nonessential multicollinearity. Although a deeper analysis is required, it seems that raise regression mitigates this kind of near multicollinearity.

•
The value of the ridge factor traditionally applied, K HKB , leads to estimators with smaller MSEs than the OLS estimators with probability greater than 0.5. In contrast, the value of the raising factor λ min always leads to estimators with smaller MSEs than OLS estimators. Thus, it is deduced that the ridge regression provides estimators with MSEs higher than the MSEs of OLS estimators with probability lower than 0.5. These questions seem to indicate that, in terms of MSE, the raise regression can present better behaviour than the ridge regression. However, the confirmation of this judgment will require a more complete analysis, including other aspects such as interpretability and inference. it is obtained that In that case, the residual sum of squares is given by