Detection of Near-Nulticollinearity through Centered and Noncentered Regression

: This paper analyzes the diagnostic of near-multicollinearity in a multiple linear regression from auxiliary centered (with intercept) and noncentered (without intercept) regressions. From these auxiliary regressions, the centered and noncentered variance inﬂation factors (VIFs) are calculated. An expression is also presented that relates both of them. In addition, this paper analyzes why the VIF is not able to detect the relation between the intercept and the rest of the independent variables of an econometric model. At the same time, an analysis is also provided to determine how the auxiliary regression applied to calculate the VIF can be useful to detect this kind of multicollinearity.


Introduction
Consider the following multiple linear model with n observations and k regressors: where y is a vector with the observations of the dependent variable, X is a matrix containing the observations of regressors and u is a vector representing a random disturbance (that is assumed to be spherical). Generally, the first column of matrix X is composed of ones to denote that the model contains an intercept. Thus, X = [1 X 2 . . . X k ] where 1 n×1 = (1 1 . . . 1) t . This model is considered to be centered. When this model presents worrying near-multicollinearity (hereinafter, multicollinearity), that is, when the linear relation between the regressors affects the numerical and/or statistical analysis of the model, the usual approach is to transform the regressors (see, for example, Belsley [1], Marquardt [2] or, more recently, Velilla [3]). Due to the transformations (centering, typification or standardization) implying the elimination of the intercept in the model, the transformed models are considered to be noncentered. Note that even after transforming the data, it is possible to recover the original model (centered) from the estimations of the transformed model (noncentered model). However, in this paper, we refer to the centered and noncentered model depending on whether the intercept is initially included or not. Thus, it is considered that the model is centered if X = [1 X 2 . . . X k ] and noncentered if X = [X 1 X 2 . . . X k ], given that X j = 1 with j = 1, . . . , k.
From the intercept is also possible to distinguish between essential and nonessential multicollinearity: why the centered VIF is unable to detect the nonessential multicollinearity and, for this, the centered coefficient of determination of the centered auxiliary regression to calculate the centered VIF is analyzed. This analysis will be applied to propose a methodology to detect the nonessential multicollinearity from the centered auxiliary regression. The structure of the paper is as follows: Section 2 presents the detection of multicollinearity in noncentered models from the noncentered auxiliary regressions, Section 3 analyzes the effects of high values of the noncentered VIF on the statistical analysis of the model and Section 4 presents the detection of multicollinearity in centered models from the centered auxiliary regressions. Section 5 illustrates the contribution of the paper with two empirical applications. Finally, Section 6 summarizes the main conclusions.

Auxiliary Noncentered Regressions
This section presents the calculation of the VIF uncentered, VIFnc, considering that the auxiliary regression is noncentered, that is, it has no intercept. First, the method regarding how to calculate the coefficient of determination for noncentered models is presented.

Noncentered Coefficient of Determination
Given the linear regression of Equation (1) with or without the intercept, the following decomposition for the sum of squares is verified: where y represents the estimation of the dependent variable of the model that is fit by employing ordinary least squares (OLS) and e = y − y are the residuals obtained from that fit. In this case, the coefficient of determination is obtained by the following expression: Comparing the decomposition of the sums of squares given by (2) with the traditionally applied method to calculate the coefficient of determination in models with the intercept, as in model (1): it is noted that both coincide if the dependent variable has zero mean. If the mean is different from zero, both models present the same residual sum of squares but different explained and total sum of squares. Thus, these models lead to the same value for the coefficient of determination (and, as a consequence, for the VIF) only if the dependent variable presents a mean equal to zero.

Noncentered Variance Inflation Factor
The VIFnc is obtained from the expression: where R 2 nc (j) is the coefficient of determination, calculated by following (3), of the noncentered auxiliary regression: where X −j is equal to the matrix X after eliminating the variable X j , for j = 1, . . . , k, and it does not have a vector of ones representing the intercept. In this case: Then: Thus, the VIFnc coincides with the expression given by Stewart [9] for the VIF and is denoted as k 2 j , that is, V IFnc(j) = k 2 j . However, recently, Salmerón Gómez et al. [12] showed that the index presented by Stewart has been misleadingly identified as the VIF, verifying the following relation between both measures: where X j is the mean of the j−variable of X. This expression is also shown by Salmerón Gómez et al. [10], where it is used to quantify the proportion of essential and nonessential multicollinearity existing in a concrete independent variable. Note that the expression: is obtained by Chennamaneni et al. [17] (expression (6) page 174), although it is also limited to the particular case of the moderated regression Y = α 0 · 1 + α 1 · U + α 2 · V + α 3 · U × V + ν where U and V are ratio-scaled explanatory variables in n-dimensional data vectors. Indeed, these authors proposed a new measure to detect multicollinearity in moderated regression models that is derived from the noncentered coefficient of determination. However, this use of the noncentered coefficient of determination lacks of the statistical contextualization provided by this paper Finally, from expression (9), it is shown that the VIFnc and the VIF only coincide if the associated variable has zero mean, analogously to what happens in the decomposition of the sum of squares. Note that this expression also clarifies why Stewart's collinearity indices diminish when the variables are centered, which the author attributed to errors in regression variables: This phenomenon is a consequence of the fact that our definition of collinearity index compels us to work with relative errors. Example 1. Considering k = 4 in model (1), we use the noncentered coefficient of determination, R 2 nc , to calculate the noncentered variance inflation factor, V IFnc. For it, we consider the values displayed in Table 1.
Note that variables y, X 2 and X 3 were originally used by Belsley [1] and we have added a new variable, X 4 , that has been randomly generated (from a normal distribution with a mean equal to 4 and a variance equal to 16) to obtain a variable that is linearly independent with respect to the rest. In these data, the existence of nonessential multicollinearity is intuited. This fact is confirmed by the small values of the coefficient of variation (CV) in two of the independent variables and the following conclusions obtained from the value of the condition indices and the proportions of the variance (see, for example, Belsley et al. [18] and Belsley [16] for more details) shown in Table 2: • Variables X 2 and X 3 present a CV lower than 0.06674082 and than 0.1002506 that were presented by Salmerón Gómez et al. [10] as thresholds to indicate that a variable may be related to the constant and the model will present strong and moderate nonessential multicollinearity, respectively.

•
The second index is associated with a high proportion of the variance with the variable X 4 , although it is not worrisome since it does not present a high value.

•
The third index presents a value higher than the established thresholds (20 for moderate multicollinearity and 30 for strong multicollinearity), and it is also associated with high proportions in the variables X 2 and X 3 .

•
The last index identified as the condition number is clearly related to the intercept, and at the same time, it includes the relation between X 2 and X 3 as previously commented.

•
Finally, the condition number, 1614.829, is higher than the threshold traditionally established as indicative of worrisome multicollinearity. Now, other models are proposed apart from the initial model for k = 4: Table 3 presents the VIF and the VIFnc of these models. Note that by using the original variables applied by Belsley (Mod1), the traditional VIF (from the centered model, see Theil [19]) provides a value equal to 1 (its minimum possible value), while the VIFnc is equal to 100,032.1. If the additional variable X 4 is included (Mod0), the traditional VIFs are also close to one while the noncentered VIFs present values higher than 100,000. The conclusion is that the VIF is not detecting the existence of nonessential multicollinearity (see Salmerón et al. [8]) while the VIFnc "does detect it". However, since the calculation of VIFnc excludes the constant term, the detected relation refers to the one between X 2 and X 3 , and not to the relation between X 2 and/or X 3 with the intercept. This fact is supported by the values obtained for the VIF and VIFnc of the second and fourth variables (Mod2) and for the third and fourth variables (Mod3).

What Kind of Multicollinearity Detects the VIFnc?
The results of Example 1 for Mod0 suggest a new definition of nonessential multicollinearity as the relation between at least two variables with little variability. Thus, the particular case when one of these variables is the intercept leads to the definition initially given by Marquardt and Snee [6]. Then, the initial idea that in a noncentered model, is not possible to find nonessential collinearity is of a nuanced nature.
By following Salmerón et al. [8] and Salmerón Gómez et al. [10], it can be concluded that the VIF only detects the essential multicollinearity and, with these results, the VIFnc detects the nonessential multicollinearity but in its generalized definition since the intercept is eliminated in the corresponding auxiliary regression.
This fact is contradictory to the fact that the VIFnc coincides with the index of Stewart, see expression (7), since this measure is able to detect the nonessential multicollinearity (see Salmerón Gómez et al. [10]). This is because the VIFnc could be fooled, including the constant as an independent variable in a model without the intercept, that is: where X 1 is a column of ones but is not considered as the intercept.
Example 2. Now, we part from model 1 in the Belsley example but include the constant as an independent variable in a model without the intercept (Mod4) and two additional models (Mod5 and Mod6): Table 4 presents the VIFnc obtained from expression (5) in Models 4-6. Results indicate that, considering the centered model and calculating the coefficient of determination of the auxiliary regressions as if the model were noncentered, it is possible to detect the nonessential multicollinearity. Thus, the contradiction indicated at the beginning of this subsection is saved.

Effects of the Vifnc on the Statistical Analysis of the Model
Given the model (1), the expression obtained for the variance of the estimator is given by: where RSS j is the residual sum of squares of the auxiliary regression of the j−independent variable as a function of the rest of the independent variables (see expression (6)). From expression (10), and considering that expression (7) can be rewritten as: it is possible to obtain: Establishing a model as a reference is required to conclude whether the variance has been inflated (see, for example, Cook [20]). Thus, if the variables in X are orthogonal, it is verified that , and consequently, the variance of the estimated coefficients in the hypothetical orthogonal case is given by the following expression: In this case: and it is then possible to state that the VIFnc is a factor that inflates the variance. As consequence, high values of V IFnc(j) imply high values of var( β j ) and a tendency not to reject the null hypothesis in the individual significance test of model (1). Thus, the statistical analysis of the model will be affected.
Note from expression (11) that this negative effect can be offset by low values of the estimation of σ 2 , that is, low values of the residual sum of squares of model (1) or high values of the number of observations, n. This is similar to what happen to the VIF (see O'Brien [7] for more details).

Auxiliary Centered Regressions
The use of the coefficient of determination of the auxiliary regression (6) where matrix X −j contains a column of ones that represents the intercept is a very common approach to detect the linear relations between the independent variables of the model (1). This is motivated due to the higher relation between X j and the rest of the independent variables, that is, the higher the multicollinearity is, the higher the value of that coefficient of determination.
However, since the coefficient of determination ignores the role of the intercept, this measure is unable to detect the nonessential linear relations. The question is evident: Does another measure exist related to the auxiliary regression that allows detection of the nonessential multicollinearity?

Case When There Is Only Nonessential Multicollinearity
Example 3. Suppose that 100 observations are simulated for variables X, Z and W from normal distributions with a mean of 5, 4 and -4 and a standard deviation of 0.01, 4 and 0.01, respectively. Note that X and W present light variability and, for this reason, it is expected that the model presents nonessential multicollinearity.
Then, y = 1 + X + Z − W + v is generated by simulating v as a normal distribution with a mean equal to 0 and a standard deviation equal to 2.
The second column of Table 5 presents the results obtained after the estimation by ordinary least squares (OLS) of model y = β 1 · 1 + β 2 · X + β 3 · Z + β 4 · W + u. Note that the estimations of the coefficients of the model differ substantially from the real values used to generate y, except for the coefficient of the variable Z (this situation illustrates the fact that if the interest is to estimate the effect of variable Z on y, the analysis will not be influenced by the linear relations between the rest of the independent variables), which is the variable free of multicollinearity (indeed, it is the unique coefficient significantly different from zero, with a 5% significance-the value used by default in this paper). This table also shows the results obtained from the estimations of the centered auxiliary regressions. Note that the coefficients of determination are very small, and consequently, the associated VIFs do not detect the degree of multicollinearity. However, note that in the auxiliary regressions corresponding to variables X and W: • The estimation of the coefficient of the intercept almost coincides with the mean from which each variable was generated, 5 and −4, and, at the same time, the coefficients of the rest of the independent variables are almost zero.

•
The estimations of the coefficients of the intercept are the unique ones that are significantly different from zero.
Thus, note that the auxiliary regressions are capturing the existence of nonessential multicollinearity. The problem is that it is not transferred to its coefficient of determination but to another characteristic.
From this finding, it is possible to propose a way to detect the nonessential multicollinearity from the centered auxiliary regression traditionally applied to calculate the VIF:

Condition 1 (C1):
Quantify the contribution of the estimation of the intercept to the total sum of the estimations of the coefficients of model (6), that is, calculate:

Condition 2 (C2):
Calculate the number of independent variables with coefficients significantly different from zero and quantify the contribution of the intercept.
A Montecarlo simulation is presented considering the model (1) where k = 3 and the variable X 2 has been generated as a normal distribution with mean µ 2 ∈ A and variance σ 2 2 ∈ B, the variable X 3 has been generated as normal distribution with mean µ 3 ∈ A and variance σ  Considering the thresholds established by Salmerón Gómez et al. [10], 90% of the simulations present values for condition C1 between 99.402% and 99.999% if CV < 0.06674082 and between 95.485% and 99.999% if CV < 0.1002506. Thus, we can consider that values of condition C1 higher than 95.485% will indicate that the auxiliary centered regressions are detecting the presence of nonessential multicollinearity. Table 7 shows that a high value is obtained for the condition C1, even if any estimated coefficient is significantly different from zero (C2 = NA).
Thus, the previous threshold, 95.485%, will be considered as valid if it is accompanied by a high value in the second condition. Example 4. Applying these criteria to the data of the Example 1 for Mod1, it is obtained that: • In the auxiliary regression X 2 = δ 1 · 1 + δ 3 · X 3 + w, the estimation of the intercept is equal to 99.988% of the total, and the individual significance of the intercept corresponds to 100% of the significant estimated coefficients.

•
In the auxiliary regression X 3 = δ 1 · 1 + δ 2 · X 2 + w, the estimation of the intercept is equal to 99.988% of the total, and the individual significance of the intercept corresponds to 100% of the significant estimated coefficients.
Thus, the symptoms shown in the previous simulation also appear, and consequently, in both situations, the nonessential multicollinearity will be detected.
Replicating both situations where the VIFnc was not able to detect the nonessential multicollinearity, it is obtained that: • For Mod2 it is obtained that: -In the auxiliary regression X 2 = δ 1 · 1 + δ 4 · X 4 + w, the estimation of the intercept is equal to the 99.978% of the total, and the individual significance of the intercept corresponds to 100% of the significant estimated coefficients.

-
In the auxiliary regression X 4 = δ 1 · 1 + δ 2 · X 2 + w, the estimation of the intercept is equal to 50.138% of the total, and none of the estimated coefficients are significantly different from zero.
• For Mod3 it is obtained that: -In the auxiliary regression X 3 = δ 1 · 1 + δ 4 · X 4 + w, the estimation of the intercept is equal to 99.984% of the total, and the individual significance of the intercept corresponds to 100% of the significant estimated coefficients.

-
In the auxiliary regression X 4 = δ 1 · 1 + δ 3 · X 3 + w, the estimation of the intercept is equal to 50.187% of the total, and none of the estimated coefficients are significantly different from zero.
Once again, it was shown that with this procedure, it is possible to detect the nonessential multicollinearity and the variables that are causing it.

Relevance of a Variable in a Regression Model
Note that the conditions C1 and C2 are focused on measuring the relevance of one of the variables, in this case, the intercept, within the multiple linear regression model. It is interesting to analyze the behavior of other measures with this same goal as, for example, the index ı j of Stewart [9]. Given model (1), Stewart defined the relevance of the j−variable as the number: where || · || is the usual Euclidean norm. Stewart considered that a variable with a relevance higher than 0.5 should not be ignored. Table 8 presents the calculation of ı j for situations shown in Example 1. Note that in all cases, the intercept will be considered relevant, even when the variable X 4 is analyzed as a function of X 2 or X 3 , despite that it was previously shown that the intercept was not relevant in these situations (at least in relation to nonessential multicollinearity).

Auxiliary Regression
Thus, the application of ı j seems not to be appropriate contrarily to what happens with conditions C1 and C2.

Case When There Is Generalized Nonessential Multicollinearity
Example 6. Suppose that the previous simulation is repeated, except for the generation of the variable Z, which, in this case, is considered to be given by Z i = 2 · X i − a i , for i = 1, . . . , 100, where a i is generated from a normal distribution with a mean equal to 2 and a standard deviation equal to 0.01. Table 9 presents the results of the estimation by OLS of the model y = β 1 · 1 + β 2 · X + β 3 · Z + β 4 · W + u and its possible auxiliary regressions.
In this case, none of the coefficients are significantly different from zero and the coefficients are very far from the real values used in the simulation. In relation to the auxiliary regression, it is possible to conclude that: • When the dependent variable is X, the coefficients that are significantly different from zero are the ones of the intercept and the variable Z. At the same time, the estimation of the coefficient of the intercept differs from the mean from which the variable X was generated. In this case, the contribution of the estimation of the intercept is equal to 83.837% of the total and represents 50% of the coefficients significantly different from zero.

•
When the dependent variable is Z, the coefficients significantly different from zero are the ones of the intercept and the variable X. In this case, the contribution of the estimation of the intercept is equal to 53.196% of the total and represents 50% of the coefficients significantly different from zero.

•
When the dependent variable is W, the signs shown in the previous section are maintained. In this case, the contribution of the intercept is equal to 95.829% of the total and represents 100% of the coefficients significantly different from zero. • Finally, although it will require a deeper analysis, the last results indicate that the estimated coefficient that is significantly different from zero in the auxiliary regression represents the variables responsible for the existing linear relation (intercept included).
Note that the existence of generalized nonessential multicollinearity distorts the symptoms previously detected. Thus, the fact that in a centered auxiliary regression, the contribution (in absolute terms) of the estimation of the intercept to the total sum (in absolute value) of all estimations will be close to 100%, and the estimation of the intercept will be uniquely significantly different from zero, are indications of nonessential multicollinearity. However, it is possible that these symptoms are not manifested but there exists worrisome nonessential multicollinearity. Thus, these conditions are sufficient but not required.
However, in situations shown in Example 6 where conditions C1 and C2 are not verified, the VIFnc will be equal to 1109,259.3, 758,927.7 and 100,912.7. Thus, note that these results complement the results presented in the previous section in relation to the VIFnc. Thus, VIFnc detects generalized nonessential multicollinearity while conditions C1 and C2 detect the traditional nonessential multicollinearity given by Marquardt and Snee [6].

Empirical Applications
In order to illustrate the contribution of this study, this section presents two empirical applications with financial and economic real data. Note that in a financial prediction model, a financial variable with low variance means low risk and a better prediction, because the standard deviation and volatility are lower. However, as discussed above, a lower variance of the independent variable may mean greater nonessential multicollinearity in a GLR model. Thus, the existence of worrisome nonessential collinearity may be relatively common in financial econometric models and this idea can be extended in general to economic applications. Note that the objective is to diagnose the type of multicollinearity existing in the model and indicate the most appropriate treatment (without applying it).

Financial Empirical Application
The following model of Euribor (100%) is specified from the data set composed by 47 Eurozone observations for the period January 2002 to July 2013 (quarterly and seasonally adjusted data) and previously applied by Salmerón Gómez et al. [10]: where HICP is the Harmonized Index of Consumer Prices (100%), BC is the Balance of Payments to net current account (millions of euros) and u is a random disturbance (centered, homoscedastic, and uncorrelated). Table 10 presents the analysis of model (13) and its corresponding auxiliary regressions. The values of the VIFs which are very close to one will indicate that there is not essential multicollinearity. The correlation coefficient between HICP and BC is 0.231 and the determinant of the correlation matrix is 0.946. Both values indicate that there is no essential multicollinearity, see García García et al. [21] and Salmerón Gómez et al. [22].
However, the condition number is higher than 30 indicating a strong multicollinearity associated, see conditions C1 and C2, with variable HICP. The values of conditions C1 and C2 are conclusive in the case of variable HICP. In the case of variable BC, although condition C1 presents a high value, none of the coefficients of the auxiliary regression is significatively different from zero (condition C2). By following the simulation presented in subsection, this indicate that the variable BC is not related to the intercept. This conclusion is in line with the value of the coefficient of variation of variable HICP that is lower than 0.1002506, the threshold established by Salmerón Gómez et al. [10] for moderate nonessential multicollinearity. Table 11 presents the calculation of the VIFnc. Note that it is not detecting the non-essential multicollinearity. As previously commented, the VIFnc only detects the essential and the generalized nonessential multicollinearity. This table also presents the VIFnc calculated in a model without intercept but including the constant as an independent variable (see Section 2.3). In this case, the VIFnc is able to detect the nonessential multicollinearity between the intercept and the variable HIPC.
In conclusion, this model will present nonessential multicollinearity caused by the variable HICP. This problem can be mitigated by centering that variable (see, for example, Marquardt and Snee [6] and Salmerón Gómez et al. [10]).

Economic Empirical Application
From French economy data from Chatterjee and Hadi [23], also analyzed by Malinvaud [24], Zhang and Liu [25] and Kibria and Lukman [26], among others, the following model is analyzed: for years 1949 through 1966 where imports (I), domestic production (DP), stock formation (SF) and domestic consumption (DC), all are measured in billions of French francs and u is a random disturbance (centered, homoscedastic, and uncorrelated). Table 12 presents the analysis of model (14) and its corresponding auxiliary regressions. The values of the VIFs of variables DP and DC indicate strong essential multicollinearity. The condition number is higher than 30 also indicating a strong multicollinearity.
Note that the values of condition C1 for variables DP and DC are lower than threshold shown in the simulation. Only the variable SF presents a higher value but, in this case, condition C2 indicates that none of the estimated coefficients of the auxiliary regression are significatively different from zero. This conclusion is in line with the coefficients of variation that are higher than the threshold established by Salmerón Gómez et al. [10] indicating that there is no nonessential multicollinearity. Table 13 presents the calculation of the VIFnc. Note that it is detecting the essential multicollinearity. This table also presents the VIFnc calculated in a model without intercept but including the constant as an independent variable. In this case, the VIFnc is also detecting the essential multicollinearity between the variables DP and DC. From thresholds established by Salmerón Gómez et al. [10] for simple linear regression (k = 2), the value 60.0706 will not be worrisome and, consequently, the nonessential multicollinearity will not be worrisome. To conclude, this model presents essential multicollinearity caused by the variables DP and DC. In this case, the problem will be mitigated by applying estimation methods other than OLS such as ridge regression (see, for example, Hoerl and Kennard [27], Hoerl et al. [28], Marquardt [29]), LASSO regression (see Tibshirani [30]), raise regression (see, for example, García et al. [31], Salmerón et al. [32], García and Ramírez [33], Salmerón et al. [34]), residualization (see, for example, York [35], García et al. [36]) or the elastic net regularization (see Zou and Hastie [37]).

Conclusions
The distinction between essential and nonessential multicollinearity and its diagnosis has not been not been adequately treated in either the scientific literature or in statistical software and this lack of information has led to mistakes in some relevant papers, for example Velilla [3] or Jensen and Ramirez [13]. This paper analyzes the detection of essential and nonessential multicollinearity from auxiliary centered and noncentered regressions, obtaining two complementary measures between them that are able to detect both kinds of multicollinearity. The relevance of the results is that they are obtained within an econometric context, encompassing the distinction between centered and noncentered models that is not only accomplished from a numerical perspective, as was the case presented, for example, in Salmerón Gómez et al. [12] or Salmerón Gómez et al. [10]. An undoubtedly interesting point of view of this situation is the one presented by Spanos [38] that stated: It is argued that many confusions in the collinearity literature arise from erroneously attributing symptoms of statistical misspecification to the presence of collinearity when the latter is misdiagnosed using unreliable statistical measures. That is, the distinction related to the econometric model provides confidence to the measures of detection and avoids the problems commented by Spanos.
From a computational point of view, this debate clarifies what is calculated when the VIF is obtained for centered and noncentered models. It also clarifies, see Section 2.3, what type of multicollinearity is detected (and why) when the uncentered VIF is calculated in a centered model. At the same time, a definition of nonessential multicollinearity is presented that generalizes the definition given by Marquardt and Snee [6]. Note that this generalization can be understood as a particular kind of essential multicollinearity: A near-linear relation between two independent variables with light variability. However, it is shown that this kind of multicollinearity is not detected by the VIF, and for this reason, we consider it more appropriate to include it within the nonessential multicollinearity.
In relation to the application of the VIFnc, this paper shows that the VIFnc detects the essential and the generalized nonessential multicollinearity and even the traditional nonessential multicollinearity if it is calculated in a regression without the intercept but including the constant as an independent variable. Note that the VIF, although widely applied in many different fields, only detects the essential multicollinearity. This paper has also analyzed why the VIF is unable to detect the nonessential multicollinearity, and two conditions are presented as sufficient (but not required) to establish the existence of nonessential multicollinearity. Since these conditions, C1 and C2, are based on the relevance of the intercept within the centered auxiliary regression to calculate the VIF, this scenario was compared to the measure proposed by Stewart [9], ı j , to measure the relative importance of a variable within a multiple linear regression. It is shown that conditions C1 and C2 are preferable to the calculation of ı j .
To summarize: • A centered model can present essential, generalized nonessential and traditional nonessential collinearity (given by Marquardt and Snee [6]) while in a noncentered model only it is only possible to find the essential and the generalized nonessential collinearity.

•
The VIF only detects the essential collinearity, the VIFnc detects the generalized nonessential and essential collinearity and the conditions C1 and C2 the traditional nonessential collinearity.

•
When there is generalized nonessential collinearity it is understood that there is also traditional nonessential collinearity, but this is not detected by the conditions C1 and C2. Thus, in this case it is necessary to use other alternative measures as the coefficient of variation of the condition number.
To conclude, in order to detect the kind of multicollinearity and its degree, the greatest number of measures must be used (variance inflation factors, condition number, correlation matrix and its determinant, coefficient of variation, conditions C1 and C2, etc.) as in Section 5, and it is inefficient to limit oneself to the management of only a few. Similarly, it is necessary to know what kind of multicollinearity is capable of detecting each one of them.
Finally, the following will be interesting as future lines of inquiry: • to establish the threshold for the VIFnc, • to extend the Montecarlo simulation of Section 4.1 for models with k > 3 regressors, • a deeper analysis to conclude if the variable responsible for the existing linear relation can be identified as the one whose estimated coefficient is significantly different from zero in the auxiliary regression (see Example 6) and • the development of a specific package in R Core Team [39] to perform the calculation of VIFnc and conditions C1 and C2.