Abstract
Linear transformations such as min–max normalization and z-score standardization are commonly used in logistic regression for the purpose of scaling. However, the work in the literature on linear transformations in logistic regression has two major limitations. First, most work focuses on improving the fit of the regression model. Second, the effects of transformations are rarely discussed. In this paper, we first generalized a linear transformation for a single variable to multiple variables by matrix multiplication. We then studied various effects of a generalized linear transformation in logistic regression. We showed that an invertible generalized linear transformation has no effects on predictions, multicollinearity, pseudo-complete separation and complete separation. We also showed that multiple linear transformations do not have effects on the variance inflation factor (VIF). Numeric examples with a real data were presented to validate our results. Our results of no effects justify the rationality of linear transformations in logistic regression.
Keywords:
logistic regression; linear transformations; predictions; ordinary least squares estimator; maximum likelihood estimator MSC:
62J12
1. Introduction
Logistic regression is one of the most commonly used techniques for modeling the relationship between the dependent variable and one or more independent variables.
In data analysis and machine learning, a transformation refers to a mapping of a variable into a new variable. A transformation can be linear or nonlinear, depending on whether the mapping is linear or nonlinear. Linear transformations can be used to improve interpretability of coefficients in linear regression and make a fitted model easier to understand [1], whereas nonlinear transformations are often used to improve the fit of the model on the data [2].
Three types of linear transformations are commonly used in machine learning prior to model fitting, namely, min–max normalization, z-score standardization and simple scaling. Since different variables that are measured in different scales may not contribute equally to model fitting, min–max normalization is used to transform all continuous variables into the same range [0, 1] to avoid a possible bias. Essentially, min–max normalization subtracts the minimum value of a continuous variable from each value and then divides by the range of the variable. z-score standardization rescales continuous variables to the standard scale, i.e., how far it is from the mean. Mathematically, z-score standardization subtracts the mean value of a continuous variable from each value and then divides by the standard deviation of the variable. Simple scaling shrinks or expands a continuous variable with big values and small values, respectively. The three types of linear transformations are all discussed by Adeyemo, Wimmer and Powell [3] for logistic regression.
However, the work in the literature on transformations in regression have some limitations. First, most work focuses on improving the fit of the regression model [4,5,6,7,8,9]. Second, the effects of transformations are rarely discussed. Morrell, Pearson, and Brant [10] examined how linear transformations affected a linear mixed-effect model and the tests of significance of fixed effects in the model. They showed how linear transformations modified the random effects, and their covariance matrix and the value of the restricted log-likelihood. Zeng [11] studied invariant properties of some statistical measures under monotonic transformations for univariate logistic regression. Zeng [12] derived analytic properties of some well-known category encodings such as ordinal encoding, order encoding and one-hot encoding in multivariate logistic regression by means of linear transformations. Adeyemo, Wimmer and Powell [3] compared the prediction accuracy of the three types of linear transformations, min–max normalization, z-score standardization and simple scaling, in logistic regression, by means of simulation.
In this paper, we first generalized a linear transformation for a single variable to multiple variables by a matrix multiplication. We then studied various effects of a generalized linear transformation in logistic regression. We showed that an invertible generalized linear transformation has no effects on predictions, multicollinearity, pseudo-complete separation, and complete separation. We also showed that multiple linear transformations do not have effects on the variance inflation factor (VIF). Numeric examples with randomly generated transformations from a real data were presented to illustrate our theoretic results.
The remainder of this paper is organized as follows. In Section 2, we give two definitions of a generalized linear transformation and show that they are equivalent. In Section 3, we study the effects of a generalized linear transformation on logistic regression. In Section 4, we present numeric examples to validate our theoretic results. Finally, the paper is concluded in Section 5.
Throughout the paper, we concentrate on transformations of independent variables, which are also sometimes called explanatory variables.
2. A Generalized Linear Transformation in Logistic Regression
Let be the vector of independent variables and be the dependent variable. Let us consider a sample of independent observations where is the value of and the values of independent variables for the -th observation. Without loss of generality, we assume are all continuous variables since otherwise they can be converted into continuous variables.
Let us adopt the matrix notation:
where for all (used for intercept ) and matrix is called the design matrix. Here, are called regression coefficients or regression parameters.
Without causing confusion, we also use to denote the columns or column vectors of . We further use capital letter to denote the row vector for
Definition 1.
A linear transformation is a linear function of a variable which maps or transforms the variable into a new one. Specifically, a linear transformation of variable can be defined as where and are constants and is nonzero. For convenience, let us call a linear transformation of a single variable a simple linear transformation. By multiple linear transformations, we mean a set of simple linear transformations. Here, we use letter in the superscript to denote the new variable after a transformation.
Note that and in Definition 1 are not vectors since is a variable.
Definition 1 can be generalized naturally by matrix multiplication to transform a set of variables to a new set of variables.
Definition 2.
A generalized linear transformation is a linear matrix-vector expression
that transforms or maps independent variables into new independent variables where is a matrix of real numbers and are real constants. Here, are variables not vectors.
It should not be confused with the linear transformation between two vector spaces, in which there is no vector . Here and hereafter, we use the prime symbol ′ in the superscript for the transpose of a vector or a matrix. The new variables in the component forms are
Consider a simple linear transformation, , for some with Without loss of generality, assume . Let be a -dimensional diagonal matrix with and . Let be a -dimensional column vector. Then are transformed into to according to Definition 2. Similarly, consider a set of simple linear transformations, say, for with Let be a -dimensional diagonal matrix with for and for Let be a dimensional column vector. Then are transformed into to according to Definition 2. Hence, both a simple linear transformation and multiple linear transformations are a special case of a generalized linear transformation.
However, Definition 2 is not convenient to use since the new design matrix issomewhat complicated. Therefore, we give another definition incorporated with the design matrix.
Definition 3.
A generalized linear transformation is a matrix multiplication that transforms into where are the 2nd to the last column of and is a matrix of real numbers as follows
Note that we request the first column of to be 0 except the first entry (which is 1) in order for to be the new design matrix.
For convenience, let us partition into 4 blocks such that , where is the -dimensional row vector , 0 is the -dimensional column vector of all 0′s and is the submatrix by deleting the first column and the first row of that is,
In the following we prove the definitions of generalized linear transformation are equivalent.
Theorem 1.
Definition 2 and Definition 3 are equivalent.
Proof.
Let us begin with Definition 2. Its new design matrix is
Hence, the new design matrix of Definition 2 is in the form of Definition 3 with
Note that the submatrix by deleting the first row and first column of matrix above is the transpose of , that is,
Next, let us begin with Definition 3.
The second, third, …, last column of the matrix above are from the linear transform
respectively. Hence, Definition 3 is in the form of Definition 2 with
and
We have concluded our proof. □
If we expand along the first column to find the determinant of in (2), we immediately see that the determinant of is equal to the determinant of Therefore, is nonsingular (or invertible) if and only if in (3) is nonsingular. In addition, it follows from the proof of Theorem 1 that is nonsingular if and only if A in Definition 2 is nonsingular.
Moreover, it is easy to see that if is nonsingular then the inverse of can be written as
From now on we will use Definition 3 unless otherwise specified. For convenience, let us call the generalized linear transformation invertible if C is invertible.
3. Effects of a Generalized Linear Transformation
In logistic regression, the dependent variable is binary with 2 values 0 and 1. Let the conditional probability that be denoted by
Logistic regression assumes the logit linearity between the log odds and independent variables
Equation (10) above can be written as
The following log likelihood is used in logistic regression
The maximum likelihood method is used to estimate parameters in logistic regression. Specifically, the maximum likelihood estimators (MLE) are the values of parameters that maximize (12). The vector of the MLE estimators of satisfies [13]
or in matrix-vector form
where and for Note that after a generalized linear transformation (12) and (14) hold with the design matrix replaced by the new design matrix .
Equation (13) or (14) represents () nonlinear equations of and cannot be solved explicitly in general [14]. Rather, they can be solved numerically by Newton-Raphson algorithm [15] as follows
where is the diagonal matrix with its diagonal elements In addition Both and are evaluated at in (15).
If is nonsingular and the data is not completely separable or pseudo-completely separable [16], then the MLE estimator exists and is unique.
The MLE estimator can be used to predict by the linear combination of variables
In particular, we have fitted values
3.1. Effects on MLE Estimator and Predictions
Theorem 2.
For logistic regression, if the MLE estimator of is then the MLE estimator of is after a generalized linear transformation assuming is nonsingular. Moreover, the generalized linear transformation does not affect predictions.
Proof.
Since is the maximum likelihood estimator of (14) is satisfied by Multiplying both sides of (14) by we obtain
Clearly, (17) can be rewritten as
Writing as for we have
It follows from (18) and (19) that satisfies (14) for the new design matrix . Hence, the linear combinations of is the new MLE estimator after the generalized linear transformation .
Let us now predict for a set of values of variables for the new system after the generalized linear transformation using the new MLE estimator Let be a specific value of respectively. Then, the row vector in the original system becomes in the new system. By (16), the predicted conditional probability of when in the new system is
The right-hand side of (20) is the predicted conditional probability of when in the original system. □
3.2. Effects on Multicollinearity
Perfect multicollinearity or complete multicollinearity or multicollinearity, in short, refers to a situation in logistic regression in which two or more independent variables are linearly related [17]. In particular, if two independent variables are linearly related, then it is called collinearity.
Mathematically, multicollinearity means there exist constant such that
where at least two of are nonzero. If we treat as an independent variable, then we just require at least one of is nonzero.
Multicollinearity is a common issue in logistic regression. If there is multicollinearity, the design matrix will not have a full column rank of . Hence, the matrix in (15) will have a rank less than . Thus, the inverse matrix in (15) does not exist, which make the iteration in (15) impossible.
If there is near multicollinearity and there is no separation of the data points, theoretically in (15) has an inverse and the iteration in (15) can be proceeded. Yet, iteration (15) may not find an approximate inverse and hence may cause unstable estimates and inaccurate variances [18].
Some authors define multicollinearity in logistic regression to be a high correlation between independent variables [19,20,21]. Let us call multicollinearity with high correlation by near multicollinearity and reserve multicollinearity for perfect multicollinearity or complete multicollinearity.
Let us define VIF now. Let be the R-squared that results when is linearly regressed against the other independent variables. Then VIF for is defined as
Near multicollinearity can be detected by using VIF [2]. The larger the VIF of an independent variable, the larger the correlation between this independent variable and others. However, there is no standard for acceptable levels of VIF. Multicollinearility can be combated by a generalized cross-validation (GCV) criterion in partially linear regression models [22,23].
3.2.1. Preliminary Results in Linear Regression
As VIF is related to linear regression, let us briefly introduce some preliminary results in linear regression. As for logistic regression, we consider independent variables . Unlike logistic regression, the dependent variable in linear regression is a continuous variable. We shall adopt the same notation as in logistic regression unless otherwise specified. In particular, is the design matrix.
In linear regression, the relationship between and is formulated as a linear combination
where is a random error, or in matrix notation
The ordinary least squares (OLS) estimator of satisfies [2]
Assuming the )-dimensional square matrix is nonsingular, then the OLS estimator is unique and can be written explicitly as
The OLS estimator can be used to predict by the linear combination of variables as follows
Like Gelman and Hill [1] and Chatterjee and Hadi [2], we will call a predicted value a fitted value if the values of come from one of the observations. So, we have fitted values
Therefore, the -dimensional column vector for the fitted values can be expressed as
It is easy to show that the OLS estimator is after an invertible generalized linear transformation . Moreover, the generalized linear transformation does not affect predictions. Indeed, let us now predict for a set of values of variables , which could be from any set of values not necessarily from one of the n observations. We first transform the values of into where are values of Next, we apply (27) and obtain , which is the predicted value of the original model.
In linear regression, the coefficient of determination, denoted by and also called R-squared, is given by Chatterjee and Hadi [2].
where is the mean of the dependent variable , that is, and is the fitted value
The coefficient of determination can be related to the square of the correlation between and as follows [2]
where
Theorem 3.
in linear regression is invariant under invertible generalized linear transformations.
Proof.
Expressing in the numerator of the 2nd equation in (29) into the matrix form and applying (26), we obtain
Substituting (32) into (29) yields
Now let be an invertible generalized linear transformation. Then the OLS estimator after the transformation becomes . In this case, in (33) becomes
which returns to in (29) before the generalized linear transformation. □
3.2.2. Effects on Logistic Regression
In Definitions 2 and 3, we defined a generalized linear transformation only for independent variables. Since an independent variable is used as the dependent variable in order to find its VIF, we consider a simple linear transformation for the dependent variable in the following result.
Lemma 1.
Consider a linear regression with as the dependent variable and as the independent variables. If we make a simple linear transformation on y such as and a generalized linear transformation on independent variables with nonsingular , then is the OSL estimator of the new linear regression after the transformations, where is the design matrix, is a -dimensional row vector and is the OLS estimator of the original linear regression.
Proof.
Since for the new linear regression has design matrix is and the dependent variable can be expressed as where is a -dimensional row vector, it is sufficient show that satisfies
Substituting into the left-hand side of (34) and replacing with , we obtain
which is the right-hand side of (34). □
Theorem 4.
VIF for each independent variable is invariant under multiple linear transformations in logistic regression.
Proof.
Without loss of generality, we assume multiple linear transformations for the first independent variables for , where . To find VIF, we do linear regressions for each , by making as the dependent variable and as the independent variables. Similarly, we do linear regression for each by making as the dependent variable and as the independent variables. We only prove the invariance of VIF for and of VIF for as the invariance of VIF for can be proved similar to and the invariance of VIF for can be proved similar to .
To find VIF for , we do linear regressions by making as the dependent variable and as the independent variables. In this case, the dependent variable and the independent variables result from a generalized linearization , where is the design matrix with independent variables and is the upper triangular matrix as follows
Since the determinant of equals by Lemma 1, the OLS estimator after the multiple linear transformations is . By (29), it’s sufficient to prove the following identity:
Since the denominator of the left-hand side of (35) is
It is sufficient to show that
Expressing the left-hand side of (36) as the multiplication of vectors
where is the -dimensional column vector with all elements of and is the -dimension vector of fitted values for
Applying (28) for the vector of fitted values and design matrix and applying Lemma1, we obtain
Hence, and so (37) becomes
which is the right hand-side of (36).
To find VIF for , we do linear regressions by making as the dependent variable and as the independent variables. In this case, the independent variable result from a generalized linearization , where is the design matrix of independent variables and is the upper triangular matrix as follows
Since the determinant of equals by Theorem 3, VIF for after the generalized transformation is the same as VIF for prior to the generalized linear transformation. □
Remark 1.
VIFs are not necessarily invariant under an invertible generalized linear transformation . For instance, let and and keep unchanged. Then result from the generalized linear transformation with
Since the determinant of is −1, is nonsingular. However, VIF for after the generalizer linear transformation equals VIF for prior to the generalized linear transformation, which are unequal in general.
The following result is immediate.
Theorem 5.
Multicollinearity exists in logistic regression if, and only if, it exists after an invertible generalized linear transformation.
Remark 2.
All the results about multicollinearity and VIF also apply to machine learning algorithms in which multicollinearity is applicable such as linear regression.
3.3. Effects on Linear Separation
Albert and Anderson [16] first assumed design matrix to have a full column rank, that is, no multicollinearity. They then introduced the concept of separation (including complete separation and quasi-complete separation) and overlap in logistic regression with intercept. They showed that separation leads to nonexistence of (finite) MLE and that overlap leads to finite and unique MLE. Therefore, like multi-collinearity, separation is a common issue in logistic regression.
Definition 4.
There is a complete separation of data points if there exists a vector that correctly allocates all observations to their response groups; that is,
Definition 5.
There is quasi-complete separation if the data are not complete separable, but there exists a vector such that
and equality holds for at least one subject in each response group.
Definition 6.
If neither a complete nor a quasi-complete separation exists, then the data is said to have overlap.
Theorem 6.
An invertible generalized linear transformation does not affect the data configuration of logistic regression.
Proof.
We consider three cases.
Case 1. There is a complete separation of data points in the original system. Then (38) holds for a vector The row in the design matrix is for , after the invertible generalized linear transformation . Let , then vector is a constant column vector of dimension (p + 1). Since , (38) holds after the generalized linear transformation. Therefore, there is also a complete separation of data points after the generalized linear transformation .
Case 2. There is a quasi-complete separation of data points in the original system. It can be proved similarly to Case 1.
Case 3. The original data point has overlap. Then the new data points after the generalized linear transformation also has overlap. We prove it by contradiction. Assume otherwise the new data points after the generalized linear transformation does not has overlap. Then there is either a complete separation or a pseudo-complete separation of data points. Let us first assume there is a complete separation of data point after the generalized linear transformation . Then there is a vector such that (38) holds. Row in the design matrix after the generalized linear transformation is for . Let , then (38) holds with , which is a contradiction. Next, let us assume there is a quasi-complete separation after the generalized linear transformation . It can be proved similarly. □
4. Numeric Examples
In this section, we use real data, the well-known German Credit Data from a German bank, to validate our theoretical results. The German Credit Data can be found in the UCI Machine Learning Repository [24]. The original dataset is in file “german.data”, which contains categorical/symbolic attributes. It has 1000 observations representing 1000 loan applicants. The statistical software package R (version 3.4.2) and its RStudio will be employed for our analyses. Since there are only 1000 records, we will not split them into training and test. We extract german.data using R’s read_table function, call it german_credit_raw, and use colnames() method to rename the column names.
There are 21 variables or attributes in german_credit_raw including 8 numerical ones as follows, which are denoted by resepectively:
- Duration: Duration in month;
- credit_amount: Credit amount;
- installment_rate: Installment rate in percentage of disposable income;
- current_address_length: Present residence since;
- age: Age in years;
- num_credits: Number of existing credits at this bank;
- num_dependents: Number of people being liable to provide maintenance for;
- credit_status: Credit status: 1 for good loans and 2 for bad loans.
Let us define a new variable called default as . With the new variable default, 0 is for good loans and 1 is for bad loans. Since it is not easy to interpret categorical variables, we will only consider numerical variables.
4.1. Validation of Invariance of Separation
Let us first build a logistic regression model logit_model_1 using all the 8 numerical variables and glm function in R. In the following, we italicize statements in R, use “>“ for the R prompt and make outputs from R bold.
> logit_model_1 <- glm(default ~ duration + credit_amount + installment_rate + current_address_length + age + num_credits + num_dependents + credit_status, data = german_credit_raw, family = “binomial”)
Warning message:
glm.fit: algorithm did not converge
We see a warning message as above. It indicates a separation in the data. Indeed, this separation is from variable credit_status. (38) holds with . By Definition 4, there is a complete separation of data points.
Now let us make a generalized linear transformation. We randomly generate matrix as shown in Table 1 and the 8-dimensional row vector in (3) by calling R function runif, which generates random values from a uniform distribution with a default value from 0 to 1. We set seed for the purpose of reproduction. We denote and by C_11 and C_1 in R, respectively. We call R’s function det to calculate the determinant of
Table 1.
Matrix C_11.
> set.seed(1)
> C_11 <- matrix(runif(64),nrow = 8)
We use R function det to find the determinant of C_11 to be 0.01433565.
Vector is generated as follows:
> set.seed(10)
> C_1 = runif(n = 8, min = 1, max = 20)
[1] 10.642086 6.828602 9.111246 14.168940 2.617583 5.283296
6.216080 6.173796
Since is nonsingular, so is by (9). Now into as in (6). Let us denote by in R.
Let us build a logistic regression model logit_model_2 for the eight transformed variables.
We also see the warning message as for the eight original variables. Therefore, after a nonsingular generalized linear transformation, the separation in data remains.
4.2. Validation of MLE
Let us drop credit_status and rebuild a logistic regression model called logit_model_3. The main output is shown in Table 2.
Table 2.
Coefficients and statistics for model 3.
The output also indicates the data still has overlap after the transformation. Hence, we have validated Theorem 6.
We see variables current_address_length, num_credits and num_dependents are not significant at the 0.05 level. Since we are not focused on building a model, let us still keep these variables. Let us extract the coefficients and put them in a vector called model_coef_3 as follows:
> model_coef_3 <- data.frame(coef(logit_model_3))
> model_coef_3 <- as.matrix(model_coef_3)
Next, let us make a generalized linear transformation. We use letter rather than to distinguish the case from Section 4.1. We randomly generate matrix and the 7-dimensional row vector in (3) by calling R function runif. Again, we denote as shown in Table 3 and by D_11 and D_1 in R, respectively.
Table 3.
Matrix D_11.
> set.seed(2)
> D_11 <- matrix(runif(49),nrow = 7)
> det(D_11)
[1] 0.2851758
> set.seed(20)
> D_1 = runif(n = 7, min = 1, max = 20)
[1] 17.672906 15.602131 6.300300 11.054110 19.295234 19.626737
2.735319
Since the determinant of is nonzero, is non-singular by (9) as shown in Table 4:
Table 4.
Matrix D.
We use R function solve to find its inverse and call it inv_D (see Table 5) in R
Table 5.
Inverse matrix of D.
Now into as in (6). Let us denote by in R. Let us build a logistic regression model for the seven transformed variables and call it logit_model_4. The main output is shown in Table 6:
Table 6.
Coefficients and statistics of model 4.
Let us extract the coefficients called model_coef_3 to get more digits as shown in Table 7:
Table 7.
Coefficients of model 4.
> model_coef_4 <- data.frame(coef(logit_model_4))
Let us find the multiplication of and vector model_coef_3 in R as follows:
> inv_D%*%model_coef_3
The result of the product is shown in Table 8 below.
Table 8.
Product of inverse matrix of D and coefficients of model 3.
This is exactly the same as model_coef_4. Next, we calculate the predicted values for all the 1000 records using both models logit_model_3 and logit_model_4 by calling R function predict and then all.equal utility to check these two predictions are near equality:
> model_3_predictions = predict(logit_model_3, german_credit_raw, type=“response”)
> model_4_predictions = predict(logit_model_4, german_credit_raw, type=“response”)
> all.equal(model_3_predictions, model_4_predictions, tolerance = 1e-13)
[1] “Mean relative difference: 0.0000000000005060054”
We see that the two predictions are identical taking rounding errors into consideration. Thus, we have validated validated Theorem 2.
Note that a nonlinear transformation even a one-to-one correspondence will not have the properties in Theorem 2 even for a single variable. For instance, let us define a one-to-one correspondence for variable age as follows: , which is in R. Let us build a univariate logistic regression model called logit_model_5 for age and a univariate logistic regression model called logit_model_6 for age_6. Next, we apply these two models to predict the values for german_credit_raw.
> model_5_predictions = predict(logit_model_5, german_credit_raw, type=“response”)
> model_6_predictions = predict(logit_model_6, german_credit_raw, type=“response”)
> all.equal(model_5_predictions, model_6_predictions, scale=1)
[1] “Mean absolute difference: 0.008512868”
We see that the predictions from logit_model_5 are in general different from predictions for logit_model_6.
4.3. Validation of Invariance of VIF
For logistic regression model logit_model_3 in Section 4.2, we use VIF function in the car package of R to find VIF for all the 7 variables. The result is shown in Table 9.
Table 9.
VIF for model 3.
> car::vif(logit_model_3)
Next, we randomly generate multiple simple transformations as follows
> set.seed(30)
> A = runif(n = 7)
> set.seed(40)
> B = runif(n = 7, min = 1, max = 10)
> german_credit_raw$duration_7 = A [1] * german_credit_raw$duration + B [1]
> german_credit_raw$credit_amount_7 = A [2] * german_credit_raw$credit_amount + B [2]
…
> german_credit_raw$num_dependents_7 = A [7] * german_credit_raw$num_dependents + B [7]
We build a logistic regression for the variables after multiple simple linear transformations and call it logit_model_7. We then find VIF as follows and display the result in Table 10
Table 10.
VIF for model 7.
> car::vif(logit_model_7)
Hence, we have validated Theorem 4. There is no need to validate Theorem 5 (the invariance of multicollinearity) as its analytical proof is straightforward.
5. Conclusions
In this paper, we first generalized a linear transformation for a single variable to multiple variables by a matrix multiplication. We then studied various effects of a generalized linear transformation in logistic regression. We showed that an invertible generalized linear transformation has no effects on predictions, multicollinearity, pseudo-complete separation, and complete separation. We also showed that multiple linear transformations do not have effects on the variance inflation factor (VIF). Numeric examples with real data were presented to validate our theoretic results.
Author Contributions
Writing—original draft, G.Z.; Writing—review & editing, S.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/, accessed on 6 November 2022.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Gelman, A.; Hill, J. Data Analysis Using Regression and multilevel/Hierarchical Models; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
- Chatterjee, S.; Hadi, A.S. Regression Analysis by Example, 5th ed.; John Wiley & Sons: New York, NY, USA, 2013. [Google Scholar]
- Adeyemo, A.; Wimmer, H.; Powell, L.M. Effects of normalization techniques on logistic regression in data science. J. Inf. Syst. Appl. Res. 2019, 12, 37–44. [Google Scholar]
- Box, G.E.P.; Tidwell, P.W. Transformation of the Independent Variables. Technometrics 1962, 4, 531–550. [Google Scholar] [CrossRef]
- Whittemore, A.S. Transformations to Linearity in Binary Regression. SIAM J. Appl. Math. 1983, 43, 703–710. [Google Scholar] [CrossRef]
- Kay, R.; Little, S. Transformations of the explanatory variables in the logistic regression model for binary data. Biometrika 1987, 74, 495–501. [Google Scholar] [CrossRef]
- Feng, C.; Wang, H.; Lu, N.; Chen, T.; He, H.; Lu, Y.; Tu, X.M. Log-transformation and its implications for data analysis. Shanghai Arch. Psychiatry 2014, 26, 105–109. [Google Scholar] [PubMed]
- Zhang, M.; Chen, S.; Rain, S.C. Evaluating Continuous Variable Transformations in Logistic Regression. In Proceedings of the Midwest SAS User Group Conference 2015, Omaha, NE, USA, 18–20 October 2015. [Google Scholar]
- Lee, D.K. Data transformation: A focus on the interpretation. Korean J. Anesthesiol. 2020, 73, 503–508. [Google Scholar] [CrossRef] [PubMed]
- Morrell, C.H.; Pearson, J.D.; Brant, L.J. Linear Transformations of Linear Mixed-Effects Models. Am. Stat. 1997, 51, 338–343. [Google Scholar]
- Zeng, G. Invariant Properties of Logistic Regression Model in Credit Scoring under Monotonic Transformations. Commun. Stat. Theory Methods 2017, 46, 8791–8807. [Google Scholar] [CrossRef]
- Zeng, G. On the analytical properties of category encodings in logistic regression. Commun. Stat. Theory Methods, 2021; advance online publication. [Google Scholar] [CrossRef]
- Hosmer, D.W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; John Wiley & Sons, Inc.: New York, NY, USA, 2013. [Google Scholar]
- Zeng, G. On the Existence of an Analytical Solution in Multiple Logistic Regression. Int. J. Appl. Math. Stat. 2021, 60, 53–67. [Google Scholar]
- Refaat, M. Credit Risk Scorecards: Development and Implementation Using SAS; Lulu.com: Raleigh, NC, USA, 2011. [Google Scholar]
- Albert, A.; Anderson, J.A. On the existence of maximum likelihood estimates in logistic regression models. Biometrika 1984, 71, 1–10. [Google Scholar] [CrossRef]
- Zeng, G.; Zeng, E. On the Relationship between Multicollinearity and Separation in Logistic Regression. Commun. Stat. Simul. Comput. 2021, 50, 1989–1997. [Google Scholar] [CrossRef]
- Shen, L.; Gao, Y.; Xiao, J. Simulation of Hydrogen Production from Biomass Gasification in Interconnected Fluidized Beds. Biomass Bioenergy 2008, 32, 120–127. [Google Scholar] [CrossRef]
- Vatcheva, K.P.; Lee, M.; McCormick, J.B.; Rahbar, M.H. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies. Epidemiology 2016, 6, 227. [Google Scholar] [CrossRef] [PubMed]
- Cincotta, K. Multicollinearity in Zero Intercept Regression: They Are Not Who We Thought They Were. In Proceedings of the Presented at the Society of Cost Estimating and Analysis (SCEA) Conference, Albuquerque, NM, USA, 6–10 June 2011. [Google Scholar]
- Dohoo, I.R.; Ducrot, C.; Fourichon, C.; Donald, A.; Hurnik, D. An overview of techniques for dealing with large numbers of independent variables in epidemiologic studies. Prev. Vet. Med. 1997, 29, 221–239. [Google Scholar] [CrossRef] [PubMed]
- Amini, M.; Roozbeh, M. Optimal partial ridge estimation in restricted semiparametric regression models. J. Multivar. Anal. 2015, 136, 26–40. [Google Scholar] [CrossRef]
- Roozbeh, M. Optimal QR-based estimation in partially linear regression models with correlated errors using GCV criterion. Comput. Stat. Data Anal. 2018, 117, 45–61. [Google Scholar] [CrossRef]
- Lichman, M. UCI Machine Learning Repository; University of California, School of Information and Computer Science: Irvine, CA, USA, 2013; Available online: http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/ (accessed on 16 December 2022).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).