# Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. What Is Multicollinearity?

#### 2.1. Effects of Multicollinearity

#### 2.2. Ways to Measure Multicollinearity

## 3. Reducing the Effects of Multicollinearity

#### 3.1. Variable Selection Methods

_{I})y = Q(Z

_{I})Xβ + Q(Z

_{I})ε,

#### 3.2. Modified Estimators Methods

#### 3.3. Machine Learning Methods

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Conflicts of Interest

## References

- Schroeder, M.A.; Lander, J.; Levine-Silverman, S. Diagnosing and dealing with multicollinearity. West. J. Nurs. Res.
**1990**, 12, 175–187. [Google Scholar] [CrossRef] [PubMed] - Algamal, Z.Y. Biased estimators in Poisson regression model in the presence of multicollinearity: A subject review. Al-Qadisiyah J. Adm. Econ. Sci.
**2018**, 20, 37–43. [Google Scholar] - Bollinger, J. Using bollinger bands. Stock. Commod.
**1992**, 10, 47–51. [Google Scholar] - Iba, H.; Sasaki, T. Using genetic programming to predict financial data. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 6–9 July 1999; pp. 244–251. [Google Scholar]
- Lafi, S.; Kaneene, J. An explanation of the use of principal-components analysis to detect and correct for multicollinearity. Prev. Vet. Med.
**1992**, 13, 261–275. [Google Scholar] [CrossRef] - Alin, A. Multicollinearity. Wiley Interdiscip. Rev. Comput. Stat.
**2010**, 2, 370–374. [Google Scholar] [CrossRef] - Mason, C.H.; Perreault, W.D., Jr. Collinearity, power, and interpretation of multiple regression analysis. J. Mark. Res.
**1991**, 28, 268–280. [Google Scholar] [CrossRef] - Neter, J.; Kutner, M.H.; Nachtsheim, C.J.; Wasserman, W. Applied Linear Statistical Models; WCB McGraw-Hill: New York, NY, USA, 1996. [Google Scholar]
- Weisberg, S. Applied Linear Regression; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 528. [Google Scholar]
- Belsley, D.A.; Kuh, E.; Welsch, R.E. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
- Tamura, R.; Kobayashi, K.; Takano, Y.; Miyashiro, R.; Nakata, K.; Matsui, T. Best subset selection for eliminating multicollinearity. J. Oper. Res. Soc. Jpn.
**2017**, 60, 321–336. [Google Scholar] [CrossRef] [Green Version] - Askin, R.G. Multicollinearity in regression: Review and examples. J. Forecast.
**1982**, 1, 281–292. [Google Scholar] [CrossRef] - Ralston, A.; Wilf, H.S. Mathematical Methods for Digital Computers; Wiley: New York, NY, USA, 1960. [Google Scholar]
- Hamaker, H. On multiple regression analysis. Stat. Neerl.
**1962**, 16, 31–56. [Google Scholar] [CrossRef] - Hocking, R.R.; Leslie, R. Selection of the best subset in regression analysis. Technometrics
**1967**, 9, 531–540. [Google Scholar] [CrossRef] - Gorman, J.W.; Toman, R. Selection of variables for fitting equations to data. Technometrics
**1966**, 8, 27–51. [Google Scholar] [CrossRef] - Mallows, C. Choosing Variables in a Linear Regression: A Graphical Aid; Central Regional Meeting of the Institute of Mathematical Statistics: Manhattan, KS, USA, 1964. [Google Scholar]
- Kashid, D.; Kulkarni, S. A more general criterion for subset selection in multiple linear regression. Commun. Stat.-Theory Methods
**2002**, 31, 795–811. [Google Scholar] [CrossRef] - Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Vrieze, S.I. Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychol. Methods
**2012**, 17, 228. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Misra, P.; Yadav, A.S. Improving the classification accuracy using recursive feature elimination with cross-validation. Int. J. Emerg. Technol.
**2020**, 11, 659–665. [Google Scholar] - Wold, H. Soft modeling: The basic design and some extensions. Syst. Under Indirect Obs.
**1982**, 2, 343. [Google Scholar] - Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst.
**2001**, 58, 109–130. [Google Scholar] [CrossRef] - Chong, I.-G.; Jun, C.-H. Performance of some variable selection methods when multicollinearity is present. Chemom. Intell. Lab. Syst.
**2005**, 78, 103–112. [Google Scholar] [CrossRef] - Maitra, S.; Yan, J. Principle component analysis and partial least squares: Two dimension reduction techniques for regression. Appl. Multivar. Stat. Models
**2008**, 79, 79–90. [Google Scholar] - Onur, T. A Comparative Study on Regression Methods in the presence of Multicollinearity. İstatistikçiler Derg. İstatistik Ve Aktüerya
**2016**, 9, 47–53. [Google Scholar] - Li, C.; Wang, H.; Wang, J.; Tai, Y.; Yang, F. Multicollinearity problem of CPM communication signals and its suppression method with PLS algorithm. In Proceedings of the Thirteenth ACM International Conference on Underwater Networks & Systems, Shenzhen, China, 3–5 December 2018; pp. 1–5. [Google Scholar]
- Willis, M.; Hiden, H.; Hinchliffe, M.; McKay, B.; Barton, G.W. Systems modelling using genetic programming. Comput. Chem. Eng.
**1997**, 21, S1161–S1166. [Google Scholar] [CrossRef] - Castillo, F.A.; Villa, C.M. Symbolic regression in multicollinearity problems. In Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation, Washington, DC, USA, 25–29 June 2005; pp. 2207–2208. [Google Scholar]
- Bies, R.R.; Muldoon, M.F.; Pollock, B.G.; Manuck, S.; Smith, G.; Sale, M.E. A genetic algorithm-based, hybrid machine learning approach to model selection. J. Pharmacokinet. Pharmacodyn.
**2006**, 33, 195. [Google Scholar] [CrossRef] [PubMed] - Katrutsa, A.; Strijov, V. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria. Expert Syst. Appl.
**2017**, 76, 1–11. [Google Scholar] [CrossRef] - Hall, M.A. Correlation-Based Feature Selection for Machine Learning; The University of Waikato: Hamilton, New Zealand, 1999. [Google Scholar]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell.
**2005**, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed] - Senawi, A.; Wei, H.-L.; Billings, S.A. A new maximum relevance-minimum multicollinearity (MRmMC) method for feature selection and ranking. Pattern Recognit.
**2017**, 67, 47–61. [Google Scholar] [CrossRef] - Tamura, R.; Kobayashi, K.; Takano, Y.; Miyashiro, R.; Nakata, K.; Matsui, T. Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor. J. Glob. Optim.
**2019**, 73, 431–446. [Google Scholar] [CrossRef] - Zhao, N.; Xu, Q.; Tang, M.L.; Jiang, B.; Chen, Z.; Wang, H. High-dimensional variable screening under multicollinearity. Stat
**2020**, 9, e272. [Google Scholar] [CrossRef] - Fan, J.; Lv, J. Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B
**2008**, 70, 849–911. [Google Scholar] [CrossRef] [Green Version] - Chen, C.W.; Tsai, Y.H.; Chang, F.R.; Lin, W.C. Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results. Expert Syst.
**2020**, 37, e12553. [Google Scholar] [CrossRef] - Larabi-Marie-Sainte, S. Outlier Detection Based Feature Selection Exploiting Bio-Inspired Optimization Algorithms. Appl. Sci.
**2021**, 11, 6769. [Google Scholar] [CrossRef] - Singh, S.G.; Kumar, S.V. Dealing with Multicollinearity problem in analysis of side friction characteristics under urban heterogeneous traffic conditions. Arab. J. Sci. Eng.
**2021**, 46, 10739–10755. [Google Scholar] [CrossRef] - Horel, A. Applications of ridge analysis toregression problems. Chem. Eng. Progress.
**1962**, 58, 54–59. [Google Scholar] - Duzan, H.; Shariff, N.S.B.M. Ridge regression for solving the multicollinearity problem: Review of methods and models. J. Appl. Sci.
**2015**, 15, 392–404. [Google Scholar] [CrossRef] [Green Version] - Assaf, A.G.; Tsionas, M.; Tasiopoulos, A. Diagnosing and correcting the effects of multicollinearity: Bayesian implications of ridge regression. Tour. Manag.
**2019**, 71, 1–8. [Google Scholar] [CrossRef] - Roozbeh, M.; Arashi, M.; Hamzah, N.A. Generalized cross-validation for simultaneous optimization of tuning parameters in ridge regression. Iran. J. Sci. Technol. Trans. A Sci.
**2020**, 44, 473–485. [Google Scholar] [CrossRef] - Singh, B.; Chaubey, Y.; Dwivedi, T. An almost unbiased ridge estimator. Sankhyā Indian J. Stat. Ser. B
**1986**, 48, 342–346. [Google Scholar] - Kejian, L. A new class of biased estimate in linear regression. Commun. Stat.-Theory Methods
**1993**, 22, 393–402. [Google Scholar] [CrossRef] - Liu, K. Using Liu-type estimator to combat collinearity. Commun. Stat.-Theory Methods
**2003**, 32, 1009–1020. [Google Scholar] [CrossRef] - Inan, D.; Erdogan, B.E. Liu-type logistic estimator. Commun. Stat.-Simul. Comput.
**2013**, 42, 1578–1586. [Google Scholar] [CrossRef] - Huang, J.; Yang, H. A two-parameter estimator in the negative binomial regression model. J. Stat. Comput. Simul.
**2014**, 84, 124–134. [Google Scholar] [CrossRef] - Türkan, S.; Özel, G. A new modified Jackknifed estimator for the Poisson regression model. J. Appl. Stat.
**2016**, 43, 1892–1905. [Google Scholar] [CrossRef] - Chandrasekhar, C.; Bagyalakshmi, H.; Srinivasan, M.; Gallo, M. Partial ridge regression under multicollinearity. J. Appl. Stat.
**2016**, 43, 2462–2473. [Google Scholar] [CrossRef] - Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B
**1996**, 58, 267–288. [Google Scholar] [CrossRef] - Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B
**2005**, 67, 301–320. [Google Scholar] [CrossRef] [Green Version] - Mosier, C.I. Problems and designs of cross-validation 1. Educ. Psychol. Meas.
**1951**, 11, 5–11. [Google Scholar] [CrossRef] - Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least angle regression. Ann. Stat.
**2004**, 32, 407–499. [Google Scholar] [CrossRef] [Green Version] - Roozbeh, M.; Babaie–Kafaki, S.; Aminifard, Z. A nonlinear mixed–integer programming approach for variable selection in linear regression model. Commun. Stat.-Simul. Comput.
**2021**, 1–12. [Google Scholar] [CrossRef] - Roozbeh, M.; Babaie-Kafaki, S.; Aminifard, Z. Improved high-dimensional regression models with matrix approximations applied to the comparative case studies with support vector machines. Optim. Methods Softw.
**2022**, 1–18. [Google Scholar] [CrossRef] - Nguyen, V.C.; Ng, C.T. Variable selection under multicollinearity using modified log penalty. J. Appl. Stat.
**2020**, 47, 201–230. [Google Scholar] [CrossRef] - Kibria, B.; Lukman, A.F. A new ridge-type estimator for the linear regression model: Simulations and applications. Scientifica
**2020**, 2020, 9758378. [Google Scholar] [CrossRef] - Arashi, M.; Norouzirad, M.; Roozbeh, M.; Mamode Khan, N. A High-Dimensional Counterpart for the Ridge Estimator in Multicollinear Situations. Mathematics
**2021**, 9, 3057. [Google Scholar] [CrossRef] - Qaraad, M.; Amjad, S.; Manhrawy, I.I.; Fathi, H.; Hassan, B.A.; El Kafrawy, P. A hybrid feature selection optimization model for high dimension data classification. IEEE Access
**2021**, 9, 42884–42895. [Google Scholar] [CrossRef] - Obite, C.; Olewuezi, N.; Ugwuanyim, G.; Bartholomew, D. Multicollinearity effect in regression analysis: A feed forward artificial neural network approach. Asian J. Probab. Stat.
**2020**, 6, 22–33. [Google Scholar] [CrossRef] - Garg, A.; Tai, K. Comparison of regression analysis, artificial neural network and genetic programming in handling the multicollinearity problem. Proceedings of International Conference on Modelling, Identification and Control, Wuhan, China, 24–26 June 2012; pp. 353–358. [Google Scholar]
- Kim, J.-M.; Wang, N.; Liu, Y.; Park, K. Residual control chart for binary response with multicollinearity covariates by neural network model. Symmetry
**2020**, 12, 381. [Google Scholar] [CrossRef] [Green Version] - Huynh, H.T.; Won, Y. Regularized online sequential learning algorithm for single-hidden layer feedforward neural networks. Pattern Recognit. Lett.
**2011**, 32, 1930–1935. [Google Scholar] [CrossRef] - Ye, Y.; Squartini, S.; Piazza, F. Online sequential extreme learning machine in nonstationary environments. Neurocomputing
**2013**, 116, 94–101. [Google Scholar] [CrossRef] - Gu, Y.; Liu, J.; Chen, Y.; Jiang, X.; Yu, H. TOSELM: Timeliness online sequential extreme learning machine. Neurocomputing
**2014**, 128, 119–127. [Google Scholar] [CrossRef] - Guo, L.; Hao, J.-h.; Liu, M. An incremental extreme learning machine for online sequential learning problems. Neurocomputing
**2014**, 128, 50–58. [Google Scholar] [CrossRef] - Mahadi, M.; Ballal, T.; Moinuddin, M.; Al-Saggaf, U.M. A Recursive Least-Squares with a Time-Varying Regularization Parameter. Appl. Sci.
**2022**, 12, 2077. [Google Scholar] [CrossRef] - Nobrega, J.P.; Oliveira, A.L. A sequential learning method with Kalman filter and extreme learning machine for regression and time series forecasting. Neurocomputing
**2019**, 337, 235–250. [Google Scholar] [CrossRef] - LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version] - Hoseinzade, E.; Haratizadeh, S. CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst. Appl.
**2019**, 129, 273–285. [Google Scholar] [CrossRef] - Kim, T.; Kim, H.Y. Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data. PLoS ONE
**2019**, 14, e0212320. [Google Scholar] [CrossRef] [PubMed] - Elman, J.L. Finding structure in time. Cogn. Sci.
**1990**, 14, 179–211. [Google Scholar] [CrossRef] - Young, T.; Hazarika, D.; Poria, S.; Cambria, E. Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag.
**2018**, 13, 55–75. [Google Scholar] [CrossRef] - Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv
**2014**, arXiv:1409.0473. [Google Scholar] - Zhang, L.; Wang, S.; Liu, B. Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
**2018**, 8, e1253. [Google Scholar] [CrossRef] [Green Version] - Zhang, X.; Liu, S.; Zheng, X. Stock Price Movement Prediction Based on a Deep Factorization Machine and the Attention Mechanism. Mathematics
**2021**, 9, 800. [Google Scholar] [CrossRef] - Kim, R.; So, C.H.; Jeong, M.; Lee, S.; Kim, J.; Kang, J. Hats: A hierarchical graph attention network for stock movement prediction. arXiv
**2019**, arXiv:1908.07999. [Google Scholar] - Hua, Y. An efficient traffic classification scheme using embedded feature selection and lightgbm. In Proceedings of the Information Communication Technologies Conference (ICTC), Nanjing, China, 29–31 May 2020; pp. 125–130. [Google Scholar]
- Katrutsa, A.; Strijov, V. Stress test procedure for feature selection algorithms. Chemom. Intell. Lab. Syst.
**2015**, 142, 172–183. [Google Scholar] [CrossRef] - Garg, A.; Tai, K. Comparison of statistical and machine learning methods in modelling of data with multicollinearity. Int. J. Model. Identif. Control
**2013**, 18, 295–312. [Google Scholar] [CrossRef]

Author | Year | Objective | Method | Pros | Cons |
---|---|---|---|---|---|

Ralston and Wilf [13] | 1960 | Develop a method for model selection | Forward selection and backward elimination | Simple to understand and use | Final model affected by order |

Mallows [17] | 1964 | A criterion for subset selection | Cp criterion | Graphically compare quality between models | Suffers with outlier and non-normality |

Gorman and Toman [16] | 1966 | Fractional factorial design to model selection | Fractional factorial design with the statistical criteria, Cp | Avoid computing all possible model | Heuristic technique |

Kashid and Kulkarni [18] | 2002 | A more general selection criterion than Cp when least square is not best | Sp criterion | Applicable on any estimator | Computationally difficult and not consistent result |

Misra and Yadav [21] | 2020 | Improve classification accuracy in small sample size | Recursive Feature Elimination with Cross-Validation | Does not delete the records | Evaluated on small sample size |

Author | Year | Objective | Method | Pros | Cons |
---|---|---|---|---|---|

Wold [22] | 1982 | Creates new components using the relationship of predictor and response | Partial Least Square (PLS) | Supervised component extraction | Cannot exhibit significant non-linear characteristics |

Lafi and Kanenee [5] | 1992 | Using PCA to perform regression | Principal component analysis (PCA) | Reduce dimensions | Does not account for relationship with response variable |

Bies et al. [30] | 2006 | Genetic algorithm-based approach to model selection | Genetic algorithm | Less subjectivity on model | Not good in finding local minima |

Katrutsa and Strijov [31] | 2017 | Quadratic programming approach | Quadratic programming | Investigates the relevance and redundancy of features | Cannot evaluate multicollinearity between quantitative and nominal random variable. |

Senawi et al. [34] | 2017 | Feature selection and ranking | Maximum relevance-minimum multicollinearity (MRmMC) | Works well with classifying problems | Non-exhaustive |

Tamura et al. [11] | 2017 | Mixed integer optimization | Mixed integer semidefinite optimization (MISDO) | Uses backward elimination to reduce computation | Only applies to low number of variables |

Tamura et al. [35] | 2019 | Mixed integer optimization | Mixed integer quadratic optimization (MIQO) | Uses VIF as indicator | Only applies to low number of variables |

Chen et al. [38] | 2020 | Combines the result of filter, wrapper, and embedded feature selection | Ensemble feature selection | Overcome local optima problem | Higher computation cost than single solution |

Zhao et al. [36] | 2020 | Variable screening based on sure independence screening (SIS) | Preconditioned profiled independence screening (PPIS) | Variable screening in high dimensional setting | Require decorrelation of the predictors |

Larabi-Marie-Sainte [39] | 2021 | Feature selection based on outlier detection | Projection Pursuit | Found outliers correlated with irrelevant features | Does not work well when features are noisy |

Singh and Kumar [40] | 2021 | Creates new variables | Linear combination and ratio of independent variables | Does not remove any variables | Based on trial-and-error |

Author | Year | Objective | Method | Pros | Cons |
---|---|---|---|---|---|

Hoerl [41] | 1962 | Adds bias in exchange for lower variance | Ridge regression | Reduces overfitting | Introduces significant amount of bias |

Singh et al. [45] | 1986 | Address significant amount of bias in ridge regression | Jack-knife procedure | Simple method to obtain confidence intervals for the regression parameters. | Larger variance than ridge regression |

Liu [46] | 1993 | Simple procedure to find ridge parameter | Liu estimator | Ridge estimate is a linear function of ridge parameter | Does not work in severe multicollinearity |

Tibshirani [52] | 1996 | Address interpretability of stepwise and ridge regression | Lasso regression | Reduces coefficient to zero | Worse performance than Ridge and does not work when p > n |

Liu [47] | 2003 | Existing method does not work in severe multicollinearity | Liu-type estimator | Allows large shrinkage parameter | Two parameter estimation |

Efron et al. [55] | 2004 | Computational simplicity | Least angle regression (LARs) | Computationally simpler Lasso | Very sensitive to the presence of outliers |

Zou and Hastie [53] | 2005 | Combines Ridge and Lasso regression | Elastic net | Achieves grouping effect | No parsimony |

Chandrasekhar et al. [51] | 2016 | Applies Ridge parameters only on variable with high collinearity | Partial ridge regression | More precise parameter estimates | Subjective measure of high collinearity |

Assaf et al. [43] | 2019 | A conditionally conjugate prior for the biasing constant | Bayesian approach to finding ridge parameter | Produce a marginal posterior of parameter given the data | Only focus on getting a single parameter |

Nguyen and Ng [58] | 2019 | Strictly concave penalty function | Modified log penalty | Parsimony variable selection under multicollinearity | No grouping effect |

Kibria and Lukman [59] | 2020 | Alternative to the ordinary least squares estimator | Kibria–Lukman estimator | Outperforms Ridge and Liu-type regression | Results depends on certain conditions |

Arashi et al. [60] | 2021 | High-dimensional alternative to Ridge and Liu | Two-parameter estimator | Has asymptotic properties | Lower efficiency in sparse model |

Qaraad et al. [61] | 2021 | Tune parameter alpha of Elastic Net | Optimized Elastic Net | Effective with imbalanced and multiclass data | Accuracy metric not discussed |

Author | Year | Objective | Method |
---|---|---|---|

Huynh and Won [65] | 2011 | Multi-objective optimization function to minimize error | Regularized OS-ELM algorithm |

Garg and Tai [63] | 2012 | Hybrid method of PCA and ANN | Factor analysis-artificial neural network (FA-ANN) |

Ye et al. [66] | 2013 | Input weight that changes with time | OS-ELM Time-varying (OS-ELM-TV) |

Gu et al. [67] | 2014 | Penalty factor in the weight adjustment matrix | Timeliness Online Sequential ELM algorithm |

Guo et al. [68] | 2014 | Smoothing parameter to adjust output weight | Least Squares Incremental ELM algorithm |

Hoseinzade and Haratizadeh [72] | 2019 | Model the correlation among different features from a diverse set of inputs | CNN-pred |

Kim and Kim [73] | 2019 | Using features from different representation of same data to predicting the stock movement | LSTM-CNN |

Nóbrega and Oliveira [70] | 2019 | Kalman filter to adjust output weight | Kalman Learning Machine (KLM) |

Hua [80] | 2020 | Decision tree to select features | XGBoost |

Obite et al. [62] | 2020 | Compare ANN and OLSR in presence of multicollinearity | Artificial neural network |

Zhang et al. [78] | 2021 | Applied attention to capture the intraday interaction between input features | CNN-deep factorization machine and attention mechanism (FA-CNN) |

Mahadi et al. [69] | 2022 | Regularization parameter that varies with time | Regularized Recursive least-squares |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chan, J.Y.-L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.-W.; Chen, Y.-L.
Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. *Mathematics* **2022**, *10*, 1283.
https://doi.org/10.3390/math10081283

**AMA Style**

Chan JY-L, Leow SMH, Bea KT, Cheng WK, Phoong SW, Hong Z-W, Chen Y-L.
Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. *Mathematics*. 2022; 10(8):1283.
https://doi.org/10.3390/math10081283

**Chicago/Turabian Style**

Chan, Jireh Yi-Le, Steven Mun Hong Leow, Khean Thye Bea, Wai Khuen Cheng, Seuk Wai Phoong, Zeng-Wei Hong, and Yen-Lin Chen.
2022. "Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review" *Mathematics* 10, no. 8: 1283.
https://doi.org/10.3390/math10081283