Predicting Stock Market Risk Using Machine Learning Classification Models
Abstract
1. Introduction
- ■
- We operationalize stock market risk as extreme down side return events
- ■
- We systematically compare the nine classification models using out-of-sample validation, and find that an interpretable model (Logistic Regression) outperforms more complex alternatives in predicting tail-risk events.
2. Data and Models
2.1. Data
2.2. Model
2.2.1. Logistic Regression Model
2.2.2. k-Nearest Neighbor Model
2.2.3. Decision Tree Model
2.2.4. Random Forest Model
2.2.5. Linear Discriminant Analysis Model
2.2.6. Naive Bayes Model
2.2.7. Quadratic Discriminant Analysis Model
2.2.8. AdaBoost Model
2.2.9. Gradient Boosting Model
3. Experimental Results
3.1. Performance Metrics
3.2. Performance Analysis Results
4. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Alaminos, David, Jose’ I. Peláez, M. Belen Salas, and Manuel A. Fernández-Gámez. 2021. Sovereign debt and currenct crises prediction models using machine learning techniques. Symmetry 13: 652. [Google Scholar] [CrossRef]
- Arifovic, Jasmina, and Michael K. Maschek. 2012. Currency crisis: Evolution of beliefs and policy experiments. Journal of Economic Behavior & Organization 82: 131–50. [Google Scholar] [CrossRef]
- Asare, Clement, Derrick Asante, and John F. Essel. 2023. Probabilistic LSTM Modeling for Stock Price Prediction with Monte Carlo dropout Long Short-Term Memory Network. International Journal of Innovative Science and Research Technology 8: 2316–22. [Google Scholar]
- Aydın, Suat, and Cengiz Tunç. 2023. What is the most prominent reserve indicator that forewarns currency crises? Economic Letters 231: 111282. [Google Scholar] [CrossRef]
- Bhandari, Hum N., Binod Rimal, Nawa R. Pokhrel, Ramchandra Rimal, Keshab R. Dahal, and Rajendra K. C. Khatri. 2022. Predicting stock market index using LSTM. Machine Learning with Applications 9: 100320. [Google Scholar] [CrossRef]
- Bodart, Vincent, and Jean-François Carpantier. 2023. Currency crises in emerging countries: The commodity factor. Journal of Commodity Markets 30: 100287. [Google Scholar] [CrossRef]
- Candelona, Bertrand, Elena-Ivona Dumitrescub, and Christophe Hurlinc. 2014. Currency crisis early warning systems: Why they should be dynamic. International Journal of Forecasting 30: 1016–29. [Google Scholar] [CrossRef]
- Carmona, Pedro, Francisco Climent, and Alexandre Momparler. 2019. Predicting failure in the U.S. Banking sector: An extreme gradient boosting approach. International Review of Economics and Finance 61: 304–23. [Google Scholar] [CrossRef]
- Climent, Francisco, Alexandre Momparler, and Pedro Carmona. 2019. Anticipating banking distress in the Eurozone: An extreme gradient boosting approach. Journal of Business Research 101: 885–96. [Google Scholar] [CrossRef]
- Coffinet, Jérôme, and Jean-Noël Kien. 2019. Detection of rare events: A machine learning toolkit with an application to banking crises. The Journal of Finance and Data Science 5: 183–207. [Google Scholar] [CrossRef]
- Filippopoulou, Chryssanthi, Emilios Galariotis, and Spyros Spyrou. 2020. An early warning system for predicting systemic banking crises in the Eurozone: A logit regression approach. Journal of Economic Behavior and Organization 172: 344–63. [Google Scholar] [CrossRef]
- Gangopadhyay, Partha. 2020. A new & simple model of currency crises: Bifurcations and the emergence of a bad equilibrium. Physica A: Statistical Mechanics and its Applications 538: 122860. [Google Scholar]
- Gutiérrez, Pedro A., M. J. Segovia-Vargas, Sancho Salcedo-Sanz, C. Hervás-Martínez, Araceli Sanchis, J. Antonio Portilla-Figueras, and Francisco Fernández-Navarro. 2010. Hybridizing logistic regression with product unit and RBF networks for accurate detection and prediction of banking crises. Omega 38: 333–44. [Google Scholar] [CrossRef]
- Hastie, Trevor, Robert Tibshirani, and Jerome H. Friedman. 2009. The Elements of Statistical Learning. New York: Springer. [Google Scholar]
- Kim, Ha Y., and Chang H. Won. 2018. Forecasting the Volatility of Stock Price Index: A Hybrid Model Integrating LSTM with Multiple GARCH-type Models. Expert Systems with Applications 103: 25–37. [Google Scholar] [CrossRef]
- Kim, Hyun J., and Heonchang Yu. 2024. Development of a Stock Volatility Detection Model Using Artificial Intelligence. In Annual Symposium of KIPS. Seoul: KIPS, vol. 31, pp. 576–79. [Google Scholar]
- Lin, Chin-Shien, Haider A. Khan, Ruei-Yuan Chang, and Ying-Chieh Wang. 2008. A new approach to modeling early warning systems for currency crises: Can a fuzzy expert system predict the currency crises effectively? Journal of International Money and Finance 27: 1098–121. [Google Scholar] [CrossRef][Green Version]
- Lyócsa, Štefan, Martina Halousková, and Erik Haugom. 2023. The US banking crisis in 2023: Intraday attention and price variation of banks at risk. Finance Research Letters 57: 104209. [Google Scholar] [CrossRef]
- McLachlan, Geoffrey J. 2004. Discriminant Analysis and Statistical Pattern Recognition. Hoboken: Wiley Interscience. ISBN 978-0-471-69115-0. [Google Scholar]
- Mehtab, Sidra, Jaydip Sen, and Abhishek Dutta. 2009. Stock Price Prediction Using Machine Learning and LSTM-Based Deep Learning Models. arXiv arXiv:2009.10819v1. [Google Scholar]
- Nakatani, Ryota. 2020. Macroprudential policy and the probability of a banking crisis. Journal of Policy Modeling 42: 1169–86. [Google Scholar] [CrossRef]
- Ren, Tingting, Shaofang Li, and Siying Zhang. 2024. Stock Market Extreme Risk Prediction Based on Machine Learning: Evidence from the American Market. North American Journal of Economics and Finance 74: 102241. [Google Scholar] [CrossRef]
- Roy, Saktinil. 2022. What drives the systemic banking crises in advanced economies? Global Finance Journal 54: 100746. [Google Scholar] [CrossRef]
- Tharwat, Alaa. 2016. Linear vs. Quadratic Discriminant Analysis Classifier: A Tutorial. International Journal of Applied Pattern Recognition 3: 145–80. [Google Scholar] [CrossRef]
- Virtanen, Timo, Eero Tölö, Matti Virén, and Katja Taipalus. 2018. Can bubble theory foresee banking crises? Journal of Financial Stability 36: 66–81. [Google Scholar] [CrossRef]
- Wang, Peiwan, and Lu Zong. 2023. Does machine learning help private sectors to alarm crises? Evidence from China’s currency market. Physica A: Statistical Mechanica and Its Applications 611: 128470. [Google Scholar] [CrossRef]



| Prediction Outcome | |||
|---|---|---|---|
| Non-Risk | Risk | ||
| Actual outcome | Non-Risk | True Positive (TP) | False Negative (FN) |
| Risk | False Positive (FP) | True Negative (TN) | |
| Model | Accuracy | Non-Risk Precision | Non-Risk Recall | Risk Precision | Risk Recall | Non-Risk F1 Score | Risk F1 Score | AUC |
|---|---|---|---|---|---|---|---|---|
| Logistic Regression | 0.9676 | 0.9772 | 0.9885 | 0.7895 | 0.6522 | 0.9828 | 0.7143 | 0.9878 |
| k-Nearest Neighbors | 0.9638 | 0.9782 | 0.9833 | 0.7264 | 0.6696 | 0.9808 | 0.6968 | 0.9921 |
| Decision Tree | 0.9617 | 0.9793 | 0.9799 | 0.6930 | 0.6870 | 0.9796 | 0.6900 | 1.0 |
| Random Forest | 0.9617 | 0.9793 | 0.9799 | 0.6930 | 0.6870 | 0.9796 | 0.6900 | 1.0 |
| Linear Discriminant Analysis | 0.9644 | 0.9691 | 0.9937 | 0.8451 | 0.5217 | 0.9813 | 0.6452 | 0.9878 |
| Naïve Bayes | 0.9633 | 0.9686 | 0.9931 | 0.8310 | 0.5130 | 0.9807 | 0.6344 | 0.9878 |
| Quadratic Discriminant Analysis | 0.9633 | 0.9686 | 0.9931 | 0.8310 | 0.5130 | 0.9807 | 0.6344 | 0.9878 |
| AdaBoost | 0.9692 | 0.9861 | 0.9810 | 0.7339 | 0.7913 | 0.9836 | 0.7615 | 0.9876 |
| Gradient Boosting | 0.9655 | 0.9810 | 0.9822 | 0.7257 | 0.7130 | 0.9816 | 0.7193 | 0.9988 |
| Model | Accuracy | Non-Risk Precision | Non-Risk Recall | Risk Precision | Risk Recall | Non-Risk F1 Score | Risk F1 Score | AUC |
|---|---|---|---|---|---|---|---|---|
| Logistic Regression | 0.9714 | 0.9744 | 0.9956 | 0.9048 | 0.6129 | 0.9849 | 0.7308 | 0.9913 |
| k-Nearest Neighbors | 0.9571 | 0.9760 | 0.9782 | 0.6667 | 0.6452 | 0.9771 | 0.6557 | 0.9432 |
| Decision Tree | 0.9509 | 0.9738 | 0.9738 | 0.6129 | 0.6129 | 0.9738 | 0.6129 | 0.7655 |
| Random Forest | 0.9509 | 0.9738 | 0.9738 | 0.6129 | 0.6129 | 0.9738 | 0.6129 | 0.9384 |
| Linear Discriminant Analysis | 0.9591 | 0.9601 | 0.9978 | 0.9231 | 0.3871 | 0.9786 | 0.5455 | 0.9913 |
| Naïve Bayes | 0.9611 | 0.9621 | 0.9978 | 0.9286 | 0.4194 | 0.9796 | 0.5778 | 0.9913 |
| Quadratic Discriminant Analysis | 0.9611 | 0.9621 | 0.9978 | 0.9286 | 0.4194 | 0.9796 | 0.5778 | 0.9913 |
| AdaBoost | 0.9550 | 0.9760 | 0.9760 | 0.6452 | 0.6452 | 0.9760 | 0.6452 | 0.9883 |
| Gradient Boosting | 0.9509 | 0.9738 | 0.9738 | 0.6129 | 0.6129 | 0.9738 | 0.6129 | 0.9834 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Noh, S.-H. Predicting Stock Market Risk Using Machine Learning Classification Models. Risks 2026, 14, 92. https://doi.org/10.3390/risks14040092
Noh S-H. Predicting Stock Market Risk Using Machine Learning Classification Models. Risks. 2026; 14(4):92. https://doi.org/10.3390/risks14040092
Chicago/Turabian StyleNoh, Seol-Hyun. 2026. "Predicting Stock Market Risk Using Machine Learning Classification Models" Risks 14, no. 4: 92. https://doi.org/10.3390/risks14040092
APA StyleNoh, S.-H. (2026). Predicting Stock Market Risk Using Machine Learning Classification Models. Risks, 14(4), 92. https://doi.org/10.3390/risks14040092
