Next Article in Journal
Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review
Next Article in Special Issue
Interconnectedness of Stock Indices in African Economies Under Financial, Health, and Political Crises
Previous Article in Journal
Nexus Between Fintech Innovations and Liquidity Risk in GCC Banks: The Moderating Role of Bank Size
Previous Article in Special Issue
Modelling Value-at-Risk and Expected Shortfall for a Small Capital Market: Do Fractionally Integrated Models and Regime Shifts Matter?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Factors, Forecasts, and Simulations of Volatility in the Stock Market Using Machine Learning

by
Juan Mansilla-Lopez
1,
David Mauricio
2,* and
Alejandro Narváez
3
1
Facultad de Ingeniería Industrial y de Sistemas, Universidad Nacional de Ingeniería, 210 Túpac Amaru Ave, Lima 15333, Peru
2
Facultad de Ingeniería de Sistemas e Informática, Universidad Nacional Mayor de San Marcos, 375 Carlos Germán Amezaga Ave, Lima 15081, Peru
3
Facultad de Ciencias Administrativas, Universidad Nacional Mayor de San Marcos, 375 Carlos Germán Amezaga Ave, Lima 15081, Peru
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2025, 18(5), 227; https://doi.org/10.3390/jrfm18050227
Submission received: 24 February 2025 / Revised: 19 March 2025 / Accepted: 18 April 2025 / Published: 24 April 2025
(This article belongs to the Special Issue Machine Learning Based Risk Management in Finance and Insurance)

Abstract

:
Volatility is a risk indicator for the stock market, and its measurement is important for investors’ decisions; however, few studies have investigated it. Only two systematic reviews focusing on volatility have been identified. In addition, with the advance of artificial intelligence, several machine learning algorithms should be reviewed. This article provides a systematic review of the factors, forecasts and simulations of volatility in the stock market using machine learning (ML) in accordance with PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analysis) review selection guidelines. From the initial 105 articles that were identified from the Scopus and Web of Science databases, 40 articles met the inclusion criteria and, thus, were included in the review. The findings show that publication trends exhibit a growth in interest in stock market volatility; fifteen factors influence volatility in six categories: news, politics, irrationality, health, economics, and war; twenty-seven prediction models based on ML algorithms, many of them hybrid, have been identified, including recurrent neural networks, long short-term memory, support vector machines, support regression machines, and artificial neural networks; and finally, five hybrid simulation models that combine Monte Carlo simulations with other optimization techniques are identified. In conclusion, the review process shows a movement in volatility studies from classic to ML-based simulations owing to the greater precision obtained by hybrid algorithms.

1. Introduction

Investors have shown interest in the forecast and simulation of time series in finance (Sezer et al., 2020). Several studies have been conducted to analyze, explain, simulate, and predict the stock market as well as to achieve improved models, combining financial and mathematical techniques with computational algorithms. According to Litimi et al. (2018), financial time series are chaotic, non-linear, and complex. De Gaetano (2019) specifies that understanding the stochastic process that underlies stock returns is crucial for making accurate investment decisions and offers insights into the level of risk in investments. Volatility measures the level of uncertainty in stock markets by representing investors’ moods regarding the trends in a country’s global and local economies. According to McMillan (2002), volatility describes how quickly a stock, index, or futures price changes.
In this sense, volatility has been prevalent since the emergence of the first financial bubble in Tulip bulb prices in the 1600s (Hayes, 2022). Over time, the pattern repeated itself in the valorization of the South Sea Company and the crisis of 1929. During the Great Depression in 1931, the cumulative decline was 47%. In 1974, it was 28.8% during the oil crisis. Moreover, in 2002, it was 28.6% with the bursting of the dot-com bubble. Because of the COVID-19 pandemic of 2020, it was 36%, and recently, it was 25% because of the Russia–Ukraine war at the beginning of 2022 (Dow Jones–DJIA–100 Year Historical Chart, 2023). However, there are specific events that can magnify volatility for a limited period, even a few days, such as 5 February 2018, when the intraday change exceeded 100%, a situation not observed since 2007 (CBOE Volatility Index (^VIX), 2022). Thus, if an investor expects a positive return and volatility increases, the probability of a negative return also increases (Bekiros & Georgoutsos, 2008a).
Studies on financial market volatility focus on various aspects, such as factors, forecasting, and simulation. Studies on factors refer to events or information that influence volatility, such as positive or negative news (Liu & Wang, 2012). Forecasting financial market volatility has become an indicator of the uncertainty associated with the profitability of financial assets, including stocks, indices, and commodities, and is a challenging problem because it is affected by different factors, many of which are unknown (Ayala et al., 2021). Trierweiler Ribeiro et al. (2020) proposed a hybrid model based on an echo state neural network to forecast stock price return volatility. Furthermore, a few studies have been conducted on volatility simulation, one of which is by Khashanah and Alsulaiman (2016), who presented a meta-model that simulates the stock market at the macro- and micro-levels and determines the effect of its parameters on volatility and market capitalization.
Some researchers have published state-of-the-art articles on volatility owing to the large number of studies on this topic. Poon and Granger (2003) conducted a systematic review of 93 authors using volatility forecasting techniques that identify traditional techniques, such as generalized autoregressive conditional heteroskedasticity (GARCH), the exponentially weighted moving average (EWMA), historical volatility (HISVOL) models, implied standard deviation (ISD) models, stochastic volatility (SV) models, and the implied volatility index (VIX), but none have used deep learning. Sezer et al. (2020) performed a systematic review of the literature and contemplated several aspects; however, regarding volatility, they only reviewed its forecast, considering seven studies (three from conferences and four published in journals) from 2005–2019, identifying ML algorithms—the convolutional neural network, recurrent neural network (RNN), long short-term memory (LSTM), LSTM + GARCH, recurrent mixture density network (RMDN) + GARCH, and heterogeneous autoregressive process (HAR) + genetic algorithm support vector regressor (GASVR)—as more accurate than traditional techniques.
Our motivation was to provide a state-of-the-art snapshot of the ML models developed for the stock market and volatility. Given the significant advancements in recent years in the application of ML in the field of finance and the importance of volatility in the stock market, more researchers are writing articles on its prediction. This study aims to fill this gap through a comprehensive overview of volatility forecasting, with results serving as recommendations for future research.
An analysis identifying the most important characteristics of each study in journals indexed in Scopus and the Web of Science (WoS) was conducted. Our focus was solely on factors, forecasts, and simulations of stock market volatility. Other aspects, including index direction predictions, commodity volatility, and stock price predictions, were not considered in this study. The main contributions of this study are as follows: to provide an overview of stock market volatility, specifically in terms of factors, ML, and simulations, over the last 20 years; to provide a definition of volatility that integrates previous concepts and provides new contexts; to provide a wide range of bibliographic references that can be used to understand and investigate volatility in stock markets through ML.
The rest of the article is structured as follows. In Section 2, a review of the literature is presented. Section 3 outlines the materials and methods. Section 4 exhibits the results obtained. Section 5 discusses the findings, and finally, Section 6 presents the conclusions and summarizes and suggests future research directions.

2. Literature Review

Volatility is an indicator of uncertainty associated with the return of an asset (Ramos-Pérez et al., 2019), plays an important role in various areas of finance as a measure of uncertainty, and is a key factor in investment portfolios, the pricing of derivatives, and risk management (Lahmiri, 2015; Naidu, 2018). Thus, understanding its nature and evolution is valuable for financial analysts (Petneházi & Gáll, 2019). Volatility quantifies the dispersion of returns but is not directly observable (Hördahl & Packer, 2007), and its forecast is an essential factor in portfolio management, risk management, monetary policy, and security pricing. Therefore, improvements in volatility forecasting models are important (Seo et al., 2019). There are several types of volatility, such as historical (realized) and implicit. Historical volatility is calculated by the recent trading activity of an asset title, which makes it factual and known but does not provide an indication of a stock’s future movement. Implied volatility refers to forward-looking volatility resulting from the price of a stock’s options traded in the market and is considered an indicator of risk associated with the underlying financial instrument (Rhoads, 2011). Historical volatility can be measured more accurately using intraday data when the sampling frequency increases (Berger et al., 2009).
Volatility has persisted since 1997 (Stapf & Werner, 2003), which can be explained by the connection between electronic trading systems and the Internet. It can also be decomposed into two components, that is, market and idiosyncrasy, the latter being more influential than the former. Its increase is due to factors such as the increase in institutional investors and the volatility of long-term interest rates (Campbell & Viceira, 2002).
However, the first state-of-the-art research is more than 20 years old, and the second included only seven studies on volatility forecasting. Since 2003, there has been no systematic study dedicated exclusively to the study of volatility forecasting. In addition, other important aspects of volatility exist, such as those that influence it and its simulation. It is important to know the factors and their categories so that institutional investors can understand the conditions that could cause volatility in advance. Forecasting will allow an estimation of next-day volatility, which will help make investment decisions. Simulations will help reproduce the historical behavior of volatility, which will allow its future behavior to be inferred.

3. Materials and Methods

3.1. Objectives and Research Question

Systematic literature review (SLR) is a method for aggregating evidence to answer a research question in a clear and replicable way. This process seeks to include all available formal evidence on the subject and assess its quality (Lame, 2019). The systematic review process followed PRISMA protocols (Page et al., 2020) for the methodical identification, selection, compilation, and analysis of key studies and their outcomes; it also includes the SLR approach applied by Poon and Granger (2003), Villegas-Ortega et al. (2021), and Shiguihara et al. (2021). As established in the introduction, this work investigates cutting-edge contributions to stock market volatility research, leveraging machine learning, simulation models, and key determinant factors. This study aims to answer the following research question:
RQ:
What has been the progress of stock market volatility analysis using machine learning (ML) in the last 20 years?
To answer the research question, the following sub-questions on stock market volatility are posed:
RQ1:
What is stock market volatility?
RQ2:
What are the factors that determine it?
RQ3:
Which ML algorithms have been applied to its forecast?
RQ4:
Which simulation models are available?

3.2. Search Strategy

For our exhaustive examination of the existing literature on machine learning approaches, data sources, and predictive performance, we employed the principal scientific repositories Scopus and Web of Science (WoS) via their dedicated search engines. Following a standard systematic review protocol, the search terms were first selected. The complexity of volatility has led to the use of various terms to describe the same idea, such as “volatility” or “Vix”, which has been applied to various subjects, such as “Nasdaq”, “Dow Jones”, “S&P500”, “market index”, “market”, “order book”, “stock price”, “financial time”, and “trading”.
The current research builds upon a previously published study examining trends in volatility over the past 20 years (Hoffmann et al., 2021). We conducted a systematic literature review, as per Mongeon and Paul-Hus (2016), examining WoS/Scopus-indexed publications (2003–2023) through Scopus’s “title-abs-key” and WoS’s “topic” search functions with the following query:
(“volatility” OR “vix”) AND (“Nasdaq” OR “s&p500” OR “dow jones” OR “market index” OR “order book” or “financial time” and “trading” and “stock price”) AND “market” AND (“trend” OR predict* OR forecast* OR “explicable” OR “explainability” OR xia OR “pattern” OR behav* OR model* OR simulat* or factor* OR captur* OR influenc* OR casual*) AND (“deep learning” OR “support vector machine” OR “neural network” OR “machine learning” OR “lstm” OR “back propagation neural network” OR “computational intelligence” OR “narx” OR “svr”). In addition, the inclusion and selection criteria given in Table 1 were considered.
The systematic review, conducted in accordance with PRISMA standards (Figure 1), identified studies for a comprehensive assessment based on predetermined selection parameters (Table 1). This process facilitated robust analysis of critical volatility drivers, machine learning approaches, and simulation models. We employed SCImago Journal Rank (SJR) to validate the academic rigor and impact of included publications.
The initial search query across Scopus and Web of Science yielded 105 articles for preliminary screening. After removing 9 duplicate records, 96 articles underwent a title and abstract review, leading to the exclusion of 29 irrelevant studies and leaving 67 for further assessment. Subsequent filtering prioritized studies focusing on machine learning applications, simulation techniques, volatility forecasting, and completed research, narrowing the pool to 23 articles. To enhance comprehensiveness, 17 additional studies addressing volatility-related aspects were incorporated from supplementary sources. The final selection comprised 40 eligible articles, which were thoroughly analyzed for their research proposals and key findings. A categorical breakdown of the selected literature revealed 8 studies on volatility concepts, 10 studies on volatility-influencing factors, 27 studies on machine learning forecasting models, and 5 studies on simulation models.

3.3. Study Extraction and Synthesis

Three independent investigators (J-M, D-M, A-N) performed dual-stage screening, first evaluating the titles and abstracts before proceeding to a full-text assessment. Our extraction protocol captured four critical elements for each included study: (1) authorship, (2) publication date, (3) methodological framework, and (4) principal conclusions. The lead researcher (J-M) conducted initial data extraction, with all co-authors (D-M, A-N) participating in cross-verification to ensure accuracy.
To maintain methodological rigor, we excluded non-English publications and conference abstracts, as these often lack peer-review scrutiny and may introduce bias. While this review followed best-practice guidelines, we note it was not preregistered in a public repository. Following Sezer et al.’s (2020) structured approach, we systematically organized, synthesized, and tabulated all research findings to facilitate comparative analysis.

4. Results

After analyzing the 40 selected articles, the following statistics were obtained.

4.1. Trends in Publications

The number of publications on stock market volatility based on ML has increased over the years, as shown in Figure 2. There is a greater interest in the topic following significant volatility events, such as the increase in interest rates or the post-COVID period.

4.2. Quality of Publications

Figure 3 presents the number of publications by journal quartile, showing that 87.5% of the selected articles were published in quartile 1 (Q1) and quartile 2 (Q2) journals, indicating that the results are reliable and that the topic is of interest.

4.3. Publications by Continent

Figure 4 shows that Asia and Europe had the highest number of researcher affiliations (39/47), accounting for 83%, with China accounting for 24% of the total affiliations, the United Kingdom for 8%, Korea for 6%, and the United States for 6%. Table 2 lists the articles selected based on study aspects related to the individual research questions, as detailed in Appendix A.

4.4. Publications by Author Impact

Table 2 outlines the top 20 authors according to their impact and total number of citations. Poon’s work was cited 1139 times in Scopus, and Bekiros has the highest h-index of 42.

5. Discussion

The results of this systematic review provide information related to stock market volatility with respect to its definition, the factors that affect it, ML algorithms for its prediction, and behavior simulation models. Researchers can use these to understand the behavior and prediction of volatility. The validity of the presented information is confirmed, as 75% of the studies reviewed belong to first- and second-quartile journals, strengthening the findings presented in this section. The research questions of this study are addressed as follows.

5.1. Volatility Analysis

5.1.1. Stock Market Volatility

Eight studies presented definitions of stock market volatility, and the four keywords in these definitions were return, risk, indicator, and period (Table 3).
The keyword definitions are as follows:
  • The return on an asset or portfolio is the percentage change in its initial value after a period of time and is also referred to as the return (on investment) (McMillan, 2002). It is referred to as the expected return when this period is in the future (Ross, 2018).
  • Uncertainty is the lack of knowledge about an event that reduces confidence in the conclusions derived from the data (Mulcahy, 2022).
  • The indicator is data or information on volatility that serve to determine its presence or absence and measure its current condition as well as its financial forecast or economic trend (https://www.investopedia.com/terms/i/indicator.asp accessed on 25 November 2024).
  • A period is the time during which an action is performed or an event takes place.
Table 3 shows that a recurring term in the definitions of volatility is risk or hazard (D01, D02, D03, D05, and D07), which is defined as a potential disaster that may occur and is measured by the probability that it will happen and how the damage may negatively affect the return (Wallace & Webber, 2011). Because risk is the effect of uncertainty on objectives (International Organization for Standardization, 2018), meaning the return, it can be stated that risk depends on the return and uncertainty (risk = return × uncertainty).
Table 3 shows the following:
  • Volatility is an indicator.
  • Some definitions are confusing, such as D02, in which no distinction is made between risk and uncertainty.
  • Four definitions (D02, D03, D05, and D07) indicate that volatility is an indicator of risk.
  • Three definitions (D02, D04, and D06) state that it is an indicator directly associated with profitability, whereas three others (D03, D05, and D07) state that it is indirectly associated with profitability. In addition, volatility formulas measure variations in profitability [A32]. Therefore, we conclude that this is an indicator of profitability.
  • Definitions (D01, D02, and D04) state that volatility is an indicator of uncertainty, which is incorrect because uncertainty can exist without generating volatility.
From the above findings, we can infer the following:
Volatility is an indicator of market risk, which measures the variation in the returns of a financial asset over a period of time.
The proposed definition covers the essential aspects of volatility, such as the return, risk, period, and indicator, and can also be applied to its typologies: annualized, realized, implied, and stochastic. Annualized volatility refers to volatility over a period of one year and realized volatility over different periods (instantaneous, daily, weekly, and monthly) [A9]. Implied volatility is future volatility based on market expectations about the maturity of options over a period [A17] measured by the VIX [A29]. Stochastic volatility is implied volatility to which a probabilistic component is added [A29].

5.1.2. Factors That Determine Stock Market Volatility

Using the proposed definition of volatility given in Section 5.1.1, 15 factors were identified in 10 studies and grouped into 5 categories (Table 4). These studies, in general, do not define the factor but rather provide an understanding of it. Table A2 shows the identified and categorized factors, with the news category having the most factors (six) and studies (ten).

5.1.3. ML Algorithms Applied to Forecast Stock Market Volatility

Several algorithms for predicting volatility were found in the literature, including LSTM, RNN, SVR, SVM, ANN, and the echo state network, which are detailed in Table A3. Hybrid models involving the use of more than one of these algorithms were predominately used in previous studies. The research considered the following volatilities for prediction: realized, historical, implied, and stochastic. A significant portion of the studies considered the use of time windows for volatility assessment, including 5, 7, 14, 21, 50, and 100 days, with the aim of predicting the next day’s volatility. Meanwhile, a minority focused on more immediate predictions, such as the volatility of the next 5 min. Various metrics have been used to evaluate the results of these systems, for example, MAPE, MSE, RMSE, and MAE, among others.
The literature review revealed several advanced RNN-based approaches for volatility prediction. W. Wang et al. (2021) [A4] developed a hybrid framework combining an Elman recurrent neural network with a factorization machine to forecast realized volatility. Nguyen et al. (2023) [A18] introduced a statistical recurrent stochastic volatility algorithm integrated with RNNs for stochastic volatility modeling. Kaczmarek et al. (2022) [A37] employed RNNs to characterize market states through volatility predictions. Additionally, Bekiros and Georgoutsos (2008b) [A35] proposed an RNN-based model specifically designed for VIX index volatility forecasting.
The review identified several sophisticated applications of long short-term memory (LSTM) networks for volatility prediction. Petneházi and Gáll (2019) [A14] employed an LSTM recurrent neural network to forecast range-based volatility, while Moon and Kim (2019) [A15] utilized an LSTM algorithm to simultaneously predict stock market indices and their volatility across five major markets. For daily realized volatility forecasting, Petrozziello et al. (2022) [A31] leveraged assets from the Dow Jones, S&P 500, and Nasdaq indices within a deep LSTM architecture. Y. Lin et al. (2022) [A32] advanced the field by developing a hybrid LSTM-CEEMDAN model to enhance realized volatility predictions. Notable hybrid implementations included that of H. Y. Kim and Won (2018) [A10], with an integration of LSTM with GARCH models for KOSPI 200 index volatility, and those of W. Zhang et al. (2023) [A34] and Yu et al. (2023) [A40], with complex LSTM-based hybrid frameworks.
The review identified several neural network-based approaches for volatility prediction. Ramos-Pérez et al. (2019) [A8] employed an artificial neural network (ANN) to forecast the true realized volatility of the S&P 500 index. Similarly, Allen and Hooper (2018) [A29] developed an ANN-based model to predict realized volatility using data from both the S&P 500 and VIX indices. Recent advancements include hybrid neural network architectures, as demonstrated by Christensen et al. (2023) [A38], with a neural network hybrid for realized volatility, and Fatima and Uddin (2022) [A39], with an integrated deep learning framework. Additionally, Chkili and Hamdi (2021) [A23] proposed a novel combination of neural networks, with ARCH for volatility forecasting in the Dow Jones Islamic Market Index.
The review identified multiple SVR-based approaches in the volatility prediction literature. Notably, Ou and Wang (2014) [A16] proposed an SVR model integrated with a CGA (compact genetic algorithm) for forecasting stochastic volatility in the Nasdaq index. Bezerra and Albuquerque (2019) [A20] developed a hybrid SVR-GMWK (GARCH–MIDAS–wavelet kernel) framework to model volatility in the S&P 500 and Bovespa indices. In addition, Hung (2016) [A22] applied a fuzzy SVR algorithm to forecast realized volatility (2010–2013).
Beyond SVR, support vector machine (SVM) implementations were employed by W. Wang et al. (2021) [A4], Shen et al. (2018) [A7], and B. Wang et al. (2013) [A13]. Additional advanced models included Trierweiler Ribeiro et al.’s (2020) [A9] proposed hybrid framework for realized volatility. H. Kim et al. (2021) [A26] developed a machine learning ensemble for implied volatility, and Cho and Lee (2022) [A36] demonstrated deep learning-based volatility prediction.
Few studies indicated the tools used to make the predictions. Specific implementations included LIBSVM (B. Wang et al., 2013) [A13], Keras with TensorFlow (Petneházi & Gáll, 2019) [A14], MATLAB (Nguyen et al., 2023) [A18], Python with LIBSVM (H. Kim et al., 2021) [A26], and R’s GMDShell library (Allen & Hooper, 2018) [A29]. Y. Lin et al. (2022) also employed Python and TensorFlow [A32]. Notably, predictive outcomes varied significantly across studies—for instance, Hung (2015) [A27] demonstrated particular effectiveness at the 21-day forecasting horizon.

5.1.4. Simulation Models for Volatility in the Stock Market

Simulation models in stock market volatility analysis establish relationships between influencing factors and volatility dynamics, enabling the examination of market behavior under diverse conditions. Our systematic review identified five key studies employing simulation-based approaches to volatility prediction (Table A4).
Notable contributions include Nguyen et al. (2023) [A18], who developed a statistical recurrent stochastic volatility model implemented in MATLAB for stochastic volatility simulation. Xu et al. (2016) [A21] utilized Monte Carlo simulation for value-at-risk (VaR) computation. H. Kim et al. (2021) [A26] implemented a self-attention mechanism with the SABR model for volatility surface prediction. Hung (2015) [A27] proposed a robust Kalman filter integrated with fuzzy GARCH and particle swarm optimization (PSO) for volatility forecasting in the TWSE, HSI, and N225 indexes. Khashanah and Alsulaiman (2016) [A30] constructed a meta-model combining quantitative simulation with Monte Carlo methods and OptQuest Machine for market capitalization volatility prediction

5.2. Thematic Analysis

5.2.1. Volatility

Volatility is an indicator of market risk that measures the variation in the returns on financial assets over time. Compared with previous studies, this definition includes four common terms found in the reviewed literature—return, risk, indicator, and period—in which the most used indicator is the quality of measurement. Several volatility indicators have been created, including VIX, which measures implied volatility, VIX3M at three months, and VIXY-IV in the short term. A limitation to calculating volatility is access to real-time stock market data. Although it is true that the indicators of the main American stock exchanges are reported in real time, those of other stock exchanges are delayed by 15 min.

5.2.2. Factors

Fifteen factors were identified: company-related events, announcements, market news, viral posts, analysis reports, insider information, pandemics, greed, fear, financial crises, stock overvaluation, commodity price drops, Brexit, new regulations, and terrorist attacks. These were grouped into six categories: news, politics, irrationality, health, economics, and war, which cover the dimensions of volatility affectation and do not overlap. The most frequently mentioned factor is market news because of its immediate effect on the market. Notably, there is a gap in the literature in terms of investigation into the proposed factors. In general, these factors are based on common sense, and they can be limited to the visible region. A future direction for their identification is based on the analysis of theories.
News refers to significant events that generate volatility over short periods, which, if repeated, can motivate federal reserves to adopt economic policies to address or control such events, generating volatility with greater intensity and over a longer period; however, such policies can generate other types of events, indicating a move to a conjunctural state. Similarly, several continuous or parallel conjunctures could cause a structural change in politics and the economy, which in turn generates significant volatility, possibly over a longer period. In other words, there is an intrinsic relationship between the volatility factors and the fundamental aspects of the economy (event, conjuncture, and structure as shown in Figure 5).

5.2.3. ML Algorithms

Most of the algorithms applied to volatility forecasting are hybrid algorithms, with 18 of the 27 studies combining ML algorithms with stochastic algorithms. This hybridization may be explained by the fact that researchers seek to leverage the benefits that multiple approaches bring to this complex problem, which exhibits considerable uncertainty. Among the 27 identified ML algorithms, RNNs and LSTMs (a type of RNN) are the most used, with six and nine studies, respectively, as these algorithms contemplate the characteristics that fit the behavior of volatility, such as feedback and temporality. Figure 6 presents the use of ML algorithms in volatility forecasting, where the predominance of recurrent networks, SVM, and SVR algorithms are observed.
Regarding the datasets, 85% of the studies analyzed the American market, mainly using the S&P 500 index. This is because the information is publicly available, while the remaining studies evaluated other markets, such as those of Asia, Europe, and Latin America. Furthermore, 40% of the studies considered datasets with periods of 10 years, 11% analyzed periods of 15 years, and 30% evaluated spans of 20 years.
Studies such as [A10] compared the results of GARCH-based models with a hybrid model combining LSTM networks and GARCH to replicate realized volatility. They found performance improvements in favor of machine learning models of more than 10% based on indicators such as MAE and MSE. The most optimal results were obtained with the proposed GEW-LSTM model.
The CEEMDAN + LSTM hybrid algorithm has demonstrated enhanced accuracy in volatility forecasting due to its ability to decompose complex, non-linear, and non-stationary time-series data into intrinsic mode functions (IMFs). This decomposition, enabled by CEEMDAN, extracts multi-scale volatility patterns, while the LSTM network effectively captures temporal dependencies and long-term trends. The hybrid model overcomes the limitations of standalone methods, such as oversimplification by traditional econometric models or noise sensitivity in single deep learning approaches. Empirical studies [A32] highlight its superiority, particularly in turbulent market conditions, making it a robust tool for volatility forecasting.
Additionally, algorithms such as GVMD-Q-DBN-LSTM-GRV, SVR + GMWK, SVR + MSM, and HARQ + J + LSTM have also exhibited strong performance. Some of these models are more advanced, as they integrate multiple algorithms. However, their effectiveness is influenced by factors such as hyperparameters, dataset selection, time windows, and forecasting horizons. The variety of algorithms and results indicates a growing trend toward developing hybrid models capable of efficiently handling different datasets and stock markets.
In the present research, 20 additional studies on volatility prediction were found compared to those presented by Sezer et al. (2020), who, in their 7 studies, included only 3 that involved the use of hybrid ML algorithms. In the research by Poon and Granger (2003), the application of ML techniques is not evident, but it is noted that most performance indicators are currently being used to evaluate the models.

5.2.4. Simulation Models for Stock Market Volatility

Few studies on simulation models were identified from 2003 to 2023 that considered ML algorithms. This may be because ML algorithms are increasingly used to reproduce the behavior of volatility more accurately than simulation-based models. Moreover, these algorithms may be used to build models that reflect data trends as opposed to simulation models, which approximate to the behavior of the data using established equations. However, the five simulation models that were identified cover the majority (80%) of Asian and American markets, with the Monte Carlo and GARCH models being the most used.

5.2.5. Future Work

In the coming years, researchers and investors will refine their prediction models using hybrid algorithms or explore new ones incorporating sophisticated mathematical functions to achieve better and faster results by leveraging cloud technologies and even robots equipped with these algorithms. This advancement will enable quicker investment decisions; however, faster decision-making and the more widespread use of these algorithms could potentially lead to more immediate and prolonged volatility events.
A gap has been identified in the literature regarding the explainability of artificial intelligence in stock market analysis. This aspect could be of interest to both academia and society on a global scale, as it would be useful in an environment characterized by its complexity and chaotic nature. Its application could enhance both local and global understanding of financial markets.
The analysis of factors affecting volatility could be refined using a different approach based on theories that enable a better understanding of them, proposing a mix of different evaluation methods.

6. Conclusions

This study provided a structured literature review of factors, forecasting, and simulations of volatility in the stock market using ML techniques from 2003 until 2023. The four research questions concern the volatility definition (RQ1), factors determining volatility (RQ2), ML algorithms (RQ3), and simulation models (RQ4). This study selected 40 articles from the WoS and Scopus databases. Among these, 83% came from Asia and Europe and 6% came from the United States. Notably, 87.5% of the selected articles belonged to first- and second-quartile journals, which guarantees that this study presents reliable results. Compared with the state-of-the-art study conducted by Sezer et al. (2020), which considered financial time-series forecasting using deep learning methods and briefly addressed volatility as identified in only seven previously published studies, our study involved 40 articles and considered aspects including factors affecting and simulations of volatility.
Seven definitions were found for volatility, with the keywords (1) return, (2) risk, (3) indicator, and (4) period. Therefore, the definition of volatility was proposed as an indicator of market risk that measures the variation in the returns of a financial asset over time, as no definition involving all these keywords was found. We identified 15 factors that influence volatility, and they were classified into six categories: news, politics, irrationality, health, economics, and war. Of these, the news was the most studied, and the most relevant factors affecting this category were market news and financial crises. In addition, an intrinsic relationship was noted among the factors affecting volatility and the fundamental aspects of the economy (events, conjunctures, and structures).
Regarding algorithms used for volatility forecasting, 27 publications used hybrid algorithms, which combined RNNs and LSTM recurrent neural networks with other models. In addition, the hybrid algorithms that exhibited the best forecasting performance were CEEMDAN + LSTM, GVMD-Q-DBN-LSTM-GRV, SVR + GMWK, SVR + MSM, and HARQ + J + LSTM, but their results depend on their hyperparameters, including the dataset, window, and horizon. These hyperparameters differed from those of the other models. The hybrid algorithms with the best performance results include the LSTM and SVR algorithms.
Regarding simulation models, three articles combined Monte Carlo simulation with optimization techniques and neural networks, with PSO being the most used stochastic algorithm, as used in three studies. However, no algorithm was favored. There was progress from simulation-based studies toward the use of ML algorithms. Future studies should consider the use of algorithms based on generative artificial intelligence and other new algorithms.
The findings of this study may benefit researchers considering the presentation of accumulated knowledge regarding volatility. Moreover, these findings will assist researchers in promoting academic studies based on broad, transparent, and replicable methodologies. Simultaneously, an improved understanding of volatility is provided, which may be used to improve the modeling of this complex concept to better analyze the stock market.
Although a comprehensive and detailed literature review was conducted, there is always the possibility that not all relevant articles were identified since the present work is limited to WoS and Scopus journals, the English language, and a 20-year period and excludes conference papers and other sources that may provide further research regarding volatility.
New volatility factors based on scientific theory and ML algorithms, such as NARX, suitable for forecasting should be identified in future studies. The future (recurrence) effects of these factors should be investigated, and stochastic algorithms, such as GARCH, based on solid theoretical foundations of financial theory and statistics can be used. Explanatory artificial intelligence should be applied to examine volatility and to determine its role in the stock market.

7. Limitations

This systematic review reveals that while volatility forecasting has been successfully implemented through machine learning techniques, simulation models, and traditional methods, several limitations emerge from the current literature. For data-related, significative variability exists across the datasets, the model performance demonstrated temporal dependence being sensitive to time specific period and parameters configuration of the algorithms. With respect of methodological constraints, the exclusion of non-english publications may have limited geographical diversity of the research and conference proceedings systematically excluded could omit cutting-edge developments.

Author Contributions

Conceptualization, J.M.-L., D.M. and A.N.; methodology, J.M.-L., D.M. and A.N.; formal analysis, J.M.-L. and D.M.; investigation, J.M.-L., D.M. and A.N.; writing—original draft preparation, J.M.-L., D.M. and A.N.; writing—review and editing, J.M.-L., D.M. and A.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was conducted without external financial support.

Data Availability Statement

The research methodology was limited to comprehensive literature analysis, without any primary data collection or computational processing.

Conflicts of Interest

The research was conducted impartially, without any commercial or financial relationships that might be deemed a conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AHEAsymmetric Hurst exponents
AIArtificial intelligence
ANNArtificial neural network
ANOVAAnalysis of variance
ARCHAutoregressive conditional heteroscedasticity
CAC40France Market Index
CAVIARConditional autoregressive value at risk
CBOEChicago Board Options Exchange
CEEMDANComplete ensemble empirical mode decomposition with adaptive noise
CGAChaotic genetic algorithm
COVIDCoronavirus disease
CSMCombination score model
DAX30German Stock Index
DCC-GARCH-MANNDynamic conditional correlation GARCH multivariate ANN
DJIADow Jones Industrial Average Index
ESNEcho state network
EWMAExponentially weighted moving average
FIAPARCH-NNBPFractionally integrated asymmetric power autoregressive conditional heteroscedasticity with neural network back propagation
FMFactorization machine
FSVMFuzzy support vector machine
FSVRFuzzy vector machine for regression
FTSEFinancial Times Stock Exchange
GAGenetic algorithm
GARCHGeneralized autoregressive conditional heteroscedasticity
GARCH-EVTGeneralized autoregressive conditional heteroscedastic autoregressive model with external value theory
GASVRGenetic algorithm support vector regressor
GEWGARCH, Exponential GARCH, and EWMA models
GMWKGaussian and Morlet wavelet kernels
GVMD-Q-DBN-LSTM-GRUGrey wolf optimizer variational mode decomposition-Q-learning-deep belief network–LSTM-gated recurrent unit
HARHeterogeneous autoregressive
HARQ-JHeterogeneous autoregressive quarticity jump model
HARQ-F-JFull heterogeneous autoregressive quarticity jump model
HISVOLHistorical volatility
HMAEMAE-adjusted heteroscedastic
HMSEMSE-adjusted heteroscedastic
HSIHang Seng Index
HS150Hong Kong Stock Index
HS300Chinese Shanghai and Shenzhen 300 Composite Index
IPC MexicoMexico Price and Quotations Index
ISDImplied standard deviation
KOSPISouth Korean Stock Index
LIBSVMLibrary for Support Vector Machines
LSTMLong short-term memory
LS-SVM-IPSOLeast square SVM-improved PSO
L-HAR-XLog volatility HAR with exogenous variables
L-NNLog volatility neural network
MAEMean absolute error
MAPEMean absolute prediction error
MAFEMean absolute forecast error
MLMachine learning
MLPMultilayer perceptron
MPFEMean percentage forecast error
MSEMean square error
MSFEMean-squared forecast error
MSPEMean-squared prediction error
MSMMarkov switching multifractal
N225Nikkei Index of the Tokyo Stock Exchange
NARXNon-linear autoregressive network with exogenous inputs
NMSENormalized mean square error
PSOParticle swarm optimization
PRISMAPreferred Reporting Items for Systematic Review and Meta-Analysis
QARNNQuantile autoregressive neural network
QLIKEQuasi-likelihood
QRNNQuantile regression neural network model
RMDNRecurrent mixture density network
RMSERoot mean square error
RNNRecurrent neural network
SABRStochastic alfa beta rho model
SJRSCImago Journal Rank
SLRSystematic literature review
SMAPESymmetric mean absolute percentage error
SR-SVStatistical recurrent stochastic volatility
SSEShanghai Stock Exchange
SSECIndex of securities traded on the Shanghai Stock Exchange
SSMSingle score model
SVASurface and variational encoders
SVStochastic volatility
SVMSupport vector machine
SVRSupport vector regression
SZSCShenzhen Stock Exchange Composite Index
SZSEShenzhen Securities Composite Index
TSWETaiwan Stock Exchange Weighted Stock Index
TSX250Toronto Stock Exchange Index
TSWETaiwan Stock Exchange Weighted Stock Index
VARValue at risk
VECHARVector error correction heterogeneous autoregressive
VIXImplied volatility index
VIX3MCBOE S&P 500 3-month volatility
VIXY-IVProshares VIX short-term future
WoSWeb of Science

Appendix A

Table A1. Selected papers.

Appendix B

Table A2. Stock market volatility factors.
Table A2. Stock market volatility factors.
CategoryIDFactorDescriptionSource
NewsF1Company-related eventsNews about decisions made by the CEOs of companies listed on stock markets.[A1]
F2AnnouncementsAnnouncements of new investments or developments of innovative services or products that are promoted or postponed.[A1]
F3Market newsSectoral news of the impact of performance on the economy. Among them, microblog news [A24], bad news, and good news [A33].[A2], [A24], [A33]
F4Viral postsPosts about rumors or about the spread of news on social networks such as X, e.g., high-volume posts [A1].[A1]
F5Technical analysis reportsValuation reports of companies listed on stock markets by risk rating agencies or valuation companies.[A1]
F6Privileged informationAdvance proprietary information leaking into the marketplace.[A3]
HealthF7COVID-19
pandemic
An example of this is the health crisis caused by the COVID-19 virus, which caused a high degree of uncertainty and volatility in stock markets [A25].[A25]
IrrationalityF8GreedBehavior that induces investors to buy more shares without considering the fundamentals and is characteristic of the last stages of bull markets because of the effect of investors’ irrationality [A25].[A3], [A5], [A25]
F9FearBehavior that induces the selling of shares due to concerns about unfavorable events or economic crises resulting from reversal events and volatility [A25].[A3], [A5], [A25]
EconomicsF10Financial crises A structural crisis involving the banking system and the monetary system that manifests itself in bank failures, credit crunches, increased public deficits, and sovereign debt. Examples are the 2008 subprime crisis [A8], the European sovereign debt crisis, the Greek crisis due to external debt default, and the 2015 Chinese market financial turbulence [A12].[A8], [A12]
F11Overvaluation of sharesAn example is the dot-com crash caused by the overvaluation of technology companies in 2000.[A8]
F12Falling commodity pricesThe fall in oil prices between 2014 and 2015 of nearly 70% in the face of oversupply and the weak reaction of economic activity in emerging countries.[A12]
PoliticsF132016 Brexit referendumWithdrawal of the United Kingdom from the European Union by referendum.[A12]
F14New regulationsRegulations imposed on a sector. An example is the decision to stop producing combustion cars in the European Union as of 2035.[A8]
WarF15Terrorist attacksTerrorist attack of 11 September 2001, which impacted financial markets in the affected countries.[A17]
Table A3. Algorithms used for volatility forecasting.
Table A3. Algorithms used for volatility forecasting.
ReferenceDataset bPeriodForecastWindow
(W)/Horizon (H) Days
Algorithm/Method aPerformance c
[A4]S&P 500
DJIA
SSE
SZSE
2000–2011Realized volatilityH = one day aheadElman + RNN + FMMAPE (2.0552), MAE (228.2216), RMSE (298.6005)
MAPE (2.3703), MAE (227.0823), RMSE (290.28)
MAPE (1.6545), MAE (47.0132), RMSE (68.2408)
MAPE (1.6545), MAE (47.0132), RMSE (293.7208)
[A6]S&P500
HS300
2010–2017Returns-LS + SVM + IPSOMAE (0.00083), MSE (0.0000029), RSME (0.00328)
MAE (0.00051), MSE (0.0000017), RSME (0.00311)
[A7]S&P5002006–2010Price volatilityH = 1FSVMNMSE (0.7425), MAE (0.0081)
[A8]S&P5002000–2007
2001–2008
2002–2009
2009–2016
2010–2017
True realized volatilityW = 10 ANNRSME (0.00274)
[A9]Nasdaq companies: CAT, EBAY, and MSFT 2745 daysRealized volatilityH = 1,
5 days,
21 days
HAR + PSO + ESNR2 (0.297), MSE (1.11 × 10−7)
[A10]KOSPI 200 Stock index2001–2011Realized volatilityW = 7, 15, and 22 days/
H = 1, 14, and 21 days
GEW +
LSTM
MAE (0.01069), MSE (0.00149),
HMAE (0.42911), HMSE (0.23492)
[A11]S&P500
Dow Jones
Nasdaq
2019–2020Historical volatilityW = 39 for S&P500 and Dow Jones/35 days for
Nasdaq
LSTM + likelihood-based loss functionMSE (no comparable values)
[A13]SSEC
SZSC
1991–2010VolatilityH = 50–100 SVM + MSM MSE (3.669 × 10−7), R2 (0.512)
MSE (1.40 × 10−7), R2 (0.76)
[A14]DJIA2008–2017Volatility
estimators
W = 21
H = 1 day
LSTMRSME (0.0131), SMAPE (5.01)
[A15]S&P500
Nasdaq
DAX
KOSPI
IPC Mexico
2010–2016VolatilityH = one day aheadLSTMMSE (0.000022), MAPE (0.138758)
MSE (0.000030), MAPE (0.139860)
MSE (0.000044), MAPE (0.133250)
MSE (0.000024), MAPE (0.214972)
MSE (0.000021), MAPE (0.136609)
[A16]Nasdaq2001–2010Actual volatility
Stochastic volatility
-SVR + CGARMSE (3.8076),
NMSE (0.6304) QLIKE (1.2517)
[A18]DAX30
HS150
CAC40
S&P500
TSX250
2004–2016
2003–2015
2004–2016
2004–2016
2004–2016
Stochastic volatility-SR + SV+ RNN MSE (0.098), MAE (0.023), QLIKE (0.851), R2LOG (0.40)
MSE (0.060), MAE (0.150), QLIKE (0.361), R2LOG (0.291)
MSE (0.090), MAE (0.209), QLIKE (0.896), R2LOG (0.34)
MSE (0.108), MAE (0.224), QLIKE (0.316), R2LOG (0.692)
MSE (0.081), MAE (0.195), QLIKE (0.119), R2LOG (0.597)
A19]SSE of 50 stocks 2015–2020Realized volatilityW = 1037 5 min block
H = 5 min ahead
HARQ + J +
LSTM
HARQ + FJ +
LSTM
MSE (1 × 10−6), HMSE (1), MAE (0.1056), HMAE (0.0614), R2LOG (0.0194), QLIKE (1)

MSE (1 × 10−6), HMSE (1), MAE (1 × 10−2), HMAE (1), R2LOG (1)
QLIKE (1)
[A20]S&P500
Bovespa
2008–2016
2007–2016
VolatilityH = 1 periodSVR + GMWK MSE (2.541976 × 10−8)
MSE (1.620612 × 10−7)
[A22]TSWE
NASDAQ
HSI
SSE
2010–2013Realized volatilityW = 22
H = one step ahead
FSVRMAFE (0.1278), MPFE (0.2506), MSFE (0.0476)
MAFE (0.1409), MPFE (0.2973), MSFE (0.0551)
MAFE (0.2957), MPFE (0.3459), MSFE (0.2114)
MAFE (0.5695), MPFE (0.3781), MSFE (1.5807)
[A23]Dow Jones Islamic1999–2016ReturnsH = one step aheadFIAPARCH + NNBPMSE (0.506785126), RMSE (0.711888423), NMSE (0.466885252)
[A26]S&P500 KOSPI2002016–2019Implied volatility surfaceW = 128 SABRRelative error (not comparable data)
[A29]S&P 500
VIX
2000–2017Realized volatility (VIX index forecast)W = 5 min
H = one step ahead
ANN-
[A31]Dow Jones (28 assets)
S&P500
(92 assets) NASDAQ (100 assets)
2012–2017Realized volatilityW = 20
H = one step ahead
LSTMMSE (0.17), QLIKE (0.1620), Pearson (0.72)
[A32]CSI300
S&P500
STOXX50
2005–2020Realized volatilityW = 5, 21, and 252
H = one step ahead
LSTM + CEEMDANMSE (1.54 × 10−9), MAE (2.39 × 10−5), HMSE (0.1346), HMAE (0.2572)
MSE (3.85 × 10−10), MAE (9.44 × 10−6), HMSE (0.5556), HMAE (0.4972)
MSE (6.37 × 10−10), MAE (1.47 × 10−5), HMSE (6.41 × 102), HMAE (1.4923)
[A34]Options of S&P5002009–2020Implied volatilityH = one step aheadSVA + DNN+ LSTMRMSE (0.0205)
MAPE (7.60%)
[A35]S&P5001998–2002CBOE-implied volatility index (VIX)W = 20
H = following day
RNNMSPE (0.033)
[A36]S&P5002000–2020Realized volatilityW = 90 s
H = following day
AHEMFE (0.000449), MSE (0.000046)
MAPE (4.853296), RAE (0.578949)
[A37]S&P5002000–2020Realized volatilityH = one day aheadRNNMSE (0.0031), MAPE (0.2618)
[A38]Dow Jones2001–2017Realized volatilityH = one day aheadL-HAR-X
L-NN
MSE (0.880)
[A39]S&P 500
FTSE-100
KSE-100
KLSE
BSESN
2013–2020Realized volatilityH = one day aheadDCC-GARCH-MANNRMSE (7.123), MAE (1.467), RMAE (0.336)
[A40]SSEC
S&P500
FTSE
2000–2022Realized volatilityH = 5/22 min aheadGVMD-Q-DBN-LSTM-GRUMAE (3.5036 × 10−5), MSE (8.6934 × 10−9), HMAE (0.5650), HMSE (1.0084)
a Abbreviations: SVR: vector machine for regression; CGA: chaotic genetic algorithm; FSVM: fuzzy support vector machine; RNN: recurrent neural network; ANN: artificial neural network; ESN: echo state network; SVM: support vector machine; PSO: particle swarm optimization; LS-SVM-IPSO: least square SVM-improved PSO; HAR: heterogeneous autoregressive; L-HAR-X: log volatility HAR with exogenous variables; L-NN: log volatility neural network; GEW: GARCH, EGARCH, and EWMA; MSM: Markov switching multifractal; VECHAR: vector error correction heterogeneous autoregressive; SR-SV: statistical recurrent stochastic volatility; FSVR: fuzzy vector machine for regression; SABR: stochastic alfa beta rho model; SVA: surface and variational autoencoders; AHE: asymmetric Hurst exponents; GMWK: Gaussian and Morlet wavelet kernels; FM: factorization; HARQ-J: heterogeneous autoregressive quarticity jump model; HARQ-F-J: full heterogeneous autoregressive quarticity jump model; DCC-GARCH-MANN: dynamic conditional correlation GARCH multivariate ANN; GVMD-Q-DBN-LSTM-GRU: grey wolf optimizer variational mode decomposition-Q-learning-deep belief network–LSTM-gated recurrent unit; CEEMDAN: complete ensemble empirical mode decomposition with adaptive noise; SABR: stochastic alfa beta rho model; DNN: deep neural networks; FIAPARCH-NNBP: fractionally integrated asymmetric power autoregressive conditional heteroscedasticity with neural network back propagation. b Abbreviations: HS300: Chinese Shanghai and Shenzhen 300 Composite Index; SSEC: Shanghai Stock Exchange Composite Index; SZSC: Shenzhen Stock Exchange Composite Index; DAX30: German Stock Index; HS150: Hong Kong Stock Index; CAC40: France Market Index; TSX250: Canada Market Index; DJIA: Dow Jones Industrial Average Index; TSWE: Taiwan Stock Exchange Weighted Stock Index; HSI: Hang Seng Index; SSE: Shanghai Stock Exchange; SZSE: Shenzhen Securities Composite Index; IPC Mexico: Mexico price and quotations index. c Abbreviations: MSE: mean square error; MAE: mean absolute error; RMSE: root mean square error; NMSE: normalized mean square error; VAR: value at risk; SMAPE: symmetric mean absolute percentage error; MAFE: mean absolute forecast error; MPFE: mean percentage forecast error; QLIKE: quasi-likelihood; HMSE: MSE-adjusted heteroscedastic; HMAE: MAE-adjusted heteroscedastic; MSPE: mean square prediction error; RMAE: relative mean absolute error.
Table A4. Simulation models to predict volatility.
Table A4. Simulation models to predict volatility.
ArtSimulation ModelDataset aPeriodMeasurementMethod bPerformance Criteria c
[A18]Statistical Recurrent Stochastic Volatility ModelDAX30
HSI50
CAC40
S&P500
TSX250
2004–2016
2003–2015
2004–2016
2004–2016
2004–2016
Stochastic VolatilitySR-SV using Density Tempered Sequential Monte CarloPPS (1.720), QLIKE (0.588), R2LOG (0.399),
MSE (0.733), MAE (0.548)
PPS (1.127), QLIKE (0.355), R2LOG (0.294),
MSE (0.060), MAE (0.152)
PPS (1.381), QLIKE (0.856), R2LOG (0.354),
MSE (0.095), MAE (0.210)
PPS (1.381), QLIKE (0.856), R2LOG (0.354),
MSE (0.095), MAE (0.210)
PPS (1.113), QLIKE (0.330), R2LOG (0.527),
MSE (0.091), MAE (0.209)
[A21]Quantile Autoregression Neural Network ModelHSI,
S&P500
FTSE100
2008–2013Value at RiskRiskmetric
GARCH + EVT
CAViaR
QRNN
APARCH
PCC
QARNN-1
QARNN-2 (Monte Carlo)
Risk (9.00), RSME (5.17), MAE (3.90)
Risk (0.35), RSME (3.65), MAE (2.89)
Risk (0.42), RSME (4.15), MAE (1.65)
Risk (1.80), RSME (1.45), MAE (0.41)
Risk (11.57), RSME (4.63), MAE (2.56)
Risk (16.68), RSME (4.86), MAE (1.89)
Risk (0.08), RSME (1.18), MAE (1.72)
Risk (0.19), RSME (0.69), MAE (0.19)
[A26]Candidate Point Selection Using a Self-Attention MechanismS&P500
KOSPI200
2016–2019Volatility SurfaceSABR ModelRelative error (mean and standard deviation)
SSM + CSM (transformer): 1.2741, 1.0532
SSM only (transformer): 1.4739, 1.2784
SSM + CSM(MLP): 1.5850, 1.1235
SSM only (MLP): 1.7403, 1.1373
SSM + CSM(CNN): 1.6732, 1.2315
SSM only (CNN): 1.8518, 1.2682
SSM + CSM(SVR): 1.7702, 1.1239
SSM only (SVR): 2.2789, 1.0545
SSM + CSM (transformer): 1.3130, 0.7186
SSM only (transformer): 1.4989, 0.8686
SSM + CSM(MLP): 2.1013, 1.6081
SSM only (MLP): 2.6868, 1.9327
SSM + CSM(CNN): 1.2863, 0.7259
SSM only (CNN): 1.5581, 0.8248
SSM + CSM(SVR): 1.9798, 1.4533
SSM only (SVR): 2.7641, 1.9963
[A27]Robust Kalman Filter Based on a Fuzzy GARCH Model Using Particle Swarm OptimizationTWSE
HSI
N225
1992–2012VolatilityFuzzy + GARCH Model + PSOMAFE (0.0715), MPFE (0.0564), MSFE (0.0127)
MAFE (0.0684), MPFE (0.0473), MSFE (0.0113)
MAFE (0.0760), MPFE (0.0334), MSFE (0.1469)
[A30]Meta-ModelS&P5002010–2014Volatility
Market Capitalization
Quantitative Simulative Empirical Model
Monte Carlo
ANOVA
a Abbreviations: SSE: Shanghai Stock Exchange index of the Shanghai Stock Exchange; HSI: Hang Seng Index of the Hong Kong Stock Exchange; FTSE: index of the London Stock Exchange; SSEC: index of securities traded on the Shanghai Stock Exchange; KOSPI: South Korean Stock Index; N225: Nikkei Index of the Tokyo Stock Exchange. b Abbreviations: SR-SV: stochastic volatility recurrent statistics; CAViaR: conditional autoregressive value at risk; QRNN: quantile regression neural network model; GARCH-EVT: generalized autoregressive conditional heteroscedastic autoregressive model with external value theory; QARNN: quantile autoregressive neural network; GA: genetic algorithm; PSO: particle swarm optimization; SSM: single score model; CSM: combination score model; MLP: multilayer perceptron; SABR: stochastic alfa beta rho model; QRNN: quantile regression neural network model. c Abbreviations: MSFE: mean-squared forecast error; ANOVA: analysis of variance.

References

  1. Allen, D., & Hooper, V. (2018). Generalized correlation measures of causality and forecasts of the VIX using non-linear models. Sustainability, 10, 2695. [Google Scholar] [CrossRef]
  2. Alostad, H., & Davulcu, H. (2019). Directional prediction of stock prices using breaking news on Twitter. Web Intelligence, 15, 1–17. [Google Scholar] [CrossRef]
  3. Ayala, J., García-Torres, M., Noguera, J., Vázquez, J., Gómez-Vela, F., & Divina, F. (2021). Technical analysis strategy optimization using a machine learning approach in stock market indices. Knowledge-Based Systems, 225, 107119. [Google Scholar] [CrossRef]
  4. Bekiros, S., & Georgoutsos, D. (2008a). Direction-of-change forecasting using a volatility-based recurrent neural network. Journal of Forecasting, 27, 407–417. [Google Scholar] [CrossRef]
  5. Bekiros, S., & Georgoutsos, D. (2008b). Non-linear dynamics in financial asset returns: The predictive power of the CBOE volatility index. The European Journal of Finance, 14, 397–408. [Google Scholar] [CrossRef]
  6. Berger, D., Chaboud, A. P., & Hjalmarsson, E. (2009). What drives volatility persistence in the foreign exchange market? Journal of Financial Economics, 94, 192–213. [Google Scholar] [CrossRef]
  7. Bezerra, P., & Albuquerque, P. (2019). Volatility forecasting: The support vector regression can beat the random walk. Economic Computation and Economic Cybernetics Studies and Research, 53, 115–126. [Google Scholar] [CrossRef]
  8. Campbell, H., & Viceira, J. (2002). Strategic asset allocation. Oxford University Press. [Google Scholar] [CrossRef]
  9. CBOE Volatility Index (^VIX). (2022). Available online: https://finance.yahoo.com/chart/%5EVIX? (accessed on 20 October 2022).
  10. Chkili, W., & Hamdi, M. (2021). An artificial neural network augmented GARCH model for Islamic stock market volatility: Do asymmetry and long memory matter? International Journal of Islamic and Middle Eastern Finance and Management, 14, 853–873. [Google Scholar] [CrossRef]
  11. Cho, P., & Lee, M. (2022). Forecasting the volatility of the stock index with deep learning using asymmetric hurst exponents. Fractal and Fractional, 6, 394. [Google Scholar] [CrossRef]
  12. Christensen, K., Siggaard, M., & Veliyev, B. (2023). A machine learning approach to volatility forecasting. Journal of Financial Econometrics, 21, 1680–1727. [Google Scholar] [CrossRef]
  13. De Gaetano, D. (2019). Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index. Mathematical Biosciences and Engineering, 16, 7125–7216. [Google Scholar] [CrossRef]
  14. Dow Jones–DJIA–100 Year Historical Chart. (2023). Available online: https://www.macrotrends.net/1319/dow-jones-100-year-historical-chart (accessed on 20 October 2023).
  15. Fatima, S., & Uddin, M. (2022). On the forecasting of multivariate financial time series using hybridization of DCC-GARCH model and multivariate ANNs. Neural Computing and Applications, 34, 21911–21925. [Google Scholar] [CrossRef]
  16. Gao, T., & Chai, Y. (2018). Improving stock closing price prediction using recurrent neural network and technical indicators. Neural Computation, 30, 2833–2854. [Google Scholar] [CrossRef]
  17. Gao, Y., He, D., Mu, Y., & Zhao, H. (2023). Realised volatility prediction of high-frequency data with jumps based on machine learning. Connection Science, 35, 2210625. [Google Scholar] [CrossRef]
  18. Gong, X., Liu, X., Xiong, X., & Zhuang, X. (2019). Forecasting stock volatility process using improved least square support vector machine approach. Soft Computing, 23, 11867–11881. [Google Scholar] [CrossRef]
  19. Greenspan, A. (2008). The age of turbulence: Adventures in a new world. Penguin Books. [Google Scholar]
  20. Hayes, A. (2022). Tulipmania: About the Dutch tulip bulb market bubble. Available online: https://www.investopedia.com/terms/d/dutch_tulip_bulb_market_bubble.asp (accessed on 25 March 2024).
  21. Hoffmann, F., Allers, K., Rombey, T., Helbach, J., Hoffmann, A., Mathes, T., & Pieper, D. (2021). Nearly 80 systematic reviews were published each day: Observational study on trends in epidemiology and reporting over the years 2000–2019. Journal of Clinical Epidemiology, 138, 1–11. [Google Scholar] [CrossRef]
  22. Hördahl, P., & Packer, F. (2007). Understanding asset prices: An overview. Bank for International Settlements. Available online: https://www.bis.org/publ/bppdf/bispap34.htm (accessed on 14 July 2023).
  23. Hu, H., Tang, L., Zhang, S., & Wang, H. (2018). Predicting the direction of stock markets using optimized neural networks with Google Trends. Neurocomputing, 285, 188–195. [Google Scholar] [CrossRef]
  24. Hung, J. (2015). Robust Kalman filter based on a fuzzy GARCH model to forecast volatility using particle swarm optimization. Soft Computing, 19, 2861–2869. [Google Scholar] [CrossRef]
  25. Hung, J. (2016). Fuzzy support vector regression model for forecasting stock market volatility. Journal of Intelligent & Fuzzy Systems, 31, 1987–2000. [Google Scholar] [CrossRef]
  26. International Organization for Standardization. (2018). Second edition: Risk management—Guidelines: 8.2 (ISO Standard 31000:2018). International Organization for Standardization.
  27. Kaczmarek, T., Będowska-Sójka, B., Grobelny, P., & Perez, K. (2022). False safe haven assets: Evidence from the target volatility strategy based on recurrent neural network. Research in International Business and Finance, 60, 101610. [Google Scholar] [CrossRef]
  28. Khan, W., Ghazanfar, M. A., Azam, M. A., Karami, A., Alyoubi, K. H., & Alfakeeh, A. S. (2022). Stock market prediction using machine learning classifiers and social media, news. Journal of Ambient Intelligence and Humanized Computing, 13, 3433–3456. [Google Scholar] [CrossRef]
  29. Khashanah, K., & Alsulaiman, T. (2016). Network theory and behavioral finance in a heterogeneous market environment. Complexity, 21, 530–554. [Google Scholar] [CrossRef]
  30. Kim, H., Park, K., Jeon, J., Song, C., Bae, J., Kim, Y., & Kang, M. (2021). Candidate point selection using a self-attention mechanism for generating a smooth volatility surface under the SABR model. Expert Systems with Applications, 173, 11640. [Google Scholar] [CrossRef]
  31. Kim, H. Y., & Won, C. H. (2018). Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Systems with Applications, 103, 25–37. [Google Scholar] [CrossRef]
  32. Lahmiri, S. (2015). Intraday stock price forecasting based on variational mode decomposition. Journal of Computational Science, 12, 23–27. [Google Scholar] [CrossRef]
  33. Lame, G. (2019). Systematic literature reviews: An introduction. Proceedings of the Design Society: International Conference on Engineering Design, 1(1), 1633–1642. [Google Scholar] [CrossRef]
  34. Lin, C., Chen, C. S., & Chen, A. (2018). Using intelligent computing and data stream mining for behavioral finance associated with market profile and financial physics. Applied Soft Computing, 68, 756–764. [Google Scholar] [CrossRef]
  35. Lin, Y., Lin, Z., Liao, Y., Li, Y., Xu, J., & Yan, Y. (2022). Forecasting the realized volatility of stock price index: A hybrid model integrating CEEMDAN and LSTM. Expert Systems with Applications, 206, 117736. [Google Scholar] [CrossRef]
  36. Litimi, H., BenSaïda, A., Belkacem, L., & Abdallah, O. (2018). Chaotic behavior in financial market volatility. Journal of Risk, 21, 27–53. [Google Scholar] [CrossRef]
  37. Liu, F., & Wang, J. (2012). Fluctuation prediction of stock market index by Legendre neural network with random time strength function. Neurocomputing, 83, 12–21. [Google Scholar] [CrossRef]
  38. McMillan, L. (2002). Options as a strategic investment (5th ed.). Penguin. [Google Scholar]
  39. Mongeon, P., & Paul-Hus, A. (2016). The journal coverage of Web of Science and Scopus: A comparative analysis. Scientometrics, 106, 213–228. [Google Scholar] [CrossRef]
  40. Moon, K., & Kim, H. (2019). Performance of deep learning in prediction of stock market volatility. Economic Computation and Economic Cybernetics Studies and Research, 53, 77–92. [Google Scholar] [CrossRef]
  41. Mulcahy, R. (2022). PMP exam prep, what you really need to know to pass the exam (10th ed.). RMC Publications. [Google Scholar]
  42. Naidu, S. (2018). Managing fiscal volatility in the Pacific. MPFD Policy Briefs No. 75. United Nations ESCAP. Available online: https://econpapers.repec.org/scripts/redir.pf?u=http%3A%2F%2Fwww.unescap.org%2Fsites%2Fdefault%2Ffiles%2FMPFD%2520Policy%2520Brief%252075_Managing%2520fiscal%2520volatility%2520in%2520the%2520Pacific.pdf;h=repec:unt:pbmpdd:pb75 (accessed on 28 February 2023).
  43. Nguyen, T., Tran, M., Gunawan, D., & Kohn, R. (2023). A statistical recurrent stochastic volatility model for stock markets. Journal of Business and Economic Statistics, 41, 414–428. [Google Scholar] [CrossRef]
  44. Ou, P., & Wang, H. (2014). Volatility modelling and prediction by hybrid support vector regression with chaotic genetic algorithms. The International Arab Journal of Information Technology, 11, 287–292. Available online: https://iajit.org/PDF/vol.11,no.3/4788.pdf (accessed on 28 February 2023).
  45. Page, M., Moher, D., Bossuyt, P., Boutron, I., Hoffmann, T., Mulrow, C., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … McKenzie, J. E. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ, 372, n160. [CrossRef] [PubMed]
  46. Petneházi, G., & Gáll, J. (2019). Exploring the predictability of range-based volatility estimators using recurrent neural networks. Intelligent Systems in Accounting, Finance and Management, 4, 774–785. [Google Scholar] [CrossRef]
  47. Petrozziello, A., Troiano, L., Serra, A., Jordanov, I., Storti, G., Tagliaferri, R., & La Rocca, M. (2022). Deep learning for volatility forecasting in asset management. Soft Computing, 26, 8553–8574. [Google Scholar] [CrossRef]
  48. Poon, S., & Granger, C. (2003). Forecasting volatility in financial markets: A review. Journal of Economic Literature, 41, 478–539. [Google Scholar] [CrossRef]
  49. Ramos-Pérez, E., Alonso-González, P., & Núñez-Velázquez, J. (2019). Forecasting volatility with a stacked model based on a hybridized artificial neural network. Expert Systems with Applications, 129, 1–9. [Google Scholar] [CrossRef]
  50. Rhoads, R. (2011). Trading VIX derivatives: Trading and hedging strategies using VIX futures, options, and exchange-traded notes. John Wiley & Sons, Inc. [Google Scholar] [CrossRef]
  51. Ross, S. (2018). Finanzas corporativas (13th ed.). McGraw-Hill. [Google Scholar]
  52. Seo, M., Lee, S., & Kim, G. (2019). Forecasting the volatility of stock market index using the hybrid models with Google domestic trends. Fluctuation Noise Letters, 18, 1950006. [Google Scholar] [CrossRef]
  53. Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M. (2020). Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied Soft Computing, 90, 106181. [Google Scholar] [CrossRef]
  54. Shen, C., Feng, L., & Li, Y. (2018). A hybrid information capturing methodology for price volatility and its application to financial markets. Journal of Intelligent & Fuzzy Systems, 35, 405–414. [Google Scholar] [CrossRef]
  55. Sheu, S., Lin, C., Lu, S., Tsai, H., & Chen, Y. (2016). Forecasting the volatility of a combined multi-country stock index using GWMA algorithms. Expert Systems, 35, e12248. [Google Scholar] [CrossRef]
  56. Shiguihara, P., De Andrade Lopes, A., & Mauricio, D. (2021). Dynamic Bayesian network modeling, learning, and inference: A survey. IEEE Access, 9, 117639–117648. [Google Scholar] [CrossRef]
  57. Stapf, J., & Werner, T. (2003). How wacky is the DAX? The changing structure of German stock market volatility. (Bundesbank Series 1 Discussion Paper No. 2003, 18p). SSRN. [Google Scholar] [CrossRef]
  58. Stoner, J., & Freeman, R. (1992). Management. Prentice Hall. [Google Scholar]
  59. Tello, N. (1998). Periodismo actual: Guía para la acción. Ediciones Colihue SRL. [Google Scholar]
  60. Trierweiler Ribeiro, G., Portela Santos, A., Cocco Mariani, V., & dos Santos Coelho, L. (2020). Novel hybrid model based on echo state neural network applied to the prediction of stock price return volatility. Expert Systems with Applications, 184, 115490. [Google Scholar] [CrossRef]
  61. Villegas-Ortega, J., Bellido-Boza, L., & Mauricio, D. (2021). Fourteen years of manifestations and factors of health insurance fraud, 2006–2020: A scoping review. Health and Justice, 9, 26. [Google Scholar] [CrossRef] [PubMed]
  62. Wallace, M., & Webber, L. (2011). The disaster recovery handbook (2nd ed.). Amacom. [Google Scholar]
  63. Wang, B., Huang, H., & Wang, X. (2013). A support vector machine based MSM model for financial short-term volatility forecasting. Neural Computing and Applications, 22, 21–28. [Google Scholar] [CrossRef]
  64. Wang, W., Tang, S., & Li, M. (2021). Advantages of combining factorization machine with elman neural network for volatility forecasting of stock marke. Complexity, 2021, 6641298. [Google Scholar] [CrossRef]
  65. Wang, W., & Yang, F. (2018). The shale revolution, geopolitical risk, and oil price volatility. Social Science Research Network. [Google Scholar] [CrossRef]
  66. Withington, J. (2013). Historia mundial de los desastres. Turner. [Google Scholar]
  67. Xu, Q., Liu, X., Jiang, C., & Yu, K. (2016). Quantile autoregression neural network model with applications to evaluating value at risk. Applied Soft Computing, 49, 1–12. [Google Scholar] [CrossRef]
  68. Yu, Y., Lin, Y., Hou, X., & Zhang, X. (2023). Novel optimization approach for realized volatility forecast of stock price index based on deep reinforcement learning model. Expert Systems with Applications, 233, 120880. [Google Scholar] [CrossRef]
  69. Zhang, S., & Fang, W. (2021). Multifractal behaviors of stock indices and their ability to improve forecasting in a volatility clustering period. Entropy, 23, 1018. [Google Scholar] [CrossRef] [PubMed]
  70. Zhang, W., Li, L., & Zhang, G. (2023). A two-step framework for arbitrage-free prediction of the implied volatility surface. Quantitative Finance, 23, 21–34. [Google Scholar] [CrossRef]
Figure 1. Review process for factors, forecasts, and simulations of volatility in the stock market using machine learning based on PRISMA guidelines.
Figure 1. Review process for factors, forecasts, and simulations of volatility in the stock market using machine learning based on PRISMA guidelines.
Jrfm 18 00227 g001
Figure 2. Publications by year.
Figure 2. Publications by year.
Jrfm 18 00227 g002
Figure 3. Number of publications per journal quartile.
Figure 3. Number of publications per journal quartile.
Jrfm 18 00227 g003
Figure 4. Publications by continent.
Figure 4. Publications by continent.
Jrfm 18 00227 g004
Figure 5. Impact of events, conjunctures, and structures on the magnitude and duration of volatility.
Figure 5. Impact of events, conjunctures, and structures on the magnitude and duration of volatility.
Jrfm 18 00227 g005
Figure 6. Machine learning algorithms applied to volatility forecasting.
Figure 6. Machine learning algorithms applied to volatility forecasting.
Jrfm 18 00227 g006
Table 1. Criteria for the inclusion and exclusion of articles.
Table 1. Criteria for the inclusion and exclusion of articles.
Inclusion CriteriaExclusion Criteria
Responds to at least one research question.
Type of article: Journals indexed in Scopus and WoS.
Language: English.
Period: January 2003–December 2023.
Areas other than computer science, finance, and economics.
Other objects of study, such as cryptocurrency volatility, exchange rates, and commodities.
Other aspects of study such as econometric methods.
Conference papers are not considered
Table 2. Author’s impact.
Table 2. Author’s impact.
Ranking
RankAuthorh-IndexCitation(s) in ScopusRankAuthorh-IndexCitation(s) in Scopus
1Bekiros S.D.423711Hung, Jui-Chung1010
2Xu, Qifa264112Petrozziello, Alessio1012
3Allen, David E.211313Kim, Ha Young7522
4Hu, Hongping1815114Gao, Tingwei770
5Poon, S.-H.17113915Khashanah, Khaldoun718
6Gong, Xiao-Li122616Wang, Baohua624
7Christensen, Kim124617Wang, Fang63
8Moon, Kyoung-Sook123918Cho, Poongjin511
9Lin, Yu116419Ou, Phichlang52
10Chkili, Walid11820Khan, Wasiat3120
Source: authors.
Table 3. Concept of volatility.
Table 3. Concept of volatility.
IdConceptSourceKeywords
ReturnRiskIndicatorPeriod
D01Indicator of uncertainty associated with the profitability of an asset, which tends to play an important role in risk models.[A8]XXX-
D02Degree to which the price of an asset fluctuates and measures the level of uncertainty or risk.[A10],
[A11]
-XX-
D03Measure of hazard, quantifying dispersion that is not directly observable.[A14]--X-
D04Variance of returns, serving as a measure of the uncertainty of the returns.[A18]XXX-
D05Measurement of risk in financial markets.[A22]--X-
D06Speed with which the price or value of a stock, future, or index changes over a period.[A17]--XX
D07Standard measure of risk in the financial market.[A34]--X-
Table 4. Categories of factors that determine volatility.
Table 4. Categories of factors that determine volatility.
Factor CategoryDescription
NewsThe narration of events that interest the largest number of readers with or without a connection to those events (Tello, 1998).
PoliticsA set of activities related to group decision-making and power relations among individuals, including resource allocation or status distribution. Political variables refer to factors that can impact an organization’s operations because of the political environment (Stoner & Freeman, 1992).
IrrationalityIrrational exuberance in the face of a warning of a possible overvaluation of stock markets (Greenspan, 2008).
HealthThe collection of entities that exist in the world, whether naturally occurring or modified without human intervention, and pertains to the significant loss of materials and human lives resulting from natural events or phenomena, such as earthquakes, floods, tsunamis, and landslides (Withington, 2013).
EconomicsGeneral conditions and trends that can be factors in an organization’s operations (Stoner & Freeman, 1992).
WarThe impact of geopolitical uncertainty generates volatility in the price of resources (oil, food, raw materials), which impacts real economic activities (W. Wang & Yang, 2018).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mansilla-Lopez, J.; Mauricio, D.; Narváez, A. Factors, Forecasts, and Simulations of Volatility in the Stock Market Using Machine Learning. J. Risk Financial Manag. 2025, 18, 227. https://doi.org/10.3390/jrfm18050227

AMA Style

Mansilla-Lopez J, Mauricio D, Narváez A. Factors, Forecasts, and Simulations of Volatility in the Stock Market Using Machine Learning. Journal of Risk and Financial Management. 2025; 18(5):227. https://doi.org/10.3390/jrfm18050227

Chicago/Turabian Style

Mansilla-Lopez, Juan, David Mauricio, and Alejandro Narváez. 2025. "Factors, Forecasts, and Simulations of Volatility in the Stock Market Using Machine Learning" Journal of Risk and Financial Management 18, no. 5: 227. https://doi.org/10.3390/jrfm18050227

APA Style

Mansilla-Lopez, J., Mauricio, D., & Narváez, A. (2025). Factors, Forecasts, and Simulations of Volatility in the Stock Market Using Machine Learning. Journal of Risk and Financial Management, 18(5), 227. https://doi.org/10.3390/jrfm18050227

Article Metrics

Back to TopTop