Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps

Vancsura, László; Tatay, Tibor; Bareith, Tibor

doi:10.3390/forecast7030036

Open AccessReview

Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps

by

László Vancsura

¹

,

Tibor Tatay

^2,*

and

Tibor Bareith

³

¹

Department of Agricultural Logistics, Trade and Marketing, Institute of Agricultural and Food Economics, Hungarian University of Agriculture and Life Sciences, 7400 Kaposvár, Hungary

²

Department of Statistics, Finances and Controlling, Széchenyi István University, 9026 Győr, Hungary

³

HUN-REN, Centre for Economic and Regional Studies, Institute of Economics, 1097 Budapest, Hungary

^*

Author to whom correspondence should be addressed.

Forecasting 2025, 7(3), 36; https://doi.org/10.3390/forecast7030036

Submission received: 22 May 2025 / Revised: 10 July 2025 / Accepted: 11 July 2025 / Published: 14 July 2025

(This article belongs to the Section AI Forecasting)

Download

Browse Figures

Versions Notes

Abstract

This systematic literature review explores the application of artificial intelligence (AI) and machine learning (ML) in financial market forecasting, with a focus on four asset classes: equities, cryptocurrencies, commodities, and foreign exchange markets. Guided by the PRISMA methodology, the study identifies the most widely used predictive models, particularly LSTM, GRU, XGBoost, and hybrid deep learning architectures, as well as key evaluation metrics, such as RMSE and MAPE. The findings confirm that AI-based approaches, especially neural networks, outperform traditional statistical methods in capturing non-linear and high-dimensional dynamics. However, the analysis also reveals several critical research gaps. Most notably, current models are rarely embedded into real or simulated trading strategies, limiting their practical applicability. Furthermore, the sensitivity of widely used metrics like MAPE to volatility remains underexplored, particularly in highly unstable environments such as crypto markets. Temporal robustness is also a concern, as many studies fail to validate their models across different market regimes. While data covering one to ten years is most common, few studies assess performance stability over time. By highlighting these limitations, this review not only synthesizes the current state of the art but also outlines essential directions for future research. Specifically, it calls for greater emphasis on model interpretability, strategy-level evaluation, and volatility-aware validation frameworks, thereby contributing to the advancement of AI’s real-world utility in financial forecasting.

Keywords:

systematic literature review; machine learning; equities; cryptocurrencies; commodities; foreign exchange markets; research gaps

1. Introduction

The application of artificial intelligence (AI) and machine learning (ML) in the financial sector is profoundly transforming the industry, with implications that extend to other fields and society as a whole (Li and Tang, 2020) [1]. From traditional hedge funds to the banking sector and fintech providers, many financial institutions are investing substantial resources in building expertise in data science and machine learning (Wall, 2018) [2]. The increasing availability of machine-readable data in finance, supported by growing computing power and storage capacities, has had a significant impact on the industry. Concurrently, this data-driven approach underscores the need to reassess and refine regulatory frameworks. In the aftermath of the 2007–2008 global financial crisis, substantial structural changes in financial regulation have focused on data-driven innovation and oversight, prompting a reevaluation of bank loan contracts and trading book stress-testing programs in Europe and the United States (Goodell et al., 2021) [3]. Financial professionals are increasingly interested in “alternative data” beyond standard corporate fundamentals, stock prices, and macroeconomic indicators. These sources include audio recordings, news articles, social media posts, and satellite imagery, all of which now significantly influence trading decisions (In et al., 2019) [4]. De Prado (2019) [5] highlights the complexity of these datasets, as they are often unquantifiable, unstructured, incomplete, and uncategorized. Consequently, such datasets are typically high-dimensional, meaning that the number of variables can match or exceed the number of observations (Duan et al., 2022) [6]. These characteristics make classical, linear econometric models unsuitable for deriving predictive and deterministic models from alternative data, because essential economic information remains hidden (Goulet Coulombe et al., 2022) [7]. Common geometric constructs, such as covariance matrices, fail to capture the topological properties characterizing network relationships in alternative datasets. In contrast, machine learning models offer the computational power and functional flexibility needed to map and interpret intricate patterns in high-dimensional spaces. Recent advances in machine learning have also improved our ability to apply scientific theories to determine relationships among variables, leading to deeper insights, more accurate forecasts, and more intuitive data visualization (Dixon et al., 2020) [8]. Classical econometric approaches cannot fully recognize and learn from the underlying interactions, often resulting in substantial biases. Machine learning thus provides solutions for identifying outliers, extracting features, and running classification or regression models effectively within these complex environments.

The potential of AI has garnered considerable scholarly interest, particularly in its financial applications. Researchers focus on key areas such as exchange rate forecasting (Nazareth and Reddy, 2023) [9], financial modeling (Chan and Hale, 2020) [10], financial risk prevention (Gao, 2022) [11], algorithmic trading (Martínez et al., 2019) [12], asset and derivative pricing (Houlihan and Creamer, 2021) [13], automation (Kokina et al., 2020) [14], fraud detection (Teng and Lee, 2019) [15], credit and insurance underwriting (Bee et al., 2021) [16], risk management (Li et al., 2021) [17], and sentiment analysis (Chen et al., 2020) [18]. Among these areas, our literature review concentrates on machine learning-based predictive models for exchange rate forecasting. This domain is inherently diverse and complex, producing an extensive body of academic work suitable for anchoring the current review.

1.1. Applications of Artificial Intelligence in Finance

1.1.1. The Efficient Market Hypothesis and Its Challenges

The Efficient Market Hypothesis (EMH), developed by Eugene Fama (1970) [19], claims that the prices of financial instruments fully reflect all available information. And new information generated in the markets is immediately incorporated into prices. Three forms of the theory are distinguished: weak, semi-strong, and strong efficiency. In weakly efficient markets, past prices do not provide investors with a trading advantage; in semi-inefficient markets, all publicly available information is incorporated into prices; while in highly efficient markets, even insider information does not provide a trading advantage (Fama, 1991) [20]. The efficient markets hypothesis is based in part on the random walk theory, which has been the focus of much academic research (Fama, 1965; Samuelson, 1965; Van Horne and Parker, 1967) [21,22,23]. The theory holds that the movement of market prices is random and therefore prices cannot be predicted. Levy’s (1967) [24] study was among the first to attempt to refute the random walk theory. In his scientific research, he concluded that the powerful data processing capabilities of electronic computers are invaluable in selecting investment strategies and also help in developing and evaluating methods based on market timing. Empirical results have confirmed that technical analytical tools have a number of beneficial properties that have proven useful in predicting market price movements. The final conclusion is that stock prices follow well-recognized trends and patterns, which play an important role in forecasting. However, research and empirical results over the past decades increasingly show that markets do not always behave in a fully efficient way (Shiller, 2003) [25]. Market anomalies such as the momentum effect (Jegadeesh and Titman, 1993) [26] or the value effect (Fama and French, 1992) [27] suggest that certain investment strategies can consistently achieve better-than-market returns. The use of artificial intelligence and machine learning in financial forecasting has opened up a new dimension to challenge theories of efficient markets and random wander, among others. Indeed, AI-based models are able to detect complex patterns that are difficult to detect with traditional statistical models. According to Chopra and Sharma (2021) [28], deep learning is able to exploit the potential of market efficiency frontiers by successfully identifying non-linear relationships in financial time series.

1.1.2. Intertwining Adaptive Markets Theory and Artificial Intelligence

Adaptive Market Hypothesis (AMH) offers an alternative approach to disproving the efficient markets hypothesis. AMH is based on the assumption that markets are not fully efficient, but dynamically change and adapt to new information as it emerges. Classical economics views investors as rational decision-makers, but behavioral economics has shown that human psychological factors play a significant role in financial decisions (Kahneman and Tversky, 1979) [29]. Investors often adjust their decisions depending on their past experience and the evolution of the market environment, resulting in dynamic and adaptive markets (Lo and Zhang, 2024) [30]. AMH can explain market anomalies such as momentum and value effects, which are thought to be caused by the adaptive behavior of investors according to adaptive market theory. All this in order to design more profitable strategies. Machine learning can play a key role in the theory of AMH because of its ability to adapt to market changes and continuously learn from new information. This in turn gives investors a significant advantage to maximize profits. Dixon et al. (2020) [8] highlight in their study that neural networks and other AI-based models support adaptive behavior in market forecasting.

1.1.3. Asymmetric Information and the Role of Alternative Data Sources

According to the theory of asymmetric information (Akerlof, 1978; Stiglitz, 2000) [31,32], market participants have very different amounts of information, which distorts their decision-making processes (Williams, 2005) [33]. Traditional financial analysis relied mostly on macroeconomic indicators and fundamental data. However, trends over the last decades show that alternative data (social media, satellite imagery, search trends) play an increasingly important role in market decision-making (In et al., 2019) [4]. Social media data play an increasingly important role in financial decision-making. Ahern and Peress (2023) [34] argue that financial and social media play a critical role in the transmission of information between market participants, improving decision-making and increasing market efficiency. Research by Sun et al. (2024) [35] suggests that social media platforms have a significant impact on investor decisions by providing rapid and widespread access to financial information. The integration of alternative data and artificial intelligence could therefore revolutionize financial decision-making. AI-based models are able to analyze large amounts of complex and unstructured data quickly and accurately, extracting information that would not be available to traditional models. This allows investors to better understand market dynamics and predict future price movements.

1.1.4. Big Data and the Issue of Information Efficiency

Shiller (2020) [36] argues that the efficiency of financial markets depends to a large extent on the type of information that investors take into account. Big Data-based financial models allow for the inclusion of other sources of information in the decision-making process beyond traditional fundamental analysis. Big Data, as the name implies, embodies a repository of large amounts and varieties of data that would be difficult or impossible to process and analyze in detail using traditional methodologies. Sources of such data may include transactional or geolocation data, social media posts, web data collections, and various mobile app data (Hasan et al., 2020) [37]. These data create the potential for financial institutions to gain real-time actionable insights into market trends and economic performance (Goldstein et al., 2024) [38]. Studies by Goulet Coulombe et al. (2022) [7] show that traditional econometric models are often unable to adequately handle these high-dimensional datasets, while machine learning algorithms are better adapted to dynamically changing market conditions. This enables new approaches to forecasting exchange rates and assessing market efficiency.

1.2. Novelty and Significance of the Study

Several recent reviews on applications of AI and machine learning for financial time-series prediction are available. Gandhmal and Kumar (2019) [39], Li and Bastos (2020) [40], and Kumbure et al. (2022) [41] review machine learning techniques applied to stock market trend or point predictions. Gandhmal and Kumar (2019) [39] conclude that Artificial Neural Networks (ANNs) and fuzzy-based techniques are the most promising among the reviewed machine learning approaches for accurate stock market predictions, supported by Shi and Zhuang’s (2019) [42] review of soft computing approaches, finding ANN architectures to consistently outperform other machine learning models in point prediction accuracy. Li and Bastos (2020) [40] and Sezer et al. (2020) [43] show that RNN-based models like Long Short-Term Memory (LSTM) implementations are the most popular in deep learning. Kumbure et al. (2022) [41] conclude that the most frequently utilized models are ANNs and Support Vector Machines (SVM), but that deep learning models like LSTM have growing interest due to reports of robust and improved predictions. Khattak et al. (2023) [44] provide an in-depth review of machine learning methods applied to forecast various financial assets between 2018 and 2023 and find new hybrid integrations of LSTM and SVM architectures to be more effective than traditional models for point predictions. Gunnarsson et al. (2024) [45] survey the relevance of AI models for volatility predictions, and report promising results across asset classes.

This systematic review differs from previous surveys in several key aspects. First, unlike general overviews that focus on machine learning in finance broadly, this study delivers a product-oriented analysis, systematically comparing four major asset classes: equities, cryptocurrencies, commodities, and foreign exchange markets. This segmentation enables a more granular understanding of how model performance and methodological choices vary across different financial environments.

Second, this review provides a critical synthesis of 100 selected studies published between 2014 and 2023, using rigorous inclusion and exclusion criteria. Many previous reviews either lacked explicit filtering procedures or did not eliminate low-quality sources. In contrast, this study emphasizes scholarly quality by excluding papers from non-peer-reviewed venues and by avoiding duplication and thematic noise.

Third, while prior reviews typically concentrate on model architectures and prediction accuracy, the present study integrates additional evaluation dimensions, such as evaluation metrics, hyperparameter tuning, feature selection, and explainability.

Moreover, this paper introduces an application-oriented critical perspective. It highlights the gap between predictive accuracy and real-world utility, especially in the context of dynamic, high-volatility markets such as crypto and commodities. To this end, the study includes dedicated sections on strategy-level model integration, robustness across market regimes, and volatility-aware evaluation frameworks, which are often missing from traditional literature reviews.

Finally, the paper aims not only to summarize but also to reframe the discourse by identifying underexplored research niches and suggesting a roadmap for future studies. These include the use of explainable AI, Generative AI, and meta-learning in finance, strategy-congruent validation approaches, and longitudinal model evaluation.

By addressing both thematic depth and methodological gaps, this study contributes a unique, high-resolution view of AI-driven financial forecasting and supports the advancement of both academic knowledge and practical implementation.

1.3. Research Questions

This systematic literature review seeks to synthesize the current state of research on the application of machine learning (ML) techniques in the prediction of financial time series. In a rapidly evolving domain where predictive accuracy is paramount for informed investment decisions, understanding the methodological landscape is crucial. The study is structured around seven specific research questions (RQs) that aim to map the scope, tools, and limitations of the existing literature. These research questions also serve as the backbone of the analytical framework employed throughout the review.

RQ1: Which product group is most frequently examined for predictive modeling?

This question aims to reveal which financial asset classes (e.g., equities, cryptocurrencies, commodities, forex) dominate the current ML forecasting literature. Identifying these focal points helps highlight any potential overrepresentation or underrepresentation of certain markets and sets the stage for recognizing methodological generalizability or asset-specific constraints.

RQ2: Which models are most commonly employed?

Given the diversity of machine learning approaches from traditional regression models to advanced deep learning architecture, this question investigates which algorithms have gained the most traction in recent years. Special attention is paid to the frequency of usage of recurrent neural networks (RNNs), long short-term memory networks (LSTMs), convolutional neural networks (CNNs), and ensemble methods such as XGBoost. These models have been prioritized in the review due to their proven capacity to capture temporal dependencies, handle non-linearities, and adapt to volatile financial time series. In particular, LSTMs and GRUs are widely adopted for sequential data modeling, while ensemble methods like XGBoost offer robustness and strong generalization. Their increasing prevalence across academic and applied domains underscores their relevance and practical effectiveness in financial forecasting contexts.

RQ3: What performance benchmarking metrics are used?

Evaluation metrics such as RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and MAPE (Mean Absolute Percentage Error) are widely adopted but may vary in suitability depending on asset class or market conditions. This question focuses on identifying the most frequently used metrics and critically reflecting on their limitations in high-volatility environments.

RQ4: What is the typical length of databases used by researchers?

The predictive power of ML models is heavily dependent on the quantity and quality of historical data. By mapping the typical data span (e.g., 1–3 years vs. 10+ years) used across studies, this question sheds light on the temporal robustness of findings and whether models are tested under varying macroeconomic regimes.

RQ5: What statistical methods are applied?

Beyond ML models, many studies incorporate complementary statistical tools for feature selection, validation, or benchmarking. This question explores the integration of classical econometric techniques with machine learning frameworks, helping to understand how hybrid methodologies are employed in empirical settings.

RQ6: Which model is considered the most reliable and accurate?

This question seeks to synthesize performance findings across multiple studies to identify consensus on the most effective predictive algorithms for specific asset classes. It also considers under what market conditions (e.g., stable vs. volatile) certain models outperform others.

RQ7: What are the main research gaps and directions?

Finally, this question aims to uncover underexplored aspects of the field, including limitations in model generalizability, lack of integration with trading strategies, inadequate testing under stress periods, and insufficient treatment of transaction costs. Addressing these gaps not only strengthens the interpretability of the existing literature but also lays the groundwork for future empirical contributions.

Scientific Contribution

The contribution of this review is threefold. First, it offers a structured and comparative overview of methodological choices in financial forecasting with ML, grounded in a large body of peer-reviewed literature. Second, it critically evaluates the practical relevance and robustness of these models across different market regimes. Third, it identifies key research gaps that hinder the effective translation of predictive accuracy into actionable strategies, especially in turbulent economic periods. Together, these elements provide a comprehensive foundation for scholars, data scientists, and financial practitioners seeking to apply or extend ML-based forecasting models in finance.

2. Methodology

Using a standardized methodology to conduct a systematic literature review not only enhances the quality of the review but also enables other researchers to replicate and compare the findings. With this in mind, we conducted our literature search following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Nazareth and Reddy, 2023; Ardabili, 2022) [9,46].

The databases selected were Google Scholar, Science Direct, and Web of Science. Among these, Science Direct and Web of Science allowed more refined, multi-parameter searches. In contrast, Google Scholar, with its more complex search engine, generated the highest number of initial hits, but it was less effective in narrowing down the most relevant studies.

Defining and refining keywords is crucial in a systematic literature review. After careful consideration, we concluded that an approach allowing comparison at the product group level would be advantageous. Thus, the following search phrase was applied to all three databases: ‘stock’ OR ‘cryptocurrency’ OR ‘commodity’ OR ‘forex’ AND ‘machine learning’ AND ‘forecast.’

For all three platforms, we limited the search to the 2014–2023 period and applied additional filtering criteria to retain only publications with a clear research or review profile. However, these steps still produced a large number of results, so we further required that the keywords appear in both the title and the abstract. Using these conditions, the search conducted on 31 December 2023, yielded 1670 publications from Google Scholar, 750 from Science Direct, and 371 from Web of Science. After that date, we included a few additional studies, but their number was negligible, and thus they are not highlighted separately.

There was substantial overlap in the initial results. Removing duplicates left approximately 683 articles. Screening titles and abstracts reduced this to 267 publications. Subsequently, as shown in Table 1, we removed 157 articles due to concerns about the quality of the journals in which they appeared. We also excluded 10 literature review studies. Eventually, we obtained a final sample of 100 studies that we analyzed in detail.

The objective of this systematic literature review was to identify publications examining the performance of machine learning models. Given the extensive body of literature on this topic, there are numerous high-quality studies that provided a solid foundation for our analysis. From the compiled database, we selected and examined 100 papers, which we categorized into four distinct product groups.

By employing well-defined keywords and research questions, we successfully filtered out articles not closely aligned with our research focus. To illustrate the methodology, we developed a PRISMA flow diagram (Figure 1) that presents an overview of the search, screening, and selection processes used to build our literature database.

Most of the relevant studies were drawn from the Science Direct database, while the Web of Science yielded the fewest. However, there was considerable overlap among these sources. The following sections detail the processing of the publication database and provide an in-depth discussion of the four product groups examined: equity markets, cryptocurrencies, commodity markets, and foreign exchange markets.

3. Literature Review of the Examined Markets

3.1. Stock Market Forecasts

Stock price prediction has long been one of the most extensively studied topics in financial forecasting using artificial intelligence. This subsection explores how various machine learning models have been applied to equity markets, highlighting their input features, forecasting performance, and practical limitations. The focus is on identifying which techniques are most effective in capturing the complex dynamics of stock price movements.

Artificial intelligence-based models offer several advantages over traditional statistical approaches, primarily due to their complexity and capacity to learn from data, enabling them to accurately detect patterns, including non-linear dynamics. Equity market data often exhibit non-stationary, non-linear behaviors that conventional statistical models struggle to capture, whereas AI-based methodologies have proven more adept over time. Recurrent Neural Networks (RNNs) are particularly noteworthy, as their architecture allows for feedback connections. Recent research by Hajiabotorabi et al. (2019) [48], comparing the predictive capabilities of ANNs and RNNs, concluded that RNNs can outperform more traditional neural networks. Moreover, the long short-term memory (LSTM) model—a specialized form of RNN widely applied to sequential datasets—has demonstrated substantial adaptability for time-series analysis.

Nabipour et al. (2020) [49] assessed the forecasting performance of nine different machine learning algorithms and two deep learning methods (RNN and LSTM) on stock data from the Tehran Stock Exchange (covering financial, oil, non-metallic mineral, and metallic materials companies). They found that RNN and LSTM outperformed all other models. Hiransha et al. (2018) [50] predicted stock prices on the Indian Stock Exchange (NSE) and NYSE using a multilayer perceptron, RNN, LSTM, and a convolutional neural network (CNN). Their empirical analysis identified CNN as the best-performing method, even outperforming ARIMA-based results. Beniwal et al. (2023) [51], examining data from the Nifty50, DJIA, DAX, Nikkei225, and SSE indices, concluded that a hybrid OGA-SVR model surpassed both SVR and LSTM in forecasting accuracy.

Two subsequent studies emphasize LSTM’s strengths. Rather (2021) [52] examined NIFTY-50 data (January 2018 to April 2021) and found that the LSTM-DNN model significantly outperformed MLP and ARIMA approaches. Similarly, Banik et al. (2022) [53] analyzed Indian stock market shares and enhanced the LSTM model with technical indicators, resulting in improved efficiency over competing methodologies.

Fischer and Krauss (2018) [54] examined S&P500 data spanning 1992–2015 and compared random forest, logistic regression, and LSTM models, concluding that LSTM produced the most accurate forecasts. Similarly, Siami-Namini et al. (2018, December) [55] analyzed the Nikkei 225, S&P500, NASDAQ Composite, Hang Seng, and Dow Jones indices from 1985–2018. Their results showed that LSTM outperformed ARIMA in terms of estimation accuracy. Liu (2019) [56] focused on predicting the S&P500 index and Apple stock prices, finding that LSTM and SVM models outperform GARCH over longer forecasting horizons. These studies collectively highlight LSTM’s strong predictive performance. However, one limitation of LSTM is its inability to capture multi-frequency characteristics of time series, constraining its capacity to model data in the frequency domain. Bhandari et al. (2022) [57] investigated S&P500 price movements (2006–2020) using single- and multilayer LSTM models. Contrary to expectations, the single-layer LSTM provided a better fit and higher accuracy than multilayer architectures, challenging the assumption that additional hidden layers necessarily improve predictive performance. Liang et al. (2019) [58] introduced a wavelet transform-enhanced LSTM (WT-LSTM) and demonstrated improved performance over a standard LSTM for S&P500 data. Qiu et al. (2020) [59] confirmed these findings with WT-LSTM and WT-GRU models on S&P500, DJIA, and HSI indices. In most cases, WT-LSTM offered superior results. An exception was found in Skehin et al. (2018) [60]—for FAANG stocks, ARIMA outperformed WT-LSTM except when predicting Apple’s stock prices.

Cao et al. (2019) [61] developed CEEMDAN-LSTM, a hybrid model combining ensemble empirical mode decomposition with LSTM, which outperformed both hybrid and conventional methods for S&P500 and HSI data. Furthermore, Hajiabotorabi et al. (2019) [48] employed hybrid architectures (BSd-RNN, DWT-RNN, DWT-FFNN) and found that the BSd-RNN was more efficient at forecasting non-linear volatility patterns in S&P500, NASDAQ Composite, DJIA, and NYSE indices than ARIMA and GARCH. Finally, Gülmez (2023) [62] applied LSTM, ANN, and hybrid LSTM models to DJIA stocks (2018–2023), revealing that LSTM-ARO (Artificial Rabbits Optimization) achieved the best results.

Zhang et al. (2017) [63] examined the price dynamics of 50 different stocks from 2007 to 2016 using AR, LSTM, and SFM (State Frequency Memory) models. They reported that the SFM model outperformed AR and LSTM, attributing this superiority to SFM’s enhanced multi-frequency pattern recognition capability. Hanauer et al. (2023) [64] analyzed stocks from 32 countries over the period July 1990 to December 2021. They employed LR, elastic net, RF, GBRT (Gradient Boosted Regression Trees), and ANN methodologies. Their findings indicated that tree-based methods and neural networks are more effective at identifying non-linear relationships and interactions. Furthermore, yield predictions derived from machine learning models surpassed those generated by conventional linear models.

Nelson et al. (2017) [65] applied multilayer perceptron, random forest, and LSTM models to the Brazilian stock market and sought to determine which method offered the highest forecasting accuracy. They found that LSTM delivered the best performance. Similarly, Kristjanpoller et al. (2014) [66] combined ANN and GARCH models to forecast stock market indices in Latin America (Brazil, Chile, Mexico) and demonstrated that ANN improved the predictive accuracy of the GARCH(1,1) model.

Nikou et al. (2019) [67] analyzed the daily price movements of the iShares MSCI UK exchange-traded fund from January 2015 to June 2018. They employed ANN, SVM, random forest, and LSTM models. Among these, LSTM achieved the highest accuracy, followed by SVM. Similarly, Ayala et al. (2021) [68] examined the IBEX, DAX, and DJIA indices between January 2011 and December 2019 using LR, RF, ANN, and SVR methodologies. Their findings indicate that ANN demonstrated outstanding performance. Moreover, they observed that integrating machine learning models with technical analysis strategies improves the identification of trading signals and the competitiveness of applied rules.

Ballings et al. (2015) [69] compared AdaBoost, random forest, Kernel Factory, SVM, KNN, logistic regression, and ANN using European stock price data, aiming to predict price trajectories one year in advance. Their results showed that random forest outperformed the other models, including neural networks. Basak et al. (2019) [70] forecasted stock market trends for companies such as Apple, Amazon, Microsoft, Facebook, Twitter, Nike, Tata, and Osram using XGBoost, logistic regression, SVM, ANN, and random forest. They also concluded that random forest outperformed the other methods.

Chong et al. (2017) [71] focused on 38 stocks with the largest market capitalization on the Korea Stock Exchange from January 2010 to December 2014. Their results indicated that DNN outperformed AR and ANN, and that incorporating an AR-DNN hybrid approach could further enhance predictive performance. Md et al. (2023) [72] examined Samsung stock price data for the period 2016–2021 and tested LR, SVR, LSTM, RNN, CNN, and MLP. LSTM proved superior to the others. They also noted that LSTM can be prone to overfitting, emphasizing the need for careful model optimization.

Liu et al. (2021) [73] examined the SSE 50 stock index data from May 2017 to July 2019 using RNN, GRU, and LSTM models. Their results indicated that the LSTM model outperformed GRU, and the simple RNN proved least effective. Similarly, Long et al. (2020) [74] utilized machine learning (random forest, Adaptive Boosting), bidirectional deep learning (BiLSTM), and other neural network models to investigate the predictability of Chinese stock price trends. BiLSTM achieved the best performance, far surpassing other forecasting methods. Jiang et al. (2023) [75] analyzed the CSI100, CSI200, CSI500, GEI100, NASDAQ100, and NYSE100 stock indices from January 2015 to December 2022. Their study was comprehensive, employing a wide range of methods (AR, logistic regression, ARIMA, GARCH, LightGBM, CNN, RNN, LSTM, TCN, ALSTM, SFM, ADA-RNN, CATN, HMM-ALSTM, MLP). The HMM-ALSTM proposed by the researchers emerged as the best predictor of return expectations and market direction for all examined indices.

Yao et al. (2023) [76] studied the SSE, DJIA, FTSE100, CAC40, Nikkei225, RTSI, and STI indices between January 2010 and August 2022. Their results showed that forecast accuracy improved significantly by combining different decomposition algorithms with a TCN model. Additionally, the MEMD-TCN model proved superior to several competing methods, including ARIMA, RF, SVR, GRU, and other decomposed deep learning variants. Behera et al. (2023) [77] employed RF, XGBoost, AdaBoost, SVR, KNN, ANN, and MVaR to examine stocks from BSE, Tokyo SE, and SSE indices, finding that the AdaBoost model combined with MVaR performed best. Yu et al. (2023) [78] used various neural network models to forecast the SSE, S&P500, and FTSE indices from January 2000 to June 2022. Their proposed hybrid model (GVMD-Q-DBN-LSTM-GRU) provided more accurate predictions in 1-, 5-, and 22-step horizons. The study offers valuable insights for investors to adjust trading decisions promptly, thus avoiding risks and maximizing returns. Furthermore, it can assist regulators in detecting anomalies and ensuring stable market development.

Jing et al. (2021) [79] examined stocks listed on the Shanghai Stock Exchange and concluded that incorporating technical indicators as input features to an LSTM model significantly improved forecasting accuracy compared to using these features in isolation. Alzaman (2023) [80] applied LSTM and GA algorithms to TSE (Toronto Stock Exchange) stocks and found that hyperparameter optimization greatly enhanced model performance. In particular, the performance of the GA methodology improved by approximately 40% following the optimization process.

Table 2 summarizes the 10 most influential studies from the literature related to stock markets.

Summary and Critical Reflection

The literature on machine learning for stock market forecasting is dominated by the use of LSTM, GRU, XGBoost, random forest, and CNN models. While the majority of studies show significant predictive accuracy—typically demonstrated by RMSE, MAE, or MAPE—these results are often interpreted over short time horizons (1–2 years); thus, their generalizability over time is severely limited. Many models do not take into account the impact of different market regimes or how the performance of the same model changes in the event of extreme volatility or economic crises. A further problem is that input variables are often only exchange rate or technical indicator-based, while market sentiment, volume, news flow, or macroeconomic variables (e.g., GDP growth, inflation, interest rates) are rarely included. Thus, the predictive ability of models is often disregarded in the market context. The methodologies are often not transparent: little information is available on the choice of hyperparameters, validation strategies, or how to prevent overfitting. It is also rare that forecast results are tested with specific trading strategies (e.g., backtesting, drawdown analysis), without which the predictive value is difficult to translate into real market decisions. Overall, therefore, the stock market literature is technologically advanced, but often lacks an economic-trading context, which hinders the practical adaptation of models.

3.2. Commodity Market Forecasts

Commodity markets present unique forecasting challenges due to the influence of macroeconomic, geopolitical, and environmental factors. This subsection reviews AI-based methods used to predict the prices of energy, metals, and agricultural products. It particularly focuses on the use of hybrid models, deep learning algorithms, and attention-based mechanisms to deal with the volatility and seasonality of commodity time series.

Liang et al. (2022) [81] examined gold price dynamics using a novel decomposition-based forecasting model. Initially, the time series is decomposed into sub-layers of varying frequencies. Forecasting is then conducted on each sub-layer utilizing long short-term memory (LSTM), convolutional neural networks (CNN), and a convolutional block attention module (CBAM). Subsequently, the combined use of LSTM, CNN, and CBAM is applied to each sub-layer, and their sub-results are aggregated to form the final prediction. The findings indicate that integrating LSTM, CNN, and CBAM enhances both modeling capabilities and forecast accuracy. Furthermore, the ICEEMDAN decomposition algorithm was shown to further improve predictive accuracy, outperforming other comparable methods.

Livieris et al. (2020) [82] similarly highlighted the advantages of integrating CNN and LSTM layers, finding that their CNN-LSTM model significantly outperformed support vector regression (SVR), feed-forward neural networks (FFNN), and standalone LSTM configurations for gold price prediction.

Focusing on medium- and long-term price forecasting of nickel, aluminum, copper, gold, iron, lead, silver, and zinc, Ozdemir et al. (2022) [83] employed two advanced deep learning architectures: LSTM and GRU. Both models demonstrated robust predictive capabilities, with the MAPE indicator used to evaluate forecasting performance. Notably, GRU networks were 33% faster than LSTM networks, suggesting that computational efficiency may be a consideration in high-resolution forecasting contexts. Shi et al. (2023) [84] combined these methodologies for aluminum, aluminum alloy, copper, lead, tin, gold, palladium, platinum, and silver, accounting for the impact of COVID-19. Their results suggest that LSTM-based hybrid models achieved high predictive accuracy, with the LSTM-GRU combination performing best. Similarly, Zhang et al. (2021) [85] examined copper price forecasting using MLP, SVR, RF, KNN, and Gradient Boosting, concluding that MLP (with deep learning integration) provided the most accurate monthly price forecasts.

Machine learning-based price forecasting has also gained considerable attention in agricultural commodity markets. Fang et al. (2020) [86] analyzed price trends of vegetable flour, soybean meal, dried rice, durum wheat, Zheng cotton, and early Indica rice using ARIMA, ANN, and SVR models. Their findings indicated that the two non-linear models (ANN and SVR) outperformed the linear ARIMA approach, though the difference between the two non-linear methods was not substantial. In particular, the results showed enhanced accuracy in predicting high-frequency components using SVM and ANN.

Similarly, Ribeiro et al. (2020) [87] employed Gradient Boost, XGBoost, RF, SVR, KNN, MLP, and the KNN-XGB-SVR hybrid model to forecast soybean and wheat prices. They concluded that the hybrid (KNN-XGB-SVR) approach yielded superior accuracy and lower forecasting errors than the standalone methodologies. Liu et al. (2022) [88] focused on soybean prices and also favored hybrid models. Their comprehensive analysis, including CNN, GRU, CNN-GRU, LSTM, MLP, SVR, CEEMDAN-CNN, CEEMDAN-CNN-GRU, CEEMDAN-GRU, CEEMDAN-LSTM, CEEMDAN-MLP, and CEEMDAN-SVR, demonstrated that combining fuzzy entropy K-means clustering, the attention-based CNN-GRU network, and CEEMDAN decomposition produced the most accurate soybean spot price forecasts. Additionally, they noted that factors such as climate change uncertainties, natural disasters, international trade, and macroeconomic policies significantly influence soybean prices.

RL et al. (2021) [89] examined cottonseed, mustard seed, soybean seed, castor seed, and guar seed data using ARIMA, LSTM, and TDNN (Time-Delay Neural Network) models. LSTM consistently provided the highest forecasting accuracy, surpassing both TDNN and ARIMA. Moreover, TDNN and LSTM performed markedly better than the linear model in predicting directional changes, a critical aspect for understanding market sentiment and business cycles. Deepa et al. (2021) [90] investigated cotton prices using LR, BLR, Gradient Boost, and RF. Their experimental results confirmed that Gradient Boost performed best, while both LR and Bayesian linear regression models also yielded satisfactory results. However, RF did not perform effectively for that specific dataset.

Ouyang et al. (2019) [91] conducted a time-series analysis on cotton, sugar, beans, soybean oil, cardamom, durum wheat, corn, coffee, cocoa, and orange juice concentrate using LSTNet, CNN, RNN, ARIMA, and VAR models. Their empirical evidence indicated that LSTNet, which integrates convolutional and recurrent layers with an autoregressive component, significantly improved forecasting results for multivariate agricultural futures. Similarly, Liang and Jia (2022) [92] examined corn, soybean, egg, PVC, and rebar prices using both individual and hybrid methods. Their GWO-CNN-LSTM model demonstrated a substantial performance improvement, effectively detecting patterns and handling extreme volatility. This finding is beneficial for policymakers, market participants, and producers navigating complex futures markets.

Weng et al. (2019) [93] analyzed cucumber price trends in China, concluding that the RNN approach achieved the highest forecasting accuracy. Additionally, they noted that the ARIMA model performed well when strong periodicity existed in the data. With advancements in data acquisition (“webcrawler” technologies), deep learning shows promise in forecasting scenarios essential for efficient crop area planning.

Deina et al. (2022) [94] studied arabica and robusta coffee price dynamics using AR, ARIMA, MLP, and ELM. Their results showed that the ELM-based methodology outperformed all linear models and proved better than MLP in 71% of cases. While linear models are straightforward to implement, the significant data variability in agricultural products often favors neural network approaches for more accurate forecasts.

Liang et al. (2023) [95] examined crude and Brent oil prices. Their proposed DNPP (Dynamic Noisy Proximal Policy) algorithm performed notably well in forecasting crude oil prices. Compared to other cutting-edge approaches, DNPP demonstrated superior performance metrics, accuracy, and better representation of actual price fluctuations. Sadefo Kamdem et al. (2020) [96] analyzed Brent and crude oil, wheat, and silver prices using ARIMA and LSTM models. They found that the LSTM model generally outperformed ARIMA for commodity market forecasting. They further observed that the predictive accuracy of both models deteriorated during the COVID-19 period due to increased market volatility.

Urolagin et al. (2021) [97] investigated gold and crude oil price trajectories using LSTM and hybrid models (RMT-LSTM, RZT-LSTM). Their hybrid RZT-LSTM model effectively managed outliers and optimized explanatory variable selection, resulting in significant improvements over the standard LSTM. Xu et al. (2023) [98] conducted empirical analyses of crude oil, Brent oil, gold, silver, platinum, palladium, and rhodium. Their results indicated that the proposed GINN (Generalized Improved Hybrid Neural Network) model outperformed alternative models. Additionally, it was shown that neural networks generally offered better forecasting performance than econometric models.

Niu et al. (2020) [99] focused on crude oil price prediction and concluded that their proposed VMD-CNN-GRU model achieved higher accuracy compared to various benchmark models (WNN, GRNN, CNN-GRU, EMD-CNN-GRU). In a similar vein, Guliyev and Mustafayev (2022) [100] applied logistic regression, Decision Tree, random forest, AdaBoost, and XGBoost to crude oil data. They concluded that XGBoost surpassed the other models on all evaluated criteria, with random forest ranking second. Further statistical testing (DeLong test) revealed no significant difference in performance between XGBoost and random forest, suggesting that both are suitable for prediction tasks. Additionally, Wang et al. (2020) [101] combined individual (LR, SVR, ANN) and hybrid (ABC-SVR) methods for crude oil forecasting. Their findings highlight that hybrid approaches can enhance the performance of standalone models by selecting optimal explanatory variables. Among the tested configurations, ABC-SVR proved the most robust.

Čeperić (2017) [102] analyzed the price dynamics of natural gas, fuel oil, crude oil, and coal using both machine learning and autoregressive methodologies. The SSA-SVR models examined produced substantially better results than the standard SVR model without feature selection (FS). Incorporating FS algorithms also conveyed a significant advantage when implementing neural network models. Similarly, Zheng et al. (2023) [103] investigated natural gas price variations using SVR as the foundational algorithm. Their results indicated that employing a genetic algorithm to optimize SVR hyperparameters markedly enhanced the accuracy of natural gas price forecasts. During the period of the Russian–Ukrainian conflict, the FS-GA-SVR hybrid model delivered more stable and accurate predictions of natural gas spot prices compared to the baseline SVR model. Prior to the onset of the conflict, Wang et al. (2021) [104] analyzed natural gas prices using 23 distinct models. They concluded that their newly proposed hybrid model (CEEMDAN-SE-PSO-ALSGRU) could forecast weekly natural gas prices with notably high precision.

Table 3 summarizes the 10 most influential studies from the literature related to commodity markets.

Summary and Critical Reflection

Commodity markets, especially energy and precious metals segments (e.g., oil, gold, silver), have received disproportionately little attention in the machine learning literature compared to equity or cryptocurrency markets. Although some studies apply RNN-, LSTM-, or SVR-based models to predict commodity market data, they often operate in a simplistic framework: input variables are mostly limited to historical price data or basic technical indicators. Most research ignores global macroeconomic, political, and environmental influences, despite the key role they play, especially in commodity markets. Oil and gas prices, for example, are closely linked to geopolitical tensions, trade wars, or production quotas (e.g., OPEC decisions). Gold prices often react as an escape mechanism to financial crises, inflation, or geopolitical instability. Ignoring these factors significantly reduces the practical applicability of the models. In addition, the validation of models is often static (e.g., single train-test split), thus not ensuring a continuous measure of predictive performance in changing market environments. Explicit validation for periods of crisis (COVID period or Russian–Ukrainian conflict) is rarely reported in publications. There is little research that examines the performance of the model on different commodity market instruments (e.g., oil vs. copper), so generalizability is also questionable. Thus, the commodity market remains uncharted territory for predictive modelling.

3.3. Cryptocurrency Forecasts

The extreme volatility of cryptocurrencies makes them both an attractive and difficult target for forecasting. This section examines how models such as LSTM, CNN, SVM, and various hybrid combinations have been utilized to forecast short- and mid-term trends in digital asset markets. Special attention is paid to Bitcoin and alternative cryptocurrencies, as well as the datasets and evaluation metrics used in the literature.

Sun et al. (2020) [105] examined 42 different cryptocurrencies between January and June 2018 using Light Gradient Boosting Machine (LightGBM), Support Vector Machine, and random forest algorithms. Their model incorporated 40 distinct feature variables. They found that LightGBM outperformed the other two models in terms of robustness. Wang et al. (2022) [106] investigated the predictability of 12 cryptocurrency returns from August 2017 to March 2021, employing random forest, logistic regression, Support Vector Machine, LSTM, and ANN models. They concluded that LSTM provided the highest estimation accuracy. Moreover, LSTM’s performance improved significantly when trading-related feature variables were included. Oyedele et al. (2023) [107] modeled the closing price predictions of six different cryptocurrencies using Adaptive Boosting (ADA), Gradient Boosting Machines (GBM), Extreme Gradient Boosting (XGB), Deep Feed-Forward Neural Networks (DFNN), Gated Recurrent Units (GRU), and convolutional neural networks (CNN). Their results indicated that CNNs yielded the best outcomes regarding both accuracy and consistency.

Akyildirim et al. (2021) [108] analyzed the predictability of 12 cryptocurrency exchange rates from April 2013 to June 2018 using support vector regression, logistic regression, random forest, and ANN models. Their findings suggest that SVR was the most effective method. Borges and Neves (2020) [109] reached a similar conclusion, noting that SVR outperformed other methods when analyzing the price movements of several cryptocurrencies. Additionally, they found that employing ML-based trading strategies achieved higher returns compared to the buy-and-hold approach.

Zhang et al. (2021) [110] compared ARIMA, random forest, XGBoost, MLP, LSTM, GRU, CNN, and various hybrid models (including LSTM + CNN, GRU + CNN, and WAMC—Weighted and Attentive Memory Channels) to predict exchange rates for six different cryptocurrencies. They concluded that well-constructed hybrid methods can further enhance predictive performance, with their proposed WAMC algorithm exhibiting superior results. Alonso-Monsalve et al. (2020) [111] conducted a short-term trend analysis of six popular cryptocurrencies, utilizing CNN, a CNN-LSTM hybrid, MLP, and Radial Basis Function Neural Networks. The hybrid architecture consistently outperformed the others and was the only approach capable of predicting Dash and Ripple trends with minimal error (approximately 4%).

Bitcoin also features prominently in scientific research, as it appears in the majority of cryptocurrency-related publications. For instance, Cavalli and Amoretti (2021) [112] aimed to predict Bitcoin’s price trend using exchange rate data from April 2013 to February 2020, applying both LSTM and CNN (convolutional neural network) models. Their findings indicated that the CNN model achieved superior estimation accuracy compared to LSTM.

Several studies highlight the strengths of the LSTM model for Bitcoin price forecasting. Chen et al. (2020) [18] employed logistic regression, linear discriminant analysis, random forest (RF), XGBoost, quadratic discriminant analysis, Support Vector Machine (SVM), and LSTM models to forecast Bitcoin exchange rates. Their results showed that for 5 min exchange rate data, LSTM exhibited the best estimation performance, while for daily data, logistic regression produced the highest accuracy. Jaquart et al. (2021) [113] utilized feed-forward neural networks, LSTM, GRU, random forest, and Gradient Boosting models to predict short-term Bitcoin price directions. They found that LSTM yielded the highest estimation accuracy. Their analysis of technical, blockchain-based, sentiment/interest-based, and asset-based feature sets revealed that technical features are predominantly influential. Over longer horizons, the importance becomes more evenly distributed among other features, such as transactions per second and weighted sentiment. Alkhodhairi et al. (2021) [114] estimated Bitcoin’s opening, highest, lowest, and closing prices using LSTM and GRU over 4, 12, and 24 h forecasting intervals. LSTM outperformed GRU in their tests. Chen et al. (2021) [115] examined Bitcoin exchange rate prediction using random forest, ANN, LSTM, ARIMA, SVM, ANFIS, and GA models. LSTM again emerged as the most accurate model. Moreover, their research showed that the inclusion of explanatory variables can significantly enhance predictive performance. Mudassir et al. (2020) [116] investigated Bitcoin price prediction using ANN, Stacked ANN, SVM, and LSTM. Although actual BTC exchange rates could be predicted with minimal error, forecasting its increase or decrease proved more challenging. LSTM still showed the best classification performance.

Contrary results were reported by Mallqui and Fernandes (2019) [117], who studied the maximum, minimum, and closing Bitcoin prices using MLP and SVR. SVR achieved higher prediction accuracy. They also noted that proper attribute selection and algorithm choice improved forecast accuracy by more than 10% over previously published methods. Jang and Lee (2017) [118] analyzed Bitcoin exchange rate predictability using linear regression, Bayesian Neural Networks, and SVM, concluding that Bayesian Neural Networks provided superior predictive performance. Al-Nefaie and Aldhyani (2022) [119] employed GRU and MLP between January 2021 and June 2022, finding MLP slightly more effective than GRU. These findings are valuable for asset pricing, especially under uncertainty from digital currencies.

Cocco et al. (2021) [120] integrated Bayesian Neural Networks, Feed-Forward Networks, LSTM, SVM, and hybrid models, demonstrating that combining models can enhance prediction performance. Tapia and Kristjanpoller (2022) [121] combined econometric and ML methods with AMEM and LSTM, significantly improving Bitcoin volatility forecasts. Dutta et al. (2020) [122] compared GRU and LSTM models for Bitcoin price analysis. Their proposed hybrid model with GRU-based recurrent selection produced more accurate predictions than other algorithms.

Lahmiri and Bekiros (2019) [123] employed both LSTM and GRNN (Generalized Regression Neural Network) models to predict the exchange rates of Bitcoin, Digital Cash, and Ripple. Their findings indicate that the predictive ability of the LSTM, a long short-term memory neural network, significantly surpasses that of the GRNN benchmark. Although the computational load of the LSTM model is higher due to its complex non-linear pattern recognition capabilities, deep learning ultimately proved highly effective in capturing the inherent chaotic dynamics of cryptocurrency markets.

Serrano (2022) [124] compared Random Neural Network (RNN), LSTM, and linear regression models for predicting Bitcoin, Ethereum, and Ripple prices. The results showed that the neural network models performed similarly and both outperformed linear regression. A similar outcome was observed by Uras et al. (2020) [125], who examined exchange rate variability and predictability for Bitcoin, Ethereum, and Litecoin. Their analysis included univariate (closing prices only) and multivariate (including volume, highest, and lowest prices) linear regression and LSTM models. They concluded that Ethereum and Litecoin were more predictable than Bitcoin, and that the univariate LSTM model yielded the most accurate results. The authors also noted a substantial difference in computation time favoring the regression models.

By contrast, Sebastião and Godinho (2021) [126] analyzed Bitcoin, Ethereum, and Litecoin exchange rates from August 2015 to March 2019 using linear regression, random forest, and Support Vector Machine methodologies, subsequently translating model predictions into trading strategies. Surprisingly, linear regression produced the most accurate results, performing best for Ethereum and Litecoin.

In their study, Poongodi et al. (2020) [127] employed both linear regression (LR) and Support Vector Machine (SVM) models to predict the Ethereum exchange rate. They concluded that the SVM model produced results with significantly higher accuracy than LR. Similarly, Zoumpekas et al. (2020) [128] investigated the Ethereum exchange rate using a range of neural network models (CNN, LSTM, SLSTM, BiLSTM, GRU) and found that the standard LSTM algorithm was the most effective for predicting the examined exchange rate data.

Patel et al. (2020) [129] explored the predictability of Litecoin and Monero cryptocurrencies over 1-, 3-, and 7-day horizons. Their proposed GRU-LSTM hybrid model outperformed the conventional LSTM approach. Furthermore, Peng et al. (2018) [130] examined the volatility predictability of Bitcoin, Ethereum, and Dash using GARCH, SVR, and hybrid models, concluding that the SVR-GARCH hybrid approach was more efficient than all other tested methodologies.

Table 4 summarizes the 10 most influential studies from the literature related to cryptocurrency markets.

Summary and Critical Reflection

Machine learning models applied to cryptocurrency markets have received considerable attention in recent years, especially in the form of LSTM, CNN, and various hybrid architectures. While the literature often demonstrates their outstanding accuracy over short time horizons, most of the applications ignore the extremely high volatility, structural breakpoints and price bubbles that characterize crypto markets. As a result, the predictive performance of the models is not sufficiently regime-sensitive, i.e., it is not clear how different the results are in a normal market environment and in a crash, for example. In addition, the metrics alone do not ensure interpretability, as the price level of crypto assets can change by up to 20–30% within hours. These extremes can distort the meaning of the metrics, while the fine-tuning of models is rarely done with robust scenarios. Moreover, there is no real trading concept behind the forecasts to judge the real practical performance of the models. Furthermore, little research has addressed how a given model performs on different crypto assets (e.g., Bitcoin vs. altcoins) or how predictive performance varies with market cycles. This is particularly critical as the decentralized and unregulated nature of the crypto market means that market behavior may require re-learning from time to time. The field would therefore offer an excellent opportunity to test resilience- and volatility-sensitive algorithms, but this opportunity is only partially exploited in the literature.

3.4. Foreign Exchange Market Forecasts

Forecasting currency exchange rates is particularly complex due to the global and multifactorial nature of FX markets. This subsection reviews machine learning models applied to foreign exchange forecasting, comparing the performance of LSTM, GRU, SVR, and ensemble methods. It also considers how these models handle high-frequency data and adapt to changing market regimes, with emphasis on their potential for real-time trading applications.

Islam and Hossain (2020) [131] modeled the exchange rate movements of EUR-USD, GBP-USD, USD-CAD, and USD-CHF currency pairs using LSTM, GRU, and GRU-LSTM deep learning methodologies. Their experimental results indicated that the GRU-LSTM hybrid model predicted the examined currency crosses more accurately than the widely used LSTM and GRU models. Yıldırım et al. (2021) [132] focused on the EUR-USD exchange rate. In terms of methods, they employed only LSTM and its variants: ME-LSTM (employing macroeconomic data) and TI-LSTM (employing technical indicators). The findings suggest that incorporating macroeconomic data marginally improved accuracy relative to using technical indicators. However, this improvement was minimal, and integrating both macroeconomic and technical indicator features did not yield a substantial increase in predictive accuracy.

Ahmed et al. (2020) [133] also investigated the EUR-USD exchange rate but extended their analysis to a broader range of models (ARIMA, RNN, LSTM, FLF-RNN, FLF-LSTM). Similarly, Escudero et al. (2021) [134] examined the EUR-USD pair using ARIMA, Elman Neural Network (ENN), and LSTM. Their comparison revealed that, for short-term forecasts (a 22-day horizon), LSTM surpassed ARIMA and ENN. ARIMA produced relatively static forecasts, while LSTM demonstrated higher accuracy for short-run predictions, and ENN proved superior in the long run.

In contrast to the studies favoring LSTM-based approaches, some researchers prefer SVR-based methods. Jubert de Almeida et al. (2018) [135] analyzed the EUR-USD exchange rate using SVR, GA, and GA-SVR models. They concluded that a hybrid approach, combining SVR and GA, enabled more accurate trend classification. Similarly, Sadeghi et al. (2021) [136] emphasized the advantages of combined methodologies, reporting that their EMC-SVR model was more effective in detecting upward, sideways, and downward trends in the EUR-USD exchange rate compared to standalone SVR.

Ni et al. (2019) [137] analyzed the EUR-USD, AUD-USD, GBP-JPY, EUR-JPY, GBP-USD, USD-CHF, USD-JPY, and USD-CAD currency pairs using CNN, LSTM, and CNN-RNN models. Their empirical findings indicate that the CNN-RNN hybrid model yields significantly better forecasting performance than either CNN or LSTM alone. Lin et al. (2020) [138] predicted USD-AUD exchange rate fluctuations using CEEMDAN-LSTM and compared its accuracy with that of SVM, RNN, MRNN (Multilayer RNN), ARIMA, and Bayesian approaches. They concluded that CEEMDAN-LSTM delivers higher accuracy relative to these other models.

Dautel et al. (2020) [139] investigated the EUR-USD, GBP-USD, JPY-USD, and CHF-USD currency pairs. Their results suggest that LSTM and GRU, given their conceptual advantages over traditional RNNs, achieve excellent forecasting accuracy. However, they also observed that conventional Feed-Forward Neural Networks (FFNNs) can remain competitive, with no statistically significant difference in some cases. Abedin et al. (2021) [140] examined 21 currency pairs using Lasso regression, Ridge regression, DT, SVR, RF, LSTM, Bi-LSTM, Bagging regression (combining multiple regression models through variance minimization), and Bi-LSTM-BR. Their proposed Bi-LSTM-BR deep learning hybrid model outperformed all other methods tested. Notably, the study found a deterioration in predictive performance during the COVID-19 period. The exchange rates of countries most adversely affected by the pandemic exhibited higher volatility, which contributed to reduced forecasting accuracy.

Qi et al. (2020) [141] analyzed the GBP-USD, EUR-GBP, AUD-USD, and CAD-CHF currency pairs using various deep learning models, including RNN, LSTM, Bi-LSTM, and GRU. Their experimental results indicate that combining an event-driven selection of explanatory variables with the Bi-LSTM model yields a robust forecasting framework, facilitating the development of accurate trading strategies with minimal risk. Rundo (2019) [142] focused on the EUR-USD, GBP-USD, and EUR-GBP currency pairs, comparing the performance of LSTM with an LSTM-RL model (reinforcement learning correction block). The findings show that the LSTM-RL hybrid model is substantially more effective than the standard LSTM in detecting trends in currency crosses. Baffour et al. (2019) [143] investigated five exchange rates (AUD-USD, CAD-USD, CHF-USD, EUR-USD, GBP-USD) using ANN-GJR, GARCH, and APGARCH models. Empirical evidence demonstrated that the hybrid ANN-GJR model significantly outperforms all benchmark models employed in the study.

Xueling et al. (2023) [144] employed a variety of machine learning and deep learning techniques including CNN, LSTM, GRU, XGBoost-LSTM, CNN-Bi-LSTM-AM, and CNN-LSTM to analyze the exchange rate movements of the AUD-RMB, EUR-RMB, USD-RMB, and RMB-JPY currency pairs. Their experimental results indicate that the proposed CNN-LSTM model outperformed all other methods in predicting the trends of the examined currency crosses. In contrast, Sako et al. (2022) [145], who investigated ZAR-USD, NGN-USD, GBP-USD, EUR-USD, RMB-USD, and JPY-USD exchange rates using RNN, LSTM, and GRU models, concluded that the GRU approach yielded the best predictive accuracy for both univariate and multivariate forecasting tasks. Similarly, Cao et al. (2020) [146] examined USD-CNY exchange rate forecasting using a range of traditional and machine learning methods (ARIMA, SVR, CNN, LSTM, DC-LSTM) and found that the DC-LSTM model substantially outperformed seven other algorithms, underscoring its value for investment decision-making.

Table 5 summarizes the 10 most influential studies from the literature related to foreign exchange markets.

Summary and Critical Reflection

Machine learning models used in the field of foreign exchange market forecasting are mostly based on technical indicators and use algorithms such as XGBoost, SVM, random forest or GRU. However, most of the studies do not sufficiently take into account fundamental economic factors such as interest rate differentials, central bank policies, inflation expectations, or geopolitical events. However, these play a key role in the evolution of exchange rates in the long run. In general, the models under consideration do not distinguish between different market structures (trend, consolidation, etc.), although the performance of the models at these stages may differ significantly. In this absence, the predictive values remain aggregated, making it difficult to determine in which environment the models perform well. In addition, studies often make idealized assumptions: transaction costs, slippage (execution price different from the price at which the trade order was placed), and liquidity constraints are ignored when evaluating model performance. This gives a distorted picture of practical applicability, especially for currency pairs where spreads are low but volume and trading frequency are extremely high. The inevitable market noise in real-time trading environments and the ability of algorithms to adapt are rarely assessed. The directional estimates provided by models are often not tested embedded in trading strategies, making it difficult to measure performance at the level of investor decision-making. Thus, foreign exchange market modelling remains rather theoretical and often not directly linked to the realities of market practice.

4. Evaluation Metrics for Financial Forecasting Models

The efficacy of financial forecasting models hinges not only on the chosen algorithmic architecture but also significantly on the appropriate selection of input variables and, crucially, on how their predictive performance is evaluated. In the realm of financial time series, where phenomena like volatility, seasonal patterns, and market shocks are inherent, understanding the various types of errors and their implications is paramount. The “garbage in, garbage out” principle extends to evaluation: choosing the right metrics ensures that model optimization aligns with genuine financial objectives, leading to more robust and actionable insights. This section delves into key evaluation metrics for both regression and classification models, highlighting their relevance and interplay with model hyperparameters in a financial context.

4.1. Regression Model Evaluation Metrics

Regression models in finance aim to forecast continuous values, such as stock prices, commodity prices, or foreign exchange rates. Assessing their predictive performance requires metrics sensitive to different aspects of error magnitude and scale.

4.1.1. Mean Absolute Error (MAE)

The Mean Absolute Error (MAE) is one of the simplest and most widely used measures for assessing prediction accuracy. It quantifies the average of the absolute differences between actual and forecasted values.

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

A significant advantage of MAE is its intuitive interpretability, as it provides the average error in the same units as the target variable. This directness makes it easy for financial practitioners to understand the typical magnitude of forecast errors (Hyndman and Athanasopoulos, 2018) [147]. However, a key drawback is its insensitivity to large, outlying errors. Since all deviations are weighted equally, MAE might not sufficiently penalize models that occasionally make very large prediction mistakes, which can be critical in risk management or high-stakes trading. When optimizing hyperparameters to minimize MAE, the model may become less aggressive in reducing extreme errors, potentially leading to a smoother but less robust prediction curve, especially in volatile market regimes.

4.1.2. Mean Squared Error (MSE)

The Mean Squared Error (MSE) calculates the average of the squared differences between actual and forecasted values.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

This metric inherently penalizes larger errors disproportionately due to the squaring operation. This makes MSE particularly useful in scenarios where minimizing significant outliers is crucial, such as in portfolio risk management or derivative pricing, where large errors can lead to substantial financial losses (Granger and Newbold, 1986) [148]. A common challenge with MSE is its interpretability, as its units are the square of the target variable’s units. When a model’s hyperparameters are tuned to minimize MSE, the optimization process will naturally prioritize reducing larger errors, often leading to models that are more sensitive to outliers. This can sometimes result in more conservative predictions in extremely volatile periods.

4.1.3. Root Mean Squared Error (RMSE)

The Root Mean Squared Error (RMSE) is simply the square root of the MSE:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

RMSE is arguably the most frequently used error metric in financial time-series modeling. It combines the advantages of MSE’s sensitivity to large errors with the interpretability of MAE by returning the error to the original units of the target variable. This dual benefit makes it highly practical for communicating performance. However, like MSE, RMSE is still susceptible to being heavily influenced by a few extreme errors. The choice between MAE and RMSE often reflects a trade-off between robustness to outliers and the desire to penalize large errors more severely during hyperparameter tuning (Chai Draxler, 2014) [149].

4.1.4. Mean Absolute Percentage Error (MAPE)

The Mean Absolute Percentage Error (MAPE) expresses the deviation in percentage form, making it a scale-independent metric:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|y_{i} - {\hat{y}}_{i}|}{y_{i}}

MAPE is particularly advantageous when comparing the performance of models across different financial instruments or markets that operate on vastly different scales (e.g., comparing a stock price forecast with a bond yield forecast). It provides a clear relative error measure. However, MAPE is highly sensitive to small actual values (y_i), where even small absolute errors can result in disproportionately large percentage errors (Makridakis, 1993) [150]. It also becomes undefined when y_i = 0. When optimizing hyperparameters using MAPE as the objective, models might prioritize accuracy for larger values, potentially neglecting the quality of predictions for assets with very low prices.

4.1.5. Symmetric MAPE (sMAPE)

To address the bias inherent in MAPE, the Symmetric MAPE (sMAPE) offers an alternative:

s M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|y_{i} - {\hat{y}}_{i}|}{(|y_{i}| + |{\hat{y}}_{i}|) / 2}

This metric provides a more symmetric measure of error and is less prone to overstating errors when the denominator (y_i) is small. It also handles cases where y_i is zero, provided

{\hat{y}}_{i}

is not zero. While sMAPE mitigates some of MAPE’s biases, it can still suffer from issues if both y_i and

{\hat{y}}_{i}

are close to zero (Kim and Kim, 2016) [151]. For hyperparameter tuning, sMAPE offers a more balanced objective function for percentage errors, encouraging models to perform robustly across various scales without excessively penalizing small-value predictions.

4.1.6. R-Squared (R²)

The coefficient of determination, R-squared (R²), measures the proportion of the variance in the dependent variable that is predictable from the independent variables:

R^{2} = 1 - \frac{\sum {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum {(y_{i} - {\bar{y}}_{i})}^{2}}

R-squared ranges from 0 to 1, with higher values indicating a better fit. It provides a measure of how well the model explains the variability of the target variable relative to a simple mean-based model. However, R does not always provide a reliable picture for time-series models, as it is not sensitive to structural breaks or temporal shifts within the series (Hyndman and Koehler, 2006) [152]. For financial time series, a high R² does not guarantee profitable trading signals or accurate directional forecasts. When models are optimized for R², they might prioritize fitting the overall trend well but neglect short-term fluctuations that are crucial for trading decisions.

4.2. Classification Model Evaluation Metrics

Classification models in finance aim to categorize outcomes, such as predicting the direction of an asset’s price (e.g., increase or decrease), identifying fraudulent transactions, or assessing credit default probabilities. The evaluation of these models focuses on the distribution and success rate of predictions across different classes.

4.2.1. Accuracy

Accuracy is the simplest metric, representing the proportion of correctly classified instances:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

where TP (True Positives), TN (True Negatives), FP (False Positives), and FN (False Negatives) refer to the counts from a confusion matrix. While suitable for balanced class distributions, accuracy can be misleading in cases of class imbalance, which is common in finance (e.g., fraud detection, where fraudulent transactions are rare). If 99% of transactions are legitimate, a model predicting “no fraud” for all transactions would achieve 99% accuracy but be useless. When hyperparameters are optimized solely for accuracy in imbalanced datasets, the model might simply learn to predict the majority class, leading to poor performance on the minority (often more critical) class (Powers, 2011) [153].

4.2.2. Precision and Recall

Precision measures the correctness of positive predictions, while Recall (Sensitivity) measures the proportion of actual positives that were correctly identified:

P r e c i s i o n = \frac{T P}{T P + F P}

R e c a l l = \frac{T P}{T P + F N}

The precision–recall trade-off is a frequent subject of analysis in financial forecasting. For instance, in an “asset buy” signal, high precision means fewer false buy signals (avoiding bad investments), while high recall means not missing too many good investment opportunities. The choice between prioritizing precision or recall (and thus influencing hyperparameter tuning) depends on the business objective: for fraud detection, high recall (catching all fraud) might be critical even if it means some false alarms; for high-frequency trading, high precision (reliable signals) might be preferred (Powers, 2011) [153].

4.2.3. F1-Score

The F1-score is the harmonic mean of precision and recall:

F 1 = 2 * \frac{P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l}

The F1-score is particularly useful for imbalanced datasets or when it is important to optimize both precision and recall simultaneously (e.g., when both false positives and false negatives have significant costs). It provides a single metric that balances these two aspects, making it a common objective for hyperparameter optimization in financial applications like credit scoring or market trend prediction (Powers, 2011) [148].

4.2.4. Specificity

Specificity measures the true negative rate, or the proportion of actual negatives that were correctly identified:

S p e c i f i t y = \frac{T N}{T N + F P}

Specificity is crucial in financial risk applications where false positives (false alarms) can be costly (e.g., incorrectly flagging a legitimate transaction as fraudulent, leading to customer inconvenience). A high specificity indicates that the model is good at identifying true negative cases, which can be critical for maintaining operational efficiency and customer trust. Hyperparameter tuning might aim to balance specificity with recall, depending on the risk appetite and operational constraints (Powers, 2011) [153].

4.2.5. ROC Curve and AUC

The Receiver Operating Characteristic (ROC) curve plots the True Positive Rate (Recall/Sensitivity) against the False Positive Rate (1—Specificity) at various threshold settings. The Area Under the Curve (AUC) summarizes the ROC curve’s performance across all possible classification thresholds:

An AUC of 0.5 indicates random guessing.
An AUC close to 1 suggests excellent classification performance.

The ROC-AUC is a robust metric for evaluating classifier performance, especially on imbalanced datasets, as it considers the trade-off between sensitivity and specificity across all thresholds. It helps in selecting an optimal operating point for a model based on specific financial risk–reward considerations (Fawcett, 2006) [154]. When performing hyperparameter optimization, optimizing for AUC aims to find a model that performs well across a range of thresholds, providing flexibility in deployment.

4.3. Summary

The judicious selection and interpretation of evaluation metrics are indispensable for developing high-performing and trustworthy financial forecasting models. While regression metrics like RMSE and MAE offer insights into prediction error magnitudes, classification metrics such as Precision, Recall, F1-score, and ROC-AUC are vital for understanding the performance of categorical predictions, especially in the context of class imbalance.

The choice of evaluation metric is not merely a post-training assessment; it critically influences the hyperparameter tuning process. Optimizing for a specific metric (e.g., minimizing RMSE or maximizing F1-score) directly guides the model’s learning trajectory, impacting crucial hyperparameters like the learning rate, number of epochs, batch size, dropout rate, and even the choice of activation function or optimizer parameters. For instance, a model tuned to minimize MSE will prioritize large error reduction, while one tuned for F1-score will seek a balance between precision and recall.

In the complex and often noisy financial landscape, a multi-faceted evaluation approach, leveraging a suite of metrics tailored to the specific business objective, is often more informative than relying on a single metric. This, combined with careful hyperparameter optimization guided by these chosen metrics on robust validation sets, ensures the development of models that are not only accurate but also reliable and actionable for real-world financial decision-making.

5. The Role of Hyperparameter Tuning in the Performance of Financial Forecasting Models

The predictive performance of deep learning and machine learning models is not solely dependent on their architectural structure; it is also profoundly influenced by the optimal selection of hyperparameters. In the context of financial time-series forecasting, where volatility, seasonal effects, and market shocks are prevalent, the model’s sensitivity to these settings becomes even more pronounced. Effective hyperparameter tuning is crucial for achieving superior generalization capabilities and robust performance in such dynamic and often noisy environments.

5.1. Key Hyperparameters

Hyperparameters are external configurations that are set before the training process begins, distinguishing them from model parameters (weights and biases) that are learned during training. Tuning these parameters significantly impacts how a model learns, its capacity, and its ability to generalize to unseen data.

5.1.1. Learning Rate

This hyperparameter determines the step size at which the model adjusts its internal weights during gradient-based optimization. A learning rate that is too high can cause the optimization process to diverge, leading to unstable training and potentially missing the optimal solution. Conversely, a learning rate that is too low can result in slow convergence, making the training process inefficient and potentially trapping the model in suboptimal local minima. Finding an appropriate learning rate is often one of the most critical steps in tuning (Mounjid and Lehalle, 2024) [155].

5.1.2. Number of Epochs

An epoch represents one complete pass through the entire training dataset. An insufficient number of epochs can lead to underfitting, where the model has not learned enough from the data and performs poorly on both training and unseen data. Conversely, an excessive number of epochs can cause overfitting, where the model learns the training data too well, including its noise, leading to poor generalization on new financial data. Early stopping, a common regularization technique, is often employed to mitigate overfitting by halting training when validation performance ceases to improve (Song and Choi, 2023) [156].

5.1.3. Batch Size

This hyperparameter defines the number of samples processed before the model’s weights are updated. Smaller batch sizes (e.g., 1–32) yield noisier gradient estimates but can sometimes lead to better generalization by escaping sharp local minima and providing a regularization effect. Larger batch sizes (e.g., 64–256 or more) provide more stable gradient estimates and can lead to faster convergence per epoch but might sometimes converge to flatter, less generalizable minima (Liu et al., 2023) [157]. The choice often depends on computational resources and dataset characteristics.

5.1.4. Number of Hidden Layers and Neurons per Layer

These architectural hyperparameters directly influence the complexity and learning capacity of a neural network. Too few layers or neurons can restrict the model’s ability to capture intricate non-linear relationships in financial data, leading to underfitting. Conversely, too many can increase the risk of overfitting and significantly raise computational costs. Finding optimal values often involves a process of trial and error, guided by domain knowledge and computational constraints (Yan and Ouyang, 2018) [158].

5.1.5. Dropout Rate

This regularization technique is specifically designed to reduce overfitting in neural networks (Livieris et al., 2022) [159]. During training, a specified percentage (dropout rate) of neurons are randomly “dropped out” (set to zero) in each update, preventing complex co-adaptations between neurons. This forces the network to learn more robust features that are not dependent on the presence of specific neurons, enhancing generalization. Typical values range from 0.1 to 0.5.

5.1.6. Activation Function

Activation functions introduce non-linearity into a neural network, allowing it to learn complex patterns. Common choices include ReLU (Rectified Linear Unit), Tanh (Hyperbolic Tangent), and Sigmoid. Different functions handle non-linearities differently. For financial time series, Tanh is often preferred over ReLU for its symmetric output range (−1,1), which can be beneficial when dealing with centered or standardized financial data, leading to more stable gradients. ReLU, by only activating for positive values, can suffer from the “dying ReLU” problem where neurons become inactive (Vancsura et al., 2024) [160]. Sigmoid, producing outputs between 0 and 1, is often used in the output layer for binary classification problems but can suffer from vanishing gradients in deeper networks.

5.1.7. Optimizer

While sometimes considered a training algorithm, the choice of optimizer (e.g., Adam, SGD, RMSprop) and its specific parameters (e.g., Adam’s β₁, β₂, ϵ) significantly impact training dynamics. Adam (Makinde, 2024) [161] is a popular choice for its adaptive learning rates and robustness in many scenarios, often requiring less manual tuning than Stochastic Gradient Descent (SGD).

5.2. Hyperparameter Optimization Techniques

Manually tuning hyperparameters can be time-consuming and inefficient, especially for complex deep learning models. Automated optimization techniques help systematically explore the hyperparameter space to find optimal configurations.

5.2.1. Grid Search

This exhaustive method defines a discrete set of values for each hyperparameter and then evaluates the model’s performance for every possible combination. While simple and guaranteed to find the best combination within the defined grid, it is extremely computationally intensive, making it impractical for high-dimensional hyperparameter spaces or models with long training times (Hoque and Aljamaan, 2021) [162].

5.2.2. Random Search

Instead of an exhaustive search, Random Search (Flavia and Mio, 2025) [163] samples hyperparameter combinations randomly from specified distributions. This method is generally faster and surprisingly effective, often finding good configurations in fewer iterations than Grid Search, especially when only a few hyperparameters are truly important.

5.2.3. Bayesian Optimization

This sophisticated technique (Liu et al., 2024) [164] iteratively searches for optimal combinations by building a probabilistic model (often a Gaussian process) of the objective function (e.g., validation loss). It uses this model to intelligently select the next set of hyperparameters to evaluate, balancing exploration (sampling new, unproven regions) and exploitation (sampling promising regions). Bayesian Optimization is particularly effective for expensive training cycles and complex, non-linear hyperparameter landscapes, making it well-suited for deep learning models in finance. Tools like Hyperopt or Optuna implement this.

5.2.4. Evolutionary Algorithms (e.g., Genetic Algorithms, Particle Swarm Optimization)

These heuristic search strategies are inspired by natural evolution or swarm intelligence. They maintain a population of hyperparameter configurations, evaluate their performance, and then use mechanisms like mutation, crossover, or swarm movement to generate new, potentially better configurations. Several recent studies have successfully applied these algorithms to financial forecasting problems, demonstrating their ability to find robust hyperparameter settings (Beniwal et al., 2023) [51].

5.3. Challenges in Financial Applications

Hyperparameter tuning in financial forecasting faces unique challenges due to the inherent characteristics of financial data (Liang et al., 2023) [95]:

Non-stationarity and Evolving Patterns: Financial markets are notoriously non-stationary, meaning their statistical properties change over time. This implies that hyperparameters optimized for one period (e.g., a bull market) might perform poorly in another (e.g., a crisis period). Continuous fine-tuning and adaptive strategies are often required.
High Risk of Overfitting: The dynamic and often noisy nature of financial data, coupled with the complexity of deep learning models, increases the risk of overfitting. Models tuned too precisely to historical data over a long period may fail dramatically when market regimes change. Robust validation strategies, such as rolling-window cross-validation, are crucial.
Time-Varying Optimality: Hyperparameters may not be universally optimal across different time periods. A model optimized for a crisis period might significantly underperform in a calm market environment, highlighting the need for dynamic hyperparameter adaptation.
Computational Cost: Tuning complex deep learning models for financial time series is computationally expensive, especially when using exhaustive search methods or models with long training times.

6. Feature Selection and Explainability in Financial Forecasting Models

The effectiveness of financial forecasting models is significantly influenced not only by the type of algorithm employed but also by the appropriate selection of input variables. The “garbage in, garbage out” principle is particularly salient in AI-based systems, where extracting relevant features from high-dimensional datasets is crucial for model performance, stability, and interpretability (Nazareth and Reddy, 2023; Goulet Coulombe et al., 2022) [7,9].

6.1. Feature Selection Techniques

Feature selection aims to identify a subset of relevant features that optimize model performance while reducing complexity. This process helps mitigate overfitting, improve generalization, and enhance model interpretability by focusing on the most influential variables. Commonly employed procedures include the following.

6.1.1. Filter Methods

These methods select features based on their statistical properties, independent of any learning algorithm. They assess the relevance of features by analyzing their correlation with the target variable or their variance (MA et al., 2025) [165]. Examples include the following:

Correlation analysis: Selecting features highly correlated with the target variable (e.g., Pearson correlation, Spearman’s rank correlation). For financial time series, this can involve analyzing lead–lag relationships or co-movements between assets (e.g., between a stock and an economic indicator).
ANOVA (Analysis of Variance): Used for categorical features to determine if there are statistically significant differences in means across groups.
Chi-squared test: For categorical features, assessing their independence from the target variable.
Mutual Information (MI): Measures the dependency between two variables (feature and target), capturing both linear and non-linear relationships. Higher MI indicates greater relevance. MI is particularly useful for financial data where complex non-linear dependencies are prevalent.

6.1.2. Wrapper Methods

These methods use a specific learning algorithm to evaluate the performance of different feature subsets. They are generally more accurate but computationally more expensive as they involve training and evaluating the model multiple times (Tsai et al., 2021) [166]:

Recursive Feature Elimination (RFE): This iterative method trains a model, ranks features by importance, and then eliminates the least important features, repeating the process until the desired number of features is reached. It is often used with linear models or tree-based models.
Sequential Feature Selection (SFS): This can be forward (adding features one by one) or backward (removing features one by one), evaluating the model’s performance at each step to determine the optimal subset.

6.1.3. Embedded Methods

These procedures perform feature selection as an integral part of the model training process. They are generally more efficient than wrapper methods because they do not require repeated model training (MA et al., 2025) [165]:

Lasso Regression (Least Absolute Shrinkage and Selection Operator): This linear regression technique adds an L1 regularization penalty to the loss function, which has the effect of shrinking the coefficients of less important variables to exactly zero, effectively performing feature selection.
Tree-based Algorithms (e.g., XGBoost, LightGBM, random forest): These algorithms inherently provide “feature importance” scores based on how often a feature is used in decision splits or how much it reduces impurity (e.g., Gini impurity for classification, variance reduction for regression). Features with higher importance scores are considered more relevant.
Regularization in Neural Networks: Techniques such as L1 regularization (similar to Lasso) or Dropout can implicitly act as feature selection mechanisms by penalizing or randomly deactivating weights of less important connections, driving them towards zero or reducing their influence.

6.2. Explainable AI (XAI) in Financial Forecasting

Beyond variable selection, explaining the predictions of complex “black-box” models (e.g., deep learning) has gained significant emphasis. Explainable AI (XAI) aims to make these models more transparent and interpretable, a critical requirement in the highly regulated and trust-sensitive financial sector.

6.2.1. SHAP (SHapley Additive exPlanations)

One of the most promising and widely adopted techniques is SHAP (SHapley Additive exPlanations). SHAP values are rooted in cooperative game theory, using the concept of Shapley values to fairly attribute the contribution of each input feature to a model’s prediction. For any given prediction, SHAP assigns a SHAP value to each input feature, quantifying its positive or negative impact on the predicted outcome. This allows analysts to not only examine the accuracy of a model’s forecast but also to understand its explainability. For example, in a neural network predicting stock price direction, SHAP can identify how much changes in interest rates, market volume, or historical volatility influenced the model’s decision. A key advantage of SHAP is its compatibility with various machine learning techniques, including XGBoost, SVM, and LSTM, offering a unified framework for comparative interpretation. However, its main drawback lies in the computational intensity, making it challenging to generate a full SHAP map for highly complex models. Nevertheless, advancements like “TreeSHAP” and “DeepSHAP” have significantly improved its applicability in tree-based and deep learning environments, respectively (Goodell et al., 2023) [167].

6.2.2. LIME (Local Interpretable Model-Agnostic Explanations)

LIME offers another powerful model-agnostic approach by focusing on local interpretability. For a specific prediction, LIME perturbs the input instance, obtains predictions from the black-box model for these perturbed samples, and then trains a simple, interpretable model (e.g., a linear model or decision tree) on this local neighborhood. The weights or coefficients of this local model explain the black-box model’s decision for that particular instance. While SHAP provides both global and local interpretations, LIME concentrates on explaining individual local decisions, making it excellent for identifying anomalies or reverse-engineering the causes of prediction errors (Rane et al., 2023) [168].

6.2.3. Attention Mechanisms

Increasing attention is also being paid to attention mechanisms, originating from natural language processing but highly adaptable to financial time series. These mechanisms quantify which parts of a time series most influence a model’s decision, for example, which past time points were most decisive in the current prediction. “Self-attention” architectures, such as Transformer models, are gaining increasing traction in financial forecasting, inherently providing transparency by revealing the relative importance of different time steps or features within a sequence (Chen and Ge, 2019) [169].

6.3. Summary

The integrated management of feature selection and explainability is essential for developing robust and trustworthy financial machine learning models. Automated feature selection methods mitigate the risk of overfitting and improve model generalizability by ensuring that only the most relevant information is fed into the model. Concurrently, interpretability techniques—particularly SHAP, LIME, and attention mechanisms—provide the means for transparent model operation, strengthening their role in business decision support and regulatory compliance. As financial models grow in complexity, the synergy between precise feature engineering and clear model explainability will be paramount for their successful and responsible deployment.

7. Advanced Machine Learning Architectures in Financial Time-Series Forecasting and Explainability (XAI)

The dynamic and complex nature of financial markets constantly presents new challenges for predictive models. Volatility, non-linearity, data scarcity, and frequent structural changes necessitate the application of advanced machine learning architectures capable of capturing hidden patterns and managing uncertainty. Recent years have seen the emergence of several innovative deep learning approaches that are revolutionizing time-series forecasting, particularly within the financial sector. Among these, Transformer-based networks, Graph Neural Networks (GNNs), Generative Adversarial Networks (GANs), and hybrid and ensemble strategies stand out. Crucially, the explainability (Explainable AI, XAI) of these complex systems has also become paramount, especially in the regulated financial environment.

7.1. Transformer-Based Models: Long-Term Dependencies and Explainability

Transformer architectures, particularly the Temporal Fusion Transformer (TFT) and related BERT-like models, have brought about a revolutionary development in sequential data and time-series forecasting. Their main strength lies in the self-attention mechanism, which allows the model to weight the relationships between different positions in the input sequence, irrespective of their distance (Souto and Moradi, 2024) [170]. This makes Transformer-based models exceptionally well-suited for simultaneously accounting for long-term dependencies and seasonal trends, a challenge often faced by recurrent neural networks (RNNs) like LSTMs and GRUs. The TFT (Temporal Fusion Transformer) (Lim et al., 2021) [171] was specifically designed to handle the complexities inherent in time-series data. It not only delivers excellent predictive performance but also prioritizes explainability. It can highlight which input variables contribute most significantly to a forecast and visualizes which past time steps the model focused on. This transparency is especially crucial in the regulated financial environment, where justifying model decisions is essential for building trust and ensuring compliance. Recent publications indicate that Transformer-based models have achieved convincing results across various benchmark datasets (e.g., S&P500, Forex), frequently outperforming LSTM and GRU-based solutions (Kabir et al., 2025) [172]. Challenges include high memory and computational demands, as well as the complexity of data structuring, which sparse attention mechanisms and more efficient architectures aim to address.

7.2. Graph-Based Models (Graph Neural Networks, GNN): Modeling Network Interactions

Graph Neural Networks (GNNs) represent a relatively new but highly promising approach in financial forecasting, where relationships between various assets (e.g., correlations, co-movements, industry connections, ownership structures) can be modeled as a graph. This allows the algorithm to learn not only the internal dynamics of individual instruments but also the interdependencies and network effects among them (Cheng et al., 2022) [173]. Methods like Graph Convolutional Networks (GCNs) or Graph Attention Networks (GATs) enable the consideration of dynamic relationships. GATs are particularly advantageous as they adaptively weight the contributions of neighboring nodes, accounting for the changing strength of connections. This is especially beneficial in portfolio modeling and optimization (Feng et al. 2025) [174], market stress contagion modeling, and risk analysis. These models are better equipped to capture non-linear interactions that traditional statistical models often miss, thereby providing deeper insights into market mechanisms. Challenges in applying GNNs include constructing relevant graph topologies, scalability, and handling dynamic graphs.

7.3. GAN-Based Time-Series Models (TimeGAN): Synthetic Data Generation and Extreme Conditions

The application of Generative Adversarial Networks (GANs) to time-series data (Takahashi et al., 2019) [175], particularly the TimeGAN, has revolutionized time-series-based data generation and forecasting. TimeGAN combines the generative capabilities of GANs with the assurance of temporal consistency, through a supervised learning objective and a combined loss function (Vuletić et al., 2024) [176]. This enables it to generate realistic synthetic time series that are not only statistically but also dynamically faithful to the original data. This capability is especially useful in data-scarce environments or during stress testing, where insufficient historical data exist to model rare or extreme market events. A major advantage of GAN-based forecasting models is their ability to provide reliable estimates under unusual, extreme conditions, where historical data are sparse or noisy. However, their training can be unstable, and overfitting is a common issue without proper regularization (e.g., Wasserstein GAN, gradient penalty), which remains a key area of research.

7.4. Hybrid and Ensemble-Based Models: The Synergy of Algorithms

In financial forecasting, increasing emphasis is placed on hybrid models and ensemble strategies, which aim to combine the strengths of multiple different algorithms to enhance predictive performance and model robustness. Hybrid models fuse different neural network architectures; an example is a CNN + LSTM combination, where the CNN extracts short-term, local features from the time series, while the LSTM models sequential dynamics and long-term dependencies (Livieris et al., 2020) [82]. Similarly, Transformer and GNN hybrids could simultaneously handle temporal and network-based relationships. Ensemble strategies—such as bagging (e.g., random forest), boosting (e.g., XGBoost, LightGBM), or stacking—improve predictive performance and model robustness by combining the forecasts of multiple models (Ingle and Deshmukh, 2021) [177]. Stacking is particularly advanced, often employing meta-models that learn which sub-model performs better in specific situations, such as varying market conditions. According to recent literature, hybrid and ensemble models generally outperform individually applied architectures, especially when operating in heterogeneous market environments, by reducing the risk of individual model errors and improving generalizability. However, complexity and computational demands can pose challenges (Gul, 2025) [178].

7.5. Meta-Learning and Few-Shot Learning: Learning from Scarce Data

Data scarcity in financial data, particularly for low-liquidity products or novel market situations, limits the effectiveness of traditional models. Meta-learning and few-shot learning techniques offer a solution by enabling models to learn quickly from limited data. Meta-learning, or “learning to learn,” allows a model to acquire background knowledge from various tasks and then leverage this knowledge to rapidly master a completely new task with minimal examples. Few-shot learning techniques, for instance, can forecast new currency pairs or asset groups based on just a few samples, provided they have accumulated sufficient background knowledge from other assets (Batool et al., 2025; Noor and Fatima, 2025) [179,180]. These techniques are especially useful in stress tests and simulations where insufficient historical data exist for analogous cases. Popular meta-learning strategies include Model-Agnostic Meta-Learning (MAML) and the reptile algorithm, both successfully applicable to financial prediction problems (Liu et al., 2025) [43]. While task definition and scalability can pose challenges, meta-learning offers a promising path towards more robust and adaptive financial modeling.

In the evolving field of financial time-series forecasting, the advanced machine learning architectures discussed (Transformer-based models, GNNs, GANs, hybrid and ensemble approaches, and meta-learning) collectively offer robust and innovative solutions for managing market complexity. These methods enable the capture of long-term dependencies, the modeling of network interactions, the generation of realistic synthetic data, the enhancement of model performance and robustness, and rapid adaptation in data-scarce environments.

8. Results of the Literature Database Analysis

In earlier years, traditional statistical methodologies dominated, but over the past 6–7 years, there has been a notable surge in interest in artificial intelligence and machine learning models, as reflected in the literature (Henrique et al., 2019; Ghoddusi et al., 2019; Sezer et al., 2020; Ozbayoglu et al., 2020; Goodell et al., 2021; Rouf et al., 2021; Kumbure et al., 2022; Kumar et al., 2022; Ahmed et al., 2022) [3,41,181,182,183,184,185,186,187]. Within our dataset, the number of publications notably peaked during 2020–2021, a trend that can be partially attributed to the COVID-19 pandemic and the increased productivity of researchers working under its constraints. Our study does not include any publications from 2016. This is because we excluded low-rated journals based on our screening, leaving us with no studies from that year (Figure 2).

In general, as illustrated in Figure 3, equity markets have attracted the greatest research interest regarding the predictive capabilities of machine learning models. Concurrently, the cryptocurrency market has also experienced substantial growth in scientific publications. The commodity market, while less frequently studied than equities or cryptocurrencies, still holds significance not only for investment decisions but also for optimizing raw material and input demands in manufacturing. Among the four examined product categories, the foreign exchange market is associated with the fewest studies. This may be partially attributable to the relatively lower volatility of major currency pairs compared to equities and cryptocurrencies. In the latter two markets, pronounced non-linearity and extreme price movements favor machine learning methods, which can capture complex patterns more effectively than traditional statistical models.

The distribution of publications by product category and time period (Figure 4) indicates that equity markets dominated research until 2019. During the COVID-19 period (2020–2021), however, cryptocurrencies assumed this leading role. A plausible explanation is that these decentralized products, less dependent on conventional economic fundamentals, gained prominence in times of crisis. A similarly noteworthy shift occurred in 2022, the year of the Russian–Ukrainian conflict outbreak, when scholarly focus moved toward the commodity market—primarily energy carriers like oil and gas due to both theoretical and practical considerations. By 2023, it appears that researchers have once again turned their attention back to equity markets.

The following statistics focus primarily on the methods employed within the research area. Figure 5 clearly indicates that the LSTM model is the most frequently used approach among the studies reviewed. SVR and RF follow in second and third place, while ARIMA—a traditional statistical method—appears notably less common. The publications generally emphasize the superiority of machine learning algorithms over autoregressive models. Overall, neural networks and their various subtypes dominate the literature.

A key consideration in evaluating model performance is the choice of benchmarking metrics. Figure 6 illustrates the five most commonly used metrics identified in the reviewed publications. The RMSE indicator appears most frequently, although each metric has its own strengths and weaknesses. When comparing the predictability of identical product groups, researchers have a wide array of metrics at their disposal. However, to compare models applied to different types of data, a scale-independent measure is required. In this context, MAPE—ranked third in usage—proves particularly useful. Most of the studies employ some form of regression modeling, which explains the prevalence of regression-based metrics among the top four measures. Notably, the top five also include Accuracy, a classification-oriented metric.

Figure 7 presents the duration of the datasets analyzed in the reviewed publications. The majority of studies rely on databases spanning one to ten years, while the second most frequently used range is eleven to twenty years. Notably, among the studies examined, two utilized exceptionally long datasets of 47 years. Both of these focused on the foreign exchange market.

Figure 8 shows that regression-based methodologies dominate the analyzed studies, followed by classification approaches. In some cases, both methods were employed concurrently. Since our research primarily focuses on predictive modeling and forecasting capabilities, articles emphasizing regression-based techniques are particularly pertinent to our objectives.

Figure 9 highlights the distribution of publications according to the journals in which they were published. Notably, a quarter of the papers examined appeared in Expert Systems with Applications and Applied Soft Computing. These journals not only focus on model specifications in detail but also provide valuable insights into the complexities of machine learning. Such comprehensive discourse facilitates a deeper exploration of algorithmic intricacies and supports a richer understanding of the evolving methodologies in this field.

9. Identifying Research Gaps

The literature review has clearly shown that although the application of machine learning models to predict the financial time horizon is a very intensive research area, there are several critical gaps. Below we highlight three key research niches.

9.1. Practical Applicability of Models in Trading Strategies

Most studies focus on the accuracy of predictive models, typically using static metrics (MAPE, RMSE, MAE) to evaluate results. However, these models are rarely incorporated into concrete, simulated, or real trading strategies. As a result, they do not examine the actual financial value of predictions or take into account practical factors such as transaction costs, slippage, market liquidity, or risk/return profile. The “isolation” of models from the practical application context is a major limitation, especially for assets with high volatility or low liquidity (e.g., cryptocurrencies, commodity markets). This clearly identifies that the real utility of prediction models is still an under-researched area.

9.2. The Relationship Between Volatility and Predictive Performance Indicators, with a Special Focus on MAPE

The second research gap stems from the relationship between volatility and prediction errors. MAPE (Mean Absolute Percentage Error) as a performance measure is widely used in financial forecasting, but it is also sensitive to small absolute errors at low price levels, which cause large percentage biases in proportions. In addition, MAPE can be distorted during extreme price movements, raising questions about its reliability, especially for cryptocurrencies and currency pairs. Studies that systematically analyze how predictive performance varies under different market regimes (e.g., high vs. low volatility) are rare in the literature. Filling this gap would be particularly important for understanding how models behave in stressful or unstable environments.

9.3. Temporal Robustness and Stability of Models in Different Market Environments

The third major research gap concerns the temporal robustness of models. Much research uses a single train-test allocation and rarely applies time-dependent validation techniques such as rolling-window or walk-forward testing. In their absence, however, it is not possible to determine whether the performance of a model is stable over the longer term or over different market cycles (e.g., crisis, boom, bust, consolidation). Since financial time series are non-stationary and have a time-dependent structure, the temporal reliability of a model is crucial. This factor is currently neglected in a large part of the literature, which also opens up a research opportunity towards a new dimension of predictive algorithm evaluation.

The common intersection of the above three research niches is the realization that financial applications of machine learning models are predominantly conducted in theoretical or simulated frameworks, while lacking robust, context-sensitive evaluation. Addressing these gaps could help to better understand the actual market applicability of models and lay the foundations for a new valuation framework that takes into account the dynamics of volatility, market regimes, and stability over time.

10. Conclusions

This thorough search and review process enabled us to address the previously posed research questions. The first key finding is that stock markets emerged as the most commonly studied domain within the analyzed literature, reflecting the largest proportion of the examined publications. Cryptocurrency markets also received substantial attention. Given their inherent volatility, predicting cryptocurrency prices poses a major challenge for both models and researchers. Another significant conclusion is the dominance of neural network-based approaches, with the LSTM model appearing most frequently. In most cases, LSTM or its hybrid variants produced the most accurate forecasts. Moreover, we identified the key benchmarking metrics crucial for comparing model performance: RMSE, which featured most prominently, and MAPE, which is scale-independent and thus valuable for comparing accuracy across diverse product categories.

In terms of data length, we found that time horizons of one to ten years are the most commonly used in forecasting studies. Although intervals of 11–20 years are also frequent, shorter time spans predominate for cryptocurrencies, reflecting their relatively recent emergence in financial markets. Regarding statistical methodologies, regression-based models were favored, although several studies employed classification techniques.

From this comprehensive literature review, we conclude that neural network models typically deliver superior forecasting accuracy, with LSTM and GRU models particularly noteworthy. However, our findings also indicate that periods of economic stress, such as the COVID-19 pandemic or the Russian–Ukrainian war, tend to erode predictive performance. Notably, relatively few publications focus on cross-product comparisons to ascertain which types of assets are most predictable. Instead, most work concentrates on model-centric performance assessments, leaving a gap in the literature regarding product-specific predictability. Similarly, no clear consensus has emerged on whether univariate or multivariate approaches are preferable, nor has a definitive pattern emerged for hybrid methodologies, with reports of both improved and reduced performance outcomes. These issues highlight avenues for future research and exploration.

Furthermore, it is evident that although the application of machine learning models in financial time-series forecasting is a highly active research area, several critical gaps remain. One of the most significant shortcomings is the lack of practical applicability: most studies assess the performance of predictive models using static accuracy metrics such as MAPE or RMSE, yet rarely evaluate their real financial utility through concrete, simulated, or live trading strategies. The second gap concerns the relationship between volatility and predictive performance indicators, especially the potential distortions of the MAPE metric under low price levels or during extreme market fluctuations. Closely related to this is the third major gap, which involves the temporal robustness of models. Many studies rely on a single train-test split, without employing dynamic validation techniques such as rolling-window or walk-forward testing, making it difficult to assess the consistency of model performance across different market regimes. The common thread among these gaps is that the use of machine learning in financial applications largely remains theoretical or simulation-based, lacking a robust, context-sensitive evaluation framework. Addressing these gaps not only enhances the interpretability of existing research but also provides a solid foundation for developing a new evaluation approach that accounts for volatility dynamics, market regime shifts, and temporal stability.

Author Contributions

Conceptualization, L.V., T.T. and T.B.; methodology, L.V.; software, L.V.; validation, T.B. and T.T.; formal analysis, T.B.; investigation, T.T.; resources, T.T.; data curation, L.V.; writing—original draft preparation, L.V., T.B. and T.T.; writing—review and editing, L.V., T.B. and T.T.; visualization, L.V.; supervision, T.B.; project administration, T.T.; funding acquisition, L.V. and T.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data used in the present study are publicly available.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LR	Linear Regression
BLR	Bayesian Linear Regression
DT	Decision Tree
RF	Random Forest
ELM	Extreme Learning Machine
Adaboost	Adaptive Boosting Regression
Gredient Boost	Gradient Boosting Regression
XGBoost	Extreme Gradient Boosting
LightGBM	Light Gradient Boosting Machine
SVR	Support Vector Regression
SVM	Support Vector Machine
MVaR	Mean Value-at-Risk
ABC-SVR	Artificial Bee Colony-Support Vector Regression
SSA-SVR	Strategic Seasonality-Adjusted-Support Vector Regression
OGA-SVR	Optimized Genetic Algorithm-Based-Support Vector Regression
FS-GA-SVR	Feature Selection-Genetic Algorithm-Support Vector Regression
WAMC	Weighted Memory Channels Regression
ANFIS	Adaptive Network Fuzzy Inference System
TDNN	Time-Delay Neural Network
GINN	Generalized Improved Neural Network
DNPP	Dynamic Noisy Proximal Policy
DBN	Deep Belief Network
GVMD-Q-DBN-LSTM-GRU	Generalized Variational Mode Decomposition-Q-Learning Algorithm-Deep Belief Network-Long Short-Term Memory-Gated Recurrent Unit
KNN	K-Nearest Neighbors
KNN-XGB-SVR	K-Nearest Neighbors-Extreme Gradient Boosting-Support Vector Regression
ENN	Elman Neural Network
ANN	Artificial Neural Networks
DNN	Deep Neural Network
RNN	Recurrent Neural Network
BSd-RNN	B-Spline-Recurrent Neural Network
DWT-RNN	Discrete Wavelet Transform-Recurrent Neural Network
CNN	Convolutional Neural Network
CNN-RNN	Convolutional Neural Network-Recurrent Neural Network
TCN	Temporal Convolutional Network
GRU	Gated Recurrent Unit
LSTNet	Long- and Short-Term Time-Series Network
LSTM	Long Short-Term Memory
LSTM-GRU	Long Short-Term Memory-Gated Recurrent Unit
GRU-LSTM	Gated Recurrent Unit-Long Short-Term Memory
BiLSTM	Bidirectional Long Short-Term Memory
BiLSTM-BR	Bidirectional Long Short-Term Memory-Bagging Ridge
SLSTM	Stacked Long Short-Term Memory
LSTM-DNN	Long Short-Term Memory-Deep Neural Network
XGBoost-LSTM	Extreme Gradient Boosting-Long Short-Term Memory
CNN-Bi-LSTM-AM	Convolutional Neural Network-Bidirectional Long Short-Term Memory-Attention Mechanism
SFM	State Frequency Memory
GA	Genetic Algorithms
MLP	Multilayer Perceptron
CNN-LSTM	Convolutional Neural Network-Long Short-Term Memory
LSTM-CNN	Long Short-Term Memory-Convolutional Neural Network
LSTM-ARO	Long Short-Term Memory-Artificial Rabbits Optimization
DC-LSTM	Deep Coupled-Long Short-Term Memory
GWO-CNN-LSTM	Gray Wolf Optimizer-Convolutional Neural Network-Long Short-Term Memory
CNN-GRU	Convolutional Neural Network-Gated Recurrent Unit
EMD-CNN-GRU	Empirical Mode Decomposition-Convolutional Neural Network-Gated Recurrent Unit
VMD-CNN-GRU	Variational Mode Decomposition-Convolutional Neural Network-Gated Recurrent Unit
AR	Autoregressive
AR-DNN	Autoregressive-Deep Neural Network
ARIMA	Autoregressive Integrated Moving Average
VAR	Vector Autoregression
GARCH	Generalized Autoregressive Conditional Heteroskedasticity
APGARCH	Asymmetric Power Generalized Autoregressive Conditional Heteroskedasticity
WT-LSTM	Wavelet Transform-Long Short-Term Memory
WT-GRU	Wavelet Transform-Gated Recurrent Unit
WT-TCN	Wavelet Transform-Temporal Convolutional Network
LSTM-RL	Long Short-Term Memory-Reinforcement Learning
HMM-ALSTM	Hidden Markov Model-Attentive Long Short-Term Memory
RMT-LSTM	Removing outliers Mahalanobis Transformation-Long Short-Term Memory
RZT-LSTM	Removing Outliers Z-Score Transformation-Long Short-Term Memory
ANN-GARCH	Artificial Neural Networks-Generalized Autoregressive Conditional Heteroskedasticity
ANN-GJR	Artificial Neural Networks-Glosten, Jagannathan, and Runkle
CEEMDAN-LSTM	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Long Short-Term Memory
CEEMDAN-SVR	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Support Vector Regression
CEEMDAN-MLP	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Multilayer Perceptron
CEEMDAN-GRU	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Gated Recurrent Unit
CEEMDAN-CNN	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Convolutional Neural Network
CEEMDAN-CNN-GRU	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Convolutional Neural Network-Gated Recurrent Unit
CEEMDAN-TCN	Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Temporal Convolutional Network
MEMD-GRU	Multivariate Empirical Mode Decomposition-Gated Recurrent Unit
MEMD-TCN	Multivariate Empirical Mode Decomposition-Temporal Convolutional Network
CATN	Cross Attentive Tree-Aware Network
ICEEMDAN-LSTM-CNN-CBAM	Improved Complete Ensemble Empirical Mode Decomposition Adaptive Noise-Long Short-Term Memory-Convolutional Neural Network-Convolutional Block Attention Module
DWT-FFNN	Discrete Wavelet Transform-Feed-Forward Neural Network
WNN	Wavelet Neural Network
GRNN	Generalized Regression Neural Networks
FFNN	Feed-Forward Neural Network
NSE	Indian Stock Exchange
Nifty50	Indian Stock Market Benchmark Index
DJIA	Dow Jones Industrial Average
DAX	Deutscher Aktienindex
Nikkei225	Leading index of Japan’s top 225 companies traded on the Tokyo Stock Exchange
SSE	Shanghai Stock Exchange
NYSE	New York Stock Exchange
S&P500	Standard and Poor’s 500
FAANG	Facebook, Amazon, Apple, Netflix, and Google
HSI	Hang Seng Index
NASDAQ	NASDAQ Composite Index
IBEX	IBerian IndEX
SSE50	Shanghai Stock Exchange Index
CSI100	China Securities Index 100
CSI200	China Securities Index 200
CSI500	China Securities Index 500
GEI100	Growth Enterprise Index 100
NASDAQ100	NASDAQ 100 Index
NYSE100	New York Stock Exchange Index 100
CAC40	French Stock Market Index („Cotation assistée en continu”)
RTSI	Russian Trading System Index
FTSE100	Financial Times Stock Exchange 100
BSE	Bombay Stock Exchange
EUR-USD	Euro-US Dollar
GBP-USD	Pound Sterling-US Dollar
USD-CHF	US Dollar-Swiss Franc
USD-CAD	US Dollar-Canadian Dollar
AUD-USD	Australian Dollar-US Dollar
GBP-JPY	Pound Sterling-Japanese Yen
EUR-JPY	Euro-Japanese Yen
USD-JPY	US Dollar-Japanese Yen
USD-AUD	US Dollar-Australian Dollar
JPY-USD	Japanese Yen-US Dollar
CHF-USD	Swiss Franc-US Dollar
EUR-GBP	Euro-Pound Sterling
CAD-CHF	Canadian Dollar-Swiss Franc
CAD-USD	Canadian Dollar-US Dollar
AUD-RMB	Australian Dollar-Renminbi
EUR-RMB	Euro-Renminbi
USD-RMB	US Dollar-Renminbi
RMB-JPY	Renminbi-Japanese Yen
ZAR-USD	South African Rand-US Dollar
NGN-USD	Nigerian Naira-US Dollar
RMB-USD	Renminbi-US Dollar
USD-CNY	US Dollar-Chinese Yuan

References

Li, X.; Tang, P. Stock Index Prediction Based on Wavelet Transform and FCD-MLGRU. J. Forecast. 2020, 39, 1229–1237. [Google Scholar] [CrossRef]
Wall, L.D. Some financial regulatory implications of artificial intelligence. J. Econ. Bus. 2018, 100, 55–63. [Google Scholar] [CrossRef]
Goodell, J.W.; Kumar, S.; Lim, W.M.; Pattnaik, D. Artificial Intelligence and Machine Learning in Finance: Identifying Foundations, Themes, and Research Clusters from Bibliometric Analysis. J. Behav. Exp. Financ. 2021, 32, 100577. [Google Scholar] [CrossRef]
In, S.Y.; Rook, D.; Monk, A. Integrating Alternative Data (Also Known as ESG Data) in Investment Decision Making. Glob. Econ. Rev. 2019, 48, 237–260. [Google Scholar] [CrossRef]
López de Prado, M. Beyond Econometrics: A Roadmap Towards Financial Machine Learning. SSRN Electron. J. 2019. [Google Scholar] [CrossRef]
Duan, Y.; Goodell, J.W.; Li, H.; Li, X. Assessing Machine Learning for Forecasting Economic Risk: Evidence from an Expanded Chinese Financial Information Set. Financ. Res. Lett. 2022, 46, 102273. [Google Scholar] [CrossRef]
Goulet Coulombe, P.; Leroux, M.; Stevanovic, D.; Surprenant, S. How Is Machine Learning Useful for Macroeconomic Forecasting? J. Appl. Econom. 2022, 37, 920–964. [Google Scholar] [CrossRef]
Dixon, M.F.; Halperin, I.; Bilokon, P. Machine Learning in Finance; Springer International Publishing: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Nazareth, N.; Reddy, Y.Y.R. Financial Applications of Machine Learning: A Literature Review. Expert Syst. Appl. 2023, 219, 119640. [Google Scholar] [CrossRef]
Chan, T.L.; Hale, N. Pricing European-Type, Early-Exercise and Discrete Barrier Options Using an Algorithm for the Convolution of Legendre Series. Quant. Financ. 2020, 20, 1307–1324. [Google Scholar] [CrossRef]
Gao, B. The Use of Machine Learning Combined with Data Mining Technology in Financial Risk Prevention. Comput. Econ. 2022, 59, 1385–1405. [Google Scholar] [CrossRef]
Gómez Martínez, R.; Prado Román, M.; Plaza Casado, P. Big Data Algorithmic Trading Systems Based on Investors’ Mood. J. Behav. Financ. 2019, 20, 227–238. [Google Scholar] [CrossRef]
Houlihan, P.; Creamer, G.G. Leveraging Social Media to Predict Continuation and Reversal in Asset Prices. Comput. Econ. 2021, 57, 433–453. [Google Scholar] [CrossRef]
Kokina, J.; Gilleran, R.; Blanchette, S.; Stoddard, D. Accountant as Digital Innovator: Roles and Competencies in the Age of Automation. Account. Horiz. 2021, 35, 153–184. [Google Scholar] [CrossRef]
Teng, H.W.; Lee, M. Estimation procedures of using five alternative machine learning methods for predicting credit card default. Rev. Pac. Basin Financ. Mark. Policies 2019, 22, 1950021. [Google Scholar] [CrossRef]
Bee, M.; Hambuckers, J.; Trapin, L. Estimating Large Losses in Insurance Analytics and Operational Risk Using the g-and-h Distribution. Quant. Financ. 2021, 21, 1207–1221. [Google Scholar] [CrossRef]
Li, Q.; Xu, Z.; Shen, X.; Zhong, J. Predicting Business Risks of Commercial Banks Based on BP-GA Optimized Model. Comput. Econ. 2021, 59, 1423–1441. [Google Scholar] [CrossRef]
Chen, Z.; Li, C.; Sun, W. Bitcoin Price Prediction Using Machine Learning: An Approach to Sample Dimension Engineering. J. Comput. Appl. Math. 2020, 365, 112395. [Google Scholar] [CrossRef]
Fama, E.F. Efficient Capital Markets: A Review of Theory and Empirical Work. J. Financ. 1970, 25, 383–417. [Google Scholar] [CrossRef]
Fama, E.F. Efficient capital markets: II. J. Financ. 1991, 46, 1575–1617. [Google Scholar] [CrossRef]
Fama, E.F. Random Walks in Stock Market Prices. Financ. Anal. J. 1965, 21, 55–59. [Google Scholar] [CrossRef]
Samuelson, P.A. Proof That Properly Anticipated Prices Fluctuate Randomly. Ind. Manag. Rev. 1965, 6, 41–49. [Google Scholar]
Van Horne, J.C.; Parker, G.G. The Random-Walk Theory: An Empirical Test. Financ. Anal. J. 1967, 23, 87–92. [Google Scholar] [CrossRef]
Levy, R.A. Random Walks: Reality or Myth. Financ. Anal. J. 1967, 23, 69–77. [Google Scholar] [CrossRef]
Shiller, R.J. From efficient markets theory to behavioral finance. J. Econ. Perspect. 2003, 17, 83–104. [Google Scholar] [CrossRef]
Jegadeesh, N.; Titman, S. Returns to buying winners and selling losers: Implications for stock market efficiency. J. Financ. 1993, 48, 65–91. [Google Scholar] [CrossRef]
Fama, E.F.; French, K.R. The cross-section of expected stock returns. J. Financ. 1992, 47, 427–465. [Google Scholar] [CrossRef]
Chopra, R.; Sharma, G.D. Application of artificial intelligence in stock market forecasting: A critique, review, and research agenda. J. Risk Financ. Manag. 2021, 14, 526. [Google Scholar] [CrossRef]
Kahneman, D.; Tversky, A. Prospect theory: An analysis of decision under risk. Econometrica 1979, 47, 363–391. [Google Scholar] [CrossRef]
Lo, A.W.; Zhang, R. The Adaptive Markets Hypothesis: An Evolutionary Approach to Understanding Financial System Dynamics; Oxford University Press: Oxford, UK, 2024. [Google Scholar] [CrossRef]
Akerlof, G.A. The market for “lemons”: Quality uncertainty and the market mechanism. In Uncertainty in Economics; Academic Press: New York, NY, USA, 1978; pp. 235–251. [Google Scholar] [CrossRef]
Stiglitz, J.E. The contributions of the economics of information to twentieth century economics. Q. J. Econ. 2000, 115, 1441–1478. [Google Scholar] [CrossRef]
Williams, L.V. (Ed.) Information Efficiency in Financial and Betting Markets; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar] [CrossRef]
Ahern, K.R.; Peress, J. The role of media in financial decision-making. In Handbook of Financial Decision Making; Edward Elgar Publishing: Gloucestershire, UK, 2023; pp. 192–212. [Google Scholar] [CrossRef]
Sun, Y.; Liu, L.; Xu, Y.; Zeng, X.; Shi, Y.; Hu, H.; Jiang, J.; Abraham, A. Alternative data in finance and business: Emerging applications and theory analysis. Financ. Innov. 2024, 10, 127. [Google Scholar] [CrossRef]
Shiller, R.J. Narrative Economics: How Stories Go Viral and Drive Major Economic Events; Princeton University Press: Princeton, NJ, USA, 2020. [Google Scholar]
Hasan, M.M.; Popp, J.; Oláh, J. Current landscape and influence of big data on finance. J. Big Data 2020, 7, 21. [Google Scholar] [CrossRef]
Goldstein, I.; Spatt, C.S.; Ye, M. The Next Chapter of Big Data in Finance. Rev. Financ. Stud. 2024, 38, hhae083. [Google Scholar] [CrossRef]
Gandhmal, D.P.; Kumar, K. Systematic Analysis and Review of Stock Market Prediction Techniques. Comput. Sci. Rev. 2019, 34, 100190. [Google Scholar] [CrossRef]
Li, A.W.; Bastos, G.S. Stock Market Forecasting Using Deep Learning and Technical Analysis: A Systematic Review. IEEE Access 2020, 8, 185232–185242. [Google Scholar] [CrossRef]
Kumbure, M.M.; Lohrmann, C.; Luukka, P.; Porras, J. Machine Learning Techniques and Data for Stock Market Forecasting: A Literature Review. Expert Syst. Appl. 2022, 197, 116659. [Google Scholar] [CrossRef]
Noor, K.; Fatima, U. Meta Learning Strategies for Comparative and Efficient Adaptation to Financial Datasets. IEEE Access 2025, 13, 24158–24170. [Google Scholar] [CrossRef]
Shi, C.; Zhuang, X. A Study Concerning Soft Computing Approaches for Stock Price Forecasting. Axioms 2019, 8, 116. [Google Scholar] [CrossRef]
Khattak, B.H.A.; Shafi, I.; Khan, A.S.; Flores, E.S.; Lara, R.G.; Samad, M.A.; Ashraf, I. A Systematic Survey of AI Models in Financial Market Forecasting for Profitability Analysis. IEEE Access 2023, 11, 125359–125380. [Google Scholar] [CrossRef]
Gunnarsson, E.S.; Isern, H.R.; Kaloudis, A.; Risstad, M.; Vigdel, B.; Westgaard, S. Prediction of Realized Volatility and Implied Volatility Indices Using AI and Machine Learning: A Review. Int. Rev. Financ. Anal. 2024, 93, 103221. [Google Scholar] [CrossRef]
Ardabili, S.; Abdolalizadeh, L.; Mako, C.; Torok, B.; Mosavi, A. Systematic Review of Deep Learning and Machine Learning for Building Energy. Front. Energy Res. 2022, 10, 1–19. [Google Scholar] [CrossRef]
Moher, D.; Liberati, A.; Altman, D.G. PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
Hajiabotorabi, Z.; Kazemi, A.; Samavati, F.F.; Ghaini, F.M.M. Improving DWT-RNN Model via B-Spline Wavelet Multiresolution to Forecast a High-Frequency Time Series. Expert Syst. Appl. 2019, 138, 112842. [Google Scholar] [CrossRef]
Nabipour, M.; Nayyeri, P.; Jabani, H.; Shahab, S.; Mosavi, A. Predicting Stock Market Trends Using Machine Learning and Deep Learning Algorithms via Continuous and Binary Data; A Comparative Analysis on the Tehran Stock Exchange. IEEE Access 2020, 8, 150199–150212. [Google Scholar] [CrossRef]
Hiransha, M.; Gopalakrishnan, E.A.; Menon, V.K.; Soman, K.P. NSE Stock Market Prediction Using Deep-Learning Models. Procedia Comput. Sci. 2018, 132, 1351–1362. [Google Scholar] [CrossRef]
Beniwal, M.; Singh, A.; Kumar, N. Forecasting Long-Term Stock Prices of Global Indices: A Forward-Validating Genetic Algorithm Optimization Approach for Support Vector Regression. Appl. Soft Comput. 2023, 145, 110566. [Google Scholar] [CrossRef]
Rather, A.M. LSTM-Based Deep Learning Model for Stock Prediction and Predictive Optimization Model. Eur. J. Decis. Process. 2021, 9, 100001. [Google Scholar] [CrossRef]
Banik, S.; Sharma, N.; Mangla, M.; Mohanty, S.N.; Shitharth, S. LSTM Based Decision Support System for Swing Trading in Stock Market. Knowl.-Based Syst. 2022, 239, 107994. [Google Scholar] [CrossRef]
Fischer, T.; Krauss, C. Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions. Eur. J. Oper. Res. 2018, 270, 654–669. [Google Scholar] [CrossRef]
Siami-Namini, S.; Tavakoli, N.; Namin, A.S. A comparison of ARIMA and LSTM in forecasting time series. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 1394–1401. [Google Scholar] [CrossRef]
Liu, Y. Novel Volatility Forecasting Using Deep Learning–Long Short Term Memory Recurrent Neural Networks. Expert Syst. Appl. 2019, 132, 99–109. [Google Scholar] [CrossRef]
Bhandari, H.N.; Rimal, B.; Pokhrel, N.R.; Rimal, R.; Dahal, K.R.; Khatri, R.K. Predicting Stock Market Index Using LSTM. Mach. Learn. Appl. 2022, 9, 100320. [Google Scholar] [CrossRef]
Liang, X.; Ge, Z.; Sun, L.; He, M.; Chen, H. LSTM with Wavelet Transform Based Data Preprocessing for Stock Price Prediction. Math. Probl. Eng. 2019, 2019, 1–8. [Google Scholar] [CrossRef]
Qiu, J.; Wang, B.; Zhou, C. Forecasting Stock Prices with Long-Short Term Memory Neural Network Based on Attention Mechanism. PLoS ONE 2020, 15, e0227222. [Google Scholar] [CrossRef]
Skehin, T.; Crane, M.; Bezbradica, M. Day ahead forecasting of FAANG stocks using ARIMA, LSTM networks and wavelets. In CEUR Workshop Proceedings; RWTH Aachen University: Aachen, Germany, 2018. [Google Scholar]
Cao, J.; Li, Z.; Li, J. Financial Time Series Forecasting Model Based on CEEMDAN and LSTM. Phys. A 2019, 519, 127–139. [Google Scholar] [CrossRef]
Gülmez, B. Stock Price Prediction with Optimized Deep LSTM Network with Artificial Rabbits Optimization Algorithm. Expert Syst. Appl. 2023, 227, 120346. [Google Scholar] [CrossRef]
Zhang, L.; Aggarwal, C.; Qi, G.-J. Stock Price Prediction via Discovering Multi-Frequency Trading Patterns. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13–17 August 2017. [Google Scholar] [CrossRef]
Hanauer, M.X.; Kalsbach, T. Machine Learning and the Cross-Section of Emerging Market Stock Returns. Emerg. Mark. Rev. 2023, 55, 101022. [Google Scholar] [CrossRef]
Nelson, D.M.; Pereira, A.C.; De Oliveira, R.A. Stock Market’s Price Movement Prediction with LSTM Neural Networks. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1419–1426. [Google Scholar] [CrossRef]
Kristjanpoller, W.; Fadic, A.; Minutolo, M.C. Volatility Forecast Using Hybrid Neural Network Models. Expert Syst. Appl. 2014, 41, 2437–2442. [Google Scholar] [CrossRef]
Nikou, M.; Mansourfar, G.; Bagherzadeh, J. Stock Price Prediction Using Deep Learning Algorithm and Its Comparison with Machine Learning Algorithms. Intell. Syst. Account. Financ. Manag. 2019, 26, 164–174. [Google Scholar] [CrossRef]
Ayala, J.; García-Torres, M.; Noguera, J.L.V.; Gómez-Vela, F.; Divina, F. Technical Analysis Strategy Optimization Using a Machine Learning Approach in Stock Market Indices. Knowl.-Based Syst. 2021, 225, 107119. [Google Scholar] [CrossRef]
Ballings, M.; Van den Poel, D.; Hespeels, N.; Gryp, R. Evaluating Multiple Classifiers for Stock Price Direction Prediction. Expert Syst. Appl. 2015, 42, 7046–7056. [Google Scholar] [CrossRef]
Basak, S.; Kar, S.; Saha, S.; Khaidem, L.; Dey, S.R. Predicting the Direction of Stock Market Prices Using Tree-Based Classifiers. N. Am. J. Econ. Financ. 2019, 47, 552–567. [Google Scholar] [CrossRef]
Chong, E.; Han, C.; Park, F.C. Deep Learning Networks for Stock Market Analysis and Prediction: Methodology, Data Representations, and Case Studies. Expert Syst. Appl. 2017, 83, 187–205. [Google Scholar] [CrossRef]
Md, A.Q.; Kapoor, S.; AV, C.J.; Sivaraman, A.K.; Tee, K.F.; Sabireen, H.; Janakiraman, N. Novel Optimization Approach for Stock Price Forecasting Using Multi-Layered Sequential LSTM. Appl. Soft Comput. 2023, 134, 109830. [Google Scholar] [CrossRef]
Liu, K.; Zhou, J.; Dong, D. Improving Stock Price Prediction Using the Long Short-Term Memory Model Combined with Online Social Networks. J. Behav. Exp. Financ. 2021, 30, 100507. [Google Scholar] [CrossRef]
Long, J.; Chen, Z.; He, W.; Wu, T.; Ren, J. An Integrated Framework of Deep Learning and Knowledge Graph for Prediction of Stock Price Trend: An Application in Chinese Stock Exchange Market. Appl. Soft Comput. 2020, 106205. [Google Scholar] [CrossRef]
Jiang, J.; Wu, L.; Zhao, H.; Zhu, H.; Zhang, W. Forecasting Movements of Stock Time Series Based on Hidden State Guided Deep Learning Approach. Inf. Process. Manag. 2023, 60, 103328. [Google Scholar] [CrossRef]
Yao, Y.; Zhang, Z.Y.; Zhao, Y. Stock index forecasting based on multivariate empirical mode decomposition and temporal convolutional networks. Appl. Soft Comput. 2023, 142, 110356. [Google Scholar] [CrossRef]
Behera, J.; Pasayat, A.K.; Behera, H.; Kumar, P. Prediction Based Mean-Value-at-Risk Portfolio Optimization Using Machine Learning Regression Algorithms for Multi-National Stock Markets. Eng. Appl. Artif. Intell. 2023, 120, 105843. [Google Scholar] [CrossRef]
Yu, Y.; Lin, Y.; Hou, X.; Zhang, X. Novel optimization approach for realized volatility forecast of stock price index based on deep reinforcement learning model. Expert Syst. Appl. 2023, 233, 120880. [Google Scholar] [CrossRef]
Jing, N.; Wu, Z.; Wang, H. A Hybrid Model Integrating Deep Learning with Investor Sentiment Analysis for Stock Price Prediction. Expert Syst. Appl. 2021, 178, 115019. [Google Scholar] [CrossRef]
Alzaman, C. Forecasting and Optimization Stock Predictions: Varying Asset Profile, Time Window, and Hyperparameter Factors. Syst. Soft Comput. 2023, 5, 200052. [Google Scholar] [CrossRef]
Liang, Y.; Lin, Y.; Lu, Q. Forecasting Gold Price Using a Novel Hybrid Model with ICEEMDAN and LSTM-CNN-CBAM. Expert Syst. Appl. 2022, 206, 117847. [Google Scholar] [CrossRef]
Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM Model for Gold Price Time-Series Forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
Ozdemir, A.C.; Buluş, K.; Zor, K. Medium-to Long-Term Nickel Price Forecasting Using LSTM and GRU Networks. Resour. Policy 2022, 78, 102906. [Google Scholar] [CrossRef]
Shi, T.; Li, C.; Zhang, W.; Zhang, Y. Forecasting on metal resource spot settlement price: New evidence from the machine learning model. Resour. Policy 2023, 81, 103360. [Google Scholar] [CrossRef]
Zhang, H.; Nguyen, H.; Vu, D.-A.; Bui, X.-N.; Pradhan, B. Forecasting monthly copper price: A comparative study of various machine learning-based methods. Resour. Policy 2021, 73, 102189. [Google Scholar] [CrossRef]
Fang, Y.; Guan, B.; Wu, S.; Heravi, S. Optimal Forecast Combination Based on Ensemble Empirical Mode Decomposition for Agricultural Commodity Futures Prices. J. Forecast. 2020, 39, 877–886. [Google Scholar] [CrossRef]
Ribeiro, M.H.D.M.; dos Santos Coelho, L. Ensemble Approach Based on Bagging, Boosting and Stacking for Short-Term Prediction in Agribusiness Time Series. Appl. Soft Comput. 2020, 86, 105837. [Google Scholar] [CrossRef]
Liu, D.; Tang, Z.; Cai, Y. A Hybrid Model for China’s Soybean Spot Price Prediction by Integrating CEEMDAN with Fuzzy Entropy Clustering and CNN-GRU-Attention. Sustainability 2022, 14, 15522. [Google Scholar] [CrossRef]
RL, M.; Mishra, A.K. Forecasting Spot Prices of Agricultural Commodities in India: Application of Deep-Learning Models. Intell. Syst. Account. Financ. Manag. 2021, 28, 72–83. [Google Scholar] [CrossRef]
Deepa, S.; Alli, A.; Gokila, S. Machine Learning Regression Model for Material Synthesis Prices Prediction in Agriculture. Mater. Today Proc. 2021, 81, 989–993. [Google Scholar] [CrossRef]
Ouyang, H.; Wei, X.; Wu, Q. Agricultural Commodity Futures Prices Prediction via Long-and Short-Term Time Series Network. J. Appl. Econ. 2019, 22, 468–483. [Google Scholar] [CrossRef]
Liang, J.; Jia, G. China Futures Price Forecasting Based on Online Search and Information Transfer. Data Sci. Manag. 2022, 5, 187–198. [Google Scholar] [CrossRef]
Weng, Y.; Wang, X.; Hua, J.; Wang, H.; Kang, M.; Wang, F.-Y. Forecasting Horticultural Products Price Using ARIMA Model and Neural Network Based on a Large-Scale Data Set Collected by Web Crawler. IEEE Trans. Comput. Soc. Syst. 2019, 1–7. [Google Scholar] [CrossRef]
Deina, C.; do Amaral Prates, M.H.; Alves, C.H.R.; Martins, M.S.R.; Trojan, F.; Stevan Jr, S.L.; Siqueira, H.V. A Methodology for Coffee Price Forecasting Based on Extreme Learning Machines. Inf. Process. Agric. 2022, 9, 556–565. [Google Scholar] [CrossRef]
Liang, X.; Luo, P.; Li, X.; Wang, X.; Shu, L. Crude Oil Price Prediction Using Deep Reinforcement Learning. Resour. Policy 2023, 81, 103363. [Google Scholar] [CrossRef]
Sadefo Kamdem, J.; Bandolo Essomba, R.; Njong Berinyuy, J. Deep Learning Models for Forecasting and Analyzing the Implications of COVID-19 Spread on Some Commodities Markets Volatilities. Chaos Solitons Fractals 2020, 140, 110215. [Google Scholar] [CrossRef] [PubMed]
Urolagin, S.; Sharma, N.; Datta, T.K. A combined architecture of multivariate LSTM with Mahalanobis and Z-Score transformations for oil price forecasting. Energy 2021, 231, 120963. [Google Scholar] [CrossRef]
Xu, Z.; Mohsin, M.; Ullah, K.; Ma, X. Using econometric and machine learning models to forecast crude oil prices: Insights from economic history. Resour. Policy 2023, 83, 103614. [Google Scholar] [CrossRef]
Niu, T.; Wang, J.; Lu, H.; Yang, W.; Du, P. A Learning System Integrating Temporal Convolution and Deep Learning for Predictive Modeling of Crude Oil Price. IEEE Trans. Ind. Inform. 2020, 17, 4602–4612. [Google Scholar] [CrossRef]
Guliyev, H.; Mustafayev, E. Predicting the Changes in the WTI Crude Oil Price Dynamics Using Machine Learning Models. Resour. Policy 2022, 77, 102664. [Google Scholar] [CrossRef]
Wang, J.; Zhou, H.; Hong, T.; Li, X.; Wang, S. A multi-granularity heterogeneous combination approach to crude oil price forecasting. Energy Econ. 2020, 91, 104790. [Google Scholar] [CrossRef]
Čeperić, E.; Žiković, S.; Čeperić, V. Short-Term Forecasting of Natural Gas Prices Using Machine Learning and Feature Selection Algorithms. Energy 2017, 140, 893–900. [Google Scholar] [CrossRef]
Zheng, Y.; Luo, J.; Chen, J.; Chen, Z.; Shang, P. Natural gas spot price prediction research under the background of Russia-Ukraine conflict-based on FS-GA-SVR hybrid model. J. Environ. Manag. 2023, 344, 118446. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Cao, J.; Yuan, S.; Cheng, M. Short-term forecasting of natural gas prices by using a novel hybrid method based on a combination of the CEEMDAN-SE-and the PSO-ALS-optimized GRU network. Energy 2021, 233, 121082. [Google Scholar] [CrossRef]
Sun, X.; Liu, M.; Sima, Z. A novel cryptocurrency price trend forecasting model based on LightGBM. Financ. Res. Lett. 2020, 32, 101084. [Google Scholar] [CrossRef]
Wang, Y.; Wang, C.; Sensoy, A.; Yao, S.; Cheng, F. Can Investors’ Informed Trading Predict Cryptocurrency Returns? Evidence from Machine Learning. Res. Int. Bus. Financ. 2022, 62, 101683. [Google Scholar] [CrossRef]
Oyedele, A.A.; Ajayi, A.O.; Oyedele, L.O.; Bello, S.A.; Jimoh, K.O. Performance Evaluation of Deep Learning and Boosted Trees for Cryptocurrency Closing Price Prediction. Expert Syst. Appl. 2023, 213, 119233. [Google Scholar] [CrossRef]
Akyildirim, E.; Goncu, A.; Sensoy, A. Prediction of Cryptocurrency Returns Using Machine Learning. Ann. Oper. Res. 2021, 297, 3–36. [Google Scholar] [CrossRef]
Borges, T.A.; Neves, R.F. Ensemble of Machine Learning Algorithms for Cryptocurrency Investment with Different Data Resampling Methods. Appl. Soft Comput. 2020, 90, 106187. [Google Scholar] [CrossRef]
Zhang, Z.; Dai, H.N.; Zhou, J.; Mondal, S.K.; García, M.M.; Wang, H. Forecasting cryptocurrency price using convolutional neural networks with weighted and attentive memory channels. Expert Syst. Appl. 2021, 183, 115378. [Google Scholar] [CrossRef]
Alonso-Monsalve, S.; Suárez-Cetrulo, A.L.; Cervantes, A.; Quintana, D. Convolution on Neural Networks for High-Frequency Trend Prediction of Cryptocurrency Exchange Rates Using Technical Indicators. Expert Syst. Appl. 2020, 149, 113250. [Google Scholar] [CrossRef]
Cavalli, S.; Amoretti, M. CNN-Based Multivariate Data Analysis for Bitcoin Trend Prediction. Appl. Soft Comput. 2021, 101, 107065. [Google Scholar] [CrossRef]
Jaquart, P.; Dann, D.; Weinhardt, C. Short-Term Bitcoin Market Prediction via Machine Learning. J. Financ. Data Sci. 2021, 7, 45–66. [Google Scholar] [CrossRef]
Alkhodhairi, R.K.; Aljalhami, S.R.; Rusayni, N.K.; Alshobaili, J.F.; Al-Shargabi, A.A.; Alabdulatif, A. Bitcoin Candlestick Prediction with Deep Neural Networks Based on Real Time Data. Comput. Mater. Contin. 2021, 68, 3215–3233. [Google Scholar] [CrossRef]
Chen, W.; Xu, H.; Jia, L.; Gao, Y. Machine Learning Model for Bitcoin Exchange Rate Prediction Using Economic and Technology Determinants. Int. J. Forecast. 2021, 37, 28–43. [Google Scholar] [CrossRef]
Mudassir, M.; Bennbaia, S.; Unal, D.; Hammoudeh, M. Time-Series Forecasting of Bitcoin Prices Using High-Dimensional Features: A Machine Learning Approach. Neural Comput. Appl. 2020, 32, 17763–17778. [Google Scholar] [CrossRef]
Mallqui, D.C.; Fernandes, R.A. Predicting the Direction, Maximum, Minimum and Closing Prices of Daily Bitcoin Exchange Rate Using Machine Learning Techniques. Appl. Soft Comput. 2019, 75, 596–606. [Google Scholar] [CrossRef]
Jang, H.; Lee, J. An Empirical Study on Modeling and Prediction of Bitcoin Prices with Bayesian Neural Networks Based on Blockchain Information. IEEE Access 2017, 6, 5427–5437. [Google Scholar] [CrossRef]
Al-Nefaie, A.H.; Aldhyani, T.H. Bitcoin Price Forecasting and Trading: Data Analytics Approaches. Electronics 2022, 11, 4088. [Google Scholar] [CrossRef]
Cocco, L.; Tonelli, R.; Marchesi, M. Predictions of Bitcoin Prices through Machine Learning Based Frameworks. PeerJ Comput. Sci. 2021, 7, e413. [Google Scholar] [CrossRef]
Tapia, S.; Kristjanpoller, W. Framework based on multiplicative error and residual analysis to forecast bitcoin intraday-volatility. Phys. A Stat. Mech. Its Appl. 2022, 589, 126613. [Google Scholar] [CrossRef]
Dutta, A.; Kumar, S.; Basu, M. A Gated Recurrent Unit Approach to Bitcoin Price Prediction. J. Risk Financ. Manag. 2020, 13, 23. [Google Scholar] [CrossRef]
Lahmiri, S.; Bekiros, S. Cryptocurrency Forecasting with Deep Learning Chaotic Neural Networks. Chaos Solitons Fractals 2019, 118, 35–40. [Google Scholar] [CrossRef]
Serrano, W. The random neural network in price predictions. Neural Comput. Appl. 2022, 34, 855–873. [Google Scholar] [CrossRef]
Uras, N.; Marchesi, L.; Marchesi, M.; Tonelli, R. Forecasting Bitcoin closing price series using linear regression and neural networks models. PeerJ Comput. Sci. 2020, 6, e279. [Google Scholar] [CrossRef] [PubMed]
Sebastião, H.; Godinho, P. Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ. Innov. 2021, 7, 1–30. [Google Scholar] [CrossRef] [PubMed]
Poongodi, M.; Sharma, A.; Vijayakumar, V.; Bhardwaj, V.; Sharma, A.P.; Iqbal, R.; Kumar, R. Prediction of the Price of Ethereum Blockchain Cryptocurrency in an Industrial Finance System. Comput. Electr. Eng. 2019, 81, 106527. [Google Scholar] [CrossRef]
Zoumpekas, T.; Houstis, E.; Vavalis, M. Eth analysis and predictions utilizing deep learning. Expert Syst. Appl. 2020, 162, 113866. [Google Scholar] [CrossRef]
Patel, M.M.; Tanwar, S.; Gupta, R.; Kumar, N. A Deep Learning-Based Cryptocurrency Price Prediction Scheme for Financial Institutions. J. Inf. Secur. Appl. 2020, 55, 102583. [Google Scholar] [CrossRef]
Peng, Y.; Albuquerque, P.H.M.; de Sá, J.M.C.; Padula, A.J.A.; Montenegro, M.R. The Best of Two Worlds: Forecasting High Frequency Volatility for Cryptocurrencies and Traditional Currencies with Support Vector Regression. Expert Syst. Appl. 2018, 97, 177–192. [Google Scholar] [CrossRef]
Islam, M.S.; Hossain, E. Foreign Exchange Currency Rate Prediction Using a GRU-LSTM Hybrid Network. Soft Comput. Lett. 2020, 3, 100009. [Google Scholar] [CrossRef]
Yıldırım, D.C.; Toroslu, I.H.; Fiore, U. Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators. Financ. Innov. 2021, 7, 1. [Google Scholar] [CrossRef]
Ahmed, S.; Hassan, S.-U.; Aljohani, N.R.; Nawaz, R. FLF-LSTM: A Novel Prediction System Using Forex Loss Function. Appl. Soft Comput. 2020, 97, 106780. [Google Scholar] [CrossRef]
Escudero, P.; Alcocer, W.; Paredes, J. Recurrent Neural Networks and ARIMA Models for Euro/Dollar Exchange Rate Forecasting. Appl. Sci. 2021, 11, 5658. [Google Scholar] [CrossRef]
Jubert de Almeida, B.; Ferreira Neves, R.; Horta, N. Combining Support Vector Machine with Genetic Algorithms to Optimize Investments in Forex Markets with High Leverage. Appl. Soft Comput. 2018, 64, 596–613. [Google Scholar] [CrossRef]
Sadeghi, A.; Daneshvar, A.; Zaj, M.M. Combined ensemble multi-class SVM and fuzzy NSGA-II for trend forecasting and trading in Forex markets. Expert Syst. Appl. 2021, 185, 115566. [Google Scholar] [CrossRef]
Ni, L.; Li, Y.; Wang, X.; Zhang, J.; Yu, J.; Qi, C. Forecasting of Forex Time Series Data Based on Deep Learning. Procedia Comput. Sci. 2019, 147, 647–652. [Google Scholar] [CrossRef]
Lin, H.; Sun, Q.; Chen, S.Q. Reducing Exchange Rate Risks in International Trade: A Hybrid Forecasting Approach of CEEMDAN and Multilayer LSTM. Sustainability 2020, 12, 2451. [Google Scholar] [CrossRef]
Dautel, A.J.; Härdle, W.K.; Lessmann, S.; Seow, H.-V. Forex Exchange Rate Forecasting Using Deep Recurrent Neural Networks. Digit. Financ. 2020, 2, 69–96. [Google Scholar] [CrossRef]
Abedin, M.Z.; Moon, M.H.; Hassan, M.K.; Hajek, P. Deep Learning-Based Exchange Rate Prediction during the COVID-19 Pandemic. Ann. Oper. Res. 2021, 345, 1335–1386. [Google Scholar] [CrossRef] [PubMed]
Qi, L.; Khushi, M.; Poon, J. Event-Driven LSTM for Forex Price Prediction. In Proceedings of the 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast, Australia, 16–18 December 2020. [Google Scholar] [CrossRef]
Rundo, F. Deep LSTM with Reinforcement Learning Layer for Financial Trend Prediction in FX High Frequency Trading Systems. Appl. Sci. 2019, 9, 4460. [Google Scholar] [CrossRef]
Baffour, A.A.; Jingchun, F.; Taylor, E.K. A Hybrid Artificial Neural Network-GJR Modelling Approach to Forecasting Currency Exchange Rate Volatility. Neurocomputing 2019, 365, 285–301. [Google Scholar] [CrossRef]
Xueling, L.; Xiong, X.; Yucong, S. Exchange rate market trend prediction based on sentiment analysis. Comput. Electr. Eng. 2023, 111, 108901. [Google Scholar] [CrossRef]
Sako, K.; Mpinda, B.N.; Rodrigues, P.C. Neural networks for financial time series forecasting. Entropy 2022, 24, 657. [Google Scholar] [CrossRef]
Cao, W.; Zhu, W.; Wang, W.; Demazeau, Y.; Zhang, C. A Deep Coupled LSTM Approach for USD/CNY Exchange Rate Forecasting. IEEE Intell. Syst. 2020, 1, 1. [Google Scholar] [CrossRef]
Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice; OTexts: Melbourne, Australia, 2018. [Google Scholar]
Granger, C.W.J.; Newbold, P. Forecasting Economic Time Series, 2nd ed.; Academic Press: Orlando, FL, USA, 1986. [Google Scholar]
Chai, T.; Draxler, R.R. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)? Arguments against Relying on RMSE in Model Evaluation. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
Makridakis, S. Accuracy Measures: Theoretical and Practical Concerns. Int. J. Forecast. 1993, 9, 527–529. [Google Scholar] [CrossRef]
Kim, S.; Kim, H. A New Metric of Absolute Percentage Error for Intermittent Demand Forecasts. Int. J. Forecast. 2016, 32, 669–679. [Google Scholar] [CrossRef]
Hyndman, R.J.; Koehler, A.B. Another Look at Measures of Forecast Accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef]
Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Int. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar] [CrossRef]
Fawcett, T. An Introduction to ROC Analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Mounjid, O.; Lehalle, C.A. Improving reinforcement learning algorithms: Towards optimal learning rate policies. Math. Financ. 2024, 34, 588–621. [Google Scholar] [CrossRef]
Song, H.; Choi, H. Forecasting stock market indices using the recurrent neural network based hybrid models: CNN-LSTM, GRU-CNN, and ensemble models. Appl. Sci. 2023, 13, 4644. [Google Scholar] [CrossRef]
Liu, C.; Tran, M.N.; Wang, C.; Gerlach, R.; Kohn, R. Data Scaling Effect of Deep Learning in Financial Time Series Forecasting. arXiv 2023, arXiv:2309.02072. [Google Scholar] [CrossRef]
Yan, H.; Ouyang, H. Financial time series prediction based on deep learning. Wirel. Pers. Commun. 2018, 102, 683–700. [Google Scholar] [CrossRef]
Livieris, I.E.; Stavroyiannis, S.; Pintelas, E.; Kotsilieris, T.; Pintelas, P. A dropout weight-constrained recurrent neural network model for forecasting the price of major cryptocurrencies and CCi30 index. Evolving Syst. 2022, 1–16. [Google Scholar] [CrossRef]
Vancsura, L.; Tatay, T.; Bareith, T. Investigating the Role of Activation Functions in Predicting the Price of Cryptocurrencies During Critical Economic Periods. Virtual Econ. 2024, 7, 64–91. [Google Scholar] [CrossRef] [PubMed]
Makinde, A. Optimizing Time Series Forecasting: A Comparative Study of Adam and Nesterov Accelerated Gradient on LSTM and GRU networks Using Stock Market data. arXiv 2024, arXiv:2410.01843. [Google Scholar] [CrossRef]
Hoque, K.E.; Aljamaan, H. Impact of hyperparameter tuning on machine learning models in stock price forecasting. IEEE Access 2021, 9, 163815–163830. [Google Scholar] [CrossRef]
Flavia, A.; Mio, C. Performance Comparison of Standard LSTM and LSTM with Random Search Optimization for Spark New Zealand Limited Stock Price Prediction. Int. J. Artif. Intell. Inform. 2025, 3, 60–66. [Google Scholar] [CrossRef]
Liu, W.; Suzuki, Y.; Du, S. Forecasting the Stock Price of Listed Innovative SMEs Using Machine Learning Methods Based on Bayesian optimization: Evidence from China. Comput. Econ. 2024, 63, 2035–2068. [Google Scholar] [CrossRef]
Ma, D.; Shu, M.; Zhang, H. Feature selection optimization for employee retention prediction: A machine learning approach for human resource management. Appl. Comput. Eng. 2025, 141, 120–130. [Google Scholar] [CrossRef]
Tsai, C.F.; Sue, K.L.; Hu, Y.H.; Chiu, A. Combining feature selection, instance selection, and ensemble classification techniques for improved financial distress prediction. J. Bus. Res. 2021, 130, 200–209. [Google Scholar] [CrossRef]
Goodell, J.W.; Jabeur, S.B.; Saâdaoui, F.; Nasir, M.A. Explainable artificial intelligence modeling to forecast bitcoin prices. Int. Rev. Financ. Anal. 2023, 88, 102702. [Google Scholar] [CrossRef]
Rane, N.; Choudhary, S.; Rane, J. Explainable Artificial Intelligence (XAI) approaches for transparency and accountability in financial decision-making. SSRN Electron. J. 2023, 4640316. [Google Scholar] [CrossRef]
Chen, S.; Ge, L. Exploring the Attention Mechanism in LSTM-Based Hong Kong Stock Price Movement Prediction. Quant. Financ. 2019, 19, 1507–1515. [Google Scholar] [CrossRef]
Souto, H.G.; Moradi, A. Can transformers transform financial forecasting? China Financ. Rev. Int. 2024, ahead-of-print. [Google Scholar] [CrossRef]
Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal Fusion Transformers for Interpretable Multi-Horizon Time Series Forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
Kabir, M.R.; Bhadra, D.; Ridoy, M.; Milanova, M. LSTM–Transformer-Based Robust Hybrid Deep Learning Model for Financial Time Series Forecasting. Science 2025, 7, 7. [Google Scholar] [CrossRef]
Cheng, D.; Yang, F.; Xiang, S.; Liu, J. Financial Time Series Forecasting with Multi-Modality Graph Neural Network. Pattern Recognit. 2022, 121, 108218. [Google Scholar] [CrossRef]
Feng, R.; Jiang, S.; Liang, X.; Xia, M. STGAT: Spatial–Temporal Graph Attention Neural Network for Stock Prediction. Appl. Sci. 2025, 15, 4315. [Google Scholar] [CrossRef]
Takahashi, S.; Chen, Y.; Tanaka-Ishii, K. Modeling financial time-series with generative adversarial networks. Phys. A 2019, 527, 121261. [Google Scholar] [CrossRef]
Vuletić, M.; Prenzel, F.; Cucuringu, M. Fin-gan: Forecasting and classifying financial time series via generative adversarial networks. Quant. Financ. 2024, 24, 175–199. [Google Scholar] [CrossRef]
Ingle, V.; Deshmukh, S. Ensemble Deep Learning Framework for Stock Market Data Prediction (EDLF-DP). Glob. Transit. Proc. 2021, 2, 47–66. [Google Scholar] [CrossRef]
Gul, A. A Novel Hybrid Ensemble Framework for Stock Price Prediction: Combining Bagging, Boosting, Dagging, and Stacking. Comput. Econ. 2025, 1–31. [Google Scholar] [CrossRef]
Batool, K.; Baig, M.M.; Fatima, U. Accuracy and Efficiency in Financial Markets Forecasting Using Meta-Learning Under Resource Constraints. Mach. Learn. Appl. 2025, 21, 100681. [Google Scholar] [CrossRef]
Liu, A.; Ma, J.; Zhang, G. Adapting to the Unknown: Robust Meta-Learning for Zero-Shot Financial Time Series Forecasting. arXiv 2025, arXiv:2504.09664. [Google Scholar] [CrossRef]
Henrique, B.M.; Sobreiro, V.A.; Kimura, H. Literature Review: Machine Learning Techniques Applied to Financial Market Prediction. Expert Syst. Appl. 2019, 124, 226–251. [Google Scholar] [CrossRef]
Ghoddusi, H.; Creamer, G.G.; Rafizadeh, N. Machine Learning in Energy Economics and Finance: A Review. Energy Econ. 2019, 81, 709–727. [Google Scholar] [CrossRef]
Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
Ozbayoglu, A.M.; Gudelek, M.U.; Sezer, O.B. Deep Learning for Financial Applications: A Survey. Appl. Soft Comput. 2020, 93, 106384. [Google Scholar] [CrossRef]
Rouf, N.; Malik, M.B.; Arif, T.; Sharma, S.; Singh, S.; Aich, S.; Kim, H.C. Stock Market Prediction Using Machine Learning Techniques: A Decade Survey on Methodologies, Recent Developments, and Future Directions. Electronics 2021, 10, 2717. [Google Scholar] [CrossRef]
Kumar, D.; Sarangi, P.K.; Verma, R. A Systematic Review of Stock Market Prediction Using Machine Learning and Statistical Techniques. Mater. Today Proc. 2022, 49, 3187–3191. [Google Scholar] [CrossRef]
Ahmed, S.; Alshater, M.M.; El Ammari, A.; Hammami, H. Artificial Intelligence and Machine Learning in Finance: A Bibliometric Review. Res. Int. Bus. Financ. 2022, 61, 101646. [Google Scholar] [CrossRef]

Figure 1. The search–appraisal–systematic chart. Source: Own editing based on Moher et al. (2009) [47].

Figure 2. Number of publications per year (n = 100). Source: Own editing.

Figure 3. Distribution of publications by product category (n = 100). Source: Own editing.

Figure 4. Distribution of publications by time and product categories. Source: Own editing.

Figure 5. Top 10 applied methodologies (n = 100). The data are projected for 100 publications, but a methodology may appear in multiple articles. Source: Own editing.

Figure 6. Top 5 performance evaluation metrics (n = 100). The data are projected for 100 publications, but a methodology may appear in multiple articles. Source: Own editing.

Figure 7. Distribution of publications based on the length of the databases (n = 100). Source: Own editing.

Figure 8. Distribution of publications by statistical methods (n = 100). Source: Own editing.

Figure 9. Number of published publications by journals (top 15). Source: Own editing.

Table 1. Quality criteria for inclusion and exclusion.

Quality Criteria	Reason for Inclusion and Exclusion
Inclusion criteria
Year of publication	We focused on publications from 2014–2023.
Articles in English language	All articles processed in the systematic literature review were published in English.
Thematic	We focused on the forecasting performance of machine learning models in relation to financial and stock market time series.
Scholarly published articles	Verified, relevant, and quality content papers on machine learning models.
Exclusion criteria
Articles that do address other sector	Time-series analyses outside the four main product groups (e.g., electricity demand, precipitation, or sea level forecast).
Conference paper, books, working papers, technical reports, thesis	All articles are professionally reviewed to ensure quality and consistency.

Source: Own editing.

Table 2. Most influential studies of stock market literature.

	Citations	Assets	Data Sources	Main Methods	Prediction Type	Dataset Period	Performance Measures
Fischer and Krauss (2018) [54]	2733	S&P500	Thomson Reuters	RF, LR, LSTM	Direction	1992–2015	Accuracy
Siami-Namini et al. (2018) [55]	1680	Nikkei225, S&P500, NASDAQ composit, Hang Seng, Dow Jones	Yahoo Finance	ARIMA, LSTM	Price	1985–2018	RMSE
Nelson et al. (2017) [65]	1124	Ibovespa index	BM&F Bovespa stock exchange	MLP, RF, LSTM	Direction	2008–2015	F1-score, Accuracy, Recall, Precision
Chong et al. (2017) [71]	1054	KOSPI stocks	Korean Stock Exchange	AR, ANN, DNN, AR-DNN, DNN-AR	Return	2010–2014	MSE, RMSE, MAE
Ballings et al. (2015) [69]	816	European stocks	Amadeus Database	AdaBoost, RF, KF, SVM, KNN, Logistic regression, ANN	Return	2009	AUC
Cao et al. (2019) [61]	784	S&P500, Hang Seng, DAX, SSE	Yahoo Finance	LSTM, SVM, MLP, CEEMDAN-LSTM, CEEMDAN-SVM, CEEMDAN-MLP	Price	2007–2017	RMSE, MAPE, MAE
Hiransha et al. (2018) [50]	748	NSE and NYSE	NSE and NYSE	MLP, RNN, CNN, LSTM, ARIMA	Price	1996–2015	MAPE
Zhang et al. (2017) [63]	532	50 different global stocks	Yahoo Finance	AR, LSTM, SFM (State Frequency Memory)	Price	2007–2016	RMSE
Nabipour et al. (2020) [49]	487	Stocks from Tehran Stock Exchange	TSETMC	DT, RF, Adaboost, xGBoost, SVC, Naïve Bayes, KNN, Logistic regression, ANN, RNN, LSTM	Direction	2009–2019	F1-score, Accuracy, ROC-AUC
Basak et al. (2019) [70]	484	10 different global stocks	Yahoo Finance	XGBoost, Logistic regression, SVM, ANN, RF	Direction	From publicly available to 2017	F1-score, Accuracy, Recall, Precision, AUC

Source: Own editing.

Table 3. Most influential studies of commodity market literature.

Reference	Citations	Assets	Data Sources	Main Methods	Prediction Type	Dataset Period	Performance Measures
Livieris et al. (2020) [82]	893	Gold	Yahoo Finance	SVR, FFNN, LSTM, CNN-LSTM	Direction, Price	2014–2018	MAE, RMSE, Accuracy, AUC
Ribeiro et al. (2020) [87]	562	Soybean, Wheat	CME Group’s	Gredient Boost, XGBoost, RF, SVR, KNN, MLP, KNN-XGB-SVR	Price	2001–2018	MSE, RMSE, MAE, MAPE
Liang et al. (2022) [81]	164	Gold	COMEX	LSTM, CNN, CBAM, LSTM-CNN-CBAM, ICEEMDAN-LSTM-CNN-CBAM	Price	2010–2020	RMSE, MAE, MAPE, SMAPE
Weng et al. (2019) [93]	151	Cucumber	Beijing Xinfadi Market	ARIMA, BPNN, RNN	Price	2010–2018	MAPE
Wang et al. (2021) [104]	116	Natural gas	US Energy Information Administration	BP network, SVR, RNN, LSTM, GRU, and PSO-GRU) and 17 hybrid models	Price	2000–2019	MSE, RMSE, MAE, MAPE
Fang et al. (2020) [86]	104	Vegetable meal, soybean meal, stalked rice, strong wheat, Zheng cotton, and early Indica rice	Wind	ARIMA, ANN, SVR	Price	2014–2017	RRMSE
Urolagin et al. (2021) [97]	99	Crude oil, Gold	Investing	LSTM, RMT-LSTM, RZT-LSTM	Price	2000–2019	MAE, MSE, RMSE
Čeperić et al. (2017) [102]	97	Natural gas, heating oil, crude oil, coal	Bloomberg	ANN, Naive Bayes, AR, ARIMA, SVR, SSA-SVR	Price	2010–2014	RMSE. MAPE
RL et al. (2021) [89]	84	Cotton seed, castor seed, rape mustard seed, soybean seed, and guar seed	National Commodity and Derivatives Exchange	ARIMA, LSTM, TDNN	Price	2009–2019	RMSE
Guliyev and Mustafayev (2022) [100]	82	Crude oil, Gold	FRED, Yahoo Finance	Logistic regression, DT, RF, Adaboost, XGBoost	Direction	1991–2021	Accuracy, AUC, ROC

Source: Own editing.

Table 4. Most influential studies of cryptocurrency market literature.

Reference	Citations	Assets	Data Sources	Main Methods	Prediction Type	Dataset Period	Performance Measures
Jang and Lee (2017) [118]	607	Bitcoin	Yahoo Finance	LR, BNN, SVR	Price	2011–2017	RMSE, MAPE
Sun et al. (2020) [105]	526	42 differenc cryptocurrencies	Investing	LightGBM, RF, SVC	Direction	2018	AUC, Accuracy
Chen et al. (2020) [18]	498	Bitcoin	CoinMarketCap	Logistic regression, RF, XGBoost, SVC, LSTM, LDA, QDA	Direction	2017–2019	F1-score, Accuracy, Recall, Precision
Lahmiri and Bekiros (2019) [123]	407	Bitcoin, Digital Cash, Ripple	CoinMarketCap	GRNN, LSTM	Price	201–2018	RMSE
Poongodi et al. (2020) [127]	396	Ethereum	Ethereumchain	LR, SVR	Direction	2015–2018	Accuracy
Patel et al. (2020) [129]	339	Litecoin, Monero	Investing	LSTM, LSTM-GRU	Price	2015–2020	MSE, RMSE, MAE, MAPE
Mallqui and Fernandes (2019) [117]	295	Bitcoin	Bitcoinchart, Investing	ANN, RNN, SVR, LSTM, ARIMA	Direction, Price	2013–2017	Accuracy, AUC, RMSE, MAE, MAPE
Akyildirim et al. (2021) [108]	295	Bitcoin Cash, Bitcoin, Dash, EOS, Ethereum Classic, Ethereum, Iota, Litecoin, OmiseGO, Monero, Ripple, Zcash	Bitfinex	Logistic regression, SVC, RF, ANN	Return	2013–2018	Accuracy
Peng et al. (2018) [130]	288	Bitcoin, Ethereum, Dash	Alt19	GARCH, EGARCH, SVR-GARCH	Volatility	2016–2017	RMSE, MAE
Dutta et al. (2020) [122]	269	Bitcoin	Yahoo Finance, Bitcoin.com	RNN, LSTM, GRU	Price	2010–2019	RMSE

Source: Own editing.

Table 5. Most influential studies of foreign exchange market literature.

Reference	Citations	Assets	Data Sources	Main Methods	Prediction Type	Dataset Period	Performance Measures
Islam and Hossain (2020) [131]	161	EUR-USD, GBP-USD, USD-CAD, USD-CHF	Histdata	LSTM, GRU, GRU-LSTM	Price	2017–2020	MSE, RMSE, MAE
Yıldırım (2021) [132]	145	EUR-USD	ECB Statistical Data Warehouse, Yahoo Finance, Federal Reserve Economic Data, Bureau of Labor Statistics Data	LSTM, ME-LSTM, TI-LSTM	Direction	2013–2018	Accuracy
Ni et al. (2019) [137]	144	EUR-USD, AUD-USD, GBP-JPY, EUR-JPY, GBP-USD, USD-CHF, USD-JPY, USD-CAD	Foreign exchange tester website	CNN, LSTM, CNN-RNN	Price	2008–2018	RMSE
Abedin et al. (2021) [140]	116	AUD-USD, EUR-USD, NZD-USD, GBP-USD, BRL-USD, CNY-USD, HKD-USD, INR-USD, KRW-USD, MXN-USD, ZAR-USD, SGD-USD, DKK-USD, JPY-USD, MYR-USD, NOK-USD, SEK-USD, LKR-USD, CHF-USD, TWD-USD, THB-USD	Kaggle, Oanda	Lasso regression, Ridge regression, DT, SVR, RF, LSTM, Bi-LSTM, Bagging regression, Bi-LSTM-BR	Price	2000–2020	RMSE, MAE, MAPE
Dautel et al. (2020) [139]	93	EUR-USD, GBP-USD, JPY-USD, CHF-USD	Oanda	FFNN, RNN, LSTM, GRU	Direction	1971–2017	Accuracy, AUC
Jubert de Almeida et al. (2018) [135]	90	EUR-USD	Dukascopy	SVM, GA, GA-SVM	Direction	2013–2016	Precision, Recall, Accuracy
Rundo (2019) [142]	88	EUR-USD, GBP-USD, EUR-GBP	Yahoo Finance	LSTM, LSTM-RL (reinforcement learning correction block)	Direction	2004–2018	Accuracy
Sako et al. (2022) [145]	78	ZAR-USD, NGN-USD, GBP-USD, EUR-USD, RMB-USD, JPY-USD	Yahoo Finance	RNN, LSTM, GRU	Price	2008–2021	RMSE, MAE
Baffour et al. (2019) [143]	73	AUD-USD, CAD-USD, CHF-USD, EUR-USD, GBP-USD	Yahoo Finance	ANN-GJR, GARCH, APGARCH	Price	2001–2013	MSE, MAD, MAPE
Ahmed et al. (2020) [133]	55	EUR-USD	XM broker	ARIMA, RNN, LSTM, FLF-RNN, FLF-LSTM	Price	2015–2018	MAE

Source: Own editing.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vancsura, L.; Tatay, T.; Bareith, T. Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps. Forecasting 2025, 7, 36. https://doi.org/10.3390/forecast7030036

AMA Style

Vancsura L, Tatay T, Bareith T. Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps. Forecasting. 2025; 7(3):36. https://doi.org/10.3390/forecast7030036

Chicago/Turabian Style

Vancsura, László, Tibor Tatay, and Tibor Bareith. 2025. "Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps" Forecasting 7, no. 3: 36. https://doi.org/10.3390/forecast7030036

APA Style

Vancsura, L., Tatay, T., & Bareith, T. (2025). Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps. Forecasting, 7(3), 36. https://doi.org/10.3390/forecast7030036

Article Menu

Navigating AI-Driven Financial Forecasting: A Systematic Review of Current Status and Critical Research Gaps

Abstract

1. Introduction

1.1. Applications of Artificial Intelligence in Finance

1.1.1. The Efficient Market Hypothesis and Its Challenges

1.1.2. Intertwining Adaptive Markets Theory and Artificial Intelligence

1.1.3. Asymmetric Information and the Role of Alternative Data Sources

1.1.4. Big Data and the Issue of Information Efficiency

1.2. Novelty and Significance of the Study

1.3. Research Questions

2. Methodology

3. Literature Review of the Examined Markets

3.1. Stock Market Forecasts

Summary and Critical Reflection

3.2. Commodity Market Forecasts

Summary and Critical Reflection

3.3. Cryptocurrency Forecasts

Summary and Critical Reflection

3.4. Foreign Exchange Market Forecasts

Summary and Critical Reflection

4. Evaluation Metrics for Financial Forecasting Models

4.1. Regression Model Evaluation Metrics

4.1.1. Mean Absolute Error (MAE)

4.1.2. Mean Squared Error (MSE)

4.1.3. Root Mean Squared Error (RMSE)

4.1.4. Mean Absolute Percentage Error (MAPE)

4.1.5. Symmetric MAPE (sMAPE)

4.1.6. R-Squared (R2)

4.2. Classification Model Evaluation Metrics

4.2.1. Accuracy

4.2.2. Precision and Recall

4.2.3. F1-Score

4.2.4. Specificity

4.2.5. ROC Curve and AUC

4.3. Summary

5. The Role of Hyperparameter Tuning in the Performance of Financial Forecasting Models

5.1. Key Hyperparameters

5.1.1. Learning Rate

5.1.2. Number of Epochs

5.1.3. Batch Size

5.1.4. Number of Hidden Layers and Neurons per Layer

5.1.5. Dropout Rate

5.1.6. Activation Function

5.1.7. Optimizer

5.2. Hyperparameter Optimization Techniques

5.2.1. Grid Search

5.2.2. Random Search

5.2.3. Bayesian Optimization

5.2.4. Evolutionary Algorithms (e.g., Genetic Algorithms, Particle Swarm Optimization)

5.3. Challenges in Financial Applications

6. Feature Selection and Explainability in Financial Forecasting Models

6.1. Feature Selection Techniques

6.1.1. Filter Methods

6.1.2. Wrapper Methods

6.1.3. Embedded Methods

6.2. Explainable AI (XAI) in Financial Forecasting

6.2.1. SHAP (SHapley Additive exPlanations)

6.2.2. LIME (Local Interpretable Model-Agnostic Explanations)

6.2.3. Attention Mechanisms

6.3. Summary

7. Advanced Machine Learning Architectures in Financial Time-Series Forecasting and Explainability (XAI)

7.1. Transformer-Based Models: Long-Term Dependencies and Explainability

7.2. Graph-Based Models (Graph Neural Networks, GNN): Modeling Network Interactions

7.3. GAN-Based Time-Series Models (TimeGAN): Synthetic Data Generation and Extreme Conditions

7.4. Hybrid and Ensemble-Based Models: The Synergy of Algorithms

7.5. Meta-Learning and Few-Shot Learning: Learning from Scarce Data

8. Results of the Literature Database Analysis

9. Identifying Research Gaps

9.1. Practical Applicability of Models in Trading Strategies

9.2. The Relationship Between Volatility and Predictive Performance Indicators, with a Special Focus on MAPE

9.3. Temporal Robustness and Stability of Models in Different Market Environments

10. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

4.1.6. R-Squared (R²)