Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting

Kaygın, Ceyda Yerdelen; Gün, Musa; Akarsu, Osman Nuri; Bağcı, Haşim; Yanık, Ahmet

doi:10.3390/math14060989

Open AccessArticle

Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting

by

Ceyda Yerdelen Kaygın

¹

,

Musa Gün

^2,*

,

Osman Nuri Akarsu

³

,

Haşim Bağcı

⁴

and

Ahmet Yanık

²

¹

Faculty of Economics and Administrative Sciences, Kafkas University, Kars 36100, Türkiye

²

Faculty of Economics and Administrative Sciences, Recep Tayyip Erdoğan University, Rize 53100, Türkiye

³

Independent Researcher, Erzurum 25070, Türkiye

⁴

Faculty of Health Sciences, Aksaray University, Aksaray 68100, Türkiye

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(6), 989; https://doi.org/10.3390/math14060989

Submission received: 10 February 2026 / Revised: 9 March 2026 / Accepted: 12 March 2026 / Published: 14 March 2026

(This article belongs to the Special Issue Recent Computational Techniques to Forecast Cryptocurrency Markets)

Download

Browse Figures

Versions Notes

Abstract

Forecasting cryptocurrency prices is challenging due to extreme volatility, nonlinear dynamics, and frequent structural shifts in digital asset markets. While recent research increasingly applies deep learning architectures, the predictive advantage of highly complex models in noisy financial environments remains uncertain. This study evaluates the forecasting performance of shallow and deep learning approaches by comparing Support Vector Machines (SVM), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) models, along with hybrid configurations (GRU + SVM, LSTM + SVM, and GRU + LSTM). Using daily data spanning from 1 October 2020 to 23 September 2025 for five major cryptocurrencies—Bitcoin, Ethereum, Binance Coin, Solana, and Ripple—the models are estimated within a consistent framework and assessed using out-of-sample performance metrics, including MAE, MAPE, MSE, and R². The results indicate that greater algorithmic complexity does not necessarily improve forecasting accuracy. In several cases, the parsimonious SVM model outperforms deep neural network architectures, particularly for highly volatile assets, while hybrid models fail to provide systematic improvements and sometimes amplify prediction errors. SHapley Additive exPlanations analysis further shows that immediate price-based variables dominate predictive power, whereas many lagged technical indicators contribute relatively limited explanatory value. Overall, the findings underscore the importance of algorithmic parsimony, suggesting that simpler machine learning models may deliver more robust forecasts in highly volatile cryptocurrency markets.

Keywords:

cryptocurrency price prediction; machine learning; deep learning; LSTM; GRU; SVM; SHAP analysis

MSC:

68T37; 62M10; 91G80; 68T07

1. Introduction

The digital finance revolution has fundamentally restructured economic interactions, introducing cryptocurrencies not merely as alternative payment systems but as a distinct asset class characterized by decentralized infrastructure and algorithmic governance [1,2]. Since the inception of Bitcoin in 2008 as a response to systemic financial fragility, the ecosystem has expanded exponentially. By late 2025, over 9000 cryptocurrencies were actively traded globally, underscoring a massive shift in capital allocation driven by technological innovation and investor demand [3]. However, unlike traditional equities or fiat currencies, which are anchored by macroeconomic fundamentals and cash flows, digital assets exhibit chaotic volatility clustering, non-stationary pricing dynamics, and heavy-tailed distributions.

These stochastic characteristics present a formidable challenge for predictive modeling. The absence of fundamental valuation anchors renders cryptocurrency prices highly susceptible to speculative sentiment, information dispersion, and market microstructure noise. Consequently, classical econometric approaches and linear time-series methods frequently fail to capture the abrupt regime shifts and nonlinear dependencies inherent in this domain [4,5]. In response to these limitations, the forecasting literature has aggressively pivoted toward Machine Learning (ML) and Deep Learning (DL) architectures [6,7]. Prevailing bibliometric data indicate a consensus that advanced algorithms—capable of modeling high-dimensional relationships—offer superior efficacy compared to traditional benchmarks [8].

Specifically, the field has seen a surge in the application of Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models [9]. These architectures are explicitly designed to handle temporal dependencies and have demonstrated localized success in cryptocurrency markets [10,11]. Concurrently, various studies have proposed hybrid ensembles and decomposition-based frameworks to further enhance predictive accuracy [12,13,14,15,16,17,18,19]. However, this “rush to complexity” has created an epistemological gap: does increasing algorithmic sophistication systematically yield robust predictive performance in high-entropy environments?

While some literature champions the use of complex hybrid models [14,20], other evidence suggests that simpler, asset-specific specifications may offer better generalization [21,22]. This uncertainty points to a potential “Complexity–Performance Paradox,” where over-parameterized deep learning models risk fitting stochastic noise rather than the underlying deterministic trend, particularly in assets prone to extreme volatility. Furthermore, the “black box” nature of deep networks exacerbates the trade-off between accuracy and interpretability [23,24]. For predictive models to be actionable for risk managers and policymakers, they must be transparent. Although Explainable AI (XAI) methods like SHapley Additive exPlanations (SHAP) have emerged to address this, their application in comparative crypto forecasting remains limited and fragmented [25,26].

This study addresses these unresolved issues by revisiting the Principle of Parsimony (Occam’s Razor) in cryptocurrency forecasting. We posit that simpler, robust algorithms may outperform complex deep architectures when market volatility disrupts neural network gradients. To test this, we employ a multi-asset framework encompassing Bitcoin (BTC), Ethereum (ETHUM), Binance Coin (BNB), Solana (SOLU), and Ripple (XRP), which represent heterogeneous liquidity and volatility regimes [27,28,29,30].

Methodologically, we contrast the performance of sequence-learning models (LSTM, GRU) with that of kernel-based Shallow Learning, specifically Support Vector Machines (SVM). While LSTM and GRU are adept at capturing long-term dependencies [31], SVM operates on the principle of Structural Risk Minimization (SRM), which, in theory, offers superior resistance to overfitting in high-dimensional spaces compared to Empirical Risk Minimization (ERM), often used in neural networks [32]. By integrating these models with SHAP analysis, this study challenges the “deeper is better” narrative and aims to provide a transparent, regime-sensitive forecasting framework.

Accordingly, this research is guided by the following questions:

RQ1. Do ML and DL models differ significantly in out-of-sample forecasting accuracy across cryptocurrencies with heterogeneous liquidity and volatility profiles?

RQ2. Does increased model complexity and hybridization systematically lead to superior predictive performance, or does it succumb to the “Complexity–Performance Paradox”?

RQ3. Which technical indicators and price-based features exert the strongest influence on model predictions as identified through SHAP-based interpretability analysis?

RQ4. Are the predictive drivers and model stability consistent across different cryptocurrencies and volatility regimes?

The remainder of this paper is organized as follows: Section 2 reviews the relevant literature on ML and DL in crypto forecasting. Section 3 details the data, feature engineering, and methodological framework. Section 4 presents empirical results and SHAP analysis. Section 5 discusses the implications of the findings and concludes the study.

2. Literature Review

The forecasting of cryptocurrency dynamics has become a pivotal domain in financial econometrics, driven by the asset class’s rapid expansion and integration into portfolio management strategies. Unlike traditional fiat currencies or equities, digital assets exhibit chaotic volatility clustering, structural breaks, and heavy-tailed distributions that defy the stationarity assumptions of classical linear models. Consequently, the academic discourse has progressively shifted from deterministic rules to stochastic and algorithmic learning frameworks [27,33].

2.1. Limitations of Technical Heuristics and the Shift to Machine Learning

Early empirical research relied heavily on technical analysis indicators—such as Moving Averages, Relative Strength Index (RSI), and MACD—as proxies for market momentum [4,5,34,35]. While studies suggest that strategies based on these indicators can capture short-term inefficiencies, their predictive power remains fragile. Empirical evidence indicates that the efficacy of these tools is highly sensitive to parameter selection and prone to failure during regime shifts [36,37]. Ultimately, these linear tools have proven insufficient for modeling the rapid information diffusion and nonlinear nature of crypto markets, necessitating more robust computational approaches.

2.2. Conventional Machine Learning and Structural Risk Minimization

The adoption of ML models marked a significant advance in handling nonlinearity, grounded in statistical learning theory [38]. Research utilizing Support Vector Machines, Random Forests, and k-nearest neighbors has consistently demonstrated performance improvements over linear econometric counterparts [6,39,40,41]. Notably, ML models have shown effectiveness in high-frequency domains when utilizing historical price action and technical indicators, with further enhancements observed when integrating macroeconomic and cross-asset variables [32].

Recent extensions have introduced trigonometric-based procedures and data-driven pricing mechanisms to simulate logarithmic returns [42,43] better. However, a critical trade-off persists: while ML-based portfolio optimization can minimize downside risk [44], the risk of overfitting increases with model complexity. As warned by [7], if model complexity is not carefully calibrated, in-sample gains can lead to significant out-of-sample degradation.

2.3. The Hybridization Trend and Ensemble Frameworks

To address asset heterogeneity, recent scholarship has emphasized “hybridization,” stacking multiple algorithms to leverage their complementary strengths [14]. While ensemble models have shown superiority over single-model specifications in high-volatility environments [45], the literature lacks consensus on the universality of these gains. Forecasting accuracy is highly variable depending on market maturity and liquidity depth [46]. Furthermore, a systematic review by [47] notes that the lack of standardized evaluation protocols restricts the comparability of hybrid models, raising questions about whether the increased computational cost yields statistically significant improvements.

2.4. Deep Learning: The Hegemony of Temporal Structures

DL has emerged as the dominant paradigm for modeling temporal dependencies [48]. Recurrent Neural Networks (RNNs), specifically Long Short-Term Memory (LSTM), are extensively used to capture long-term correlations in price sequences [10,12,49]. Comparative analyses often position LSTM and GRU as superior to deep feed-forward networks, particularly for short- to medium-term horizons [11].

The field has subsequently moved toward increasingly complex architectures, such as CNN-LSTM and CNN-BiLSTM, which combine convolutional feature extraction with recurrent temporal learning [50,51]. Other innovations include frequency decomposition [16] and attention-based mechanisms [20]. While proponents argue for the inherent superiority of DL over traditional ML [52,53], this “deeper is better” narrative often overlooks the “Complexity–Performance Paradox”. These models frequently suffer from a loss of interpretability and may amplify noise in high-entropy regimes, a limitation this study aims to investigate [53].

2.5. Feature Dimensionality and Alternative Data

Parallel to algorithmic advancements, the feature space for forecasting has expanded. Beyond price-volume data, researchers increasingly incorporate alternative data sources, including blockchain transaction metrics [13], Google Trends sentiment [54,55,56], and macroeconomic indicators [32,57] such as interest rates and gold prices [27,40]. While rich feature sets can enhance explanatory power, they also introduce dimensional challenges. Recent findings indicate that parsimonious indicator selection often outperforms models saturated with noisy variables, particularly when market microstructure noise is prevalent [21,58]. In a complementary vein, ref. [59] demonstrated that combining socio-economic variables, Bitcoin price dynamics, and sentiment signals derived from Twitter and news sources substantially improves the directional prediction of altcoin prices over a seven-year high-frequency dataset, underscoring the informational value of multi-source feature engineering in cryptocurrency forecasting.

2.6. Systemic Integration and Technological Constraints

The deployment of forecasting models in real-world environments introduces technological and infrastructural challenges. Studies have proposed blockchain-integrated cloud–edge architectures to handle latency and token-based frameworks to balance cost, time, and quality [60,61]. Others emphasize the importance of data preprocessing [62] and feature selection [63] in enhancing the reliability of neural network forecasting [64]. These contributions highlight that algorithmic accuracy must be paired with robust system architecture.

2.7. The Interpretability Deficit and Research Gap

Despite the proliferation of sophisticated models, a significant gap remains regarding the “Black Box” nature of DL algorithms. Most studies prioritize statistical accuracy metrics (e.g., MSE, MAPE) while overlooking economic relevance and interpretability [28,30]. Although Explainable AI (XAI) tools like SHAP have been introduced to decipher model decisions [26,65], their application in cryptocurrency forecasting remains fragmented and limited to single-asset studies. This lack of transparency constrains the practical utility of advanced models for risk managers and policymakers.

The main contributions of this study are dynamically structured as follows:

This study provides a comprehensive comparative evaluation of shallow and deep learning algorithms in cryptocurrency price forecasting. By examining five major cryptocurrencies—BTC, BNB, ETHUM, SOLU, and XRP—the analysis captures heterogeneous market characteristics and offers a broader assessment of model performance across assets with different liquidity and volatility profiles.
The study contributes to the literature by empirically investigating the “complexity–performance paradox” in financial forecasting models. Although much of the recent literature highlights increasingly complex deep learning architectures, the empirical results show that higher algorithmic complexity does not necessarily result in better predictive performance. In several cases, the parsimonious SVM model outperforms deep neural network architectures such as LSTM and GRU in out-of-sample forecasting [52,53].
The research highlights the importance of algorithmic parsimony and structural risk minimization. The findings show that the SVM model, grounded in the SRM principle, produces more stable and accurate predictions in highly volatile cryptocurrency markets compared to over-parameterized deep learning models that are more prone to noise amplification and overfitting.
The study extends existing research by systematically evaluating hybrid forecasting architectures, including GRU + SVM, LSTM + SVM, and GRU + LSTM. The results indicate that hybridization does not universally improve forecasting accuracy and may even deteriorate predictive performance in certain assets, thereby providing new empirical evidence on the limitations of combining multiple complex architectures.
The study enhances the interpretability of machine learning models by integrating SHAP. Rather than focusing solely on prediction accuracy, the analysis identifies the key drivers of model predictions and distinguishes between informative price signals and less informative technical indicators. This explainable AI framework improves transparency and provides deeper insights into the mechanisms underlying cryptocurrency price predictions [26,65].

3. Materials and Methodology

This study adopts a rigorous computational framework to evaluate the “Complexity–Performance” trade-off in cryptocurrency forecasting. We specify the closing price as the dependent variable. However, rather than arbitrarily selecting algorithms, our methodological design is grounded in specific theoretical hypotheses regarding the handling of stochastic noise and structural breaks in financial time series.

3.1. Data Acquisition and Feature Space Construction

The dataset used in this analysis was automatically downloaded from the cryptocurrency exchange platform www.binance.com using the Python (version 3.12.1) programming language. Data are retrieved directly through the platform’s public Application Programming Interface (API) and subsequently structured for empirical analysis. The sampling window spans a multi-year period (from 1 October 2020 to 23 September 2025) to ensure the inclusion of diverse market regimes. Given the relatively recent emergence of many cryptocurrencies, longer historical datasets remain limited. Nevertheless, using high-frequency daily data yields more than 1800 observations, which is sufficient for training machine learning and deep learning models [11].

The empirical analysis focuses on five high-capitalization cryptocurrencies—Bitcoin (BTC), Ethereum (ETHUM), Binance Coin (BNB), Solana (SOLU), and Ripple (XRP)—selected to represent distinct liquidity tiers and investor behaviors. To model the complex mechanisms underlying price formation, we constructed a feature space that goes beyond simple price history. This structure includes:

Market Microstructure Variables: These include the stock opening price (SOP), closing price (SCP), lowest price (LSP), highest price (HSP), volume-weighted average price (SWA), trading volume (STV), and number of transactions (NST). Empirical literature suggests these variables encapsulate the primary information flow and liquidity constraints of the asset [39,50,51,66,67,68,69,70,71].
Technical Oscillators: We incorporated several technical indicators to proxy for market momentum and volatility clustering. These include Bollinger Bands: middle (BBm), upper (BBu), and lower (BBl) for volatility clustering [72]; the Commodity Channel Index (CCI) for overbought/oversold identification [73]; and Moving Averages (MA) for trend smoothing [74]. Furthermore, the Moving Average Convergence Divergence (MACD), MACD signal (MACDs), MACD histogram (MACDh) [35], momentum (MOM) [75], Relative Strength Index (RSI) [34], and Stochastic Oscillator—%K (STOk) and %D (STOd) [76]—were integrated to capture short-term price velocity. Prior studies emphasize that these indicators are instrumental in capturing market momentum, volatility dynamics, and trend reversals, thereby providing substantial improvements in predictive performance [5,66,77,78,79].

As shown in Table 1, the assets exhibit distinct statistical properties. The Jarque–Bera statistics (p < 0.01) confirm that all series deviate from normality, characterized by heavy tails (kurtosis) and skewness. This non-stationarity underscores the necessity of the nonlinear modeling approaches detailed below. In other words, this comprehensive feature set allows us to test whether complex models can effectively filter out the “noise” generated by lagging technical indicators or succumb to overfitting.

Table 2 demonstrates that all cryptocurrency price series are non-stationary at conventional significance levels. This finding is consistent with the well-established properties of financial asset prices.

Using a Bai–Perron-type multiple structural break detection procedure [80,81], multiple structural break points were identified in the log-return series of each digital asset (see Table 3). The results indicate that the return dynamics of cryptocurrency markets exhibit various structural shifts throughout the sample period, reflecting changes in market conditions and volatility patterns. In addition to the structural break analysis, the nonlinear dependence structure of the cryptocurrency return series was examined using the Brock–Dechert–Scheinkman (BDS) test [82]. The results are reported in Table 4.

The BDS test results reveal statistically significant nonlinear dependence (p < 0.01) in all cryptocurrency return series, thereby supporting the application of nonlinear modeling approaches such as machine learning and deep learning algorithms.

Table 5 presents the results of the Zivot–Andrews structural break unit root test [83], which was applied to account for potential structural changes in the cryptocurrency price series. The results indicate that the majority of digital asset price series remain non-stationary even after accounting for a structural break. However, the XRP series appears to be stationary at conventional significance levels. This finding may be attributed to the distinctive market dynamics and structural regulatory developments observed in the XRP market during the sample period. These results reinforce the assumption that cryptocurrency markets exhibit strongly nonstationary dynamics and structural shifts, which, in turn, justify the use of nonlinear machine learning and deep learning models for forecasting applications.

3.2. Algorithmic Frameworks

The core contribution of this methodology is the comparative assessment of “Deep” versus “Shallow” learning paradigms.

3.2.1. Deep Learning: LSTM and GRU

To capture long-term temporal dependencies, we deployed Recurrent Neural Networks (RNNs). Specifically, LSTM networks were utilized to mitigate the vanishing gradient problem via their specialized gating mechanisms (input, forget, and output gates). The mathematical propagation of the LSTM cell is governed by the interaction between the cell state (

c_{t}

) and the hidden state (

h_{t}

), as detailed in Equations (1)–(6) [84,85].

i_{t} = σ (ε_{i} L_{t} + ε_{i} h_{t - 1} + κ_{i})

(1)

f_{t} = σ (ε_{f} L_{t} + ε_{f} h_{t - 1} + κ_{f})

(2)

o_{t} = σ (ε_{o} L_{t} + ε_{o} h_{t - 1} + κ_{o})

(3)

\tilde{c} = \tanh (ε_{c} h_{t - 1} + ε_{c} L_{t} + κ_{c})

(4)

c_{t} = f_{t} \times c_{t - 1} + i_{t} \times {\tilde{c}}_{t}

(5)

h_{t} = o_{t} \times t a n h (c_{t})

(6)

In these formulations, σ denotes the sigmoid activation function, while L represents the input data. The terms

i_{t}

,

f_{t}

, and

o_{t}

correspond to the outputs of the input, forget, and output gates, respectively. The variable

c_{t}

indicates the cell state at time t, whereas

{\tilde{c}}_{t}

denotes the candidate cell state. The parameters

ε_{f}

,

ε_{c}

,

ε_{o}

, and

ε_{i}

represent the weight coefficients associated with the forget gate, candidate cell state, output gate, and input gate, respectively.

κ_{f}

,

κ_{i}

,

κ_{o}

, and

κ_{c}

capture bias terms. The hidden state at time t is denoted by

h_{t}

, with

h_{t - 1}

representing the hidden state at time t − 1. The candidate cell state reflects the new information that may be incorporated into the LSTM.

Complementarily, we employed GRU, a more parsimonious variant of LSTM that merges the forget and input gates into a single “update gate.” This architectural simplification reduces the parameter space, theoretically enhancing computational efficiency without significantly compromising the ability to model temporal sequences [86,87]. The GRU dynamics are formalized in Equations (7)–(10)

The update gate controls how much information from the previous hidden state is carried forward to the current time step. Specifically, it takes the prior hidden state

h_{t - 1}

and the current input

x_{t}

as inputs and produces an activation

G_{u}

(Equation (7)) bounded between zero and one. This mechanism enables the GRU to balance historical memory and new information, thereby adapting flexibly to evolving temporal dynamics.

G_{u} = σ (W_{x u} x_{t} + W_{h u} h_{t - 1} + b_{u})

(7)

The reset gate controls how much past information is discarded. It takes the previous hidden state

h_{t - 1}

, and the current input,

x_{t}

, as inputs and produces an activation, denoted by

G_{r}

(Equation (8)), bounded between zero and one. This mechanism enables the model to selectively attenuate historical information during hidden-state updates.

G_{r} = σ (W_{x r} x_{t} + W_{h r} h_{t - 1} + b_{r})

(8)

The candidate activation process computes a new candidate activation value by combining the previous hidden state with the current input. Specifically, it takes the prior hidden state, the current input, and the reset gate as inputs and produces the candidate activation output, denoted by

{\tilde{h}}_{t}

(Equation (9)).

{\tilde{h}}_{t} = τ (W_{x h} x_{t} + W_{h h} (G_{r} h_{t - 1} + b_{h}))

(9)

The hidden state is obtained by integrating the previous hidden state with the candidate activation, yielding the current hidden state denoted by

h_{t}

(Equation (10)).

h_{t} = (1 - G_{u}) h_{t - 1} + G_{u} {\tilde{h}}_{t}

(10)

3.2.2. Shallow Learning and Structural Risk Minimization: Support Vector Machines

In contrast to RNNs, including LSTM and GRU architectures, which are grounded in the ERM principle, SVMs are theoretically founded on the Structural Risk Minimization (SRM) principle, originally introduced by [38]. While ERM-based models focus on minimizing in-sample training error, SVM seeks to minimize an upper bound on the generalization error, thereby explicitly balancing model complexity and empirical loss [88].

Methodologically, SVM maps input vectors into a high-dimensional feature space through kernel functions and identifies an optimal separating hyperplane by maximizing the margin between classes (or regression bounds). This margin-maximization framework enables SVM to construct decision boundaries that are less sensitive to noise, particularly in finite-sample and high-volatility environments. The corresponding optimization problem is formulated via the Lagrangian dual representation, allowing nonlinear relationships to be captured efficiently through kernelized inner products (Equations (11)–(13)) [88].

Let the dataset be defined as follows:

G = {(x_{i}, y_{i}) ∣ i = 1,2, \dots, ε}

, where

x_{i}

denotes the feature vector in the input space, and

y_{i}

represents the scalar output variable. The parameter ε corresponds to the sampling horizon derived from heteroskedastic price dynamics through a transfer function. Within this framework, SVM seeks to identify a nonlinear mapping ϕ that projects the original input vectors x_i into a higher-dimensional feature space. This transformation enables the construction of a linear decision function in the transformed space, even when the original problem exhibits strong nonlinearities.

The core optimization problem of the SVM model is formulated as follows (Equation (11)):

\begin{matrix} m i n \frac{1}{2} ∥ w ∥^{2} + C \sum_{i = 1}^{ϵ} (ξ_{i} + ξ_{i}^{*}) \\ y_{i} - w x_{i} - b \leq 1 + ξ_{i} \\ w x_{i} + b - y_{i} \leq 1 + ξ_{i}^{*}, i = 1,2, \dots, ϵ \end{matrix}

(11)

When the Lagrangian function associated with the optimization problem is formulated, the following expression is obtained (Equation (12)).

\begin{matrix} m a x (- \frac{1}{2} \sum_{i = 1}^{ϵ} \sum_{j = 1}^{ϵ} (α_{i} - α_{i}^{*}) (α_{j} - α_{j}^{*}) K (x_{i}, x_{j})) \\ + \sum_{i = 1}^{ϵ} y_{i} (α_{i} - α_{i}^{*}) - ε \sum_{i = 1}^{ϵ} (α_{i} + α_{i}^{*}) \end{matrix}

(12)

For the classification task, the decision function is given by Equation (13):

f (x) = S i g n (\sum_{i = 1}^{e} (α_{i} - α_{i}^{*}) K (x_{i}, x) + b)

(13)

This theoretical distinction is central to our Parsimony Hypothesis. Whereas deep learning models such as LSTM and GRU often achieve lower training error at the cost of increased overfitting—especially in noisy and stochastic settings like cryptocurrency markets—SVM’s SRM-based framework prioritizes generalization performance over in-sample fit. By minimizing an explicit bound on expected risk rather than empirical risk alone, SVM is theoretically more robust to overfitting and better suited for datasets with high noise and limited effective sample sizes.

3.2.3. Hybrid Ensembles

To test the “Complexity Paradox,” we constructed hybrid architectures (GRU + SVM, LSTM + SVM, GRU + LSTM). These ensembles aim to combine the feature-extraction capabilities of RNNs with the margin-maximization of SVMs. However, we hypothesize that without careful regularization, such hybridization may amplify error variance in high-volatility regimes.

3.2.4. Proposed Prediction Framework

The proposed forecasting framework integrates various machine learning and deep learning models to evaluate their predictive performance in cryptocurrency markets. To ensure the transparency and reproducibility of the empirical analysis, the final hyperparameter configurations used to implement each model are reported in the algorithm below. These parameters represent the optimal settings used during training. The price prediction procedure consists of the following sequential steps:

Input: Cryptocurrency dataset
Output: Predicted values and performance metrics (MSE, MAE, R², MAPE)

Load cryptocurrency dataset.
Remove non-predictive variables such as date.
Split the dataset into training and test sets (70–30).
Apply MinMaxScaler normalization to both input variables and the target variable [89].
Reshape the dataset for recurrent neural network (RNN) models into the form (n, 1, p).
Train prediction models:
- GRU [86]
- LSTM [9]
- SVM [90]
- GRU + SVM [86,90]
- LSTM + SVM [9,90]
- GRU + LSTM [9,86]
Generate predictions for the test dataset.
Apply inverse transformation to obtain original price values.
Compute prediction errors and analyze error distributions using KDE.
Evaluate model performances using MSE, MAE, R², and MAPE metrics.
Ensure reproducibility by fixing the random seed (SEED = 42) [89].

3.3. Data Partitioning and Evaluation Metrics

The dataset is partitioned into a training set (70%) and a testing set (30%) to ensure a strict out-of-sample evaluation of model performance (Figure 1). The split point is deliberately set so that the test sample includes periods with structural breaks and abrupt volatility. This design enables a rigorous stress test of the models’ robustness and adaptability under turbulent market conditions.

Forecast accuracy is assessed using four widely adopted metrics: MAE, MAPE, MSE, and R². While all metrics are reported, MAE and MAPE are emphasized in the analysis, as they provide more robust, interpretable measures of forecast errors in the presence of outliers than squared-error-based metrics. The R² statistic is employed to evaluate the explanatory power of the models in capturing the variance of the dependent variable, with higher values indicating superior goodness of fit [41,91]. A robust forecasting model is therefore expected to minimize error measures while maximizing R² as close to unity as possible.

3.4. Hyperparameter Configuration of the Models

To ensure the reproducibility of the experiments [92], the final hyperparameter configurations employed in each machine learning and deep learning model are presented in Table 6.

The hyperparameter values reported in Table 6 were determined based on preliminary experiments and settings commonly adopted in the literature [89]. These configurations were kept consistent across all cryptocurrency datasets to ensure a fair comparison among the proposed models. Furthermore, all experiments were conducted with a fixed random seed (seed = 42) to ensure reproducibility.

3.5. Explainable AI (XAI) via SHAP

To mitigate the “black box” opacity inherent in machine learning and deep learning models, we employ SHAP analysis. Grounded in cooperative game theory, SHAP assigns each feature a contribution value reflecting its marginal impact on the model’s predictions [93]. Unlike heuristic feature-importance techniques, SHAP delivers consistent, locally accurate, and globally coherent explanations of model behavior. This interpretability framework enables us to assess whether predictive performance is driven by economically meaningful signals—such as price dynamics—or by overfitting to noisy technical indicators [94,95]. Consequently, SHAP-based analysis ensures that the empirical findings are not only statistically robust but also economically valid [24,25].

4. Findings

In this study, we employed six distinct configurations: GRU, LSTM, SVM, GRU + SVM, GRU + LSTM, and LSTM + SVM. To rigorously evaluate the “Complexity–Performance” hypothesis, we analyze these models using a range of performance metrics, forecast plots, and percentage error visuals.

4.1. Model Performance Results and the Parsimony–Accuracy Trade-Off

Table 7 reports the out-of-sample performance outcomes. A critical examination of these metrics reveals a distinct hierarchy in model performance, determined by the asset’s volatility profile.

4.1.1. The Superiority of Parsimony in High Volatility

The empirical analysis reveals a striking pattern: the SVM model consistently achieves the lowest error metrics (MAE, MAPE, MSE) across the majority of assets, specifically BTC, ETHUM, and XRP. For instance, in predicting Bitcoin—the most capitalized yet volatile asset—SVM achieved an R² of 0.9991 and a MAPE of 0.55%, significantly outperforming the complex GRU (R² = 0.9651, MAPE = 2.93%) and LSTM (R² = 0.9562, MAPE = 3.13%) models. Similarly, for XRP, which exhibits extreme kurtosis, SVM achieved the lowest MAPE of 1.81%, whereas GRU and LSTM lagged significantly at 8.12% and 6.84%, respectively.

This finding supports our hypothesis that, in turbulent markets, rigid structural optimization (SRM) of SVM filters out stochastic noise more effectively than gradient-based updates in deep neural networks. While RNNs attempt to capture every temporal nuance—often leading to overfitting to noise—SVMs instead construct a robust decision boundary that generalizes more effectively to unseen volatility.

4.1.2. The Failure of Hybridization and Noise Amplification

Contrary to the expectation that hybrid models combine the “best of both worlds,” our results show that the GRU + LSTM configuration yielded the poorest performance for BTC (MAPE = 4.91%) and XRP (MAPE = 9.33%). The result suggests that stacking recurrent layers on chaotic time-series data may induce “noise amplification,” where the errors of one model propagate to the next, degrading out-of-sample accuracy.

However, a regime-dependent exception is observed. For SOLU, the hybrid LSTM + SVM model achieved the best performance (MAE = 2.31, R² = 0.9908). This finding indicates that hybridization is not universally flawed but is context-dependent; it may offer advantages in specific market microstructures, though it lacks the universal stability of the standalone SVM.

4.2. Visual Analysis of Forecast Stability

An examination of the forecasting plots for BTC (Figure 2) highlights the dynamic behavioral differences between “Shallow” and “Deep” learning. The GRU and LSTM models generate smoother prediction paths; however, during periods of heightened volatility (indices 250–350 and 450–520), they struggle to capture local peaks, exhibiting a “lagging” behavior. In contrast, the SVM model exhibits a sharper response function. Despite its simplicity, SVM captures the turning points with greater precision, validating the argument that parsimonious models adapt more rapidly to structural breaks than deep architectures burdened by memory gates.

For BNB (Figure 3), performance differences intensify in the final 100 observations where volatility spikes. The SVM model follows the actual price trajectory most closely, exhibiting accurate responses to sudden upward movements. While GRU-based models track the general direction, they display increased divergence (over-smoothing) in volatile segments. The hybrid LSTM + SVM and GRU + LSTM models underperform both SVM and standalone LSTM, further supporting the observation that added complexity does not necessarily translate into greater signal retention.

The ETHUM plots (Figure 4) reinforce the dominance of parsimony. The SVM model generates predictions that align most closely with the actual price trajectory during trend reversals. The hybrid architectures fail to deliver meaningful improvements, with the GRU + LSTM combination displaying the weakest alignment. This confirms that for established assets like Ethereum, the noise-filtering capacity of SVM is superior to the stacked memory layers of hybrid RNNs.

SOLU (Figure 5) represents a unique case in which hybridization proves effective. The LSTM + SVM model produces the smoothest forecasts and closest alignment. The finding shows that, for certain volatility regimes, combining LSTM-based temporal feature extraction with SVM’s regression capability can be synergistic. However, the GRU + LSTM hybrid continues to display pronounced deviations, confirming that “Deep+Deep” stacking is generally inferior to “Deep + Shallow” or standalone “Shallow” approaches.

The XRP results (Figure 6) offer the strongest visual evidence for the “Complexity–Performance Paradox.” The SVM model clearly outperforms all others, delivering substantially lower prediction errors. Conversely, the complex GRU + LSTM configuration yields the weakest results. This divergence illustrates that in assets with heavy tails and abrupt jumps (like XRP), deeply coupled architectures lose tracking capability, likely due to gradient saturation or noise overfitting.

4.3. Error Distribution Analysis

The normalized heatmaps (Figure 7) visually summarize the dominance of parsimony. The SVM model consistently clusters within the “purple” (high performance) areas across almost all metrics and coins. By contrast, the GRU + LSTM model is often concentrated in the “yellow” (low-performance) zones. This scale-free comparison confirms that SVM’s advantage is not an artifact of a single metric but a robust property of the algorithm across the dataset.

Figure 8 illustrates the error distributions via KDE. The SVM model consistently produces the narrowest error distributions (leptokurtic), centered closely around zero. This indicates a high degree of algorithmic stability. In contrast, the GRU + LSTM model exhibits broader, platykurtic distributions with shifted means. The map suggests that greater complexity increases prediction variance, reducing the model’s reliability for risk management.

4.4. Interpretability and Feature Importance

To understand why the simpler SVM model outperforms deep architectures, we employed SHAP analysis (Figure 9).

The results elucidate the mechanism behind SVM’s success:

Dominance of Price Action: The model assigns the highest predictive weight to immediate price variables—Highest Price (HSP), Opening Price (SOP), and Volume-Weighted Average Price (SWA).

Irrelevance of Lagging Indicators: Conversely, widely used technical indicators such as MACD and Stochastic Oscillator (STOd) exhibit lower marginal contributions.

This interpretability finding is crucial. It implies that in efficient but volatile crypto markets, the most reliable signal is the immediate price action itself. DL models may obscure these direct signals by trying to learn complex temporal patterns from lagging indicators. SVM’s success stems from its ability to focus on these primary variables while treating technical oscillators as secondary noise, effectively implementing a “data-driven Occam’s Razor”.

Given that the GRU–LSTM model initially exhibited the weakest predictive performance for BTC, a SHAP-based feature importance analysis [96] was conducted to identify and eliminate low-impact technical indicators from the dataset. After removing these less informative variables, the model was retrained on the reduced feature set [89]. The resulting forecast curve shows closer alignment between the predicted and actual BTC price movements, indicating improved model generalization (see Figure 10). Moreover, the updated performance metrics show a notable increase in R² and a reduction in MAPE [97], confirming that SHAP-based feature reduction effectively mitigated input-space noise and enhanced the predictive capability of the GRU + LSTM architecture.

5. Discussion and Conclusions

In this study, we revisited the efficacy of computational forecasting techniques in the cryptocurrency domain through a rigorous comparative evaluation of ML and DL models. Rather than relying solely on in-sample fit—a metric often inflated by overfitting—this analysis emphasized out-of-sample predictive accuracy, algorithmic stability across heterogeneous volatility regimes, and the interpretability of decision boundaries. By jointly assessing forecasting outcomes and SHAP-based explanations, the paper offers direct responses to the research questions outlined in the introduction. It challenges the prevailing “complexity bias” in the existing literature. In addition, structural break and stationarity tests were conducted to ensure that latent structural changes in the data did not drive the empirical results.

Addressing RQ1, which asks, “Do ML and DL models differ significantly in out-of-sample forecasting accuracy across cryptocurrencies with heterogeneous liquidity and volatility profiles?”, the empirical results yield a definitive response. We observe a substantial divergence in model performance dictated by asset-specific volatility dynamics. Contrary to the assumption that deep architectures are inherently superior, the parsimonious SVM model consistently outperformed LSTM and GRU networks in assets characterized by extreme volatility and structural breaks, such as Bitcoin and Ripple. This outperformance is robust across all error metrics, suggesting that SVM provides greater forecast stability during sudden price movements. DL approaches, conversely, tend to exhibit advantages only when price dynamics evolve smoothly, and temporal dependence is the dominant factor. These findings align with evidence reported by [6,32,40], refuting the universality of DL superiority in crypto markets.

Regarding RQ2, which investigates, “Does increased model complexity and hybridization systematically lead to superior predictive performance, or does it succumb to the ‘Complexity–Performance Paradox’?”, our data indicate that the paradox is real. We find no systematic evidence that increased complexity enhances accuracy in high-entropy environments. The complex GRU + LSTM hybrid architecture notably underperformed simpler single-model specifications in highly volatile assets. This suggests that stacking recurrent layers can amplify noise rather than signal, leading to overfitting and weakened out-of-sample robustness when regime shifts occur frequently—consistent with [7,44]. A conditional exception was observed for Solana, where the LSTM + SVM hybrid delivered strong performance. This nuance indicates that hybridization is only beneficial when model design is tailored to asset-specific liquidity and volatility characteristics, supporting the asset-based perspectives of [14,16].

Turning to RQ3, concerning “Which technical indicators and price-based features exert the strongest influence on model predictions as identified through SHAP-based interpretability analysis?”, the SHAP results provide a clear hierarchy. Across all configurations, immediate price-related variables—specifically Volume-Weighted Average Price, daily High, and Opening price—dominate the prediction mechanism, particularly in SVM-based models. Technical indicators occupy a secondary role; their explanatory power is limited in isolation and becomes meaningful only when combined with core price variables. This finding is consistent with prior evidence reported by [25,63], and it supports the argument that short-term momentum and volatility dynamics play a central role in cryptocurrency price formation [28]. While ref. [59] demonstrates that incorporating Twitter sentiment, news sentiment, and socio-economic variables alongside Bitcoin prices markedly improves altcoin directional accuracy, the SHAP-based evidence of the present study reveals that, within a purely price-based technical feature set, instantaneous price variables consistently dominate secondary indicators. This complementary finding suggests that the marginal contribution of external sentiment signals may be context-dependent and asset-specific.

Lastly, RQ4 addresses, “Are the predictive drivers and model stability consistent across different cryptocurrencies and volatility regimes?” The results reveal significant heterogeneity. While core price variables remain universally important, the directional influence and relative weight of technical indicators vary markedly across assets and volatility regimes. This instability reflects differences in market maturity, liquidity depth, and trading behavior, confirming that cryptocurrency markets are not homogeneous [27,29].

Beyond statistical significance, the empirical findings carry important economic implications for cryptocurrency markets. The relatively strong performance of SVM models in highly volatile assets such as BTC and XRP suggests that simpler machine learning approaches may deliver more reliable forecasting signals under turbulent market conditions. For investors and portfolio managers, this implies that simpler models may yield more stable predictions during periods of sharp price fluctuations, thereby improving portfolio allocation and risk management decisions. In contrast, deep learning models such as LSTM and GRU tend to perform better under calmer market dynamics, where temporal dependencies dominate price movements. This finding indicates that the economic value of forecasting models depends substantially on prevailing market conditions rather than on algorithmic complexity alone [6].

Furthermore, the dominance of price-based variables identified through SHAP analysis underscores that short-term price formation in cryptocurrency markets is driven primarily by instantaneous price dynamics rather than secondary technical indicators. From an economic standpoint, this finding suggests that investors and analysts should prioritize core price variables when constructing forecasting frameworks or trading strategies. From a regulatory and market surveillance perspective, the results demonstrate that cryptocurrency markets exhibit heterogeneous dynamics across assets, implying that risk assessment and monitoring frameworks should account for asset-specific liquidity and volatility characteristics when evaluating market stability.

Taken together, the results demonstrate that algorithmic complexity does not guarantee predictive superiority. In assets prone to frequent regime shifts, parsimonious SVM architectures deliver superior stability relative to DL models. The empirical success of SVM supports the principle of SRM over the ERM paradigm commonly associated with neural networks. While DL models excel at stationary pattern recognition tasks, their performance deteriorates in non-stationary financial environments. SVM’s ability to control the upper bound of generalization error renders it more robust to the stochastic noise inherent in cryptocurrency markets—an interpretation supported by [98,99,100].

The findings provide several practical implications for investors, analysts, risk managers, and policymakers. From a portfolio management perspective, the results indicate that model selection should account for market regimes and asset-specific volatility characteristics rather than relying solely on increasingly complex deep learning architectures. In highly volatile market conditions, simpler machine learning approaches such as Support Vector Machines (SVM) tend to provide more robust and stable forecasting performance. In contrast, during relatively stable market phases where price dynamics evolve more smoothly, deep learning architectures such as LSTM and GRU may offer stronger predictive capabilities. This regime-aware modeling perspective suggests that effective forecasting frameworks should remain flexible and volatility-sensitive, a conclusion consistent with previous evidence in the literature [101,102].

From a broader market and regulatory perspective, the results also reveal that cryptocurrency markets exhibit heterogeneous dynamics across different assets, reflecting variations in liquidity, volatility, and structural characteristics. These differences imply that risk assessment and market monitoring frameworks should incorporate asset-specific features when evaluating market stability. Consequently, both portfolio management strategies and regulatory surveillance systems may benefit from adopting analytical frameworks that explicitly account for volatility regimes and cross-asset heterogeneity.

Despite these contributions, several limitations should be acknowledged. First, the analysis focuses on a limited number of major cryptocurrencies—BTC, BNB, ETHUM, SOLU, and XRP—which may restrict the generalizability of the findings to smaller or less liquid digital assets. Second, the empirical analysis relies on daily data covering the 2020–2025 period; therefore, the results may partly reflect the specific market conditions, structural shifts, and heightened volatility episodes observed during this timeframe. Third, although the study evaluates multiple machine learning and deep learning models, it does not encompass all potential forecasting architectures or alternative feature engineering strategies that may influence predictive performance. In addition, the analysis primarily focuses on forecasting accuracy and model interpretability, while other practical considerations—such as real-time trading implementation, portfolio optimization, and transaction costs—remain outside the scope of the current framework.

Another limitation relates to the explanatory variables used in the models. The present study relies exclusively on price-based technical indicators and does not incorporate sentiment signals from social media platforms or broader macroeconomic variables. Previous research suggests that integrating such information sources can improve directional prediction accuracy in cryptocurrency markets [59]. Incorporating these additional features within the proposed regime-sensitive modeling framework represents a promising direction for future research.

Future research may extend this framework by including longer historical datasets, including lower-liquidity cryptocurrencies, and testing alternative hybrid forecasting architectures. Additionally, adopting regime-switching or real-time forecasting frameworks could provide further insights into model robustness under rapidly changing market conditions and enhance the practical applicability of cryptocurrency forecasting systems.

In conclusion, the findings suggest that effective cryptocurrency forecasting depends less on methodological complexity and more on regime sensitivity and asset-specific dynamics. The results demonstrate that (i) machine learning and deep learning models exhibit significant differences in out-of-sample performance, (ii) increasing model complexity does not necessarily improve predictive accuracy and may sometimes reduce it, (iii) price-based indicators remain dominant predictors, and (iv) the effects of explanatory variables vary across both assets and volatility regimes. Accordingly, cryptocurrency forecasting should adopt a dynamic, asset-aware, and volatility-sensitive modeling perspective rather than defaulting to increasingly complex predictive architectures.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math14060989/s1, File S1: Dataset used in this study. Each worksheet contains the data for a specific cryptocurrency included in the analysis.

Author Contributions

Conceptualization, C.Y.K., M.G., O.N.A. and H.B.; methodology, O.N.A., M.G. and H.B.; validation, C.Y.K. and A.Y.; formal analysis, O.N.A. and M.G.; investigation, C.Y.K., O.N.A., H.B. and A.Y.; resources, C.Y.K. and H.B.; data curation, O.N.A. and M.G.; writing—original draft preparation, C.Y.K., M.G., O.N.A., H.B. and A.Y.; writing—review and editing, C.Y.K., M.G., O.N.A., H.B. and A.Y.; visualization, O.N.A. and M.G.; supervision, C.Y.K. and M.G.; project administration, C.Y.K., M.G., O.N.A., H.B. and A.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the Supplementary Materials. Further inquiries can be directed to the corresponding author.

Acknowledgments

During the preparation of this, we used ChatGPT (OpenAI, GPT-5.2) and Grammarly (free version) for language editing and grammar refinement of our manuscript. This assistance was limited to improving clarity and readability. The scientific content, analysis, interpretation, and scholarly insights of this work are entirely those of the authors. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Dupuis, D.; Gleason, K. Money Laundering with Cryptocurrency: Open Doors and the Regulatory Dialectic. J. Financ. Crime 2021, 28, 60–74. [Google Scholar] [CrossRef]
Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System; Klaus Nordby: Oslo, Norway, 2008. [Google Scholar]
CoinMarketCap. Cryptocurrencies. Available online: https://coinmarketcap.com/ (accessed on 25 December 2025).
Safi, S.K. Comparative Study on Forecasting Accuracy among Moving Average Models with Simulation and PALTEL Stock Market Data in Palestine. Am. J. Theor. Appl. Stat. 2013, 2, 202. [Google Scholar] [CrossRef]
Teixeira, L.A.; de Oliveira, A.L.I. Predicting Stock Trends through Technical Analysis and Nearest Neighbor Classification. In Proceedings of the 2009 IEEE International Conference on Systems, Man and Cybernetics, San Antonio, TX, USA, 11–14 October 2009; IEEE: New York, NY, USA, 2009; pp. 3094–3099. [Google Scholar]
Akyildirim, E.; Goncu, A.; Sensoy, A. Prediction of Cryptocurrency Returns Using Machine Learning. Ann. Oper. Res. 2021, 297, 3–36. [Google Scholar] [CrossRef]
Jaquart, P.; Köpke, S.; Weinhardt, C. Machine Learning for Cryptocurrency Market Prediction and Trading. J. Financ. Data Sci. 2022, 8, 331–352. [Google Scholar] [CrossRef]
Attri, S.; Singh, S. Exploring the Landscape of Cryptocurrency Forecasting Research: A Bibliometric Perspective. Glob. Knowl. Mem. Commun. 2025. ahead-of-print. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Hamayel, M.J.; Owda, A.Y. A Novel Cryptocurrency Price Prediction Model Using GRU, LSTM and Bi-LSTM Machine Learning Algorithms. AI 2021, 2, 477–496. [Google Scholar] [CrossRef]
Seabe, P.L.; Moutsinga, C.R.B.; Pindza, E. Forecasting Cryptocurrency Prices Using LSTM, GRU, and Bi-Directional LSTM: A Deep Learning Approach. Fractal Fract. 2023, 7, 203. [Google Scholar] [CrossRef]
Nasirtafreshi, I. Forecasting Cryptocurrency Prices Using Recurrent Neural Network and Long Short-Term Memory. Data Knowl. Eng. 2022, 139, 102009. [Google Scholar] [CrossRef]
Guo, H.; Zhang, D.; Liu, S.; Wang, L.; Ding, Y. Bitcoin Price Forecasting: A Perspective of Underlying Blockchain Transactions. Decis. Support Syst. 2021, 151, 113650. [Google Scholar] [CrossRef]
Murray, K.; Rossi, A.; Carraro, D.; Visentin, A. On Forecasting Cryptocurrency Prices: A Comparison of Machine Learning, Deep Learning, and Ensembles. Forecasting 2023, 5, 196–209. [Google Scholar] [CrossRef]
Ammer, M.A.; Aldhyani, T.H.H. Deep Learning Algorithm to Predict Cryptocurrency Fluctuation Prices: Increasing Investment Awareness. Electronics 2022, 11, 2349. [Google Scholar] [CrossRef]
Jin, C.; Li, Y. Cryptocurrency Price Prediction Using Frequency Decomposition and Deep Learning. Fractal Fract. 2023, 7, 708. [Google Scholar] [CrossRef]
Jay, A.; Berlanga, R. Algorithmic Complexity vs. Market Efficiency: Evaluating Wavelet–Transformer Architectures for Cryptocurrency Price Forecasting. Algorithms 2026, 19, 101. [Google Scholar] [CrossRef]
Luo, X.; Yin, W. Counterfactual Explanation-Based Cryptocurrency Price Prediction. Entropy 2026, 28, 65. [Google Scholar] [CrossRef] [PubMed]
Lian, T.K.; Al-Hadi, I.A.A.-Q.; Alomari, M.A.; Al-Andoli, M.N.; Jasser, M.B.; Gaid, A.S.A. A Comparative Study of Deep Learning Models for Bitcoin Price Prediction Using NeuralProphet, RNN, and LSTM. Eng. Technol. Appl. Sci. Res. 2026, 16, 31263–31273. [Google Scholar] [CrossRef]
Mahdi, E.; Martin-Barreiro, C.; Cabezas, X. A Novel Hybrid Approach Using an Attention-Based Transformer + GRU Model for Predicting Cryptocurrency Prices. Mathematics 2025, 13, 1484. [Google Scholar] [CrossRef]
Rodrigues, F.; Machado, M. High-Frequency Cryptocurrency Price Forecasting Using Machine Learning Models: A Comparative Study. Information 2025, 16, 300. [Google Scholar] [CrossRef]
Poudel, S.; Paudyal, R.; Cankaya, B.; Sterlingsdottir, N.; Murphy, M.; Pandey, S.; Vargas, J.; Poudel, K. Cryptocurrency Price and Volatility Predictions with Machine Learning. J. Mark. Anal. 2023, 11, 642–660. [Google Scholar] [CrossRef]
Nosratabadi, S.; Mosavi, A.; Duan, P.; Ghamisi, P.; Filip, F.; Band, S.; Reuter, U.; Gama, J.; Gandomi, A. Data Science in Economics: Comprehensive Review of Advanced Machine Learning and Deep Learning Methods. Mathematics 2020, 8, 1799. [Google Scholar] [CrossRef]
Wang, M.; Zheng, K.; Yang, Y.; Wang, X. An Explainable Machine Learning Framework for Intrusion Detection Systems. IEEE Access 2020, 8, 73127–73141. [Google Scholar] [CrossRef]
Fatahi, R.; Nasiri, H.; Dadfar, E.; Chehreh Chelgani, S. Modeling of Energy Consumption Factors for an Industrial Cement Vertical Roller Mill by SHAP-XGBoost: A “Conscious Lab” Approach. Sci. Rep. 2022, 12, 7543. [Google Scholar] [CrossRef]
Badar, W.; Ramzan, S.; Raza, A.; Fitriyani, N.L.; Syafrudin, M.; Lee, S.W. Enhanced Interpretable Forecasting of Cryptocurrency Prices Using Autoencoder Features and a Hybrid CNN-LSTM Model. Mathematics 2025, 13, 1908. [Google Scholar] [CrossRef]
Šťastný, T.; Koudelka, J.; Bílková, D.; Marek, L. Clustering and Modelling of the Top 30 Cryptocurrency Prices Using Dynamic Time Warping and Machine Learning Methods. Mathematics 2022, 10, 3672. [Google Scholar] [CrossRef]
Ftiti, Z.; Louhichi, W.; Ben Ameur, H. Cryptocurrency Volatility Forecasting: What Can We Learn from the First Wave of the COVID-19 Outbreak? Ann. Oper. Res. 2023, 330, 665–690. [Google Scholar] [CrossRef] [PubMed]
Lapitskaya, D.; Eratalay, M.H.; Sharma, R. Prediction of Cryptocurrency Prices with the Momentum Indicators and Machine Learning. Comput. Econ. 2025, 66, 2483–2501. [Google Scholar] [CrossRef]
Mostafa, F.; Saha, P.; Islam, M.R.; Nguyen, N. GJR-GARCH Volatility Modeling under NIG and ANN for Predicting Top Cryptocurrencies. J. Risk Financ. Manag. 2021, 14, 421. [Google Scholar] [CrossRef]
Li, Z.; Tran, M.-N.; Wang, C.; Gerlach, R.; Gao, J. A Bayesian Long Short-Term Memory Model for Value at Risk and Expected Shortfall Joint Forecasting. arXiv 2021, arXiv:2001.08374. [Google Scholar] [CrossRef]
Dimitriadou, A.; Gregoriou, A. Predicting Bitcoin Prices Using Machine Learning. Entropy 2023, 25, 777. [Google Scholar] [CrossRef]
Zhang, J.; Cai, K.; Wen, J. A Survey of Deep Learning Applications in Cryptocurrency. iScience 2024, 27, 108509. [Google Scholar] [CrossRef] [PubMed]
Chong, T.; Ng, W.-K.; Liew, V. Revisiting the Performance of MACD and RSI Oscillators. J. Risk Financ. Manag. 2014, 7, 1–12. [Google Scholar] [CrossRef]
Wang, J.; Kim, J. Predicting Stock Price Trend Using MACD Optimized by Historical Volatility. Math. Probl. Eng. 2018, 2018, 9280590. [Google Scholar] [CrossRef]
Zatwarnicki, M.; Zatwarnicki, K.; Stolarski, P. Effectiveness of the Relative Strength Index Signals in Timing the Cryptocurrency Market. Sensors 2023, 23, 1664. [Google Scholar] [CrossRef]
Kapur, G.; Manohar, S.; Mittal, A.; Jain, V.; Trivedi, S. Cryptocurrency Price Fluctuation and Time Series Analysis through Candlestick Pattern of Bitcoin and Ethereum Using Machine Learning. Int. J. Qual. Reliab. Manag. 2024, 41, 2055–2074. [Google Scholar] [CrossRef]
Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995; ISBN 978-1-4757-2442-4. [Google Scholar]
Kara, Y.; Acar Boyacioglu, M.; Baykan, Ö.K. Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines: The Sample of the Istanbul Stock Exchange. Expert Syst. Appl. 2011, 38, 5311–5319. [Google Scholar] [CrossRef]
Liu, Y.; Li, Z.; Nekhili, R.; Sultan, J. Forecasting Cryptocurrency Returns with Machine Learning. Res. Int. Bus. Finance 2023, 64, 101905. [Google Scholar] [CrossRef]
Moghaddam, A.H.; Moghaddam, M.H.; Esfandyari, M. Stock Market Index Prediction Using Artificial Neural Network. J. Econ. Financ. Adm. Sci. 2016, 21, 89–93. [Google Scholar] [CrossRef]
Zhao, Y.; Salem, S.; AL-Zaydi, A.M.; Seong, J.-T.; Alghamdi, F.M.; Yusuf, M. On Fitting and Forecasting the Log-Returns of Bitcoin and Ethereum Exchange Rates via a New Sine-Based Logistic Model and Robust Regression Methods. Alex. Eng. J. 2024, 96, 225–236. [Google Scholar] [CrossRef]
Brini, A.; Lenz, J. Pricing Cryptocurrency Options with Machine Learning Regression for Handling Market Volatility. Econ. Model. 2024, 136, 106752. [Google Scholar] [CrossRef]
Koker, T.E.; Koutmos, D. Cryptocurrency Trading Using Machine Learning. J. Risk Financ. Manag. 2020, 13, 178. [Google Scholar] [CrossRef]
Kiranmai Balijepalli, N.S.S.; Thangaraj, V. Prediction of Cryptocurrency’s Price Using Ensemble Machine Learning Algorithms. Eur. J. Manag. Bus. Econ. 2025. ahead of print. [Google Scholar] [CrossRef]
Cortez, K.; del Pilar Rodríguez-García, M.; Mongrut, S. Exchange Market Liquidity Prediction with the K-Nearest Neighbor Approach: Crypto vs. Fiat Currencies. Mathematics 2021, 9, 56. [Google Scholar] [CrossRef]
Boozary, P.; Sheykhan, S.; GhorbanTanhaei, H. Forecasting the Bitcoin Price Using the Various Machine Learning: A Systematic Review in Data-Driven Marketing. Syst. Soft Comput. 2025, 7, 200209. [Google Scholar] [CrossRef]
Alnami, H.; Mohzary, M.; Assiri, B.; Zangoti, H. An Integrated Framework for Cryptocurrency Price Forecasting and Anomaly Detection Using Machine Learning. Appl. Sci. 2025, 15, 1864. [Google Scholar] [CrossRef]
Liu, M.; Li, G.; Li, J.; Zhu, X.; Yao, Y. Forecasting the Price of Bitcoin Using Deep Learning. Financ. Res. Lett. 2021, 40, 101755. [Google Scholar] [CrossRef]
Lu, W.; Li, J.; Wang, J.; Qin, L. A CNN-BiLSTM-AM Method for Stock Price Prediction. Neural Comput. Appl. 2021, 33, 4741–4753. [Google Scholar] [CrossRef]
Wang, H.; Wang, J.; Cao, L.; Li, Y.; Sun, Q.; Wang, J. A Stock Closing Price Prediction Model Based on CNN-BiSLSTM. Complexity 2021, 2021, 5360828. [Google Scholar] [CrossRef]
Nouira, A.Y.; Bouchakwa, M.; Amara, M. Role of Social Networks and Machine Learning Techniques in Cryptocurrency Price Prediction: A Survey. Soc. Netw. Anal. Min. 2024, 14, 152. [Google Scholar] [CrossRef]
Shukla, A.; Das, T.K.; Roy, S.S. TRX Cryptocurrency Profit and Transaction Success Rate Prediction Using Whale Optimization-Based Ensemble Learning Framework. Mathematics 2023, 11, 2415. [Google Scholar] [CrossRef]
Arratia, A.; López-Barrantes, A.X. Do Google Trends Forecast Bitcoins? Stylized Facts and Statistical Evidence. J. Bank. Financ. Technol. 2021, 5, 45–57. [Google Scholar] [CrossRef]
Morozova, E.; Panov, V. Bitcoin Price Modelling via Analysis of Google Trends Data: Lévy-Based Approach. Financ. Res. Lett. 2025, 86, 108301. [Google Scholar] [CrossRef]
Wang, Y.; Wang, C.; Sensoy, A.; Yao, S.; Cheng, F. Can Investors’ Informed Trading Predict Cryptocurrency Returns? Evidence from Machine Learning. Res. Int. Bus. Financ. 2022, 62, 101683. [Google Scholar] [CrossRef]
Chen, Z. From Disruption to Integration: Cryptocurrency Prices, Financial Fluctuations, and Macroeconomy. J. Risk Financ. Manag. 2025, 18, 360. [Google Scholar] [CrossRef]
Mudassir, M.; Bennbaia, S.; Unal, D.; Hammoudeh, M. Time-Series Forecasting of Bitcoin Prices Using High-Dimensional Features: A Machine Learning Approach. Neural Comput. Appl. 2025, 37, 22979–22993. [Google Scholar] [CrossRef] [PubMed]
Gupta, A.; Pandey, G.; Gupta, R.; Das, S.; Prakash, A.; Garg, K.; Sarkar, S. Machine Learning-Based Approach for Predicting the Altcoins Price Direction Change from a High-Frequency Data of Seven Years Based on Socio-Economic Factors, Bitcoin Prices, Twitter and News Sentiments. Comput. Econ. 2024, 64, 2981–3026. [Google Scholar] [CrossRef]
Alenazi, M.M.; Jaskani, F.H. Hybrid Cloud–Edge Architecture for Real-Time Cryptocurrency Market Forecasting: A Distributed Machine Learning Approach with Blockchain Integration. Mathematics 2025, 13, 3044. [Google Scholar] [CrossRef]
Kabashkin, I.; Perekrestov, V.; Pivovar, M. Token-Based Digital Currency Model for Aviation Technical Support as a Service Platforms. Mathematics 2025, 13, 1297. [Google Scholar] [CrossRef]
Mohanty, S.; Dash, R. A New Dual Normalization for Enhancing the Bitcoin Pricing Capability of an Optimized Low Complexity Neural Net with TOPSIS Evaluation. Mathematics 2023, 11, 1134. [Google Scholar] [CrossRef]
Cohen, G.; Qadan, M. The Complexity of Cryptocurrencies Algorithmic Trading. Mathematics 2022, 10, 2037. [Google Scholar] [CrossRef]
Ciano, T. Bitcoin Price Prediction and Machine Learning Features: New Financial Scenarios. In Encyclopedia of Monetary Policy, Financial Markets and Banking; Elsevier: Amsterdam, The Netherlands, 2025; pp. 683–693. [Google Scholar]
Fang, F.; Chung, W.; Ventre, C.; Basios, M.; Kanthan, L.; Li, L.; Wu, F. Ascertaining Price Formation in Cryptocurrency Markets with Machine Learning. Eur. J. Financ. 2024, 30, 78–100. [Google Scholar] [CrossRef]
Kumar, G.; Singh, U.P.; Jain, S. Hybrid Evolutionary Intelligent System and Hybrid Time Series Econometric Model for Stock Price Forecasting. Int. J. Intell. Syst. 2021, 36, 4902–4935. [Google Scholar] [CrossRef]
Huang, Y.; Capretz, L.F.; Ho, D. Machine Learning for Stock Prediction Based on Fundamental Analysis. In Proceedings of the 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, FL, USA, 4–7 December 2021; IEEE: New York, NY, USA, 2021; pp. 1–10. [Google Scholar]
Zheng, H.; Zhou, Z.; Chen, J. RLSTM: A New Framework of Stock Prediction by Using Random Noise for Overfitting Prevention. Comput. Intell. Neurosci. 2021, 2021, 8865816. [Google Scholar] [CrossRef]
Jeong, S.H.; Lee, H.S.; Nam, H.; Oh, K.J. Using a Genetic Algorithm to Build a Volume Weighted Average Price Model in a Stock Market. Sustainability 2021, 13, 1011. [Google Scholar] [CrossRef]
Cheng, E.; Lam, C. The Important Legal and Regulatory Influences in the Operation of Contemporary, Sophisticated and Organized Exchanges. J. Bus. Econ. Policy 2019, 6, 10–15. [Google Scholar] [CrossRef]
Qiu, J.; Wang, B.; Zhou, C. Forecasting Stock Prices with Long-Short Term Memory Neural Network Based on Attention Mechanism. PLoS ONE 2020, 15, e0227222. [Google Scholar] [CrossRef]
Bernis, G.; Brunel, N.; Kornprobst, A.; Scotti, S. Stochastic Evolution of Distributions and Functional Bollinger Bands. Appl. Stoch. Models Bus. Ind. 2022, 38, 370–390. [Google Scholar] [CrossRef]
Shahvaroughi Farahani, M.; Razavi Hajiagha, S.H. Forecasting Stock Price Using Integrated Artificial Neural Network and Metaheuristic Algorithms Compared to Time Series Models. Soft Comput. 2021, 25, 8483–8513. [Google Scholar] [CrossRef]
Glabadanidis, P. Timing the Market with a Combination of Moving Averages. Int. Rev. Financ. 2017, 17, 353–394. [Google Scholar] [CrossRef]
Han, M. Commodity Momentum and Reversal: Do They Exist, and If so, Why? J. Futures Mark. 2023, 43, 1204–1237. [Google Scholar] [CrossRef]
Wang, G.; Peskin, C.S. Entrainment of a Cellular Circadian Oscillator by Light in the Presence of Molecular Noise. Phys. Rev. E 2018, 97, 062416. [Google Scholar] [CrossRef]
Khairi, T.W.A.; Zaki, R.M.; Mahmood, W.A. Stock Price Prediction Using Technical, Fundamental and News Based Approach. In Proceedings of the 2019 2nd Scientific Conference of Computer Sciences (SCCS), Baghdad, Iraq, 27–28 March 2019; IEEE: New York, NY, USA, 2019; pp. 177–181. [Google Scholar]
Vargas, M.R.; dos Anjos, C.E.M.; Bichara, G.L.G.; Evsukoff, A.G. Deep Leaming for Stock Market Prediction Using Technical Indicators and Financial News Articles. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 7–12 July 2018; IEEE: New York, NY, USA, 2018; pp. 1–8. [Google Scholar]
Prachyachuwong, K.; Vateekul, P. Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information. Information 2021, 12, 250. [Google Scholar] [CrossRef]
Bai, J.; Perron, P. Estimating and Testing Linear Models with Multiple Structural Changes. Econometrica 1998, 66, 47. [Google Scholar] [CrossRef]
Bai, J.; Perron, P. Computation and Analysis of Multiple Structural Change Models. J. Appl. Econom. 2003, 18, 1–22. [Google Scholar] [CrossRef]
Broock, W.A.; Scheinkman, J.A.; Dechert, W.D.; LeBaron, B. A Test for Independence Based on the Correlation Dimension. Econom. Rev. 1996, 15, 197–235. [Google Scholar] [CrossRef]
Zivot, E.; Andrews, D.W.K. Further Evidence on the Great Crash, the Oil-Price Shock, and the Unit-Root Hypothesis. J. Bus. Econ. Stat. 1992, 10, 251–270. [Google Scholar] [CrossRef]
Ehteram, M.; Najah Ahmed, A.; Khozani, Z.S.; El-Shafie, A. Graph Convolutional Network—Long Short Term Memory Neural Network- Multi Layer Perceptron-Gaussian Progress Regression Model: A New Deep Learning Model for Predicting Ozone Concertation. Atmos. Pollut. Res. 2023, 14, 101766. [Google Scholar] [CrossRef]
Kafi, F.; Yousefi, E.; Ehteram, M.; Ashrafi, K. Stabilized Long Short Term Memory (SLSTM) Model: A New Variant of the LSTM Model for Predicting Ozone Concentration Data. Earth Sci. Inform. 2025, 18, 311. [Google Scholar] [CrossRef]
Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Stroudsburg, PA, USA, 2014; pp. 1724–1734. [Google Scholar]
Ma, X.; Hou, M.; Zhan, J.; Zhong, R. Enhancing Production Prediction in Shale Gas Reservoirs Using a Hybrid Gated Recurrent Unit and Multilayer Perceptron (GRU-MLP) Model. Appl. Sci. 2023, 13, 9827. [Google Scholar] [CrossRef]
Song, X.; Zhang, Y. The Implementation of Dynamic Heteroskedasticity Convertible SVM Model in Financial Time Series. In Proceedings of the 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA), Ottawa, ON, Canada, 29–30 September 2014; IEEE: New York, NY, USA, 2014; pp. 281–285. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Van Dao, D.; Adeli, H.; Ly, H.-B.; Le, L.M.; Le, V.M.; Le, T.-T.; Pham, B.T. A Sensitivity and Robustness Analysis of GPR and ANN for High-Performance Concrete Compressive Strength Prediction Using a Monte Carlo Simulation. Sustainability 2020, 12, 830. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Hu, C.; Li, L.; Li, Y.; Wang, F.; Hu, B.; Peng, Z. Explainable Machine-Learning Model for Prediction of In-Hospital Mortality in Septic Patients Requiring Intensive Care Unit Readmission. Infect. Dis. Ther. 2022, 11, 1695–1713. [Google Scholar] [CrossRef]
Kumar, V.; Sznajder, K.K.; Kumara, S. Machine Learning Based Suicide Prediction and Development of Suicide Vulnerability Index for US Counties. npj Ment. Health Res. 2022, 1, 3. [Google Scholar] [CrossRef] [PubMed]
Frie, C.; Riza Durmaz, A.; Eberl, C. Exploration of Materials Fatigue Influence Factors Using Interpretable Machine Learning. Fatigue Fract. Eng. Mater. Struct. 2024, 47, 2752–2773. [Google Scholar] [CrossRef]
Lundberg, S.M.; Allen, P.G.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems 30, 31st Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Hyndman, R.J.; Koehler, A.B. Another Look at Measures of Forecast Accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef]
Ji, S.; Kim, J.; Im, H. A Comparative Study of Bitcoin Price Prediction Using Deep Learning. Mathematics 2019, 7, 898. [Google Scholar] [CrossRef]
Al Giffary, N.F.; Sulianta, F. Prediction Of Cryptocurrency Prices Using LSTM, SVM And Polynomial Regression. arXiv 2024, arXiv:2403.03410. [Google Scholar] [CrossRef]
Ahmed Al-Zakhali, O.; Abdulazeez, A.M. Comparative Analysis of Machine Learning and Deep Learning Models for Bitcoin Price Prediction. Indones. J. Comput. Sci. 2024, 13, 407–419. [Google Scholar] [CrossRef]
Köse, N.; Gür, Y.E.; Ünal, E. Deep Learning and Machine Learning Insights Into the Global Economic Drivers of the Bitcoin Price. J. Forecast. 2025, 44, 1666–1698. [Google Scholar] [CrossRef]
Kalyani, K.; Subbiah Parvathy, V.A.M.; Abdeljaber, H.; Satyanarayana Murthy, T.; Acharya, S.; Prasad Joshi, G.; Won Kim, S. Effective Return Rate Prediction of Blockchain Financial Products Using Machine Learning. Comput. Mater. Contin. 2023, 74, 2303–2316. [Google Scholar] [CrossRef]

Figure 1. Dataset partitioning.

Figure 2. BTC—Performance Plots of Actual and Predicted Values across Models.

Figure 3. BNB—Performance Plots of Actual and Predicted Values across Models.

Figure 4. ETHUM—Performance Plots of Actual and Predicted Values across Models.

Figure 5. SOLU—Performance Plots of Actual and Predicted Values across Models.

Figure 6. XRP—Performance Plots of Actual and Predicted Values across Models.

Figure 7. Normalized Heatmaps of Performance Metrics.

Figure 8. Kernel Density Estimation (KDE) Curves of Percentage Error Distributions.

Figure 9. SHAP Analysis of the Best-Performing Model (BTC–SVM).

Figure 10. BTC price prediction using GRU + LSTM with SHAP-based feature reduction.

Table 1. Descriptive statistics and normality test results.

	BTC	BNB	ETHUM	SOLU	XRP
Mean	49,997.85	406.70	2379.14	90.16	0.92
Std	28,292.52	199.18	977.66	72.43	0.77
Min	10,542.06	26.86	340.75	1.19	0.21
25%	27,261.95	269.11	1661.99	23.17	0.46
50%	42,753.97	354.64	2291.68	72.77	0.58
75%	64,241.09	581.65	3128.75	151.27	1.00
Max	123,306.43	1047.82	4832.07	261.97	3.55
Skew	0.85	0.21	0.24	0.426	1.65
Kurtosis	−0.25	−0.41	−0.50	−1.19	1.49
JB stat	223.89	27.25	38.27	163.10	994.80
JB p	0.00 *	0.00 *	0.00 *	0.00 *	0.00 *

Note: Values are expressed in U.S. dollars (USD). Statistically significant at the 1% level (*).

Table 2. Unit root test results (ADF).

	ADF Statistic	p-Value	Stationarity
BTC	−0.550	0.881	Non-stationary
BNB	−0.779	0.819	Non-stationary
ETHUM	−2.331	0.161	Non-stationary
SOLU	−1.387	0.588	Non-stationary
XRP	−0.927	0.778	Non-stationary

Table 3. Multiple structural break dates (Bai–Perron-type) identified in log-return series.

	Number of Breaks	Structural Break Dates
BTC	5	15 December 2020, 9 January 2021, 24 January 2021, 10 November 2021, 20 November 2022
BNB	5	29 January 2021, 3 February 2021, 18 February 2021, 25 March 2021, 14 April 2021
ETHUM	5	24 April 2021, 14 May 2021, 19 May 2021, 23 January 2025, 8 April 2025
SOLU	5	25 December 2020, 23 February 2021, 28 July 2021, 11 September 2021, 30 December 2022
XRP	5	30 November 2020, 4 April 2021, 14 April 2021, 24 April 2021, 29 April 2021

Table 4. Brock–Dechert–Scheinkman (BDS) test results for nonlinearity.

	BDS Statistic	p-Value
BTC	4.717	0.000
BNB	13.385	0.000
ETHUM	6.141	0.000
SOLU	9.638	0.000
XRP	12.008	0.000

Table 5. Zivot–Andrews Structural break unit root test results.

	ZA Statistic	p-Value	Break Index	Stationarity
BTC	−3.5577	0.6421	13	Non-stationary
BNB	−3.1292	0.8680	5	Non-stationary
ETHUM	−3.9285	0.3952	17	Non-stationary
SOLU	−3.6771	0.5630	0	Non-stationary
XRP	−6.4511	0.0004	21	Stationary

Table 6. Hyperparameter configuration of the models.

Model	Hyperparameter	Value
GRU	Units	64
	Activation	tanh
	Dense Layer	32 neurons (ReLU)
	Optimizer	Adam
	Learning Rate	0.001
	Loss Function	Mean Squared Error
	Epochs	20
	Batch Size	16
	Input Shape	(timesteps = 1, features = p)
LSTM	Units	64
	Activation	tanh
	Dense Layer	32 neurons (ReLU)
	Optimizer	Adam
	Learning Rate	0.001
	Loss Function	Mean Squared Error
	Epochs	20
	Batch Size	16
	Input Shape	(timesteps = 1, features = p)
SVM	Kernel Function: Radial Basis Function	RBF
	C	800
	Gamma	0.001
	Epsilon	0.01
GRU + SVM	GRU	64
	Activation	tanh
	SVM Kernel	RBF
	C	600
	Gamma	0.001
	Epsilon	0.01
LSTM + SVM	LSTM Units	64
	Activation	tanh
	SVM Kernel	RBF
	C	600
	Gamma	0.001
	Epsilon	0.01
GRU + LSTM	GRU	64
	LSTM Units	64
	Dense Layer	32 neurons (ReLU)
	Optimizer	Adam
	Learning Rate	0.001
	Loss Function	Mean Squared Error
	Epochs	20
	Batch Size	16

Table 7. Model Performance Results.

Coin	Model	MAE	MAPE (%)	R²	MSE
BTC	GRU	2866.3682	2.9310	0.9651	14,288,580.0865
BTC	LSTM	3105.1338	3.1342	0.9562	17,917,769.4491
BTC	SVM	460.5172	0.5455	0.9991	356,745.8254
BTC	GRU + SVM	2341.746	2.3469	0.9748	10,306,609.6366
BTC	GRU + LSTM	4941.4201	4.9098	0.8930	43,832,097.7282
BTC	LSTM + SVM	2419.2396	2.4403	0.9743	10,497,885.761
BNB	GRU	11.2273	1.6381	0.9729	258.4689
BNB	LSTM	6.8132	1.0445	0.9907	88.5141
BNB	SVM	5.1133	0.7976	0.9949	48.2872
BNB	GRU + SVM	8.6014	1.2467	0.9839	153.2239
BNB	GRU + LSTM	10.2425	1.5346	0.9809	181.5925
BNB	LSTM + SVM	10.6795	1.5439	0.9776	213.1521
ETHUM	GRU	41.8236	1.4087	0.9924	4013.1953
ETHUM	LSTM	31.3793	1.0507	0.9968	1670.6561
ETHUM	SVM	26.1128	0.8863	0.9978	1157.1261
ETHUM	GRU + SVM	34.1498	1.1324	0.9954	2458.4263
ETHUM	GRU + LSTM	36.6249	1.2276	0.9945	2930.3937
ETHUM	LSTM + SVM	32.2126	1.0513	0.9957	2290.0729
SOLU	GRU	3.5855	2.1374	0.9817	20.3610
SOLU	LSTM	4.2487	2.4844	0.9732	29.7986
SOLU	SVM	3.1835	1.8619	0.9839	17.9050
SOLU	GRU + SVM	3.8195	2.3518	0.9791	23.2488
SOLU	GRU + LSTM	7.1321	4.2370	0.9391	67.6740
SOLU	LSTM + SVM	2.3111	1.3826	0.9908	10.2058
XRP	GRU	0.1843	8.1237	0.9359	0.0666
XRP	LSTM	0.1635	6.8404	0.9484	0.0536
XRP	SVM	0.0368	1.8149	0.9970	0.0031
XRP	GRU + SVM	0.1001	4.2435	0.9788	0.0221
XRP	GRU + LSTM	0.2267	9.3263	0.8976	0.1065
XRP	LSTM + SVM	0.1443	5.8884	0.9558	0.0460

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kaygın, C.Y.; Gün, M.; Akarsu, O.N.; Bağcı, H.; Yanık, A. Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting. Mathematics 2026, 14, 989. https://doi.org/10.3390/math14060989

AMA Style

Kaygın CY, Gün M, Akarsu ON, Bağcı H, Yanık A. Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting. Mathematics. 2026; 14(6):989. https://doi.org/10.3390/math14060989

Chicago/Turabian Style

Kaygın, Ceyda Yerdelen, Musa Gün, Osman Nuri Akarsu, Haşim Bağcı, and Ahmet Yanık. 2026. "Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting" Mathematics 14, no. 6: 989. https://doi.org/10.3390/math14060989

APA Style

Kaygın, C. Y., Gün, M., Akarsu, O. N., Bağcı, H., & Yanık, A. (2026). Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting. Mathematics, 14(6), 989. https://doi.org/10.3390/math14060989

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting

Abstract

1. Introduction

2. Literature Review

2.1. Limitations of Technical Heuristics and the Shift to Machine Learning

2.2. Conventional Machine Learning and Structural Risk Minimization

2.3. The Hybridization Trend and Ensemble Frameworks

2.4. Deep Learning: The Hegemony of Temporal Structures

2.5. Feature Dimensionality and Alternative Data

2.6. Systemic Integration and Technological Constraints

2.7. The Interpretability Deficit and Research Gap

3. Materials and Methodology

3.1. Data Acquisition and Feature Space Construction

3.2. Algorithmic Frameworks

3.2.1. Deep Learning: LSTM and GRU

3.2.2. Shallow Learning and Structural Risk Minimization: Support Vector Machines

3.2.3. Hybrid Ensembles

3.2.4. Proposed Prediction Framework

3.3. Data Partitioning and Evaluation Metrics

3.4. Hyperparameter Configuration of the Models

3.5. Explainable AI (XAI) via SHAP

4. Findings

4.1. Model Performance Results and the Parsimony–Accuracy Trade-Off

4.1.1. The Superiority of Parsimony in High Volatility

4.1.2. The Failure of Hybridization and Noise Amplification

4.2. Visual Analysis of Forecast Stability

4.3. Error Distribution Analysis

4.4. Interpretability and Feature Importance

5. Discussion and Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI