An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting

Lahmiri, Salim; Bekiros, Stelios

doi:10.3390/math14101615

Open AccessArticle

An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting

by

Salim Lahmiri

^1,2 and

Stelios Bekiros

^3,*

¹

Department of Supply Chain and Business Technology Management, John Molson School of Business, Concordia University, Montreal, QC H3H 0A1, Canada

²

Chaire Innovation et Économie Numérique, ESCA École de Management, Casablanca 20250, Morocco

³

Valter Cantino Department of Management, University of Turin, 10124 Turin, Italy

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(10), 1615; https://doi.org/10.3390/math14101615

Submission received: 23 March 2026 / Revised: 21 April 2026 / Accepted: 30 April 2026 / Published: 9 May 2026

(This article belongs to the Special Issue Applied Time Series and Artificial Intelligence in Economics and Finance)

Download

Browse Figures

Versions Notes

Abstract

We propose various hybrid predictive systems to forecast the Bitcoin next-day price. In particular, we combine the decomposition methods based on signal processing techniques including maximum overlap discrete wavelet transform (MODWT), empirical wavelet transform (EWT), empirical mode decomposition (EMD), and variational mode decomposition (VMD) for feature extraction from original price series. Then, the extracted features are fed to the machine learning models for training and forecasting. We implemented five machine learning models, including regression Gaussian process (RGP), support vector regression (SVR), k-nearest neighbors algorithm (kNN), regression trees (RT), and feedforward neural networks (FFNN). The grey wolf optimization (GWO) algorithm is employed for hyperparameter optimization of the machine learning models. The root mean squared error (RMSE) is used for the evaluation and comparison of 20 hybrid predictive systems. The simulation results show that the RGP-GWO-VMD hybrid predictive system achieved the lowest forecasting error. In addition, RGP-GWO yielded on average the lowest forecasting error across all of the machine learning systems. Furthermore, among signal decomposition methods, the lowest forecasting error is generally achieved under the EWT. Hence, we presented the best results in forecasting Bitcoin prices from 20 hybrid prediction systems to serve as the baseline for future work and to guide traders, investors, and portfolio managers.

Keywords:

MODWT; EWT; EMD; VMD; signal processing; signal decomposition; machine learning; grey wolf optimization; Bitcoin price forecasting

MSC:

68T01

1. Introduction

Cryptocurrency is a digital currency traded in computer networks, and it is not subjected to government or any central authority. Nowadays, trading cryptocurrencies is attracting many investors and scholars, as they offer a high level of openness and high returns but with a high risk and volatility. Bitcoin is broadly acknowledged as one of the most prominent and traded cryptocurrencies. Therefore, investors and traders have shifted their attention to the trading of cryptocurrencies, especially Bitcoin. In this regard, forecasting Bitcoin prices is crucial for investors and traders, primarily for better portfolio management, specifically in the short term to generate high profits from active trading. However, it is challenging to accurately predict Bitcoin price since it is particularly volatile. Therefore, scholars have proposed various models to predict Bitcoin price in recent years.

For instance, Atsalakis et al. [1] employed a hybrid neuro-fuzzy controller to forecast the direction of the change in the daily price of Bitcoin. They concluded that it outperformed the adaptive neuro-fuzzy system (ANFIS) and the standard artificial neural network. Lahmiri and Bekiros [2] found that the predictability of long-short-term memory neural network topologies (LSTM) is significantly higher when compared to the generalized regression neural network (GRNN). Mallqui and Fernandes [3] proposed a predictive system based on the combination of recurrent neural networks (RNN) and decision tree classifiers and found that their system obtained the best results to predict the Bitcoin price direction compared to LSTM and recurrent neural networks (RNN). In addition, the support vector machines (SVM) algorithm obtained the best results to forecast the Bitcoin price compared to the standard artificial neural networks and RNN. Lahmiri and Bekiros [4] implemented and compared the performance of support vector regression (SVR), Gaussian Poisson regressions (GRP), regression trees (RT), k-nearest neighbours (kNN), feedforward neural networks (FFNN), Bayesian regularization neural networks (BRNN), and radial basis function networks (RBFNN). The optimal parameters of the SVR, GRP, and kNN were determined by using the Bayesian optimization method. They found that the BRNN achieved superior accuracy compared to the remaining predictive models. Guo et al. [5] combined the wavelet transform (WT) with a casual multi-head attention temporal convolutional network and found that their proposed model improves the price forecasting performance compared to the autoregressive integrated moving average (ARIMA), ARIMA with additional explanatory variables (ARIMAX), convolutional neural networks (CNN), multilayer perceptron (MLP), LSTM, sequence-to-sequence (Seq2Seq) network, Bayesian neural networks (BNN), and state-frequency memory (SFM) recurrent neural networks. Koo and Kim [6] found that the LSTM yielded the highest accuracy in predicting price direction movement, followed by RNN and standard MLP. In addition, it was shown that the flattening distribution strategy (FDS), used to eliminate the difference between outliers and non-outliers, significantly improves the accuracy of the LSTM, RNN, and MLP. Lahmiri and Bekiros [7] investigated the effect of standard numerical training algorithms, including conjugate gradient with Powell-Beale restarts, the resilient algorithm, and Levenberg–Marquardt algorithm on the accuracy of the deep feedforward neural network (DFFNN) and concluded that the DFFNN trained with the Levenberg–Marquardt algorithm outperforms the DFFNN trained with Powell-Beale restarts algorithm and DFFNN trained with the resilient algorithm in forecasting intraday prices of Bitcoin. Rajabi et al. [8] concluded that the learnable window size-MLP (LWS-MLP) is superior to the SVR, ARIMA, random forests (RF), LSTM, MLP, and WaveNet. Rathore et al. [9] found that Facebook’s Prophet performed better than the Naïve model by using open, high, low, volume, and market capitalization as inputs. Hajek et al. [10] concluded that the bagged support vector regression (BSVR) trained with sentiment index outperformed ARIMA, feedforward neural network, random forest (RF), radial basis function neural network (RBFNN), stacked artificial neural network (SANN), and bidirectional long short-term memory (Bi-LSTM). Zou and Herremans [11] proposed a hybrid multimodal model that consists of a support vector machine (SVM) trained with price data, which is fused with a text-based CNN. They concluded that their model can be used to build a profitable trading strategy with a reduced risk over a hold or moving average strategy. Cheng et al. [12] found that LSTM has a noticeable improvement compared to seasonal autoregressive integrated moving average (SARIMA) and Facebook Prophet. Abul Basher et al. [13] concluded that RF predicts Bitcoin price directions with a higher degree of accuracy than logit models.

A common limitation across the above studies is the absence of directly integrating signal processing algorithms to extract intrinsic information in Bitcoin data and tuned machine learning to improve accuracy into the forecasting framework. Instead, they rely on direct inputs such as historical observations or exogenous variables to train machine learning systems. This classical forecasting process may introduce noisy data, which is not appropriate to improve accuracy. Also, other studies used pretrained deep learning models on very large datasets composed of different and various signals and images. Such pretrained deep learning models are not necessarily suitable to analyze and predict the underlying Bitcoin data. In addition, previous works have not considered optimization of the machine learning models to improve their effectiveness.

Given the complexity of the dynamics underlying Bitcoin prices and the inherent uncertainty in forecasting such volatile prices, this paper puts forth novel hybrid predictive systems for the purpose of forecasting Bitcoin’s next-day price. Specifically, this study proposes the hybridization of signal processing techniques and machine learning to forecast Bitcoin’s future price. First, signal decomposition methods are used for decomposing the original price time series to fully leverage the information contained within the data. Second, machine learning models are trained with decomposition-based information to predict the Bitcoin next-day price, and the grey wolf optimization (GWO) [14] algorithm is employed to optimize the hyper-parameters of each machine learning model, thereby enhancing its performance. Finally, the performance of each hybrid predictive system is evaluated based on the root mean of squared errors (RMSE). The signal decomposition methods include maximum overlap discrete wavelet transform (MODWT) [15], empirical wavelet transform (EWT) [16], empirical mode decomposition (EMD) [17], and variational mode decomposition (VMD) [18]. The machine learning models include the Gaussian process regression (GPR) [19], support vector regression (SVR) [20], k-nearest neighbors algorithm (kNN) [21], regression trees (RT) [22], and feedforward neural networks (FFNN) [23].

We rely on signal processing/decomposition methods, as they are basically multiresolution techniques used to decompose the original signal into intrinsic components to provide its representation in time-frequency space. Such components provide a meaningful description of the oscillations in the original signal. In addition, machine learning has become a revolutionary tool in time series forecasting. It includes various algorithms that allow computers to learn patterns from data and make predictions with no prior assumptions. For the hybrid system tuning, the grey wolf optimization (GWO) algorithm is chosen. Indeed, the GWO is a swarm intelligence optimization algorithm inspired by the hunting behaviour of grey wolves in nature. It is simple, as it uses only one operator to determine the positions of the solutions (wolves) for problem-solving [14]. Figure 1 shows the flow chart of the hybrid predictive systems.

The main contributions of our paper are as follows:

We implement various signal decomposition algorithms to highlight multiresolution components of the original data.
We implement various machine learning models for forecasting purposes.
We design 20 predictive systems that integrate signal decomposition and machine learning models.
Heuristic optimization is employed to tune all integrative predictive systems based on the GWO algorithm.
To the best of our knowledge, this is the first time a comprehensive set of hybrid predictive systems has been designed and implemented to forecast Bitcoin’s next-day price.

The paper is structured as follows. Section 2 introduces the signal processing techniques, machine learning models, and grey wolf optimization algorithm. Section 3 presents the data and the simulation results from each hybrid predictive system. Lastly, Section 4 concludes the paper.

2. Methods

To extract intrinsic characteristics of Bitcoin price data, we implement various signal processing algorithms, including MODWT, EWT, EMD, and VMD. Thanks to their ability to analyze signals, they were applied to various engineering applications. For instance, the MODWT was successful in various applications, including the prediction of pan evaporation [24], solar radiation [25], irrigation flow [26], and wave height [27]. EMD was successful in diverse tasks, including prediction of tourism demand [28], CO₂ concentration [29], water quality [30], and oil production [31]. The EWT was effective in different problems, including the prediction of wave height [32], wind speed [33], wave power [34], and fault in the wheelset-bearing system [35]. The VMD was used in numerous problems, including the prediction of ship motion attitude [36], pumped storage hydropower unit [37], fault in cryogenic rolling bearing fault [38], and wind speed [39]. In addition, the GWO algorithm is adopted in our study to fine-tune machine learning systems as it was found to be useful in tuning the hyperparameters of the MLP [40,41], support vector machines [42], and AdaBoost, Bagging, and backpropagation neural networks [43]. The variational mode decomposition, Gaussian process regression, and grey wolf optimization algorithm are described next.

2.1. Variational Mode Decomposition (VMD)

The variational mode decomposition (VMD) [18] is an adaptive signal processing technique used to decompose the original signal into variational modes functions (VMFs) where each VMF is centred on a specific frequency s_k estimated during the decomposition process. The determination of centre frequencies and bandwidths for the component signals requires seeking K modal functions such that the total bandwidth of all decomposed VMFs is minimized and their sum equals the original signal. The amplitude and frequency of each VMF at any given point in time are determined using the Hilbert transform, which generates a unilateral spectrum. The bandwidth of each VMF is calculated based on the following steps: (a) apply the Hilbert transform to each VMF to obtain its spectrum, (b) shift the spectral distribution of each VMF to its corresponding frequency baseband, and (c) determine the signal bandwidth using the Wiener filter. The decomposition algorithm is represented in the form of a constrained optimization problem:

{m i n}_{u_{k} s_{k}} = \{\sum_{k = 1}^{K} {‖δ_{t} [(δ (t) + \frac{j}{π t}) \times d_{k} (t)] e^{- j ω_{k} t}‖}_{2}^{2}\}

(1)

Subject to:

\sum_{k = 1}^{K} d_{k} (t) = x (t)

(2)

where s_k is centre frequency and d_k(t) is the kth mode of the signal. In this work, the number of modes is set to five.

2.2. Regression Gaussian Process

The regression Gaussian process (RGP) [19] is a non-parametric model useful to model uncertainty in the data. The implicit function f(x) is assumed to follow the Gaussian distribution as follows:

f (x) ~ G P (m (x), k_{f} (x, x^{'}))

(3)

where the m(x) is the mean; the k_f(x,x′) represents the kernel function. In this work the square exponential function is chosen:

k_{f} (x, x^{'}) ~ σ_{f}^{2} e x p (- \frac{{(x - x^{'})}^{2}}{2 l^{2}})

(4)

where

σ_{f}^{2}

is the variance of the kernel function and l is the length. Then, the implicit function f(x) can be expressed as:

y = f (x_{n}) + ξ_{n}

(5)

where ξ_n∼N(0,

σ_{n}^{2}

) is the Gaussian noise.

2.3. Grey Wolf Optimization

The grey wolf optimization (GWO) [14] is an evolutionary metaheuristic optimization algorithm based on swarm intelligence to simulate the hunting behaviour and social leadership of grey wolves. To replicate the leadership hierarchy, the GWO algorithm includes four types of grey wolves: namely, α, β, δ, and ω. In grey wolves, the social hierarchy of wolves, α is considered as the fittest solution, and β and δ are described as the second and third solutions, respectively, and ω is after these three wolves. Specifically, hunting operations are commonly led by α and β, and δ wolves could intermittently hunt. The GWO algorithm consists of 3 main steps: encircling, hunting, and attacking prey.

The grey wolves encircle the prey, where their positions are specified by:

\vec{X} (t + 1) = {\vec{X}}_{p} (t) - \vec{A} \cdot \vec{D}

(6)

\vec{D} = {\vec{C} \cdot \vec{X}}_{p} (t) - \vec{X} (t)

(7)

where t is iteration number, and:

\vec{A} = 2 \vec{a} \cdot \vec{r_{1}} - \vec{a}

(8)

\vec{C} = 2 \cdot \vec{r_{2}}

(9)

where

\vec{a}

is linearly decreasing from 2 to 0 during iterations and

\vec{r_{1}}

and

\vec{r_{2}}

are random vectors in [0, 1]. The Grey wolf hunting is based on update of the prey position with the help of the three wolves as follows:

\vec{D_{α}} = |\vec{C_{1}} \cdot \vec{X_{α}} - \vec{X}|

(10)

\vec{D_{β}} = |\vec{C_{2}} \cdot \vec{X_{β}} - \vec{X}|

(11)

\vec{D_{δ}} = |\vec{C_{31}} \cdot \vec{X_{δ}} - \vec{X}|

(12)

Finally, depending on the estimated positions of α, β, and δ, the solutions are updated for every iteration as follows:

\vec{X_{1}} = \vec{X_{α}} - \vec{A_{1}} \cdot (\vec{D_{α}})

(13)

\vec{X_{2}} = \vec{X_{β}} - \vec{A_{2}} \cdot (\vec{D_{β}})

(14)

\vec{X_{3}} = \vec{X_{δ}} - \vec{A_{3}} \cdot (\vec{D_{δ}})

(15)

\vec{X} (t + 1) = \frac{\vec{X_{1}} + \vec{X_{2}} + \vec{X_{3}}}{3}

(16)

In this study, the GWO is adopted to optimize the parameters and kernels of the predictive systems.

In this study, the GWO algorithm is employed to optimize each single predictive system. For GRP, it is used to optimize the length of the kernel function. For SVR, it is used to optimize the spatial distribution status parameter of the radial basis function kernel. For RT, it is employed to determine the optimal number of splits (divisions). For kNN, GWO is used to determine the optimal distance metric and the number of neighbors. Finally, for FFNN, it is employed to determine the number of hidden layers, the number of neurons in hidden layers, and the type of activation function, for instance, sigmoid, hyperbolic, Relu, or soft plus.

2.4. Performance Measures

For performance evaluation, three common performance metrics are used including the root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). The RMSE calculates the error magnitude that is in the same units as the target variable; MAE offers a straightforward measure of prediction accuracy; and MAPE facilitates interpretation of the performance in terms of relative error.

The RMSE computes the square root of the average of squared differences between predicted

\hat{y}

and actual values y over n observations. It is given by:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {{(y}_{i} - \hat{y_{i}})}^{2}}

(17)

The MAE calculates the average of absolute differences between predicted and actual values, focusing on the average magnitude of errors, regardless of their direction. It is expressed as follows:

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|

(18)

The MAPE computes the average percentage difference between predicted and actual values.

MAPE = \frac{1}{n} \sum_{i = 1}^{n} |\frac{y_{i} - \hat{y_{i}}}{y_{i}}| \times 100

(19)

The lower the performance metric, the better the accuracy of the predictive system.

3. Results

We used Bitcoin’s daily price data in US dollars from 27 March 2018 to 27 March 2023 from Yahoo Finance. Figure 2 displays the price series of Bitcoin. One could observe that Bitcoin price data is not stationary and noisy. We used the first 80% of data for training the models and the remaining 20% for testing. As indicated previously, the standard root mean of squared errors (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are employed to evaluate the performance of each predictive system. In this regard, RMSE and MAE are expressed in US dollars.

Table 1 provides the obtained RMSE for each forecasting approach used to combine the signal processing technique with machine learning. As indicated, the RGP model trained with VMD-resulting VMFs and optimized the GWO algorithm (RGP-VMD-GWO) achieved the lowest forecasting error measured by RMSE; for instance, 0.0003. In contrast, the kNN trained with EMD-resulting IMFs (kNN-EMD-GWO) yielded the highest forecasting error; for instance, RMSE = 27,983. In addition, Table 2 provides the obtained MAE by each proposed prediction system. As shown, the RGP-GWO-VMD system yielded the lowest prediction error measured by MAE; for instance, 0.1980. Conversely, the FFNN-GWO-EMD obtained the highest prediction error; for instance, MAE = 31,256.6425. Also, Table 3 shows the MAPE obtained by each integrative forecasting system. The lowest MAPE (0.0006) is obtained by the RGP-GWO-VMD system, whilst the highest MAPE (36,839) is obtained by the kNN-GWO-VMD. From Table 1, Table 2 and Table 3, we conclude that all three performance measures yield the same conclusions: the RGP-GWO-VMD is the best predictive integrative system to forecast Bitcoin price based on all performance measures, whilst the hybrid kNN-GWO-EMD is the worst in terms of RMSE and MAPE, and the FFNN-GWO-EMD is the worst in terms of MAE.

In addition, we performed the Diebold-Mariano (DM) test to verify the null hypothesis of equal forecast accuracy between each baseline hybrid predictive system and RGP-GWO-VMD at a 5% statistical significance level. The results are provided in Table 4. As shown, the null hypothesis is rejected for all pairs of comparisons, as the probability values of the test (p-value) are always below 5%. Thus, the forecasts from the RGP-GWO-VMD are consistently different from those of the other integrative predictive systems.

Finally, to summarize the results from simulations, we computed the average RMSE obtained by each predictive system. We found that the lowest average RMSE across signal processing algorithms was obtained by RGP-GWO (2.59), followed by SVR-GWO (2.8), RT-GWO (15.31), ANN-GWO (28.00), and kNN-GWO (20,142.63). Also, we computed the average RMSE for each signal processing algorithm. It is found that the lowest average RMSE was obtained by MODWT (1795.7887), followed by VMD (3845.0194), EWT (4903.3795), and EMD (5608.8752).

4. Conclusions

Forecasting cryptocurrencies is receiving growing attention as they represent alternative investments with high profits compared to traditional equities [44,45,46,47,48,49,50]. In this regard, various studies implemented statistical and machine learning models to forecast prices [51,52,53,54,55,56] and volatility of Bitcoin [57,58,59,60] which is the largest traded cryptocurrency.

In this study, we proposed novel integrative predictive systems to forecast Bitcoin’s next-day price. Indeed, we designed new integrative forecasting systems based on combinations of signal processing algorithms (for instance, MODWT, EWT, EMD, and VMD) and machine learning systems (for instance, RGP, SVR, kNN, RT, and FFNN). In addition, grey wolf optimization was adopted to find the best hyperparameter combinations to further enhance the performance of the integrative prediction systems and alleviate the risk of overfitting. In total, twenty integrative systems are implemented and tested.

The experimental results showed that the RGP-GWO-VMD hybrid predictive system was found to be the best predictive integrative system based on three performance measures including RMSE, MAE, and MAPE. In addition, according to Diebold-Mariano test, we concluded that the forecasts from the RGP-GWO-VMD are consistently different from those of the other integrative predictive systems. In this regard, the superiority of the RGP-GWO-VMD can be explained by the fact that the RGP is effective in modelling uncertainties inherent in the Bitcoin generating process. In addition, based on Wiener filtering and Hilbert transform, and by imposing band-limited limits on the decomposition process, VMD leads to better representation of complex temporal patterns like trends, noise, and seasonality in the original sequence. Also, the integration of GWO further enhances the RGP-VMD by optimizing the length scale parameter of RGP which leads to reduced uncertainty in predictions.

Furthermore, the MODWT multiresolution algorithm yielded the lowest average forecasting error measured by RMSE across machine learning systems. However, if the average RMSE is calculated for each signal processing algorithm across all machine learning systems when the least performer kNN-GWO is excluded, the MODWT algorithm obtains the highest value of RMSE (18.6109). The EWT, EMD, and VMD obtain 7.22445, 15.3440, and 7.5242, respectively. In other words, the performance of the least performer kNN-GWO improves with MODWT as it offers better time-frequency localization characteristics that can be exploited by kNN-GWO.

In summary, the empirical results showed that the proposed integrative forecasting system, RGP-GWO-VMD, achieved significant performance. Our study has the merit to shed light on the optimal design of an integrative predictive system used to forecast the Bitcoin next-day price based on the integration of signal processing, machine learning, and grey wolf optimization algorithm.

Future works would consider extending the study to other cryptocurrencies, stocks, and commodities.

Author Contributions

Conceptualization, S.L. and S.B.; methodology, S.L. and S.B.; software, S.L.; validation, S.B.; formal analysis, S.L.; investigation, S.L.; resources, S.L. and S.B.; data curation, S.L. and S.B.; writing—original draft preparation, S.L.; writing—review and editing, S.L. and S.B.; visualization, S.L. and S.B.; supervision, S.L. and S.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available in Yahoo Finance at https://finance.yahoo.com/ (accessed on 1 May 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MODWT	maximum overlap discrete wavelet transform
EWT	empirical wavelet transform
EMD	empirical mode decomposition
VMD	variational mode decomposition
RGP	regression Gaussian process
SVR	support vector regression
kNN	k-nearest neighbors algorithm
RT	regression trees
FFNN	feedforward neural networks
GWO	grey wolf optimization
RMSE	root mean of squared errors

References

Atsalakis, G.S.; Atsalaki, I.G.; Pasiouras, F.; Zopounidis, C. Bitcoin price forecasting with neuro-fuzzy techniques. Eur. J. Oper. Res. 2019, 276, 770–780. [Google Scholar] [CrossRef]
Lahmiri, S.; Bekiros, S. Cryptocurrency forecasting with deep learning chaotic neural networks. Chaos Solitons Fractals 2019, 118, 35–40. [Google Scholar] [CrossRef]
Mallqui, D.C.A.; Fernandes, R.A.S. Predicting the direction, maximum, minimum and closing prices of daily Bitcoin exchange rate using machine learning techniques. Appl. Soft Comput. 2019, 75, 596–606. [Google Scholar] [CrossRef]
Lahmiri, S.; Bekiros, S. Intelligent forecasting with machine learning trading systems in chaotic intraday Bitcoin market. Chaos Solitons Fractals 2020, 133, 109641. [Google Scholar] [CrossRef]
Guo, H.; Zhang, D.; Liu, S.; Wang, L.; Ding, Y. Bitcoin price forecasting: A perspective of underlying blockchain transactions. Decis. Support Syst. 2021, 151, 113650. [Google Scholar] [CrossRef]
Koo, E.; Kim, G. Prediction of Bitcoin price based on manipulating distribution strategy. Appl. Soft Comput. 2021, 110, 107738. [Google Scholar] [CrossRef]
Lahmiri, S.; Bekiros, S. Deep learning forecasting in cryptocurrency high-frequency trading. Cogn. Comput. 2021, 13, 485–487. [Google Scholar] [CrossRef]
Rajabi, S.; Roozkhosh, P.; Farimani, N.M. MLP-based learnable window size for bitcoin price prediction. Appl. Soft Comput. 2022, 129, 109584. [Google Scholar] [CrossRef]
Rathore, R.K.; Mishra, D.; Mehra, P.S.; Pal, O.; Hashim, A.S.; Shapi’I, A.; Ciano, T.; Shutaywi, M. Real-world model for bitcoin price prediction. Inf. Process. Manag. 2022, 59, 102968. [Google Scholar] [CrossRef]
Hajek, P.; Hikkerova, L.; Sahut, J.-M. How well do investor sentiment and ensemble learning predict Bitcoin prices? Res. Int. Bus. Financ. 2023, 64, 101836. [Google Scholar] [CrossRef]
Zou, Y.; Herremans, D. PreBit—A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin. Expert Syst. Appl. 2023, 233, 120838. [Google Scholar] [CrossRef]
Cheng, J.; Tiwari, S.; Khaled, D.; Mahendru, M.; Shahzad, U. Forecasting Bitcoin prices using artificial intelligence: Combination of ML, SARIMA, and Facebook Prophet models. Technol. Forecast. Soc. Change 2024, 198, 122938. [Google Scholar] [CrossRef]
Abul Basher, S.; Sadorsky, P. Forecasting Bitcoin price direction with random forests: How important are interest rates, inflation, and market volatility? Mach. Learn. Appl. 2022, 9, 100355. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Percival, D.B.; Walden, A.T. Wavelet Methods for Time Series Analysis; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, MA, USA; New York, NY, USA, 2000. [Google Scholar]
Gilles, J. Empirical wavelet transform. IEEE Trans. Signal Process. 2013, 61, 3999–4010. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.L.C.; Shih, H.H.; Zheng, Q.N.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; The MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
Vapnik, V.; Chapelle, O. Bounds on error expectation for support vector machines. Neural Comput. 2000, 12, 2013–2036. [Google Scholar] [CrossRef]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Chapman & Hall/CRC: Boca Raton, FL, USA, 1998. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning internal representations by error propagation. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition; David, E., McClelland, J.L., PDP Research Group, Eds.; MIT Press: Cambridge, MA, USA, 1986; Volume 1. [Google Scholar]
Ghaemi, A.; Rezaie-Balf, M.; Adamowski, J.; Kisi, O.; Quilty, J. On the applicability of maximum overlap discrete wavelet transform integrated with MARS and M5 model tree for monthly pan evaporation prediction. Agric. For. Meteorol. 2019, 278, 107647. [Google Scholar] [CrossRef]
Ghimire, S.; Deo, R.C.; Raj, N.; Mi, J. Wavelet-based 3-phase hybrid SVR model trained with satellite-derived predictors, particle swarm optimization and maximum overlap discrete wavelet transform for solar radiation prediction. Renew. Sustain. Energy Rev. 2019, 113, 109247. [Google Scholar] [CrossRef]
Mouatadid, S.; Adamowski, J.F.; Tiwari, M.K.; Quilty, J.M. Coupling the maximum overlap discrete wavelet transform and long short-term memory networks for irrigation flow forecasting. Agric. Water Manag. 2019, 219, 72–85. [Google Scholar] [CrossRef]
Altunkaynak, A.; Çelik, A.; Mandev, M.B. Dynamic adaptive wavelet based fuzzy framework for extended significant wave height forecasting. Ocean Eng. 2024, 295, 116814. [Google Scholar] [CrossRef]
Liao, Z.; Ren, C.; Sun, F.; Tao, Y.; Li, W. EMD-based model with cooperative training mechanism for tourism demand forecasting. Expert Syst. Appl. 2024, 244, 122930. [Google Scholar] [CrossRef]
Yang, G.; Yuan, E.; Wu, W. Predicting the long-term CO2 concentration in classrooms based on the BO-EMD-LSTM model. Build. Environ. 2022, 224, 109568. [Google Scholar] [CrossRef]
Zhang, Y.; Li, C.; Jiang, Y.; Sun, L.; Zhao, R.; Yan, K.; Wang, W. Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J. Clean. Prod. 2022, 354, 131724. [Google Scholar] [CrossRef]
Cao, Y.; Liu, S.; Cao, X.; Liu, X.; Hu, H.; Zhang, T.; Yu, L. EMD-based multi-algorithm combination model of variable weights for oil well production forecast. Energy Rep. 2022, 8, 13389–13398. [Google Scholar] [CrossRef]
Karbasi, M.; Jamei, M.; Ali, M.; Abdulla, S.; Chu, X.; Yaseen, Z.M. Developing a novel hybrid auto encoder decoder bidirectional gated recurrent unit model enhanced with empirical wavelet transform and Boruta-Catboost to forecast significant wave height. J. Clean. Prod. 2022, 379, 134820. [Google Scholar] [CrossRef]
Yang, Q.; Huang, G.; Li, T.; Xu, Y.; Pan, J. A novel short-term wind speed prediction method based on hybrid statistical-artificial intelligence model with empirical wavelet transform and hyperparameter optimization. J. Wind Eng. Ind. Aerodyn. 2023, 240, 105499. [Google Scholar] [CrossRef]
Ni, C.; Peng, W. An integrated approach using empirical wavelet transform and a convolutional neural network for wave power prediction. Ocean Eng. 2023, 276, 114231. [Google Scholar] [CrossRef]
Ding, J. A double impulsiveness measurement indices-bilaterally driven empirical wavelet transform and its application to wheelset-bearing-system compound fault detection. Measurement 2021, 175, 109135. [Google Scholar] [CrossRef]
Zhou, T.; Yang, X.; Ren, H.; Li, C.; Han, J. The prediction of ship motion attitude in seaway based on BSO-VMD-GRU combination model. Ocean. Eng. 2023, 288, 115977. [Google Scholar] [CrossRef]
Fang, M.; Zhang, F.; Yang, Y.; Tao, R.; Xiao, R.; Zhu, D. The influence of optimization algorithm on the signal prediction accuracy of VMD-LSTM for the pumped storage hydropower unit. J. Energy Storage 2024, 78, 110187. [Google Scholar] [CrossRef]
Wang, B.; Guo, Y.; Zhang, Z.; Wang, D.; Wang, J.; Zhang, Y. Developing and applying OEGOA-VMD algorithm for feature extraction for early fault detection in cryogenic rolling bearing. Measurement 2023, 216, 112908. [Google Scholar] [CrossRef]
Parri, S.; Teeparthi, K.; Kosana, V. A hybrid methodology using VMD and disentangled features for wind speed forecasting. Energy 2024, 288, 129824. [Google Scholar] [CrossRef]
Liang, J.; Du, Y.; Xu, Y.; Xie, B.; Li, W.; Lu, Z.; Li, R.; Bal, H. Using adaptive chaotic grey wolf optimization for the daily streamflow prediction. Expert Syst. Appl. 2024, 237, 121113. [Google Scholar] [CrossRef]
Yang, Z. Competing leaders grey wolf optimizer and its application for training multi-layer perceptron classifier. Expert Syst. Appl. 2024, 239, 122349. [Google Scholar] [CrossRef]
Wang, Z.; Shang, P.; Mao, X. Feature recognition of complex systems using cumulative residual Tsallis signal entropy and grey wolf optimized support vector machine. Expert Syst. Appl. 2024, 238, 122246. [Google Scholar] [CrossRef]
Li, H.-W.; Wang, L.; Liu, J.-N.; Yang, Y.; Lu, G.-L. Maximizing power density in proton exchange membrane fuel cells: An integrated optimization framework coupling multi-physics structure models, machine learning, and improved gray wolf optimizer. Fuel 2024, 358, 130351. [Google Scholar] [CrossRef]
Papadimitriou, T.; Gogas, P.; Athanasiou, A.F. Forecasting Bitcoin Spikes: A GARCH-SVM Approach. Forecasting 2022, 4, 752–766. [Google Scholar] [CrossRef]
Chevallier, J.; Guégan, D.; Goutte, S. Is It Possible to Forecast the Price of Bitcoin? Forecast 2021, 3, 377–420. [Google Scholar] [CrossRef]
Seabe, P.L.; Pindza, E.; Moutsinga, C.R.B.; Aphane, M. Temporal Attention-Enhanced Stacking Networks: Revolutionizing Multi-Step Bitcoin Forecasting. Forecasting 2025, 7, 2. [Google Scholar] [CrossRef]
Mba, J.C.; Mwambi, S.M.; Pindza, E. A Monte Carlo Approach to Bitcoin Price Prediction with Fractional Ornstein–Uhlenbeck Lévy Process. Forecasting 2022, 4, 409–419. [Google Scholar] [CrossRef]
Murray, K.; Rossi, A.; Carraro, D.; Visentin, A. On Forecasting Cryptocurrency Prices: A Comparison of Machine Learning, Deep Learning, and Ensembles. Forecasting 2023, 5, 196–209. [Google Scholar] [CrossRef]
Ladhari, A.; Boubaker, H. Deep Learning Models for Bitcoin Prediction Using Hybrid Approaches with Gradient-Specific Optimization. Forecasting 2024, 6, 279–295. [Google Scholar] [CrossRef]
Ampountolas, A. Comparative Analysis of Machine Learning, Hybrid, and Deep Learning Forecasting Models: Evidence from European Financial Markets and Bitcoins. Forecasting 2023, 5, 472–486. [Google Scholar] [CrossRef]
Qureshi, M.; Iftikhar, H.; Rodrigues, P.C.; Rehman, M.Z.; Salar, S.A.A. Statistical Modeling to Improve Time Series Forecasting Using Machine Learning, Time Series, and Hybrid Models: A Case Study of Bitcoin Price Forecasting. Mathematics 2024, 12, 3666. [Google Scholar] [CrossRef]
Lamothe-Fernández, P.; Alaminos, D.; Lamothe-López, P.; Fernández-Gámez, M.A. Deep Learning Methods for Modeling Bitcoin Price. Mathematics 2020, 8, 1245. [Google Scholar] [CrossRef]
Ye, Z.; Wu, Y.; Chen, H.; Pan, Y.; Jiang, Q. A Stacking Ensemble Deep Learning Model for Bitcoin Price Prediction Using Twitter Comments on Bitcoin. Mathematics 2022, 10, 1307. [Google Scholar] [CrossRef]
Sattarov, O.; Makhmudov, F. Risk-Aware Crypto Price Prediction Using DQN with Volatility-Adjusted Rewards Across Multi-Period State Representations. Mathematics 2025, 13, 3012. [Google Scholar] [CrossRef]
Kaygın, C.Y.; Gün, M.; Akarsu, O.N.; Bağcı, H.; Yanık, A. Algorithmic Stability in Turbulent Markets: Unveiling the Superiority of Shallow Learning over Deep Architectures in Cryptocurrency Forecasting. Mathematics 2026, 14, 989. [Google Scholar] [CrossRef]
Alenazi, M.M.; Jaskani, F.H. Hybrid Cloud–Edge Architecture for Real-Time Cryptocurrency Market Forecasting: A Distributed Machine Learning Approach with Blockchain Integration. Mathematics 2025, 13, 3044. [Google Scholar] [CrossRef]
Nakakita, M.; Toyabe, T.; Nakatsuma, T. Bayesian Analysis of Bitcoin Volatility Using Minute-by-Minute Data and Flexible Stochastic Volatility Models. Mathematics 2025, 13, 2691. [Google Scholar] [CrossRef]
Rubio, L.; Alba, K.V.; Velasquez, C.E.; Ramos, F.R. Stacked ML-GARCH for Bitcoin Risk Forecasting: A Novel Ensemble Approach for Superior Value-at-Risk Estimation. Mathematics 2026, 14, 624. [Google Scholar] [CrossRef]
Kim, J.-M.; Jun, C.; Lee, J. Forecasting the Volatility of the Cryptocurrency Market by GARCH and Stochastic Volatility. Mathematics 2021, 9, 1614. [Google Scholar] [CrossRef]
Azman, S.; Pathmanathan, D.; Thavaneswaran, A. Forecasting the Volatility of Cryptocurrencies in the Presence of COVID-19 with the State Space Model and Kalman Filter. Mathematics 2022, 10, 3190. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed predictive systems to forecast Bitcoin price based on combination of signal processing techniques for price decomposition, machine learning for learning and testing, and grey wolf optimization (GWO) for tuning parameters of the machine learning models.

Figure 2. Plot of bitcoin price time series. Samples on x-axis and price level on y-axis.

Table 1. RMSE obtained by each predictive system.

	MODWT	EWT	EMD	VMD
FFNN-GWO	23.7585	26.9270	35.7726	25.5540
RGP-GWO	9.5656	0.0004	0.8004	0.0003
kNN-GWO	8904.5	24,488	27,983	19,195
SVR-GWO	10.8910	0.0129	0.2843	0.0079
RT-GWO	30.2288	1.9575	24.5190	4.5349

Table 2. MAE obtained by each predictive system.

	MODWT	EWT	EMD	VMD
FFNN-GWO	21,404.8508	25,120.2785	31,256.6425	22,502.4740
RGP-GWO	8127.6220	0.2500	684.7072	0.1980
kNN-GWO	8876.8000	22,850	19,071	16,114
SVR-GWO	8245.3294	12.0220	262.1148	6.8629
RT-GWO	25,916.7054	1473.6921	23,708.1446	3737.2589

Table 3. MAPE obtained by each predictive system.

	MODWT	EWT	EMD	VMD
FFNN-GWO	71.6620	83.7710	115.2839	78.3467
RGP-GWO	29.3046	0.0008	2.3804	0.0006
kNN-GWO	7176.3	71.2246	19,183	36,839
SVR-GWO	27.0891	0.0417	0.8715	0.0233
RT-GWO	90.0347	5.0136	84.4070	13.1088

Table 4. Results from D-M test: RGP-GWO-VMD versus reference models.

Reference Model	D-M Statistic	p-Value	Reference Model	D-M Statistic	p-Value
FFNN-GWO-CWT	5.5243	6.31 × 10⁻⁸	kNN-GWO-CWT	27.9533	1.2439 × 10⁻⁹²
FFNN-GWO-MODWT	−6.9929	1.30 × 10⁻¹¹	kNN-GWO-MODWT	27.7543	7.3203 × 10⁻⁹²
FFNN-GWO-EMD	−28.1139	2.98 × 10⁻⁹³	kNN-GWO-EMD	3.4482	6.3042 × 10⁻⁴
FFNN-GWO-VMD	−23.9177	1.18 × 10⁻⁷⁶	kNN-GWO-VMD	23.5887	2.5358 × 10⁻⁷⁵

SVR-GWO-CWT	43.0342	7.74 × 10⁻¹⁴⁵	RT-GWO-CWT	15.4041	1.611 × 10⁻⁴¹
SVR-GWO-MODWT	−5.2386	2.74 × 10⁻⁷	RT-GWO-MODWT	−16.876	1.30 × 10⁻⁴⁷
SVR-GWO-EMD	−15.7695	4.57 × 10⁻⁴³	RT-GWO-EMD	−76.3537	4.14 × 10⁻²²⁶
SVR-GWO-VMD	19.2096	2.75 × 10⁻⁵⁷	RT-GWO-VMD	−5.7176	2.25 × 10⁻⁸

The null hypothesis is that both RGP-GWO-VMD (the overall best model) and the reference model have equal predictive accuracy. The test is performed at 5% statistical significance.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lahmiri, S.; Bekiros, S. An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting. Mathematics 2026, 14, 1615. https://doi.org/10.3390/math14101615

AMA Style

Lahmiri S, Bekiros S. An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting. Mathematics. 2026; 14(10):1615. https://doi.org/10.3390/math14101615

Chicago/Turabian Style

Lahmiri, Salim, and Stelios Bekiros. 2026. "An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting" Mathematics 14, no. 10: 1615. https://doi.org/10.3390/math14101615

APA Style

Lahmiri, S., & Bekiros, S. (2026). An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting. Mathematics, 14(10), 1615. https://doi.org/10.3390/math14101615

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Integrative System Based on Signal Processing and Tuned Regression Gaussian Process by Grey Wolf Optimization Algorithm for Bitcoin Price Forecasting

Abstract

1. Introduction

2. Methods

2.1. Variational Mode Decomposition (VMD)

2.2. Regression Gaussian Process

2.3. Grey Wolf Optimization

2.4. Performance Measures

3. Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI