1. Introduction
Over the past few years, technological progress and the advent of digital transformation imply a paradigm shift in various industries, including the business sector [
1]. The rapid progress of digital transformation has undeniably sparked the emergence of fintech (financial technology). These innovations are considered by many to be among the most important developments in the financial sector. Also, the financial landscape is facing a revolution owing to digital solutions, which are challenging and revolutionizing age-old techniques and practices that have held their ground for a long time.
Moreover, fintech, in fact, provides a broad spectrum of services. These include mobile banking, digital wallets, peer-to-peer payment systems, e-insurance, e-payments, and even cryptocurrencies like Bitcoin.
The cryptocurrencies that have taken the world by storm are a relatively complex form of exchange medium. With Bitcoin being the first form of this development, the heterogeneous nature of these cryptocurrencies has made it difficult to ascertain a proper method of prediction of the prices of these currencies using conventional econometric or even deep learning models, which have been employed to predict trends in other exchange mediums. Madan et al. [
2] used various machine learning methodologies, such as generalized linear models and random forest, to address the Bitcoin prediction challenge. Jiang, X. [
3] proposed deep learning methods in order to predict the Bitcoin price. His study shows that long short-term memory (LSTM) provides the best prediction.
In addition, cryptocurrencies can undergo fast and remarkable cost variances over short periods, making them an especially hazardous speculation. This instability is driven by an assortment of variables, including counting theoretical exchanges, advertising estimations, and outside occasions. Another particular characteristic of cryptocurrency markets is their ceaseless 24/7 exchanging. Not at all like conventional stock markets that have settled exchanging hours, advanced resources can be bought and sold at any time, and predicting the price of any digital financial asset is considered one of the most challenging tasks, which makes it difficult for investors to stay well-informed due to their instability. Our study contributes to the existing literature by proposing novel approaches for Bitcoin price prediction using network models and high-frequency data. We employ a network-based method to capture the interdependencies and relationships between different cryptocurrencies and market variables. Additionally, we utilize high-frequency data to capture rapid price fluctuations and market dynamics. Our approach provides a more comprehensive and accurate prediction of Bitcoin prices, addressing the limitations of previous models. The first model integrates long short-term memory (LSTM) and attention mechanisms that allow sequence learning and optimization to take place. Gradient-aware optimization has been introduced to improve the model’s ability to make better forecasts and make informed trading decisions. The second model combines ANN-LSTM with gradient-aware optimization to improve its forecasting and trading decision-making capabilities. These models stand out because they use sophisticated techniques that allow them to adapt to the conditions of the ever-changing Bitcoin market. These models are a helpful tool for participants in the currency trading industry because of their adaptive characteristics, which allow them to maintain a constant in a variety of market conditions. In fact, predicting the price of this volatile asset is challenging due to its reliance on various external factors. The dataset in question contains information regarding cryptocurrencies that is dynamic and subject to change as the world transforms and develops. The search results shed light on the dynamic nature of Bitcoin data, such as changing market dynamics, constant updates to cryptocurrency temporal data, and an examination of cryptocurrency rates of return.
Moreover, the effects of social media play an impressive part. A single tweet or news report can send cryptocurrency costs taking off or falling. This lively relationship between social media and cryptocurrency markets incorporates an extra layer of complexity.
In recent years, deep learning techniques have been applied to time series forecasting, especially in popular real-world application areas such as cryptocurrencies, due to the market’s instability and dynamism. The majority of these models employ advanced deep learning strategies based on long short-term memory (LSTM), attention mechanisms, gradient-based optimization techniques, and so many others.
Actually, deep learning models have shown superior performance in predicting cryptocurrency prices compared with traditional machine learning models. Together, these devices offer a powerful system and well-suited data for exploring the complex and highly volatile cryptocurrency landscape (Sun et al. [
4]). Hence, analysts have committed critical effort to progressing time series estimating models, investigating different combinations to distinguish the most successful approach for price forecasts. As it stands now, investing or even setting exchange rates for cryptocurrencies is a gamble.
In point of fact, our models stand out because they use sophisticated techniques that allow them to adapt to the conditions of the ever-changing Bitcoin market. These models are a helpful tool for participants in the currency trading industry because of their adaptive characteristic, which allows them to maintain a constant in a variety of market conditions.
The paper is organized as follows. In
Section 2, we conduct a literature assessment of the market under consideration, investigating the methods used to estimate cryptocurrency values.
Section 3 describes the methods for estimating Bitcoin prices as well as the research contributions.
Section 4 presents our methodology.
Section 5 discusses the research findings. In
Section 6, we describe the results of this study. Finally,
Section 7 will conclude the paper.
2. Literature Review
The trading and exchanging of cryptocurrencies across the globe have increased significantly over the last decade. This upsurge has pushed their market value to hundreds of billions of dollars globally. In January 2021, this figure reached an impressive USD 1 trillion [
5]. In financial market modeling, accurate forecasting and investment choices depend on having a solid grasp of the dynamics of asset prices, entry points, and market behavior. An attempt was made to build upon the Tramontana et al. model by Gu, E. G. [
6]. They built a new two-dimensional discontinuous piecewise linear (PWL) map with three branches, as well as trend followers that adhere to the most recent price trend, to power their financial market model.
Forecasting digital currencies’ worth is a challenge, as they are volatile and have unique systems. The prices keep changing due to emerging technologies with no clear future monetary value, according to analysts. Media and investors have recently taken notice of Bitcoin. However, it can be difficult to estimate the prices of Bitcoin and other cryptocurrencies because they are too volatile and complicated in nature. Earlier findings propose that deep learning algorithms can boost accuracy in forecasting cryptocurrency values by uncovering intricate patterns in complex and dynamic datasets. Through these techniques, behaviors or movements within unstable cryptocurrency markets can be identified. In order to have a better prediction with good accuracy, Bangroo et al. [
7] used different machine learning algorithms like random forest regressor and gradient boosting regressor to predict cryptocurrencies like Bitcoin, XRP, Ethereum, and Stellar. Xiaolei et al. [
4] proposed three models: SVM, RF model, and light gradient boosting machine to forecast the price of the cryptocurrency market. Lahmiri et al. [
8] presented two deep learning methods, a deep learning neural network (DLNN) and generalized regression neural networks (GRNNs), to forecast the price of Bitcoin. Modi, Parth Daxesh, et al. [
9] investigated the use of deep neural networks, specifically a shallow bidirectional-LSTM (Bi-LSTM) model, to forecast daily closing prices for Bitcoin. Also, in their research work, Tripathi, B., & Sharma, R. K. [
10] explore how to model Bitcoin values using deep learning, Bayesian optimization, and signal processing techniques. Chen, J. [
11] focused on the prediction of Bitcoin prices using deep learning algorithms, such as CNN, LSTM, and GRU.
Additionally, in their revolutionary study, Zhou et at. [
12] focus on deep learning within the financial markets and offer perceptive details regarding the potential applications of deep learning techniques for Bitcoin returns.
Our findings build on previous research using deep learning approaches to estimate the price of Bitcoin and other cryptocurrencies. Indeed, various deep learning models have been used over the previous five years, and they have shown to be the finest technology for forecasting cryptocurrency prices. Kristjanpoller and Minutolo’s [
13] research has significant advanced the area by introducing a hybrid MLP neural network-GARCH model for predicting Bitcoin price volatility. In their work, they conducted a comprehensive assessment of multiple GARCH models and discovered the benefits of combining linear and nonlinear models for better forecasting of Bitcoin price volatility. Also, Nakano et al. [
14] used an MLP neural network to estimate Bitcoin returns based on a variety of technical indications.
Further, in 2023, Akila et al. [
15] recommended LSTM networks, a deep learning technique to forecast prices of cryptocurrencies. Their method consisted of using historical price data and technical indicators as input to the LSTM model. This decision was prompted by LSTM’s ability to identify underlying patterns and trends in data. It was revealed by the outcomes that LSTM uses significantly and effectively predicted future cryptocurrency prices. Moreover, Gurgul, V. et al. [
16] integrate their method with recent research on artificial intelligence risk measurement and safe artificial intelligence, emphasizing the significance of considering both financial and textual data when projecting cryptocurrency prices. This is especially important for investors, traders, and politicians, who rely on accurate forecasts to make sound judgments.
One of the foremost important and decentralized cryptocurrencies is Bitcoin, which was presented by Satoshi Nakamoto [
17] on 31 October 2008. Also, we can find another notable study by Liu et al. [
18] focused on Bitcoin. Building on the advancements in deep learning for cryptocurrency price prediction, they used a separate deep learning technique, stacked denoising autoencoders (SDAEs), to forecast Bitcoin’s price. SDAE outperformed other models in forecasting the price of Bitcoin in both the directional and level prediction.
Furthermore, deep learning algorithms have achieved great advances in past research, producing excellent results in a variety of domains such as image-to-language conversion, speech recognition, and computer vision. According to research, combining deep learning algorithms results in the lowest anticipated mistakes. For example, Patel et al. [
19] proposed a hybrid cryptocurrency prediction system based on LSTM and GRU in their study. The results demonstrate great price accuracy, and the combination of LSTM and GRU can be used to predict the prices of multiple cryptocurrencies (Monero, Litecoin, and Bitcoin). In the same context, a range of hybrid deep learning techniques are employed for estimating cryptocurrency prices, combining the strengths of different deep learning models to produce better predictable results.
A variety of hybrid approaches have been utilized in order to achieve better performance. For example, the combination of a convolutional neural Network (CNN) and stacked gated recurrent unit (GRU) suggested by Kang et al. [
20] was evaluated on three different cryptocurrency datasets including Bitcoin, Ethereum, and Ripple.
In addition, Petrovic et al. [
21] proposed a novel combined method to predict the price that is based on hybrid machine learning and the swarm intelligence approach, combining the power of both techniques. In a similar vein, in their study, Li et al. [
22] proposed a novel data decomposition-based hybrid bidirectional deep learning model for forecasting the daily price change in the Bitcoin market. Results show that the model outperforms other benchmark models such as econometric models, machine learning models, and deep learning models. Likewise, Li et al. [
23] conducted a study on the Bitcoin price forecasting method based on a CNN-LSTM hybrid neural network model. The findings demonstrate that the proposed model performs well in forecasting Bitcoin.
Along the same lines, Zahouani and Boubaker [
24] investigated the efficacy of several mixed forecasting models to predict daily oil prices, including ANN-LSTM, CNN-LSTM, BRNN-LSTM, and LSTM-Attention. The investigation shows that the hybrid LSTM-Attention model beats other hybrid models in terms of accuracy, with the lowest error rate.
Our study seeks to increase the forecast accuracy by introducing extra optimization and a refining algorithm into hybrid models. Our goal is to improve prediction accuracy and ensure reliable results.
3. Proposed Algorithms
3.1. Long Short-Term Memory (LSTM)
Long short-term memory (LSTM) networks are a type of deep learning technique and a refined version of the recurrent neural network (RNN). LSTM has been employed in prediction tasks such as forecasting cryptocurrency prices, including Ethereum, Litecoin, and particularly for Bitcoin (Livieris, Ioannis E., et al. [
25]). Its utility encompasses activities related to time series and sequential prediction issues like machine translation and speech recognition. The fundamental component of LSTM is the memory module, and the other components are three gates: input gate, output gate, and forget gate.
Calculation formula is:
where
is the input at time
t,
is the hidden state at time
t,
is the cell state at time
t,
is the sigmoid function, and tanh is the hyperbolic tangent function.
3.2. Artificial Neural Network (ANN)
Artificial neurons, also called ANNs, are AI tools enabling robots to simulate human cognitive abilities. The application of ANN as a powerful AI computing tool is manifest in fields like telecommunications, material research, health care, neurology, and finance (Hong et al. [
26]). It is referred to as an algorithm for classification and regression problems by ANN. The output layer collects information from the input layers of the ANN through its hidden layers.
A neural network may include three layers. The first one is the input layer, where the activity of input units represents raw data delivered to the network. The second layer is the hidden layer, which controls the activities of each hidden unit. The number of hidden layers, as well as the activities of the input units and the weights assigned to their interactions with the hidden units, can vary. Finally, the output layer’s behavior is determined by both the activity of the hidden units and the weights between the hidden and output units.
3.3. Attention Mechanisms
Attention mechanisms are a crucial component of deep learning models and have been proven to be effective in various sectors.
In medical image analysis, Li, Xiang, et al. [
27] have examined deep learning models to investigate inter-spatial information and improve the accuracy of image classification and segmentation. And in the area of cryptocurrency price forecasting, Yazhini, V., et al. [
28] combined attention mechanisms with long short-term memory, bidirectional-LSTM, and gated recurrent unit models to anticipate the future closing price of Bitcoin and Ethereum.
Several recent publications have demonstrated how attention mechanisms can boost the predictive ability of deep learning models of virtual currency prices. In addition, attention mechanisms allow models to focus on specific areas of input or output data, resulting in improved performance for tasks such as machine translation, sentiment analysis, and time series prediction. In fact, they can help deep learning models focus on relevant data in order to improve their accuracy and efficiency.
3.4. Gradient Descent
Gradient descent is used widely as an optimization method to train machine learning models and neural networks. It reduces discrepancies between predicted and actual outcomes and could be combined with deep learning algorithms like LSTM to enhance prediction precision (Elsayed et al. [
29]). In fact, gradient descent is based on a convex function that can be thought of as finding the lowest point within a linear curve by moving along its steepest slope direction. The technique updates model parameters depending on the estimated gradient, providing the ability for the model to learn and become better over time. It is similar to estimating the line of best fit in linear regression.
Additionally, the selection of an appropriate gradient descent type plays a significant part in the training process of machine learning in the domain and relies on some key things like dataset size, jamming, and stability, as well as hyperparameters. Furthermore, there are three different gradient descent learning algorithms: batch gradient descent (BGD), stochastic gradient descent (SGD) and mini-batch gradient descent. BGD is characterized by traditional methodology that produces a stable error gradient and convergence; it is also suitable for smaller datasets that can fit into memory. On the contrary, stochastic gradient descent (SGD) repeats a training epoch for every instance in the dataset, modifying the parameters of each individual sample at a time, and hence it is suitable for larger datasets. Lastly, mini-batch gradient descent combines ideas from both BGD as well as SGD. This kind balances the speed of SGD with the computational efficiency of BGD.
The mathematical expression of batch gradient descent is:
where:
θ denotes the parameters under optimization.
ƞ is the learning rate that determines the step size in the parameter space.
is a cost function that evaluates the model’s performance.
represents the gradient of the cost function with respect to the parameters.
- ❖
Stochastic Gradient Descent (SGD)
The mathematical expression of stochastic gradient descent (SGD) is:
where:
denotes the parameters under optimization.
ƞ is the learning rate that determines the step size in the parameter space.
is the cost function that measures the model’s performance for a specific training example .
denotes the gradient of the cost function with respect to the parameters for a specific training example.
- ❖
Mini-Batch Gradient Descent
The mathematical expression of mini-batch gradient descent is:
where:
denotes the parameters under optimization.
ƞ is the learning rate that determines the step size in the parameter space.
is the cost function that measures the model’s performance for a mini-batch of training examples .
denotes the gradient of the cost function with respect to the parameters for a mini-batch of training examples.
6. Discussion
Our study discovered that hybrid deep learning techniques, such as LSTM-Attention, outperform ANN-LSTM at predicting cryptocurrencies like Bitcoin, as shown in
Table 7. With the combination of LSTM and attention, we attained an accuracy of 99.84 percent. Furthermore, our findings suggest that, in this scenario, boosting optimization specificity can enhance forecasting accuracy, as illustrated in
Figure 6 and
Figure 11.
Deep learning approaches, such as LSTM-Attention, have shown potential in predicting Bitcoin prices due to their capacity to detect complicated patterns and correlations in data. Gradient-specific optimization is a strategy that uses gradient information gathered during the training process to optimize model parameters. Using this strategy allows the model to learn more efficiently and precisely, resulting in higher forecasting accuracy.
In summary, the LSTM-Attention hybrid model generally outperforms the ANN-LSTM hybrid model in terms of forecast accuracy, as evidenced by lower error metrics and a higher accuracy rate. Furthermore, the LSTM-Attention model shows a greater difference in forecast accuracy than the ANN-LSTM model, as evidenced by the higher DM statistic. A higher DM value of 4.253 indicates a substantial difference in forecast accuracy against the ANN-LSTM model.
7. Conclusions
Predicting Bitcoin values is a difficult task owing to numerous market variables. For this reason, recent advances in deep learning and artificial intelligence have yielded more accurate and reliable predictive models than formerly effective methods such as time series analysis or econometric modeling. Hybridization can increase the precision of predictions of Bitcoin prices by adopting mixed approaches to the use of these two modeling systems.
Firstly, as previously stated, we employed the LSTM with attention mechanism, followed by gradient-specific optimization, to enhance our predictions of Bitcoin prices over the last five years. Secondly, we merged the ANNs and LSTM and incorporated gradient-specific optimization. The findings show that LSTM-Attention with gradient-specific optimization performs well in Bitcoin forecasts, making it more appropriate for Bitcoin predictions, producing results that are very similar to reality when compared with the second model.
Therefore, our findings have major implications for investors, traders, and politicians, who rely on precise forecasting to make educated decisions. Although our hybrid LSTM-Attention model with gradient-specific optimization was quite successful, it is important to realize that no model is perfect. In some situations, the model may underperform or fail to appropriately anticipate cryptocurrencies. In fact, our models have a few limitations: they do not take into consideration sentiment analysis in the Bitcoin market, and they cannot measure the intensity of sentiment from text-based sources like social media platforms, which is why there is always opportunity for development in terms of forecast accuracy, prediction error reduction, and model robustness to changing market conditions. In the future, we want to increase the accuracy of forecasts by adding more models, adjusting hyperparameters, and improving hybrid models that are already in place. We need to have better forecasting models so that they can easily be relied upon. As a result, our goals are to increase forecast accuracy, produce trustworthy outcomes, and account for changes in the market.