You are currently viewing a new version of our website. To view the old version click .
Data
  • Article
  • Open Access

31 October 2022

Cryptocurrency Price Prediction with Convolutional Neural Network and Stacked Gated Recurrent Unit

,
and
Faculty of Information Science and Technology, Multimedia University, Melaka 75450, Malaysia
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Data Analysis for Financial Markets

Abstract

Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, accurate price prediction is essential. In view of the fact that the price prediction is a time series task, a hybrid deep learning model is proposed to predict the future price of the cryptocurrency. The hybrid model integrates a 1-dimensional convolutional neural network and stacked gated recurrent unit (1DCNN-GRU). Given the cryptocurrency price data over the time, the 1-dimensional convolutional neural network encodes the data into a high-level discriminative representation. Subsequently, the stacked gated recurrent unit captures the long-range dependencies of the representation. The proposed hybrid model was evaluated on three different cryptocurrency datasets, namely Bitcoin, Ethereum, and Ripple. Experimental results demonstrated that the proposed 1DCNN-GRU model outperformed the existing methods with the lowest RMSE values of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset.

1. Introduction

Cryptocurrencies serve as a peer-to-peer digital currency where every detailed transaction occurs in a secured way. The transactions are further stored in a block, known as Blockchain. The security features made cryptocurrency a popular and well-known trading platform for investors. Cryptocurrencies have been growing dramatically, gaining popularity and capitalization. Bitcoin is the first decentralized cryptocurrency developed by Satoshi Nakamoto [], and it has become the world’s most valuable cryptocurrency. With the vast transaction volume of cryptocurrencies, many types of currencies were introduced into the cryptography world. Some well-known cryptocurrencies are Ethereum and Ripple, among others.
This study focuses on cryptocurrency price prediction. The cryptocurrency price prediction is a time series problem that can be solved by using deep learning regression techniques. Although price prediction of cryptocurrency is challenging, developing cryptocurrency price prediction algorithms is worthwhile because it plays a vital role for cryptocurrency traders. Inspired by the success of deep learning regression models in a wide spectrum of applications, this paper proposes a hybrid regression model that amalgamates a 1-dimensional convolutional neural network (1DCNN) and a stacked gated recurrent unit (GRU), into the 1DCNN-GRU model for cryptocurrency price prediction. Three cryptocurrency historical price datasets are first collected from the cryptocurrency exchange website. Subsequently, the datasets are subjected to some data pre-processing, including normalization and missing value removal before passing into the 1DCNN-GRU model for representation learning and price prediction. The 1DCNN layer plays the role of extracting the salient features in the historical price data. The extracted features are then passed into the stacked GRU for temporal encoding where the long-range dependencies are captured. The temporal encoding is then leveraged for the cryptocurrency price prediction. The predicted price is compared against the real price and the root mean square error is computed. The main contributions of this paper are as follows.
  • The cryptocurrency historical price data are acquired from the cryptocurrency exchange website. As the daily or hourly interval data are susceptible to information loss, this study leverages the one-minute interval data for more accurate price prediction.
  • The feature scaling is performed on the cryptocurrency historical price data by normalization. In addition, the data is further pre-processed to remove the missing values that might affect model learning. The clean data are then partitioned into the training set and testing set for model learning and price prediction.
  • A hybrid 1DCNN-GRU model is proposed for representation learning and cryptocurrency price prediction. The 1DCNN model encodes the prominent patterns in the historical price data, hence producing discriminative features to represent the historical price data. Thereafter, the stacked GRU model captures the long-range dependencies in the features, thus alleviating the gradient vanishing problems.

3. Cryptocurrency Price Prediction with 1-Dimensional Convolutional Neural Network and Stacked Gated Recurrent Unit (1DCNN-GRU)

This section details the proposed 1DCNN-GRU model for cryptocurrency price prediction. The historical price data of three cryptocurrencies are first acquired, namely Bitcoin, Ethereum, and Ripple. Subsequently, the collected data are pre-processed to clean missing values. Thereafter, the cleaned data are fed into the hybrid 1DCNN-GRU model for model learning and price prediction. Figure 1 illustrates the process flow of the cryptocurrency price prediction.
Figure 1. The process flow of the cryptocurrency price prediction.

3.1. Data Acquisition

Three datasets were used for the cryptocurrency price prediction, namely Bitcoin, Ethereum, and Ripple.
The Bitcoin historical data [] were acquired from the Kaggle website. The provided one-minute interval data range from 1 January 2012 until 31 March 2021, which contain approximately 4.8 millions samples, including NaN values. Some columns in the data are open, high, low, close (OHLC) price, volume, and the weighted price. All the timestamps are in UNIX time. The NaN values indicate that no trade or activity happened at that time. Figure 2 visualizes the Bitcoin closing price for the years 2012 to 2021.
Figure 2. The historical price of Bitcoin (2012–2021).
The Ethereum historical data were collected from the Bitstamp exchange website. The data comprise around 396,403 samples at one-minute intervals. The Ethereum closing price of the year 2021 is shown in Figure 3.
Figure 3. The historical price of Ethereum (2021).
Ripple is another widely known cryptocurrency, which has slightly lower values compared to other cryptocurrencies. The Ripple historical data were also gathered from the Bitstamp exchange website. The historical data consist of around 396,403 samples, as displayed in Figure 4.
Figure 4. The historical price of Ripple (2021).

3.2. Data Pre-Processing

Some pre-processing steps are performed to clean the cryptocurrency historical data, including feature selection, timestamp conversion, missing values removal, train-test split, and min-max scaling normalization.
As each dataset consists of many features, this work only utilizes three features for price prediction, namely timestamp, date, and closing price. Subsequently, timestamp conversion is carried out where the timestamp in UNIX is converted into the YY:MM:DD date format. The zeros and NaNs are filtered out by dropping the associated rows. To avoid huge data losses and to provide more timely and detailed prediction, the samples are taken at one-min intervals. Due to the inconsistency of historical data and high sampling rates, the historical data of one week are used. With these settings, the number of samples is 10,797 for Bitcoin and 10,834 for both Ethereum and Ripple. The samples are further partitioned into six days for the training set and one day for the testing set. Apart from that, the features are subjected to min-max scaling normalization that transforms each feature into the range [0, 1]. The min-max scaling suppresses the effects of outliers while preserving the relationships among the data values. The min-max scaling is computed as
x norm = x min ( x ) max ( x ) min ( x ) .

3.3. 1-Dimensional Convolutional Neural Network and Gated Recurrent Unit

In this work, a hybrid model that integrates 1DCNN and GRU is proposed for cryptocurrency price prediction. The architecture of the proposed 1DCNN-GRU model is depicted in Figure 5. The proposed 1DCNN-GRU comprises a 1D convolutional layer and two GRU layers with 256 units each.
Figure 5. The architecture of the proposed 1DCNN-GRU model.
The cryptocurrency historical price is a kind of time series data that captures the closing price over the time. Using the raw price data as the input might introduce noise and outliers, causing the regression model to learn on the insignificant data. Therefore, a 1DCNN is leveraged to extract the prominent patterns from the historical price data. In the 1-dimensional convolutional layer (Conv1D), the kernel slides along the temporal axis and encodes the price data into representative features. The Conv1D layer in the proposed model sets both kernel size and stride to 1; hence the convolution window will read one time step at one time. The Conv1D layer consists of 256 output filters in the convolution, thus producing 256-dimensional output space. The output of the Conv1D layer is passed into the subsequent GRU layer.
Two GRU layers are leveraged to encode the long-term dependencies of the extracted features. The ability of capturing long-term dependencies in GRU is attributable to the gating mechanisms. There are two gates in the GRU, namely update gate and reset gate. The update gate z t at time step t determines the information from the previous time steps to be passed to the future, defined as
z t = σ W ( z ) x t + U ( z ) h t 1 ,
where the weights W ( z ) and U ( z ) are multiplied with the input x t and hidden states h t 1 , respectively. The results of the multiplication are summed and passed into a sigmoid activation function to squash the values between 0 and 1.
The reset gate r t determines the past information to forget, where the computation is defined as
r t = σ W ( r ) x t + U ( r ) h t 1 ,
where the input x t and hidden states h t 1 are multiplied with their corresponding weights W ( r ) and U ( r ) . The sum of the results is likewise fed into a sigmoid activation function to limit the output to the range between 0 and 1.
A new memory content h t is then leveraged to store past information, defined as
h t = tanh W x t + r t U h t 1 ,
where ⊙ denotes the element-wise product. The new memory content is determined by first multiplying the input x t and hidden states h t 1 with the corresponding weights W and U. Thereafter, the element-wise product of the reset gate r t and U h t 1 is calculated. The product operation diminishes the information from the previous time step when the values of r t close to 0. Then, the sum of W x t and r t U h t 1 is regulated by a tanh function to keep the output within −1 and 1.
Following that, the final memory at the current time step h t that determines the information to be passed to the next time step is calculated as
h t = z t h t 1 + 1 z t h t .
Having z t values close to 1 will retain the majority of the previous information, whereas z t values close to 0 will keep the most part of the current information.
Lastly, the output from the GRU layers is passed into a dense layer with one hidden unit for price prediction. The layer-wise architecture of the proposed 1DCNN-GRU is presented in Table 2.
Table 2. The layer-wise architecture of the proposed 1DCNN-GRU model.

4. Hyperparameter Tuning

A hyperparameter tuning by grid search is performed to determine the optimal settings of the 1DCNN-GRU model. The hyperparameters that are involved in the hyperparameter tuning are optimizer, activation function, and batch size. The optimizers play the role of optimizing the model learning process to ensure the model converges optimally. In this work, four optimizers are considered, namely Adam, SGD, Adamax, and RMSProp. The activation function is the function in the Conv1D layer and GRU layers that transforms the input, enabling the model to learn and perform more complex tasks. Five activation functions are explored, which are sigmoid, softmax, ReLU, tanh, and linear. The batch size defines the number of samples that is used for error gradient computation in each model weights update. The RMSE is adopted as the evaluation metric of the cryptocurrency price prediction models. The RMSE is the square root of the average squared distance between actual and predicted values, defined as
RMSE = 1 n i = 1 n y i y ^ i 2 ,
where n is the total number of predictions, y is the real price, and y ^ denotes the predicted price. The optimal settings are set to the hyperparameter values with the lowest RMSE.
Table 3 shows the experimental results of different hyperparameter values on the Bitcoin dataset. The lowest RMSE of 43.933 is obtained on the Bitcoin dataset when SGD optimizer, sigmoid activation function, and batch size of 16 are used. The experimental results on the Ethereum dataset are presented in Table 4. It is observed that the lowest RMSE of 3.511 is achieved with the Adamax optimizer, softmax activation function, and batch size of 32. As for the Ripple dataset, the lowest RMSE of 0.00128 is recorded when the Adam optimizer, softmax activation function, and batch size of 32 are set, as shown in Table 5. The experimental settings are given in Table 6.
Table 3. RMSE of 1DCNN-GRU with different hyperparameters on the Bitcoin dataset.
Table 4. RMSE of 1DCNN-GRU with different hyperparameters on the Ethereum dataset.
Table 5. RMSE of 1DCNN-GRU with different hyperparameters on the Ripple dataset.
Table 6. The experimental settings of the proposed 1DCNN-GRU model.

5. Experimental Results and Analysis

In this section, the performance of the proposed 1DCNN-GRU model is compared with the existing prediction models. All models are trained on the same one-minute interval historical data.
Table 7 presents the comparison results of the methods on Bitcoin, Ethereum, and Ripple datasets. In general, the RMSE of all methods on the Bitcoin dataset is the highest, followed by the Ethereum dataset, and the Ripple dataset yields the lowest RMSE. This is due to the difference in the price where higher prices tend to result in higher RMSE.
Table 7. Experimental results on Bitcoin, Ethereum, and Ripple datasets.
The experimental results show that the proposed 1DCNN-GRU outshines the methods in comparison. The proposed 1DCNN-GRU model records an RMSE of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset. Compared to the GRU model [] alone, adding 1DCNN has reduced the RMSE on all datasets. This is attributable to 1DCNN that is able to learn local relationships and encode the cryptocurrency historical data into discriminative features. In doing so, the noise, outliers and insignificant data in the input are suppressed.
Apart from that, the proposed 1DCNN-GRU also showed much improvement in relation to the CNN-LSTM model []. The RMSE has reduced from 47.537 to 43.933 on the Bitcoin dataset, from 3.516 to 3.511 on the Ethereum dataset, and from 0.00135 to 0.00128 on the Ripple dataset. Both LSTM and GRU have their own strengths and perform well in different applications in which they utilize gating mechanisms to retain the historical information. In this application, the improvement corroborates the effectiveness of stacked GRU in capturing the long-range dependencies of the features, thus alleviating the vanishing gradient problems. The real and predicted prices of the Bitcoin, Ethereum, and Ripple are illustrated in Figure 6, Figure 7 and Figure 8, respectively.
Figure 6. The price prediction by 1DCNN-GRU model on the Bitcoin dataset.
Figure 7. The price prediction by 1DCNN-GRU model on the Ethereum dataset.
Figure 8. The price prediction by 1DCNN-GRU model on the Ripple dataset.

6. Conclusions

This paper presents a hybrid deep learning model that harnesses the strengths of 1DCNN and stacked GRU for cryptocurrency price prediction. The historical price of three cryptocurrencies are acquired, namely Bitcoin, Ethereum, and Ripple. The collected data are normalized and pre-processed to remove the missing values. Subsequently, the pre-processed data are passed into the hybrid 1DCNN-GRU model. The 1DCNN model transforms the price data into a discriminative representation that captures the significant patterns in the price data. Subsequently, the stacked GRU model encodes the long-range dependencies in the representation to mitigate past information loss problems. The gating mechanism of GRU determines the past and current information to be updated and reset, thus alleviating diminishing gradient problems. The experimental results demonstrate that the proposed 1DCNN-GRU outperforms the methods in comparison with the lowest RMSE values of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset.
As a proof of concept and due to the limitations in computing resources, this study only utilizes the historical data for one week. Training the model on the cryptocurrency data for a longer time span should be able to further improve the generalization capability of the model. In addition to the closing price, other factors such as the seasonality trends, government policies and laws, social media, can also be considered as the input for price prediction model.

Author Contributions

Conceptualization, C.Y.K. and C.P.L.; methodology, C.Y.K. and C.P.L.; software, C.Y.K. and C.P.L.; validation, C.Y.K. and C.P.L.; formal analysis, C.Y.K.; investigation, C.Y.K.; resources, C.Y.K.; data curation, C.Y.K. and C.P.L.; writing—original draft preparation, C.Y.K.; writing—review and editing, C.P.L. and K.M.L.; visualization, C.Y.K. and C.P.L.; supervision, C.P.L. and K.M.L.; project administration, C.P.L.; funding acquisition, C.P.L. All authors have read and agreed to the published version of the manuscript.

Funding

The research in this work was supported by Telekom Malaysia Research & Development under grant number RDTC/221045, Fundamental Research Grant Scheme of the Ministry of Higher Education under award number FRGS/1/2021/ICT02/MMU/02/4, and Multimedia University Internal Research Grant with award number MMUI/220021.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
1DCNN-GRU1-dimensional Convolutional Neural Network and Gated Recurrent Unit
1DCNN1-dimensional Convolutional Neural Network
GRUGated Recurrent Unit
ANNArtificial Neural Network
MLPMultilayer Perceptron
RMSERoot Mean Square Error
LSTMLong Short-Term Memory
RNNRecurrent Neural Networks
TCNTemporal Convolutional Networks
Conv1D1D convolutional layer

References

  1. Nakamoto, S. Bitcoin: A peer-to-peer electronic cash system. In Decentralized Business Review; Seoul, Korea, 2008; p. 21260. Available online: https://www.debr.io/article/21260-bitcoin-a-peer-to-peer-electronic-cash-system (accessed on 19 June 2022).
  2. Lim, J.Y.; Lim, K.M.; Lee, C.P. Stacked Bidirectional Long Short-Term Memory for Stock Market Analysis. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Kota Kinabalu, Malaysia, 13–15 September 2021; pp. 1–5. [Google Scholar]
  3. Chong, L.S.; Lim, K.M.; Lee, C.P. Stock Market Prediction using Ensemble of Deep Neural Networks. In Proceedings of the 2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Kota Kinabalu, Malaysia, 26–27 September 2020; pp. 1–5. [Google Scholar]
  4. Islam, M.R.; Nguyen, N. Comparison of financial models for stock price prediction. J. Risk Financ. Manag. 2020, 13, 181. [Google Scholar] [CrossRef]
  5. Koukaras, P.; Nousi, C.; Tjortjis, C. Stock Market Prediction Using Microblogging Sentiment Analysis and Machine Learning. Telecom 2022, 3, 358–378. [Google Scholar] [CrossRef]
  6. Park, J.; Seo, Y.S. A Deep Learning-Based Action Recommendation Model for Cryptocurrency Profit Maximization. Electronics 2022, 11, 1466. [Google Scholar] [CrossRef]
  7. Manujakshi, B.; Kabadi, M.G.; Naik, N. A Hybrid Stock Price Prediction Model Based on PRE and Deep Neural Network. Data 2022, 7, 51. [Google Scholar]
  8. Shahbazi, Z.; Byun, Y.C. Knowledge Discovery on Cryptocurrency Exchange Rate Prediction Using Machine Learning Pipelines. Sensors 2022, 22, 1740. [Google Scholar] [CrossRef] [PubMed]
  9. Patel, M.M.; Tanwar, S.; Gupta, R.; Kumar, N. A deep learning-based cryptocurrency price prediction scheme for financial institutions. J. Inf. Secur. Appl. 2020, 55, 102583. [Google Scholar] [CrossRef]
  10. Pintelas, E.; Livieris, I.E.; Stavroyiannis, S.; Kotsilieris, T.; Pintelas, P. Investigating the problem of cryptocurrency price prediction: A deep learning approach. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations, Neos Marmaras, Greece, 5–7 June 2020; Springer: Cham, Switzerland, 2020; pp. 99–110. [Google Scholar]
  11. Gao, P.; Zhang, R.; Yang, X. The application of stock index price prediction with neural network. Math. Comput. Appl. 2020, 25, 53. [Google Scholar] [CrossRef]
  12. Carta, S.; Medda, A.; Pili, A.; Reforgiato Recupero, D.; Saia, R. Forecasting e-commerce products prices by combining an autoregressive integrated moving average (ARIMA) model and google trends data. Future Internet 2018, 11, 5. [Google Scholar] [CrossRef]
  13. Abraham, J.; Higdon, D.; Nelson, J.; Ibarra, J. Cryptocurrency price prediction using tweet volumes and sentiment analysis. SMU Data Sci. Rev. 2018, 1, 1. [Google Scholar]
  14. Dutta, A.; Kumar, S.; Basu, M. A gated recurrent unit approach to bitcoin price prediction. J. Risk Financ. Manag. 2020, 13, 23. [Google Scholar] [CrossRef]
  15. Sin, E.; Wang, L. Bitcoin price prediction using ensembles of neural networks. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, 29–31 July 2017; pp. 666–671. [Google Scholar]
  16. Yenidoğan, I.; Çayir, A.; Kozan, O.; Dağ, T.; Arslan, Ç. Bitcoin forecasting using ARIMA and PROPHET. In Proceedings of the 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, Bosnia and Herzegovina, 20–23 September 2018; pp. 621–624. [Google Scholar]
  17. McNally, S.; Roche, J.; Caton, S. Predicting the price of bitcoin using machine learning. In Proceedings of the 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), Cambridge, UK, 21–23 March 2018; pp. 339–343. [Google Scholar]
  18. Phaladisailoed, T.; Numnonda, T. Machine learning models comparison for bitcoin price prediction. In Proceedings of the 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE), Bali, Indonesia, 24–26 July 2018; pp. 506–511. [Google Scholar]
  19. Jiang, X. Bitcoin price prediction based on deep learning methods. J. Math. Financ. 2019, 10, 132–139. [Google Scholar] [CrossRef]
  20. Politis, A.; Doka, K.; Koziris, N. Ether price prediction using advanced deep learning models. In Proceedings of the 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), Sydney, Australia, 3–6 May 2021; pp. 1–3. [Google Scholar]
  21. Tanwar, S.; Patel, N.P.; Patel, S.N.; Patel, J.R.; Sharma, G.; Davidson, I.E. Deep learning-based cryptocurrency price prediction scheme with inter-dependent relations. IEEE Access 2021, 9, 138633–138646. [Google Scholar] [CrossRef]
  22. Livieris, I.E.; Kiriakidou, N.; Stavroyiannis, S.; Pintelas, P. An advanced CNN-LSTM model for cryptocurrency forecasting. Electronics 2021, 10, 287. [Google Scholar] [CrossRef]
  23. Zhang, Z.; Dai, H.N.; Zhou, J.; Mondal, S.K.; García, M.M.; Wang, H. Forecasting cryptocurrency price using convolutional neural networks with weighted and attentive memory channels. Expert Syst. Appl. 2021, 183, 115378. [Google Scholar] [CrossRef]
  24. Jay, P.; Kalariya, V.; Parmar, P.; Tanwar, S.; Kumar, N.; Alazab, M. Stochastic neural networks for cryptocurrency price prediction. IEEE Access 2020, 8, 82804–82818. [Google Scholar] [CrossRef]
  25. Sebastião, H.; Godinho, P. Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ. Innov. 2021, 7, 1–30. [Google Scholar] [CrossRef] [PubMed]
  26. Saadah, S.; Whafa, A.A. Monitoring Financial Stability Based on Prediction of Cryptocurrencies Price Using Intelligent Algorithm. In Proceedings of the 2020 International Conference on Data Science and Its Applications (ICoDSA), Bandung, Indonesia, 5–6 August 2020; pp. 1–10. [Google Scholar]
  27. Derbentsev, V.; Datsenko, N.; Babenko, V.; Pushko, O.; Pursky, O. Forecasting Cryptocurrency Prices Using Ensembles-Based Machine Learning Approach. In Proceedings of the 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T), Kharkiv, Ukraine, 6–9 October 2020; pp. 707–712. [Google Scholar]
  28. Zielak. Bitcoin historical Data. Available online: https://www.kaggle.com/mczielinski/Bitcoin-historical-data (accessed on 17 May 2022).
  29. Jaquart, P.; Dann, D.; Weinhardt, C. Short-term bitcoin market prediction via machine learning. J. Financ. Data Sci. 2021, 7, 45–66. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.