Deep Learning Methods for Modeling Bitcoin Price

: A precise prediction of Bitcoin price is an important aspect of digital ﬁnancial markets because it improves the valuation of an asset belonging to a decentralized control market. Numerous studies have studied the accuracy of models from a set of factors. Hence, previous literature shows how models for the prediction of Bitcoin su ﬀ er from poor performance capacity and, therefore, more progress is needed on predictive models, and they do not select the most signiﬁcant variables. This paper presents a comparison of deep learning methodologies for forecasting Bitcoin price and, therefore, a new prediction model with the ability to estimate accurately. A sample of 29 initial factors was used, which has made possible the application of explanatory factors of di ﬀ erent aspects related to the formation of the price of Bitcoin. To the sample under study, di ﬀ erent methods have been applied to achieve a robust model, namely, deep recurrent convolutional neural networks, which have shown the importance of transaction costs and di ﬃ culty in Bitcoin price, among others. Our results have a great potential impact on the adequacy of asset pricing against the uncertainties derived from digital currencies, providing tools that help to achieve stability in cryptocurrency markets. Our models o ﬀ er high and stable success results for a future prediction horizon, something useful for asset valuation of cryptocurrencies like Bitcoin.


Introduction
Bitcoin is a cryptocurrency built by free software based on peer-to-peer networks as an irreversible private payment platform. Bitcoin lacks a physical form, is not backed by any public body, and therefore any intervention by a government agency or other agent is not necessary to transact [1]. These transactions are made from the blockchain system. Blockchain is an open accounting book, which records transactions between two parties efficiently, leaving such a mark permanently and impossible to erase, making this tool a decentralized validation protocol that is difficult to manipulate, and with low risk of fraud. The blockchain system is not subject to any individual entity [2].
For Bitcoin, the concept originated from the concept of cryptocurrency, or virtual currency [3]. Cryptocurrencies are a monetary medium that is not affected by public regulation, nor is it subject to a regulatory body. It only affects the activity and rules developed by the developers. Cryptocurrencies are virtual currencies that can be created and stored only electronically [4]. The cryptocurrency is designed to serve as a medium of exchange and for this, it uses cryptography systems to secure the transaction and control the subsequent creation of the cryptocurrency. Cryptocurrency is a subset of a digital currency designed to function as a medium of exchange and cryptography is used to secure the transaction and control the future creation of the cryptocurrency.
Forecasting Bitcoin price is vitally important for both asset managers and independent investors. Although Bitcoin is a currency, it cannot be studied as another traditional currency where economic theories about uncovered interest rate parity, future cash-flows model, and purchasing power parity matter, since different standard factors of the relationship between supply and demand cannot be applied in the digital currency market like Bitcoin [5]. On the one hand, Bitcoin has different characteristics that make it useful for those agents who invest in Bitcoin, such as transaction speed, dissemination, decentrality, and the large virtual community of people interested in talking and providing relevant information about digital currencies, mainly Bitcoin [6].
Velankar and colleagues [7] attempted to predict the daily price change sign as accurately as possible using Bayesian regression and generalized linear model. To do this, they considered the daily trends of the Bitcoin market and focused on the characteristics of Bitcoin transactions, reaching an accuracy of 51% with the generalized linear model. McNally and co-workers [8] studied the precision with which the direction of the Bitcoin price in United States Dollar (USD) can be predicted. They used a recurrent neural network (RNN), a long short-term memory (LSTM) network, and the autoregressive integrated moving average (ARIMA) method. The LSTM network obtains the highest classification accuracy of 52% and a root mean square error (RMSE) of 8%. As expected, non-linear deep learning methods exceeded the ARIMA method's prognosis. For their part, Yogeshwaran and co-workers [9] applied convolutional and recurrent neural networks to predict the price of Bitcoin using data from a time interval of 5 min to 2 h, with convolutional neural networks showing a lower level of error, at around 5%. Demir and colleagues [10] predicted the price of Bitcoin using methods such as long short-term memory networks, naïve Bayes, and the nearest neighbor algorithm. These methods achieved accuracy rates between 97.2% and 81.2%. Rizwan, Narejo, and Javed [11] continued with the application of deep learning methods with the techniques of RNN and LSTM. Their results showed an accuracy of 52% and an 8% RMSE by the LSTM. Linardatos and Kotsiantis [12] had the same results, after using eXtreme Gradient Boosting (XGBoost) and LSTM; they concluded that this last technique yielded a lower RMSE of 0.999. Despite the superiority of computational techniques, Felizardo and colleagues [13] showed that ARIMA had a lower error rate than methods, such as random forest (RF), support vector machine (SVM), LSTM, and WaveNets, to predict the future price of Bitcoin. Finally, other works showed new deep learning methods, such as Dutta, Kumar, and Basu [14], who applied both LSTM and the gated recurring unit (GRU) model; the latter showed the best error result, with an RMSE of 0.019. Ji and co-workers [15] predicted the price of Bitcoin with different methodologies such as deep neural network (DNN), the LSTM model, and convolutional neural network. They obtained a precision of 60%, leaving the improvement of precision with deep learning techniques and a greater definition of significant variables as a future line of research. These authors show the need for stable prediction models, not only with data in and out of the sample, but also in forecasts of future results.
To contribute to the robustness of the Bitcoin price prediction models, in the present study a comparison of deep learning methodologies to predict and model the Bitcoin price is developed and, as a consequence, a new model that generates better forecasts of the Bitcoin price and its behavior in the future. This model can predict achieving accuracy levels above 95%. This model was constructed from a sample of 29 variables. Different methods were applied in the construction of the Bitcoin price prediction model to build a reliable model, which is contrasted with various methodologies used in previous works to check with which technique a high predictive capacity is achieved; specifically, the methods of deep recurrent neural networks, deep neural decision trees, and deep support vector machines, were used. Furthermore, this work attempts to obtain high accuracy, but it is also robust and stable in the future horizon to predict new observations, something that has not yet been reported by previous works [7][8][9][10][11][12][13][14][15], but which some authors demand for the development of these models and their real contribution [9,12].
We make two main contributions to the literature. First, we consider new explanatory variables for modeling the Bitcoin price, testing the importance of these variables which have not been considered so far. It has important implications for investors, who will know which indicators provide reliable, accurate, and potential forecasts of the Bitcoin price. Second, we improve the prediction accuracy concerning that obtained in previous studies with innovative methodologies.
This study is structured as follows: Section 2 explains the theory of methods applied. Section 3 offers details of the data and the variables used in this study. Section 4 develops the results obtained. Section 5 provides conclusions of the study and the purposes of the models obtained.

Deep Learning Methods
As previously stated, different deep learning methods have been applied for the development of Bitcoin price prediction models. We use this type of methodology thanks to its high predictive capacity obtained in the previous literature on asset pricing to meet one of the objectives of this study, which is to achieve a robust model. Specifically, deep recurrent convolution neural network, deep neural decision trees, and deep learning linear support vector machines have been used. The characteristics of each classification technique used are detailed below. In addition, the method of analysis of the sensitivity of variables used in the present study, in particular, the method of Sobol [16], which is necessary to determine the level of significance of the variables used in the prediction of Bitcoin price is recorded, fulfilling the need presented by the previous literature in the realization of the task of feature selection [15].

Deep Recurrent Convolution Neural Network (DRCNN)
Recurrent neural networks (RNN) have been applied in different fields for prediction due to its huge prediction performance. The previous calculations made are those that form the result within the structure of the RNN [17]. Having an input sequence vector x, the hidden nodes of a layer s, and the output of a hidden layer y, can be estimated as explained in Equations (1) and (2).
where W xs , W ss , and W so define the weights from the input layer x to the hidden layer s, by the biases of the hidden layer and output layer. Equation (3) points out σ and o as the activation functions.
where z(t) is the vibration signals, and ω(t) is the Gaussian window function focused around 0. T(τ, ω) is the function that expresses the vibration signals. To calculate the hidden layers with the convolutional operation, Equations (4) and (5) are applied.
where W indicates the convolution kernels. Recurrent convolutional neural network (RCNN) can be heaped to establish a deep architecture, called the deep recurrent convolutional neural network (DRCNN) [18,19]. To use the DRCNN method in the predictive task, Equation (6) determines how the last phase of the model serves as a supervised learning layer. where W h is the weight and b h is the bias. The model calculates the residuals caused by the difference between the predicted and the actual observations in the training stage [20]. Stochastic gradient descent is applied for optimization to learn the parameters. Considering that the data at time t is r, the loss function is determined as shown in Equation (7).

Deep Neural Decision Trees (DNDT)
Deep neural decision trees are decision tree (DT) models performed by deep learning neural networks, where a weight division corresponding to the DNDT belongs to a specific decision tree and, therefore, it is possible to interpret its information [21]. Stochastic gradient descent (SGD) is used to optimize the parameters at the same time; this partitions the learning processing in mini-batches and can be attached to a larger standard neural network (NN) model for end-to-end learning with backward propagation. In addition, standard DTs gain experience through a greedy and recursive factor division. This can make a selection of functions more efficient [22]. The method starts by performing a soft binning function to compute the residual rate for each node, making it possible to make decisions divided into DNDTs [23]. The input of a binning function is a real scalar x which makes an index of the containers to which x belongs.
The activation function of the DNDT algorithm is carried out based on the NN represented in Equation (8).
where w is a constant with value w = [1, 2, ..., n + 1], τ > 0 is a temperature factor, and b is defined in The coding of the binning function x is given by the NN according the expression of Equation (9) [24]. The key idea is to build the DT with the applied Kronecker product from the binning function defined above. Connecting every feature x d with its NN f d (x d ), we can determine all the final nodes of the DT as appears in Equation (10).
where z expresses the leaf node index obtained by instance x in vector form. The complexity parameter of the model is determined by the number of cut points of each node. There may be inactive points since the values of the cut points are usually not limited.

Deep Learning Linear Support Vector Machines (DSVR)
Support vector machines (SVMs) were created for binary classification. Training data are denoted by its labels (x n , y n ), n = 1, . . . , N, x n ∈ R D , t n ∈ {−1, +1}; SVMs are optimized according to Equation (11). min where ξ n are features that punish observations that do not meet the margin requirements [25].
The optimization problem is defined as appears in Equation (12).

of 13
Usually the Softmax or 1-of-K encoding method is applied in the classification task of deep learning algorithms. In the case of working with 10 classes, the Softmax layer is composed of 10 nodes and expressed by p i , where i = 1, ..., 10; p i specifies a discrete probability distribution, 10 i p i = 1. Equation (13) is defined by h as the activation of the penultimate layer nodes, W as the weight linked by the penultimate layer to the Softmax layer, and the total input into a Softmax layer. The next expression is the result.
The predicted class î would be as follows in Equation (15).
Since linear-SVM is not differentiable, a popular variation is known as the DSVR, which minimizes the squared hinge loss as indicated in Equation (16).
The target of the DSVR is to train deep neural networks for prediction [24,25]. Equation (17) expresses the differentiation of the activation concerning the penultimate layer, where l (w) is said differentiation, changing the input x for the activation h.
where I{·} is the indicator function. Likewise, for the DSVR, we have Equation (18).

Sensitivity Analysis
Data mining methods have the virtue of offering a great amount of explanation to the authors' studied problem. To know what the degree is, sensitivity analysis is performed. This analysis tries to quantify the relative importance of the independent variables concerning the dependent variable [26,27]. To do this, the search for the reduction of the set of initial variables continues, leaving only the most significant ones. The variance limit follows, where one variable is significant if its variance increases concerning the rest of the variables as a whole. The Sobol method [16] is applied to decompose the variance of the total output V (Y) offered by the set of equations expressed in Equation (19). where the sensitivity indexes, with S ij being the effect of interaction between two variables. The Sobol decomposition allows the estimation of a total sensitivity index, STi, which measures the sum of all the sensitivity effects involved in the independent variables.

Data and Variables
The sample period selected is from 2011 to 2019, with a quarterly frequency of data. To obtain the information of the independent variables, data from the IMF's International Financial Statistics (IFS), the World Bank, FRED Sant Louis, Google Trends, Quandl, and Blockchain.info were used.
The dependent variable used in this study is the Bitcoin price and is defined as the value of Bitcoin in USD. In addition, we used 29 independent variables, classified into demand and supply variables, attractiveness, and macroeconomic and financial variables, as possible predictors of the Bitcoin future price (Table 1). These variables were used throughout the previous literature [1,3,4,14]. The sample is fragmented into three mutually exclusive parts, one for training (70% of the data), one for validation (10% of the data), and the third group for testing (20% of the data). The training data are used to build the intended models, while the validation data attempt to assess whether there is overtraining of those models. As for the test data, they serve to evaluate the built model and measure the predictive capacity. The percentage of correctly classified cases is the precision results and RMSE measures the level of errors made. Furthermore, for the distribution of the sample data in these three phases, cross-validation 10 times with 500 iterations was used [28,29]. Table 2 shows a statistical summary of the independent variables for predicting Bitcoin price. It is observed that all the variables obtain a standard deviation not higher than each value of the mean. Therefore, the data show initial stability. On the other hand, there is a greater difference between the minimum and maximum values. Variables like mining commissions and cost per transaction show a small minimum value compared to their mean value. The same fact happens with the hash variable. Despite these extremes, they do not affect the values of the standard deviations of the respective variables.  Table 3 and Figures 1-3 show the level of accuracy, the root mean square error (RMSE), and the mean absolute percentage error (MAPE). In all models, the level of accuracy always exceeds 92.61% for testing data. For its part, the RMSE and MAPE levels are adequate. The model with the highest accuracy is that of deep recurrent convolution neural network (DRCNN) with 97.34%, followed by the model of deep neural decision trees (DNDT) method with 96.94% on average by regions. Taken together, these results provide a level of accuracy far superior to that of previous studies. Thus, in the work of Ji and co-workers [15], an accuracy of around 60% is revealed; in the case of McNally and co-workers [8], it is close to 52%; and in the study of Rizwan, Narejo, and Javed [11], it approaches 52%. Finally, Table 4 shows the most significative variables by methods after applying the Sobol method for the sensitivity analysis.             Table 4 shows additional information on the significant variables. Block size, cost per transaction, and difficulty were significant in the three models for each method applied. This demonstrates the importance of the cost to carry out the Bitcoin transaction, of the block of Bitcoins to buy, as well as the difficulty of the miners to find new Bitcoins, as the main factors in the task of determining the price of Bitcoin. This contrasts with the results shown in previous studies, where these variables are not significant or are not used by the initial set of variables [5,7,8]. The best results were obtained by the DRCNN method, where in addition to the aforementioned variables, the transaction value, transaction volume, block size, dollar exchange rate, Dow Jones, and gold were also significant. This shows that the demand and supply variables of the Bitcoin market are essential to predict its price, something that has been shown by some previous works [1,30]. Yet significant macroeconomic and financial variables have not been observed as important factors by other recent works [30,31], since they were shown as variables that did not influence Bitcoin price fluctuations. In our results, the macroeconomic variables of Dow Jones and gold have been significant in all methods.

Empirical Results
On the other hand, the models built by the DNDT and DSVR methods show high levels of precision, although lower than those obtained by the DRCNN. Furthermore, these methods show some different significant variables. Such is the case of the variables of forum posts, a variable popularly used as a proxy for the level of future demand that Bitcoin could have, although with divergences in previous works regarding its significance to predict the price of Bitcoin, where some works show that this variable is not significant [11,14]. Finally, these methods show another macroeconomic variable that is more significant, in the case of the dollar exchange rate. This represents the importance that changes in the price of the USD with Bitcoin can be decisive in  Table 4 shows additional information on the significant variables. Block size, cost per transaction, and difficulty were significant in the three models for each method applied. This demonstrates the importance of the cost to carry out the Bitcoin transaction, of the block of Bitcoins to buy, as well as the difficulty of the miners to find new Bitcoins, as the main factors in the task of determining the price of Bitcoin. This contrasts with the results shown in previous studies, where these variables are not significant or are not used by the initial set of variables [5,7,8]. The best results were obtained by the DRCNN method, where in addition to the aforementioned variables, the transaction value, transaction volume, block size, dollar exchange rate, Dow Jones, and gold were also significant. This shows that the demand and supply variables of the Bitcoin market are essential to predict its price, something that has been shown by some previous works [1,30]. Yet significant macroeconomic and financial variables have not been observed as important factors by other recent works [30,31], since they were shown as variables that did not influence Bitcoin price fluctuations. In our results, the macroeconomic variables of Dow Jones and gold have been significant in all methods.
On the other hand, the models built by the DNDT and DSVR methods show high levels of precision, although lower than those obtained by the DRCNN. Furthermore, these methods show some different significant variables. Such is the case of the variables of forum posts, a variable popularly used as a proxy for the level of future demand that Bitcoin could have, although with divergences in previous works regarding its significance to predict the price of Bitcoin, where some works show that this variable is not significant [11,14]. Finally, these methods show another macroeconomic variable that is more significant, in the case of the dollar exchange rate. This represents the importance that changes in the price of the USD with Bitcoin can be decisive in estimating the possible demand and, therefore, a change in price. This variable, like the rest of the macroeconomic variables, has not been shown as a significant variable [5,31].
This set of variables observed as significant represents a group of novel factors that determine the price of Bitcoin and therefore, is different from that shown in the previous literature.

Post-Estimations
In this section, we try to perform estimations of models to generate forecasts in a future horizon. For this, we used the framework of multiple-step ahead prediction, applying the iterative strategy and models built to predict one step forward are trained [32]. At time t, a prediction is made for moment t + 1, and this prediction is used to predict for moment t + 2 and so on. This means that the predicted data for t + 1 are considered real data and are added to the end of the available data [33]. Table 5 and Figures 4-6 show the accuracy and error results for t + 1 and t + 2 forecasting horizons. For t + 1, the range of precision for the three methods is 88.34-94.19% on average, where the percentage of accuracy is higher in the DRCNN (94.19%). For t + 2, this range of precision is 85.76-91.37%, where the percentage of accuracy is once again higher in the DRCNN (91.37%). These results show the high precision and great robustness of the models.

Post-Estimations
In this section, we try to perform estimations of models to generate forecasts in a future horizon. For this, we used the framework of multiple-step ahead prediction, applying the iterative strategy and models built to predict one step forward are trained [32]. At time t, a prediction is made for moment t + 1, and this prediction is used to predict for moment t + 2 and so on. This means that the predicted data for t + 1 are considered real data and are added to the end of the available data [33]. Table 5 and Figures 4-6 show the accuracy and error results for t + 1 and t + 2 forecasting horizons. For t + 1, the range of precision for the three methods is 88.34-94.19% on average, where the percentage of accuracy is higher in the DRCNN (94.19%). For t + 2, this range of precision is 85.76-91.37%, where the percentage of accuracy is once again higher in the DRCNN (91.37%). These results show the high precision and great robustness of the models.

Conclusions
This study developed a comparison of methodologies to predict Bitcoin price and, therefore, a new model was created to forecast this price. The period selected was from 2011 to 2019. We applied different deep learning methods in the construction of the Bitcoin price prediction model to achieve a robust model, such as deep recurrent convolutional neural network, deep neural decision trees and deep support vector machines. The DRCNN model obtained the highest levels of precision. We propose to increase the level of performance of the models to predict the price of Bitcoin compared to previous literature. This research has shown significantly higher precision results than those shown in previous works, achieving a precision hit range of 92.61-95.27%. Likewise, it was possible to identify a new set of significant variables for the prediction of the price of Bitcoin, offering great stability in the models developed predicting in the future horizons of one and two years.
This research allows us to increase the results and conclusions on the price of Bitcoin concerning previous works, both in matters of precision and error, but also on significant variables. A set of significant variables for each methodology applied has been selected analyzing our results, but some of these variables are recurrent in the three methods. This supposes an important addition to the field

Conclusions
This study developed a comparison of methodologies to predict Bitcoin price and, therefore, a new model was created to forecast this price. The period selected was from 2011 to 2019. We applied different deep learning methods in the construction of the Bitcoin price prediction model to achieve a robust model, such as deep recurrent convolutional neural network, deep neural decision trees and deep support vector machines. The DRCNN model obtained the highest levels of precision. We propose to increase the level of performance of the models to predict the price of Bitcoin compared to previous literature. This research has shown significantly higher precision results than those shown in previous works, achieving a precision hit range of 92.61-95.27%. Likewise, it was possible to identify a new set of significant variables for the prediction of the price of Bitcoin, offering great stability in the models developed predicting in the future horizons of one and two years.
This research allows us to increase the results and conclusions on the price of Bitcoin concerning previous works, both in matters of precision and error, but also on significant variables. A set of significant variables for each methodology applied has been selected analyzing our results, but some of these variables are recurrent in the three methods. This supposes an important addition to the field of cryptocurrency pricing. The conclusions are relevant to central bankers, investors, asset managers, private forecasters, and business professionals for the cryptocurrencies market, who are generally interested in knowing which indicators provide reliable, accurate, and potential forecasts of price changes. Our study suggests new and significant explanatory variables to allow these agents to predict the Bitcoin price phenomenon. These results have provided a new Bitcoin price forecasting model developed using three methods, with the DCRNN model as the most accurate, thus contributing to existing knowledge in the field of machine learning, and especially, deep learning. This new model can be used as a reference for setting asset pricing and improved investment decision-making.
In summary, this study provides a significant opportunity to contribute to the field of finance, since the results obtained have significant implications for the future decisions of asset managers, making it possible to avoid big change events of the price and the potential associated costs. It also helps these agents send warning signals to financial markets and avoid massive losses derived from an increase of volatility in the price.
Opportunities for further research in this field include developing predictive models considering volatility correlation of the other new alternative assets and also safe-haven assets such as gold or stable currencies, that evaluate the different scenarios of portfolio choice and optimization. Funding: This research was funded by Cátedra de Economía y Finanzas Sostenibles, University of Malaga, Spain.

Conflicts of Interest:
The authors declare no conflict of interest.