An advanced CNN-LSTM model for cryptocurrency 3 forecasting

: Nowadays, cryptocurrencies are established and widely recognized as an alternative


Introduction
Cryptocurrencies have been established and widely recognized as a new electronic alternative exchange currency method, which have considerable implications for emerging economies and in general for the global economy [1].They have infiltrated most financial transactions and as a result cryptocurrency trade is generally considered one of the most popular and promising type of profitable investments.Nevertheless, this constantly increasing financial market is characterized by significant volatility and strong price fluctuations over time.Nowadays, cryptocurrency forecasting is generally considered as one of the most challenging time-series prediction problems due to the large number of unpredictable factors involved and the significant volatility of cryptocurrencies' prices, resulting in complicated temporal dependencies [2][3][4].
During the last years, deep learning methodologies were applied on time-series predictions, focusing on popular real-world application domains such as cryptocurrency market.Most of these models exploit advanced deep learning techniques and special architectural designs based on convolutional and Long short-term memory (LSTM) layers [5][6][7][8][9][10].Convolutional layers are utilized to filter out the noise in complex time-series data as well as extracting new valuable features while LSTM layers are used to efficiently capture sequence patterns as well as long and short term dependencies [9].
Nevertheless, although advanced deep learning models possess the ability to address highly nonlinear time-series problems, they were proved to produce inefficient and unreliable cryptocurrency forecasts.More specifically, Pintelas et al. [11] and Livieris et al. [12] presented some comprehensive researches and highlighted that the difficulties in cryptocurrency forecasting are based on two main reasons: First, cryptocurrencies time-series are close to random walk process, which implies that the prediction problem is considered too complex and too complicated; Second, the inefficiency of deep learning models is mainly based on the existence of autocorrelation in the errors and the lack of stationarity [2,13].It is worth mentioning that stationarity property constitutes an important property in time-series modeling as well as for the reliability of the prediction models.Notice that non-stationary series possess high volatility, trend, are frequently characterized by heteroskedasticity and significant properties such as mean, frequency, variance and kurtosis vary over time.
Along this line, Livieris et al. [2] introduced a novel framework for enhancing deep learning forecasting models.The major novelty of their proposed work was the enforcement of a time-series to become "suitable" for fitting a deep learning model based on the stationarity property, as well as a framework the development of accurate and reliable prediction models.To impose stationarity the authors performed a series of transformations based on first differences or returns, without the loss of any embedded information.Additionally, they performed an extensive research focusing on the evaluation of the prediction accuracy of deep learning models, as well as the reliability of their forecasts by examining the existence of autocorrelation in the errors.Based on their experimental and theoretical evidence, the authors concluded that their proposed framework secures the "suitability" of a time-series for fitting a deep learning model and it is essential for developing accurate and reliable deep learning time-series models.
Based on the previous works, the objective of this research is two-fold: Firstly, to investigate if the forecasting accuracy of a cryptocurrency deep learning model can be indeed enhanced by utilizing data from various cruptocurrencies; Secondly, to develop a deep learning model with advanced forecasting accuracy.
In this work, we propose a multiple-input deep neural network model, called MICDL, for the prediction of cryptocurrency price and movement.Initially, all cryptocurrency time-series data are transformed based on returns transformation in order to satisfy the stationarity property and be "suitable" for fitting the proposed deep neural network model [2].Subsequently, the proposed prediction model uses as inputs the transformed data from various cryptocurrencies and handles them independently, in the sense that each cryptocurrency data consist of inputs to different convolutional layers, in order to each cryptocurrency information to be exploited and processed, separately.Finally, the processed data from each cryptocurrency are merged and further processed for issuing the final prediction.The rationale for the utilization of a multi-input neural network is that this type of models have been originally proposed for more efficiently exploiting mixed data and refers to the case of having multiple types of independent data [14].In the literature, these models have been successfully applied for addressing a variety of difficult real-world problems reporting promising results while they were found to outperform traditional single output models [14][15][16][17][18].The main idea behind these models is to extract valuable information from each category of mixed data, independently and then concatenate the information for issuing the final prediction.Additionally, we conducted an empirical study utilizing almost four consecutive years (Jan-01, 2017 -Oct-31, 2020) of cryptocurrency data from the three cryptocurrencies with the highest market capitalization i.e.Bitcoin (BTC), Etherium (ETH) and Ripple (XRP).The numerical experiments report that the proposed model provided reliable price movement predictions outperforming traditional deep learning models as well as accurate price forecasting.Moreover, the detailed experimental analysis highlight that MICDL has the ability to efficiently exploit mixed cryptocurrency data and reduces overfitting with lower computational cost compared to a traditional fully-connected deep neural network.
The remainder of this paper is organized as follows: Section 2 presents a brief review of deep learning models for cryptocurrency price and movement forecasting.Section 3 presents a detailed description of the proposed framework focusing on highlighting its architecture and benefits.Section 4 presents data preparation and reports the descriptive statistics, describing the basic features of each data.Section 5 presents the detailed experimental analysis, focusing on the evaluation of the proposed framework.Section 6 summarizes the main findings of this research, presents the conclusions and some interesting future directions.

Related work
Cryptocurrency price analysis and forecasting constitutes a considerably complicated problem in time-series analysis and a considerably challenging research area.It's complexity and difficulty is caused by the cryptocurrency time-series' significant fluctuations and volatility, which are highly influenced by an enormous number of factors.In the literature, recent research efforts have utilized and adopted deep learning methodologies for predicting cryptocurrency price and directional movement to improve forecasting accuracy.Some interesting findings and useful conclusions are briefly presented.
Derbentsev et al. [3] attempted to model short-term dynamics of the three most capitalized cryptocurrencies, i.e.Bitcoin, Etherium and Ripple, using several sophisticated prediction models.
More specifically, they evaluated the prognostic performance of an Artificial Neural Network (ANN), a Random Forest (RF) and a Binary Autoregressive Tree (BART) model.The utilized data obtained 1583 daily cryptocurrency prices from Aug-1, 2015 to Dec-1, 2019.Their experimental results reported that that ANN and BART models exhibited 63% average accuracy for predicting directional movement which was considerably higher than the "naive" model.Chowdhury et al. [4] applied advanced machine learning prediction models on the index and constituents of cryptocurrencies for forecasting future values.More analytically, their primary aim was the prediction of the closing price of the CCI30 index as well as nine major cryptocurrencies in order to assist cryptocurrency investors in trading.In their work, they utilized a variety of machine learning models including Gradient Boosted Trees, ANNs, k-nearest neighbor as well as robust ensemble learning models.Their utilized data contained daily closing prices from Jan-01, 2017 to Jan-31, 2019.
Ensemble models and Gradient Boosted Trees exhibited the best prediction performance, which was competitive and sometimes better, compared to that of similar state of the art models proposed in the literature.
Pintelas et al. [11] conducted an interesting research, evaluating sophisticated deep learning models for predicting cryptocurrency prices and movements.Their research revealed the significant limitations of deep learning models for exhibiting reliable forecasts.Based on their experimental analysis, the authors highlighted the need for adopting more advanced algorithmic approaches for the development of efficient and reliable cryptocurrency models.Along this line, Livieris et al. [12] considered to improve the forecasting performance and reliability of deep learning models utilizing three widely utilized ensemble strategies, i.e.Averaging, Bagging and Stacking.The authors utilized hourly prices of Bitcoin, Etherium and Ripple from Jan-1 2018, to Aug-31, 2019.Additionally, they conducted an exhaustive performance evaluation of various ensemble models using several Conv-based and LSTM-based learners as based models.Their analysis highlighted that deep learning and ensemble learning may efficiently be adapted to develop strong, and reliable cryptocurrency prediction models, but with significant computational cost.
Patel et al. [19] proposed a hybrid cryptocurrency prediction approach, which focuses on Litecoin A common limitation of all presented and discussed researches is that they focused on achieving better forecasting performance by exploiting more sophisticated models and techniques, usually ignoring the development of a sophisticated training dataset containing more useful information.
In other words, most approaches treat each cryptocurrency independently ignoring its conceivable relations with other cryptocurrencies and they don't take into consideration the complexity and non-stationarity of cryptocurrency time-series data.
In this research, we propose a different approach and present a new model for the development of accurate and reliable forecasting models.The novelty of the proposed model is based on the utilized training data as well as its special architectural design.More specifically, in this work, we propose a multiple-input deep neural network model, which utilizes as inputs various cryptocurrency data and handles them independently in order to each cryptocurrency information to be exploited and processed, separately.The processed data from each cryptocurrency are merged and further processed for issuing the final prediction.To the best of our knowledge this is the first conducted approach which focused to exploit data from various cryptocurrencies for exhibiting more accurate forecasts.
Following previous approaches [11,12], we provide a comprehensive performance evaluation for price prediction and directional movement.

Multiple-input cryptocurrency deep learning model
In this section, we present the proposed Multiple-Input Cryptocurrency Deep Learning (MICDL) model.The proposed approach is based on the idea of not prossessing all cryptocurrency data, simultaneously.In contrast, each cryptocurrency data is processed and handled independently and then the processed data from each cryptocurrency are merged and further processed for estimating the final prediction.The rationale behind the proposed approach is to develop a learning model which is able to independently extract useful information from various cryptocurrency data and subsequently process these information for achieving accurate and reliable predictions.
Suppose that we have data from N cryptocurrencies.Each cryptocurrency data is utilized as input in a unique convolutional layer, which is followed by a pooling layer and a LSTM layer.The proposed approach focuses on exploiting the ability of convolutional layers for extracting useful knowledge by learning the internal representation of each cryptocurrency, independently, as well as the effectiveness of LSTM layers for identifying short-term and long-term dependencies.Then, the output vectors of all LSTM layers are merged by a concatenate layer.This layer is followed by a series of layers, which constitute the classical structure of a deep learning neural network i.e. a dense layer, a batch normalization layer, a dropout layer, a dense layer, a batch normalization layer, a dropout layer and a final output layer of one neuron.The architecture of MICDL is presented in Figure 1.
Notice that although a traditional deep neural network model is able to analyze and encode any complex function, the convergence of its training process may be degradated due to the number of weights, which exponentially increases as the number of layers increases; and due to the vanishing gradient problem, which usually occurs in large networks.In contrast, the significant advantages of the proposed model's architecture is that it provides more flexibility and adaptivity for low computation effort compared to a fully connect neural network with a similar number of layers as well as greater resistance to the vanishing gradient problem, due to its sparse structure [16,20].
Subsequently, we present a brief description of the main elements of the proposed MICRL model i.e.Convolution and pooling layers, LSTM layers, Dense layers, Batch-normalization layers and dropout layers.

•
Convolutional layer: Convolutional layers [21] constite a novel class of neural network layers which are characterized by their remarkable ability to learn the internal representation of their inputs.This is performed by applying convolutional operations between the input data and the use of convolution kernels, called "filters", for developing new feature values.
• Pooling layer: Pooling layers [21] are utilized to reduce the spatial dimensions, aiming on reducing the number of operations required for all following layers.Notice that less spatial Probably, max pooling and average pooling layers constitute the most widely utilized choices, which use the maximum value and the average value from each cluster of outputs of the previous layer, respectively [22].
• LSTM layer: LSTM layers [23] belong to the class of recurrent neural network layers, enhanced with a separate memory cell and adaptive gate units (input, forget and output) for controlling the information flow.The utilization of gates in each cell implies that data can be filtered, discarded or added therefore maintaining useful information in the memory cell for longer periods of time.
The advantage of LSTM layers are their ability to identify both short and long term correlation features within time series and considerably address the vanishing gradient problem [23].
• Dense layer: Dense layers constitute the most popular and widely utilized choice for composing the hidden layer of a deep neural network [24].In particular, each dense layer is composed by neurons, which are connected with all neurons of the previous layer.Generally, dense layers add a non-linearity property and theoretically a neural network composed by dense layers is able to model any mathematical function [25].
• Batch-normalization layer: Batch normalization constitutes an elegant technique for training deep neural networks which focuses on stabilizing the learning process by standardizing the inputs of the next layer for each mini-batch [21].Batch normalization significantly reduces the problem of coordinating updates across many layers.and usually accelerates training by considerably reducing the number of epochs.
• Dropout layer: Dropout constitutes one of the most famous regulization methods for preventing neural-networks from overfitting.The dropout layer is a non-learnable layer which is added between existing layers of a neural network model.It is applied to outputs of the prior layer and temporary sets a random set of outputs to zero with a pre-defined probability p, called dropout rate, which are fed to the next layer.The key idea in dropout and its motivation is to make each layer less sensitive to statistical fluctuations in the inputs [26].

Data
The data utilized in this research, concern daily historical data from Jan-01, 2017 to Oct-31, 2020 of BTC, ETH and XRP in USD, which constitute the cryptocurrencies with the highest market capitalization.Moreover, the data for all cryptocurrencies were collected from the website https: //coinmarketcap.com.Next, following the novel framework, which was originally proposed by Livieris et al. [2] all cryptocurrency data are initially transformed based on the returns transformation in order to satisfy the stationarity property and to be "suitable" for fitting a deep neural network model.Finally, it is worth noticing that all deep learning models were trained using the transformed series and the inverse transformations were applied for calculating the prediction for the levels of the original time-series.

Numerical experiments
In this section, we conducted an extensive experimental analysis to examine and evaluate the performance of proposed multi-input deep learning model in forecasting the cryptocurrency prices of Bitcoin, Etherium and Ripple.
The proposed model was evaluated against two CNN-LSTM models: Model 1 and Model 2 .Model 1 is trained with only one cryptocurrency data(i.e.BTC, ETH or XRP), Model 2 is trained with all three cryptocurrency data, as well as the proposed MICDL model.
• Model 1 consists of a convolutional layer of 16 filters of size (2; ), followed by an average pooling layer of size 2, a LSTM layer of 50 units, a batch normalization layer a dropout out layer with p = 0.4, a dense layer of 64 neurons, a batch normalization layer a dropout out layer with p = 0.2 and an output layer of one neuron.
• Model 2 consists of a convolutional layer of 32 filters of size (2; ), followed by an average pooling layer of size 2, a LSTM layer of 50 units, a batch normalization layer a dropout out layer with p = 0.5, a dense layer of 128 neurons, a batch normalization layer a dropout out layer with p = 0.2 and an output layer of one neuron.
• MICDL model consists of 3 convolutional layers with 16 filters of size (2; ), each one takes as input a unique cryptocurrency time-series data, i.e.BTC, ETH and XRP.Each convolutional layer is followed by a Average pooling layer of size (2; ) and a LSTM layer with 50 units.The outputs of the LSTM layers are merged by a concatenate layer which is followed by a dense layer of 256 neurons, a batch normalization layer, a dropout layer with p = 0.3 a dense layer of 64 neurons, a batch normalization layer, a dropout layer with p = 0.2 and a final output layer of one neuron.
Each cryptocurrency prediction model was trained utilizing two different Lag values i.e 7 (1 week) and 14 (2 weeks) while their hyper-parameters were optimized under exhaustive experimentation (used various number of filters in convolutional layers, units in LSTM layers, neurons in dense layers, values of dropout rate).
For evaluating the regression performance of all forecasting models we utilized the performance metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and coefficient of determination R 2 , which are respectively defined by where N is the number of forecasts, y t is the actual value, ŷt is the predicted value and ȳ = 1 n ∑ N t=1 y t is the mean of the actual values.
Furthermore, for the binary classification problem of directional movement (price increasement or decreasement on the following day with respect to the today's price), we utilized the metrics: Accuracy (Acc), Geometric Mean (GM), Sensitivity (Sen) and Specificity (Spe), which are respectively defined by where TP stands for the number of values which were correctly identified to be increased, TN stands for the number of values which were correctly identified to be decreased, FP (type I error) stands for the number of values which were misidentified to be increased and FN (type II error) stands for the number of values which misidentified to be decreased.Moreover, the performance metric Area Under Curve (AUC) was included in our experimental analysis which is presented using the Receiver Operating Characteristic (ROC) curve.Notice that ROC curve is created by plotting the true positive rate (Sensitivity) against the false positive rate (Specificity) at various threshold settings.
All models were trained with Root Mean Square Propagation (RMSProp) [29].Rectifier Linear Unit (ReLU) activation function was utilized as activaton function except for the output layer where linear activation was used.In all layers, kernel and bias initializer were set as default as well as the recurrent initializer in the LSTM layers.Additionally, in order to avoid overfitting we used the early stopping technique based on "validation loss".
At this point, it is worth mentioning that the performance metrics AUC and GM as well as the balance between Sen and Spe present the information provided by a confusion matrix in compact form; hense, they constitute the proper metrics to evaluate the ability of model of not overfitting the training data.Finally, it is worth mentioning, that Model 1 and Model 2 reported better classification performance for Lag value 14, while MICDL reported similar performance for both Lag values.
The interpretation of Tables 4-6 show that Model 1 has overfitted the training data and it is not able to make reliable price movement predictions which implies that the utilization of all cryptocurrencies in the training data has benefitted the development of prediction models with better classification performance.Additionally, by comparing the performance of the proposed model MICDL with that of Model 2 , we point out that the special architecture of MICDL has better exploited the training data and is able to predict price movements with higher accuracy and reliability.Finally, it is worth mentioning that MICDL considerably outperformed both Model 1 and Model 2 reporting 12.5%-54% and 4.33%-22.95%higher GM score for Lag values 7 and 14, respectively as well as presenting the best balance between Sen and Spe metrics.
Based on the previous analysis, we are able to conclude that the utilization of all cryptocurrencies in the training data but most significantly the multi-input architecture of the proposed MICDL has developed a forecasting model with the best regression and classification performance.Although the R 2 metric presents that all models have equally fitted the training data and are able to exhibit accurate cryptocurrency predictions, the metrics GM, Sen and Spe reveal that the utilization of all series in the training data, provided models with improved performance regarding the directional movement problem.This indicates that although Model 1 is able to predict a value close to the next value, it cannot provide any reliable information if the cryptocurrency price will increase or decrease the next day.
Moreover, the architecture of MICDL model developed a model which has efficiently exploited the information provided in the training set and is able to provide accurate and reliable price movement prediction without degradating its regression performance.Next, we attempt to provide statistical evidences about the efficiency and reliability of the proposed MICDL model's forecasts.In more detail, for rejecting the hypothesis H 0 that all cryptocurrency models performed equally well for a given level, we utilized the non-parametric Friedman Aligned Ranking (FAR) [30] test.In addition, in order to examine if the differences in the performance of the models are statistically significant, we applied the post-hoc Finner test [31] with significance level α = 5%.
Tables 7, 8 and 9 report the statistical analysis, performed by nonparametric multiple comparison, relative to MAE, RMSE, and R 2 performance metrics.Regarding the regression performance of the evaluated models similar conclusions can be made with the previous analysis.More specifically, the interpretation of Tables 7-9 reveals that all models performed equally well, since the differences in their regression performance is not significantly significant.
In sequel, for examining the superioty of MICDL, regarding the problem of predicting future cryptocurrency directional movement, we conduct a nonparametric multiple comparison relative to AUC and GM metrics.Additionally, to measure the difference in the balance of Sen and Spe metrics, we utilize a new metric defined as the product of these two metrics, i.e.Sen×Spe.

Discussion, conclusions & future research
In this research, we proposed a deep neural network model based on a multi-input archtecture for the prediction of cryptocurrency price and movement.The proposed prediction model uses as inputs cryptocurrency data which handles them independently in order each cryptocurrency information to be initially exploited and processed, separately.More specifically, each cryptocurrency data consists of inputs to different convolutional and LSTM layers which are utilized for learning the internal representation and identifying short-term and long-term dependencies of each cryptocurrency, respectively.Next, the model merges the processed data obtained from the output vectors of LSTM layers and further process them for making the final prediction.It is worth noticing that all utilized cryptocurrency time-series were transformed based on returns transformation in order to satisfy the stationarity property and be "suitable" for fitting the proposed model.
We conducted a coprehensive experimental analysis using a sufficient amount of cryptocurrency data from the three cryptocurrencies with the highest market capitalization i.e.Bitcoin , Etherium and Ripple.The detailed experimental analysis highlighted that the proposed model has the ability to efficiently exploit mixed cryptocurrency data, reduce overfitting and secure lower computational cost compared to a traditional fully-connected deep neural network in terms of lower number of weights (and consequently less computational time).
Based on the experimental analysis, we are able to conclude that the utilization of all cryptocurrencies in the training data but most significantly the multi-input architecture of the proposed MICDL has developed a forecasting model with the best regression and classification performance.
It is worth taking into consideration that although the regression metrics reported that all models are equivalent based on their performance which has been also statistically confirmed, in practice they are not.The classification metrics, especially GM and Sen×Spe, highlighted that the utilization of all cryptocurrency data, can assist the development of prediction models which exhibit better directional movement prediction; and that the architecture of proposed model has efficiently exploited the information provided in the training data and it is able to provide accurate and reliable price and movement predictions.
A common limitation of traditional approaches is that they focused on achieving better forecasting performance by exploiting more sophisticated models and techniques, usually ignoring the development of a sophisticated training dataset containing more useful information.In more detail, they do not treat each cryptocurrency independently, ignoring its conceivable relations with other cryptocurrencies and they don't take into consideration the complexity and non-stationarity of cryptocurrency time-series data.In this research, we proposed a different approach and also presented a new methodology for the development of accurate and reliable forecasting models.
It is worth mentioning that cryptocurrency investors and financial researchers are more interested in the future cryptocurrency price movements rather than knowing the exact future price for making proper investment decisions [2,11].By taking into consideration that directional movement prediction problem is of higher significance than the price prediction problem, we can conclude that the proposed model is generally preferable for supporting policy decision-making and cryptocurrency markets behavior.
During the last decade, machine learning and deep learning have been widely adopted for assisting financial researchers and cryptocurrency investors in decision support and portfolio management.Nevertheless, a natural question which rises is "how the widespread adoption of prediction models would feedback into future predictions?"Currently, cryptocurrencies follow a random walk process [2,11,12]; however, the increasing usage of prediction models may possibly change the behavior of cryptocurrencies in the future.In other words, the increasing dependency of investors on forecasting models' predictions for portfolio optimization will ultimately result in affecting investors' decisions and cryptocurrencies' fluctuations and prices.
In addition, the utilized cryptocurrencies in this research were selected because they constitute the cryptocurrencies with the highest market capitalization.As a result, the proposed work should be considered as a first approach for obtaining better forecasting performance, regarding future cryptocurrency prediction.Clearly, the proposed methodology could be extended with the adoption of more cryptocurrencies.Such an extension with more cryptocurrencies could introduce new criteria, which may conceivably influence and improve the forecasting performance; thus more experiments are certainly needed and this is our major concern for future research.Moreover, one significant issue which we should also be thoroughly investigated in the future is the adoption of cryptocurrency information such as average daily price, open daily price, close daily price, high and low daily prices as well as the daily volume of trades or even economic and technical trading indicators [32,33].However, the rising questions: "which cryptocurrencies are more correlated" and "which features have greater impact in price prediction" are still under consideration.Furthermore, another interesting direction for future research could be the evaluation of the proposed model on high-frequency data.
Finally, since our experiments are quite encouraging, a promising idea is to enhance the propose MICDL model with sophisticated pre-processing techniques based on moving average and exponential smoothing.

and
Monero cryptocurrencies.The proposed model is based on a recurrent neural network architecture which utilizes LSTM and GRU layers.The data in their study contained daily Litecoin data from Aug-24, 2016 to Feb-23, 2020 and Monero data from Jan-30, 2015 to Feb-23, 2020 concerning average price, open price, close price, high and low prices as well as the volume of trades.The reported experiments demonstrated that the proposed hybrid model outperforms traditional LSTM networks exhibiting some promising results.

For
evaluation purposes, the cryptocurrency data were divided in training set, validation set and testing set.More analytically the training set comprised of daily data from Jan-01, 2017 to Feb-18, 2020 (1153 datapoints), the validation set from Mar-01, 2020 to May-31, 2020 (94 datapoints) while the testing set consisted of data from Jun-01, 2020 to Oct-31, 2020 (152 datapoints) which ensured a considerable amount of unseen out-of-sample datapoints for testing.Finally, it is worth noticing that all utilized datasets contained values that include the recent COVID-19 crisis in the beginning of 2020, which are characterized by considerable volatility and deviations from the regular behavior as well as structural breaks.

Figure 2
Figure 2 presents the daily price of the cryptocurrencies BTC, ETH, and XRP.The interpretation of Figure 2 reveals that Ripple does not have large variability as Bitcoin and Etherium.It is worth

Tables 4 ,
5 and 6 summarize the performance of all forecasting models, based on BTC, ETH and XRP data, respectively.Clearly, all model exhibited similar performance, regarding the performance metrics MAE, RMSE and R 2 .By comparing the performance of Model 2 and MICDL with the performance of Model 1 , we can easily conclude that the utilization of all three series in the training data did not developed a forecasting model with better regression performance.More specifically, all models reported almost identical regression performance.In contrast, regarding the classification problem of forecasting the price movement both MICDL and Model 2 considerably outperformed Model 1 , regarding all Lag values and cryptocurrencies.More specifically, for Lag value 7, Model 1 reported 20.058, 25.77 and 22.031 GM score for BTC, ETH and XRP while Model 2 reported 26.567, 23.961 and 22.274 and MICDL reported 30.886, 29.582 and 25.053 in the same situations.Regarding Lag value 14, Model 1 exhibited 27.962, 27.888 and 22.418 GM score for BTC, ETH and XRP while Model 2 exhibited 23.727, 27.930 and 23.351 and MICDL exhibited 29.173, 30.461 and 26.157 in the same situations.Additionally, both MICDL and Model 2 reported better balance between Sen and Spe metrics compared to Model 1 , regarding all cryptocurrencies and Lag values.
does not have large variability as Bitcoin and Etherium.It is worth mentioning that Ripple constitutes a different cryptocurrency, from the point of view that it is not mineable, it is pre-mined and it has small variability compared to the other two cryptocurrencies.Nevertheless, since Ripple is highly ranked in market capitalization, it is traditionally included in most research works in cryptocurrency market.Furtermore, Table1illustrates the descriptive statistics including Mean, Median, Maximum, Minimum, Standard Deviation (Std.Dev.),Skewness and Kurtosis for each cryptocurrency and CCi30 index while Table2summarizes the up and down movements in the prices and the corresponding percentages.

Table 1 .
Descriptive statistics for BTC, ETH and XRP data

Table 2 .
The number of up and down movements of BTC, ETH and XRP data

Table 3
[27,28] the t-statistics and the associated p-values of the Augmented Dickey-Fuller (ADF) test[27,28]performed on the level (Levels) of the cryptocurrency series as well as of the corresponding transformed time-series.Notice that ( * ) denotes statistical significance at the 5% critical level.The interpretation of Table3reveals that BTC, ETH and XRP time-series possess a unit root which implies that these series are non-stationary.Additionally, the corresponding p-value of the transformed series are practically zero, which denotes that they satisfy the stationarity property and are "suitable" fitting a deep learning model.

Table 3 .
ADF unit root test of all cryptocurrency time-series

Table 4 .
Performance of the evaluated models for all BTC data.

Table 5 .
Performance of the evaluated models for all ETH data.

Table 6 .
Performance of the evaluated models for all XRP data.

Table 7 .
It is worth noticing that AUC, GM and Sen×Spe metrics evaluate the ability of model of not overfitting the training data; hence, presenting the information provided by a confusion matrix in compact form.FAR test and Finner post hoc test based on MAE metric

Table 8 .
FAR test and Finner post hoc test based on RMSE metric

Table 9 .
FAR test and Finner post hoc test based on R 2 metric Tables 10, 11 and 12 present the statistical analysis, performed by nonparametric multiple comparison, relative to AUC, GM, and Sen×Spe metrics.More specifically, the interpretation of Tables 10-12 provides statistical evidence that MICDL outperformed both Model 1 and Model 2 and provides more reliable forecasts.

Table 10 .
FAR test and Finner post hoc test based on AUC metric

Table 11 .
FAR test and Finner post hoc test based on GM metric

Table 12 .
FAR test and Finner post hoc test based on Sen×Spe metric