GJR-GARCH Volatility Modeling under NIG and ANN for Predicting Top Cryptocurrencies

: Cryptocurrencies are currently traded worldwide, with hundreds of different currencies in existence and even more on the way. This study implements some statistical and machine learning approaches for cryptocurrency investments. First, we implement GJR-GARCH over the GARCH model to estimate the volatility of ten popular cryptocurrencies based on market capitalization: Bitcoin, Bitcoin Cash, Bitcoin SV, Chainlink, EOS, Ethereum, Litecoin, TETHER, Tezos, and XRP. Then, we use Monte Carlo simulations to generate the conditional variance of the cryptocurrencies using the GJR-GARCH model, and calculate the value at risk (VaR) of the simulations. We also estimate the tail-risk using VaR backtesting. Finally, we use an artiﬁcial neural network ( ANN ) for predicting the prices of the ten cryptocurrencies. The graphical analysis and mean square errors ( MSEs ) from the ANN models conﬁrmed that the predicted prices are close to the market prices. For some cryptocurrencies, the ANN models perform better than traditional ARIMA models.


Introduction
A cryptocurrency is a digital currency and is used as a medium of exchange. This currency is a unique combination of three different attributes: anonymity, freedom from central authority, and provision of protection against double spending attacks; Lánskỳ (2016). Cryptocurrency, also known as crypto, is one of the fastest-growing asset classes, with a total market capitalization of $761.46 billion as of December 2020. 1 The underlying reason is that the market for cryptocurrency offers hedging properties when included with traditional asset classes such as stocks and bonds; . The hedging benefit is more pronounced in developed markets; . Moreover, some prior studies extensively studied only Bitcoin for hedging and diversification benefits, such as the fact that Bitcoin is appealing for diversification purposes Shahzad et al. (2021); Bitcoin is a strong hedge and safe-haven against some categorical economic policy uncertainty metrics, including fiscal policy, taxes, national security, and trade policy under bullish market conditions Mokni et al. (2021); Bitcoin is a good hedging instrument during high uncertainty periods Mokni et al. (2020); Bitcoin can provide diversification benefits while optimizing the risk and return with a new approach Hatemi-J et al. (2019); Bitcoin is a good hedge against both newspaper and internet-search-based measures of economic uncertainty Bouri and Gupta (2019). It is a digital or virtual currency with no regulatory council and, to date, without any government interventions, leading to a risk factor for cryptocurrency investment. Moreover, there are no indices to measure the volatility of cryptocurrency return. Therefore, it is essential to understand the risk-return properties of cryptocurrency. Besides, it is necessary to predict future prices for investors. The main objective of this study is to offer an appropriate model for estimating volatility and predicting the price of cryptocurrency as the risk-return tradeoff of cryptocurrency is distinct. It can only be explained by cryptocurrency market-specific factors Liu and Tsyvinski (2018).
The prior literature has widely studied volatility and price prediction using different models; however, these studies mostly include asset classes other than cryptocurrency. Borri (2019) examines the volatility of four cryptocurrencies using GARCH models. However, one of the key caveats of estimating volatility for cryptocurrency is to account for fat-tails. As a result, we apply the GJR-GARCH model against the GARCH model in our study. Besides, there has been limited research in price prediction for cryptocurrency. Therefore, we apply artificial neural networks (ANNs) for predicting the price of the cryptocurrency.
In this study, we assess risk and forecast price using ANNs for ten cryptocurrencies selected based on market capitalization. Our sample period is from 11 November 2018 to 31 December 2020. Initially, we identify an appropriate time series model, GJR-GARCH Glosten et al. (1993), to forecast volatility and estimate VaR, which allows us to identify more accurate risk than the GARCH Bollerslev (1986) model. The existing literature provides evidence that cryptocurrencies suffer from extreme tail-risk Borri (2019); Feng et al. (2018). Using the GJR-GARCH model at a 1% confidence level, we confirm that cryptocurrencies suffer from tail-risk and show that Bitcoin SV and Chainlink have the highest tail-risks of −19.78% and −15.57%, respectively. In contrast, TETHER suffers from the least tail-risk among our selected cryptocurrencies. Value at risk (VaR) backtesting Danielsson (2011) is a statistical risk managing tool that helps investors to monitor and quantify the risk level associated with an investment portfolio. We perform VaR Nieppola (2009) backtesting to examine the accuracy of the VaR models. Then, we apply ANNs McCulloch and Pitts (1943), which are considered one of the most promising methods for time series prediction, to predict price in a complex environment. Most prior studies implement ANNs in predicting stock prices Hassan et al. (2007); White (1988). Some other studies implement this new technique in predicting stock indices Yao et al. (1999) and the credit rating process Hájek (2011). We show that ANNs can forecast next-day cryptocurrency prices with higher precision.
In this paper, we try to answer three essential questions for cryptocurrency investment: (1) Does GJR-GARCH perform better in estimating volatility than the GARCH model? (2) Does VaR Backtesting show any interesting results for top cryptocurrencies? (3) Does an ANN perform better than traditional ARIMA models for predicting prices of cryptocurrencies? This study contributes to the growing literature of cryptocurrencies as follows. First, the study contributes to tail-risk literature by providing a well-established methodology, the GJR-GARCH model. Second, we estimate confidence intervals for risk and returns and compare them with simulated data by Markov Chain Monte Carlo. Third, value at risk backtesting is used to measure the accuracy of the value at risk calculations, which determines how well the mentioned cryptocurrency investment strategy would perform using historical data. Finally, this study contributes to the literature of ANNs by showing the application in predicting prices of cryptocurrencies.

Literature Review
There has been a sufficient amount of research regarding measuring risk in finance. The GARCH model has been widely implemented to estimate volatility in different types of security markets. Moschini and Myers (2002) developed a multivariate generalized ARCH (GARCH) parameterization for the optimal futures hedge ratio using corn prices. Alberg et al. (2008) implemented different GARCH models in the Tel Aviv stock exchange to estimate conditional variance. De Goeij and Marquering (2004) found strong evidence of conditional heteroskedasticity in the covariance between stock and bond market returns using the GARCH model. However, the GARCH model may not be appropriate if the asset returns suffer from fat-tail risk. In such cases, the GJR-GARCH model is more suited as it accounts for fat-tail distribution better than the GARCH model. Therefore, many other studies implement the GJR-GARCH models. Ma et al. (2020) examined whether gold can be used as a safe haven for stocks. They used the GJR-GARCH model to estimate the extreme tail risk. Moreover, Maciel (2013) applied a similar methodology in the stock market. As cryptocurrencies suffer from fat-tail risk, we prefer to implement the GJR-GARCH model to accurately estimate the volatility of cryptocurrencies.
However, price predictions were mainly used for the stock market. The first significant study of neural network models used IBM's daily stock returns for prediction White (1988). After that, significant research was performed to check neural networks' accuracy of prediction to forecast the stock market. Hassan et al. (2007) proposed a fusion model by combining the hidden Markov model (HMM), an artificial neural network (ANN), and genetic algorithms (GA) to forecast financial market behavior. Adebiyi et al. (2014) compared the forecasting performance by ARIMA and artificial neural networks for stock data. Some other studies used ANNs to predict the stock market index Moghaddam et al. (2016); Yao et al. (1999). Other than predicting stock prices, some studies implemented ANNs in different fields of finance. Previous studies have shown that ANNs exhibit better performance in bankruptcy prediction Quah and Srinivasan (1999) and the credit rating process Hájek (2011). Previous studies have applied different methodologies such as ordinary least squares (OLS) regression, support vector regression (SVR), and least absolute shrinkage and selection operator (LASSO) techniques from the field of machine learning to predict the price of cryptocurrency Plakandaras et al. (2021). Therefore, we present ANNs as an alternative mechanism for predicting cryptocurrency prices. Motivated by prior studies, we implement the ANN models in a relatively new financial market, which is cryptocurrency.

Methodology
In this section, we will explain data selections and methods that have been used to analyze the data. First, we describe the data source, how the cryptocurrencies were selected, and some properties of the data. Second, we describe the ARIMA process and its subsection describes robust model selection. Third, we present the GJR-GARCH and GARCH for volatility, and perform the back testing of the models. Finally, we describe how to use the supervised machine learning approach, the ANN model, for predicting cryptocurrency prices.

Data Selections
We collected daily prices, dollar volume, and market capitalization for 2500 cryptocurrencies from coingecko.com (accessed on 31 January 2021). Then, we sorted all the cryptocurrencies based on their market capitalization. Finally, we chose the top ten cryptocurrencies from the sorted list. The top ten selected cryptocurrencies were Bitcoin, Bitcoin Cash, Bitcoin SV, Chainlink, EOS, Ethereum, Litecoin, TETHER, Tezos, and XRP. The combined market capitalization of our selected cryptocurrencies covers around 80 percent of the coingecko-based market. Our sample period ran from 11 November 2018 to 31 December 2020 with a total of 781 trading days. Since the starting dates of our selected cryptocurrencies were not same, we chose a starting date of 11 November 2018, which was the first trading date of the latest cryptocurrency of our observed cryptocurrencies. Typically, we have 252 business days per year, however a cryptocurrency investor can invest and exercise every single day of the year; that is 365 days a year.

Exploratory Data Analysis
Exploratory data analysis has been applied to summarize the main characteristics of cryptocurrencies. We plotted prices and a table of summary statistics to estimate the quick description of the data.
In Figure 1, prices are scaled as follows: Bitcoin is divided by 10; XRP is multiplied by 1000; Chainlink, EOS, TETHER and Tezos are multiplied by 100; Litecoin is multiplied by 10. Data are daily prices of the cryptocurrencies from 11 November 2018 to 31 December 2020. Figure 1 demonstrates a clear surge in the prices of the cryptocurrencies. In this Figure, we see that most of the cryptocurrencies have either an upward or a downward trend except for Chainlink, TETHER, and XRP, which seem less volatile.  Table 1 presents descriptive statistics for ten cryptocurrencies where results have been calculated based on the daily log returns of the assets. The mean daily log returns of the cryptocurrencies are between −0.24% and 0.30%. From the sample period, Chainlink observed the highest return whereas Bitcoin Cash observed the lowest return. In terms of volatility as measured by standard deviation, Bitcoin SV was the most volatile cryptocurrency whereas TETHER was the least volatile cryptocurrency. We use the heatmap in Figure 2 to understand the correlation between the ten cryptocurrencies, which assists us in determining risk in further analysis. The heatmap shows that TETHER is uncorrelated with the other nine cryptocurrencies of our sample.

Higher Moments, Correlations, and Value at Risk
For each cryptocurrency, degrees of asymmetry in a probability distribution have been found through skewness. Because the normal distribution shows zero skewness, we use skewness to find out whether each probability distribution is normal or not. If the daily prices are skewed, a normally distributed model will always underestimate skewness risk in its predictions. The more skewed the price, the less accurate the financial model will be. Mathematically, we calculate the skewness as: Moreover, we find that kurtosis measures extreme values in either tail of a fitted distribution. This has been used to see whether the distributions with large kurtosis exhibit tail data exceeding the tails of the normal distribution. Kurtosis risks can be used to see the occasional extreme returns. Mathematically, kurtosis has been evaluated by the following formula: where x = cryptocurrency returns, µ = mean, σ = standard deviation. We have found excess kurtosis because the distribution of each crypto outcome has many instances of outlier results (we also checked Q-Q plots), which causes fat tails on the bell-shaped distribution curve. Log prices of all cryptocurrencies have been used to calculate the correlation. We use the following formula: where R = correlation coefficient, x = mean of the values of the x-variable, y i = values of the y-variable in a sample, y = mean of the values of the y-variable. Value at risk, VaR, is applied on the cryptocurrency returns to see the specific losses at the left tail of the return distribution at a certain significance level. The following idea has been used to find VaR: VaR = [Expected Weighted Return of the Portfolio − (z-score of the confidence interval × standard deviation of the portfolio)] × portfolio value

ARIMA Model for Cryptocurrency Prediction
For the time series data analysis, stationarity is a very important factor. However, most of the real-world data, especially the market data (stock, cryptocurrencies, and other derivatives), are not stationary. To make the data stationary, we take the difference of the data. A non-stationary series X t will follow an Autoregressive Integrated Moving Average where B is the backshift operator that transforms a variable into its lagged version. For where p, q are the autoregression and moving average orders, respectively, which are determined from the analysis of the autocorrelation of the series. φ and θ are the autoregression and moving average parameters, respectively, which we estimate from the model based on the values of p, d, and q.

Robust Model Selection
To identify the appropriate model for this study, we initially look at the AIC Akaike (1974) to choose which ARI MA(p, d, q) is the best fit for our prediction. This strategy has been followed for all ten cryptocurrencies. The Box-Jenkins Box et al. (2015) method has been applied for cryptocurrency returns to fit the ARIMA model, and we test normality by Q − Q plots and find non-normal behavior and use NIG distribution instead (see Figure 3). To consider the leverage effect, we use GJR-GARCH under NIG distribution instead of general GARCH models.
In Figure 3, we only show why hyperbolic distribution and assymetric-GARCH modelling are necessary. We present results for Bitcoin only in Figure 3 to avoid redundancy. For all other nine cryptocurrencies, same results have been found. The Empirical and theoretical densities have been shown where heavy tail elements have been found. Moreover, empirical and theoretical CDFs also show similar results. We also show Q-Q and P-P plots, which confirm our assumption of there being non-normality. We fit some ARI MA(p, d, q) models. Then, we compared the performance of each model in forecasting the future values. To fit these above models, we needed to make sure that the return was a stationary series. We used the Augmented Dickey Fuller (ADF) Enders (2008) test to check the stationarity of the models. The null hypothesis was "the series is not stationary".
From Table 2 we see that the p-value from the ADF test of the original series is greater than 0.05 except for TETHER and XRP, which are 0.01 and 0.013, respectively. So, at the 5% level of significance we do not have sufficient evidence against the null hypothesis that the original series is non-stationary. The null hypothesis for the KPSS test is that the data are stationary. For this test, we do not want to reject the null hypothesis. Except TETHER, all other cryptocurrencies show p-values less than or equal to 0.01 in KPSS tests. As a result, we took the first and second difference depending on the data and ran different models. The best ARI MA(p, d, q) model was chosen based on criteria that the lower AIC (Akaike's Information Criteria) the better model as shown in the Table 3 below. However, some values of the parameters p, d, q give smaller AIC values but those are found overfitting. For example, ARI MA(1, 1, 2) has the most negative AIC value for Bitcoin but it was found that ARI MA(0, 2, 1) makes the best prediction. Similarly, all overfitted cases were excluded and the best models are shown in bold fonts (see Table 3). Next, we wanted to check the ARCH effect in our data. As a result, we ran an ARCH-LM test. In Table 4 we show the results of the ARCH-LM test. The results confirm the existence of heteroskedasticity in our data. The null hypothesis in this test was that there is no ARCH effect, while an alternative effect is that there is an ARCH effect. Although we show both chi-squared values and p-values, we made the decision to reject the null hypothesis based on p-values. Therefore, ARCH effects were found in our data. The ARCH model defines the variance of the current error term or innovation as a function of the actual sizes of the error terms from prior time periods.

Volatility Modeling with GJR-GARCH
Let r t denote the log return of a portfolio between periods t − 1 and t, and F t denote the information filtration generated by these terms. We have where t are independent and identically distributed random variables with NIG distribution. Parameter constraints are The unforeseen series can be estimated by conditional variance models. Engle (1982) discovered the Autoregressive Conditional Heteroskedasticity (ARCH) model to forecast the variance of a time series. The ARCH model assumes that, the same as the error terms in a regular AR process, the variance of the error term is dependent on previous error term variances.
Bollerslev (1986) extended the ARCH model and developed the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model. GARCH(p, q) with three parameters can describe complex volatility structures and it is sufficient for most applications. We can forecast future volatility by using the GARCH(p, q) model below: where σ 2 = α 0 1−α 1 −α 2 is the unconditional variance of innovations e t . Note that α 1 + β 1 < 1 as τ → ∞; we also achieveσ 2 t+τ → σ 2 . So, prediction of volatility goes with time asymptotically to the unconditional variance. Returns of an asset have positive excess kurtosis. Thus, we consider a fat tail, 1 if e 2 t−k > 0 0 otherwise γ p > 0 produces an extra volatility for negative innovation that adjusts the asymmetric impact on volatility.
The GJR-GARCH model is slightly different than the original ARCH or GARCH model. We consider GJR − GARCH(p, q) with where the leverage order is automatically considered equal to order 1, and all parameter constraints are quite similar to the GARCH(p, q) model. Barndorff-Nielsen (1977) proposed NIG distribution, a hyperbolic distribution, which captures more data than a Gaussian distribution. The probability density function of random variable G can be defined as where q(x) = (x − µ) 2 + δ 2 α > 0 evaluates the shape, δ > 0 is the scaling parameter, µ ∈ (−∞, ∞) and 0 ≤ |κ| ≤ α evaluates the skewness. C 1 (x) is the modified Bessel function of the 2nd kind with index one. We consider all innovation under NIG.

Confidence Bound by Markov Chain Monte Carlo Simulation
Furthermore, we consider an independent random drawn from a specific probabilistic model with realization of the entire sample path of specific length K, y 1 , y 2 , · · · , y n . For a large number of simulated draws, we have P sample paths with each having length K. We use Monte Carlo simulation Hertz (1964) to generate the next conditional variance by applying GJR-GARCH. As the accuracy of the Monte Carlo method depends on a finite number of simulations, thus it has some errors, which can be controlled by increasing the number of paths M. We estimate the probability of the future event using the sample proportion of the event occurrence across M simulations. Here, where C is the number of time events occurring in M simulated sample paths. Finally, we estimate Monte Carlo simulated path error.
We plotted the average and the 97.5% and 2.5% percentiles of the simulated paths and compared the simulation statistics to the original data.

Calculation of Sharpe Ratio
We use the Sharpe ratio for calculating the risk-adjusted returns. It is used to evaluate a portfolio's past performance (ex-post) where actual returns are used in the formula. Sharpe ratio measures the performance of an investment, which is excess return per unit of risk (measured by standard deviation) of that investment. Higher Sharpe ratios suggest better performance. The Sharpe ratio, S p , is calculated as follows: where r c and σ c represent the return and standard deviation of cryptocurrencies and r f presents the risk-free rates. It explains whether a cryptocurrency portfolio's excess returns are due to some smart investment decisions or a result of too much risk. Note that a negative Sharpe ratio does not present any useful meaning. For daily cryptocurrency investors, Sharpe ratio is a nice tool with which to see future returns, but it is not the only one.

Value at Risk Backtesting
Next, we examine how our ex-ante risk forecast performs in the ex-post basis. To monitor and quantify various level of risks associated with cryptocurrency investment, the VaR statistical risk management technique has been applied where backtesting measures the accuracy of the value at risk calculations. Using backtesting we determine how well a crypto investment strategy would perform using historical data for each cryptocurrency. We obtain VaR at a 1% level, which is VaR 0.01 (t + 1). If X is the random variable indicating loss under given parameters 0 < α < 1 then we can write that the α − VaR of X is where VaR α (X) is the minimum loss that will not be exceeded with probability α. The smallest loss is in the (1 − α) × 100% worst cases. Thus, our backtesting is based on our assumed distribution at a (1 − α) confidence level in VaR. We also performed Kupiec's tests and Christoffersen's tests for all ten cryptocurrencies, however we have not included these in our paper.

Supervised Machine Learning Approach: Artificial Neural Network
In ANN, some inputs are provided to an artificial neuron, and weight is associated with each input. Weight increases the steepness of the activation function. This means that weight decides how fast the activation function will trigger whereas bias β is used to delay the triggering of the activation function. Fundamentally, the mechanism has three layers: input, hidden, and output layers. Each layer may contain many nodes or neurons and the hidden part may also contain several layers. However, for the time series analysis and forecasting, the single-layer feed-forward network is the most widely used used model structure Zhang et al. (1998). For a typical neuron, if the inputs are x 1 , x 2 , · · · , x n then the synaptic weights to be applied to them are denoted as w pi for i = 1, 2, · · · , n. The following equation represents the idea: where W i,j and W j for i = 1, 2, · · · , p, j = 1, 2, · · · , q are known as connection weights and W 0 is the initial bias. Here Y t is the return r t . The weight shows the effectiveness of a particular input. The higher the weight of an input, the more impact there is on the network. Network has been created and trained in open loop form as shown in Figure 4. In fact, open loop (single-step) is more efficient than closed loop (multi-step) training data sets. Open loop allows us to supply the network with correct past outputs as we train it to produce the correct current outputs. The performance of the ANN Crick (1989) was evaluated using the determination of coefficient R 2 and the mean square error (MSE) of the output of the model. MSE indicates the average squared difference between the predicted values estimated from a model and the actual values. The MSE is calculated as follows based on cryptocurrency return.
where r t andr t are actual and predicted prices of the cryptocurrencies. We chose the Bayesian regularization training algorithm as used in MacKay and Mac Kay (2003) and Selvamuthu et al. (2019). Training stops according to adaptive weight minimization (regularization). To examine whether our model was performing well or not, we checked mean squared error (MSE).

Results and Discussion
In this section, we present the results of the GJR-GARCH and GARCH models for estimating the volatilities of the ten selected cryptocurrencies, simulated cryptocurrency volatility, and predicted cryptocurrency prices.

GJR-GARCH and GARCH Models for Cryptocurrency Volatilities
First, we present descriptive statistics of the ten cryptocurrencies in Table 5. Sharpe ratio is the average return earned in excess of the risk-free rate per unit of volatility or total risk. Intuitively, it measures the excess return earned per unit of risk. In our sample, Chainlink has the highest Sharpe ratio of 0.046, whereas Bitcoin has the second-highest Sharpe ratio of 0.029 in our sample. Some cryptocurrencies have negative Sharpe ratios, which signify that their average return is less than the risk-free rate in our sample. Table 5 shows that cryptocurrency returns exhibit fat-tail distributions. Generally, prior literature has used the GARCH model over the autoregressive conditional heteroscedastic (ARCH) model. However, to control for asymmetric responses of volatility to innovation fluctuations, we used the GJR-GARCH model. From Table 5, we observe all positive values of excess kurtosis, which means all of the returns have leptokurtic distributions. This refers to a heavy degree of risks with extreme return values. Moreover, except Bitcoin SV and TETHER, all the other cryptocurrencies show skewness with negative values, which are compensated with high future returns for higher volatility. These two higher moments play with investors' sentiments.  Figure 5 depicts the comparison between the GARCH(p,q) and GJR-GARCH(p,q) models to estimate volatility using daily data. The top left panel of this Figure shows the time-series and the right panel shows the scatter plot of the volatility estimate. The middle panel shows a comparison in volatility forecasting. The bottom panel shows the news impact curve. We used the daily data for Bitcoin and we found similar results for all others. Therefore, to avoid redundancy and to show volatility model selection, we only show results for a single cryptocurrency. We also use the lowest AIC value for determining the best p and q for a robust volatility GJR-GARCH(p,q) model for each cryptocurrency. Furthermore, there is a valid reason for choosing GJR-GARCH over the GARCH model: it is empirically found that negative cryptocurrency returns at time t − 1 have a stronger impact on the volatility at time t than positive cryptocurrency returns. This asymmetric process is known as the leverage effect. The increment of the risk is realized to come from the increased leverage induced by the negative returns of cryptos Glosten et al. (1993). In Figure 6, we forecast the volatility for the ten cryptocurrencies. It shows daily data where the blue line indicates predicted volatility. If we examine the historical return series, it does not show conditional mean offset and thus exhibits volatility clustering. However, we initially forecast the volatility of Bitcoin using both GARCH and GJR-GARCH in Figure 5 and found that GJR-GARCH provides the better estimate. That is why we implemented the GJR-GARCH(p,q) model to forecast the return and volatility for our selected cryptocurrencies. In Figure 6, we forecast the conditional variance (vs. absolute value of returns) for all ten cryptocurrencies.

Monte Carlo Simulations of the Cryptocurrencies' Volatility Using GJR-GARCH Model
Next, we simulated the conditional variance of the cryptocurrency's returns from a fully specified GJR-GARCH model based on historical data. In Figure 7, we plotted the average and the 97.5% and 2.5% percentiles of the simulated paths and compared the simulation statistics to the original data for each cryptocurrency. The red band indicates the confidence bound, and all of our selected cryptocurrencies are embedded in the simulated Monte Carlo paths. We followed the GJR-GARCH(p,q) model based on a 1% confidence level and daily frequency to estimate VaR. We have presented our results both in Table 5 and Figure 7. Our results show that Bitcoin, Bitcoin Cash, Etherium, and Litecoin suffer the most from tail-risk whereas TETHER has the least tail-risk. In Figure 8, the jagged red line running across the bottom of the plot indicates the portfolio's (negative) one-day 99% value-at-risk. For any instance of a cryptocurrency falling below that line, there will occur an exceedance risk. This would predict a 99% value-at-risk measure to experience approximately very few exceedances in under our horizon. We observe that Bitcoin, Bitcoin Cash, EOS, Ethereum, and Litecoin are highly risky assets, because these have many exceedance of jagged lines. Referring to Figure 8, Bitcoin has at least four exceedances of jagged lines. Thus, it supports the hypothesis that Bitcoin is one of the highly volatile cryptocurrencies.

ARIMA and ANN for Forecasting Cryptocurrencies' Prices
Figure 9 presents a window plot of the predicted price of each cryptocurrency with their actual prices. From close observation we see that ARIMA models also give higher accuracy for the next-cryptocurrency price prediction. We use ANN models for forecasting the future price of ten cryptocurrencies. However, we choose to show the complete process of forecasting future prices using ANN for only Chainlink. After training, the network may be converted to closed loop form. We choose daily price data of Chainlink randomly and divide into three different category named training set (70%), testing set (15%) and Validation set (15%).
In Figure 10, we find that we don't have any overfit of the model. In Figure 10a, we evaluate regression fit by the value of regression coefficient. Figure 10b allows us to check on the training progress of our Neural Network. The Figure 10b shows MSE and epochs of the Neural Network for training and test set. To observe whether our ANN is training well, we look for training set's Loss and Accuracy and whether they converge as the number of epochs increases. As our Loss and Accuracy from both sets are diverging from each other, there is no sign of overfitting from our ANN model. It shows a good fit for Chainlink. We also observe similar output for all other nine cryptocurrencies.
In our case, we show it for 9 epochs in Figure 10c. All types of data sets are approximately overlapping at 3 epochs. Therefore, we do not have to change our ANN structure. Next, we tested our model against actual data to see how well it performed. We found that for all of the ten cryptocurrencies, our ANN model could predict the price, which means it is performing quite well. Here Figure 10, R is actually R 2 ; the coefficient of the model.  Table 6 has been given for the MSE of each model. In the studied supervised machine learning technique, we evaluated the model by using MSE. For some mentioned cryptocurrencies, we obtained less MSE for ANN predictions, and for other cryptocurrencies, we see ARIMA perform better. As we know, minimal or least MSE is considered the best fit. Therefore, from this study, it is not possible to say that the ANN is the best model but one can implement the ANN as an alternative procedure. For Chainlink and all other nine cryptocurrencies, K-fold validations are compared using the mean square error (MSE) metric. We observe that the MSE of the fitted model is less than cross-validation MSE for all ten cryptocurrencies. For our example, the cross-validation MSE of Chainlink is 0.53 approximately, which is obviously less than the MSE of ARIMA and ANN prediction. The window plot in Figure 11 presents only a one-month prediction output. All of the models are showing good output. In most of the cases, the ANN is performing better than ARIMA. Overall, in this paper we have contributed two separate studies for cryptocurrencies. One is for volatility/risk and another is for return. An investor may think about these two properties of the financial market. Cheikh et al. (2020) investigated the presence of asymmetric volatility dynamics in Bitcoin, Ethereum, Ripple, and Litecoin using threshold GARCH models. We have extended this group of literature. We compared GARCH and GJR-GARCH volatility estimates for estimating volatility considering the normal inverse Gaussian (NIG) distribution. Then, we showed that to estimate volatility for cryptocurrencies, the GJR-GARCH model under NIG is better. Similarly, for predicting prices of cryptocurrencies, we applied an ANN and showed the comparison with ARIMA instead of ordinary least squares (OLS) regression, support vector regression (SVR), and the least absolute shrinkage and selection operator (LASSO) of Bouri et al. (2021). Our findings have implications for money managers. Money managers can implement an ANN for predicting the prices of cryptocurrencies.

Conclusions
We conducted a standard learning approach for cryptocurrency investment by fitting two different time-series models, the GARCH(p,q) and the GJR-GARCH(p,q) models, in predicting cryptocurrency risks. We concluded that GJR-GARCH(p,q) provides a better estimate for volatility of cryptocurrency returns than GARCH(p,q). We also estimated the tail-risk of cryptocurrencies using VaR, and backtesting provides a comparison of the Monte Carlo simulated VaR measure to the actual volatility of cryptocurrency returns. We showed that cryptocurrencies suffer from excessive tail-risk. Finally, we presented a comparison of ARIMA and ANN for forecasting cryptocurrency prices with market values. Based on the results, the ANN can be considered an alternative method to the ARIMA model for predicting cryptocurrency prices. From our analysis we observe that Bitcoin, Bitcoin Cash, EOS, Ethereum, and Litecoin are highly risky cryptocurrencies. Although the paper focused on comparing some common financial models in evaluating cryptocurrency risk and predicting prices, it is among the few papers in the literature that have studied the cryptocurrency market. Therefore, the paper would make a good contribution to cryptocurrency risk management. This paper mainly analyzes some top cryptocurrencies for estimating risk and predicting prices by comparing two models for each objective. GJR-GARCH under NIG and ANN for cryptocurrency price prediction can be alternative powerful techniques. Future research may extend the analysis by fitting more models for several cryptocurrencies to estimate risk and predict prices. More specifically, we want to introduce more statistical machine learning models in cryptocurrency price prediction. We did not use any machine learning techniques for volatility prediction or portfolio optimization, such as CVaR, mean variance, or Black Litterman portfolio optimization in the case of cryptocurrency, to optimize asset allocation. In future we will contribute in this sector. Since machine learning techniques are data driven, thus they will help portfolio managers to understand the future risk of investors' assets more profoundly.
Author Contributions: Data Collection: P.S., and F.M., Methodology Development: F.M., P.S., and M.R.I., Writing: all authors contributed equally, Funding and overall supervision: N.N., All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Data Availability Statement:
In this study, publicly available data sets were used. These data can be found here: [coingecko.com, accessed on 31 January 2021] Acknowledgments: The authors are grateful to Svetlozar Rachev of the Department of Mathematics and Statistics, Texas Tech University, for teaching Advanced Statistical Methods in Finance at the Summer Graduate Course. The authors are also thankful to the unknown reviewers for wonderful suggestions which make this paper more sound. Codes were written in MATLAB-R2021a under a Texas Tech University authorized licence. Some analyses were done in R, an open statistical software package.

Conflicts of Interest:
The authors declare no conflict of interest. Note 1 coinmarketcap.com accessed on 31 January 2021.