Sentiment-Induced Bubbles in the Cryptocurrency Market

: Cryptocurrencies lack clear measures of fundamental values and are often associated with speculative bubbles. This paper introduces a new way of testing for speculative bubbles based on StockTwits sentiment, which is used as the transition variable in a smooth transition autoregression. The model allows for conditional heteroskedasticity and fat tails of the conditional distribution of the error term, and volatility may depend on the constructed sentiment index. We apply the model to the CRIX index, for which several bubble periods are identiﬁed. The detected locally explosive price dynamics, given the speciﬁed bubble regime controlled by a smooth transition function, are more akin to the notion of speculative bubble that is driven by exuberant sentiment. Furthermore, we ﬁnd that volatility increases as the sentiment index decreases, which is analogous to the commonly called leverage effect.


Introduction
The current literature on bubble tests is confronted with the difficulty to conclude that a price bubble is not caused by time-varying or regime switching fundamentals (Gürkaynak 2008). Recent tests proposed by (Phillips et al. 2011(Phillips et al. , 2015 provide powerful tests, essentially based on the supremum of sequential unit root test statistics, and have been applied to the cryptocurrency markets by (Cheung et al. 2015;Corbet et al. 2018;Hafner 2018), where the latter accounts for time-varying volatility. These tests, however, are purely statistical in nature and do not allow us to infer if structural breaks detected in the time series processes of asset prices are evidence of bubbles or are due to breaks in the underlying (unobserved) fundamentals (Pesaran and Johnsson 2018). An inclusion of extracted sentiment information, representing the sentiment in the crypto community with their specific linguistic features, contributes to solving this inconclusive puzzle and adds economic and behavioral information into the statistical settings.
Alternative bubble tests have been proposed e.g., by Pavlidis et al. (2017) based on the gap between spot and futures prices and applied to equities and exchange rates, and Pavlidis et al. (2018) using market expectations of futures prices applied to the oil market-see also Kruse and Wegener (2019). With the lack of liquidity in futures prices of cryptocurrencies, it seems difficult to apply these tests to crypto markets today. Further bubble tests include Cheah and Fry (2015), who use a continuous time model to identify bubbles via anomalous behaviors of the drift and volatility components, and Fry and Cheah (2016), who develop models for financial bubbles and crashes based on statistical physics, with applications to Bitcoin and Ripple.
Bubbles are more prone to emerge in the crypto market than in the stock markets. Theoretical grounds for market efficiency rely crucially on the stabilizing powers of rational speculation-see e.g., (De Long et al. 1990; Glosten and Milgrom 1985;Yang and Brown 2016). Given the presence of limits to arbitrage (e.g., no short-sale venue) and the limited fundamental information in the cryptocurrency market, rational speculation that pulls prices close to its fundamental value is not possible. These constraints result in a hurdle of price discovery.
In this paper, we postulate that a bubble-like behavior of prices is characterized by a smooth transition function that dynamically assigns the probability (loading) to the explosive regime and the random walk regime, given the exogenous sentiment information. By this construction, the speculative bubble can only be pumped up with anomalous sentiment. We therefore develop an econometric framework and a test for a sentiment-induced price bubble.
We target the cryptocurrency-related messages in Stocktwits which attracts the crypto community to share their information, opinions and sentimental moods. We use the sentiment measures constructed by Nasekin and Chen (2018) from this social media as their newly constructed sentiment index is viewed as a representative sentiment from the crypto community, with a consideration of their specific linguistic features. The information content of it is relevant for future market performance and can be used to predict the price and volatility evolution . As mentioned before, due to the limited knowledge of a fundamental value in this new digital asset class, the mispricing caused by sentiments cannot be promptly corrected or revert to its fundamental value. This is the reason why sentiment entails a short-run predictability because of an inefficient crypto market that defers a price correction process. This slow correction makes sentiment accumulated and amplified; as a consequence, the bubble is able to grow and probably collapse once sentimental bias is finally being corrected.
The econometric framework is that of a smooth transition autoregressive model (STAR), where the transition variable is the sentiment index. The idea is that, in times of a very high sentiment index, corresponding to excessively bullish evaluations, the price dynamics will be driven by an explosive autoregression, while otherwise they follow a random walk. We allow for conditional heteroskedasticity and fat tails by specifying recently proposed score-driven models that are shown to fit the data well. Volatility is allowed to depend explicitly on the sentiment index. This complements previous studies on cryptocurrency volatility as in (Conrad et al. 2018;Kjaerland et al. 2018).
We apply the model to the CRIX index, which is a value weighted index of the cryptocurrency market with endogenously determined number of constituents using statistical criteria. The reallocation of the CRIX happens on a monthly and quarterly basis-see (Trimborn and Härdle 2018) and thecrix.de for details. We identify several bubble periods, primarily in 2017. Volatility is negatively depending on the sentiment index, meaning that bad sentiments or news increase volatility, a feature commonly called leverage effect in classical financial markets. Here, the leverage effect is explicitly driven by the sentiment index.
The paper is organized as follows. We first present the sentiment index for cryptocurrencies in Section 2. Then, we introduce the econometric model in Section 3 and discuss its application to the CRIX. Section 4 provides the conclusions.

Cryptocurrencies and a Sentiment Index
For the task of sentiment quantification and construction, this section outlines the dataset being analyzed and the methodologies employed for quantification.

StockTwits Data
StockTwits 1 is a social microblogging platform where investors and traders dedicate to financial and economic discussion. Each message, by StockTwits policy, should start with "cashtag" that explicitly refers to the specific financial asset. Through it, one can easily link the message content with the asset symbol starting with cashtag; subsequently, associate the symbol with the sentiment of message content, after textual analysis. Sentiment analysis is very possible in StockTwits due to its add-in sentiment disclosure applied to each users. Users can also express their sentiment by labeling their messages as "Bearish" (negative) or "Bullish" (positive) via a toggle button. The available labeled data benefits an advance on textual analysis that typically relies on the available training dataset.
Since 2014, StockTwits adds streams and symbology for cryptocurrencies and tokens, expanding from 100 cryptos in the beginning to more than 400 cryptos recently. This brand new and vibrant new asset class have successfully attracted a huge attention from its big community and also from new comers. New cryptocurrencies are regularly added to the list of cashtags supported by StockTwits. 2 A cashtag refers to a cryptocurrency if and only if it ends with ".X" (e.g., $BTC.X for Bitcoin, $LTC.X for Litecoin). We use this convention and StockTwits Application Programming Interface (API) to download all messages containing a cashtag referring to a cryptocurrency. StockTwits API also provides for each message its user's unique identifier, the time it was posted at with a one-second precision, and the sentiment associated by the user ("Bullish", "Bearish" or unclassified). Our final dataset contains 1,220,728 messages from 33,613 distinct users, posted between March 2013 and May 2018, and related to 425 cryptocurrencies. Overall, 472,255 messages are classified as bullish (38.6%) and 92,033 as bearish (7.5%), and the remaining are unclassified. An imbalance between the numbers of positive and negative messages shows that online investors are optimistic on average, as previously found by (Avery et al. 2016;Kim and Kim 2014).
StockTwits, with a focus on financial discussion, offers an advantage to extract the speculative sentiment, which may ultimately trigger a speculative bubble. Another advantage is that the availability of labeled sentiment by users themselves, rendering an application of supervised learning schemes. The detail of statistical learning model applied to Stocktwits dataset will be documented in the following subsection.

Sentiment Prediction
Nasekin and Chen (2018) propose a state-of-art methodology for semantic sentiment prediction in the cryptocurrency domain. The long short-term memory (LSTM) type of recurrent neural network (RNN), together with word embedding technique provide a superior performance in predicting domain-specific sentiment. The key advantageous feature in the LSTM is to keep the context-specific dependence encoded, so that the important information about semantic structure of sentence won't be lost.
A general architecture of a sentiment prediction LSTM/RNN network is presented in Figure 1 of Nasekin and Chen (2018). This architecture consists of the input sequence, an embedding lookup matrix, several layers of LSTM cells/units, an output sequence, mean pooling and softmax layers. The core of this structure are the LSTM cells. The structure of these cells is presented in Figure 1. The specifications of this structure include several steps: (1) introducing the cell state C t to keep information about the previous states of LSTM cells. The amount of information stored in the cell state is controlled by the "gates": an input gate i t , a forget gate f t and an output gate g t . The first to act is the forget gate f t : it determines how much of the previous state C t−1 will be kept based on the values of the previous hidden state h t−1 and the current input x t . The sigmoid function σ(x) = 1/(1 + exp(−x)) outputs a value between 0 and 1 for each number in the cell state C t−1 ; (2) generating an update to 1 https://stocktwits.com/. 2 This list can be found at https://api.stocktwits.com/symbol-sync/symbols.csv. C t−1 through a new candidate value of the cell state,C t , and deciding how much of the new candidate stateC t will be inputted into C t ; and (3) updating the value of the cell state C t as a weighted sum of the previous cell state value C t−1 and the new candidate valueC t ; (4) updating the the hidden state h t as a filtered value of the cell state C t , which is put through the tanh nonlinearity and multiplied element-wise by the values of the output gate g t .
The detail of RNN algorithm can be found in Nasekin and Chen (2018). Its performance in terms of labeling sentiment as bullish or bearish is also documented, with 84% accuracy.

Sentiment Index and Cryptocurrency Index
A trained RNN model is used to predict sentiment labels of unlabeled messages which constitute about 60% of the StockTwits' messages' dataset. More specifically, the LSTM setup with pre-trained Word2Vec embeddings are employed for this purpose. Aggregated sentiment in Nasekin and Chen (2018) is constructed in the following way: where M Bu t and M Be t is the number of bullish and bearish messages on day t, respectively. Equation (1) is defined as a logarithmic rate of change of the number of bullish and bearish messages on a day t. This aggregate sentiment is viewed as a representative sentiment from the crypto community in Stocktwits with their specific linguistic features. The information content of it is relevant for future market performance and can be used to predict the price and volatility evolution, given the limited knowledge of fundamental value ). More importantly, due to the limited knowledge of fundamental value in this new digital asset class, the mispricing due to sentiment cannot be promptly corrected or revert to its fundamental value. This is the reason in sentiment carries a short-run predictability. This slow correction makes sentiment accumulated and amplified; as a consequence, the bubble is able to grow and probably collapses as sentimental bias is finally being corrected.
The CRIX (CRyptocurrency IndeX) is created by Trimborn and Härdle (2018) and used to track the entire cryptocurrency market performance as close as possible. It is constructed robustly in the sense it considers a frequently changing market structure, hence the representativity and the tracking performance can be assured. In such a way, the number of constituents is changing over time, depending on market conditions and the relative dominance among cryptos. The data series starting from July 2014 can be downloaded through thecrix.de.

A Sentiment-Based Model for Locally Explosive Crypto Prices
Suppose we have a series of log prices for the CRIX, denoted y t , and a series of sentiment indices for the crypto market, called s t . The idea is to allow for bubble-like behavior of prices, given by a locally explosive autoregressive process, where the explosive regime is determined by a sentiment index of the crypto market. The transition between the random walk and the explosive regime is driven by a smooth transition function as in classical smooth transition AR models (STAR). Furthermore, we take into account conditional heteroskedasticity of the error term and fat tails of the conditional distribution. The model can be written as where α > 0, ε t is an i.i.d. error term with mean zero and unit variance, h t is volatility, and g(s) is the logistic function, i.e., with "steepness" parameter γ and "threshold" parameter τ. Essentially, the dynamics of y t are a mixture of two regimes. When the index s t is large, then g(·) will be close to unity and more weight is given to the explosive regime, while if it is small, then g(·) is close to zero and more weight is given to the random walk regime. In the limiting case, γ → ∞ one obtains as a special case the threshold autoregressive model, as g(s t ) degenerates to the indicator function I(s t − τ > 0). It is for this reason that we interpret the situation s t − τ > 0 as the bubble regime and s t − τ < 0 as the non-bubble regime, although in the smooth transition model there is strictly speaking a continuum of regimes. See also van Dijk et al. (2002), who adopt the same interpretation. Estimation of the model can be done by nonlinear least squares-see, e.g., Teräsvirta (1994). However, it will be more efficient to take into account conditional heteroskedasticity and fat tails of the distribution of ε t by using maximum likelihood estimation (MLE).
The volatility part of the model is taken to be the Beta-t-EGARCH model of (Creal et al. 2011;Harvey 2013). That is, we assume that ε t follows a student-t distributed random variable with mean zero, scale one, and η degrees of freedom and the volatility dynamics are driven by the score of the likelihood function, i.e., h t+1 = ω + φh t + κu t , |φ| < 1, By Proposition 12 of Harvey (2013), we can write alternatively u t = (η + 1)b t − 1, where b t = ε 2 t /(ε 2 t + η) is an IID beta distributed r.v. The reason for using a score driven EGARCH rather than the classical EGARCH model of Nelson (1991) is that many recent empirical studies have found that the news impact function of classical EGARCH tends to overweigh the impact of large shocks on volatility, while the impact functions of score driven models tend to give a more accurate account of the impact of large shocks-see, e.g., Harvey (2013) for a detailed discussion and motivation for score driven models.
The exponential form of volatility is convenient to augment the volatility equation with explanatory variables without having to worry about the positivity of the variance. We consider an additional term based on the first difference of the sentiment index, ∆s t , i.e., the volatility equation becomes The motivation for using the first differences of the sentiment index rather than the index level is that changes in the index might be more informative to explain price uncertainty, and hence volatility, than the index itself. The sign of the parameter δ is not a priori clear, as it may be that volatility increases when either the sentiment index increases or decreases. We have tried other functional forms for the impact of the sentiment index, such as δ(∆s t−1 − c) 2 , where c is a constant-for example, the sample mean of the sentiment index. However, and perhaps surprisingly, the best form turned out to be the linear one.
Estimation of the transition parameter γ is often problematic when this parameter is large, as then the transition function is steep and a large number of observations in a neighborhood of s t = τ is required to obtain a reliable estimate of γ-see, e.g., Granger and Teräsvirta (1993) for a detailed discussion. In that case, they suggest to first reparameterize g as 1/(1 + exp(−γ(s t−1 − τ)/σ)), wherê σ is the sample standard deviation of the sentiment index, then set γ to a fixed value, e.g., unity, and estimate the remaining parameters by MLE. The procedure can be reiterated by using a set of fixed values for γ on a grid. We follow their advice here using the grid of integers from 1 to 10 and found that, after rescaling of the transition function, γ = 3 maximizes the likelihood and gives the best results. We compare our model with one that ignores the sensitivity index in the conditional mean and variance, i.e., We call this model M0, as opposed to the above complete model M1, and we would like to test model M1 versus model M0 to see whether the sensitivity index has a significant contribution to explain locally explosive behavior and volatility. Testing is however non-standard as under the null hypothesis, H 0 : α = µ 2 = 0, there are unidentified parameters, τ and γ. Thus, likelihood ratio test statistics do not have a chi-square distribution under the null. This is a well known problem in STAR models-see, e.g., (Granger and Teräsvirta 1993;van Dijk et al. 2002) for an overview. The simplest solution is to use an LM-type test by estimating the auxiliary regression where e t is an error term, and then test the hypothesis H 0 : b = 0. As shown by Luukkonen et al. (1988), testing H 0 is equivalent to testing H 0 , as the mean term in the auxiliary regression is the first order Taylor Table 1. In the sentiment-free model M0, the constant µ 1 is positive and significant, while in model M1 the combined term of µ 1 and µ 2 is closer to zero. The estimate of α is small but significant, indicating that the explosive regime is important. In addition, the difference in the log likelihood values suggests that the goodness-of-fit of M1 is substantially higher than that of M0. A classical likelihood ratio test clearly would reject M0 in favor of M1. However, as outlined above, this test is non-standard in our context due to unidentified parameters under the null hypothesis. Instead, we perform the auxiliary regression approach in Equation (3) and obtain the least squares estimator of b = 0.0011 with a standard error of 0.0002, so that the p-value of the t-test for H 0 : b = 0 is very close to zero. Hence, we reject the hypothesis of a random walk in favor of STAR nonlinearity. Rather than estimating the degrees-of-freedom parameter η directly, we estimate its inverse, 1/η, as this often yields more stable results numerically-see, e.g., Harvey (2013). To summarize, our testing approach suggests a significant contribution of the sentiment index to explain locally explosive behavior of the CRIX.
The estimated transition function is given by g(·) = 1/(1 + exp(−3(· −τ)/σ)), whereσ = 0.3358 is the standard deviation of the sentiment index. This function is shown in Figure 3, indicating the "bubble regime" for s t >τ. Empirically, this regime occurs in about 16% of the sample period, as, for 219 of 1340 observations, the sentiment index is larger than the estimated value of τ.  Note that the estimated volatility parameters, except for the sentiment term, are rather similar for the two models and characterized by high persistence, i.e., φ is close to one, and fat tails of the conditional student-t distribution given by a degrees of freedom parameter η of about 2.6 for both models. However, the parameter related to the sentiment index, δ is significant and negative, indicating that volatility increases whenever there is a drop in the sentiment index. This is similar to financial markets, where negative news tend to have stronger impact on volatility than positive news, often referred to as the "leverage effect" and first noted by (Black 1976;Christie 1982), see Bauwens et al. (2012) for a recent overview. In our case, the asymmetry in the impact of positive and negative innovations on volatility is explicitly modeled by the change of the sentiment index. Figure 4 shows the estimated log volatility process together with the estimated conditional mean of returns, i.e., µ 1 + {αy t−1 + µ 2 }g(s t−1 ). The shaded areas highlight the estimated bubble periods, which mainly occurred in 2017 and parts of 2018. Not surprisingly, the shaded bubble periods correspond to substantially higher conditional mean returns, while it is close to zero for the non-shaded areas. Unlike Hafner (2018) who finds a single bubble regime starting in May 2017 and whose sample ends in December 2017, we find multiple bubble periods, mainly during the period May 2017 to April 2018. Hence, the starting date of these periods coincides with the single regime of Hafner (2018), but, due to the volatility of the sentiment index, this regime is decomposed into several sub-regimes. While the procedure advocated by (Hafner 2018;Phillips et al. 2011) identifies bubble periods that are of long duration and quite inert to price decreases, our approach produces regimes of shorter duration because, as the sentiment index drops, one quickly leaves a bubble regime.
Furthermore, we find that volatility is generally higher in the bubble regimes, with an average log volatility of −3.58 compared with −4.04 outside of a bubble. However, short term movements of volatility tend to react negatively to changes of the sentiment index, as reflected by the negative estimate of δ in model (2). Hence, our approach of using a sentiment index for modelling cryptocurrencies not only identifies locally explosive bubble periods, but also measures its impact on volatility. Moreover, it can be used as a predictive device, on a daily basis, both for returns and volatility. The method we propose conveys regulation implications in the cryptocurrency markets. Very likely scams come to a play given investors' irrational exuberance and a surge of initial coin offerings (ICOs). However, these challenge the regulators in the presence of bubbles.

Conclusions
Our model allows to test for speculative bubbles in cryptocurrencies using a sentiment index, which drives the transition in a regime switching autoregression. For a popular cryptocurrency index, we find statistically significant regime nonlinearity and identify corresponding bubble periods. Furthermore, volatility is specified as a score-driven EGARCH-type model augmented with the daily changes of the sentiment index. We find that volatility increases as the sentiment index decreases, and vice versa. This is similar to the leverage effect in classical financial markets, where bad news have a stronger effect on volatility than good news, but here this effect is explicitly driven by the sentiment index.
Several extensions of the present analysis are possible. First, it is possible to do forecasting. One-step-ahead forecasting is trivial, but multi-step ahead is not, due to the nonlinearity of the conditional mean function. Several approaches could be employed including bootstrap and Monte Carlo simulation-see, e.g., van Dijk et al. (2002). In addition, the time series properties of the sentiment index would have to be investigated to build a model that explicitly takes the sentiment dynamics into account. Second, we could compare the statistical properties of our testing approach with those of a pure time series based approach such as Phillips et al. (2011). The latter approach uses less information and hence should have less power if the true data generating process is close to a smooth transition autoregression. This is left for future research.