Statistical Arbitrage in Cryptocurrency Markets
- Development of an advanced, machine-learning-based statistical arbitrage approach for the cryptocurrency space: we build our approach on the ideas of Fischer and Krauss (2018); Huck (2009, 2010); Krauss et al. (2017); Moritz and Zimmermann (2014); Takeuchi and Lee (2013), who have developed similar methods for U.S. cash equities, but on much lower frequencies (days to months). With the present manuscript, we successfully show that relative-value arbitrage opportunities exist in this young and aspiring market, given that a random forest is able to produce daily returns of 7.1 bps after transaction costs.
- Consideration of microstructural effects: advancing to higher frequencies, e.g., minute-binned data, brings along substantial challenges. First, trading volume needs to be taken into account. In cash equities, many strategies are backtested on the closing price, which captures 7 percent of daily liquidity for NYSE listed stocks—see Intercontinental Exchange (2018). In stark contrast, liquidity needs to be carefully assessed for every minute bar in the cryptocurrency space, especially in case of smaller coins. We incorporate this effect in our study and only execute trades in case liquidity is present. Second, micro-structural effects, and especially the bid-ask bounce, need to be considered. We therefore introduce a lag between the price on which the prediction is generated, and the subsequent price on which execution is taking place. Hence, we eliminate the bid-ask bounce see, e.g., (Gatev et al. 2006) and we render the strategy realistic in the digital age, given that there is sufficient time for signal generation, order routing, and order execution.
- Shining light into the black box: machine learning models often have the downside of being intransparent and opaque. Hence, we analyze feature importances, and we compare the random forest to the transparent logistic regression. We find that both methods capture short-term characteristics in the data, with past returns over the past 60 min contributing most when explaining future returns over the subsequent 120 min.
2. Data and Software
3.1. Generation of Training and Trading Set
3.2. Feature and Target Generation
3.2.1. Features—Multiperiod Returns
3.3.1. Logistic regression
3.3.2. Random forest
3.4. Forecasting, Ranking and Trading
- Execution gap: We create the trading signal at the end of minute t and place the order for execution at the closing price of the following minute . In other words, we introduce a one period gap between signal generation and execution to account for the time frame required for data processing, prediction making, and order management.
- Volume constraint (opening of position): A position is only opened when at least one unit of the currency pair is traded at the respective point in time—otherwise, the order is canceled and the amount of capital foreseen for the position is kept in cash for the two hours period.
- Volume constraint (closing of position): Once the position has reached its two hours lifetime, a closing order is triggered and executed at the first bar with sufficient volume.
- Elimination of starting point bias: To avoid any bias related to the starting point (point in time at which the first portfolio is opened), we open a new portfolio at every minute and average the results across the 120 portfolios that are opened at each time t.
- Transaction costs: We assume 15 bps per half turn, based on analyses on transaction costs and liquidity costs provided in Schnaubelt et al. (2019) on cryptocurrency limit order book data.
4.1. Trade-Level Results
- Positive mean returns: Both models yield positive and statistically significant mean returns with the RF (3.8 bps) clearly outperforming the LR (2.0 bps) by a factor of almost two. Looking at the contribution from long trades and short trades, we find that the latter are more profitable (−2.1 bps. vs. 5.6 bps (LR) and 0.2 bps. vs. 6.4 bps. (RF))—a finding that is likely driven by the overall decline of the cryptocurrency market during this period.
- Extreme price movements: Looking at the minimum (−42.8 percent) and maximum returns (34.4 percent), we find astonishingly high values given the two hour holding period. However, these observations can be attributed to the extreme price movements in cryptocurrency markets—see Osterrieder and Lorenz (2017). The 25 percent and 75 percent quartiles are less extreme with values between −1.2 and 1.3 percent for both models.
- Negative median: We further notice that both, the RF and the LR model, have negative median returns. In other words, more trades lead to a loss than to a profit. However, taking into account the magnitude of the profits and losses, we find that the profits surpass the losses by approximately 5 bps (LR) and 10 bps (RF) on average (simply speaking, more money is made when the model is right than lost when it is wrong). In result, the mean trade of the RF is positive, i.e., .
- Skewness and Kurtosis: Both, LR and RF exhibit positive skewness, which is a favorable property for investors, given that the right tail tends to be more pronounced than the left tail. By contrast, kurtosis values above 9 indicate leptokurtic behavior, and that significant risk lies in the extremes—see Osterrieder and Lorenz (2017).
- Differing number of trades: Finally, we observe that the number of executed trades differs between the two models as well as the long and short leg. As described in the previous section, our backtesting engine cancels orders in case no volume is available to execute the respective trade. We may therefore cautiously conclude that the RF model selects a larger share of less liquid coins (119,829 executed trades) compared to the LR model (158,408 trades). Note: the overall high number of trades results from the backtesting logic in which we open a new portfolio with three long orders and three short orders by the end of each minute to avoid starting point bias.
4.2. Return Development over Time
- Panel A—daily return characteristics: With regard to mean return, the random forest surpasses the logistic regression by 2.2 bps per day (7.1 bps vs. 4.9 bps). We further observe that both, the maximum and minimum daily returns, are within reasonable levels of −2.6 percent (LR) and +2.1 percent (RF), respectively. The underlying reason is the large number of active positions at each point in time (see Section 3.4) which also explains the low standard deviation of 66 bps (LR) and 53 bps (RF). Looking at Bitcoin (BTC) and the general market (MKT), we find mean returns of −0.5 bps per day and −28.1 bps, respectively.
- Panel B—risk metrics: Panel B reveals favorable risk metrics for the random forest with a 1-percent value at risk of −1.0 percent compared to −1.5 percent for the logistic regression. Moreover, we find a significantly lower maximum drawdown of −2.4 percent for the RF and -5.9 percent for the LR compared to −26.7 percent for Bitcoin and −32.9 percent for the general market. The difference is caused by the short leg of the portfolio, i.e., the investment in the flop-3 coins which helps in eliminating market risk.
- Panel C—annualized risk-return metrics: Finally, panel C depicts annualized risk-return metrics. We observe annualized returns of 29.0 percent for the random forest and 18.8 percent for the logistic regression, compared to vastly negative results for the buy-and-hold benchmarks. Given the low volatility, these results translate into a Sharpe ratio of 1.4 (LR) and 2.5 (RF) respectively—hereby outperforming both Bitcoin and the general market by a clear margin.
4.3. Beyond Returns—Shedding Light Into the Patterns Exploited for Trading
- Feature importance analysis: The upper half of the figure shows the features (explanatory variables) used by the random forest, sorted by feature importance in descending order. The most important features are the returns over the past 20, 40 and 60 min. In other words, the random forest pays most attention to the price development over the past hour. By contrast, the longer term price development (past 12–24 h) does not seem to have a substantial contribution to predicting the price change over the next two hours.
- Coefficient analysis: Looking at the lower part of the figure, we take advantage of the high transparency and explanatory value of the logistic regression model. The highest regression coefficient of approximately −6.5 belongs to the return over the past 20 min, followed by the coefficients for the 40 and 60 min returns. Moreover, we find that almost all regression coefficients exhibit a negative sign—in other words, the model likely produces a positive forecast (long), in case the respective coin has experienced a decline in the recent past (negative feature values which are multiplied with negative regression coefficients) and vice versa. We may therefore cautiously conclude that the model capitalizes on short-term mean-reversion—see Jegadeesh (1990); Lehmann (1990).
5. Discussion—Limits to Arbitrage
Conflicts of Interest
|MKT||market, i.e., an equal investment in all coins at the beginning of the trading period|
|VaR||value at risk|
- Balcilar, Mehmet, Elie Bouri, Rangan Gupta, and David Roubaud. 2017. Can volume predict Bitcoin returns and volatility? A quantiles-based approach. Economic Modelling 64: 74–81. [Google Scholar]
- Baur, Dirk G., and Thomas Dimpfl. 2018. Asymmetric volatility in cryptocurrencies. Economics Letters 173: 148–51. [Google Scholar] [CrossRef]
- Beneki, Christina, Alexandros Koulis, Nikolaos A. Kyriazis, and Stephanos Papadamou. 2019. Investigating volatility transmission and hedging properties between Bitcoin and Ethereum. Research in International Business and Finance 48: 219–27. [Google Scholar] [CrossRef]
- Berkson, Joseph. 1953. A statistically precise and relatively simple method of estimating the bio-assay with quantal response, based on the logistic function. Journal of the American Statistical Association 48: 565–99. [Google Scholar] [CrossRef]
- Bowen, David A., and Mark C. Hutchinson. 2016. Pairs trading in the UK equity market: Risk and return. The European Journal of Finance 22: 1363–87. [Google Scholar] [CrossRef]
- Breiman, Leo. 1996. Bagging predictors. Machine Learning 24: 123–40. [Google Scholar] [CrossRef]
- Breiman, Leo. 2001. Random forests. Machine Learning 45: 5–32. [Google Scholar] [CrossRef]
- coinmarketcap.com. 2018. Overview of available cryptocurrencies. Available online: coinmarketcap.com (accessed on 27 July 2018).
- Colianni, Stuart, Stephanie Rosales, and Michael Signorotti. 2015. Algorithmic Trading of Cryptocurrency Based on Twitter Sentiment Analysis. Working Paper. Stanford, CA, USA: Stanford University. [Google Scholar]
- cryptocompare.com. 2018. Overview of CryptoCompare API. Available online: cryptocompare.com (accessed on 6 September 2018).
- Dyhrberg, Anne Haubo. 2016. Bitcoin, gold and the dollar—A GARCH volatility analysis. Finance Research Letters 16: 85–92. [Google Scholar] [CrossRef]
- Enke, David, and Suraphan Thawornwong. 2005. The use of data mining and neural networks for forecasting stock market returns. Expert Systems with Applications 29: 927–40. [Google Scholar] [CrossRef]
- Fama, Eugene F. 1970. Efficient capital markets: A review of theory and empirical work. The Journal of Finance 25: 383–417. [Google Scholar] [CrossRef]
- Fischer, Thomas, and Christopher Krauss. 2018. Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research 270: 654–69. [Google Scholar] [CrossRef]
- Garcia, David, and Frank Schweitzer. 2015. Social signals and algorithmic trading of Bitcoin. Royal Society Open Science 2: 150288. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Gatev, Evan, William N. Goetzmann, and K. Geert Rouwenhorst. 2006. Pairs trading: Performance of a relative-value arbitrage rule. Review of Financial Studies 19: 797–827. [Google Scholar] [CrossRef]
- Gilbert, Clayton J. Hutto Eric. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. Paper presented at Eights International Conference on Weblogs and Social Media, Ann Arbor, MI, USA, June 1–4. [Google Scholar]
- Gregoriou, Greg N. 2012. Handbook of Short Selling. Amsterdam and Boston: Academic Press. [Google Scholar]
- Ha, Sungjoo, and Byung-Ro Moon. 2018. Finding attractive technical patterns in cryptocurrency markets. Memetic Computing 10: 301–6. [Google Scholar] [CrossRef]
- Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2008. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Series in Statistics; New York: Springer. [Google Scholar]
- Ho, Tin Kam. 1995. Random decision forests. Paper presented at the third International Conference on Document Analysis and Recognition, Montreal, QC, Canada, August 14–16; vol. 1, pp. 278–82. [Google Scholar]
- Ho, Tin Kam. 1998. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20: 832–44. [Google Scholar]
- Huck, Nicolas. 2009. Pairs selection and outranking: An application to the S&P 100 index. European Journal of Operational Research 196: 819–25. [Google Scholar]
- Huck, Nicolas. 2010. Pairs trading and outranking: The multi-step-ahead forecasting case. European Journal of Operational Research 207: 1702–16. [Google Scholar] [CrossRef]
- IEEE, and The Open Group. 2018. The open group base specifications. 7. Available online: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_16 (accessed on 6 September 2018).
- Intercontinental Exchange. 2018. Behind the Scenes—An insider’s guide to the NYSE closing auction. Available online: https://www.nyse.com/article/nyse-closing-auction-insiders-guide (accessed on 30 December 2018).
- Jegadeesh, Narasimhan. 1990. Evidence of predictable behavior of security returns. The Journal of Finance 45: 881. [Google Scholar] [CrossRef]
- Jiang, Zhengyao, and Jinjun Liang. 2017. Cryptocurrency portfolio management with deep reinforcement learning. arXiv, arXiv:1612.01277v5. [Google Scholar][Green Version]
- Jones, Eric, Travis Oliphant, and Pearu Peterson. 2014. SciPy: open source scientific tools for Python. Available online: http://www.scipy.org/ (accessed on 30 December 2018).
- Kim, Y. Bin, Jun G. Kim, Wook Kim, Jae H. Im, Tae H. Kim, Shin J. Kang, and Chang H. Kim. 2016. Predicting fluctuations in cryptocurrency transactions based on user comments and replies. PLoS ONE 11: e0161197. [Google Scholar] [CrossRef] [PubMed]
- Kleinbaum, David G., and Mitchel Klein. 2010. Logistic Regression: A Self-Learning Text. New York: Springer. [Google Scholar]
- Koutmos, Dimitrios. 2018. Return and volatility spillovers among cryptocurrencies. Economics Letters 173: 122–27. [Google Scholar] [CrossRef]
- Krauss, Christopher. 2017. Statistical arbitrage pairs trading strategies: Review and outlook. Journal of Economic Surveys 31: 513–45. [Google Scholar] [CrossRef]
- Krauss, Christopher, Xuan Anh Do, and Nicolas Huck. 2017. Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500. European Journal of Operational Research 259: 689–702. [Google Scholar][Green Version]
- Lehmann, Bruce N. 1990. Fads, martingales, and market efficiency. The Quarterly Journal of Economics 105: 1. [Google Scholar] [CrossRef]
- Leung, Mark T., Hazem Daouk, and An-Sing Chen. 2000. Forecasting stock indices: A comparison of classification and level estimation models. International Journal of Forecasting 16: 173–90. [Google Scholar] [CrossRef]
- Lintilhac, Paul S., and Agnes Tourin. 2017. Model-based pairs trading in the Bitcoin markets. Quantitative Finance 17: 703–16. [Google Scholar] [CrossRef]
- Liu, Bo, Lo-Bin Chang, and Hélyette Geman. 2017. Intraday pairs trading strategies on high frequency data: The case of oil companies. Quantitative Finance 17: 87–100. [Google Scholar] [CrossRef]
- Madan, Isaac, Shaurya Saluja, and Aojia Zhao. 2015. Automated Bitcoin Trading via Machine Learning Algorithms. Working Paper. Stanford, CA, USA: Stanford University. [Google Scholar]
- McKinney, Wes. 2010. Data structures for statistical computing in python. Paper presented at the 9th Python in Science Conference, Austin, TX, USA, June 28–July 3; vol. 445, pp. 51–56. [Google Scholar]
- McNally, Sean, Jason Roche, and Simon Caton. 2018. Predicting the price of Bitcoin using machine learning. Paper presented at the 26th International Conference on Parallel, Distributed and Network-Based Processing, Cambridge, UK, March 21–23; pp. 339–43. [Google Scholar]
- Moritz, Benjamin, and Tom Zimmermann. 2014. Deep Conditional Portfolio Sorts: The Relation between Past and Future Stock Returns. Working Paper. Munich, Germany: LMU Munich, Cambridge, MA, USA: Harvard University. [Google Scholar]
- Osterrieder, Joerg, and Julian Lorenz. 2017. A statistical risk assessment of Bitcoin and its extreme tail behavior. Annals of Financial Economics 12: 1750003. [Google Scholar] [CrossRef]
- Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, and et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12: 2825–30. [Google Scholar]
- Python Software Foundation. 2016. Python 3.5.2 Documentation. Available online: https://docs.python.org/3.5/ (accessed on 15 December 2018).
- Quantopian Inc. 2016. Empyrical: Common Financial Risk Metrics. Available online: https://github.com/quantopian/empyrical (accessed on 15 December 2018).
- Raschka, Sebastian. 2015. Python Machine Learning. Birmingham: Packt Publishing. [Google Scholar]
- Schnaubelt, Matthias, Jonas Rende, and Christopher Krauss. 2019. Testing Stylized Facts of Bitcoin Limit Order Books. Journal of Risk and Financial Management 12: 25. [Google Scholar] [CrossRef]
- Shah, Devavrat, and Kang Zhang. 2014. Bayesian regression and Bitcoin. Paper presented at the 52nd Conference on Communication, Control, and Computing, Monticello, IL, USA, October 1–3; pp. 409–14. [Google Scholar]
- Takeuchi, Lawrence, and Yu-Ying Lee. 2013. Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks. Working Paper. Stanford, CA, USA: Stanford University. [Google Scholar]
- Tourin, Agnès, and Raphael Yan. 2013. Dynamic pairs trading using the stochastic control approach. Journal of Economic Dynamics and Control 37: 1972–81. [Google Scholar] [CrossRef]
- Van der Walt, S., S. C. Colbert, and G. Varoquaux. 2011. The NumPy array: A structure for efficient numerical computation. Computing in Science & Engineering 13: 22–30. [Google Scholar] [CrossRef]
- Warriner, Amy Beth, Victor Kuperman, and Marc Brysbaert. 2013. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods 45: 1191–207. [Google Scholar] [CrossRef] [PubMed]
The emotional valence and opinion polarization are computed on a daily basis as proposed by Warriner et al. (2013).
Not all time-series examined are complete in the sense that they cover the whole period from January to September 2018. This could be due to several reasons such as the delisting of a coin. It is noteworthy that such time-series are not eliminated but traded according to the available data.
More precisely, by executing at the opening price of minute , we still leave a small gap compared to an execution at the closing price of minute t (which is used to make the prediction).
|Share > 0||0.46677||0.52671||0.49897||0.45622||0.52395||0.49587|
|Mean return positive trade||0.01726||0.01750||0.01739||0.01802||0.01757||0.01774|
|Mean return negative trade||−0.01551||−0.01828||−0.01691||−0.01508||−0.01799||−0.01669|
|Share > 0||0.51807||0.53012||0.50602||0.50602|
|B||Historic VaR 1%||−0.01523||−0.01025||−0.09112||−0.10461|
|Historic VaR 5%||−0.00809||−0.00756||−0.05482||−0.05978|
|Gap 0||Gap 1||Gap 2||Gap 3||Gap 4||Gap 5|
|Share > 0||0.52342||0.49587||0.49294||0.49070||0.48810||0.48605|
|Mean return positive trade||0.01869||0.01774||0.01763||0.01756||0.01755||0.01745|
|Mean return negative trade||−0.01623||−0.01669||−0.01666||−0.01660||−0.01656||−0.01651|
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fischer, T.G.; Krauss, C.; Deinert, A. Statistical Arbitrage in Cryptocurrency Markets. J. Risk Financial Manag. 2019, 12, 31. https://doi.org/10.3390/jrfm12010031
Fischer TG, Krauss C, Deinert A. Statistical Arbitrage in Cryptocurrency Markets. Journal of Risk and Financial Management. 2019; 12(1):31. https://doi.org/10.3390/jrfm12010031Chicago/Turabian Style
Fischer, Thomas Günter, Christopher Krauss, and Alexander Deinert. 2019. "Statistical Arbitrage in Cryptocurrency Markets" Journal of Risk and Financial Management 12, no. 1: 31. https://doi.org/10.3390/jrfm12010031