# Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Market Data

#### 2.2. Social Data

- Has been created during the time period the study takes place: previous tweets are not taken into account even when they may be influencing current behavior, as such analysis is outside the scope of this study.
- Contained the name (i.e., bitcoin) or the ticker symbol (i.e., btc) of one of the analyzed currencies in either its text fields or tags: this gives a high degree of confidence that the tweet is at least related to one of the cryptocurrencies in question.
- Is written in English: Being dictionary based, our sentiment analysis tool only works with the English language.
- Is not duplicated: while re-tweets were allowed as this may signal a sentimental trend, duplicated tweets not taken in consideration as this type of activity is mainly displayed by bot accounts.

#### 2.3. Sentiment Analysis

#### 2.4. Feature Vectors

- $neu$ is the average of neutral sentiments defined as $\frac{{\sum}_{i=1}^{n}{t}_{neu}}{n}$
- $neg$ is the average of negative sentiments defined as $\frac{{\sum}_{i=1}^{n}{t}_{neg}}{n}$
- $norm$ is the sum of the valence scores of each word defined as $\frac{{\sum}_{i=1}^{n}{t}_{norm}}{n}$
- $pos$ is the average of positive sentiments $\frac{{\sum}_{i=1}^{n}{t}_{pos}}{n}$
- $pol$ is the geometric mean of $pos$ and $neg$ defined as $\sqrt{{V}_{pos}{V}_{neg}}$
- close is the closing price in the time period
- high is the highest price in the time period
- low is the lowest price in the time period
- open is the opening price in the time period
- volumeto is the trading volume for the time period

#### 2.5. Multi-Layer Perceptron

#### 2.6. Support Vector Machines

#### 2.7. Random Forests

#### 2.8. Training

## 3. Results

#### 3.1. Setup

#### 3.2. Evaluation

- ${t}_{p}$ = Number of true positive values
- ${t}_{n}$ = Number of true negative values
- ${f}_{p}$ = Number of false positive values
- ${f}_{p}$ = Number of false negative values.

#### 3.3. Results

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Data Availability

## References

- Ferreira, M.; Rodrigues, S.; Reis, C.I.; Maximiano, M. Blockchain: A Tale of Two Applications. Appl. Sci.
**2018**, 8, 1506. [Google Scholar] [CrossRef] - Trabelsi, N. Are There Any Volatility Spill-Over Effects among Cryptocurrencies and Widely Traded Asset Classes? J. Risk Financ. Manag.
**2018**, 11, 66. [Google Scholar] [CrossRef] - Corelli, A. Cryptocurrencies and Exchange Rates: A Relationship and Causality Analysis. Risks
**2018**, 6, 111. [Google Scholar] [CrossRef] - Cocco, L.; Tonelli, R.; Marchesi, M. An Agent Based Model to Analyze the Bitcoin Mining Activity and a Comparison with the Gold Mining Industry. Future Internet
**2019**, 11, 8. [Google Scholar] [CrossRef] - Memon, R.A.; Li, J.P.; Ahmed, J. Simulation Model for Blockchain Systems Using Queuing Theory. Electronics
**2019**, 8, 234. [Google Scholar] [CrossRef] - Hölbl, M.; Kompara, M.; Kamišalić, A.; Nemec Zlatolas, L. A Systematic Review of the Use of Blockchain in Healthcare. Symmetry
**2018**, 10, 470. [Google Scholar] [CrossRef] - Fischer, T.G.; Krauss, C.; Deinert, A. Statistical Arbitrage in Cryptocurrency Markets. J. Risk Financ. Manag.
**2019**, 12, 31. [Google Scholar] [CrossRef] - Garcia, D.; Schweitzer, F. Social signals and algorithmic trading of Bitcoin. R. Soc. Open Sci.
**2015**, 2, 150288. [Google Scholar] [CrossRef] - Bollen, J.; Maoa, H.; Zeng, X. Twitter mood predicts the stock market. J. Comput. Sci.
**2010**, 2, 1–8. [Google Scholar] [CrossRef] - Li, Q.; Wang, T.; Li, P.; Gong, Q.; Chen, Y. The effects of news and public mood on stock movements. Inf. Sci.
**2014**, 278, 826–840. [Google Scholar] [CrossRef] - Chen, C.-H.; Hafner, C.M. Sentiment-Induced Bubbles in the Cryptocurrency Market. J. Risk Financ. Manag.
**2019**, 12, 53. [Google Scholar] [CrossRef] - Bekiros, S.; Gupta, R.; Kyei, C. A non-linear approach for predicting stock returns and volatility with the use of investor sentiment indices. Appl. Econ.
**2016**, 48, 2895–2898. [Google Scholar] [CrossRef] - Lahmiri, S.; Bekiros, S. Chaos, randomness and multi-fractality in Bitcoin market. Chaos Solitons Fractals
**2018**, 106, 28–34. [Google Scholar] [CrossRef] - Chen, A.; Leung, M.; Daouk, H. Application of Neural Networks to an Emerging Financial Market: Forecasting and Trading the Taiwan Stock Index. Comput. Oper. Res.
**2014**, 30, 901–923. [Google Scholar] [CrossRef] - Lahmiri, S.; Bekiros, S. Cryptocurrency forecasting with deep learning chaotic neural networks. Chaos Solitons Fractals
**2019**, 118, 35–40. [Google Scholar] [CrossRef] - Bekiros, S.D.; Georgoutsos, D.A. Direction-of-change forecasting using a volatility-based recurrent neural network. J. Forecast.
**2008**, 27, 407–417. [Google Scholar] [CrossRef] [Green Version] - Saad, E.; Prokhorov, D.; Wunsch, D. Advanced neural network training methods for low false alarm stock trend prediction. In Proceedings of the IEEE International Conference on Neural Networks (ICNN96), Washington, DC, USA, 3–6 June 1996. [Google Scholar]
- Tsaih, R.; Hsu, Y.; Lai, C.C. Forecasting S&P 500 stock index futures with a hybrid AI system. Decis. Support Syst.
**1998**, 23, 161–174. [Google Scholar] - Kohara, K.; Ishikawa, T.; Fukuhara, Y.; Nakamura, Y. Stock price prediction using prior knowledge and neural networks. Int. Syst. Account. Financ. Manag.
**1997**, 6, 11–22. [Google Scholar] [CrossRef] - Baestaens, D.E.; van den Bergh, W.M. Tracking the Amsterdam stock index using neural networks. Neural Netw. Cap. Mark.
**1995**, 10, 149–161. [Google Scholar] - Tsibouris, G.; Zeidenberg, M. Back propagation as a test of the efficient markets hypothesis. In Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences, Kauai, HI, USA, 7–10 January 1992. [Google Scholar] [CrossRef]
- Refenes, A.-P.; Zapranis, A.D.; Francis, G. Modeling stock returns in the framework of APT: A comparative study with regression models. Neural Netw. Cap. Mark.
**1995**, 10, 101–125. [Google Scholar] - Cao, L.; Tay, F.E. Financial forecasting using support vector machines. Neural Comput. Appl.
**2001**, 10, 184–192. [Google Scholar] [CrossRef] - Cao, L.; Tay, F.E. Application of support vector machines in financial time series forecasting. Omega
**2001**, 29, 309–317. [Google Scholar] - Huang, W.; Nakamori, Y.; Wang, S.Y. Forecasting stock market movement direction with support vector machine. Comput. Oper. Res.
**2005**, 32, 2513–2522. [Google Scholar] [CrossRef] - Chen, W.; Shih, J. Comparison of support-vector machines and back propagation neural networks in forecasting the six major Asian stock markets. Int. J. Electron. Financ.
**2006**, 1, 49–67. [Google Scholar] [CrossRef] - Patel, J.; Shah, S.; Thakkar, P.; Kotecha, K. Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques. Expert Syst. Appl.
**2015**, 42, 259–268. [Google Scholar] [CrossRef] - Suryoday, B.; Saibal, K.; Snehanshu, S.; Luckyson, K.; Sudeepa, R. Predicting the direction of stock market prices using tree-based classifiers. N. Am. Econ. Financ.
**2019**, 47, 552–567. [Google Scholar] - Bordino, I.; Battiston, S.; Caldarelli, G.; Cristelli, M.; Ukkonen, A.; Weber, I. Web search queries can predict stock market volumes. PLoS ONE
**2012**, 7, e40014. [Google Scholar] [CrossRef] - Schoen, H.; Gayo-Avello, D.; Metaxas, P.T.; Mustafaraj, E.; Strohmaier, M.; Gloor, P. The power of prediction with social media. Internet Res.
**2013**, 23, 528–543. [Google Scholar] [CrossRef] [Green Version] - Kim, Y.B.; Kim, J.G.; Kim, W.; Im, J.H.; Kim, T.H.; Kang, S.J.; Kim, C.H. Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies. PLoS ONE
**2016**, 11, e0161197. [Google Scholar] [CrossRef] - Phillips, R.C.; Gorse, D. Cryptocurrency price drivers: Wavelet coherence analysis revisited. PLoS ONE
**2018**, 13, e0195200. [Google Scholar] [CrossRef] - Hutto, C.J.; Gilbert, E. VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eighth international AAAI Conference on Weblogs and Social Media, Ann Arbor, MI, USA, 1–4 June 2014. [Google Scholar]
- Ribeiro, F.; Araújo, M.; Gonçalves, P.; Gonçalves, M.; Benevenuto, F. SentiBench—A benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Sci.
**2016**, 5, 23. [Google Scholar] [CrossRef] - Haganm, M.; Demuth, H.; Hudson, M.; Orlando-De-Jesús, B. Neural Network Design; PWS Pub Co.: Boston, MA, USA, 2014; ISBN 978-0971732117. [Google Scholar]
- Mai, F.; Shan, Z.; Bai, Q.; Wang, X.; Chiang, R. How Does Social Media Impact Bitcoin Value? A Test of the Silent Majority Hypothesis. Manag. Inf. Syst.
**2018**, 35, 19–52. [Google Scholar] [CrossRef] - Garcia, D.; Tessone, C.; Mavrodiev, P.; Perony, N. The digital traces of bubbles: Feedback cycles between socio-economic signals in the Bitcoin economy. J. R. Soc. Interface
**2014**, 11, 20140623. [Google Scholar] [CrossRef] [PubMed]

Cryptocurrency | Collected Tweets | Total Percentage |
---|---|---|

Bitcoin | 13,096,598 | 63% |

Ethereum | 5,366,126 | 25.81% |

Ripple | 1,143,634 | 5.5% |

Litecoin | 1,183,214 | 5.69% |

Cryptocurrency | Price Increased | Price Decreased |
---|---|---|

Bitcoin | 28 | 32 |

Ethereum | 28 | 32 |

Ripple | 23 | 37 |

Litecoin | 29 | 31 |

**Table 3.**Results of applying multi-layer perceptron (MLP), support vector machine (SVM) and random forest (RF) using Twitter data, market data or both for predicting daily market movements for Bitcoin.

Model | Accuracy (95% CI) | Precision | Recall | F_{1} Score |
---|---|---|---|---|

MLP Twitter | 0.39 (±0.02) | 0.38 | 0.39 | 0.38 |

MLP Market | 0.72 (±0.03) | 0.74 | 0.72 | 0.71 |

MLP Twitter and Market | 0.72 (±0.06) | 0.76 | 0.72 | 0.72 |

SVM Twitter | 0.50 (±0.03) | 0.29 | 0.50 | 0.37 |

SVM Market | 0.55 (±0.03) | 0.53 | 0.56 | 0.47 |

SVM Twitter and Market | 0.55 (±0.03) | 0.31 | 0.56 | 0.40 |

RF Twitter | 0.44 (±0.04) | 0.50 | 0.80 | 0.62 |

RF Market | 0.61 (±0.04) | 0.67 | 0.25 | 0.36 |

RF Twitter and Market | 0.44 (±0.04) | 0.28 | 0.44 | 0.34 |

Random | 0.50 (±0.28) | 0.49 | 0.50 | 0.50 |

Majority | 0.55 (±0.0) | 0.31 | 0.56 | 0.40 |

**Table 4.**Results of applying MLP, SVM and RF using Twitter data, market data or both for predicting daily market movements for Ethereum.

Model | Accuracy (95% CI) | Precision | Recall | F_{1} Score |
---|---|---|---|---|

MLP Twitter | 0.39 (±0.02) | 0.44 | 0.39 | 0.38 |

MLP Market | 0.44 (±0.02) | 0.44 | 0.39 | 0.35 |

MLP Twitter and Market | 0.44 (±0.03) | 0.56 | 0.44 | 0.39 |

SVM Twitter | 0.39 (±0.03) | 0.15 | 0.39 | 0.22 |

SVM Market | 0.39 (±0.03) | 0.15 | 0.39 | 0.22 |

SVM Twitter and Market | 0.39 (±0.03) | 0.15 | 0.39 | 0.22 |

RF Twitter | 0.33 (±0.03) | 0.14 | 0.33 | 0.19 |

RF Market | 0.28 (±0.03) | 0.12 | 0.28 | 0.17 |

RF Twitter and Market | 0.39 (±0.03) | 0.15 | 0.39 | 0.22 |

Random | 0.50 (±0.28) | 0.54 | 0.50 | 0.49 |

Majority | 0.61 (±0.0) | 0.37 | 0.61 | 0.46 |

**Table 5.**Results of applying MLP, SVM and RF using Twitter data, market data or both for predicting daily market movements for Ripple.

Model | Accuracy (95% CI) | Precision | Recall | F_{1} Score |
---|---|---|---|---|

MLP Twitter | 0.54 (±0.03) | 0.50 | 0.50 | 0.50 |

MLP Market | 0.64 (±0.04) | 0.68 | 0.67 | 0.66 |

MLP Twitter and Market | 0.56 (±0.02) | 0.56 | 0.56 | 0.55 |

SVM Twitter | 0.53 (±0.04) | 0.60 | 0.56 | 0.50 |

SVM Market | 0.50 (±0.04) | 0.50 | 0.50 | 0.41 |

SVM Twitter and Market | 0.50 (±0.04) | 0.25 | 0.50 | 0.33 |

RF Twitter | 0.39 (±0.03) | 0.39 | 0.39 | 0.39 |

RF Market | 0.50 (±0.03) | 0.50 | 0.50 | 0.41 |

RF Twitter and Market | 0.44 (±0.03) | 0.44 | 0.44 | 0.44 |

Random | 0.50 (±0.28) | 0.50 | 0.50 | 0.49 |

Majority | 0.50 (±0.0) | 0.25 | 0.50 | 0.33 |

**Table 6.**Results of applying MLP, SVM and RF using Twitter data, market data or both for predicting daily market movements for Litecoin.

Model | Accuracy (95% CI) | Precision | Recall | F_{1} Score |
---|---|---|---|---|

MLP Twitter | 0.59 (±0.05) | 0.61 | 0.61 | 0.61 |

MLP Market | 0.61 (±0.04) | 0.78 | 0.61 | 0.54 |

MLP Twitter and Market | 0.61 (±0.04) | 0.62 | 0.61 | 0.60 |

SVM Twitter | 0.52 (±0.04) | 0.50 | 0.50 | 0.41 |

SVM Market | 0.52 (±0.04) | 0.25 | 0.50 | 0.33 |

SVM Twitter and Market | 0.66 (±0.04) | 0.80 | 0.67 | 0.62 |

RF Twitter | 0.50 (±0.03) | 0.50 | 0.50 | 0.49 |

RF Market | 0.50 (±0.03) | 0.50 | 0.50 | 0.49 |

RF Twitter and Market | 0.61 (±0.03) | 0.66 | 0.61 | 0.58 |

Random | 0.50 (±0.28) | 0.50 | 0.50 | 0.50 |

Majority | 0.50 (±0.0) | 0.25 | 0.50 | 0.33 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Valencia, F.; Gómez-Espinosa, A.; Valdés-Aguirre, B.
Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning. *Entropy* **2019**, *21*, 589.
https://doi.org/10.3390/e21060589

**AMA Style**

Valencia F, Gómez-Espinosa A, Valdés-Aguirre B.
Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning. *Entropy*. 2019; 21(6):589.
https://doi.org/10.3390/e21060589

**Chicago/Turabian Style**

Valencia, Franco, Alfonso Gómez-Espinosa, and Benjamín Valdés-Aguirre.
2019. "Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning" *Entropy* 21, no. 6: 589.
https://doi.org/10.3390/e21060589