Enhancing Cryptocurrency Price Forecasting by Integrating Machine Learning with Social Media and Market Data
Abstract
:1. Introduction
2. Related Work
- Price Trend (PT): identifying and exploiting the direction in which the price of a cryptocurrency is moving (e.g., upward trend, downward trend).
- Instant of Trading (IT): discovering the specific moment to buy/sell a cryptocurrency.
- Publication Frequency (PF): exploiting social media data mentioning a particular cryptocurrency.
- Impressions (I): evaluating the overall market sentiment or public perception about a specific cryptocurrency.
- Sentiment Analysis (SA): analyzing users’ sentiments and opinions to gauge the market’s mood regarding a particular cryptocurrency.
- Bot Analysis (BA): detecting the activities of bots on social media, as the posts they publish can influence the price prediction process.
- Correlation among Coins (CC): measuring the relationship between the price movements of different cryptocurrencies, such as correlation and causality.
- Trading Indicators (TI): exploiting market metrics and signals used by traders to make trading decisions, such as moving averages, relative strength index (RSI), moving average convergence divergence (MACD).
- Deep Learning: using deep learning approaches for price forecasting.
- Type of Coins: defining the type of cryptocurrencies considered for price prediction, which can belong to the four categories discussed in Section 3.1, i.e., Solid Project (SP), High Capitalization (HC), Influential Meme (IM), and Volatile Meme (VM) coins.
3. Proposed Methodology
3.1. Data Collection and Preprocessing
- High Capitalization (HC): this category includes cryptocurrencies such as Bitcoin and Ethereum, which are highly popular and have a significant impact on the world of cryptocurrencies.
- Solid Project (SP): it includes cryptocurrencies backed by a robust project, although they may be less popular. Examples include Solana and Conflux, which form the foundation for various types of blockchains, as well as projects like The Sandbox, which is associated with the metaverse and NFT-related initiatives.
- Influential Meme (IM): it includes coins that do not rely on solid projects (i.e., meme coins). Despite their lower capitalization and the absence of substantial projects, they have a significant influence on the world of cryptocurrencies due to their history and popularity on social media.
- Volatile Meme (VM): this category comprises cryptocurrencies created purely for speculative purposes, characterized by high volatility and substantial price fluctuations within short time periods.
3.2. Data Enrichment
3.2.1. Correlation between Social Media and Market Data
3.2.2. Causal Connection in Market Data
3.2.3. Textual Analysis of Social Data
3.3. Training Machine Learning Models
3.4. Trading Recommendation
- Impact of commissions: commission costs depend on the trading platform used, and thus, the algorithm is designed to take into account a certain percentage of the invested capital to be paid as transaction fees.
- Identification of strong trends: the algorithm implements a heuristic to limit the number of transactions, starting a new one only in the presence of a significant event. In this way, it is possible to avoid imprudent operations during phases of price uncertainty, with notable benefits in terms of profits.
- Use of take-profit: it leads the algorithm to close operations when the profit percentage exceeds a certain threshold.
- Use of stop-loss: it closes operations when the loss percentage exceeds a certain threshold.
Algorithm 1 Pseudocode of the trading algorithm. |
|
4. Experimental Results
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Härdle, W.K.; Harvey, C.R.; Reule, R.C. Understanding cryptocurrencies. J. Financ. Econom. 2020, 18, 181–208. [Google Scholar] [CrossRef]
- Hashemi Joo, M.; Nishikawa, Y.; Dandapani, K. Cryptocurrency, a successful application of blockchain technology. Manag. Financ. 2020, 46, 715–733. [Google Scholar] [CrossRef]
- Kraaijeveld, O.; De Smedt, J. The predictive power of public Twitter sentiment for forecasting cryptocurrency prices. J. Int. Financ. Mark. Inst. Money 2020, 65, 101188. [Google Scholar] [CrossRef]
- Cakici, N.; Fieberg, C.; Metko, D.; Zaremba, A. Do Anomalies Really Predict Market Returns? New Data and New Evidence. Rev. Financ. Forthcom. 2023, rfad025. [Google Scholar] [CrossRef]
- Gu, S.; Kelly, B.; Xiu, D. Empirical asset pricing via machine learning. Rev. Financ. Stud. 2020, 33, 2223–2273. [Google Scholar] [CrossRef]
- Bianchi, D.; Büchner, M.; Tamoni, A. Bond risk premiums with machine learning. Rev. Financ. Stud. 2021, 34, 1046–1089. [Google Scholar] [CrossRef]
- Bali, T.G.; Beckmeyer, H.; Moerke, M.; Weigert, F. Option return predictability with machine learning and big data. Rev. Financ. Stud. 2023, 36, 3548–3602. [Google Scholar] [CrossRef]
- Zhou, X.; Zhou, H.; Long, H. Forecasting the equity premium: Do deep neural network models work? Mod. Financ. 2023, 1, 1–11. [Google Scholar] [CrossRef]
- Valencia, F.; Gómez-Espinosa, A.; Valdés-Aguirre, B. Price movement prediction of cryptocurrencies using sentiment analysis and machine learning. Entropy 2019, 21, 589. [Google Scholar] [CrossRef]
- Abraham, J.; Higdon, D.; Nelson, J.; Ibarra, J. Cryptocurrency price prediction using tweet volumes and sentiment analysis. SMU Data Sci. Rev. 2018, 1, 1. [Google Scholar]
- Branda, F.; Marozzo, F.; Talia, D. Ticket Sales Prediction and Dynamic Pricing Strategies in Public Transport. Big Data Cogn. Comput. 2020, 4, 36. [Google Scholar] [CrossRef]
- Hitam, N.A.; Ismail, A.R. Comparative performance of machine learning algorithms for cryptocurrency forecasting. Ind. J. Electr. Eng. Comput. Sci 2018, 11, 1121–1128. [Google Scholar] [CrossRef]
- Khedr, A.M.; Arif, I.; El-Bannany, M.; Alhashmi, S.M.; Sreedharan, M. Cryptocurrency price prediction using traditional statistical and machine-learning techniques: A survey. Intell. Syst. Account. Financ. Manag. 2021, 28, 3–34. [Google Scholar] [CrossRef]
- Jay, P.; Kalariya, V.; Parmar, P.; Tanwar, S.; Kumar, N.; Alazab, M. Stochastic neural networks for cryptocurrency price prediction. IEEE Access 2020, 8, 82804–82818. [Google Scholar] [CrossRef]
- Patel, M.M.; Tanwar, S.; Gupta, R.; Kumar, N. A deep learning-based cryptocurrency price prediction scheme for financial institutions. J. Inf. Secur. Appl. 2020, 55, 102583. [Google Scholar] [CrossRef]
- Lahmiri, S.; Bekiros, S. Cryptocurrency forecasting with deep learning chaotic neural networks. Chaos Solitons Fractals 2019, 118, 35–40. [Google Scholar] [CrossRef]
- Ammer, M.A.; Aldhyani, T.H. Deep learning algorithm to predict cryptocurrency fluctuation prices: Increasing investment awareness. Electronics 2022, 11, 2349. [Google Scholar] [CrossRef]
- Poongodi, M.; Nguyen, T.N.; Hamdi, M.; Cengiz, K. Global cryptocurrency trend prediction using social media. Inf. Process. Manag. 2021, 58, 102708. [Google Scholar]
- Lamon, C.; Nielsen, E.; Redondo, E. Cryptocurrency price prediction using news and social media sentiment. SMU Data Sci. Rev. 2017, 1, 1–22. [Google Scholar]
- Fleischer, J.P.; von Laszewski, G.; Theran, C.; Parra Bautista, Y.J. Time Series Analysis of Cryptocurrency Prices Using Long Short-Term Memory. Algorithms 2022, 15, 230. [Google Scholar] [CrossRef]
- Van Tran, L.; Le, S.T.; Tran, H.M. Empirical Study of Cryptocurrency Prices Using Linear Regression Methods. In Proceedings of the 2022 RIVF International Conference on Computing and Communication Technologies (RIVF), Ho Chi Minh City, Vietnam, 20–22 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 701–706. [Google Scholar]
- Sun, J.; Zhou, Y.; Lin, J. Using machine learning for cryptocurrency trading. In Proceedings of the 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS), Taipei, Taiwan, 6–9 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 647–652. [Google Scholar]
- Hamayel, M.J.; Owda, A.Y. A novel cryptocurrency price prediction model using GRU, LSTM and bi-LSTM machine learning algorithms. AI 2021, 2, 477–496. [Google Scholar] [CrossRef]
- Livieris, I.E.; Pintelas, E.; Stavroyiannis, S.; Pintelas, P. Ensemble deep learning models for forecasting cryptocurrency time-series. Algorithms 2020, 13, 121. [Google Scholar] [CrossRef]
- Rathan, K.; Sai, S.V.; Manikanta, T.S. Crypto-currency price prediction using decision tree and regression techniques. In Proceedings of the 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 23–25 April 2019; IEEE: Piscataway, NJ, USA; pp. 190–194. [Google Scholar]
- Mirtaheri, M.; Abu-El-Haija, S.; Morstatter, F.; Ver Steeg, G.; Galstyan, A. Identifying and analyzing cryptocurrency manipulations in social media. IEEE Trans. Comput. Soc. Syst. 2021, 8, 607–617. [Google Scholar] [CrossRef]
- Wołk, K. Advanced social media sentiment analysis for short-term cryptocurrency price prediction. Expert Syst. 2020, 37, e12493. [Google Scholar] [CrossRef]
- Vo, A.D.; Nguyen, Q.P.; Ock, C.Y. Sentiment analysis of news for effective cryptocurrency price prediction. Int. J. Knowl. Eng. 2019, 5, 47–52. [Google Scholar] [CrossRef]
- Sebastião, H.; Godinho, P. Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ. Innov. 2021, 7, 3. [Google Scholar] [CrossRef]
- Hutto, C.; Gilbert, E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA, 1–4 June 2014; Volume 8, pp. 216–225. [Google Scholar]
- Loria, S. textblob Documentation. Release 0.16 2018, 2, 269. [Google Scholar]
- Pano, T.; Kashef, R. A complete VADER-based sentiment analysis of bitcoin (BTC) tweets during the era of COVID-19. Big Data Cogn. Comput. 2020, 4, 33. [Google Scholar] [CrossRef]
- Kim, G.; Shin, D.H.; Choi, J.G.; Lim, S. A deep learning-based cryptocurrency price prediction model that uses on-chain data. IEEE Access 2022, 10, 56232–56248. [Google Scholar] [CrossRef]
- Tanwar, S.; Patel, N.P.; Patel, S.N.; Patel, J.R.; Sharma, G.; Davidson, I.E. Deep learning-based cryptocurrency price prediction scheme with inter-dependent relations. IEEE Access 2021, 9, 138633–138646. [Google Scholar] [CrossRef]
- Shahbazi, Z.; Byun, Y.C. Improving the cryptocurrency price prediction performance based on reinforcement learning. IEEE Access 2021, 9, 162651–162659. [Google Scholar] [CrossRef]
- Belcastro, L.; Cantini, R.; Marozzo, F.; Orsino, A.; Talia, D.; Trunfio, P. Programming big data analysis: Principles and solutions. J. Big Data 2022, 9, 1–50. [Google Scholar] [CrossRef]
- Al Guindy, M. Cryptocurrency price volatility and investor attention. Int. Rev. Econ. Financ. 2021, 76, 556–570. [Google Scholar] [CrossRef]
- Aslanidis, N.; Bariviera, A.F.; López, Ó.G. The link between cryptocurrencies and Google Trends attention. Financ. Res. Lett. 2022, 47, 102654. [Google Scholar] [CrossRef]
- Mardjo, A.; Choksuchat, C. HyVADRF: Hybrid VADER–Random Forest and GWO for Bitcoin Tweet Sentiment Analysis. IEEE Access 2022, 10, 101889–101897. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
- Hua, Y.; Zhao, Z.; Li, R.; Chen, X.; Liu, Z.; Zhang, H. Deep learning with long short-term memory for time series prediction. IEEE Commun. Mag. 2019, 57, 114–119. [Google Scholar] [CrossRef]
- Moustafa, H.; Malli, M.; Hazimeh, H. Real-time Bitcoin price tendency awareness via social media content tracking. In Proceedings of the 2022 10th International Symposium on Digital Forensics and Security (ISDFS), Istanbul, Turkey, 6–7 June 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
- Maqsood, U.; Khuhawar, F.Y.; Talpur, S.; Jaskani, F.H.; Memon, A.A. Twitter Mining based Forecasting of Cryptocurrency using Sentimental Analysis of Tweets. In Proceedings of the 2022 Global Conference on Wireless and Optical Technologies (GCWOT), Malaga, Spain, 14–17 February 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
Related Work | PT | TI | Social Media Data | Market Data | Deep Learning | Type of Coins | ||||
---|---|---|---|---|---|---|---|---|---|---|
PF | I | SA | BA | CC | TI | |||||
Hitam et al. [12] | x | - | - | - | - | - | x | x | x | HC, SP |
Abraham et al. [10] | x | - | - | - | x | - | - | - | - | HC |
Lahmiri et al. [16] | x | - | - | - | - | - | - | x | x | HC |
Vo et al. [28] | x | - | - | - | x | - | - | - | x | HC |
Rathan et al. [25] | x | - | - | - | - | - | - | - | - | HC |
Valencia et al. [9] | x | - | - | - | x | - | - | x | x | HC |
Wołk [27] | x | - | x | x | x | - | - | - | x | HC, SP |
Patel et al. [15] | x | - | - | - | - | - | - | - | x | - |
Ioannis et al. [24] | x | x | - | - | - | - | x | x | x | - |
Khedr et al. [13] | x | - | - | - | - | - | - | x | x | HC, SP |
Jay et al. [14] | x | - | - | - | - | - | - | x | x | HC, SP |
Poongodi et al. [18] | x | - | - | - | - | - | - | x | x | - |
Mirtaheri et al. [26] | - | - | x | x | x | x | - | - | - | HC |
Hamayel et al. [23] | x | - | - | - | - | - | - | - | x | HC |
Tanwar et al. [34] | x | - | - | - | - | - | x | - | x | HC |
Shahbazi et al. [35] | x | - | - | - | - | - | - | x | x | HC, SP |
Kim et al. [33] | x | - | - | - | - | - | - | - | x | HC |
Van Tran et al. [21] | x | x | - | - | - | - | x | x | - | HC, SP, IM |
Ammer et al. [17] | x | x | - | - | - | - | x | x | x | HC, SP, IM, VM |
Fleischer et al. [20] | x | x | - | - | - | - | x | x | x | - |
Sun et al. [22] | x | x | - | - | - | - | - | - | - | HC, SP, IM |
Sebastião et al. [29] | x | x | - | - | - | - | - | x | - | HC, SP |
Lamon et al. [19] | x | - | - | - | x | - | - | - | - | HC |
Our work | x | x | x | x | x | x | x | x | x | HC, SP, IM |
Category | Acronym | Cryptocurrencies |
---|---|---|
High Capitalization | HC | Bitcoin (BTC), Ethereum (ETH), Polygon (MATIC), Polkadot (DOT), Solana (SOL), Cosmos (ATOM), Stellar (XLM), Avalanche (AVAX), Tron (TRX), Litecoin (LTC) |
Solid Project | SP | Conflux (CFX), Stacks (STX), Fantom (FTM), Quant (QNT) Loopring (LRC), The sandbox (SAND), Gala (GALA), Lido Dao (LDO), Cronos (CRON), Zilliqa (ZIL), Chiliz (CHZ), Neo (NEO), Vethor Token (VTHO), Bancor (BNT), The Graph (GRT) |
Influent Meme | IM | Dogecoin (DOGE), Shiba Inu (SHIB), Decentraland (MANA) |
Volatile Meme | VM | Babydoge Coin (BabyDoge), Floki (Floki), Catecoin (CATE), Dogelon Mars (ELON), Volt Inu v2 (VOLT), Dejitaru Tsuka (TSUKA), Kishu Inu (KISHU), Shiba Predator (SHIBAP), Pitbull (PIT), Akita Inu (AKITA) |
Shiba Inu | Floki | CateCoin | ||||
---|---|---|---|---|---|---|
Category | Pearson | Spearman | Pearson | Spearman | Pearson | Spearman |
Tweets | 0.723 | 0.841 | 0.868 | 0.909 | 0.797 | 0.880 |
Followers | 0.659 | 0.761 | 0.523 | 0.858 | 0.320 | 0.751 |
Likes | 0.752 | 0.849 | 0.896 | 0.913 | 0.414 | 0.854 |
Retweets | 0.796 | 0.850 | 0.889 | 0.913 | 0.674 | 0.831 |
Model | Hyperparameters |
---|---|
Random forest | max_features: ; min_samples_split: 5; estimators: 300 |
XGBoost | eta: 0.01; gamma: 150; n_estimators: 100; subsample: 1 |
CatBoost | depth: 6; iterations: 200; learning_rate: 0.1; l2_leaf_reg: 0.2 |
Conv1D | conv1d_layer: [units: 256; kernel_size: 2; activation: ReLU]; flatten_layer: yes; dense_layer_1: [units: 8; activation: ReLU]; dense_layer_2: [units: 1; activation: linear]; optimizer: Adam; learning_rate: 0.0001; epoch: 200 |
GRU | gru_layer_units: 256; dense_layer_1: [units: 8; activation: ReLU]; dense_layer_2: [units: 1; activation: linear]; optimizer: Adam; learning_rate: 0.0001; epoch: 200 |
LSTM | lstm_layer_units: 32; lstm_layer_2_units: 64; dense_layer_1: [units: 8; activation: ReLU]; dense_layer_2: [units: 1; activation: linear]; optimizer: Adam; learning_rate: 0.0001; epoch: 200 |
Model | Category | RMSE | MAE | MAPE | |
---|---|---|---|---|---|
Random forest | Tree-based | 0.085 | 0.055 | 0.75 | 5.2% |
XGBoost | Tree-based | 0.110 | 0.070 | 0.68 | 6.8% |
CatBoost | Tree-based | 0.025 | 0.035 | 0.92 | 1.7% |
Conv1D | CNN | 0.005 | 0.003 | 0.95 | 1.4% |
GRU | RNN | 0.004 | 0.002 | 0.96 | 1.3% |
LSTM | RNN | 0.003 | 0.002 | 0.97 | 1.2% |
Cryptocurrency | Acronym | Category | Profit % with Fees | Profit % without Fees |
---|---|---|---|---|
Bitcoin | BTC | HC | −14.91 | +227.38 |
Ethereum | ETH | HC | +6.41 | +16.35 |
Polygon | MATIC | HC | +164.55 | +178.43 |
Polkadot | DOT | HC | −47.80 | +28.13 |
Solana | SOL | HC | −33.56 | −10.55 |
Cosmos | ATOM | HC | −42.27 | +13.89 |
Stellar | XLM | HC | +40.01 | +48.26 |
Avalanche | AVAX | HC | +23.26 | +29.37 |
TRON | TRX | HC | 0.00 | 0.00 |
Litecoin | LTC | HC | +10.58 | +85.98 |
Conflux | CFX | SP | +112.46 | +202.52 |
Stacks | STX | SP | +59.89 | +109.21 |
Fantom | FTM | SP | −25.42 | −22.88 |
Quant | QNT | SP | +74.17 | +83.96 |
Loopring | LRC | SP | +3.73 | +4.73 |
The Sandbox | SAND | SP | −80.04 | −30.06 |
Gala | GALA | SP | +259.01 | +297.28 |
Lido DAO | LDO | SP | −79.72 | +85.95 |
Cronos | CRO | SP | −15.32 | +5.68 |
Zilliqa | ZIL | SP | +38.22 | +58.44 |
Chiliz | CHZ | SP | 0.00 | 0.00 |
Neo | NEO | SP | +132.66 | +240.80 |
VeThor Token | VTHO | SP | +26.57 | +32.84 |
Bancor | BNT | SP | −7.44 | −5.50 |
The Graph | GRT | SP | −25.37 | −18.11 |
Dogecoin | DOGE | MCI | +27.68 | +39.41 |
Shiba Inu | SHIB | MCI | +432.47 | +771.74 |
Decentraland | MANA | MCI | +2247.29 | +2962.72 |
Mean profit for HC coins | +10.63 | +61.72 | ||
Mean profit for SP coins | +31.56 | +69.66 | ||
Mean profit for IM coins | +902.48 | +1257.96 | ||
Overall mean profit | +117.40 | +194.14 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Belcastro, L.; Carbone, D.; Cosentino, C.; Marozzo, F.; Trunfio, P. Enhancing Cryptocurrency Price Forecasting by Integrating Machine Learning with Social Media and Market Data. Algorithms 2023, 16, 542. https://doi.org/10.3390/a16120542
Belcastro L, Carbone D, Cosentino C, Marozzo F, Trunfio P. Enhancing Cryptocurrency Price Forecasting by Integrating Machine Learning with Social Media and Market Data. Algorithms. 2023; 16(12):542. https://doi.org/10.3390/a16120542
Chicago/Turabian StyleBelcastro, Loris, Domenico Carbone, Cristian Cosentino, Fabrizio Marozzo, and Paolo Trunfio. 2023. "Enhancing Cryptocurrency Price Forecasting by Integrating Machine Learning with Social Media and Market Data" Algorithms 16, no. 12: 542. https://doi.org/10.3390/a16120542
APA StyleBelcastro, L., Carbone, D., Cosentino, C., Marozzo, F., & Trunfio, P. (2023). Enhancing Cryptocurrency Price Forecasting by Integrating Machine Learning with Social Media and Market Data. Algorithms, 16(12), 542. https://doi.org/10.3390/a16120542