Cryptocurrency Futures Portfolio Trading System Using Reinforcement Learning
Abstract
1. Introduction
2. Background and Related Works
2.1. Blockchain and Cryptocurrency
2.2. Reinforcement Learning
2.3. System Trading and Portfolio Management
3. Materials and Methods
3.1. Stage 1: Construction of Input Dataset
3.2. Stage 2: Model Training and Portfolio Construction
- The agent observes the current state and selects an action using the policy network.
- Based on the selected action, the agent receives the next state and reward from the environment.
- The reward comprises immediate and delayed components; the immediate reward reflects the instantaneous profit or loss resulting from the action, whereas the delayed reward considers the long-term performance.
- The agent calculates the loss function using its experienced states, actions, and rewards, and updates the parameters of the policy and value networks. The loss function is defined as follows:
3.3. Stage 3: Evaluation of CPTS
4. Experiments
4.1. Data Description
4.2. CPTS Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CPTS | Cryptocurrency portfolio trading system |
A2C | Advantage actor–critic |
ANOVA | Analysis of variance |
DQN | Deep Q-network |
PnL | Profit and loss |
References
- Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. Bitcoin White Paper. 2008. Available online: https://bitcoin.org/bitcoin.pdf (accessed on 17 August 2025).
- Buterin, V. A Next-Generation Smart Contract and Decentralized Application Platform. Ethereum White Paper. 2014. Available online: https://ethereum.org/en/whitepaper/ (accessed on 17 August 2025).
- Corbet, S.; Lucey, B.; Yarovaya, L. Datestamping the Bitcoin and Ethereum Bubbles. Fin. Res. Lett. 2018, 26, 81–88. [Google Scholar] [CrossRef]
- Liang, Z.; Chen, H.; Zhu, J.; Jiang, K.; Li, Y. Adversarial Deep Reinforcement Learning in Portfolio Management. arXiv 2018, arXiv:1808.09940. [Google Scholar] [CrossRef]
- Tether. Tether: Fiat Currencies on the Bitcoin Blockchain. Tether White Paper. 2021. Available online: https://assets.ctfassets.net/vyse88cgwfbl/5UWgHMvz071t2Cq5yTw5vi/c9798ea8db99311bf90ebe0810938b01/TetherWhitePaper.pdf (accessed on 17 August 2025).
- Au, C.H.; Hsu, W.S.; Shieh, P.H.; Yue, L. Can Stablecoins Foster Cryptocurrencies Adoption? J. Comput. Inf. Syst. 2024, 64, 360–369. [Google Scholar] [CrossRef]
- Bullmann, D.; Klemm, J.; Pinna, A. Search for Stability in Crypto-Assets: Are Stablecoins the Solution? SSRN: Rochester, NY, USA; European Central Bank (ECB): Frankfurt am Main, Germany, 2019. [Google Scholar] [CrossRef]
- Baur, D.G.; Hong, K.; Lee, A.D. Bitcoin: Medium of Exchange or Speculative Assets? J. Int. Financ. Mark. Inst. Money 2018, 54, 177–189. [Google Scholar] [CrossRef]
- Lyons, R.K.; Viswanath-Natraj, G. What Keeps Stablecoins Stable? J. Int. Money Fin. 2023, 131, 102777. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-Level Control through Deep Reinforcement Learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Kober, J.; Bagnell, J.A.; Peters, J. Reinforcement Learning in Robotics: A Survey. Int. J. Robot. Res. 2013, 32, 1238–1274. [Google Scholar] [CrossRef]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; van den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef]
- Li, Y. Deep Reinforcement Learning: An Overview. arXiv 2017, arXiv:1701.07274. [Google Scholar]
- Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous Methods for Deep Reinforcement Learning. arXiv 2016, arXiv:1602.01783. [Google Scholar] [CrossRef]
- Yli-Huumo, J.; Ko, D.; Choi, S.; Park, S.; Smolander, K. Where Is Current Research on Blockchain Technology?—A Systematic Review. PLoS ONE 2016, 11, e0163477. [Google Scholar] [CrossRef]
- Carlsten, M.; Kalodner, H.; Weinberg, S.M.; Narayanan, A. On the Instability of Bitcoin without the Block Reward. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, New York, NY, USA, 24–28 October 2016; ACM: New York, NY, USA, 2016; pp. 154–167. [Google Scholar] [CrossRef]
- Kroll, J.A.; Davey, I.C.; Felten, E.W. The Economics of Bitcoin Mining, or Bitcoin in the Presence of Adversaries. In Proceedings of the 12th Workshop on the Economics of Information Security (WEIS 2013), Washington, DC, USA, 10–11 June 2013. [Google Scholar]
- Clark, J.A.; Miller, A.; Bonneau, J.; Narayanan, A.; Felten, E.W. Research Perspectives and Challenges for Bitcoin and Cryptocurrencies. In Proceedings of the IEEE Symposium on Security and Privacy, San Jose, CA, USA, 17–21 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 104–121. [Google Scholar]
- Wood, G. Ethereum: A Secure Decentralised Generalised Transaction Ledger. Ethereum Yellow Paper. 2014. Available online: https://ethereum.github.io/yellowpaper/paper.pdf (accessed on 17 August 2025).
- Corbet, S.; Lucey, B.; Urquhart, A.; Yarovaya, L. Cryptocurrencies as a Financial Asset: A Systematic Analysis. Int. Rev. Financ. Anal. 2019, 62, 182–199. [Google Scholar] [CrossRef]
- Kristoufek, L. What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet Coherence Analysis. PLoS ONE 2015, 10, e0123923. [Google Scholar] [CrossRef] [PubMed]
- McNally, S.; Roche, J.; Caton, S. Predicting the Price of Bitcoin Using Machine Learning. In Proceedings of the 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), Cambridge, UK, 21–23 March 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 339–343. [Google Scholar] [CrossRef]
- Ji, S.; Kim, J.; Im, H. A Comparative Study of Bitcoin Price Prediction Using Deep Learning. Mathematics 2019, 7, 898. [Google Scholar] [CrossRef]
- Deng, Y.; Bao, F.; Kong, Y.; Ren, Z.; Dai, Q. Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 653–664. [Google Scholar] [CrossRef]
- Carta, S.; Corriga, A.; Ferreira, A.; Podda, A.S.; Recupero, D.R. A Multi-Layer and Multi-Ensemble Stock Trader Using Deep Learning and Deep Reinforcement Learning. Appl. Intell. 2021, 51, 889–905. [Google Scholar] [CrossRef]
- Fischer, T. Reinforcement Learning in Financial Markets—A Survey. FAU Discussion Papers in Economics. 2018. Available online: https://www.econstor.eu/bitstream/10419/183139/1/1032172355.pdf (accessed on 17 August 2025).
- Jiang, Z.; Liang, J. Cryptocurrency Portfolio Management with Deep Reinforcement Learning. In Proceedings of the 2017 Intelligent Systems Conference (IntelliSys), London, UK, 7–8 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 905–913. [Google Scholar] [CrossRef]
- Zhao, D.; Rinaldo, A.; Brookins, C. Cryptocurrency Price Prediction and Trading Strategies Using Support Vector Machines. arXiv 2019, arXiv:1911.11819. [Google Scholar] [CrossRef]
- Henderson, P.; Islam, R.; Bachman, P.; Pineau, J.; Precup, D.; Meger, D. Deep Reinforcement Learning That Matters. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; AAAI Press: Menlo Park, CA, USA, 2018. [Google Scholar] [CrossRef]
- Watkins, C.J.; Dayan, P. Q-Learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Sutton, R.S.; McAllester, D.; Singh, S.; Mansour, Y. Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Advances in Neural Information Processing Systems 12; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
- Konda, V.R.; Tsitsiklis, J.N. Actor-Critic Algorithms. In Advances in Neural Information Processing Systems 13; MIT Press: Cambridge, MA, USA, 2000; pp. 1008–1014. [Google Scholar]
- Pricope, T.V. Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review. arXiv 2021, arXiv:2106.00123. [Google Scholar] [CrossRef]
- Chan, E.P. Algorithmic Trading: Winning Strategies and Their Rationale; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar] [CrossRef]
- Kaufman, P.J. Trading Systems and Methods, 5th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Pardo, R. The Evaluation and Optimization of Trading Strategies; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
- Park, C.H.; Irwin, S.H. What Do We Know about the Profitability of Technical Analysis? J. Econ. Surv. 2007, 21, 786–826. [Google Scholar] [CrossRef]
- Aldridge, I. High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Murphy, J.J. Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications; Penguin: New York, NY, USA, 1999. [Google Scholar]
- Wilder, J.W. New Concepts in Technical Trading Systems; Trend Publishing Research: Greensboro, NC, USA, 1978. [Google Scholar]
- Tsay, R.S. Analysis of Financial Time Series; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
- Chincarini, L.B. Quantitative Equity Portfolio Management: An Active Approach to Portfolio Construction and Management; McGraw-Hill: New York, NY, USA, 2006. [Google Scholar]
- Gu, S.; Kelly, B.; Xiu, D. Empirical Asset Pricing via Machine Learning. Rev. Financ. Stud. 2020, 33, 2223–2273. [Google Scholar] [CrossRef]
- Dixon, M.F.; Halperin, I.; Bilokon, P. Machine Learning in Finance: From Theory to Practice; Springer: Cham, Switzerland, 2020. [Google Scholar] [CrossRef]
- Markowitz, H. Portfolio Selection. J. Finance 1952, 7, 77–91. [Google Scholar] [CrossRef]
- De Prado, M.L. Advances in Financial Machine Learning; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
- Zhang, Z.; Zohren, S.; Roberts, S. Deep Reinforcement Learning for Trading. arXiv 2020, arXiv:1911.10107. [Google Scholar] [CrossRef]
- Feng, W.; Wang, Y.; Zhang, Z. Informed Trading in the Bitcoin Market. Fin. Res. Lett. 2018, 26, 63–70. [Google Scholar] [CrossRef]
- Silantyev, E. Order Flow Analysis of Cryptocurrency Markets. Digit. Finance 2019, 1, 191–218. [Google Scholar] [CrossRef]
- Dyhrberg, A.H. Bitcoin, Gold and the Dollar––A GARCH Volatility Analysis. Fin. Res. Lett. 2016, 16, 85–92. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Ticker Symbol | Cryptocurrency |
---|---|
BTCUSDT | Bitcoin |
ETHUSDT | Ethereum |
ADAUSDT | Cardano |
SOLUSDT | Solana |
THETAUSDT | Theta Network |
NEARUSDT | NEAR Protocol |
BNBUSDT | Binance Coin |
AVAXUSDT | Avalanche |
DOTUSDT | Polkadot |
BCHUSDT | Bitcoin Cash |
TRXUSDT | TRON |
FTMUSDT | Fantom |
ALGOUSDT | Algorand |
EGLDUSDT | Elrond |
ATOMUSDT | Cosmos |
XTZUSDT | Tezos |
KAVAUSDT | Kava |
ENJUSDT | Enjin Coin |
Ticker Symbol | 10 min | 30 min | 60 min | Daily |
---|---|---|---|---|
ADAUSDT | 6.700 | 5.939 | 6.515 | 2.829 |
ALGOUSDT | 0.237 | 0.198 | 0.186 | 0.931 |
ATOMUSDT | 7.458 | 6.901 | 7.043 | 3.191 |
AVAXUSDT | 1.789 | 1.595 | 1.543 | 1.499 |
BCHUSDT | 5.315 | 5.332 | 4.976 | 3.201 |
BNBUSDT | 16.572 | 16.832 | 19.048 | 7.59 |
BTCUSDT | 16.624 | 17.036 | 17.21 | 6.594 |
DOTUSDT | 3.747 | 3.735 | 3.852 | 2.119 |
EGLDUSDT | 2.322 | 2.362 | 2.353 | 1.881 |
ENJUSDT | 1.614 | 1.496 | 1.407 | 1.103 |
ETHUSDT | 12.522 | 13.552 | 13.618 | 6.107 |
FTMUSDT | 2.572 | 2.128 | 2.241 | 1.105 |
KAVAUSDT | 2.562 | 2.708 | 2.486 | 1.439 |
NEARUSDT | 1.081 | 1.059 | 1.04 | 0.744 |
SOLUSDT | 1.041 | 0.988 | 1.038 | 1.322 |
THETAUSDT | 3.322 | 3.676 | 3.767 | 2.485 |
TRXUSDT | 27.092 | 26.416 | 26.015 | 10.526 |
XTZUSDT | 3.885 | 3.570 | 3.936 | 1.950 |
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Square | F | p-Value |
---|---|---|---|---|---|
Timeframe | 216,834.9 | 3 | 72,278.29 | 15.563 | 0.000 |
Residual | 6,669,306 | 1436 | 4644.36 |
Comparison Group | Mean Difference (%) | p-Value |
---|---|---|
10–30 m | 2.987 | 0.319 |
10–60 m | −0.545 | 0.989 |
30–60 m | −3.532 | 0.181 |
10 m–1 d | −48.991 | 0.000 |
30 m–1 d | 46.005 | 0.000 |
60 m–1 d | 49.537 | 0.000 |
Ticker Symbol | Average ROR (%) |
---|---|
TRXUSDT | 26.51 |
BNBUSDT | 17.48 |
BTCUSDT | 16.96 |
ETHUSDT | 13.23 |
ATOMUSDT | 7.13 |
ADAUSDT | 6.51 |
BCHUSDT | 5.31 |
XTZUSDT | 3.94 |
THETAUSDT | 3.77 |
DOTUSDT | 3.75 |
Ticker Symbol | Average ROR (%) |
---|---|
TRXUSDT | 10.53 |
BNBUSDT | 7.59 |
BTCUSDT | 6.59 |
ETHUSDT | 6.11 |
BCHUSDT | 3.20 |
ATOMUSDT | 3.19 |
ADAUSDT | 2.83 |
Ticker Symbol | Average ROR (%) | ||
---|---|---|---|
10 min (%) | 30 min (%) | 60 min (%) | |
ADAUSDT | 3.256 | 3.329 | 3.356 |
ATOMUSDT | 7.396 | 7.996 | 3.458 |
BCHUSDT | 7.897 | 7.997 | 3.458 |
BNBUSDT | 7.489 | 8.000 | 7.989 |
BTCUSDT | 3.358 | 7.990 | 3.458 |
DOTUSDT | 3.123 | 7.996 | 3.423 |
ETHUSDT | 3.423 | 7.997 | 3.423 |
THETAUSDT | 3.223 | 7.998 | 3.423 |
TRXUSDT | 9.512 | 8.003 | 8.012 |
XTZUSDT | 3.058 | 7.992 | 3.458 |
Averages | 5.174 | 7.530 | 4.346 |
Ticker Symbol | Average ROR (%) |
---|---|
Daily | |
ADAUSDT | 47.491 |
ATOMUSDT | 32.114 |
BCHUSDT | 69.761 |
BNBUSDT | 28.571 |
BTCUSDT | 49.616 |
ETHUSDT | 35.049 |
TRXUSDT | 38.843 |
Average | 43.064 |
Frequency | N | Mean ROR (%) | Standard Deviation | t-Test | p-Value |
---|---|---|---|---|---|
High | 300 | 5.68 | 2.36 | −19.377 | 0.000 |
Low | 70 | 43.06 | 16.10 |
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Square | F | p-Value |
---|---|---|---|---|---|
Type of Frequency (A) | 299,444.23 | 1 | 299,444.23 | 1157.75 | 0.001 |
Portfolio Selection (B) | 2147.51 | 1 | 2147.51 | 8.30 | 0.004 |
AB Interaction | 10,378.82 | 1 | 10,378.82 | 40.13 | 0.001 |
Residual | 185,188.61 | 716 | 258.64 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chun, J.H.; Lee, S.J. Cryptocurrency Futures Portfolio Trading System Using Reinforcement Learning. Appl. Sci. 2025, 15, 9400. https://doi.org/10.3390/app15179400
Chun JH, Lee SJ. Cryptocurrency Futures Portfolio Trading System Using Reinforcement Learning. Applied Sciences. 2025; 15(17):9400. https://doi.org/10.3390/app15179400
Chicago/Turabian StyleChun, Jae Heon, and Suk Jun Lee. 2025. "Cryptocurrency Futures Portfolio Trading System Using Reinforcement Learning" Applied Sciences 15, no. 17: 9400. https://doi.org/10.3390/app15179400
APA StyleChun, J. H., & Lee, S. J. (2025). Cryptocurrency Futures Portfolio Trading System Using Reinforcement Learning. Applied Sciences, 15(17), 9400. https://doi.org/10.3390/app15179400