Reinforcement Learning in Energy Finance: A Comprehensive Review
Abstract
1. Introduction
1.1. Overview
1.2. Illustration
1.3. Literature Review and Research Gaps
1.3.1. Reinforcement Learning in Financial Markets
1.3.2. Energy Finance Modeling Approaches
1.3.3. Energy System Operations with Learning-Based Methods
1.3.4. Research Gaps and Contributions
2. Theoretical Foundations of Reinforcement Learning in Energy Finance
2.1. Reinforcement Learning Framework
- A set of states representing the environment
- A set of actions available to the agent
- Transition probabilities defining how actions lead to new states
- A reward function providing feedback on action quality
- A discount factor determining the relative importance of immediate versus future rewards
2.2. RL Algorithms Relevant to Financial Applications
2.3. RL vs. Traditional Financial Modeling Approaches
3. Characteristics of Energy Markets Relevant to RL Applications
3.1. Price Dynamics and Volatility
3.2. Seasonality and Cyclicality
3.3. Regulatory and Market Structure Considerations
3.4. Physical Constraints and Real Options
3.5. Market Incompleteness and Liquidity Constraints
4. RL for Energy Price Forecasting and Trading Strategies
4.1. Energy Price Forecasting with RL
4.2. Optimal Trading Strategies
4.3. Risk Management Applications
5. RL for Derivatives Valuation in Energy Markets
5.1. Option Pricing Fundamentals
5.2. RL Approaches to Option Pricing and Applications
5.3. Real Options Analysis
6. Option Value in Power Systems
6.1. Real Options Framework for Smart Grid Technologies
6.2. Stochastic Optimization Approach to Quantifying Option Value
6.3. Methodological Considerations for RL-Based Option Valuation
6.4. Future Research Directions
7. Conclusions: Limitations, Future Directions, and Policy Limitations
7.1. Current Limitations of RL in Energy Finance
7.2. Promising Research Directions
7.2.1. Methodological Advancements for Energy Finance
7.2.2. Enhanced Valuation of Energy Assets and Flexibility
7.2.3. Uncertainty Modeling and Risk Assessment
7.2.4. Comparative Analysis of Decision-Making Frameworks
7.2.5. Sustainable Communities and Energy Equity
7.3. Policy Recommendations
7.4. Synthesis and Outlook
8. Conclusions
Funding
Conflicts of Interest
References
- Eydeland, A.; Wolyniec, K. Energy and Power Risk Management: New Developments in Modeling, Pricing, and Hedging; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
- Fischer, T.G. Reinforcement Learning in Financial Markets—A Survey; FAU Discussion Papers in Economics No. 12/2018; Friedrich-Alexander-Universität Erlangen Nürnberg: Erlangen, Germany, 2018. [Google Scholar]
- Weron, R. Electricity price forecasting: A review of the state-of-the-art with a look into the future. Int. J. Forecast. 2014, 30, 1030–1081. [Google Scholar] [CrossRef]
- Giannelos, S.; Moreira, A.; Papadaskalopoulos, D.; Borozan, S.; Pudjianto, D.; Konstantelos, I.; Sun, M.; Strbac, G. A Machine Learning Approach for Generating and Evaluating Forecasts on the Environmental Impact of the Buildings Sector. Energies 2023, 16, 2915. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Kyriakarakos, G. Artificial Intelligence and the Energy Transition. Sustainability 2025, 17, 1140. [Google Scholar] [CrossRef]
- Most, D.; Giannelos, S.; Yueksel-Erguen, I.; Beulertz, D.; Haus, U.-U.; Charousset-Brignol, S.; Frangioni, A. A Novel Modular Optimization Framework for Modelling Investment and Operation of Energy Systems at European Level; ZIB-Report—20–08; Zuse Institute: Berlin, Germany, 2020. [Google Scholar]
- Lawryshyn, Y. Using Reinforcement Learning in Applied Real Options Modelling. J. Risk Financ. Manag. 2023, 16, 320. [Google Scholar]
- Lee, J.S.; Chun, W.; Roh, K.; Heo, S.; Lee, J.H. Applying real options with reinforcement learning to assess commercial CCU deployment. J. CO2 Util. 2023, 77, 102613. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Lippert, I.; Sareen, S. Alleviation of energy poverty through transitions to low-carbon energy infrastructure. Energy Res. Soc. Sci. 2023, 100, 103087. [Google Scholar] [CrossRef]
- Liu, X.; Wu, J.; Chen, S. Efficient Hyperparameters optimization Through Model-based Reinforcement Learning and Meta-Learning. In Proceedings of the IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Virtual, 14–16 December 2020; pp. 1036–1041. [Google Scholar]
- Longstaff, F.A.; Schwartz, E.S. Valuing American options by simulation: A simple least-squares approach. Rev. Financ. Stud. 2001, 14, 113–147. [Google Scholar] [CrossRef]
- López-Vargas, A.; Ledezma-Espino, A.; Sanchis-De-Miguel, A. Methods, data sources and applications of the Artificial Intelligence in the Energy Poverty context: A review. Energy Build. 2022, 268, 112233. [Google Scholar] [CrossRef]
- Lu, J.; Yang, H.; Wei, Y.; Huang, J. Planning of soft open point considering demand response. In Proceedings of the 2019 IEEE Sustainable Power and Energy Conference (iSPEC), Beijing, China, 21–23 November 2019; pp. 246–251. [Google Scholar]
- Majeske, N.; Vaidya, S.S.; Roy, R.; Rehman, A.; Sohrabpoor, H.; Miller, T.; Li, W.; Fiddyment, C.R.; Gumennik, A.; Acharya, R.; et al. Industrial energy forecasting using dynamic attention neural networks. Energy AI 2025, 20, 100504. [Google Scholar] [CrossRef]
- Marzban, S.; Delage, E.; Li, J.Y.-M. Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures. Quant. Financ. 2023, 23, 1411–1430. [Google Scholar] [CrossRef]
- Meinshausen, N.; Hambly, B.M. Monte Carlo methods for the valuation of multiple-exercise options. Math. Financ. 2004, 14, 557–583. [Google Scholar] [CrossRef]
- Mhlanga, D. Artificial intelligence in the Industry 4.0, and its impact on poverty, innovation, infrastructure development, and the Sustainable Development Goals: Lessons from emerging economies? Sustainability 2021, 13, 5788. [Google Scholar] [CrossRef]
- Carmona, R.; Coulon, M. A Survey of commodity markets and structural models for electricity prices. In Quantitative Energy Finance; Springer: Berlin/Heidelberg, Germany, 2014; pp. 41–83. [Google Scholar]
- Benth, F.E.; Šaltytė Benth, J.; Koekebakker, S. Stochastic Modelling of Electricity and Related Markets; World Scientific: Singapore, 2008. [Google Scholar]
- Conejo, A.J.; Carrión, M.; Morales, J.M. Decision Making Under Uncertainty in Electricity Markets; Springer Nature: Dordrecht, The Netherlands, 2010. [Google Scholar]
- Thompson, M.; Davison, M.; Rasmussen, H. Natural gas storage valuation and optimization: A real options application. Nav. Res. Logist. NRL 2009, 56, 226–238. [Google Scholar] [CrossRef]
- Konstantelos, I.; Giannelos, S.; Strbac, G. Strategic valuation of smart grid technology options in distribution networks. IEEE Trans. Power Syst. 2017, 32, 1293–1303. [Google Scholar] [CrossRef]
- Giannelos, S.; Konstantelos, I.; Strbac, G. Option Value of Demand-Side Response Schemes Under Decision-Dependent Uncertainty. IEEE Trans. Power Syst. 2018, 33, 5103–5113. [Google Scholar] [CrossRef]
- Giannelos, S.; Konstantelos, I.; Strbac, G. Stochastic optimisation-based valuation of smart grid options under firm DG contracts. In Proceedings of the 2016 IEEE International Energy Conference (ENERGYCON), Leuven, Belgium, 4–8 April 2016; pp. 1–7. [Google Scholar]
- Vanegas Cantarero, M.M. Of renewable energy, energy democracy, and sustainable development: A roadmap to accelerate the energy transition in developing countries. Energy Res. Soc. Sci. 2020, 70, 101716. [Google Scholar] [CrossRef]
- Carbonneau, A. Pricing and Hedging Financial Derivatives with Reinforcement Learning Methods. Ph.D. Thesis, Concordia University, Montreal, QC, Canada, 2021. [Google Scholar]
- Cao, J.; Chen, J.; Farghadani, S.; Hull, J.; Poulos, Z.; Wang, Z.; Yuan, J. Gamma and vega hedging using deep distributional reinforcement learning. Front. Artif. Intell. 2023, 6, 1129370. [Google Scholar] [CrossRef]
- Caputo, C.; Cardin, M.-A. Analyzing real options and flexibility in engineering systems design using decision rules and deep reinforcement learning. J. Mech. Des. 2022, 144, 021705. [Google Scholar] [CrossRef]
- Chauhan, V.S.; Sharma, R.; Shah, H. Exploring sustainability through clean energy, artificial intelligence, and machine learning: Ethical perspectives. In AI Applications for Clean Energy and Sustainability; Riswandi, B., Singh, B., Kaunert, C., Vig, K., Eds.; IGI Global Scientific Publishing: Hershey, PA, USA, 2024; pp. 119–138. [Google Scholar] [CrossRef]
- Charousset-Brignol, S.; van Ackooij, W.; Oudjane, N.; Daniel, D.; Noceir, S.; Haus, U.-U.; Lazzaro, A.; Frangioni, A.; Lobato, R.; Ghezelsofla, A.; et al. Synergistic approach of multi-energy models for a European optimal energy system management tool. Proj. Repos. J. 2021, 9, 113–116. [Google Scholar]
- Chen, A.-S.; Leung, M.T.; Pan, S.; Chou, C.-Y. Financial hedging in energy market by cross-learning machines. Neural Comput. Appl. 2020, 32, 10321–10335. [Google Scholar] [CrossRef]
- Chen, B.; Wang, J.; Wang, L.; He, Y.; Wang, Z. Robust Optimization for Transmission Expansion Planning: Minimax Cost vs. Minimax Regret. IEEE Trans. Power Syst. 2014, 29, 3069–3077. [Google Scholar] [CrossRef]
- Chen, C.-F.; Napolitano, R.; Hu, Y.; Kar, B.; Yao, B. Addressing machine learning bias to foster energy justice. Energy Res. Soc. Sci. 2024, 116, 103653. [Google Scholar] [CrossRef]
- Che, X.; Zhu, B.; Wang, P. Assessing global energy poverty: An integrated approach. Energy Policy 2021, 149, 112099. [Google Scholar] [CrossRef]
- Cheraghi, Y.; Bratvold, R.B.; Muhammad, R.B. Value Creation in Sustainable Energy Transition Using Reinforcement Learning. Energies 2024, 17, 854. [Google Scholar]
- Chronopoulos, M.; Hagspiel, V.; Fleten, S.-E. Stepwise green investment under policy uncertainty. Energy J. 2016, 37, 87–108. [Google Scholar] [CrossRef]
- Cramton, P.; Ockenfels, A.; Stoft, S. Capacity market fundamentals. Econ. Energy Environ. Policy 2013, 2, 27–46. [Google Scholar] [CrossRef]
- Dalal, G.; Gilboa, E.; Mannor, S. Hierarchical decision making in electricity grid management. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 1153–1162. [Google Scholar]
- del Guayo, Í.; Cuesta, Á. Towards a just energy transition: A critical analysis of the existing policies and regulations in Europe. J. World Energy Law Bus. 2022, 15, 212–222. [Google Scholar] [CrossRef]
- Deng, J.L. Control problems of grey systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar]
- Deng, S.; Oren, S. Electricity derivatives and risk management. Energy 2006, 31, 940–953. [Google Scholar] [CrossRef]
- Dong, Z.; Zhang, X.; Zhang, L.; Giannelos, S.; Strbac, G. Flexibility enhancement of urban energy systems through coordinated space heating aggregation of numerous buildings. Appl. Energy 2024, 374, 123971. [Google Scholar] [CrossRef]
- Du, Y.; Li, F.; Zandi, H.; Xue, Y. Approximating Nash Equilibrium in Day-ahead Electricity Market Bidding with Multi-agent Deep Reinforcement Learning. J. Mod. Power Syst. Clean Energy 2021, 9, 534–544. [Google Scholar] [CrossRef]
- Ersen, H.Y.; Tas, O.; Ugurlu, U. Solar energy investment valuation with intuitionistic fuzzy trinomial lattice real option model. IEEE Trans. Eng. Manag. 2023, 70, 2584–2593. [Google Scholar] [CrossRef]
- Fabra, N.; Reguant, M. Pass-through of emissions costs in electricity markets. Am. Econ. Rev. 2014, 104, 2872–2899. [Google Scholar] [CrossRef]
- Famoso, F.; Oliveri, L.M.; Brusca, S.; Chiacchio, F. A Dependability Neural Network Approach for Short-Term Production Estimation of a Wind Power Plant. Energies 2024, 17, 1627. [Google Scholar] [CrossRef]
- FERC (Federal Energy Regulatory Commission). The February 2021 Cold Weather Outages in Texas and the South Central United States; FERC, NERC and Regional Entity Staff Report; FERC: Washington, DC, USA, 2021.
- Frestad, D.; Benth, F.E.; Koekebakker, S. Modeling term structure dynamics in the Nordic electricity swap market. Energy J. 2010, 31, 53–86. [Google Scholar] [CrossRef]
- Fuad, K.S.; Hafezi, H.; Kauhaniemi, K.; Laaksonen, H. Soft open point in distribution networks. IEEE Access 2020, 8, 210550–210565. [Google Scholar] [CrossRef]
- Gawusu, S.; Jamatutu, S.A.; Ahmed, A. Predictive modeling of energy poverty with machine learning ensembles: Strategic insights from socioeconomic determinants for effective policy implementation. Int. J. Energy Res. 2024, 2024, 9411326. [Google Scholar] [CrossRef]
- Gawusu, S.; Jamatutu, S.A.; Zhang, X.; Moomin, S.T.; Ahmed, A.; Mensah, R.A.; Das, O.; Ackah, I. Spatial analysis and predictive modeling of energy poverty: Insights for policy implementation. Environ. Dev. Sustain. 2024, 1–48. [Google Scholar] [CrossRef]
- Giannelos, S.; Borozan, S.; Strbac, G. A Backwards Induction Framework for Quantifying the Option Value of Smart Charging of Electric Vehicles and the Risk of Stranded Assets under Uncertainty. Energies 2022, 15, 3334. [Google Scholar] [CrossRef]
- Giannelos, S.; Borozan, S.; Aunedi, M.; Zhang, X.; Ameli, H.; Pudjianto, D.; Konstantelos, I.; Strbac, G. Modelling Smart Grid Technologies in Optimisation Problems for Electricity Grids. Energies 2023, 16, 5088. [Google Scholar] [CrossRef]
- Giannelos, S.; Borozan, S.; Konstantelos, I.; Strbac, G. Option value, investment costs and deployment levels of smart grid technologies. Sustain. Energy Res. 2024, 11, 47. [Google Scholar] [CrossRef]
- Giannelos, S.; Borozan, S.; Strbac, G.; Zhang, T.; Kong, W. Vehicle-to-Grid: Quantification of its contribution to security of supply through the F-Factor methodology. Sustain. Energy Res. 2024, 11, 32. [Google Scholar] [CrossRef]
- Giannelos, S.; Borozan, S.; Moreira, A.; Strbac, G. Techno-Economic Analysis of Smart EV Charging for Expansion Planning Under Uncertainty. In Proceedings of the 2023 IEEE Belgrade PowerTech, Belgrade, Serbia, 25–29 June 2023; pp. 1–7. [Google Scholar]
- Giannelos, S.; Djapic, P.; Pudjianto, D.; Strbac, G. Quantification of the Energy Storage Contribution to Security of Supply through the F-Factor Methodology. Energies 2020, 13, 826. [Google Scholar] [CrossRef]
- Giannelos, S.; Jain, A.; Borozan, S.; Falugi, P.; Moreira, A.; Bhakar, R.; Mathur, J.; Strbac, G. Long-Term Expansion Planning of the Transmission Network in India under Multi-Dimensional Uncertainty. Energies 2021, 14, 7813. [Google Scholar] [CrossRef]
- Giannelos, S.; Konstantelos, I.; Strbac, G. A new class of planning models for option valuation of storage technologies under decision-dependent innovation uncertainty. In Proceedings of the 2017 IEEE Manchester PowerTech, Manchester, UK, 18–22 June 2017; pp. 1–6. [Google Scholar]
- Giannelos, S.; Konstantelos, I.; Strbac, G. Option value of dynamic line rating and storage. In Proceedings of the IEEE International Energy Conference (ENERGYCON), Limassol, Cyprus, 3–7 June 2018. [Google Scholar]
- Giannelos, S.; Konstantelos, I.; Strbac, G. Endogenously stochastic demand side response participation on transmission system level. In Proceedings of the IEEE International Energy Conference (ENERGYCON), Limassol, Cyprus, 3–7 June 2018. [Google Scholar]
- Giannelos, S.; Zhang, T.; Pudjianto, D.; Konstantelos, I.; Strbac, G. Investments in Electricity Distribution Grids: Strategic versus Incremental Planning. Energies 2024, 17, 2724. [Google Scholar] [CrossRef]
- Giannelos, S.; Zhang, X.; Zhang, T.; Strbac, G. Multi-Objective Optimization for Pareto Frontier Sensitivity Analysis in Power Systems. Sustainability 2024, 16, 5854. [Google Scholar] [CrossRef]
- Giannelos, S.; Pudjianto, D.; Zhang, T.; Strbac, G. Energy Hub Operation Under Uncertainty: Monte Carlo Risk Assessment Using Gaussian and KDE-Based Data. Energies 2025, 18, 1712. [Google Scholar] [CrossRef]
- Glasserman, P. Monte Carlo Methods in Financial Engineering; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Gomez, A.A.; Consigli, G.; Liu, J. Multi-period portfolio selection with interval-based conditional value-at-risk. Ann. Oper. Res. 2024, 1–39. [Google Scholar] [CrossRef]
- Goyal, A.; Bhattacharya, K. Optimal design of a decarbonized sector-coupled microgrid: Electricity-heat-hydrogen-transport sectors. IEEE Access 2024, 12, 38399–38409. [Google Scholar] [CrossRef]
- Greenwood, D.M.; Djapic, P.; Sarantakos, I.; Giannelos, S.; Strbac, G.; Creighton, A. Pragmatic method for assessing the security of supply in future smart distribution networks. Cired—Open Access Proc. J. 2020, 2020, 221–224. [Google Scholar] [CrossRef]
- Guo, Y.; Wang, N.; Xu, Z.-Y.; Wu, K. The internet of things-based decision support system for information processing in intelligent manufacturing using data mining technology. Mech. Syst. Signal Process. 2020, 142, 106630. [Google Scholar] [CrossRef]
- Halkos, G.E.; Tsirivis, A.S. Value-at-risk methodologies for effective energy portfolio risk management. Econ. Anal. Policy 2019, 62, 197–212. [Google Scholar] [CrossRef]
- Halperin, I. QLBS: Q-Learner in the Black-Scholes (-Merton) worlds. J. Deriv. 2019, 26, 99–123. [Google Scholar]
- Hilliard, J.E.; Reis, J. Valuation of commodity futures and options under stochastic convenience yields, interest rates, and jump diffusions in the spot. J. Financ. Quant. Anal. 1998, 33, 61. [Google Scholar] [CrossRef]
- Higgs, H.; Worthington, A. Stochastic price modeling of high volatility, mean-reverting, spike-prone commodities: The Australian wholesale spot electricity market. Energy Econ. 2008, 30, 3172–3185. [Google Scholar] [CrossRef]
- Hogan, W.W. Contract networks for electric power transmission. J. Regul. Econ. 1992, 4, 211–242. [Google Scholar] [CrossRef]
- Hosseini, E.; Saeedpour, B.; Banaei, M.; Ebrahimy, R. Optimized deep neural network architectures for energy consumption and PV production forecasting. Energy Strategy Rev. 2025, 59, 101704. [Google Scholar] [CrossRef]
- Holttinen, H.; Kiviluoma, J.; Helistö, N.; Levy, T.; Menemenlis, N.; Jun, L.; Cutululis, N.; Koivisto, M.; Das, K.; Orths, A.; et al. Design and Operation of Energy Systems with Large Amounts of Variable Generation: Final Summary Report, IEA Wind TCP Task 25; VTT Technical Research Centre of Finland, VTT Technology: Espoo, Finland, 2021. [Google Scholar] [CrossRef]
- Hull, J.C. Options, Futures, and Other Derivatives, 10th ed.; Pearson: New York, NY, USA, 2017. [Google Scholar]
- Ilo, A.; Prata, R.; Strbac, G.; Giannelos, S.; Bissell, G.R.; Kulmala, A.; Constantinescu, N.; Samovich, N.; Iliceto, A. White Paper ETIP SNET—Holistic Architectures for Power Systems. 2019. Available online: http://hdl.handle.net/20.500.12708/39729 (accessed on 6 February 2025).
- Jain, V.; Mitra, A. Artificial intelligence and machine learning for sustainable development: Enhancing health, equity, and environmental sustainability. In Machine and Deep Learning Solutions for Achieving the Sustainable Development Goals; Ruiz-Vanoye, J., Díaz-Parra, O., Eds.; IGI Global Scientific Publishing: Hershey, PA, USA, 2025; pp. 107–124. [Google Scholar] [CrossRef]
- Janczura, J.; Trück, S.; Weron, R.; Wolff, R.C. Identifying spikes and seasonal components in electricity spot price data: A guide to robust modeling. Energy Econ. 2013, 38, 96–110. [Google Scholar] [CrossRef]
- Janner, M.; Fu, J.; Zhang, M.; Levine, S. When to trust your model: Model-based policy optimization. arXiv 2019, arXiv:1906.08253. [Google Scholar]
- Jiang, D.R.; Powell, W.B. Risk-averse approximate dynamic programming with quantile-based risk measures. Math. Oper. Res. 2018, 43, 554–579. [Google Scholar] [CrossRef]
- Jiang, X.; Zhou, Y.; Ming, W.; Yang, P.; Wu, J. An overview of soft open points in electricity distribution networks. IEEE Trans. Smart Grid 2022, 13, 1899–1910. [Google Scholar] [CrossRef]
- Joskow, P.L. Lessons learned from electricity market liberalization. Energy J. 2008, 29, 9–42. [Google Scholar] [CrossRef]
- Jorion, P. Value at Risk: The New Benchmark for Managing Financial Risk, 3rd ed.; McGraw-Hill: Columbus, OH, USA, 2007. [Google Scholar]
- Judson, E.; Fitch-Roy, O.; Soutar, I. Energy democracy: A digital future? Energy Res. Soc. Sci. 2022, 91, 102732. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Williams, R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 1992, 8, 229–256. [Google Scholar] [CrossRef]
- Schulman, J.; Levine, S.; Abbeel, P.; Jordan, M.; Moritz, P. Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, 6–11 July 2015; pp. 1889–1897. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017. [Google Scholar] [CrossRef]
- Sutton, R.S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, Austin, TX, USA, 21–23 June 1990; pp. 216–224. [Google Scholar]
- Bertsekas, D.P. Dynamic Programming and Optimal Control; Approximate dynamic programming; Athena Scientific: Nashua, NH, USA, 2012; Volume II. [Google Scholar]
- Wang, W.; Huang, Y.; Wang, S. First-order adversarial vulnerability of neural networks and input dimension. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6543–6552. [Google Scholar]
- Prete, C.L.; Blumsack, S. Enhancing the reliability of bulk power systems against the threat of extreme weather: Lessons from the 2021 Texas electricity crisis. Econ. Energy Environ. Policy 2023, 12, 31–48. [Google Scholar] [CrossRef]
- Schwartz, E.S. The stochastic behavior of commodity prices: Implications for valuation and hedging. J. Financ. 1997, 52, 923–973. [Google Scholar] [CrossRef]
- Zareipour, H.; Bhattacharya, K.; Canizares, C. Forecasting the hourly Ontario energy price by multivariate adaptive regression splines. In Proceedings of the 2006 IEEE Power Engineering Society General Meeting, Montreal, QC, Canada, 18–22 June 2006. [Google Scholar]
- Busby, J.W.; Baker, K.; Bazilian, M.D.; Gilbert, A.Q.; Grubert, E.; Rai, V.; Rhodes, J.D.; Shidore, S.; Smith, C.A.; Webber, M.E. Cascading risks: Understanding the 2021 winter blackout in Texas. Energy Res. Soc. Sci. 2021, 77, 102106. [Google Scholar] [CrossRef]
- Nick, S.; Thoenes, S. What drives natural gas prices?—A structural VAR approach. Energy Econ. 2014, 45, 517–527. [Google Scholar] [CrossRef]
- Knittel, C.R.; Roberts, M.R. An empirical examination of restructured electricity prices. Energy Econ. 2005, 27, 791–817. [Google Scholar] [CrossRef]
- Suenaga, H.; Smith, A.; Williams, J. Volatility dynamics of NYMEX natural gas futures prices. J. Futur. Mark. 2008, 28, 438–463. [Google Scholar] [CrossRef]
- Brown, S.P.A.; Yücel, M.K. What drives natural gas prices? Energy J. 2008, 29, 45–60. [Google Scholar] [CrossRef]
- Paraschiv, F.; Erni, D.; Pietsch, R. The impact of renewable energies on EEX day-ahead electricity prices. Energy Policy 2015, 73, 196–210. [Google Scholar] [CrossRef]
- Ketterer, J.C. The impact of wind power generation on the electricity price in Germany. Energy Econ. 2014, 44, 270–280. [Google Scholar] [CrossRef]
- Potomac Economics; Electric Reliability Council of Texas. 2019 State of the Market Report for the ERCOT Electricity Markets. May 2020. Available online: https://www.potomaceconomics.com/wp-content/uploads/2020/06/2019-State-of-the-Market-Report.pdf (accessed on 3 February 2025).
- Staffell, I.; Green, R. Is there still merit in the merit order stack? The impact of dynamic constraints on optimal plant mix. IEEE Trans. Power Syst. 2015, 31, 43–53. [Google Scholar] [CrossRef]
- Woo, C.; Horowitz, I.; Moore, J.; Pacheco, A. The impact of wind generation on the electricity spot-market price level and variance: The Texas experience. Energy Policy 2011, 39, 3939–3944. [Google Scholar] [CrossRef]
- Staffell, I.; Rustomji, M. Maximising the value of electricity storage. J. Energy Storage 2016, 8, 212–225. [Google Scholar] [CrossRef]
- Bunn, D.W.; Oliveira, F.S. Agent-based analysis of technological diversification and specialization in electricity markets. Eur. J. Oper. Res. 2007, 181, 1265–1278. [Google Scholar] [CrossRef]
- Brinkmann, E.J.; Rabinovitch, R. Regional limitations on the hedging effectiveness of natural gas futures. Energy J. 1995, 16, 113–124. [Google Scholar] [CrossRef]
- Karakatsani, N.V.; Bunn, D.W. Forecasting electricity prices: The impact of fundamentals and time-varying coefficients. Int. J. Forecast. 2008, 24, 764–785. [Google Scholar] [CrossRef]
- Boukas, I.; Ernst, D.; Théate, T.; Bolland, A.; Huynen, A.; Buchwald, M.; Wynants, C.; Cornélusse, B. A deep reinforcement learning framework for continuous intraday market bidding. Mach. Learn. 2021, 110, 2335–2387. [Google Scholar] [CrossRef]
- Al Moti, M.M.; Uddin, R.S.; Hai, A.; Bin Saleh, T.; Alam, G.R.; Hassan, M.M.; Hassan, R. Blockchain Based Smart-Grid Stackelberg Model for Electricity Trading and Price Forecasting Using Reinforcement Learning. Appl. Sci. 2022, 12, 5144. [Google Scholar] [CrossRef]
- Pannakkong, W.; Vinh, V.T.; Tuyen, N.N.M.; Buddhakulsomsiri, J. A reinforcement learning approach for ensemble machine learning models in peak electricity forecasting. Energies 2023, 16, 5099. [Google Scholar] [CrossRef]
- Mulliez, M. Dynamic Hedging in the Presence of Basis Risk: A Reinforcement Learning Approach. Master’s Thesis, Imperial College London, London, UK, 2021. [Google Scholar]
- Madahi, S.S.K.; Claessens, B.; Develder, C. Distributional reinforcement learning-based energy arbitrage strategies in imbalance settlement mechanism. J. Energy Storage 2024, 104, 114377. [Google Scholar] [CrossRef]
- Bahrami, S.; Hooshmand, R.-A.; Parastegari, M. Short term electric load forecasting by wavelet transform and grey model improved by PSO (particle swarm optimization) algorithm. Energy 2014, 72, 434–442. [Google Scholar] [CrossRef]
- Wu, L.; Liu, S.; Yao, L.; Yan, S. The effect of sample size on the grey system prediction. Appl. Math. Model. 2013, 37, 6577–6583. [Google Scholar] [CrossRef]
- Zhao, H.; Wu, Q.; Hu, S.; Xu, H.; Rasmussen, C.N. Review of energy storage system for wind power integration support. Appl. Energy 2015, 137, 545–553. [Google Scholar] [CrossRef]
- Mystakidis, A.; Koukaras, P.; Tsalikidis, N.; Ioannidis, D.; Tjortjis, C. Energy Forecasting: A Comprehensive Review of Techniques and Technologies. Energies 2024, 17, 1662. [Google Scholar] [CrossRef]
- Boucetta, L.N.; Amrane, Y.; Chouder, A.; Arezki, S.; Kichou, S. Enhanced Forecasting Accuracy of a Grid-Connected Photovoltaic Power Plant: A Novel Approach Using Hybrid Variational Mode Decomposition and a CNN-LSTM Model. Energies 2024, 17, 1781. [Google Scholar] [CrossRef]
- Dos Reis, J.R.; Tabora, J.M.; de Lima, M.C.; Monteiro, F.P.; Monteiro, S.C.D.A.; Bezerra, U.H.; Tostes, M.E.D.L. Medium and long term energy forecasting methods: A literature review. IEEE Access 2025, 13, 29305–29326. [Google Scholar] [CrossRef]
- Sadeghi, M.; Shavvalpour, S. Energy risk management and value at risk modeling. Energy Policy 2006, 34, 3367–3373. [Google Scholar] [CrossRef]
- Wang, X.; Liu, H.; Yao, Y. Value-at-Risk forecasting for the Chinese new energy stock market: An explainable quantile regression neural network method. Procedia Comput. Sci. 2024, 242, 1096–1103. [Google Scholar] [CrossRef]
- Abdullah, B.U.D.; Khanday, S.A.; Islam, N.U.; Lata, S.; Fatima, H.; Nengroo, S.H. Comparative Analysis Using Multiple Regression Models for Forecasting Photovoltaic Power Generation. Energies 2024, 17, 1564. [Google Scholar] [CrossRef]
- Zhang, S.; Tu, L.; Duan, Q.; Chao, Z.; Tang, X.; Wanyan, X.; Chen, X. Conditional Value at Risk Model of New Power System Reserve Assessment Considering Primary Energy Supply Risk. In Energy Power and Automation Engineering; ICEPAE Lecture Notes in Electrical Engineering; Springer: Singapore, 2023; Volume 1118. [Google Scholar]
- Trabelsi, N.; Tiwari, A.K.; Ghallabi, F.; Khemakhem, I. Nexus of crude oil and clean energy stock indices: Evidence from time-vector-auto-regression in conjunction with conditional-autoregressive-value-at-risk. Heliyon 2025, 11, e40970. [Google Scholar] [CrossRef]
- Barrera-Rivera, R.R.; Valencia-Herrera, H. Hedging and optimization of energy asset portfolios. In Artificial Intelligence and Soft Computing for Energy Systems; Springer: Berlin/Heidelberg, Germany, 2022; pp. 113–126. [Google Scholar]
- Syalsabila, A.; Prastyo, D.D.; Akbar, M.S.; Rahayu, S.P.; Deivanayagampillai, N. Conditional Value-At-Risk Modelling Using Hybrid LASSO-QRNN to Quantify the Market Risk Dependence on Oil and Gas Companies’ Stock in Indonesia. In Advances in Manufacturing Processes and Smart Manufacturing Systems, GCMM 2023; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2024; Volume 1215. [Google Scholar] [CrossRef]
- Wilmott, P.; Dewynne, J.; Howison, S. Option Pricing: Mathematical Models and Computation; Oxford Financial Press: Oxford, UK, 1995. [Google Scholar]
- Buehler, H.; Gonon, L.; Teichmann, J.; Wood, B. Deep hedging. Quant. Financ. 2019, 19, 1271–1291. [Google Scholar] [CrossRef]
- Becker, S.; Cheridito, P.; Jentzen, A. Deep optimal stopping. J. Mach. Learn. Res. 2019, 20, 1–25. [Google Scholar]
- Becker, S.; Cheridito, P.; Jentzen, A. Pricing and hedging American-style options with deep learning. J. Risk Financ. Manag. 2020, 13, 158. [Google Scholar] [CrossRef]
- Song, C. Design and application of financial market option pricing system based on high-performance computing and deep reinforcement learning. Sci. Program. 2022, 2022, 8525361. [Google Scholar] [CrossRef]
- Boogert, A.; de Jong, C. Gas storage valuation using a Monte Carlo method. J. Deriv. 2008, 15, 81–98. [Google Scholar] [CrossRef]
- Oren, S.S. Integrating real and financial options in demand-side electricity contracts. Decis. Support Syst. 2001, 30, 279–288. [Google Scholar] [CrossRef]
- Tseng, C.-L.; Barz, G. Short-Term Generation Asset Valuation: A Real Options Approach. Oper. Res. 2002, 50, 297–310. [Google Scholar] [CrossRef]
- Nadarajah, S.; Secomandi, N. A review of the operations literature on real options in energy. Eur. J. Oper. Res. 2022, 309, 469–487. [Google Scholar] [CrossRef]
- Alqubaisi, A. Deep Real Options: Valuation of Real Options on Green Energy using Deep Learning Methods. Energy Econ. 2023, 120, 106553. [Google Scholar]
- Beulertz, D.; Charousset, S.; Most, D.; Giannelos, S.; Yueksel-Erguen, I. Development of a modular framework for future energy system analysis. In Proceedings of the 2019 54th International Universities Power Engineering Conference (UPEC), Bucharest, Romania, 3–6 September 2019; pp. 1–6. [Google Scholar]
- Siddiqui, A.S.; Maribu, K. Investment and upgrade in distributed generation under uncertainty. Energy Econ. 2009, 31, 25–37. [Google Scholar] [CrossRef]
- Nick, M.; Cherkaoui, R.; Paolone, M. Optimal planning of distributed energy storage systems in active distribution networks embedding grid reconfiguration. IEEE Trans. Power Syst. 2017, 33, 1577–1590. [Google Scholar] [CrossRef]
- Wogrin, S.; Galbally, D.; Reneses, J. Optimizing storage operations in medium- and long-term power system models. IEEE Trans. Power Syst. 2015, 31, 3129–3138. [Google Scholar] [CrossRef]
- Papadaskalopoulos, D.; Strbac, G. Nonlinear and sequential pricing models for active demand response. IEEE Trans. Smart Grid 2017, 8, 1349–1359. [Google Scholar]
- Amann, G.; Escobedo Bermúdez, V.R.; Boskov-Kovacs, E.; Gallego Amores, S.; Giannelos, S.; Iliceto, A.; Ilo, A.; Chavarro, J.R.; Samovich, N.; Schmitt, L.; et al. E-Mobility Deployment and Impact on Grids: Impact of EV and Charging Infrastructure on European T&D Grids: Innovation Needs; Gallego Amores, S., Ed.; MJ-09-22-246-EN-N; Publications Office of the European Union: Luxembourg, 2022. [Google Scholar] [CrossRef]
- Borozan, S.; Giannelos, S.; Strbac, G. Strategic network expansion planning with electric vehicle smart charging concepts as investment options. Adv. Appl. Energy 2022, 5, 100077. [Google Scholar] [CrossRef]
- Borozan, S.; Giannelos, S.; Aunedi, M.; Strbac, G. Option value of EV smart charging concepts in transmission expansion planning under uncertainty. In Proceedings of the 2022 IEEE 21st Mediterranean Electrotechnical Conference (MELECON), Palermo, Italy, 14–16 June 2022; pp. 63–68. [Google Scholar]
- Giannelos, S.; Konstantelos, I.; Strbac, G. Investment Model for Cost-effective Integration of Solar PV Capacity under Uncertainty using a Portfolio of Energy Storage and Soft Open Points. In Proceedings of the 2019 IEEE Milan PowerTech, Milan, Italy; 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Borozan, S.; Giannelos, S.; Falugi, P.; Moreira, A.; Strbac, G. Machine Learning-Enhanced Benders Decomposition Approach for the Multi-Stage Stochastic Transmission Expansion Planning Problem. Electr. Power Syst. Res. 2024, 237, 110985. [Google Scholar] [CrossRef]
- Khurana, U.; Samulowitz, H.; Turaga, D. Feature Engineering for Predictive Modeling Using Reinforcement Learning. Proc. AAAI Conf. Artif. Intell. 2018, 32, 3407–3414. [Google Scholar] [CrossRef]
- Moran, M.; Gordon, G. Deep Curious Feature Selection: A Recurrent, Intrinsic-Reward Reinforcement Learning Approach to Feature Selection. IEEE Trans. Artif. Intell. 2023, 5, 1174–1184. [Google Scholar] [CrossRef]
- Kaloev, M.; Krastev, G. Experiments Focused on Exploration in Deep Reinforcement Learning. In Proceedings of the 2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 21–23 October 2021; pp. 351–355. [Google Scholar]
- Tittaferrante, A.; Yassine, A. Benchmarking Offline Reinforcement Learning. In Proceedings of the 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Nassau, Bahamas, 12–14 December 2022; pp. 259–263. [Google Scholar]
- Kamruzzaman; Duan, J.; Shi, D.; Benidris, M. A Deep Reinforcement Learning-Based Multi-Agent Framework to Enhance Power System Resilience Using Shunt Resources. IEEE Trans. Power Syst. 2021, 36, 5525–5536. [Google Scholar] [CrossRef]
- Nkurunziza, F.; Kabanda, R.; McSharry, P. Enhancing poverty classification in developing countries through machine learning: A case study of household consumption prediction in Rwanda. Cogent Econ. Financ. 2024, 13, 2444374. [Google Scholar] [CrossRef]
- Raghavendra, A.H.; Majhi, S.G.; Mukherjee, A.; Bala, P.K. Role of artificial intelligence (AI) in poverty alleviation: A bibliometric analysis. VINE J. Inf. Knowl. Manag. Syst. 2023, 55, 710–729. [Google Scholar] [CrossRef]
- Satria, D.; Permani, R.; Winarno, K.; Kaluge, D.; Indraswari, C.R.; Handrito, R.P. An exploratory study of high-educated poverty through machine learning approach: A case study of East Java, Indonesia. J. Bus. Manag. Econ. Eng. 2025, 23, 92–107. [Google Scholar] [CrossRef]
- Kwilinski, A.; Lyulyov, O.; Pimonenko, T. Energy Poverty and Democratic Values: A European Perspective. Energies 2024, 17, 2837. [Google Scholar] [CrossRef]
- Dall-Orsoletta, A.; Cunha, J.; Araújo, M.; Ferreira, P. A systematic review of social innovation and community energy transitions. Energy Res. Soc. Sci. 2022, 88, 102625. [Google Scholar] [CrossRef]
- Palma, G.; Guiducci, L.; Stentati, M.; Rizzo, A.; Paoletti, S. Reinforcement Learning for Energy Community Management: A European-Scale Study. Energies 2024, 17, 1249. [Google Scholar] [CrossRef]
- Ponse, K.; Kleuker, F.; Fejér, M.; Serra-Gómez, Á.; Plaat, A.; Moerland, T. Reinforcement learning for sustainable energy: A survey. arXiv 2024. [Google Scholar] [CrossRef]
- Pillan, M.; Costa, F.; Caiola, V. How could people and communities contribute to the energy transition? Conceptual maps to inform, orient, and inspire design actions and education. Sustainability 2023, 15, 14600. [Google Scholar] [CrossRef]
- Neij, L.; Palm, J.; Busch, H.; Bauwens, T.; Becker, S.; Bergek, A.; Buzogány, A.; Candelise, C.; Coenen, F.; Devine-Wright, P.; et al. Energy communities—Lessons learnt, challenges, and policy recommendations. Oxf. Open Energy 2025, 4, oiaf002. [Google Scholar] [CrossRef]
- Abbas, K.; Butt, K.M.; Xu, D.; Ali, M.; Baz, K.; Kharl, S.H.; Ahmed, M. Measurements and determinants of extreme multidimensional energy poverty using machine learning. Energy 2022, 251, 123977. [Google Scholar] [CrossRef]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M. A review of machine learning approaches to power system security and stability. IEEE Access 2020, 8, 113512–113531. [Google Scholar] [CrossRef]
- Piras, G.; Muzi, F.; Ziran, Z. Open tool for automated development of renewable energy communities: Artificial intelligence and machine learning techniques for methodological approach. Energies 2024, 17, 5726. [Google Scholar] [CrossRef]
- Kaur, S.; Kumar, R.; Singh, K.; Huang, Y. Leveraging Artificial Intelligence for Enhanced Sustainable Energy Management. J. Sustain. Energy 2024, 3, 1–20. [Google Scholar] [CrossRef]
- Nalli, P.K.; Manikandan, K.P.; Padmapriya, G.; Bhatt, D.; Talukdar, N.; Premkumar, R. Optimizing energy systems using machine learning and artificial intelligence. In Integrating Artificial Intelligence Into the Energy Sector; Derbali, A., Ed.; IGI Global Scientific Publishing: Hershey, PA, USA, 2025; pp. 493–514. [Google Scholar] [CrossRef]
- Alturif, G.; Saleh, W.; El-Bary, A.A.; Osman, R.A. Using artificial intelligence tools to predict and alleviate poverty. Entrep. Sustain. Issues 2024, 12, 400–413. [Google Scholar] [CrossRef]
- Bose, S.; Kremers, E.; Mengelkamp, E.M.; Eberbach, J.; Weinhardt, C. Reinforcement learning in local energy markets. Energy Inform. 2021, 4, 7. [Google Scholar] [CrossRef]
Study | RL Algorithm | Energy Application | Data Characteristics | Key Findings |
---|---|---|---|---|
Jiang and Powell (2018) [85] | Value iteration with function approximation | Ensemble forecasting of electricity prices | PJM hourly price data | RL-based ensembles adapt better to regime changes than static ensemble methods |
Boukas et al. (2020) [114] | Proximal policy optimization | Intraday electricity trading | Nord Pool intraday market | RL strategy outperforms benchmark strategies by 15–28% in risk-adjusted returns |
Du et al. (2021) [46] | Multi-agent DQN | Bidding strategy in day-ahead markets | ERCOT market data | Multi-agent approach effectively approximates Nash equilibrium solutions |
Moti (2022) [115] | Q-learning | Electricity price prediction in blockchain-based grid | Simulated smart grid environment | RL framework mediates operator–consumer interactions for price prediction |
Pannakkong et al. (2023) [116] | Double deep Q-network | Peak electricity demand forecasting | Thailand’s electricity demand data | DDQN outperformed individual ML models by dynamically selecting optimal models |
Guo and Wang (2020) [72] | Deep Q-network | Adaptive model selection for price forecasting | ISO New England data | RL framework reduced MAPE by 18% compared to the best individual model |
Cao et al. (2023) [30] | Deep distributional RL | Options portfolio hedging | Simulated & empirical energy data | Outperformed delta hedging by 22–30% in managing non-linear risks |
Mulliez (2021) [117] | Q-learning | Dynamic hedging with basis risk | Natural gas basis spreads | Adaptive hedging outperformed traditional approaches under time-varying risks |
Chen et al. (2020) [34] | Hybrid RL + supervised learning | Energy portfolio hedging | Futures and spot price data | Cross-learning improved profit-risk tradeoffs vs. static hedging |
Karimi Madahi et al. (2024) [118] | Distributional RL | Battery storage arbitrage | UK imbalance settlement prices | Captured asymmetric risk profiles better than expected value methods |
Study | Valuation Problem | RL Methodology | Energy Focus | Key Contribution |
---|---|---|---|---|
Halperin (2019) [74] | General option pricing | Q-learning | Energy options | Model-free approach deriving pricing functions directly from empirical data |
Buehler et al. (2019) [133] | Hedging under market frictions | Deep reinforcement learning | Options hedging | Framework accommodating transaction costs and market incompleteness |
Becker et al. (2019) [134] | Optimal stopping | Deep Q-Network | American-style options | Direct learning of exercise policies without explicit continuation values |
Becker et al. (2020) [135] | Gas swing option valuation | Deep reinforcement learning | Natural gas contracts | RL approach superior to LSMC for contracts with complex constraints |
Marzban et al. (2023) [18] | Risk-aware option pricing | Actor-critic with risk measures | Energy derivatives | Incorporation of expectile risk measures for risk-averse valuation |
Song (2022) [136] | Computationally efficient pricing | Deep RL with high-performance computing | Energy option pricing | Real-time pricing under dynamic market conditions |
Carbonneau (2021) [29] | Equal risk pricing | Neural networks with RL | Energy derivatives | Pricing framework reflecting actual hedging costs and residual risks |
Dalal et al. (2016) [41] | Generation asset valuation | Deep Deterministic Policy Gradient (DDPG) | Power generation | Operating policies maximizing value under technical constraints |
Boogert and de Jong (2008) [137] | Gas storage valuation | Q-learning | Natural gas storage | Capturing complex intertemporal tradeoffs in storage operations |
Lee et al. (2023) [10] | CCU investment valuation | RL with real options | Carbon capture | Framework for identifying optimal investment timing under uncertainty |
Caputo and Cardin (2022) [31] | Waste-to-energy system valuation | Deep RL for flexibility analysis | Energy systems | DRL models improved economic outcomes by up to 69% vs. traditional approaches |
Cheraghi et al. (2024) [38] | Energy transition investment | RL for sustainable planning | Renewable energy | Dynamic optimization considering environmental and regulatory uncertainty |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Giannelos, S. Reinforcement Learning in Energy Finance: A Comprehensive Review. Energies 2025, 18, 2712. https://doi.org/10.3390/en18112712
Giannelos S. Reinforcement Learning in Energy Finance: A Comprehensive Review. Energies. 2025; 18(11):2712. https://doi.org/10.3390/en18112712
Chicago/Turabian StyleGiannelos, Spyros. 2025. "Reinforcement Learning in Energy Finance: A Comprehensive Review" Energies 18, no. 11: 2712. https://doi.org/10.3390/en18112712
APA StyleGiannelos, S. (2025). Reinforcement Learning in Energy Finance: A Comprehensive Review. Energies, 18(11), 2712. https://doi.org/10.3390/en18112712