Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft
Abstract
:1. Introduction
- (1)
- A new MEA energy management model has been developed taking into account the operating state of the generators, the connection priority relationship between the generators and the buses and the shedding priority relationship of the loads.
- (2)
- For the model of MEA, a hybrid deep reinforcement learning (HDRL) algorithm incorporating D3QN and DDPG is proposed to solve the energy management problem of MEA by taking the generator-bus connection relationship and the load shedding relationship as the discrete action space, and the generator output power and ESS charging and discharging power as the continuous space. The HDRL algorithm proposed in this paper inherits the advantages of DQL in dealing with discrete action space and exploits the advantages of DDPG in dealing with continuous action space. However, in the training process of the HDRL algorithm, we first train on discrete actions and then on continuous actions, alternating between the two until full convergence.
- (3)
- Simulation studies based on data from the [12] show that the study and numerical analysis of the generator under different operating conditions prove the effectiveness of the proposed HDRL method. The real-time nature of the method is verified by applying it to different time lengths T.
2. Mea System Description
2.1. Eps Model
2.2. Mathematical Formulations
2.2.1. Load and BUS Priority Constraints
2.2.2. Power Balance Constraint
2.2.3. BUS Connection and Generators State Constraints
2.2.4. Generator and BUS Power Capacity Constraints
2.2.5. Generator Optimum Operating Range Constraint
2.2.6. ESS Constraints
2.2.7. MEA Energy Management Optimization Model
- (1)
- The cost function that limits the generator to operate in the optimal range is [29]
- (2)
- The cost function for implementing the predefined connection rules between the generator and the bus is
- (3)
- The cost function for reducing non-critical load shedding is
- (4)
- Considering the cost function of the lifetime of the battery in ESS is
3. Hdrl-Based Solution
3.1. Mdp Formulation
3.2. HDRL
3.2.1. D3QN
3.2.2. DDPG
3.2.3. HDRL Algorithm Process
4. Case Study
4.1. System Setup
4.2. Model Training
4.3. Simulation Results
4.4. On-Line Hardware-in-the-Loop Test
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mohamed, M.A.; Abdullah, H.M.; El-Meligy, M.A.; Sharaf, M.; Soliman, A.T.; Hajjiah, A. A novel fuzzy cloud stochastic framework for energy management of renewable microgrids based on maximum deployment of electric vehicles. Int. J. Electr. Power Energy Syst. 2021, 129, 106845. [Google Scholar] [CrossRef]
- Sarlioglu, B.; Morris, C.T. More electric aircraft: Review, challenges, and opportunities for commercial transport aircraft. IEEE Trans. Transp. Electrif. 2015, 1, 54–64. [Google Scholar] [CrossRef]
- Wheeler, P.; Bozhko, S. The more electric aircraft: Technology and challenges. IEEE Electrif. Mag. 2014, 2, 6–12. [Google Scholar] [CrossRef]
- Rosero, J.A.; Ortega, J.A.; Aldabas, E.; Romeral, L.A.R.L. Moving towards a more electric aircraft. IEEE Aerosp. Electron. Syst. Mag. 2007, 22, 3–9. [Google Scholar] [CrossRef]
- Chen, J.; Song, Q. A decentralized energy management strategy for a fuel cell/supercapacitor-based auxiliary power unit of a more electric aircraft. IEEE Trans. Ind. Electron. 2019, 66, 5736–5747. [Google Scholar] [CrossRef]
- Chen, J.; Song, Q.; Yin, S.; Chen, J. On the decentralized energy management strategy for the all-electric APU of future more electric aircraft composed of multiple fuel cells and supercapacitors. IEEE Trans. Ind. Electron. 2019, 67, 6183–6194. [Google Scholar] [CrossRef]
- Li, X.; Wu, X. Autonomous energy management strategy for a hybrid power system of more-electric aircraft based on composite droop schemes. IEEE Trans. Ind. Electron. 2021, 129, 106828. [Google Scholar] [CrossRef]
- Mohamed, M.A.; Yeoh, S.S.; Atkin, J.; Hussaini, H.; Bozhko, S. Efficiency focused energy management strategy based on optimal droop gain design for more electric aircraft. IEEE Trans. Transport. Electrific. 2022. Available online: https://ieeexplore.ieee.org/abstract/document/9734043 (accessed on 14 March 2022).
- Wang, Y.; Xu, F.; Mao, S.; Yang, S.; Shen, Y. Adaptive online power management for more electric aircraft with hybrid energy storage systems. IEEE Trans. Transp. Electrif. 2020, 6, 1780–1790. [Google Scholar] [CrossRef]
- Zhang, Y.; Peng, G.O.H.; Banda, J.K.; Dasgupta, S.; Husband, M.; Su, R.; Wen, C. An energy efficient power management solution for a fault-tolerant more electric engine/aircraft. IEEE Trans. Ind. Electron. 2018, 66, 5663–5675. [Google Scholar] [CrossRef]
- Zhang, Y.; Yu, Y.; Su, R.; Chen, J. Power scheduling in more electric aircraft based on an optimal adaptive control strategy. IEEE Trans. Ind. Electron. 2019, 67, 10911–10921. [Google Scholar] [CrossRef]
- Zhang, Y.; Chen, J.; Yu, Y. Distributed power management with adaptive scheduling horizons for more electric aircraft. Int. J. Electr. Power Energy Syst. 2021, 126, 106581. [Google Scholar] [CrossRef]
- Maasoumy, M.; Nuzzo, P.; Iandola, F.; Kamgarpour, M.; Sangiovanni-Vincentelli, A.; Tomlin, C. Optimal load management system for aircraft electric power distribution. In Proceedings of the 52nd IEEE Conference on Decision and Control, Florence, Italy, 10–13 December 2013. [Google Scholar]
- Barzegar, A.; Su, R.; Wen, C.; Rajabpour, L.; Zhang, Y.; Gupta, A.; Lee, M.Y. Intelligent power allocation and load management of more electric aircraft. In Proceedings of the 11th IEEE International Conference on Power Electronics and Drive Systems, Sydney, NSW, Australia, 9–12 June 2015. [Google Scholar]
- Zou, H.; Tao, J.; Elsayed, S.K.; Elattar, E.E.; Almalaq, A.; Mohamed, M.A. Stochastic multi-carrier energy management in the smart islands using reinforcement learning and unscented transform. Int. J. Electr. Power Energy Syst. 2021, 130, 106988. [Google Scholar] [CrossRef]
- François-Lavet, V.; Henderson, P.; Islam, R.; Bellemare, M.G.; Pineau, J. An introduction to deep reinforcement learning. Found. Trends® Mach. Learn. 2018, 11, 219–354. [Google Scholar] [CrossRef]
- Ji, Y.; Wang, J.; Xu, J.; Fang, X.; Zhang, H. Real-time energy management of a microgrid using deep reinforcement learning. Energies 2019, 12, 2291. [Google Scholar] [CrossRef]
- Du, G.; Zou, Y.; Zhang, X.; Liu, T.; Wu, J.; He, D. Deep reinforcement learning based energy management for a hybrid electric vehicle. Energy 2020, 201, 117591. [Google Scholar] [CrossRef]
- Wu, Y.; Tan, H.; Peng, J.; Zhang, H.; He, H. Deep reinforcement learning of energy management with continuous control strategy and traffic information for a series-parallel plug-in hybrid electric bus. Appl. Energy 2019, 247, 454–466. [Google Scholar] [CrossRef]
- Yu, L.; Xie, W.; Xie, D.; Zou, Y.; Zhang, D.; Sun, Z.; Jiang, T. Deep reinforcement learning for smart home energy management. IEEE Internet Things J. 2019, 7, 2751–2762. [Google Scholar] [CrossRef]
- Fan, L.; Zhang, J.; He, Y.; Liu, Y.; Hu, T.; Zhang, H. Optimal scheduling of microgrid based on deep deterministic policy gradient and transfer learning. Energies 2021, 14, 584. [Google Scholar] [CrossRef]
- Zhu, Z.; Weng, Z.; Zheng, H. Optimal Operation of a Microgrid with Hydrogen Storage Based on Deep Reinforcement Learning. Electronics 2022, 11, 196. [Google Scholar] [CrossRef]
- Yu, L.; Qin, S.; Zhang, M.; Shen, C.; Jiang, T.; Guan, X. A review of deep reinforcement learning for smart building energy management. IEEE Internet Things J. 2021, 8, 12046–12063. [Google Scholar] [CrossRef]
- Gao, G.; Li, J.; Wen, Y. DeepComfort: Energy-efficient thermal comfort control in buildings via reinforcement learning. IEEE Internet Things J. 2020, 7, 8472–8484. [Google Scholar] [CrossRef]
- Huang, C.; Zhang, H.; Wang, L.; Luo, X.; Song, Y. Mixed Deep Reinforcement Learning Considering Discrete-continuous Hybrid Action Space for Smart Home Energy Management. J. Mod. Power Syst. Clean Energy 2022, 10, 743–754. [Google Scholar] [CrossRef]
- Ye, Y.; Qiu, D.; Wu, X.; Strbac, G.; Ward, J. Model-free real-time autonomous control for a residential multi-energy system using deep reinforcement learning. IEEE Trans. Smart Grid 2020, 11, 3068–3082. [Google Scholar] [CrossRef]
- Huang, Y.; Wei, G.; Wang, Y. Vd d3qn: The variant of double deep q-learning network with dueling architecture. In Proceedings of the 2018 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 9130–9135. [Google Scholar]
- Chen, J.; Wang, C.; Chen, J. Investigation on the selection of electric power system architecture for future more electric aircraft. IEEE Trans. Transp. Electrif. 2018, 4, 563–576. [Google Scholar] [CrossRef]
- Xu, B.; Guo, F.; Xing, L.; Wang, Y.; Zhang, W.A. Accelerated and Adaptive Power Scheduling for More Electric Aircraft via Hybrid Learning. IEEE Trans. Ind. Electron. 2022. Available online: https://ieeexplore.ieee.org/abstract/document/9714232 (accessed on 15 February 2022).
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Wang, Z.; Schaul, T.; Hessel, M.; Hasselt, H.; Lanctot, M.; Freitas, N. Dueling network architectures for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 1995–2003. [Google Scholar]
Bus | Priority | ||
---|---|---|---|
1st | 2nd | 3rd | |
Bus1 | Main Gen1 | Main Gen2 | APU |
Bus2 | Main Gen2 | Main Gen1 | APU |
Parameters | Value | Parameters | Value |
---|---|---|---|
100, 600 | 0.98 | ||
480, 480, 480 | 0.95 | ||
600 | 0.5 | ||
−100, 100 | 400 | ||
0.2, 1 | |||
1, 20 | 0.001, 0.2495, 0.2495, 0.5 |
Management length T (time slots) | 20 | 40 | 60 | 80 | |
Number of integer variables | 420 | 840 | 1260 | 1680 | |
Number of continuous variables | 400 | 800 | 1200 | 1600 | |
Optimal objective value (time slots) | DDPG | 11.90 | 21.69 | 31.24 | 38.36 |
DQN + DDPG | 13.36 | 22.45 | 31.56 | 40.87 | |
Gurobi | 8.32 | 16.66 | 24.64 | 31.34 | |
Proposed | 8.65 | 18.71 | 26.41 | 35.87 | |
Solution time (second) | DDPG | 1.73 × 10−1 | 1.81 × 10−1 | 1.83 × 10−1 | 1.90 × 10−1 |
DQN + DDPG | 1.75 × 10−1 | 1.78 × 10−1 | 1.85 × 10−1 | 1.95 × 10−1 | |
Gurobi | 13.42 | 109.42 | 2270.09 | 7169.95 | |
Proposed | 1.67 × 10−1 | 1.72 × 10−1 | 1.86 × 10−1 | 1.88 × 10−1 | |
Times | Gurobi/Proposed | 80.36 | 636.16 | 12,204.78 | 38,138.03 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, B.; Xu, B.; He, T.; Yu, W.; Guo, F. Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft. Energies 2022, 15, 6323. https://doi.org/10.3390/en15176323
Liu B, Xu B, He T, Yu W, Guo F. Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft. Energies. 2022; 15(17):6323. https://doi.org/10.3390/en15176323
Chicago/Turabian StyleLiu, Bing, Bowen Xu, Tong He, Wei Yu, and Fanghong Guo. 2022. "Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft" Energies 15, no. 17: 6323. https://doi.org/10.3390/en15176323
APA StyleLiu, B., Xu, B., He, T., Yu, W., & Guo, F. (2022). Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft. Energies, 15(17), 6323. https://doi.org/10.3390/en15176323