# Deep Learning Optimal Control for a Complex Hybrid Energy Storage System

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methodology

#### 2.1. System Description

^{2}distributed in two floors, each having a living surface area of 50 m

^{2}, and it was assumed to be inhabited by four people. The ceiling/floor heights considered were 2.5 m/3.0 m, while the building width/depth were 6.5 m/8.0 m. The glazing ratio considered was of 20% on the south side, 10% on the north side and 12% on the east and west sides. The energy demand profile for cooling, heating and DHW of the building were obtained within the HYBUILD project [23] activities and it is out of the scope of this paper to present the details of energy demand calculations.

#### 2.2. Components Models and Operating Modes Description

#### 2.2.1. Fresnel Collectors

^{2}) is the direct normal irradiance at the specified location [24], ${T}_{m}=95\text{}$°C is the mean receiver temperature, ${T}_{amb}$ (in °C) is the ambient air temperature [24], ${v}_{w}$ (in m/s) is the wind speed [24] and ${A}_{Fres}=60$ m

^{2}is the total surface area of the solar collectors.

#### 2.2.2. PV Panels

^{2}) is the plan of array (POA) irradiance at the specified location and ${A}_{PV}=20.9$ m

^{2}is the PV panels surface area. The efficiency of auxiliary components related to the PV system (DC/DC converter, connections, etc.) was assumed to be accounted for in ${\eta}_{PV}$.

^{2}) is the sum of three contributions, as shown in Equation (3):

^{2}) is the POA beam component, ${E}_{g}$ (in W/m

^{2}) is the POA ground-reflected component and ${E}_{d}$ (in W/m

^{2}) is the POA sky-diffuse component.

#### 2.2.3. Heat Pump and PCM Tank

#### 2.2.4. Sorption Chiller

#### 2.2.5. Dry Cooler

#### 2.2.6. DHW Tank

#### 2.2.7. Buffer Tank

^{2}is the surface area of the buffer tank edge (lateral surface area) and ${A}_{base}\text{}$= 0.62 m

^{2}is the surface area of the base of the buffer tank.

#### 2.2.8. DC-Bus

#### 2.2.9. Summary of the Main Model Parameters

- Surface of the Fresnel solar collectors: 60 m
^{2}. - PV panels surface: 20.9 m
^{2}. - PV panels orientation: 0° (south).
- PV panels inclination: 30°.
- PCM tank storage capacity: $\approx 43,200$ kJ (12 kWh).
- DHW tank capacity: 250 L.
- DHW electric heater power: 2 kW.
- Buffer tank capacity: 800 L.
- Battery energy storage capacity: 7.3 kWh.
- Maximum battery charging/discharging power: 3 kW.

#### 2.3. DRL Control Description

#### 2.3.1. General Description

- A set of states $S$ that represents the environment, being ${S}_{t}\in S$ the environment state at time t.
- A set of actions $A$ that can be taken by the agent, being ${A}_{t}\in A\left(s\right)$ the action taken at time t from the subset of available actions at state $s$, $A\left(s\right)$.
- A numerical reward for the new visited state, ${R}_{t+1}\in \mathbb{R}$ that will depend on its trajectory: ${S}_{0},\text{}{A}_{0},{R}_{0},{S}_{1},{A}_{1},{R}_{1},$…, ${S}_{t},{A}_{t},{R}_{t}$.
- Assuming that the system dynamics is Markovian, random variables ${S}_{t}$ and ${R}_{t}$ will only depend on its previous values, with a probability distribution,$\text{}p(),$ which characterizes the system, defined as in Equation (28):$$p\left({s}^{\prime},r|s,a\right)\dot{=}Pr\left\{{S}_{t}={s}^{\prime},{R}_{t}=r|{S}_{t-1}=s,\text{}{A}_{t-1}=a\right\},$$
- An agent policy, $\pi $, which determines the chosen action at a given state. Defined as a probability, $\pi \left(a|s\right)$ results in the probability of choosing action $a$ from state $s$.

#### 2.3.2. Policy Gradient Algorithms

#### 2.3.3. HYBUILD Control Model

- Thermal energy demand for cooling/heating in the current time slot ($T{E}_{t}^{dem}$).
- Thermal energy demand for domestic hot water (DHW) in the current time slot ($T{E}_{t}^{dhw}$).
- Ambient temperature (${T}_{amb,t}$).
- Energy cost for electric demand in the current time slot (${C}_{t}$).
- Charge level of the PCM tank subsystem, (${E}_{PCM,t}$), as explained in Section 2.2.3. Not used in heating mode.
- Buffer tank top temperature, (${T}_{buffer,top,t}$), as explained in Section 2.2.7.
- Battery state of charge in the DC-bus subsystem (${B}_{S,t}$), as explained in Section 2.2.8. being $t$ the corresponding time and all of them were standard normalized according to their ranges.

- Charging/discharging power is set to a fixed value, namely 3 kW.
- If from the high-level control the DC-bus is forced to operate in charging, buffer or discharging mode, the pair of values $\left({E}_{1},{E}_{2}\right)$ is set to three fixed levels: (75, 90), (10, 90) and (10, 25), respectively, as a percentage of the battery state of charge, ${B}_{S}$.

- If there is some energy demand, cooling/heating mode 0 is not an option.
- Otherwise, any cooling/heating mode will perform as mode 0 inside ${T}_{S}$.

- ${N}_{inp,heat}\text{}$= 7 and ${N}_{inp,cool}\text{}$= 8 are the number of inputs, defined by the system state dimension. Their values are standard normalized with their corresponding ranges.
- ${N}_{hid,heat}$ and ${N}_{hid,cool}$ are the hidden layer sizes for heating and cooling modes, respectively. They use to be much larger than the size of inputs and outputs. Actually, the number of hidden layers, their size, the type activation functions, as well as other parameters will be adjusted in a future study by hyper-parameter setting analysis, being out of the scope of this paper. The values ${N}_{hid,heat}\text{}$= 100 and ${N}_{hid,cool}\text{}$= 1,000 were adopted here, with exponential linear unit activation functions and a dropout rate of 0.8.
- ${N}_{out,heat}\text{}$= 3 and ${N}_{out,cool}\text{}$= 21 are the number of outputs corresponding to the cardinality of the actions set. Outputs represent softmax of logits and the corresponding action is taken as a multinomial of the logarithm of outputs.
- Learning rate, $\alpha \text{}$= 0.0005.
- Discount rate, $\gamma \text{}$= 0.99.

#### 2.3.4. Minimum Cost Control Policy

- $E{E}_{t}^{fg}$ is the electrical energy bought from the grid in slot $t$, either to feed the DC-bus or other equipment, such as the electric resistance of the DHW tank.
- $E{E}_{t}^{tg}$ is the electrical energy sold to the grid in slot $t$. A discount factor of 0.5 was considered.
- $T{E}_{t}^{hp}$ is the thermal energy provided by the heat pump subsystem for cooling/heating in slot $t$.
- $T{E}_{t}^{pcm}$ is the thermal energy provided by the PCM tank for cooling/heating in slot $t$.
- $Penalty$ is the cost assumed for a non-covered demand. A value much higher than the energy cost is used.

#### 2.3.5. Rule-Based Control Policies

- Battery mode—charging, buffer or discharging—is determined by two battery state of charge thresholds (${B}_{min}^{th}$ and ${B}_{max}^{th}$) and the grid cost (${C}_{t}$).
- Cooling mode 1 (PCM tank charging) is set if there is no cooling demand. Otherwise, cooling mode 2 (PCM tank discharging) is set if PCM energy (${E}_{PCM,t}$) is larger than a threshold factor ($PC{M}_{f}^{th}$) times the cooling demand ($T{E}_{t}^{dem}$). Otherwise, cooling mode 3 (simultaneous PCM tank charging and cooling supply to the building) or 4 (cooling supply using the standard HP evaporator) is set according to the energy stored in the PCM tank in relation to the PCM energy threshold (${E}_{PCM}^{th}$).
- Sorption chiller mode is set depending on the buffer tank temperature threshold ($B{T}^{th}$) in comparison to the buffer tank temperature at the top region (${T}_{buffer,top,t}$).

#### 2.3.6. Implementation Aspects

#### 2.4. Network Trainizng

#### 2.4.1. Training and Test Data

- 0.2 €/kWh from 13:00 to 23:00 h.
- 0.1 €/kWh for the rest of the day.

#### 2.4.2. Training Times

- Time granularity for model computation: $\Delta t\text{}$= 3 min for cooling mode and $\Delta t$ = 15 s for heating mode. Taking longer time slots for the period when the heat pump is switched on would surpass in excess the heating demand in that time slot, due to the fact that the heat pump has higher coefficient of performance in heating mode.
- Time slot between control decisions: ${T}_{s}\text{}$= 30 min.
- Batch size of 6 days. Test set consists of 3 batches (18 days or 864 control slots).

#### 2.5. Robustness Analysis

## 3. Results and Discussion

- Cooling demand (‘Demand’) and global horizontal solar irradiation (‘GHI tilted’) on the tilted plane (PV surface). Green and orange areas show how the cooling demand was met: whether from the heat pump (‘From HP’) or from the PCM tank (‘From PCM’).
- The state of charge of the PCM tank (‘PCM SoC’), heat pump cooling mode (‘Cool. mode’) and mode of operation of the sorption chiller (‘Sorption act.’).
- The values of E1 and E2 thresholds of the DC-bus subsystem as detailed in Section 2.2.8. The state of charge of the battery is also shown (‘Battery SoC’), along with the cost of electricity (‘Grid cost’) as binary (0 corresponds to 0.1 €/kWh and 1 to 0.2 €/kWh).
- Domestic hot water demand (‘Demand DHW’) and top region temperature of the buffer tank (‘Buffer Tank top temp.’). Green and orange areas show how the DHW demand was met: whether from the heat pump (‘From elect’) or from the buffer tank (‘From BT’).
- Cumulative cost associated to the energy delivered to and taken from the power grid during valley (‘Ener. sold 0′ and ‘Ener. bought 0′, respectively) and peak (‘Ener. sold 1′ and ‘Ener. bought 1′, respectively) electricity tariff, along with the total cost according to the cumulative cost (‘Cost’) defined as $\sum}_{i=0}^{t}{R}_{i$. The total amount of electricity consumption is also plotted (‘Cumm. elec. energ.’).

- The operating cost for the 18 days of the test set is 11.1 €. As seen below, it is far less than the RBC policy tested under the same scenario, indicating that the deep learning control approach is highly efficient.
- Cooling demand is always covered, either from the HP or the PCM tank, in order to avoid penalties.
- Cooling modes 1 (PCM tank charging) and 4 (operation of the HP with the standard evaporator) are never (or rarely) used.
- All energy storage modules (PCM tank, buffer tank and electric battery) are fully exploited by charging and discharging them as much as possible on a daily basis within the allowed thresholds.
- The sorption chiller is also activated on a daily basis to assist the operation of the HP, which is beneficial for the overall system performance.

- Minimum and maximum battery thresholds: ${B}_{min}^{th}\text{}$= 0.01 and ${B}_{max}^{th}\text{}$= 0.94, respectively.
- Threshold factor for PCM tank discharging: $PC{M}_{f}^{th}\text{}$= 1.98.
- Buffer tank temperature threshold: $B{T}^{th}\text{}$= 76.7 °C.
- Threshold of the (normalized) amount of energy stored in the PCM tank: ${E}_{PCM}^{th}\text{}$= 0.19.

- The operating cost for the 18 days of the test set is 23.5 €, which is more than double the cost obtained using an DRL policy.
- Cooling demand is always covered, either from the HP or the PCM tank.
- All cooling modes are used by the HP, with no clear predilection for a specific operating mode.
- Sorption chiller activation is much more irregular as compared with the DRL case.
- The full potential of the PCM tank is hardly exploited, while the buffer tank is charged and discharged as much as possible on a daily basis.
- Electric battery is reasonably well exploited, but the main difference with respect to the DRL policy is that it is not discharged when the electricity cost is high and electricity demand of the system is low.

- The upper plot (first) shows how the heating demand is covered, whether by the heat pump (‘From HP’) or the buffer tank (‘From BT’).
- It can be observed, in the third plot, how the buffer tank temperature in the middle layer drops when heat is provided to the building from the buffer tank.
- The cumulative cost results negative (bottom plot), meaning that economic benefit is obtained from selling energy to the grid. This is achieved by charging/discharging the battery during the corresponding valley/peak tariff periods, as observed in the second plot.
- Bottom plot shows that the amount of energy sold in valley/peak tariff periods is larger than the energy bought during the same periods. As mentioned previously, an energy retailer may not reward energy reinjection when the amount of sold energy surpasses the bought energy. If this is the case, the cumulative cost will be zero instead of negative.

## 4. Conclusions and Future Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

## References

- Afram, A.; Janabi-Sharifi, F. Theory and applications of HVAC control systems–A review of model predictive control (MPC). Build. Environ.
**2014**, 72, 343–355. [Google Scholar] [CrossRef] - Thieblemont, H.; Haghighat, F.; Ooka, R.; Moreau, A. Predictive control strategies based on weather forecast in buildings with energy storage system: A review of the state-of-the art. Energy Build.
**2017**, 153, 485–500. [Google Scholar] [CrossRef][Green Version] - Cupelli, L.; Schumacher, M.; Monti, A.; Mueller, D.; De Tommasi, L.; Kouramas, K. Simulation Tools and Optimization Algorithms for Efficient Energy Management in Neighborhoods. In Energy Positive Neighborhoods and Smart Energy Districts; Elsevier BV: Amsterdam, The Netherlands, 2017; pp. 57–100. [Google Scholar]
- Boudon, M.; L’Helguen, E.; De Tommasi, L.; Bynum, J.; Kouramas, K.; Ridouane, E.H. Real Life Experience—Demonstration Sites. In Energy Positive Neighborhoods and Smart Energy Districts; Monti, A., Pesch, D., Ellis, K.A., Mancarella, P., Eds.; Elsevier BV: Amsterdam, The Netherlands, 2017; pp. 227–250. [Google Scholar]
- Tarragona, J.; Fernández, C.; de Gracia, A. Model predictive control applied to a heating system with PV panels and thermal energy storage. Energy
**2020**, 197, 117229. [Google Scholar] [CrossRef] - Gholamibozanjani, G.; Tarragona, J.; De Gracia, A.; Fernández, C.; Cabeza, L.F.; Farid, M.M. Model predictive control strategy applied to different types of building for space heating. Appl. Energy
**2018**, 231, 959–971. [Google Scholar] [CrossRef] - Achterberg, T. SCIP: Solving constraint integer programs. Math. Program. Comput.
**2009**, 1, 1–41. [Google Scholar] [CrossRef] - Vigerske, S.; Gleixner, A. SCIP: Global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optim. Methods Softw.
**2018**, 33, 563–593. [Google Scholar] [CrossRef] - Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; A Bradford Book; MIT Press: Cambridge, MA, USA, 2018; 427p. [Google Scholar]
- Watkins, C.J.C.H.; Dayan, P. Technical Note: Q-Learning. Mach. Learn.
**1992**, 8, 279–292. [Google Scholar] [CrossRef] - Liu, S.; Henze, G.P. Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 1. Theoretical foundation. Energy Build.
**2006**, 38, 142–147. [Google Scholar] [CrossRef] - Liu, S.; Henze, G.P. Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 2: Results and analysis. Energy Build.
**2006**, 38, 148–161. [Google Scholar] [CrossRef] - Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv
**2013**, arXiv:1312:5602. Available online: https://arxiv.org/abs/1312.5602 (accessed on 30 April 2021). - Wei, T.; Wang, Y.; Zhu, Q. Deep Reinforcement Learning for Building HVAC Control. In Proceedings of the 54th Annual Design Automation Conference, Austin, TX, USA, 18–22 June 2017. [Google Scholar]
- Mason, K.; Grijalva, S. A review of reinforcement learning for autonomous building energy management. Comput. Electr. Eng.
**2019**, 78, 300–312. [Google Scholar] [CrossRef][Green Version] - Yu, L.; Qin, S.; Zhang, M.; Shen, C.; Jiang, T.; Guan, X. Deep Reinforcement Learning for Smart Building Energy Management: A Survey. arXiv
**2020**, arXiv:e2008.05074. Available online: https://arxiv.org/abs/2008.05074 (accessed on 30 April 2021). - Wang, Z.; Hong, T. Reinforcement learning for building controls: The opportunities and challenges. Appl. Energy
**2020**, 269, 115036. [Google Scholar] [CrossRef] - Cheng, C.-C.; Lee, D. Artificial Intelligence-Assisted Heating Ventilation and Air Conditioning Control and the Unmet Demand for Sensors: Part 1. Problem Formulation and the Hypothesis. Sensors
**2019**, 19, 1131. [Google Scholar] [CrossRef][Green Version] - Liu, S.; Henze, G.P. Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory. J. Sol. Energy Eng.
**2006**, 129, 215–225. [Google Scholar] [CrossRef] - De Gracia, A.; Fernández, C.; Castell, A.; Mateu, C.; Cabeza, L.F. Control of a PCM ventilated facade using reinforcement learning techniques. Energy Build.
**2015**, 106, 234–242. [Google Scholar] [CrossRef][Green Version] - De Gracia, A.; Barzin, R.; Fernández, C.; Farid, M.M.; Cabeza, L.F. Control strategies comparison of a ventilated facade with PCM – energy savings, cost reduction and CO2 mitigation. Energy Build.
**2016**, 130, 821–828. [Google Scholar] [CrossRef][Green Version] - HYBUILD. Available online: http://www.hybuild.eu/ (accessed on 4 December 2020).
- Macciò, C.; Porta, M.; Dipasquale, C.; Trentin, F.; Mandilaras, Y.; Varvagiannis, S. Deliverable D1.1-Requirements: Context of Application, Building Classification and Dynamic Uses Consideration. 2018. Available online: http://www.hybuild.eu/2018/12/20/requirements-context-of-application-building-classification-and-dynamic-uses-consideration-deliverable-released/ (accessed on 30 April 2021).
- Weather Data by Location. All Regions—Europe WMO Region 6—Greece. Available online: https://energyplus.net/weather-location/europe_wmo_region_6/GRC//GRC_Athens.167160_IWEC (accessed on 4 December 2020).
- Solar PV Panel Module Aleo S79 Characteristics. Bosch Solar Services. Available online: https://bit.ly/2VQ91l1 (accessed on 16 September 2019).
- Zebner, H.; Zambelli, P.; Taylor, S.; Obinna Nwaogaidu, S.; Michelsen, T.; Little, J. Pysolar. Available online: https://github.com/pingswept/pysolar (accessed on 15 December 2020).
- Reindl, D.; Beckman, W.; Duffie, J. Diffuse fraction correlations. Sol. Energy
**1990**, 45, 1–7. [Google Scholar] [CrossRef] - Reindl, D.; Beckman, W.; Duffie, J. Evaluation of hourly tilted surface radiation models. Sol. Energy
**1990**, 45, 9–17. [Google Scholar] [CrossRef] - Loutzenhiser, P.; Manz, H.; Felsmann, C.; Strachan, P.; Frank, T.; Maxwell, G. Empirical validation of models to compute solar irradiance on inclined surfaces for building energy simulation. Sol. Energy
**2007**, 81, 254–267. [Google Scholar] [CrossRef][Green Version] - Varvagiannis, E.; Charalampidis, A.; Zsembinszki, G.; Karellas, S.; Cabeza, L.F. Energy assessment based on semi-dynamic modelling of a photovoltaic driven vapour compression chiller using phase change materials for cold energy storage. Renew. Energy
**2021**, 163, 198–212. [Google Scholar] [CrossRef] - Palomba, V.; Vasta, S.; Freni, A.; Pan, Q.; Wang, R.; Zhai, X. Increasing the share of renewables through adsorption solar cooling: A validated case study. Renew. Energy
**2017**, 110, 126–140. [Google Scholar] [CrossRef] - Palomba, V.; Dino, G.E.; Frazzica, A. Coupling sorption and compression chillers in hybrid cascade layout for efficient exploitation of renewables: Sizing, design and optimization. Renew. Energy
**2020**, 154, 11–28. [Google Scholar] [CrossRef] - Chandra, Y.P.; Matuska, T. Stratification analysis of domestic hot water storage tanks: A comprehensive review. Energy Build.
**2019**, 187, 110–131. [Google Scholar] [CrossRef] - Duffie, J.A.; Beckman, W.A. Solar Energy Thermal Processes; John Wiley & Sons Inc.: Hoboken, NJ, USA, 1974; ISBN 9780471223719. [Google Scholar]
- Bellman, R. A Markovian Decision Process. J. Math. Mech.
**1957**, 6, 679–684. [Google Scholar] [CrossRef] - Bellman, R. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 2010; 392p. [Google Scholar]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Driessche, G.V.D.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nat. Cell Biol.
**2016**, 529, 484–489. [Google Scholar] [CrossRef] - Abadi, M.; Barham, P.B.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
- Sutton, R.S.; Mcallester, D.; Singh, S.; Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 27–30 November 2000; pp. 1057–1063. [Google Scholar]
- Williams, R.J. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn.
**1992**, 8, 229–256. [Google Scholar] [CrossRef][Green Version] - Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015; Bengio, Y., LeCun, Y., Eds.; Scientific Research Publisher: Wuhan, China, 2015. [Google Scholar]
- Bergstra, J.; Yamins, D.; Cox, D.D. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, (PART 1), Atlanta, GA, USA, 16–21 June 2013; Dasgupta, S., McAllester, D., Eds.; PMLR: New York, NY, USA, 2013; pp. 115–123. [Google Scholar]
- Van Rossum, G.; Drake Jr., F.L. Python Tutorial; 12th Media Services: Suwanee, GA, USA, 1995; pp. 1–156. [Google Scholar]

Mode | Description | Active Pumps | Dry Cooler | Fan-Coils |
---|---|---|---|---|

Cooling 1 | PCM tank is charged by the heat pump, no cooling is provided to the building | P5–P8 (if sorption is on) P7 and P8 (if sorption is off) | On | Off |

Cooling 2 | PCM tank is discharging to provide cooling to the building | P9 and P10 | Off | On |

Cooling 3 | Cooling is provided by the HP through the PCM tank | P5–P10 (if sorption is on) P7–P10 (if sorption is off) | On | On |

Cooling 4 | Cooling is provided by the HP through the standard evaporator | P5–P10 (if sorption is on) P7–P10 (if sorption is off) | On | On |

Heating | Heating is provided by the HP | P7 and P10 | On | On |

0 | No cooling or heating is provided | None | Off | Off |

Mode | Name | ${\mathit{E}}_{1}\text{}(\%)$ | ${\mathit{E}}_{2}\text{}(\%)$ |
---|---|---|---|

1 | Charging | 75 | 90 |

2 | Discharging | 10 | 25 |

3 | Buffer | 10 | 90 |

Variable | Symbol | Reference Value | Error Range | Units |
---|---|---|---|---|

Optical efficiency Fresnel | η_{opt} | Data from [22] | Ref·(1 ± 0.2·n) | - |

PV efficiency | η_{PV} | 0.16 | Ref·(1 ± 0.25·n) | - |

Maximum battery charging or discharging power | MaxB | 3.0 | Ref·(1 ± 0.2·n) | kW |

Battery charging efficiency | η_{B} | 0.9 | Ref·(1 ± 0.11·n) | - |

Sorption thermal efficiency | COP_{th} | 0.55 | Ref·(1 ± 0.09·n) | - |

Dry cooler electricity consumption | ${\dot{W}}_{dc}$ | Equation (10) | Ref·(1 ± 0.2·n) | kW |

Heat pump cooling power | ${\dot{Q}}_{evap}$ | Data from [30] | Ref·(1 ± 0.2·n) | kW |

Heat produced by the compressor | ${\dot{Q}}_{comp}$ | Data from [30] | Ref·(1 ± 0.2·n) | kW |

Buffer tank thermal resistance | R_{buffer} | 430.3 | Ref·(1 ± 0.13·n) | K/kW |

RPW-HEX thermal resistance | ${R}_{PCM}$ | 424.5 | Ref·(1 ± 0.18·n) | K/kW |

DHW tank thermal resistance | R_{DHW} | 830.8 | Ref·(1 ± 0.19·n) | K/kW |

Operating Mode | Policy | |
---|---|---|

DRL | RBC | |

Cooling | 11.1 | 23.5 |

Heating | −2.4 | −0.1 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zsembinszki, G.; Fernández, C.; Vérez, D.; Cabeza, L.F.
Deep Learning Optimal Control for a Complex Hybrid Energy Storage System. *Buildings* **2021**, *11*, 194.
https://doi.org/10.3390/buildings11050194

**AMA Style**

Zsembinszki G, Fernández C, Vérez D, Cabeza LF.
Deep Learning Optimal Control for a Complex Hybrid Energy Storage System. *Buildings*. 2021; 11(5):194.
https://doi.org/10.3390/buildings11050194

**Chicago/Turabian Style**

Zsembinszki, Gabriel, Cèsar Fernández, David Vérez, and Luisa F. Cabeza.
2021. "Deep Learning Optimal Control for a Complex Hybrid Energy Storage System" *Buildings* 11, no. 5: 194.
https://doi.org/10.3390/buildings11050194