Artificial Neural Network Simulation of Energetic Performance for Sorption Thermal Energy Storage Reactors

Sorption thermal heat storage is a promising solution to improve the development of renewable energies and to promote a rational use of energy both for industry and households. These systems store thermal energy through physico-chemical sorption/desorption reactions that are also termed hydration/dehydration. Their introduction to the market requires to assess their energy performances, usually analysed by numerical simulation of the overall system. To address this, physical models are commonly developed and used. However, simulation based on such models are time-consuming which does not allow their use for yearly simulations. Artificial neural network (ANN)-based models, which are known for their computational efficiency, may overcome this issue. Therefore, the main objective of this study is to investigate the use of an ANN model to simulate a sorption heat storage system, instead of using a physical model. The neural network is trained using experimental results in order to evaluate this approach on actual systems. By using a recurrent neural network (RNN) and the Deep Learning Toolbox in MATLAB, a good accuracy is reached, and the predicted results are close to the experimental results. The root mean squared error for the prediction of the temperature difference during the thermal energy storage process is less than 3 K for both hydration and dehydration, the maximal temperature difference being, respectively, about 90 K and 40 K.


Introduction
Renewable energy deployment is often applied in conjunction with Thermal Energy Storage (TES) to balance the energy between production and demand, e.g., storing summer heat for winter heating. Chemical and sorption TES have been identified as promising technologies to solve the seasonal mismatch of solar energy storage [1] offering high energy densities around 600 kW · h · m −3 and 200 kW · h · m −3 , respectively, [2]. Despite their high energy densities, chemical and sorption TES suffer from their low technology readiness level (typically two to three), which justifies the intensive research on this topic that has occurred in the last ten years [1][2][3][4][5][6][7][8]. The main scientific bottlenecks are in the improvement of the heat and mass transfer in the TES during hydration and to provide a better integration of the TES within the system to increase the overall efficiency. Recently, the analysis of the energy chain of a zeolite TES has shown that approximately 70% of absorbed energy is converted into useful heat released on discharge. However, approximately half of this heat is then directly lost at the outlet of the adsorbent bed. The overall system efficiency is 36% [9]. Numerical simulation is usually used to address such problems with tools like TRNSYS, ENERGY+. . . that require robust and time efficient TES numerical models compatible with yearly simulation.
Detailed physical models of TES have been developed and reported in the literature [9,10]. Usually based on the analysis of heat and mass transfer in the reactor, they are complex and require large computational efforts, making it difficult to use them for yearly simulations. One possible strategy to overcome this difficulty is the use of regression methods based on machine learning which are considerably less demanding in terms of computation. Support-vector regression (SVR) is one classic option [11] to reproduce complex time series but with the advances in machine learning, SVR approaches are nowadays generally outperformed by artificial neural network (ANN)-based techniques. Basically, an ANN is composed of different layers: an input layer with the information given to the network, hidden layers showing the influence among different elements, and an output layer which gives the desired results [12].
Recently, Scapino et al. [5] have developed a potassium carbonate (K 2 CO 3 ) TES model based on an artificial neural network (ANN). The network was trained with a set of simulated data produced by a physics-based model. It showed a significant reduction in time of simulation and acceptable discrepancies with respect to the physics-based model. Taking into consideration these promising results, the objective of our research work is to develop a neural network that will assess the performance of a zeolite TES. In our approach, however, experimental data instead of simulation data are used for the training and the validation of the neural network. We chose to use a recurrent neural network (RNN) as this category of ANN was designed for sequence modelling and therefore naturally applies to time series forecasting [13]. The main reason for the effectiveness of RNN is its ability to capture past information from data and use it for upcoming sequence steps.
The studied sorption thermal energy storage system is presented in Figure 1. The main part of the system consists of two vertical packed-bed reactors (top and bottom, Figure 1b,d). Each cylinder reactor (72 cm in diameter) can load 40 kg of zeolite materials (Alfa Aesar, 13X, beads of 1.6 mm to 2.5 mm, see Figure 1a). An air treatment system ( Figure 1c) drives the sorbate flow at various temperature and humidity levels into the reactors according to different dehydration and hydration tests. At the inlet and outlet of reactors, temperature sensors, hydrometers and flow meters are installed to measure the properties of airflow. More details about the reactor system are available in [2]. The performance of the system could be explored with the variation of charging temperature, airflow rate, relative humidity in desorption mode, bed thickness and serial or parallel reactor test configurations [2]. Since we aimed to approach daily use cases, several sets of results were chosen (see Section 2.1) for ANN training and validation.

Recurrent Neural Network
Recurrent neural networks (RNNs) form a category of ANNs which may be seen as general sequence processors. Contrary to more widespread ANNs, such as multilayered perceptrons which propagate data forwards only, RNNs propagate data both forwards and backwards. This feature allows RNNs to process sequences of data, and notably to handle time-dependent situations. In this study, we choose to use long short-term memory (LSTM) recurrent neural networks [13]. LSTM has the ability to overcome the vanishing and exploding gradient problems [15] often encountered in recurrent networks. Basically, this is achieved via special gates which determine which past information should be passed and which should be forgotten.
In this work, the data were organised to fit the created RNN model. The relevant parameters (temperature, air flow, air humidity, temperature difference) were selected based on a study of the material properties. Firstly, the same experimental data covering both cycles (hydration/dehydration) were used to train and validate the RNN model in order to verify the validity of this approach. Then, the data were separated into training and validation sets and used on the second RNN in order to provide a reliable evaluation of the accuracy of the model.
Training a neural network requires both a learning dataset and a validation dataset, with inputs and corresponding outputs. A large amount of learning data are recommended to gain a well-performing neural network. The chosen amount of validation data is usually significantly smaller. The root mean square error (RMSE, Equation (1)) was used to evaluate the accuracy of the model during the validation phase: whereŷ 1 ,ŷ 2 , . . .ŷ n are the predicted values, y 1 , y 2 , . . . y n are the observed values, and n is the number of observations. It should be noted that the RMSE is the square root of the variance of the residuals, which is an absolute measure of fit. The lower its value, the more accurate the model is. However, the RMSE depending on the scale of the used data implies that there is no normalised value assessing the reliability of the model.
In order to construct a model which is as accurate as possible, the different parameters of a RNN (e.g., number of hidden layers, number of epochs and the number of batches) are generally tested to achieve the best combination. The number of epochs corresponds to the number of times that all the training sets will undergo processing. By increasing the number of epochs, the output of the neural network goes from underfitting, optimum and finally to overfitting. The batch divides the total amount of training data. Its size corresponds to the number of training data that will be used in one pass. Therefore, the smaller this number is, the longer the training of the neural network will take, since a smaller amount of data will be used for each propagation. The number of hidden layers corresponds to the size of the network and the number of neurons that are used for learning. The larger this number is, the larger the neural network will be and the longer it will take to train it.

Data Processing
The data sets originating from [2] must be processed before training the neural network. The parameters selected from the data as inputs are: the relative humidity measured at the entry of the reactor (RH in ), the temperature of air flow at the entry of the reactor (T in ), the rate of air flow (Q v ) inside the reactor. The parameter selected as an output is the variation of the temperature (∆T) between the entry of the reactor and the output temperature. Those parameters are used for both hydration and dehydration in order to construct a neural network that can predict both phases. The MATLAB code used for training and validation is given in Appendix A.
Six different experiments of hydration and three experiments of dehydration are used for the training. This corresponds to a total of 15,700 measurements. For the validation of the model, 3 experiments of hydration and 1 of dehydration, 5500 measurements in total, are used.
All experiments used during the training process have to be in the same database. Yet, for proper training of the neural network, the different experiments have to be segregated. Therefore, during data processing, all experiments are numbered. All the data from the first experiment will be preceded by one, and so on. Thanks to a data processing code (see Appendix B), the neural network can separate each experiment as a cycle. The experiments were organised in a random way.

Determination of the Optimal Rnn Parameters for Training
A parametric study on three RNN parameters (mini batch size, number of max epoch, number of hidden layers) was performed to analyse the RMSE of the outlet temperature. The lower the RMSE is, the better the result. One parameter is modified at a time whereas the two others are set to a constant value of 200. Figure 2 summarises the influence of each parameter and gives the RMSE as a function of time. Figure 2a shows the influence of the batch size. It indicates that a small mini batch size will give better results than a bigger one. Indeed, a smaller number of data go through one pass, and therefore the RNN will update itself more efficiently because it processes less data. Figure 2b shows the influence of the max epoch number. The curve representing the number of epochs has an erratically decreasing shape. It shows that there is an optimum number of epochs (800 or 1300) that is not necessarily the largest one. Moreover, the curve has a repetitive shape showing some determinism in the number of epochs for which training is efficient. Finally, Figure 2c shows an optimum number (400) of the hidden Layers that is not the highest one. It also suggests that an even number of hidden layers works better than an uneven number.
From Figure 2, an optimal solution can be extracted with a minimum batch size to use between 100 and 300, the number of hidden layers should be 400 and the number of epochs should be 800 or 1300. Indeed, the results gave the lowest RMSE for each simulation. Note that the calculation time increases considerably when the number of epoch or the number of hidden layers increases. For our data set, the calculation time remained lower than 3 h.

Validation of the Model with the Same Training and Verification Dataset
According to the previous section the parameters of the RNN were set to (200, 800, 400). First, the model was trained with the data sets referenced as training in Tables 1 and 2, in total nine training simulations. Then with the same data sets, the results were compared in order to understand if the model is able to reproduce the results of the training data. The RMSEs are shown in Table 3 for the nine trainings. They range from 0.9 to 2.31, which is relatively low, and the mean value for these 9 simulations is 1.60. This result indicates that the network predicted correctly the outcome of the data set that was used to create it.  [2,14].  Table 2. Dehydration data [2,14].   3 and 4 compare the results of the temperature difference as a function of the volume of data (the time step is one of 30 s for all the experiments) for both hydration and dehydration. As seen in Figures 3 and 4, the prediction curve correctly follows the shape of the experimental curve but is slightly offset. The average RMSE is around 1.60, which validates the above method using RNNs. The model would still be considered as accurate when considering this precision.

Validation of the Model with Specific Dataset
Many simulations computing different sets of parameters were run and can be found in Appendix C. The (200, 1300, 400) set of parameters gives the most accurate results for our dataset ( Figure 5). The RMSE values for the specific dataset are shown in Table 4. The average RMSE of the simulation is 2.37. The predictions given by the RNN are accurate but not perfect. For example, on diagram (d), the prediction curve follows the experimental curve well, but is offset during all the drops in temperature. The other prediction curves, (a), (b), and (c), are more acceptable. Comparing Figure 5 to Figure 3, it can be assessed that the prediction is less satisfactory when the neural network has never met the data. The RMSE is not the same for all the hydration simulations. This result could perhaps still be improved using better-suited parameters.

Conclusions
An RNN model suited for zeolite-based thermal energy storage was constructed. For quoted experimental data, the RNN model gave accurate results on the prediction of the temperature of air flow coming out of the reactor on complete cycles of both dehydration and hydration of zeolite. The calculation time of RNN model is lower than that of physicsbased models (less than 2 or 3 h to compute more than 15,700 estimations). Moreover, once the RNN model is trained, the calculation time to predict any given data set is almost instantaneous (less than a minute for more than 7500 estimations).
It should however be mentioned that in the predictions given by the RNN a gap between the experimental results and the predicted values remains. The accuracy of the predictions could still be improved, especially when significant changes in values occur over a short period of time. A possible cure to this issue could be to use more appropriate metrics such as dynamic time warping (DTW) [16] which is less sensitive to time shifts although more computationally demanding.
The results from the RNN are nevertheless promising. It should be noted that the predictions for the dehydration cycles are often more accurate than for the hydration cycles, which may be due to the fact that the variation of the output parameter is twice as large in the dehydration cases than in the hydration cases. However, there are less experimental data available for the dehydration cycles. Furthermore, the experiments covered only full hydration and dehydration cycles. It would be more realistic to model the reactors' behaviour using cycles including both hydration and dehydration in an experiment to see if an RNN is able to model accurately such cases.
Alternative machine learning based approaches for time series prediction could also be considered in future work. We plan to experiment with the gated recurrent unit (GRU) approach [17] which can be seen as a simplified version of LSTM and has been shown to often perform better than LSTM on small datasets as in our case. We may also evaluate the use of convolutional neural network (CNN)-based models. CNNs were originally devised for pattern recognition but have been successfully applied to time series forecasting in recent years [18]. They are known to be resistant to noise which might beneficial when using a experimental datasets.