Forecasting of Heat Production in Combined Heat and Power Plants Using Generalized Additive Models

: The paper presents a developed methodology of short-term forecasting for heat production in combined heat and power (CHP) plants using a big data-driven model. An accurate prediction of an hourly heat load in the day-ahead horizon allows a better planning and optimization of energy and heat production by cogeneration units. The method of training and testing the predictive model with the use of generalized additive model (GAM) was developed and presented. The weather data as an input variables of the model were discussed to show the impact of weather conditions on the quality of predicted heat load. The new approach focuses on an application of the moving window with the learning data set from the last several days in order to adaptively train the model. The inﬂuence of the training window size on the accuracy of forecasts was evaluated. Different versions of the model, depending on the set of input variables and GAM parameters were compared. The results presented in the paper were obtained using a data coming from the real district heating system of a European city. The accuracy of the methods during the different periods of heating season was performed by comparing heat demand forecasts with actual values, coming from a measuring system located in the case study CHP plant. As a result, a model with an averaged percentage error for the analyzed period (November–March) of less than 7% was obtained.


Introduction
District heating systems (DHS) are common forms of heat distribution in large urban areas. The largest systems consist of a district heating network (DHN) supplied by a combined heat and power (CHP) plant. In recent years, combined-cycle gas turbine plants, as well as gas turbines with heat recovery units, have become popular due to their high flexibility, short start-up time, and lower environmental impact compared to coal plants [1]. In CHP plants, the basic product is useful heat, whose demand in a DHS must be covered at any time. This means that actual electricity production depends on the current heat demand. On the other hand, dynamic changes in electricity prices are observed. On the European energy markets, the price changes every hour, with maximum daily variations of 50% and more. An accurate prediction of an hourly heat load in the day-ahead horizon allows better planning and optimization of heat and electricity production by cogeneration units.
New solutions to optimize electricity generation and keep the produced heat load on the required level taking into account economic aspects are under constant development. Fang et al. [2] proposed an optimization model, where the heat demand and electricity price forecasts are used as an input to obtain a heat storage operation plan. Wand et al. [3] studied the flexibility of two different CHP units considering the day-ahead market and real-time wind power balancing. Nowadays, the progressive development of digitization and the use of advanced data analysis methods is a trend in the so-called 4th generation district heating [4,5]. The main element is short-term (up to several days) planning of energy production, based on the expected heat load profile [6]. An important issue is that estimated production on an hourly basis must be contracted in a day-ahead electricity market. Similarly fuel such as natural gas can also be purchased on the spot market. Moreover, inaccurate estimates of expected volumes may result in the need to switch on the peak units (e.g., heat only boilers), whose energy and environmental efficiency is lower.
Prediction of an hourly heat demand on a large urban scale is a complex problem. The heat for heating of buildings depends mainly on the weather data, while domestic hot water consumption is strongly related to consumers' behavior over the day and week. An important aspect are transient DHN effects such as heat losses, thermal inertia of buildings, etc. [7]. Forecasting methods are based on a data-driven approach. It means that heat demand and its relation to predictor variables is found in historical data. Spoladore et al. [8] analyzed data of heat demand for town-level aggregation and developed a model of hourly gas consumption for heating purposes. Nigitz et al. [9] proposed a model, where changes in consumer behavior are covered by continuous adaptation by using historical data for the ambient temperature and the heat load. Mosavi et al. [10] and Bourdeau et al. [11] gave an overview of data-driven methods that can be applied to heat load forecasting. There are models on a scale of individual buildings as well as the entire district heating network. Generally, predictive models are based on the supervised learning technique and supplied with weather data such as temperature, wind speed, cloud cover, among others. Dotzdauer [12] developed a simple model based on stepwise regression of ambient temperature. Baltputnis et al. [13] used a multiple linear regression of meteorological parameters. Autoregressive integrated moving average ARIMA method of heat load and ambient temperature time series were used by Grosswindhager et al. [14] and Fang et al. [15]. Artificial Neural Networks ANN have been shown by Wojdyga [16] to be effective approach to analyze data from previous heating seasons. In recent years, there has been a clear increase in interest in the use of machine learning methods, such as support vector machines (SVM) [17], random forest [18], deep learning [19] and gradient boosting [20]. This results from the intensive development of IT tools used in large industrial installations, database capacity, and the availability of data acquisition systems. An important issue in the data-driven models is the selection of input variables and the way of training the model. Machine learning techniques enable analyzing many variables besides weather data, such as operating data from DHN or calendar data [21,22]. This paper focuses on the use of the generalized additive model (GAM) method to develop the heat demand model in a medium-sized heating system supplied from a CHP plant. In the GAM method, the forecast variable is estimated by smoothing the input variables with functional relationships [23]. It is useful extension of the generalized linear model (GLM), able to effectively map the seasonality and non-linearity which is normally presented in the heat load data. In the literature, there are many applications of the GAM method to forecast electricity load [24]. Kim et al. [25] decomposed the load into the components on different temporal scales, related to the annual, weekly and daily cycles. The non-linear impact of ambient air temperature on the load level was incorporated with the cubic spline by Sigauke [26]. Pathak et al. [27] used GAM to forecast gas usage for buildings taking into account weather data supplemented by features such as hour of the day, day of the week, month, etc. Khamma et al. [28] interpreted the relationships between predictors using GAM and evaluate their impacts on energy consumption in office buildings.
In this paper, the GAM method was applied to build an hourly heat demand model based on the weather data as ambient temperature, solar irradiation and wind speed. The presented method was extended by the use of variable representing the hour of the day. Application of day hour variable can improve accuracy of predicted heat load, taking into account DHN behavior changes during normal and weekend days. Additionally, the optimal size of the sliding window with data selected to the calibration layer was analyzed as a function days number. Particular attention was paid to parametrization and calibration of the model in order to obtain high accuracy in the day-ahead time horizon.

Case Study District Heating System
The case study DHS consists of a heating network with a total length of about 260 km. The network is supplied from a CHP plant with two identical gas turbines (total 106 MWe) equipped with heat recovery units (total 238 MWt). The flexibility of heat production in the summer season is provided by a 600 MWh heat accumulator. Four heat-only-boilers are used as the peak source. The share of individual units in total heat capacity of the plant is presented in Figure 1. The presented state refers to the period when the maximum value of produced heat in the CHP is delivered to the DHN. During the whole year, the diverse scenarios of heat production are realized and values presented in Figure 1 can rich different values. The operation strategy is strongly dependent on current heat demand as well as the overhaul time of the selected unit. The dependence between the instantaneous heat load and the ambient air temperature during the entire heating season is depicted in Figure 2. The maximum heat load of the analyzed DHN does not exceed 250 MWt, when the minimal value of heat delivered to the final consumers during summertime is always above 20 MWt. Obviously, the heat load increases as the temperature drops. A strong correlation can be seen, however there is a relatively large spread for the same temperature level. Produced heat can vary in the range of around 50 MW. It results from the influence of other factors such as a month, day of the week, hour of the day and other meteorological parameters. Therefore, it is required to include additional input data in order to increase accuracy of the heat forecasts.

Physical Model of the District Heating System
The heat is supplied to the district heating network, where it is distributed to the endusers through heat distribution centers. Additional installations in the system are heating substations that connect individual sections of the network and ensure heat distribution to specific areas of the city. The instantaneous heat output transferred to the heating network by the CHP plant depends on the mass flow rate of network water and the temperature difference between supply and return side, according to Equation (1): where . Q s -thermal power, kW; c p -water specific heat, kJ/(kg·K); . m-mass flow rate, kg/s; T s -supply temperature, K; T R -return temperature, K and t-time, s The regulation of thermal power consists of changing both the supply temperature and the mass flow rate at the output [29]. The water temperature at the outlet from the CHP plant is in accordance with the regulatory table according to Equation (2). Supply temperature depends on the ambient temperature and a coefficient depending on the wind speed and solar radiation. For example, if the weather is windy and cloudy, the coefficient c takes values above 1, increasing the supply temperature. The maximum supply temperature in the analyzed system is 115 • C, at −20 • C ambient air temperature. During summertime, the supply side is maintained at 70 • C: where: T s -supply water temperature, K; c-coefficient; W s -wind speed, m/s; I r -solar radiation, W/m 2 ; T a -ambient air temperature, K and f -individual system function based on the DHN regulatory table, K.

Operation of the District Heating System
During operation of the system, heat production in the source is continuously adapted to the real demand in the DHN. Due to the thermal behavior of buildings as well as transient operation of DHN there is a time delay between the current parameters at the supply and return side. The time course of parameters for selected periods during the heating season and the summer period is presented in Figures 3 and 4, respectively. There is a strong correlation of the ambient air temperature with the generated thermal power, as well as the mass flow rate of water can be observed. In the summertime, dynamic fluctuations in the water mass flow rate and heat load are visible. Outside the heating season, heat demand results from the domestic hot water consumption. Peaks associated with increased hot water consumption in the evening and morning can be observed.

Heat Demand Model with the Use of GAM
Generalized Additive Model (GAM) is a class of statistical models in which the usual linear relationship between the response and predictors are replaced by several non-linear smooth functions [30]. The equation becomes as follow (Equation (3)): where: y i -dependent variable; α 0 -intercept; x 1 , . . . , x p -independent variables; f 1 , . . . , f p -smoothing functions and ε i -random error The GAM model is able to capture the non-linear effect of individual variables. The response variable is obtained as a summation of individual effects, represented with one or more terms. The smoothing function f consists of the base functions b and the corresponding regression coefficients β (see Equation (4)). The base function b can take the form of a linear or cubic spline, P-spline, and other [31]. The smoothing function can also include two input variables, according to Equation (5), where δ is a vector of regression coefficients: where: I, J-the dimension of the spline basis; b(x)-the corresponding spline function; β-the corresponding regression coefficient and δ-the corresponding vector of regression coefficient.
One of the advantages of GAM models is their flexibility. The method summarizes the contribution of each predictor using smoothing terms. In addition, a GAM algorithm captures nonlinearity and interactions in a learning dataset. Predictive methods with more complex mathematical approach, such as artificial neural networks (ANN) are typical black-box models. In an ANN model, interactions with a forecasted variable are created implicitly when propagating through the hidden layers as each hidden unit is a nonlinear combination of the input. Besides, in order to build an accurate model with black box models, many variables, especially over a long period of time, must be taken into account [16]. In the study presented in this paper, the investigated gas-fired power plant is relatively new and there are no data from the heating seasons of previous years. For this reason, the training window is relatively short (the last several days), due to the needs of permanent adaptation to the current operation of the system, and this can be achieved by the use of GAM methods. The heat demand model for the case study DHS was built using the mcgv package [32] containing the implementation of libraries with the GAM method in R programming language. The forecasts generated by the model were compared with real values using the root mean square error RMSE and the mean absolute percentage error MAPE (Equations (6) and (7), respectively): where: N-number of hours in the analyzed period; t-hour; Q pred -predicted heat load, kW and Q real -real heat load, kW where: λ-penalty parameter. The model is fitted by minimalizing a penalized residual sum of squares RSS presented in Equation (8) for one dimensional basis functions. The fitting involves finding all coefficients to minimize residual sum of square with the use of the general cross validation (GCV) criteria proposed by Craven et al. [33]. The degree of smoothness in a spline can be controlled by a penalty parameter λ in order to avoid overfitting. The iterative process of minimalization is stopped when the change of value of GCV between successive iterations are less than 0.01. The function minimizing RSS provides a compromise between a regression spline fit and a linear fit. When λ is near to zero the fit will be close to the data.

Input Variables
The weather data listed in Table 1 were taken into consideration as predictors. The table contains the calculated Pearson's correlation coefficient, which is a simple statistic that measures the linear correlation between two variables. Coefficients were calculated using the data of the entire heating season. It has a value between +1 and −1, where 1 is a total positive linear correlation, and −1 is a total negative linear correlation. Correlations for forecasted and actual weather data are presented. There is less correlation with the forecasted values due to additional forecast errors. It should be noted that in a real application, the predictive model determines output based on the weather forecasts in a forthcoming day horizon.

Flow Diagram of the Model
In Figure 5 a flow chart that includes input and output of the predictive model is presented. The model generates an hourly heat load forecast in a day-ahead time horizon, starting from 00:00. This time horizon results from the conditions related to participation in the electricity market. The model operation on the timeline diagram is illustrated in Figure 6.  • Before issuing the results, the model is calibrated with the real weather data and corresponding heat output from CHP within the moving time window. • When the weather forecast file comes at 7:00, the previously calibrated model generates load forecasts for the assumed time horizon.

Model Parametrization and Validation
Three variants of the model depending on the set of input data were considered (Equations (9)-(11)). For the first case, the model is supplied with ambient temperature, radiation and wind speed (Equation (9)). In addition to climatic parameters, variable representing the hour of the day was used, which takes values from 0 to 23. In the model M2, ambient temperate and hour of the day were implemented (Equation (10)). The model M3 only considers ambient air temperature (Equation (11)):  The detailed description of calculation Q M3 using M3 model (Equation (11)) is presented in Equation (12). The calculated coefficients are can be found in Table 2, where value of heat demand obtained from the model was also presented for the selected ambient air temperature: where: I-dimension of the spline basis (number of knots, I = 9); β T,M3,i -the corresponding regression coefficient for M3 model (see Table 2) and b T,M3,i (Temp)-the corresponding cubic-spline regression function for M3 model When calibrating the model (see Figure 5), input variables are fitted using smoothing terms. The signal from the heat meter on the output from the CHP to the DHN was used as the instantaneous heat load. Base functions are in the form of cubic spline functions. Other smooth functions such as thin plate regression spline or P-splines were examined. It was found that with a sufficiently high number of knots for splines, the type of function is of marginal importance. The additive term in M3 model, dependent on the ambient air temperature is shown in Equation (12). Other functions used in the remaining models (M2 and M1) are analogous. After the calibration stage, Equations (9)-(11) are used to find the forecasted heat load based on the new weather data.
In each model variant, a combination of input data and additive functions were selected to obtain the most accurate result in a day-ahead horizon. For example, it was observed that taking into account wind speed by using the two-dimensional additive function f (Wind, Temp) gives a better fit. Table 3 presents the most relevant parameters used (as an argument for calculations within the mgcv package) [32]. Table 3. Input parameters for mgcv package used for building the predictive model.

Parameter
Description Value

Family
The family object specifying the distribution and link to use in fitting. Gaussian

Method
The smoothing parameter estimation method. GCV (generalized cross validation)

Optimizer
The numerical method to optimize the smoothing parameter estimation criterion. Newton

Smoothing functions
Indicating the smoothing basis to use.
Cubic regression splines Figure 7 shows the impact of individual variables on the heat load for a sample training dataset in moving window (12 days) for the M1 model which contains four input variables. It can be clearly seen that as the ambient temperature together with solar irradiation increase, the values of the additive functions take lower values. For the variable representing hour, the daily variability of heat production is visible as a reduced power at night and increased in the evening (Figure 8a). The two-dimensional smoothing function, including the combined effect of wind speed and air temperature, can be seen in Figure 8b, where contour plot of additive term on the heat load is presented. The estimated value of the two dimensional additive function is marked on individual contour lines. One can notice that increasing the wind speed in the low temperature range leads to an increase in the heat load. Table 4 contain some results from the fitting procedure of the M1 model, obtained for a selected training period. The table was generated as an output from mgcv package. It can be seen that the default maximum degrees of freedom for the smoothers used in the model are sufficient for all species, as the effective degrees of freedom (EDF) for all estimated smoothers are below their maximum possible value (k ). The p-value for the observed k-index is not significant. The k-index is a measure of the remaining pattern in the residuals, and the p-value is calculated based on the distribution of the k-index after randomizing the order of the residuals [34]. The data was fitted with GCV = 36.5 and R 2 = 0.95.

Effect of the Moving Window Size with Learning Data
In order to determine the optimal size of the sliding window, the effect of the number of days in the window was investigated. The analysis was based on the data from the entire heating season, simulating the operation of the investigated models day by day for different window sizes. The RMSE error was calculated for the test dataset (in the next day horizon), for forecasted as well as actual weather data. The results are illustrated in Figure 9. The graph shows that the optimal window size with the smallest corresponding RMSE error is 12 days. This window size allows the model to be adapted to the current DHN demand and end-users' behavior. The use of a time-shifting window potentially can lead to inaccurate forecasts in the next day's horizon (e.g., when the ambient air temperature is expected to rapidly decrease outside the range in the learning window). In that case, the forecasts will be extrapolated from smoothing functions.

Results of the Heat Demand Model during the Heating Season
The obtained heat load models have been tested using the data coming from the DHS system during the heating season from November until March. The accuracy of the method was determined by means of a comparison of the forecasts with real values at the corresponding time. Figure 10 presents the time course of forecasts obtained by considered models along with the actual heat load for a selected period. It can be clearly seen that the M3 model is inaccurate compared with the M1 and M2 model. Including a variable representing the hour of the day significantly improves the results. This approach allows taking into account the daily profiles of heat production (e.g., morning and evening peaks). From the point of view of optimizing the operation of the CHP plant, a high accuracy forecast for the forthcoming days is needed. It should be borne in mind that a model that fits well into the learning dataset may not give a good accuracy on the new data set. The Figure 11 shows the aggregated MAPE error for each model. The MAPE was calculated for the entire analyzed period (November-March). The metric was determined for the training dataset in a learning window (12 days) and test dataset (day-ahead horizon). In addition to weather data forecasts, simulation of model operation on real weather values was also included. This approach enables to assess the accuracy of the predictive model without weather forecast error. The M1 model gives a better fit to the training set because of including more input data (solar radiation and wind speed). A significant improvement can be observed while taking into account the hour of the day feature. This enables the daily seasonality to be taken into account. The box plot of absolute percentage errors for test dataset is presented in Figure 12. In the M1 and M2 model, the vast majority of hourly percentage errors are between 2 and 12%. The maximum observed errors exceed 20%. It can be seen that the M1 model gives slightly better results, particularly outliers are on a lower level. The Table 5 presents the results of the MAPE and RMSE metrics aggregated into individual months of the heating season (including M1 and M2 model). In the period from December to February, the models have similar accuracy. During transition periods such as November and March, where the heat demand is lower, the M1 model gives more accurate results. In these months, due to the relatively large daily amplitude of changes in ambient temperature as well as dynamic changes of mass flow rate in DHN, there are additional difficulties in forecasting. Standard deviation of the heat demand in these periods was greater than in typically winter months. It also should be noted that in these periods the weather forecast error increases. For example, the RMSE error of forecasts in ambient air temperature in January and February was approximately 1 • C. In March, RMSE increased to 1.5 • C.  Including solar radiation in the learning data set allows to obtain better accuracy in March, when the influence of radiation in the thermal gains of buildings is greater. This can be observed in Figure 13. During periods of increased radiation over a day, the M1 model gives a better fit compared with the M2 model. The accuracy of two models is similar at night. The plot shows the hourly errors during several days from the end of March. During this period, the greatest instantaneous errors were observed. The time corresponds to the transient period between heating season and summer. Poor repeatability of heat production profile for similar climatic data was observed as a reason of inaccuracy. Therefore, the instantaneous errors exceed 20%, however the daily mean values are below 10%.

Conclusions
The paper presents the results of heat demand forecasting in the complex system supplied from a gas fired combined heat and power (CHP) plant with seven independent heat sources. The idea and results presented in the paper deal with the actual challenge of increasing energy efficiency during heat and electricity production. To maintain high production efficiency with reduced pollutants emission, high quality production forecasting is needed. For the purposes of optimizing heat production in cogeneration, hourly forecasts of the expected load demand in the forthcoming day are necessary. Accurate prediction enables efficient planning of heat production, taking into account the cooperation of the cogeneration heat sources with the district heating network. The main benefit comes from effective planning of electricity sales on the spot market, keeping the heat production at the required level. This is because the electricity production depends on the current heat load in the investigated system, which is a gas-fired combined heat and power plant. Electricity generation in gas-fired CHP plants allows reducing the emission of gaseous pollutants as CO 2 , SO x , and NO x and stopping the emission of dust. The CO 2 emission level is around 50% lower in gas-fired power plants than coal-fired power plants. The main benefit of effective electricity generation planning in the gas-fired CHP plants is improving energy generation efficiency and reducing CO 2 emission level. The generalized additive model (GAM) method was successfully used to build a predictive model based on weather data and a variable representing an hour of the day. The heat load profile can change over time because of thermo-modernization of buildings, changing the regulation of heating nodes or new connections to the heating network. For this reason, it is important to calibrate the model properly and continuously. In presented paper the application of the adaptive training of the model using moving window was investigated. The results confirmed that the last 12 days give the opportunity to take into account the current conditions of DHN operation and heat consumption behavior by end users. The moving window was adopted in all three variants of the heat demand forecasting models. The model M1 is supplied with ambient temperature, radiation and wind speed. Model M2 was additionally supplied with variable representing the hour of the day, and model M3 only consider ambient air temperature. The results of model M1, including additional weather data such as irradiation and wind speed gives the best results, particularly between the winter and summer period when high fluctuations of radiation during a whole day exist. It should be noted that the model is also burdened with independent factors such as weather forecast error and uncertainty in measuring the thermal power at the DHN supply side (about 1.5-2%).  Data Availability Statement: Restrictions apply to the availability of these data. Data was obtained from PGE Energia Ciepła S.A. and are available from the authors with the permission of PGE Energia Ciepła S.A.