Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems

Telicko, Jevgenijs; Krumins, Andris; Nikitenko, Agris

doi:10.3390/buildings15152702

Open AccessArticle

Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems

by

Jevgenijs Telicko

^1,*

,

Andris Krumins

²

and

Agris Nikitenko

³

¹

Institute of Numerical Modelling, University of Latvia, LV-1004 Riga, Latvia

²

Lafivents SIA, LV-1004 Riga, Latvia

³

Institute of Applied Computer Systems, Riga Technical University, LV-1048 Riga, Latvia

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(15), 2702; https://doi.org/10.3390/buildings15152702 (registering DOI)

Submission received: 10 June 2025 / Revised: 3 July 2025 / Accepted: 22 July 2025 / Published: 31 July 2025

(This article belongs to the Special Issue Smart and Sustainable Buildings: New Trends, Technologies, and Integration in the Energy Transition)

Download

Browse Figures

Versions Notes

Abstract

The operational and indoor environmental quality of buildings has a significant impact on global energy consumption and human quality of life. One of the key directions for improving building performance is the optimization of building control systems. In modern buildings, the presence of numerous actuators and monitoring points makes manually designed control algorithms potentially suboptimal due to the complexity and human factors. To address this challenge, model predictive control based on artificial neural networks can be employed. The advantage of this approach lies in the model’s ability to learn and understand the dynamic behavior of the building from monitoring datasets. It should be noted that the effectiveness of such control models is directly dependent on the forecasting accuracy of the neural networks. In this study, we adapt neural network architectures such as GRU and TCN for use in the context of building model predictive control. Furthermore, we propose a novel hybrid architecture that combines the strengths of recurrent and convolutional neural networks. These architectures were compared using real monitoring data collected with a custom-developed device introduced in this work. The results indicate that, under the given experimental conditions, the proposed hybrid architecture outperforms both GRU and TCN models, particularly when processing large sequential input vectors.

Keywords:

machine learning; model-based predictive control; time series data processing; building control; power control; monitoring

1. Introduction

Buildings are among the largest consumers of energy, accounting for approximately 40% of total energy consumption [1]. At the same time, people spend about 90% of their lives indoors [2]. This implies that the energy efficiency and indoor environmental quality of buildings have a significant impact on the economy, the environment, and human quality of life.

Nowadays, numerous engineering approaches, combined with innovative materials, make it possible to substantially improve both the energy efficiency of buildings and the quality of the indoor climate. However, it is important to note that modifying the building structure or applying new materials is not always economically effective.

One promising direction for enhancing the energy efficiency and indoor climate quality of buildings is the optimization of building management. It is important to note that during the configuration of such management systems, errors may occur due to human factors—especially in complex buildings with multiple systems operating in dynamic climate conditions.

Thus, to improve building management, control models can be employed that are capable of evaluating the building’s dynamics on their own. As a result, these models can assess the effectiveness of different control strategies and select the one closest to optimal [3]. One such approach is model predictive control (MPC) based on artificial neural network models.

In MPC, one of the key components is the model used to analyze the outcomes of different control strategies. The accuracy of this model directly affects the efficiency of the control system, which may be assessed using various metrics such as comfort level, energy efficiency, and others.

There are three main directions for developing such models. The first approach involves white-box models, which are primarily based on physical processes. The advantage of these models lies in their transparency to users and their potential for high accuracy in the case where all the right properties of construction are used [4]. However, in practice, it is often the case that not all parameters, such as those of building materials, detailed geometry, or HVAC system details, are known. As a result, such models are rarely used in real-world applications.

The second type of model is the grey-box model, which partially relies on physical principles, while historical data is used to estimate the model coefficients. An example of such a model is the lumped capacitance model. There are several practical implementations of grey-box models, and in some cases, researchers have achieved fairly accurate results [5,6,7,8]. However, this approach still requires manual configuration of the model, which can directly impact its accuracy and simultaneously requires a certain level of expertise. Some results of integration of such models into ongoing control indicate that standardized solutions are required and indicate that AI models could reduce the manual effort required for system modeling [6].

The third category of models is black-box models, which are entirely data-driven. These models often include those based on artificial neural networks, which, according to several literature reviews, represent the most prevalent approach within this category [9]. Some studies suggest that neural network-based Model Predictive Control may even outperform white-box models, offering a substantial reduction in model development time [10].

Despite their potential, practical research in this area remains limited. In one notable study, the authors implemented models based on Input Convex Recurrent Neural Networks (ICRNNs) and Long Short-Term Memory (LSTM) neural network architectures. Their results indicate that the application of such control systems can achieve approximately 20% energy savings, alongside a significant enhancement in indoor comfort [11]. Among the approaches evaluated, the neural network-based models exhibited the best overall performance.

However, it is important to highlight that the authors conducted their study using simulated data rather than real monitoring data. Consequently, the results may differ significantly under real-world conditions due to measurement noise or factors not accounted for by the simulation. Furthermore, the authors utilized a fixed prediction horizon as well as a fixed input sequence length [11].

It is important to note that in real-world applications of model predictive control, the required prediction horizon length may vary significantly depending on the specific characteristics of the building control system, its inertia, and the control objectives [12]. Additionally, it should be emphasized that an effective forecast may require different lengths of historical data. For instance, in buildings with low thermal inertia, only a few hours or even minutes of historical data may suffice to accurately estimate the current state. In contrast, for systems with pronounced inertia, determining the current state may necessitate historical data spanning an entire day or even longer.

It may seem that using the maximum amount of input data for prediction should yield the best results. However, in practice, this is not always the case, as not all neural network models can efficiently utilize large volumes of input data. In certain situations, increasing the amount of input data may even lead to a decrease in model accuracy [13].

In [11], it is unclear how the model’s accuracy may vary depending on the length of the input data and the forecasting horizon. In theory, increasing the amount of input data should lead to improved performance.

In another study, based on a literature review in the field of model-based building control, the authors initially identify the most suitable neural network architectures, such as LSTM, Gated Recurrent Unit (GRU), and convolutional neural networks (CNNs) [14]. Subsequently, the authors train various architectures using monitoring data from a real building, experimenting with different input and prediction horizon lengths.

However, the results are presented in the form of a single mean squared error value for each architecture, which makes it unclear how the error varies with different input and output lengths, and which architecture performs better within specific ranges of input and output data lengths. It is also important to note that the study does not fully clarify which specific neural network architectures were applied. For example, in the case of the CNN architecture, it is not specified whether 1D or 2D convolutions were used, nor is the information flow within the architectures sufficiently described.

Because only a single mean squared error is reported for each model, it remains unclear how the performance changes depending on input and prediction lengths, and which architecture is more effective under varying conditions [14].

According to the literature reviews provided by authors and several other researchers, it could be concluded that the most popular neural network architecture for building dynamic assessment for MPC is LSTM [9,11,14].

Nowadays, artificial neural network models are developing rapidly. New large language models—such as GPT, which are based on architectures like transformers and utilize attention mechanisms—demonstrate outstanding performance on numerous benchmarks, comparable to that of humans. However, it should be noted that in Model Predictive Control tasks, particularly in the context of building management, the ability to work efficiently with time series data is crucial. Some studies suggest that architectures such as transformers are not well-suited for tasks involving time series data due to quadratic complexity in long-time series forecasts [15].

However, at the same time, there are several studies in the field of long-time series forecast, which indicate that classical LSTM could be outperformed by other architectures.

Some studies in the field of long-time series forecasting using recurrent neural networks indicate that the LSTM architecture may be inferior to its improved variant, GRU, in forecasting data similar to that encountered by MPC models in building management. For example, various authors have reached such conclusions while forecasting solar power generation and electrical load forecasting [16,17,18]. There are also a few works that indicate that GRU outperforms LSTM as a core model for MPC building control [14].

Similarly, within convolutional neural networks, new architectures tailored for time series analysis have appeared, for instance, the Temporal Convolutional Network (TCN), the authors of which argue that it outperforms classical recurrent architectures, such as LSTM, in certain tasks [19]. This has also been approved by other research aimed at forecasting such parameters close to building dynamics, such as seasonal energy consumption [20].

This makes TCN and GRU good candidates for the core of MPC for buildings. Importantly, TCN and GRU architectures have not yet been studied in comparison in the context of model predictive control (MPC) for buildings, nor has there been an investigation into how their performance depends on varying input and output sequence lengths.

One can hypothesize that due to its competitive structure, the GRU architecture is better suited for understanding the context of closely positioned sequential data. In contrast, the TCN is likely more effective at capturing the context of data points that are further apart within the input vector, as its data processing flow is more symmetrical compared to that of GRU. Based on this, it can be assumed that combining these architectures may lead to a model capable of more effectively processing longer input sequences.

In order to address the question of how the most suitable architectures for MPC, such as GRU and TCN, perform under varying input and forecast data dimensions, as well as how these architectures may complement each other, a dataset specifically oriented toward MPC control was generated in this study. This dataset was subsequently used to compare the performance of different architectures.

2. Materials and Methods

2.1. Integration of MPC Control and Specific Data Monitoring Solution

Data collection for training MPC models based on artificial neural networks may require additional information beyond classical monitoring parameters, such as temperatures in various zones, system targets (e.g., target temperatures), and so forth. In this study, a practical monitoring solution was developed to enable the optimization of both schedule-based control and the maintenance of a specified operating condition. This solution allows for the tracking and modification of control signals for the heating and cooling systems. The combined control scheme and the heating/cooling system layout are shown in Figure 1.

In Figure 1, it can be seen that the heat source is a heat pump. The heat pump is connected to a water storage tank, from which water circulates through a heat exchanger. The regulation of energy supply to the capillary system (an alternative to traditional radiators) is performed via a 3-way valve, the position of which is controlled by a servomotor.

This part of the experimental setup represents a simplified heating system similar to those commonly used in serial-type multi-apartment buildings in the Baltic region. The main difference is that, in typical residential buildings, hot water is supplied by a central network, and radiators are used instead of a capillary system. However, the underlying energy control model remains the same.

In a conventional setup, the servomotor is typically controlled by a thermostat, which is used to set the desired (target) temperature. Monitoring the target temperature, in combination with temperature distribution throughout the building and external parameters, can serve as a dataset for training MPC models aimed at optimizing schedule-based control. Specifically, using such data, a model could theoretically predict (within a certain degree of accuracy) when a comfortable temperature would be reached during transitions, such as from night to day mode. Consequently, it would be possible to determine the optimal time to initiate the transition so that the desired temperature is achieved by a specified moment (within the system’s capabilities).

However, for developing a model focused on maintaining a set temperature, target temperature data alone is insufficient. In classical systems, maintaining the desired regime is typically managed by a PID controller, which generates a control signal for the servomotor based on the current deviation from the setpoint and the dynamics of that deviation. Therefore, in order to improve the regulation regime (i.e., reduce deviations from the setpoint), it is crucial to monitor the control signal itself—something that is usually not included in standard monitoring parameters.

To enable the monitoring of control signals, a custom ESP32-based board was developed, featuring both input and output ranges of 0–10 V. As shown in Figure 1, this device is labeled as the MPC monitoring device and is installed along the control signal line between the thermostat and the servo valve. The device was programmed such that, during the data collection phase, it received the signal from the thermostat, replicated that signal to the valve, and simultaneously transmitted the control signal data to a central module.

The MPC monitoring devicewas also equipped with an RS-485 interface for Modbus communication. Using this protocol, data was later collected by a central module based on a Raspberry Pi (RPI). Additionally, the device was programmed with a mode that allowed the RPI, via Modbus, to set a specific control signal, regardless of the signal being sent by the thermostat.

For the experiment, a thermostat with native Modbus communication capabilities over RS-485 was used. This enabled the monitoring of the set temperature and operational mode (cooling/heating) and allowed these parameters to be programmatically controlled.

Heat meters (shown in yellow and orange in Figure 1), which measure the temperature difference between the inlet and outlet water and calculate energy based on the fluid flow, were installed to account for the energy emitted both through the piping and the capillary system. These heat meters are also equipped with RS-485 interfaces for Modbus communication. As a result, the RPI can read, in real time, the energy flow, water temperatures, and flow rate in different segments of the heating system.

2.2. MPC AI Model Development

It is important to note that there are two primary approaches to configuring model predictive control (MPC) based on artificial neural network models. One approach involves developing a model that receives historical data along with the desired outcome, and in response, predicts the required control signal needed to achieve the specified target [14].

However, this approach can present challenges, as the same outcome may be achieved through multiple different control strategies. As a result, such models may suffer from poor convergence.

An alternative approach involves using a model that predicts the outcome of a given control strategy, taking historical data into account. In this case, each control input corresponds to a single outcome. Such a model can later be used to evaluate different control strategies. Then, optimization based on the model could be used to find a suitable control strategy.

An additional advantage of this type of model is its interpretability. For example, if a strategy is implemented in which the system operates at full capacity, the model should logically predict an increase in temperature.

It is important to note that not all classical architecture representations are directly suitable for model predictive control tasks in their original form. This is primarily due to the mismatch between the lengths of historical input data and predicted output sequence.

For an understanding of this, the reader may refer to the data structure of an MPC model, as illustrated in Figure 2.

The input data for the predictive component of an MPC model can be categorized into two types: historical measurements that characterize the current state of the controlled system, and data that describe the future control strategy.

For a minimal configuration of the predictive component based on artificial neural networks, the historical data may include reference signal measurements (e.g., indoor temperature). In Figure 2, these data are represented by the orange dashed line. In contrast, data representing the potential future control strategy (e.g., valve position) are shown in green in Figure 2.

Accordingly, the length of the control strategy data matches the length of the forecast that describes the outcome of the applied strategy (e.g., predicted indoor temperature), which is illustrated by the blue dashed line in Figure 2.

The selection of the prediction horizon length depends on both the dynamic inertia of the building’s behavior and the capabilities of the neural network architecture. The prediction length essentially represents a trade-off: if it is too long, forecast accuracy tends to decrease, the neural network becomes more difficult to train, and a larger dataset is required. On the other hand, if the prediction horizon is too short, the model may become less effective in practical applications, as the physical processes involved exhibit inertia and require a longer time span to be accurately captured and represented.

In turn, to determine the current state of the building under various control strategies, a dataset is required that contains sufficiently detailed information to enable predictions with a certain level of accuracy. In practical applications, it is generally unreasonable to use an input vector length exceeding one day, as, in most cases, key processes, such as heat transfer, thermal storage charging, and temperature stabilization, tend to reach a steady state within that time frame.

Theoretically, it is possible to develop models where the length of the control strategy input differs from the length of the predicted outcome. However, such models tend to be less flexible and, as a result, may be less effective due to inefficient dentalization.

To address the issue of differing input and output dimensionality, the TCN architecture in this study was modified accordingly, as illustrated in Figure 3.

To achieve equal dimensionality of the input data at the initial stage, the control strategy data are expanded to match the dimensionality of the data representing the building’s state by means of partial or complete repeated duplication. The resulting data are then combined and fed into the TCN architecture.

In this implementation, the number of layers in the TCN architecture depends on the length of the input data and the filter size, which enables the model to scale when the input length remains consistent. In this case, as a base, we used the dilated TCN implementation listed in [19]. For practical implementation, we used the Python 3.8 keras-tcn library [19]. To limit an exponentially growing number of dilated convolutions and maintain a high amount of parameters, we used a filter size k of 2 in this part of the architecture, as well as a fixed base size b of 2. We used the same size for b and k to ensure that the architecture would not skip any input data and the results of convolution for processing in the next layer. As a result, the size of a dilation base d is a function dependent on the number of layers i and can be described by the following [19]:

d = b^{i}

(1)

At the same time, the layer number required to cover all data can be described using the following equation [19]:

1 + (k - 1) \cdot \frac{b^{n} - 1}{b - 1} \geq l

(2)

where l is the input sequence length, and n is the layer number. In the case of b = 2 and k = 2, the required number of layers is the following:

n = {log}_{2} (l)

(3)

The height of the output matrix is determined by the number of filters specified in the TCN architecture. In this case, this number was grid-searched for each input sequence length. The width of the output matrix is dependent on input size, as the return sequence is set to true. The resulting matrix is then processed by a single 2D convolutional layer filter, the dimensions of which were configured such that the dimensions after convolution match the length of the forecast vector. Thus, the adapted TCN architecture contains no fully connected or recurrent layers and relies solely on convolution for data processing.

Since GRU is a recurrent neural network, unlike convolutional neural networks, it is less sensitive to fixed input dimensionality and can process input sequences of varying lengths. To achieve input symmetry in the GRU architecture, an upscaling mechanism using separate GRU units was applied to the control strategy data seen in Figure 4.

For practical implementation, we used the GRU provided by the library. The number of upscaled GRU units determines the length of the input data representing the building’s state.

After the upscaling step, the control strategy data were combined with the state data and processed using a GRU layer, where the number of units was set equal to the length of the output vector. In this configuration, each GRU unit processes a sequence of input data, and only its final output is used. As a result, intermediate GRU outputs are not utilized. This allows for the individual optimization of the weights of each unit, with each unit responsible for generating a single output element.

It is important to note that each architecture has its own advantages and limitations. The GRU architecture processes input data sequentially, which theoretically enables better analysis of the context and relationships between closely positioned data points compared to TCN. However, due to its more symmetric data processing structure, TCN may offer a better understanding of relationships between data points that are farther apart in the input sequence than GRU.

To combine and potentially leverage the advantages of both architectures, the same upscaling method that was used in the previously described GRU architecture was applied. After upscaling, the input data were merged and then independently processed by both the TCN and GRU models described above; see Figure 5.

The TCN output was adjusted to match the length of the prediction vector by cropping excess data and applying a 2D convolutional layer.

Finally, the outputs of the GRU and TCN branches were concatenated and passed through a fully connected layer, with the number of neurons equal to the sum of the input dimensions. The result was then processed by a second fully connected layer, with the number of neurons equal to the length of the output vector.

For all neural network architectures employed in this study, the adaptive learning rate algorithm RAdam [21] was utilized. A grid search over the learning rate revealed that higher values, such as 0.01, often led to non-convergence in several configurations. Conversely, learning rates smaller than 0.001 resulted in negligible differences across all configurations. Therefore, a learning rate of 0.001 was selected as a stable and effective value for all architectures.

The Huber loss function was chosen as the cost function for optimization. This loss function combines the absolute and quadratic loss components, balanced by a parameter delta [22]. To optimize performance, delta was tuned via grid search over the range [0, 1] with a step size of 0.25, independently for each input sequence length.

The optimal number of training epochs was determined automatically using early stopping. The patience parameter was set to 300 (with an average epoch number of around 12 K), and the early stopping criterion was based on the mean absolute error (MAE) evaluated on the validation dataset.

2.3. Base Line Model

For a more effective comparison between models, Random Forest was selected as the baseline model. In contrast to neural network architectures such as TCN and GRU, this model does not require sequential input data. Therefore, the matrix representing the building’s state and the control strategy were combined and presented to the model as a single vector. Separate models were employed for multi-step ahead forecasting. For example, to forecast 10 future time steps, the same vector containing information about the building’s state and the future control strategy was input into 10 different Random Forest models, each of which predicted one future value.

3. Experiment

To determine the optimal neural network configuration, a series of experiments was conducted.

Dataset

Real monitoring data were used from an experimental building located in the Botanical Garden of the University of Latvia in Riga. Detailed parameters of the building were presented in our previous publication [23].

The data were collected over a period of 76 days, covering the entire intensive heating season of one year in Latvia. As a result, the dataset does not include days when the heating system was inactive. It should be noted that this dataset is suitable only for dynamic models aimed at predicting building behavior during the heating period. For control under cooling conditions or during transitional phases, the most effective approach would likely be to use separate models trained on corresponding datasets, rather than relying on a single universal model designed to predict dynamics across all operating modes.

It is also important to emphasize that this dataset cannot be used to assess the absolute performance metrics of models when applied to other buildings, as all buildings differ in their characteristics. However, it can be useful for evaluating the potential differences in the behavior of various models.

The dataset included outdoor temperature, the temperature of the internal heating circuit (between the heat pump and the heat exchanger), the temperature of the capillary system (which radiates heat through the walls and ceiling), indoor air temperature, the temperature read by the thermostat, the temperature at the center of the room, and the voltage applied to the three-way valve that controls the system’s output power. Additionally, time was encoded as a cosine phase generated from the timestamp to provide temporal context to the model.

All data were normalized to a 0–1 range. For this, the minimum value of each parameter was subtracted from its column, and then all values from the column were divided by its maximum. It is important to note that the data were sourced from multiple systems: the previously described MPC monitoring device (for all heating signals), a weather station, and a wireless monitoring system described in one of our other publications [24]. The data recording frequency during monitoring was one record per minute.

Due to the use of multiple data acquisition systems, perfect time synchronization was not possible. To address this and to maintain a consistent temporal resolution, the raw data were first averaged over one-minute intervals. Then, linear interpolation was applied to eliminate the potential NaN values of each data column, which could have resulted from electronic malfunctions or time misalignments. Finally, to compress the data, the values were averaged over three-minute intervals.

Each training sample for the neural network architecture consists of three components: two input blocks and one output block.

The first input block is a matrix representing various data columns that describe the state of the building and its environment. The weather station data, temperatures within the heating system, previous control signals, and temperature in the room in the past were used. The length of this input block is determined by the input data length parameter.

The second input block is a vector representing the future control strategy; in this case, the voltage applied to the servo valve. The length of this input depends on the forecast horizon.

Finally, the output block is a vector corresponding to the predicted future room temperature, specifically at the center of the room.

The dataset was divided into three subsets: training, validation, and testing using a 70/20/10% split.

Then, to increase the number of samples, a sliding window approach was used instead of segmenting the data into non-overlapping chunks. The window size was set to the sum of the input length and the forecast horizon. This window was moved across the monitoring data with a stride of one sample, generating a larger number of training, validation, and test examples for the neural network.

It is important to note that, as a result of this method, in the context of a single dataset split (e.g., the training set), certain data segments may appear in multiple examples.

Each neural network architecture configuration was trained five times to eliminate the influence of random weight initialization. For comparison, the best result out of the five runs was selected based on the mean absolute error (MAE) metric. In this comparison, the mean absolute error (MAE) of temperature, measured in degrees Celsius, was chosen as the primary evaluation metric, as its values directly reflect the potential average physical deviation of the forecast. This, in turn, can be used to assess the impact of model accuracy on control performance outcomes.

Since, in theory, different architectures may process data with varying effectiveness, and this effectiveness may depend on both the amount of input data and the prediction horizon, in this comparison, we evaluated the architectures using various input lengths ranging from 3 h to 1 day, as well as different forecasting horizons ranging from 30 min to 3 h.

This choice was based on the physical behavior of the system, where most heat transfer processes are expected to stabilize within a 24-hour period. Conversely, using very short histories or forecasting horizons below 30 min is generally impractical due to the thermal inertia of the building, which can be partially assessed from the monitoring data. For example, as shown in Figure 6, even when the heating system operates at full capacity, it takes approximately three hours for the temperature to stabilize.

4. Results

All architectures were trained five times to eliminate the influence of weight initialization and to obtain stable results.

The best results for the TCN, GRU, the proposed TCN-GRU hybrid architectures, and reference Random Forest models, depending on input sequence length and prediction horizon, are presented in Table 1, Table 2, Table 3, and Table 4, respectively. The row labeled input indicates the input sequence length in minutes; to determine the actual size of the input vector used in the models, this value should be divided by three, accounting for the previously applied averaging by 3 min. Similarly, the out column shows the length of the prediction horizon in minutes, and as with the input, this value should be divided by three to reflect the true length of the predicted vector.

It is important to note that the absolute error values in the tables have been inverse-normalized and, therefore, correspond to real temperature values.

It can be observed that when forecasting 30 min ahead (equivalent to 10 samples), all models exhibit very low errors; see Figure 7.

However, it is important to note that within such a short time interval, the system undergoes only minimal changes, which could be accurately captured even by a simple model, such as Random Forest.

As shown in Figure 6, even when the system operates at maximum power, the temperature changes by only approximately 0.35 °C over a 30 min period.

However, since the system does not continuously switch between 0% and 100% power, the average temperature deviation over this period is significantly smaller than 0.35 °C.

All architectures also exhibit a consistent and logical trend: when the input sequence length is held constant and the prediction horizon is increased, the error also increases. This is expected, as the neural network must approximate a more complex function; the longer the prediction horizon, the less effectively it can be modeled by a simple linear function. This also reflects the baseline model; see Figure 8:

Comparing the results, it can be concluded that the TCN architecture demonstrates the highest error across all configurations of input and forecast lengths relative to the other models, including Random Forest. It is also noteworthy that, unlike the GRU and the combined GRU-TCN architectures, the TCN exhibits an increase in error as the input sequence length increases while the forecast horizon remains constant; see Table 1. The baseline Random Forest model also appears to capture similar dynamics, albeit to a lesser extent (see Table 4). One possible explanation is that, in this implementation, the TCN (Temporal Convolutional Network) has the fewest trainable parameters compared to the other neural network models. For example, with a 180 min input length, the TCN has approximately half as many parameters as the GRU model. This disparity grows even larger as the input length increases. In addition, it is attributed to the fixed minimum size of the filters used in the TCN layers. It is possible that slightly better results could have been achieved if these parameters were optimized for each specific input dimension. However, from a practical perspective, such optimization can be resource-intensive, as it would require training and validating numerous model configurations.

The highest mean absolute error (MAE) of 0.36 °C for the TCN architecture occurred with an input length of one day (1440 min) and a prediction horizon of 3 h, as shown in Table 1.

A natural question may arise: Is an MAE of 0.36 °C considered large or small?

To address this, consider that the model is intended for optimizing schedule-based control. In other words, it needs to determine when to activate the daytime mode in order to reach a comfortable indoor temperature by a specific time.

Based on the system dynamics illustrated in Figure 6, it can be seen that under maximum power, the transition from night mode to daytime comfort conditions takes approximately 180 min, during which the temperature increases by 2.5 °C. From this, we can estimate a simplified linear rate of temperature change of approximately 0.013 °C per minute. Therefore, with a mean absolute error of 0.36 °C, the system could potentially switch to daytime mode up to 25 min too early or too late.

Examining the error variations in the GRU architecture under different input lengths and forecasting horizons, one can observe that in certain cases, for example, when predicting 3 h ahead, increasing the input length leads to a reduction in error (see Table 2). However, this trend is not consistent across all forecasting horizons; see Figure 9.

For instance, when forecasting 90 min ahead, using the longest input sequence (1440 min) results in an increase in error. This indicates that the model is unable to effectively utilize the additional input data. This issue may be related to the known limitation of recurrent neural networks, which tend to forget the values of elements located near the beginning of long input sequences, an issue highlighted by the authors of the TCN architecture in their publication [19].

As shown in Table 3, unlike the GRU, the proposed hybrid TCN-GRU architecture exhibits a clear trend of decreasing prediction error with increasing input length when forecasting 180 min ahead. This indicates that the architecture is capable of effectively utilizing additional input data when making long-term forecasts. At the same time, it outperforms all models in the 180 min forecast horizon; see Figure 10.

It is also noteworthy that when forecasting 180 min ahead, the proposed TCN-GRU hybrid architecture achieves a 30% lower error compared to GRU, and more than a three-times-lower error than the TCN architecture.

Furthermore, in the cases of 60 and 90 min forecasts, the prediction error is very close to that of GRU. This may simply indicate that, for this particular building, forecasting temperature changes over a 60 or 90 min horizon does not require additional data beyond a certain threshold.

The reason why the hybrid architecture may yield more accurate long-term forecasts can be attributed to the ability of the final fully connected layers to effectively integrate the outputs from both the GRU and TCN components. The TCN component potentially retains more information from the earlier parts of the time series, as well as long-range dependencies between distant data points. At the same time, the GRU component can complement the TCN by providing more precise information about the influence of recent data. Thus, the combination of these architectures can enhance the model’s memory capacity.

In conclusion, if this model is evaluated in the context of schedule-based control using similar reasoning to the TCN, it can be inferred that a forecast horizon of 180 min, with an input length of 1440 min, would, on average, lead to a deviation in system activation time of approximately 8 min from the optimum.

5. Conclusions

In the course of this study, a monitoring system was developed to enable the generation of real physical datasets specifically tailored for the development and validation of neural network-based MPC models.

Although the use of such data does not allow for directly determining the absolute performance of different architectures across various buildings and operating modes, especially depending on input and output length, the results obtained from this dataset can still be useful for assessing the relative behavior of the tested models. These behavioral trends may persist across other buildings as well.

The results indicate that excessively increasing the forecast horizon can often have a negative impact on model performance. The simpler baseline model, Random Forest, has shown promising results, suggesting its potential suitability for short-term forecasting tasks.

The comparative analysis shows that, in the current implementation, TCN architectures underperform compared to Random Forest and GRU models. However, the results obtained from a new architecture based on TCN and GRU combination suggest that, despite the weak performance of individual TCNs, their ensemble can significantly outperform GRU models in accuracy at longer forecasting horizons. This supports the hypothesis stated at the beginning of the study.

For 60 and 90 min forecast horizons, the GRU and GRU-TCN models produced similar results, with little variation in performance across different input lengths. Since the hybrid architecture requires more computational resources for training, the GRU may be a more efficient choice for medium-range forecasting horizons.

The proposed hybrid architecture may be further applied to the optimization of schedule-based building control, as initial tests demonstrated a potentially small temporal deviation (less than 10 min) in activation timing. However, this time may vary depending on the optimization strategy used in combination with dynamic models, the study of which could be the focus of future research.

Author Contributions

Conceptualization, J.T., A.K. and A.N.; methodology, J.T.; software, J.T.; validation, J.T.; formal analysis J.T. and A.K.; investigation, J.T.; resources, J.T.; data curation, J.T.; writing—original draft preparation, J.T., A.K. and A.N.; writing—review and editing, J.T., A.K. and A.N.; visualization, J.T. and A.K.; supervision, J.T., A.K. and A.N.; project administration, J.T. and A.K.; funding acquisition, A.K. All authors have read and agreed to the published version of the manuscript.

Funding

Present research is being conducted based on an agreement with SIA “ETKC” (Centre of Competence for Energy and Transportation) within the framework of project Nr. 5.1.1.2.i.0/2/24/A/CFLA/002, co-funded by The Recovery and Resilience Facility.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Author Andris Krumins was employed by the company Lafivents SIA. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Cao, X.; Xilei, D.; Liu, J. Building energy-consumption status worldwide and the state-of-the-art technologies for zero-energy buildings during the past decade. Energy Build. 2016, 128, 198–213. [Google Scholar] [CrossRef]
Klepeis, N.; Nelson, W.; Ott, W.; Robinson, J.; Tsang, A.; Switzer, P.; Behar, J.; Hern, S.; Engelmann, W. The National Human Activity Pattern Survey (NHAPS): A resource for assessing exposure to environmental pollutants. J. Expo. Anal. Environ. Epidemiol. 2001, 11, 231–252. [Google Scholar] [CrossRef]
Široký, J.; Oldewurtel, F.; Cigler, J.; Prívara, S. Experimental analysis of model predictive control for an energy efficient building heating system. Appl. Energy 2011, 88, 3079–3087. [Google Scholar] [CrossRef]
Taheri, S.; Hosseini, P.; Razban, A. Model predictive control of heating, ventilation, and air conditioning (HVAC) systems: A state-of-the-art review. J. Build. Eng. 2022, 60, 105067. [Google Scholar] [CrossRef]
Oldewurtel, F.; Parisio, A.; Jones, C.N.; Gyalistras, D.; Gwerder, M.; Stauch, V.; Lehmann, B.; Morari, M. Use of model predictive control and weather forecasts for energy efficient building climate control. Energy Build. 2012, 45, 15–27. [Google Scholar] [CrossRef]
Tomás, L.; Lämmle, M.; Pfafferott, J. Demonstration and Evaluation of Model Predictive Control (MPC) for a Real-World Heat Pump System in a Commercial Low-Energy Building for Cost Reduction and Enhanced Grid Support. Energies 2025, 18, 1434. [Google Scholar] [CrossRef]
Uytterhoeven, A.; Van Rompaey, R.; Bruninx, K.; Helsen, L. Chance constrained stochastic MPC for building climate control under combined parametric and additive uncertainty. J. Build. Perform. Simul. 2022, 15, 410–430. [Google Scholar] [CrossRef]
Nagpal, H.; Avramidis, I.I.; Capitanescu, F.; Heiselberg, P. Optimal energy management in smart sustainable buildings—A chance-constrained model predictive control approach. Energy Build. 2021, 248, 111163. [Google Scholar] [CrossRef]
Michailidis, P.; Michailidis, I.; Gkelios, S.; Kosmatopoulos, E. Artificial Neural Network Applications for Energy Management in Buildings: Current Trends and Future Directions. Energies 2024, 17, 570. [Google Scholar] [CrossRef]
Stoffel, P.; Berktold, M.; Gall, A.; Kümpel, A.; Mueller, D. Comparative study of neural network based and white box model predictive control for a room temperature control application. J. Phys. Conf. Ser. 2021, 2042, 012043. [Google Scholar] [CrossRef]
Paré, M.C.; Dermardiros, V.; Lesage-Landry, A. Efficient Data-Driven MPC for Demand Response of Commercial Buildings. arXiv 2024, arXiv:2401.15742. [Google Scholar] [CrossRef]
Laguna, G.; Mor, G.; Lazzari, F.; Gabaldon, E.; Erfani, A.; Saelens, D.; Cipriano, J. Dynamic horizon selection methodology for model predictive control in buildings. Energy Rep. 2022, 8, 10193–10202. [Google Scholar] [CrossRef]
Kim, H.; Ejaz, M.A.; Lee, K.; Cho, H.M.; Kim, D.H. Predictive Optimal Control Mechanism of Indoor Temperature Using Modbus TCP and Deep Reinforcement Learning. Appl. Sci. 2025, 15, 7248. [Google Scholar] [CrossRef]
Abida, A.; Richter, P. HVAC control in buildings using neural network. J. Build. Eng. 2023, 65, 105558. [Google Scholar] [CrossRef]
Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are Transformers Effective for Time Series Forecasting? arXiv 2022, arXiv:2205.13504. [Google Scholar] [CrossRef]
Dey, R.; Salem, F. Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks. arXiv 2017, arXiv:1701.05923. [Google Scholar] [CrossRef]
Huynh, A.; Nguyen, T. The Comparison of GRU and LSTM in Solar Power Generation Forecasting Application. Int. J. Sci. Res. Arch. 2024, 13, 1360–1370. [Google Scholar] [CrossRef]
Abumohsen, M.; Owda, A.Y.; Owda, M. Electrical Load Forecasting Using LSTM, GRU, and RNN Algorithms. Energies 2023, 16, 2283. [Google Scholar] [CrossRef]
Bai, S.; Kolter, J.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
Shaikh, A.K.; Nazir, A.; Khalique, N.; Shah, A.S.; Adhikari, N. A new approach to seasonal energy consumption forecasting using temporal convolutional networks. Results Eng. 2023, 19, 101296. [Google Scholar] [CrossRef]
Liu, L.; Jiang, H.; He, P.; Chen, W.; Liu, X.; Gao, J.; Han, J. On the Variance of the Adaptive Learning Rate and Beyond. arXiv 2021, arXiv:1908.03265. [Google Scholar] [CrossRef]
Gokcesu, K.; Gokcesu, H. Generalized Huber Loss for Robust Learning and its Efficient Minimization for a Robust Statistics. arXiv 2021, arXiv:2108.12627. [Google Scholar] [CrossRef]
Telicko, J.; Jakovics, A. Applying Dynamic U-Value Measurements for State Forecasting in Buildings. Latv. J. Phys. Tech. Sci. 2023, 60, 81–94. [Google Scholar] [CrossRef]
Telicko, J.; Jakovics, A. Power efficient wireless monitoring system based on ESP8266. In Proceedings of the 2022 IEEE 63th International Scientific Conference on Power and Electrical Engineering of Riga Technical University (RTUCON), Riga, Latvia, 10–12 October 2022; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. Heating system configuration.

Figure 2. MPC data scheme.

Figure 3. TCN architecture data flow.

Figure 4. GRU architecture data flow.

Figure 5. Combined GRU TCN architecture data flow.

Figure 6. Building response on 100% heating power.

Figure 7. Different model mean absolute error forecasting 30 min ahead.

Figure 8. Model mean absolute error change depending on forecast horizon with fixed 540 min data input.

Figure 9. GRU performance MAE trends.

Figure 10. Model comparison in long, 180 min forecasts.

Table 1. Temperature (degrees Celsius) mean absolute error for the TCN architecture under different input and output shapes.

Input	180 (min)	540 (min)	1440 (min)
Out
30 (min)	0.086	0.079	0.084
60 (min)	0.112	0.122	0.134
90 (min)	0.160	0.166	0.192
180 (min)	0.248	0.298	0.36

Table 2. Temperature (degrees Celsius) mean absolute error for the GRU architecture under different input and output shapes.

Input	180	540	1440
Out
30	0.066	0.078	0.07
60	0.092	0.104	0.08
90	0.118	0.114	0.126
180	0.182	0.1602	0.162

Table 3. Temperature (degrees Celsius) mean absolute error for GRU-TCN hybrid architecture under different input and output shapes.

Input	180 min	540 min	1440 min
Out
30 min	0.064	0.078	0.074
60 min	0.108	0.106	0.104
90 min	0.094	0.100	0.112
180 min	0.152	0.142	0.108

Table 4. Temperature (degrees Celsius) mean absolute error for Random Forest under different input and output shapes.

Input	180 min	540 min	1440 min
Out
30 min	0.055	0.057	0.060
60 min	0.087	0.093	0.097
90 min	0.117	0.122	0.126
180 min	0.187	0.190	0.198

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Telicko, J.; Krumins, A.; Nikitenko, A. Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems. Buildings 2025, 15, 2702. https://doi.org/10.3390/buildings15152702

AMA Style

Telicko J, Krumins A, Nikitenko A. Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems. Buildings. 2025; 15(15):2702. https://doi.org/10.3390/buildings15152702

Chicago/Turabian Style

Telicko, Jevgenijs, Andris Krumins, and Agris Nikitenko. 2025. "Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems" Buildings 15, no. 15: 2702. https://doi.org/10.3390/buildings15152702

APA Style

Telicko, J., Krumins, A., & Nikitenko, A. (2025). Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems. Buildings, 15(15), 2702. https://doi.org/10.3390/buildings15152702

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Development and Evaluation of Neural Network Architectures for Model Predictive Control of Building Thermal Systems

Abstract

1. Introduction

2. Materials and Methods

2.1. Integration of MPC Control and Specific Data Monitoring Solution

2.2. MPC AI Model Development

2.3. Base Line Model

3. Experiment

Dataset

4. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI