Neural Approach in Short-Term Outdoor Temperature Prediction for Application in HVAC Systems

: An accurate air-temperature prediction can provide the energy consumption and system load in advance, both of which are crucial in HVAC (heating, ventilation, air conditioning) system operation optimisation as a way of reducing energy losses, operating costs, as well as pollution and dust emissions while maintaining residents’ thermal comfort. This article presents the results of an outdoor air-temperature time-series prediction for a multifamily building with the use of artiﬁcial neural networks during the heating period (October–May). The aim of the research was to analyse in detail the created neural models with a view to select the best combination of predictors and the optimal number of neurons in a hidden layer. To meet that task, the Akaike information criterion was used. The most accurate results were obtained by MLP 3-3-1 (r = 0.986, AIC = 1300.098, SSE = 4467.109), with the ambient-air-temperature time series observed 1, 2, and 24 h before the prognostic temperature as predictors. The AIC proved to be a useful method for the optimum model selection in a machine-learning modelling. What is more, neural network models provide the most accurate prediction, when compared with LR and SVR. Additionally, the obtained temperature predictions were used in HVAC applications: entering-water temperature and indoor temperature modelling.


General Context
The building sector is responsible for a major part of global energy consumption. It uses about 40% of the generated power, which is more than that used in industry and transport [1,2]. Depending on the climate, heating and cooling consumes as much as 70% of the power used by buildings [1]. At the same time, this sector, being the highest energy consumer, has great potential for improving energy efficiency thanks to modernisation activities, including insulation and the use of modern HVAC (heating, ventilation, air conditioning) systems that are managed in a smart way [3,4]. A rational HVAC system operation, which consists of reducing energy losses, operational costs, and the volume of pollution and dust emissions while maintaining thermal comfort, should be based on heating-load prediction. There are several factors that influence the heating load, which include the energy-efficiency standard of the building, its heating capacity, and the position of the premises in relation to each other or to external walls. Another important aspect is the position of the building in terms of the directions of the world and the density of development in the surrounding area [5]. Scientific research has proven that the operational habits and routines of residents in relation to their temperature preferences also have a significant influence on the heating demand [6,7]. The so-called human factor that affects the internal heating-load balance has been discussed, among others, in the work of Stevenson and Leaman [8].
To estimate the amount of heating consumed in buildings, the analyses usually attempt to find empirical correlations between weather conditions and heating demand. Among the basic meteorological parameters, outdoor air temperature is the most important element in calculating the heating demand, which constitutes the basis for designing heating systems and heating sources of buildings [9]. What is more, it is also used in indoor thermal-comfort forecasting [10,11].
Kováč et al. [12] used historical data of outdoor temperature, heat load, heat-consumer configuration, and the seasons for short-term prediction of energy consumption and heatproduction planning for the period from 1 November 2014 to 31 March 2015 using wavelet transform and neural networks. The authors concluded that outdoor air temperature is one of the most important inputs in heat-consumption forecasting.
Manasis et al. [13] studied the impact of outdoor temperature on the performance of an open-cycle gas turbine. They tested Kalman filtering as a preprocessing technique on temperature data, which was an input variable for turbine output power as well as power-generation prediction in power plants. The obtained results indicated that accurate temperature prediction leads to output-power prediction improvement by 65% and powergeneration prediction improvement by 73%.
Zhao et al. [14] tested the Monte Carlo method (MCM) to preprocess meteorological data (including temperature) before a 24 h cooling load-prediction modelling using a support vector machine (SVR). The authors pointed out that the most important input variable was the one-hour-ahead temperature, where the load-prediction error decreased with MCM from 11.54% to 10.92%.
Futawatari et al. [15] investigated the energy-saving capacity of HVAC systems in data centres, where information-technology equipment needs permanent cooling and ventilating. Among others, they predicted computer-room air conditioning (CRAC) power consumption on the basis of outside temperature. Using air temperature in Tokyo in 5 • C classes ranging from −10 • C to 35 • C, they created a method for optimizing the total energy consumption, with a reduction of 10%.
Kim and Nam [16] developed a performance-prediction equation for a heat exchanger to provide cost reduction as well as operation improvement for a ground-source heat pump. The authors considered ground and outdoor air temperature as the potential effecter of the exchanger's performance and concluded that the former had a larger influence. They also calculated errors in the equation and numerical data and obtained an RMSE standard deviation not exceeding 5.20%, indicating that the obtained models were valid.
Another issue that emerges from the present research is the importance of the application of computational intelligence methods (including the neural approach) in widely understood engineering. Numerous studies have raised the subject. For example, Sadeghzadeh et al. [17] developed a multilayer perceptron (MLP-ANN), a radial basis function (RBF-ANN), and Elman backpropagation (Elman BP-ANN) to the improve thermal efficiency of a flat-plate solar collector. Baghban et al. [18] implemented machine-learning methods like MLP, adaptive neuro-fuzzy inference system (ANFIS), as well as least squares support vector machine (LSSVM) in the heat transfer of a helically coiled tube. Ahmadi et al. [19] proposed LSSVM and a genetic algorithm to improve Al 2 O 3 /EG nanofluid thermalconductivity ratio forecasting.

Scope
The literature indicates that outdoor temperature prediction is crucial for more energyefficient HVAC systems in buildings, leading to the optimisation of their management and operation [1,[20][21][22][23]. This is because an accurate air-temperature prediction several hours ahead can provide the energy consumption and system load in advance. It was estimated that by using 24 h temperature forecasts, the U.S. electricity-generating industry saves $166 million annually [9]. Air-temperature forecasting is one of the major issues in the meteorology and climatology domains, which explains why it is fairly well-researched. However, it concerns daily, monthly, or annual temperature time series measured with low and medium time resolution (long-and medium-term predictions) [24,25]. The time series of meteorological parameters for use in heating systems must concern much smaller time intervals (hours, minutes), and the methodology for forecasting such short-term data has not yet to be developed. What is more, Kováč et al. [12] considered the lack of precise instructions for architectural design as the biggest problem for using neural networks. Among the research conducted so far, it is difficult to find work devoted solely to this subject. Usually, meteorological parameter prediction is used at an early stage in heating-load estimation without delving into its methodology [20]. Moreover, Zhao et al. [13] pointed out that most studies ignore the fact that actual and predicted meteorological data tend to differ, but that is an issue that is not adequately discussed in terms of heating or cooling load forecasting.
Wang et al. [22] developed and evaluated outdoor air-temperature prediction 3 h ahead based on an approximate pattern-matching algorithm (called TPAM) using historical meteorological data from five cities in China (Beijing, Harbin, Shanghai, Guangzhou, and Kunming) that represent different climatic zones. The parameters of the algorithm were defined for each city based on its historical data from January 2000 to December 2018 and then tested for 2019. The aim of the research was to implement temperature prediction by using only one meteorological parameter and skipping the time window. The obtained outdoor temperature forecast performance was between 71.9% and 95.6%, with an average of 80.0%.
Demirezen and Fung [21] introduced a computer program based on artificial neural networks to predict the outdoor temperature at time resolutions of 2 min and 1 h covering data from March 2018 in Strathroy, Canada. Input variables included relative humidity (RH), wind direction, wind velocity, and global horizontal irradiance (GHI) measured with the same time resolution as ambient temperature. The practical objective of the research was to assess the possibility of replacing temperature sensors with a data-driven model using weather-station data in automated systems, in hybrid energy systems, or other energy systems requiring accurate ambient-temperature forecasts. The accuracy of the temperature prediction, as described by the correlation coefficient, was 0.95 on average, indicating that ANN (artificial neural networks) is a powerful tool in outdoor temperature prediction.
Zhang et al. [23] proposed an air-temperature forecasting model that was based on the physical relationship between air temperature, direct solar radiation, cloud and ground heat radiation, and heat convection. They used sub-hourly data for the period 1 April to 8 April 2012 from a weather station situated at Nanyang Technology University, Singapore. The solar radiation profile of the past 12 h was used as the model input. The error (MAE) obtained for 0.5 h was less than 1 K when the forecasting horizon was less than 2 h. The authors noted that the model accuracy increases as the prediction interval increases because of the uncertainty in the variability of air characteristics over an extended time interval.
Papantoniou and Kolokotsa [26] developed ANN models for outdoor temperature prediction with a 4-24 h horizon for four European cities-Ancona (Italy), Chania (Greece), Granada, and Mollet (Spain)-mostly on the basis of data for 1 year. As input variables, time (minutes of the day), outdoor temperature, solar radiation, wind speed, and relative air humidity were all selected. For each city, the authors tested different neural networks (feed forward, cascade, and Elman) to identify one most suitable network. The annual performance of ANN models was r 2 > 0.9 and RMSE < 2.0 • C.
Rodriguez et al. [27] predicted the ambient air temperature in Vitoria-Gasteiz, Basque Country for the next 10 min with a combination of the multilayer perceptron and the optimal n-nearest station methods with control parameters in solar photovoltaic generation. The authors established the season, time, distance, and relative position between the target stations and selected stations, with the air temperature of the target and selected stations in the last 24 h as the input variables. The obtained results indicate that the accumulated difference between the measured and predicted temperature was lower than 1% in 96.60% of days in the validation subset, with a root mean square error of 0.2557 • C.

Objective
With the motivation to find a simple and as effective as hybrid-model technique to predict the hourly time series of ambient air temperature, artificial neural network models were used. As the most popular [27], the multilayer perceptron (MLP) was involved. The research presented in this paper refers to the heating supply in a Polish building sector (multifamily building). The authors created prediction models of outdoor air temperature changes based on historical time series [1] using only one parameter (temperature) [22] combination as an input variable for the reasonable management of a boiler in a multifamily residential building during the heating season.
The aim of the research was to analyse in detail the created neural models with a view to addressing the following questions: how many and which hours preceding the prognosed values may be used as the basis for the most accurate prediction of temperature; and is it possible to improve the quality of such prediction by increasing the number of input variables as well as the number of neurons in the hidden layer [27]? Thus, the study concerns a model-input selection for outdoor temperature prediction and the selection of the optimum neural model. To meet that task, the Akaike information criterion was used [28,29].
The research results may be applied in the optimization of the heating-system control strategy by using the prediction of outdoor air temperature in a timeframe of several hours, which allows for improving the process of controlling heating systems and implementing an efficient and reasonable fuel-management strategy [30]. Due to the climate in Poland, a major part of the energy sold to households (65.1% in 2018) is used for heating premises during winter [31]. However, as a result of the climate change that we are currently witnessing, irrespective of whether it is due to natural or anthropogenic reasons, not to mention the heatwaves that tend to occur during summer, the demand for ventilation and air-conditioning systems is increasing as well. As the reasonable use of such systems should also be based on the prediction of outdoor air temperature, the results of the present study may be successfully and effectively used for the optimisation of the operation of air-conditioning systems in summer.
The main contributions of the paper are the following: i. Short-term, outdoor-temperature time series prediction using neural networks. ii.
Research into the past moments of outdoor air temperature as model input. iii.
Detailed analysis of the created models in terms of the selection of predictors and the number of neurons in the hidden layer. iv.
Evaluation of the Akaike information criterion's suitability for the selection of the most accurate model. v.
Assessment of ambient-temperature prediction results in HVAC applications.

Database
The authors used an archived time series of outdoor air temperature values obtained from the monitoring of the heating system in a multifamily building during the heating period (October-May). The analysed structure consisted of 31 residential premises located across five storeys. Heat was supplied to the building from a local gas boiler located in the basement, which was used by the central-heating and warm-water supply systems. The boiler-control process was based on heating curves that enabled it to adjust the heating source to the needs of the building.
The database for analyses consisted of a matrix that includes 4335 lines with the following hours in the analysed period and 24 columns, where the hourly time series of ambient temperature were placed in one column (output), with an air temperature

Neural Network Modelling
The initial assumption for creating neural models of outdoor air-temperature changes was that their architecture should be as simple as possible. Nash and Sutcliffe, [32] as well as Wheater et al. [33], pointed to the need for simplicity (and lack of duplication) in the model structure as the main principles for building models with optimized parameters. Multilayer perceptron neural networks were used in the study, as they are most commonly used in practice in various applications [27,[34][35][36].
To achieve the aim, the multilayer perceptron (MLP) model was involved in the study. MLP is a unidirectional neural network consisting of one or two hidden layers of nonlinear activating nodes, and it is possible to use it in modelling nonlinear phenomena. The data stream goes in one direction, from the input layer, through the hidden layer (or layers) to the output layer. In this very popular neural network, the supervised learning method called a backpropagation algorithm (BP) is used. Classically, it works in four steps: (i) initialization of weights with low random values, (ii) feedforward propagation (neuron receives and sends a signal to the hidden neurons; hidden neurons calculate the activation function and send a signal to the output; output calculates the output signal), (iii) backpropagation (comparing the output values with the actual value; calculating error; sent back to all units), (iv) weight correction [37].
The research employed a one-or two-stage learning process with the use of a backpropagation algorithm, modern second-order algorithms as conjugate gradients (CG), a Quasi-Newton Broyden-Fletcher-Goldfarb-Shanno (BFGS) procedure, as well as the Levenberg-Marquardt method, and weight reduction in the Weigend method to avoid the network's overlearning process. CG, BFGS, and Levenberg-Marquardt algorithms are iterative techniques for solving nonlinear optimization problems, which are faster than BP and therefore more readily used. Linear input and output layers, hyperbolic hidden-layer neuron-activation functions, and a varied number of epochs (50 to 2000) were implemented. Time series were previously divided into three subsets: learning (50% of the total number of observations, or 2167 cases), validation, and test subsets (25% of observations, or 1084 each).
Neural models with a varied number of inputs (1-10), different network architectures in terms of the number of neurons in the hidden layer (2-10), and different time lags of the input time series with respect to the target air temperature (1-24) were analysed. The number of established neurons in the hidden layer ranged from 2 to 10 because too few neurons would render the network incapable of correctly representing the functions and too many would make the learning time longer, which can lead to overfitting. The analyses involved several variants of the ANN architecture for various input variables and delays. The results presented in Tables 1 and 2 concern the best combination of predictors.

Quality Metrics
Model quality was assessed based on the value of the sum of squared error, which is determined by the relationship: where: n is the number of observations, x i is the predicted data, and y i is the observed data. The Pearson linear correlation coefficient, r, between the actual values and those obtained from the model was assessed as well: where the explanation is the same as above.
Additionally, the quotient of standard deviations, ξ, was calculated according the following formula: where: σ δ is the standard deviation of prediction errors and σ x is the standard deviation of the target variable.
To select the best prediction model (with the optimal combination of input variables), the Akaike information criterion (AIC) was applied [38][39][40]. The AIC is one of the most widespread tools in statistical modelling. It is also one of the metrics of the model fit and a criterion for choosing between statistical models with a different number of predictors. It was introduced in 1973 by Hirotugu Akaike as an extension of the maximum likelihood principle and became the first model-selection criterion to gain common acceptance. Maximum likelihood is used to estimate the model parameters when the model's structure and dimension have been defined. Akaike's crucial idea was to combine the processes of estimation as well as structural and dimensional determination into a single procedure [41].
In general, a model with more predictors gives more accurate predictions but is also more likely to overfit, so it is important to find the optimum combination of input variables.
The Akaike criterion value is calculated based on the following equation: where: m is the maximum likelihood and k is the number of estimated parameters of the model [28,29].

Validation Modelling
To validate the obtained results, linear regression (LR) and support vector regression (SVR) models were created for the best combination of input variables.
In the linear regression approach, the authors assumed that the predicted value is a linear combination of all input variables (hourly time series of various combinations of lagged outdoor temperature). The optimal values of the coefficients can be found using linear algebra methods or gradient-based methods (such as stochastic gradient descent, SGD). As one of the simplest machine-learning methods, LR will probably not lead to overfitting.
The support vector regression belongs to the support vector machine-learning technique. It is a linear regressor, which is more tolerant to outliers. It does not count as errors values that are close enough to the correct value. It can model nonlinear dependencies by using the kernel trick, which allows for the expression of the value of some points by their relation to others. In the present research, the hourly outdoor air temperature was predicted with the use of an SVR supported by an RBF algorithm. The parameters of the SVR model that defined the width of the margin of trust and the maximum value of weight that may be determined for the given vector were: ε = 0.1 and C = 10.0. The RBF was determined by the gamma parameter, γ (width of the kernel function), which was equal to 0.33.
The performance of comparative models was assessed by SSE, AIC, r, and ξ.

Outdoor Air-Temperature Prediction Results in HVAC Applications
The last stage of the research was to use the obtained results in HVAC applications. The best combination of ambient air-temperature lags (1, 2, and 24 h) was implemented in the forecasting models of the entering-water temperature (EWT) to the system, as well as indoor temperature in the two flats (with the lowest and the highest average temperature in the heating season). Models were created using LR, MLP, and SVR methods. Such models, obtained using the above lag combinations, were compared with models using the current temperature at a given hour as predictor. Their accuracy was assessed by MSE, r, and ξ.
MSE was calculated according to the following formula: where: n is the number of observations.

Neural Network Modelling
The first stage of research into outdoor temperature prediction consists of the evaluation of how the number of model inputs, understood as the number of input variables (air temperature time series delayed by 1 to 24 h), influences the prediction quality, as well as the determination of the best combination of input variables (Table 1). In the table below, the best models are presented.
In the case of one input variable, the prediction of outdoor temperature was based on the time series delayed by one hour in relation to the predicted time series. Ten input variables consisted of separate time series delayed by 1 to 9 h and by 24 h (Table 1).
Based on the AIC, it was determined that the best ability to predict outdoor temperature (line in the table marked in grey) was noted for the ANN of the following architecture: three input variables, three neurons in the hidden layer, one neuron in the output layer (MLP 3-3-1). Three input values mean a delay by 1, 2, and 24 h in relation to the target time series. In this case, the value of the sum of the squared error SSE was 4467.109, the quotient of the standard deviation ξ was 0.170, and the AIC criterion was 1300.098. In comparison to the model, where the input values were delayed by 2, 3, and 24 h, a much higher SSE was obtained (10,004.706), while the value of the AIC criterion was nearly three times higher (3642.126) ( Table 2). The very high correlation coefficients between the actual temperature and the predicted one, equalling to 0.98, were obtained in all cases presented in the table. The model with the poorest characteristics was the network, which used as the input variable a time series delayed by 1 h in relation to the predicted one. It was characterised by the highest SSE of 6199.361, AIC of 2223.594, and ξ of 0.201, which proves that in comparison to other models, the time series delayed by one hour contained a significantly lower amount of information necessary to create the prediction.
The scatter plot for the best neural model (MLP 3-3-1) presents good prediction matching the observed data (Figure 1a), as well as the course of the hourly time series plot (Figure 2). The residual histogram shows that in 1900 cases, the differences between both time series were in the range of (−1, 0], and in 1500 events, the predicted outdoor temperature differed from the observed ones by (0, 1] • C. Residues (1,2] were noted 350 and (−2, −1] 300 times (Figure 1b).
nificantly lower amount of information necessary to create the prediction.
The scatter plot for the best neural model (MLP 3-3-1) presents good prediction matching the observed data (Figure 1a), as well as the course of the hourly time series plot ( Figure 2). The residual histogram shows that in 1900 cases, the differences between both time series were in the range of (−1, 0], and in 1500 events, the predicted outdoor temperature differed from the observed ones by (0, 1] °C. Residues (1,2] were noted 350 and (−2, −1] 300 times (Figure 1b).  The conducted analyses demonstrate that the input time series delayed by 24 h has a major influence on the quality of outdoor temperature prediction prepared with the use of ANN. This results from the cyclical nature of daily changes in ambient air temperature. Models that did not take this variable into account were characterised by a much lower quality. This occurred in the case described above (network 1) and in model no. 4 ( Table  1).
The obtained results are consistent with those presented in the relevant literature. The values of SSE errors for the hourly data of the created models ranged from 4467 to 6199 (Table 1), which corresponds to MSE errors of 1.0-1.4 °C. Demirezen and Fung [21], who predicted outdoor air temperature with the use of neural networks for a cloud-based smart dual-fuel switching system using 2 min resolution data, obtained MSE in the range of 0.9-1.4 °C, while the accuracy of their prediction of hourly time series of temperature ranged from 1.0-5.1 °C. Papantoniou and Kolokotsa [26] forecasted temperature in four European cities (Ancona, Italy; Chania, Greece; Granada, Spain; and Mollet, Spain) with the use of neural networks with an MSE of up to 1.4 °C. Wang et al. [22] attempted shortterm ambient-temperature prediction using pattern approximate matching obtained for five cities in China. The models were described by MSE values ranging from 1.2 to 9.5 °C. Huang et al. [41] modelled solar greenhouse air temperature using Laplace transform, The conducted analyses demonstrate that the input time series delayed by 24 h has a major influence on the quality of outdoor temperature prediction prepared with the use of ANN. This results from the cyclical nature of daily changes in ambient air temperature. Models that did not take this variable into account were characterised by a much lower quality. This occurred in the case described above (network 1) and in model no. 4 ( Table 1).
The obtained results are consistent with those presented in the relevant literature. The values of SSE errors for the hourly data of the created models ranged from 4467 to 6199 (Table 1), which corresponds to MSE errors of 1.0-1.4 • C. Demirezen and Fung [21], who predicted outdoor air temperature with the use of neural networks for a cloud-based smart dual-fuel switching system using 2 min resolution data, obtained MSE in the range of 0.9-1.4 • C, while the accuracy of their prediction of hourly time series of temperature ranged from 1.0-5.1 • C. Papantoniou and Kolokotsa [26] forecasted temperature in four European cities (Ancona, Italy; Chania, Greece; Granada, Spain; and Mollet, Spain) with the use of neural networks with an MSE of up to 1.4 • C. Wang et al. [22] attempted short-term ambient-temperature prediction using pattern approximate matching obtained for five cities in China. The models were described by MSE values ranging from 1.2 to 9.5 • C. Huang et al. [41] modelled solar greenhouse air temperature using Laplace transform, with accuracy measured by MSE = 2.59 • C. Tran et al. [42], who created medium air-temperature forecasts (several days ahead) using ANN, recurrent neural network (RNN), and long short-term memory (LSTM), obtained the best prediction results (average MSE = 7.4 • C) for LSTM. Bączkiewicz et al. [37] created MLP temperature models with MSE in a range of 4.7-6.7 • C. These researchers confirmed that even the MLP model with a simple construction could provide satisfactory results.
Further analysis was carried out on the influence of the number of neurons in the hidden layer and the length of delay on the prediction performance ( Table 2). The model characterized by the best AIC criterion (MLP 3-3-1) was analysed, and then the number of neurons in the hidden layer was increased from three to four (MLP 3-4-1) and then five (MLP 3-5-1).
The increased number of neurons in the hidden layer did not significantly improve the prediction quality. The values of the criteria selected for its evaluation in the three analysed models differed only slightly. In fact, the calculated r, ξ, and SSE values were almost identical in MLP 3-3-1, MLP 3-4-1, and MLP 3-5-1, with the difference between them having been observed only for AIC. For the first combination of delayed temperature (1, 2, 24), they equalled 1300, 1310, and 1320 and successively increased with the following lags. This suggests that there is no need to use more complex neural structures for predicting the outdoor air temperature. The initial assumption that networks with the least complex architecture should be used in the study has been confirmed by the obtained results.
The following stage in the analysis involved the application of further delays for the three input variables (the first column in Table 2). Although in all presented cases, the number of neurons in the input layer was three, different lags in relation to the predicted outdoor air temperature were used. One of them was always 24 h, while the others ranged from 1 to 11 h. Past temperature time series between 12 and 24 h were also considered, but the results were much worse and were thus not included in this paper. A 24 h delay was made in all combinations and seemed to have the greatest impact on outdoor air-temperature prediction, which was proven at an earlier stage of the research (Table 1). Along with the increase in delay, the values of the sum of the squared error SSE, AIC criterion, and the quotient of standard deviation ξ also increased, while the value of correlation coefficients decreased slowly but steadily. AIC for 10, 11, and 24 h lags amounted to around 6900, with the lowest for MLP 3-5-1 and the highest for MLP 3-3-1.

Validation Modelling
To validate the neural predictive modelling results, linear regression and support vector regression models were created for the input combination, which was assessed as the best (lags of 1, 2, and 24 h in relation to the target time series). Although the correlation coefficients for LR and SVR were the same as for the MLP model (0.986), the values of other performance metrics were much higher than those obtained for MLP 3-3-1 ( Table 3). The SSE obtained for MLP equalled 4467.109; for LR, it was 6952.597, with 7428.476 for SVR. In the case of AIC values, the one calculated for the neural model was significantly lower than the two used for comparison (AIC LR = 3212.325; AIC SVR = 1700.423). The lowest ξ value was noted for SVR (0.146); for MLP, it was slightly worse (0.170), whereas the highest was for LR (0.923). Validation results, obtained using other computational intelligence methods like basic linear regression and top support vector regression, indicated that the best outdoor temperature prediction was still created using ANN. Accuracy metrics converted to MSE values were consistent with those found in the literature, quoted in Section 3.1.

Outdoor Air-Temperature Prediction Results in HVAC Applications
Verification of the results of air-temperature prediction in HVAC applications has been presented in Table 4. Predictive models concerning indoor temperature in two flats (with the highest and lowest mean indoor temperature), as well as EWT, were created using LR, MLP, and SVR. The best performance, according to different metrics, was marked in red. The obtained results indicated that more research should be conducted in that field. Taking into account only the outdoor temperature with the three lag combinations as inputs (T 0 , T 1,2,24 , and T all ), there was no unambiguous indication that T 1,2,24 was the most accurate. Moreover, the obtained results stated that in many cases, e.g., temp. no. 4 and no. 8, predicting with SVR or EWT by linear regression, the most accurate models refer to all temperature-lag predictors (1-24 h). Only for temp. no. 8, the prediction model created with LR, and MLP for temp. no. 4, as well as the SVR model for EWT on the basis of T 1 , 2 , 24 , were the best correlation coefficients obtained. The indoor temperature in flat no. 4 was predicted with MLP using the best ξ value. The obtained MSE values were the lowest for temp. no. 8 (0.387-3.352); for temp. no. 4, it was 0.422-10.261, and the highest was for EWT (11.118-26.869).
The highest correlation coefficients between the observed and predicted values were obtained for entering-water temperature predictions (0.8-0.9), and the lowest for the indoor air temperature in apartment number 8 (0.086-0.579). In the case of apartment no. 8, where the highest average internal temperature was recorded among all flats during the analysed period, the thermal comfort of its residents was much less dependent on the outside temperature than on their habits. The internal mean temperature in flat no. 4 was the lowest compared to other apartments, and it was also the most correlated with the outdoor temperature as well as EWT.
Nevertheless, in the literature, many different meteorological parameters (apart from the outdoor temperature) were used for that purpose, such as wind speed, wind direction, dew-point temperature, air pressure, relative air humidity, and solar radiation [43], Temperature characteristics of the building were also considered, like its massive wall, inner and outer surface temperature, and heat flux [44], as well as parameters connected with occupancy: motion, air flowrate, or window opening [45]. On the other hand, the authors [14,46] have also indicated that the outdoor temperature is the most sensitive meteorological parameter with the greatest impact on the energy demand.
Ramadan et al. [47] made use of seven machine-learning models in their indoor temperature forecasting for a room of the Laboratory of Civil Engineering and Geo-Environment at Lille University, France. They concluded that the best prediction was obtained by ANN and extra trees (ET) regressor methods. The performance of the models was MSE < 1 and r > 0.9. Tzuc et al. [48] obtained a similar high performance for indoor building temperature models created using MLP, a radial basis function neural network (RBF), and the group method of data handling (GMDH). The authors reported that r = 0.931 and MSE = 1.033 for training and r = 0.929 and MSE = 1.321 for the testing subset. Fang et al. [49] predicted indoor temperature with the long short-term memory (LSTM) model in a low-energyconsumption building located in Grenoble, France. They obtained a mean MSE of around 0.25 • C.
Marmaras et al. [50] implemented a control technique to optimise a ground-source heat pump (GSHP) operation. They pointed out that using algorithms to control enteringwater temperature may benefit the systems' efficiency. Using the proposed method, a 3% reduction in the energy consumption of the building was obtained. Do and Haberl [51] developed a ground-heat exchanger to calculate the EWT of the GSHP system. The measured EWT and calculated comparison indicated that the difference between them was 1.2 • C for heating and 1.6 • C for cooling seasons.

Conclusions
Addressing the needs of various recipients in the field of heating, ventilation, and air conditioning in a sustainable manner while maintaining thermal comfort is one of the major challenges of modern society. To meet this challenge, the potential of intelligent technologies in HVAC systems is currently being evaluated in developed countries. Using computational intelligence methods has provided an opportunity to rationalise their operation by introducing a new class of regulators based on changes in weather conditions. Outdoor air-temperature prediction is crucial for a more energy-efficient HVAC system and leads to the optimisation of its management and operation because its accurate forecast several hours ahead can provide the energy consumption and system load in advance.
The presented research concerns neural short-term outdoor temperature prediction using past temperature moments as input. The best model performance was obtained when one of predictors was the temperature measured 24 h earlier, which is probably related to the daily cycle of changes in air temperature. Based on the Akaike information criterion, it was noted that the most accurate results were obtained with the use of an MLP 3-3-1 type of network, where the predictors were ambient air temperature time series observed 1, 2, and 24 h before the prognosed temperature. Its performance, according to the correlation coefficient, Akaike information criterion, and sum of squared error, was 0.986, 1300.098, and 4467.109, respectively.
The obtained results have also clarified out that even neural models of the least complex structure can offer accurate prediction of the hourly outdoor air temperature. Multilayer perceptrons with one hidden layer, a linear activation function in the input and output layers, and a hyperbolic one in the hidden layer may be used successfully in that field. The AIC proved to be a useful method for the best-model selection in machine-learning modelling. What is more, neural network models provide the most accurate prediction when compared with other computational intelligence methods, like LR and SVR.
Validation of the outdoor air-temperature prediction in HVAC applications (indoor air temperature as well as entering-water temperature forecasting) has indicated that more research should be conducted in that field. The results did not clearly state that T 1,2,24 gives the best forecasts in the aforementioned HVAC applications. Using results obtained with the MLP 3-3-1 model for outdoor temperature as the only input, the predicted indoor temperature accuracy was 0.569-6.131, with an EWT of 12.581-26.869 (MSEs).
Although the study directly concerns ambient air-temperature prediction based on its measurement during the heating season, the demand for ventilation and air-conditioning systems is increasing as well due to climate change. As the reasonable use of such systems should also be based on the prediction of outdoor air temperature, the results of the present study may be used with success for the optimisation of the operation of air-conditioning systems in the summer.