Double-Target Based Neural Networks in Predicting Energy Consumption in Residential Buildings

: A reliable prediction of sustainable energy consumption is key for designing environmen-tally friendly buildings. In this study, three novel hybrid intelligent methods, namely the grasshopper optimization algorithm (GOA), wind-driven optimization (WDO), and biogeography-based optimization (BBO), are employed to optimize the multitarget prediction of heating loads (HLs) and cooling loads (CLs) in the heating, ventilation and air conditioning (HVAC) systems. Concerning the optimization of the applied algorithms, a series of swarm-based iterations are performed, and the best structure is proposed for each model. The GOA, WDO, and BBO algorithms are mixed with a class of feedforward artiﬁcial neural networks (ANNs), which is called a multi-layer perceptron (MLP) to predict the HL and CL. According to the sensitivity analysis, the WDO with swarm size = 500 proposes the most-ﬁtted ANN. The proposed WDO-ANN provided an accurate prediction in terms of heating load (training (R 2 correlation = 0.977 and RMSE error = 0.183) and testing (R 2 correlation = 0.973 and RMSE error = 0.190)) and yielded the best-ﬁtted prediction in terms of cooling load (training (R 2 correlation = 0.99 and RMSE error = 0.147) and testing (R 2 correlation = 0.99 and RMSE error = 0.148)).


Introduction
The concerns about reducing energy consumption in smart cities are growing. Indeed, the rate of energy consumption has risen mainly in the last two decades. For instance, in buildings, an approximate increasing rate of consumption of 0.9% per year in the USA was reported by the US Energy Information Administration (EIA) [1]. In this regard, the residential buildings (as it is the main concern of the present study) consume approximately 38% of electricity energy. Therefore, almost one-fourth of the world's energy is consumed in residential buildings [2]. Considering climate change, environmental concerns, as well as an overall significant contribution of buildings and their high demands on the consumption of such energy use, new intelligent methods are required to reduce this energy consumption. The rate of energy consumption of buildings has been investigated by various researchers such as Zhou, et al. [3], Ahmad, et al. [4], and Mocanu, et al. [5].
Zhou, et al. [3] applied a feedforward artificial neural network (ANN) combined with a hybrid system (for example, considering the multivariable optimization process). The results obtained from their study showed that the generated ANN-based data-driven training algorithm is more reliable than the traditional "lsqcurvefit" fitting method. Asadi, et al. [94] investigated the energy consumption prediction of the building retrofits. They used a combination of the ANN and a genetic algorithm to find the most proper structure of a hybrid algorithm and to assess the interaction among conflicting objectives. Mocanu, et al. [5] stated that the deep learning-based solution methods are aimed to enhance the estimation level of the predictive network by letting a higher level of abstraction be implemented. Ahmad, et al. [4] stated that the management of energy-efficient buildings and predicting energy consumption for all buildings are critical tasks for decision-makers, in order to save effective energy as well as to develop smart cities. They reviewed some of the employed prediction techniques in estimating the building electrical energy using the most updated artificial intelligence (AI)-based solution. Zhou, et al. [3] investigated the use of a new hybrid system combined with active cooling, phase change materials, and hybrid ventilations.

HL and CL-Related Studies
Many researchers also have aimed to investigate the links between the heating loads (HLs) and cooling loads (CLs) and their most influential parameters such as the speed of the wind, the conditions of the environment climate changing, the rate of light, etc. Budaiwi and Abdou [95] successfully employed the heating, ventilation, and air conditioning (HVAC) system operational strategies to deduce the amount of consumed energy in buildings. In their study, they used an example of buildings with intermittent occupancy. Nasruddin, et al. [96] studied the HVAC optimization system to measure energy consumption in a particular building dataset. They used ANN as well as a multi-objective GA. Their findings provided possible design variables that can help to achieve the best-fit HVAC system. The results proved to be practical in terms of both annual energy consumption and thermal comfort. Min, et al. [97] explored indirect evaporative cooling energy recovery systems (for example, to pursue a high-quality indoor thermal environment) through a new statistical modeling approach. The validation of the proposed model was done using a series of experimental data. Roy, et al. [98] predicted heating load in buildings by employing a hybrid model of multivariate adaptive regression splines (MARS) and extreme learning machine (ELM). Liu, et al. [99] studied hierarchical modeling and temperature interval techniques to optimize the prediction of a multi-layer hybrid building's cooling and heating load. The obtained results are compared with the basic estimation models, showing that the accuracy of the proposed models is significantly enhanced [99]. Kavaklioglu [100] proposed a new rebuts modeling of cooling and heating loads to have an efficient residential building design. Moayedi, et al. [101] investigated the social behavior of elephant herds in order to predict cooling load of residential buildings. In this sense, the technique of elephant herding optimization (EHO) is employed to optimize the trend of the HVAC system. In the other study, the k-fold cross-validation method was used to validate the proposed hybrid models, as discussed by Qiao, et al. [102]. Conventional modeling such as ordinary least squares was employed to have a better comparison between the obtained results and traditional techniques. The results showed that the developed methods are excellent in providing links between the influential factors and the target, which were heating and cooling loads.

Novelty and Proposed Methods
Based on the applicability of metaheuristic algorithms for a wide range of optimization techniques (for example, [13,93,[103][104][105][106][107][108][109][110][111]), this study is particularly conducted in the case of thermal load analysis for residential buildings, where their real operations highly rely on building design criteria. The building design codes try to reduce the amount of energy consumption by considering two key terms: CLs and HLs. The HVAC is a well-known technology for indoor and vehicular environmental comfort. The concept of HVAC is a key issue in designing residential building structures such as apartment buildings, hotels, family homes, senior living facilities, office and industrial buildings (for example, medium to large), hospitals, and skyscrapers. Indeed, the subject of HAVC is not limited to building only, and it can be considered for the commonly used vehicles such as trains, cars, airplanes, large ships as well as submarines that work in the marine environments. The main objective of the HVAC system is to provide acceptable indoor air quality and thermal comfort. Although there are many new techniques to calculate HLs and CLs, there is still a lack of proper intelligence techniques in the prediction of these parameters. Studying through the current literature, numerous studies have used popular optimization methods [112,113] but the lack of multiobjective consideration of both HL and CL as well as more up-to-date hybrid intelligent-based solutions can be a suitable gap of knowledge in the subject of HVAC systems simulation. Therefore, this article recommends three novel multiobjective intelligent-based solutions on the herding trend of the grasshopper optimization algorithm (GOA), wind-driven optimization (WDO), and biogeography-based optimization (BBO) to predict the multi targets of HLs and CLs of residential buildings. These algorithms enhance the current ANN-based solutions. In this regard, over 33,000 model iterations are performed to provide the best-fit structure of combined algorithms.

Methodology and Theoretical Background
The steps that are taken to fulfill the objective of the current study are illustrated here as well as in Figure 1: (a) First, data preprocessing: in this step, the collected dataset of heating loads and cooling loads are separated randomly into two main sections called the training dataset and testing dataset. As a well-established train/test dataset selection, 70% of the whole dataset is used for training the models to create a proper predictive network and to make the connection between both targets of HLs and CLs and their influential factors. The rest of the 30% is selected to be the testing and validation datasets. In our example, we selected 30% for the testing dataset, where none of the testing datasets were involved in the learning process. (b) Second, the programming language of MATLAB 2018 is used to (i) define the most proper structure of the multi-layer perceptron (MLP) neural network (in terms of the number of hidden neurons), and (ii) building three hybrid models by optimizing the computational parameters of the MLP (that is, its biases and assigned weights) using the GOA, WDO, and BBO algorithms. After a series of trial and error parametric processes (that is, used to find the most appropriate hyperparameters of the hybrid Energies 2021, 14, 1331 4 of 25 models), the best population size is determined for each model. Next, the outputs (that is, HLs and CLs in our study) are predicted by the proposed models. (c) In the last step, by employing the remaining 30% of data (that is, called the testing dataset), the error performance of the proposed models is calculated based on the differences between the real measured values and the predicted values obtained from the proposed models.
(b) Second, the programming language of MATLAB 2018 is used to (i) define the most proper structure of the multi-layer perceptron (MLP) neural network (in terms of the number of hidden neurons), and (ii) building three hybrid models by optimizing the computational parameters of the MLP (that is, its biases and assigned weights) using the GOA, WDO, and BBO algorithms. After a series of trial and error parametric processes (that is, used to find the most appropriate hyperparameters of the hybrid models), the best population size is determined for each model. Next, the outputs (that is, HLs and CLs in our study) are predicted by the proposed models. (c) In the last step, by employing the remaining 30% of data (that is, called the testing dataset), the error performance of the proposed models is calculated based on the differences between the real measured values and the predicted values obtained from the proposed models. The following, sections describe the MLP, GOA, WDO, and BBO algorithms that were used.

Multilayer Perceptron
These tools are widely applied for modeling complex engineering issues. Artificial neural networks (ANNs), (that is, also called neural networks) were first introduced by McCulloch and Pitts [114]. It is known as a computing system vaguely inspired by the biological neural networks, whose idea is primary taken from animal brains. This artificial intelligence (AI)-based technique will try to make a relationship between a series of input data layers with one or more output layers by establishing non-linear equations [115]. A common ANN structure includes different elements called input layers, hidden layer(s), and output layer(s) ( Figure 2). The following, sections describe the MLP, GOA, WDO, and BBO algorithms that were used.

Multilayer Perceptron
These tools are widely applied for modeling complex engineering issues. Artificial neural networks (ANNs), (that is, also called neural networks) were first introduced by McCulloch and Pitts [114]. It is known as a computing system vaguely inspired by the biological neural networks, whose idea is primary taken from animal brains. This artificial intelligence (AI)-based technique will try to make a relationship between a series of input data layers with one or more output layers by establishing non-linear equations [115]. A common ANN structure includes different elements called input layers, hidden layer(s), and output layer(s) ( Figure 2).

Grasshopper Optimization Algorithm
Grasshoppers may be seen independently or in a swarm in nature. It is considered as one of the insects feeding on plants. Noting that such insects are considered as a pest (that is, any plant or animal detrimental to crops) as they mainly severely harm the crops and pasture [116]. The idea of the grasshopper optimization algorithm (GOA) is taken from the natural behaviors of grasshoppers pests; Saremi, et al. [117] developed the GOA for solving searching optimization issues. Like many previously published natural inspired algorithms (for example, the whale optimization algorithm; the ant lion optimizer; the salp swarm algorithm, etc.), the GOA algorithm predicts based on two major steps. The first step is an exploration whereas the second step is called exploitation. As is taken from the grasshoppers' behaviors, such steps are performed to seek a food source. In this first phase (that is, called the exploitation phase), the agents that are responsible for the search fly locally over the search area. This process is illustrated in Figure 3. which is adapted from [117]. According to Mafarja, et al. [118], which explored the behavior of the relations for different levels of some algorithm parameters, when the distance between two grasshoppers is between 0 and 2.079, the repulsion takes place between them. Consequently, if the distance is out of this range (that is, larger than 2.079), it means that the proposed grasshopper enters a comforting district.

Grasshopper Optimization Algorithm
Grasshoppers may be seen independently or in a swarm in nature. It is considered as one of the insects feeding on plants. Noting that such insects are considered as a pest (that is, any plant or animal detrimental to crops) as they mainly severely harm the crops and pasture [116]. The idea of the grasshopper optimization algorithm (GOA) is taken from the natural behaviors of grasshoppers pests; Saremi, et al. [117] developed the GOA for solving searching optimization issues. Like many previously published natural inspired algorithms (for example, the whale optimization algorithm; the ant lion optimizer; the salp swarm algorithm, etc.), the GOA algorithm predicts based on two major steps. The first step is an exploration whereas the second step is called exploitation. As is taken from the grasshoppers' behaviors, such steps are performed to seek a food source. In this first phase (that is, called the exploitation phase), the agents that are responsible for the search fly locally over the search area. This process is illustrated in Figure 3. which is adapted from [117]. According to Mafarja, et al. [118], which explored the behavior of the relations for different levels of some algorithm parameters, when the distance between two grasshoppers is between 0 and 2.079, the repulsion takes place between them. Consequently, if the distance is out of this range (that is, larger than 2.079), it means that the proposed grasshopper enters a comforting district. Equation (1) expresses the movement of grasshoppers, noting that the term x i is the position of the ith insect: where r 1 , r 2 , and r 3 are random values between 0 and 1. Also, S i , G i , and A i denote the social relationship, the gravity force, and the wind advection, respectively.  Equation (1) expresses the movement of grasshoppers, noting that the term xi is the position of the ith insect: where r1, r2, and r3 are random values between 0 and 1. Also, Si, Gi, and Ai denote the social relationship, the gravity force, and the wind advection, respectively.

Wind-Driven Optimization
The technique of wind-driven optimization (WDO) was introduced by Bayraktar, et al. [119]. The early idea of WDO was initially formed for electromagnetics usages. In this sense, the four most influential forces for this task are a Coriolis force (FC), pressure gra-

Wind-Driven Optimization
The technique of wind-driven optimization (WDO) was introduced by Bayraktar, et al. [119]. The early idea of WDO was initially formed for electromagnetics usages. In this sense, the four most influential forces for this task are a Coriolis force (F C ), pressure gradient force (F PG ), frictional force (F F ), and gravitational force (F G ). The main idea behind the WDO method is that, for reducing the computational complexity, the air parcels are considered weightless and dimensionless. Assuming that δ V and the ∇P are the finite volume and pressure gradient of the air, and Equation (6) shows the main force because of the predefined pressure gradient. The F F (as it is shown by Equation (7)) aims to face the airtriggered movement by the term F PG . In this regard, the term F G (that is, the gravitational force illustrated by Equation (8)) pulls the parcels to the earth's center from every dimension. Also, the F C (Equation (9)) attributes to the deflection in the air parcel motions.
In Equations (6)-(8), the term → u is set to be the velocity vector of wind, ρ is the density of a short air parcel, g signifies the gravitational constant, θ symbolizes the earth's rotation and finally α is a frictional coefficient.
By considering the above-mentioned influential factors and pre-defined forces along with the ideal gas equation, the finalized equation is provided by Derick, et al. [120]: Since the air velocity mainly depends on the pressure value the velocity gets adjusted to bring higher pressure for the system. Therefore, regarding the pressure rank, Equation (10) is adapted. In this sense, due to the pressure, the known parcels, in a descending sequence, are ranked. Accordingly, the position and velocity are updated by the following equations when i denotes the rank: where in Equations (7) and (8), the terms → U cur and → U new signify the velocity value of the current and coming iteration, respectively. Note that the term x is defined to be the air parcel position. In addition, x opt and x cur are set to be the optimal and current positions, respectively. Meanwhile, the other terms such as → U otherdirection are set to be equal → F C while the C is taken to calculated as C = −2 RT. The clarified progressive process updates until stopping criteria are met. These criteria can be set by an objective function or even a pre-defined number of repetitions. Full details of the WDO technique is explained in some previous studies such as Bayraktar, et al. [119] and Bayraktar, et al. [121].

Biogeography-Based Optimization
BBO was established based on the nature of different species distribution as well as their biogeography knowledge. BBO was first developed by Simon [54]. Later it was expanded by other researchers and more particularly by Mirjalili et al. [55], where the initial algorithm was enhanced and combined with MLP structure and became BBO-MLP. Due to its excellent predictivity, many scholars recommended it to be used in many complex engineering problems [122,123]. A combination with MLP aimed to optimize the performance of the MLP since the BBO is known as a population search based technique. The flowchart of the original algorithm before being mixed with MLP is shown in Figure 4. Which is adopted from [54,55]. Similar to many other population search-based optimization algorithms, the BBO also gets started by generating a random population called a habitat. to its excellent predictivity, many scholars recommended it to be used in many complex engineering problems [122,123]. A combination with MLP aimed to optimize the performance of the MLP since the BBO is known as a population search based technique. The flowchart of the original algorithm before being mixed with MLP is shown in Figure 4. Which is adopted from [54,55]. Similar to many other population search-based optimization algorithms, the BBO also gets started by generating a random population called a habitat.

Data Collection and Statistical Analysis
The primary dataset of this study is obtained from Tsanas and Xifara [124] as uploaded in an open-source machine learning repository (http://archive.ics.uci.edu/ml/datasets/ Energy + efficiency accessed on 26 December 2020). The dataset was prepared according to the Ecotect computer software [125], where 12 different types of residential buildings (hypothetically located in Athens, Greece, each containing 7 persons with the area of 771.75 m 3 composed of 18 elementary cubes with dimensions of 3.5 × 3.5 × 3.5 m 3 ) with changes in their influential design parameters simulated. Common material with U-values of 1.780, 0.860, 0.500, and 2.260 for walls, floors, roofs, and windows, respectively, was used for simulating the buildings. Calculating the glazing areas as a ratio the floor area, three areas with the ratios 10%, 25%, and 40% are considered. Applied distributions of this parameter are as follows: (a) uniform: with 25% glazing on each, (b) north: 55% on the north and 15% on each other side, (c) east: 55% on the east and 15% on each other side, (d) south: 55% on the south and 15% on each other side, and (e) west: 55% on the west and 15% on each other side. Some samples also did not have glazing areas. The main objective was to measure the values of HL and CL (both in kWh/m 2 ) based on the designated residential buildings. Each one of the proposed residential buildings follows a unique structure, such as variation in their design parameters such as relative compactness, roof area, surface area, wall area, orientation, glazing area distribution, overall height, and glazing area. For instance, four orientations, five distribution scenarios, four sets of glazing areas (such as 0, 10, 25, and 40% of the entire floor area), etc. Therefore, a total of 768 design examples were simulated accounting for eight key design factors. The targets were both HL and CL that were required for the particular residential buildings. More description of the employed dataset can be obtained in other related studies such as Chou and Bui [126] and Tsanas and Xifara [124]. In this article, both of the HL, as well as CL, are taken into the calculation procedure as the main target variables. The main aim was that these targets to be predicted by the three proposed hybrid intelligence techniques including GOA-MLP, WDO-MLP, and BBO-MLP. The main dataset includes 768 samples, 537 rows (i.e., 70%) are chosen for designing the training pattern, and the rest of the 231 rows (that is, 30%) are taken to assess the validation and to test the trained networks. The scatter plots of Figure 5 show the influential factors that have been used as the main inputs and their relationship with both chosen targets of HL and CL. To help to understand the variability of the provided datasets, the descriptive statistics of the input variables are provided and tabulated in Table 1.

Results and Discussion
This investigation aimed to predict HLs and CLs in buildings utilizing artificial intelligence according to predictive tools. An ANN multilayer perceptron (MLP) is employed to estimate the HLs and CLs. The prepared datasets are separated into two sections, for training and testing the model. The first section that is selected using 70% of the whole database is considered for training of the ANN models (named training dataset) while the 30% remaining items are used for evaluation of their network performances (called the testing dataset). The new testing dataset (that is, selected in each stage of the network simulation) is built using data that varies from the training step. Two statistical indices of the determination coefficient (R 2 ) and root mean square error (RMSE) are employed to compute the network error efficiency and the regression among the target values and system outcomes of HLs and CLs. The upper mentioned indices are significantly utilized and also are presented by Equations (9) and (10), respectively.

Results and Discussion
This investigation aimed to predict HLs and CLs in buildings utilizing artificial intelligence according to predictive tools. An ANN multilayer perceptron (MLP) is employed to estimate the HLs and CLs. The prepared datasets are separated into two sections, for training and testing the model. The first section that is selected using 70% of the whole database is considered for training of the ANN models (named training dataset) while the 30% remaining items are used for evaluation of their network performances (called the testing dataset). The new testing dataset (that is, selected in each stage of the network simulation) is built using data that varies from the training step. Two statistical indices of the determination coefficient (R 2 ) and root mean square error (RMSE) are employed to compute the network error efficiency and the regression among the target values and system outcomes of HLs and CLs. The upper mentioned indices are significantly utilized and also are presented by Equations (9) and (10), respectively.
where, Y i actual , Y i produced, and Y mean indicate values considered in each step of the simulation for the exact, predicted, and the mean values of the shown P ult , respectively. N displays the number of data.

Optimization of MLP Model
Before training the models using GOA, WDO, and BBO, a reliable primary structure of the MLP should be determined. In a multilayer perceptron-based network model, the number of neurons for the output and input layers is fixed. Such a number is considered to be equal to the number of outputs and inputs, respectively. It is important to know that this number in the hidden layer is a different factor that changes relying on the amount of user data. So, in this step and to create a strong multilayer perceptron-based network structure, eight different numbers were tested for the hidden neurons. For more trustworthiness, each of the proposed networks was implemented four times (shown as TR1, TR2, TR3, and TR4 indicating the first, second, third, and fourth effort, respectively), and totally, thirty-two (= 4 * 8) various structures were built to specify the most proper structure. Figure 6 shows the results of the considered analysis. This trial and error procedure can help to determine the best structure of the optimized network. The best structure of the MLP model, having the minimum error, may be achieved while the number of neurons in each hidden layer is 5. Accordingly, to have a simplified solution and as shown in Figure 6, the number of nodes equal to five was selected as the best possible number of neurons that are required to be chosen in the hidden layers. SO, the network structure is obtained to be 8 × 5 × 2 (that is, eight input neurons, five hidden neurons, and two output neurons).

Optimization of MLP Model
Before training the models using GOA, WDO, and BBO, a reliable primary structure of the MLP should be determined. In a multilayer perceptron-based network model, the number of neurons for the output and input layers is fixed. Such a number is considered to be equal to the number of outputs and inputs, respectively. It is important to know that this number in the hidden layer is a different factor that changes relying on the amount of user data. So, in this step and to create a strong multilayer perceptron-based network structure, eight different numbers were tested for the hidden neurons. For more trustworthiness, each of the proposed networks was implemented four times (shown as TR1, TR2, TR3, and TR4 indicating the first, second, third, and fourth effort, respectively), and totally, thirty-two (= 4 8) various structures were built to specify the most proper structure. Figure 6 shows the results of the considered analysis. This trial and error procedure can help to determine the best structure of the optimized network. The best structure of the MLP model, having the minimum error, may be achieved while the number of neurons in each hidden layer is 5. Accordingly, to have a simplified solution and as shown in Figure 6, the number of nodes equal to five was selected as the best possible number of neurons that are required to be chosen in the hidden layers. SO, the network structure is obtained to be 8 × 5 × 2 (that is, eight input neurons, five hidden neurons, and two output neurons). The mathematical-based equation of the obtained MLP structure was given (as a problem function) to the GOA, WDO, and BBO optimization algorithms to be trained. This process consists of finding the best set of weights and biases for a raw MLP structure. The prediction is then executed by the optimized MLP.
The next step deals with selecting the proper cost function, object function, and assessment techniques (for example, R 2 and RMSE). In this step, we try to provide the bestfit relationships between the actual and predicted values. In the current study, two simultaneous target simulations were provided for both HLs and CLs. As described, at the end of each iteration the value of the cost function is taken to assess the accuracy of the proposed model. In this sense, a sensitivity analysis was then performed for all proposed models of GOA-MLP and WDO-MLP based on the population size (also known as the swarm size). It is one of the most critical terms for hybrid intelligent algorithms. Therefore, the hybrid predictive networks are assessed according to various swarm sizes (for example, in this study, nine sets of swarm sizes were used), leading to 11,000 iterations for every one of the proposed hybrid solutions. The swarm size varied between 25 to 500 as shown in Figure 7. All the algorithms conducted 1000 iterations to find enough chances to decrease the rate of errors shown here as mean square errors. The convergence curves that are shown in Figure 7 are the first result of the fitness function after the swarm size sensitivity analysis. This part is also called the convergence analysis. As depicted in Figure 7a  The mathematical-based equation of the obtained MLP structure was given (as a problem function) to the GOA, WDO, and BBO optimization algorithms to be trained. This process consists of finding the best set of weights and biases for a raw MLP structure. The prediction is then executed by the optimized MLP.
The next step deals with selecting the proper cost function, object function, and assessment techniques (for example, R 2 and RMSE). In this step, we try to provide the best-fit relationships between the actual and predicted values. In the current study, two simultaneous target simulations were provided for both HLs and CLs. As described, at the end of each iteration the value of the cost function is taken to assess the accuracy of the proposed model. In this sense, a sensitivity analysis was then performed for all proposed models of GOA-MLP and WDO-MLP based on the population size (also known as the swarm size). It is one of the most critical terms for hybrid intelligent algorithms. Therefore, the hybrid predictive networks are assessed according to various swarm sizes (for example, in this study, nine sets of swarm sizes were used), leading to 11,000 iterations for every one of the proposed hybrid solutions. The swarm size varied between 25 to 500 as shown in Figure 7. All the algorithms conducted 1000 iterations to find enough chances to decrease the rate of errors shown here as mean square errors. The convergence curves that are shown in Figure 7 are the first result of the fitness function after the swarm size sensitivity analysis. This part is also called the convergence analysis. As depicted in Figure 7a

Assessment of the Proposed Models
The reliability of the proposed models is assessed by evaluating the measured and predicted values of the HLs and the CLs. Two different sets of error criteria of MSE and RMSE are used to have an estimation of the error as well as checking the performance of samples obtained from both training and testing datasets. It is worth noting that the output fitness from both of the training and testing phases defines the generalization power and training capability, respectively. The computed error values in the simultaneous HL and CL prediction for GOA-MLP, WDO-MLP, and BBO-MLP techniques are also shown in Figure 8. Figure 9 (referred to as GOA-MLP) and Figure 10 (referred to as WDO-MLP) display a graphical evaluation between the original predicted and measured patterns of the heating and cooling loads (for both the training and testing datasets). Similarly, the accuracy of the proposed multitarget prediction BBO-MLP techniques are shown in Figure 11. As is seen, all proposed models result in a valid estimate of the HLs and CLs patterns: with the calculated R 2 (training = 0.9529, testing = 0.9512) for the GOA-MLP-HLs and R 2 of (training = 0.9998, testing = 0.9998) for the CLs estimation. In the case of WDO-MLP-HLs, the R 2 (training = 0.9551 and testing = 0.9464) was calculated while for the WDO-MLP-CLs, the R 2 (training = 0.9995 and testing = 0.9994) was found. Lastly, for the case of BBO-MLP-HLs, the R 2 (training = 0.9563 and testing = 0.943) and BBO-MLP-CLs, the R 2 (training = 0.9981 and testing = 0.9975) were calculated. Considering the three proposed techniques and after performing a nonlinear swarm optimization process, it can be seen that the best-fit model is obtained from the proposed structure of BBO-MLP. The result from the error evaluations also shows excellent accuracy, and this proved the BBO-MLP is the most effective hybrid technique in training the MLP.

Simplified Multitarget Equation
To have a more practical formula that can be used to predict both HLs and CLs and considering the minimum required error, in this section, a simple MLP network (having two neurons in its hidden layer) is provided. Also, to have a better understanding, the MLP-best-fit equation is prepared according to the optimized weights and biases. These terms are unique for any particular database. Noting that to have practical use of this formula, the well-known mathematical function of Tangent-Sigmoid (Tansig) is also provided. The Tansig is also the MLP activation function expressed by Equation (14). The R 2 accuracy of the proposed formula for the prediction of multi targets of HLs and CLs simultaneously were 0.98547, 0.9866, 0.98374 for the training, testing, and validation datasets, respectively.

Simplified Multitarget Equation
To have a more practical formula that can be used to predict both HLs and CLs and considering the minimum required error, in this section, a simple MLP network (having two neurons in its hidden layer) is provided. Also, to have a better understanding, the MLPbest-fit equation is prepared according to the optimized weights and biases. These terms are unique for any particular database. Noting that to have practical use of this formula, the well-known mathematical function of Tangent-Sigmoid (Tansig) is also provided. The Tansig is also the MLP activation function expressed by Equation (14). The R 2 accuracy of the proposed formula for the prediction of multi targets of HLs and CLs simultaneously were 0.98547, 0.9866, 0.98374 for the training, testing, and validation datasets, respectively. HL = 2.5988 × Y1 − 2.7980 × Y2 − 1.1800 CL = 2.2978 × Y1 − 2.5449 × Y2 − 1.1861 (11) (13) in which A, B, C, D, E, F, G, and H represent relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, and glazing area distribution, respectively.

Further Discussion and Future Works
The high significance of predicting the thermal load of buildings has driven many scholars toward developing sophisticated methods for predicting the heating load and cooling load. This subject is particularly important when it comes to energy-efficient buildings [127]. The findings of this study are in accordance with earlier efforts regarding the applicability of artificial intelligence models for simulating the HL and CL. These studies have focused on various types of buildings including office, residential, and industrial buildings [128,129].
Data-driven methods (for example, MLP ANN) are capable of analyzing the relationship between a number of effective parameters with a target parameter that is aimed to be predicted. The HL and CL of a residential building were successfully modeled in this study using three hybrid techniques. Each model represented an MLP neural network optimized with a metaheuristic algorithm. The optimizer components were the grasshopper optimization algorithm, wind-driven optimization, and biogeography-based optimization. These algorithms can globally find the solution to a given problem. In this work, evaluating different population sizes for the mentioned hybrids ( Figure 7) showed that although the results of similar population sizes are somewhat close the effect of this parameter is better investigated to use an appropriate configuration.
Having a look at the dataset that was used ( Figure 5), most of the inputs (for example, glazing area distribution) do not show an explicit correlation with the target parameters; this indicates the complexity of the problem. In such cases, a powerful algorithm can prevent the network to be misled. As the results showed, the GOA, WDO, and BBO could nicely deal with this problem; meaning that they could predict the loads for buildings with both high and low inputs.
Due to the excellent performance of all algorithms that were used, determining the best-fitted model can be disputable, but overall, considering both correlation and error evaluations for both targets, the BBO-based model was slightly better. It indicates that the MLP parameters optimized by the bio-geography theory are more efficient compared to those found by the grasshopper optimization algorithm and wind-driven optimization.
The formula provided at the end of this research is reliable due to the excellent correlation of its products. However, considering the dataset that was used, it may be subjected to residential buildings. As explained supra, the data of this study was obtained from a computer simulation for various types of residential buildings. An idea for future work can be utilizing the type of buildings as an input to archive a more generalizable solution.

Conclusions
This article is concerned with the simultaneous calculation of HLs and CLs of residential buildings through the HVAC system. To limit the drawbacks of MLP technique, hybrid intelligent techniques are implemented. The focus of the three proposed models abbreviated as GOA-MLP, WDO-MLP, and BBO-MLP was to introduce several reliable novel hybrid models to simulate both HLs and CLs. Overall, and after 33,000 iterations of the three proposed algorithms, it was shown that all three hybrid techniques (GOA-MLP,