Building Heating and Cooling Load Prediction Using Ensemble Machine Learning Model

Building energy consumption prediction has become an important research problem within the context of sustainable homes and smart cities. Data-driven approaches have been regarded as the most suitable for integration into smart houses. With the wide deployment of IoT sensors, the data generated from these sensors can be used for modeling and forecasting energy consumption patterns. Existing studies lag in prediction accuracy and various attributes of buildings are not very well studied. This study follows a data-driven approach in this regard. The novelty of the paper lies in the fact that an ensemble model is proposed, which provides higher performance regarding cooling and heating load prediction. Moreover, the influence of different features on heating and cooling load is investigated. Experiments are performed by considering different features such as glazing area, orientation, height, relative compactness, roof area, surface area, and wall area. Results indicate that relative compactness, surface area, and wall area play a significant role in selecting the appropriate cooling and heating load for a building. The proposed model achieves 0.999 R2 for heating load prediction and 0.997 R2 for cooling load prediction, which is superior to existing state-of-the-art models. The precise prediction of heating and cooling load, can help engineers design energy-efficient buildings, especially in the context of future smart homes.


Introduction
Internet of Things (IoT) is revolutionizing many sectors to improve productivity and advance the human lifestyle. The smart home is one such example of IoT applications to enhance energy efficiency and sustainability [1]. Several physical devices are connected to the Internet for monitoring and tracking the status of the physical environment and house activities. The human being does not need to be at home to perform these activities. As billions of IoT devices need power to operate, energy resources are scarce, and preserving energy resources is very important. For estimating the cooling or heating load of a building, temperature characteristics or profiles of smart homes should be analyzed. If the relation between the building structure and energy requirements is known, then the builder designs and architects the building architecture, which can save energy and effectively use the energy to heat or cool the building. Therefore, the heating load (HL) and cooling load (CL) of a building estimation have been a problem in the building energy efficiency field.
Energy consumption prediction is an important research area as it consumes approximately 30% of the total energy and its carbon emissions are approximately 33% in 2021 [2]. Despite developments in the building sector, the current efforts are not enough to achieve the 1.5 • C scenario. Smart sustainable infrastructures are needed to meet the rapidly growing urbanization. From this perspective, energy consumption prediction and modeling is an important task to obtain smart and low energy-requiring infrastructures. Three primary approaches for building energy consumption modeling and forecasting are physical models, data-driven models, and hybrid models [3]. Data-driven approaches have been regarded as the most suitable for integration into smart houses.
With the wide deployment of IoT sensors in smart homes and appliances, such sensors generate large amounts of data. Data-driven approaches leverage the data for modeling and forecasting energy consumption patterns. Furthermore, the use of machine learning and deep learning models provide flexible and reliable solutions in this regard [4]. Several solutions exist for building energy prediction in the literature [5][6][7]. However, such solutions lack several aspects. First, predominantly, studies focus on the use of individual machine learning models, and ensemble models are not properly investigated. Second, the performance of such models still needs improvement regarding prediction accuracy. Third, often the models are optimized by fine-tuning different hyperparameters, and the potential of feature selection is partially investigated. In addition, the role of different attributes such as roof area, wall area, glazing, etc. is not very well studied. As a result, such studies are not very well suited for predicting the energy needs of modern smart homes. This study aims to overcome these issues.
This study follows a machine learning-based data-driven approach for building energy consumption prediction and makes the following contributions: • A novel ensemble model is proposed that combines three random forest models (3RF) for predicting the heating and cooling load of the buildings. Performance comparison of the proposed approach is carried out with reference to K-nearest neighbor (KNN), linear regression (LR), random forest (RF), general additive model (GAM), and multilayer perceptron (MLP). In addition, convolutional neural networks (CNN), long short-term memory (LSTM), and their ensemble CNN-LSTM are also used for experiments. • The influence of different features related to building is investigated. The performance of the models is analyzed regarding different features from the dataset such as glazing area, orientation, height, relative compactness, roof area, surface area, and wall area. Mean absolute error, root mean squared error, mean absolute percentage error, and coefficient of determination R-squared is used for performance evaluation. • Performance is also evaluated within the context of existing state-of-the-art studies regarding prediction accuracy and computational complexity.
The rest of the article is organized as follows. Section 2 includes state-of-the-art works using machine learning and deep learning models to predict the building energy efficiency performance. Section 3 discusses the proposed methodology to predict heat loading and cool loading in buildings. Section 3.3 covers the description of the building energy efficiency simulated datasets. Section 4 discusses the performance results of the proposed machine learning and deep learning models to predict heat and cool loading. Section 5 concludes the article.

Literature Review
Energy usage optimization and energy saving of the buildings are needed to conserve energy. Statistical modeling techniques are commonly used to predict the energy consumption of the building given the building characteristics. Researchers applied various machine learning models to predict energy consumption and efficiency. The building energy efficiency is measured in terms of heat loading and cool loading parameters.
Kumar et al. [5] proposed extreme learning machine (ELM) and online sequential ELM (OSELM) methods to predict the building HL and CL parameters. The combinations of ELM methods are used to compare the prediction performance of the models with well-known ML models such as artificial neural network (ANN), support vector machine (SVM), gradient boosting (GB), etc. The authors also varied the attribute combinations to compare the performance of the proposed methods. The reported results indicate that ELM with group 4 attribute combination produced the best result with a 0.0456 mean absolute error (MAE) for HL and 0.0358 for CL prediction. An ANN-based solution is proposed in [6] to predict the cooling and heating load in the buildings. A component-based approach is taken to predict the energy efficiency of the buildings. The ANN is applied at each component level to obtain the final HL and CL performance results. The Energyplus software is used to generate the simulation-based datasets. The reported results show that R 2 is 0.974 for HL and 0.999 for CL. The proposed solution is a component-based approach with the advantage of the reusability of the ML models. However, the ML model may suffer from data or concept drift.
The authors presented an LSTM-based deep learning solution to predict the HL and CL performance in [8]. Transfer learning and multitask-based learning are also embedded in the solution to predict building energy efficiency. One of the advantages of deep learning models is the computation speed compared to building performance simulation (BPS). The method obtained an R 2 of 0.983 for CL and 0.848 for HL performance. The HL performance still requires improvement. Additionally, the training time for LSTM-based solutions is much higher than the ANN and machine learning-based solutions. XGboosting algorithm is proposed in [9] to predict the building energy load in terms of HL and CL. The performance metric's root mean squared error (RMSE), R 2 , MAE, and mean absolute percentage error (MAPE) are measured for both the HL and CL prediction. The reported results show that the obtained R 2 for HL is 0.9993, while CL has a 0.998 R 2 . Chakraborty et al. [10] performed an in-depth study of feature selection, feature engineering, and parameter optimization to predict the building energy loads on the generated synthetic dataset. Machine learning and deep learning techniques such as XGBoost and ANN are used to conduct the experiments and predict the HL and CL. Results indicate that the XGBoost obtained RN_RMSE of 2.95 for CL and 3.90 for HL, while the R 2 value of 0.98 for CL and 0.95 for HL are obtained when the feature engineering is performed. However, the feature selection and hyperparameter optimization are not included. The authors pointed out that the scarcity of datasets hinders the research progress in this context. Octahedral regression-based building energy load prediction is proposed in [7]. Octahedral regression is a kernel averaging methodology proposed by authors to predict the building energy loads. Experimental results indicate an MAE value of 0.945 for HL, whereas the MAE for CL is 1.113. The prediction performance still has room for improvement.
The authors propose a multi-objective optimization model in [11] to predict the building energy load. The method MOO was also presented to improve the hyperparameter selection process. The authors claim that the proposed optimized model enhances the energy prediction accuracy and computation time compared to the grid search model parameter selection. The experimental results on simulated building datasets show that the HL and CL obtained higher R 2 values of 0.992 and 0.993, respectively. Sadeghi et al. [12] implemented a deep neural network (DNN) model to predict the building energy load prediction. The authors show that the DNN performed better than ANN to estimate the energy load. Normalized RMASE (NRMSE) and normalized MAE (NMAE) are used for evaluation. The R 2 value is 1 for the HL case, whereas 0.994 for the CL case. However, the training time for DNN is higher than for machine learning models. Song et al. [13] proposed a temporal convolution neural network (TCN) to predict hourly heat loading. The TCN has the advantage of rapidly extracting complex features due to the presence of a convolution neural network (CNN) and recursive neural network (RNN). The hourly heat loading performance accuracy is reported as 97.9% and an MAE value of 0.102 on the test dataset. However, the applicability of the TCN algorithm in building energy load prediction is not explored in the study.
Sajjad et al. [14] presented a sequential learning model framework to predict the HL and CL as multi-outputs. The authors claimed that their work is the first to propose a multioutput HL and CL prediction using a gated recurrent unit (GRU) to eliminate the tedious task of training separately for HL and CL prediction. The HL obtained a MAPE of 0.9315, and CL obtained a MAPE of 1.0132. However, the training time for the neural networks is much higher than for the machine learning models. The training time-based comparison is not discussed in the article. A feature construction method along with ensemble learning algorithms is proposed to predict the CL prediction in [15]. Different methods, such as K-clustering, discrete wavelet transform (DWT), etc., were used for feature construction, and three models of gradient boosting machine (GBM), random forest (RF), and cubist algorithms were used to evaluate the cooling load accuracy. The dataset was obtained from the Tianjin office building in China. The reported results indicate that the CL achieved an R 2 of 99.8% when the combination of DWT and the Cubist algorithm was used for evaluation. However, this work is limited to cooling load prediction.
Olu et al. [16] performed building energy performance prediction at the building design stages. The authors explored various machine learning techniques to evaluate energy performance. The feature selection and hyperparameter selection were also considered to improve the performance. The GB performed better than all other machine learning models with an accuracy of 67% and an F1 score of 65%. A shuffled complex evolution (SCE) performance optimization technique is proposed in [17] to enhance the performance of multi-layer perceptron (MLP). The authors claim that SCE improved the prediction accuracy by 22.84% and outperformed the benchmark optimizer such as moth-flame and optics-inspired models. The CL prediction results show that the SCE_MLP model achieved a 0.9227 R 2 for the well-known building energy efficiency datasets [18]. Table 1 provides an analytical overview of the state-of-the-art works, which worked on progressively improving the building energy efficiency. Several works in the literature can be found that have worked on improving the performance of HL and CL in solving the building energy efficiency problem. Although the reported results showed significant prediction performance, there is still scope for improvement. Additionally, the detailed study of the input features and output is not well explored in the literature. The prior works do not address an ideal solution that improves the performance and reduces the computation and training time. Thus, a detailed study on the building energy datasets is performed using machine learning and deep learning techniques, feature selection, and hyperparameter tuning to obtain an accurate, generalized, and low training time solution.

Proposed Methodology
Our motivation for this study is to leverage the data analytic models for predicting energy resource consumption in buildings. To improve the prediction performance of the state-of-the-art energy efficiency HL and CL, the selected machine learning models are thoroughly evaluated. This study proposes an ensemble model 3RandomForest (3RF) to predict the HL and CL of a building effectively. The architecture of the proposed 3RF model is presented in Figure 1. Three RF models are used to test the dataset and predict the HL and CL. The advantage of our method is to leverage the combinations of more decision trees from three RF models and achieve optimal performance results.

3Random Forest
The decision tree models showed promising results for predicting the HL and DL. In order to improve the performance, 3RF is proposed to repeat the RF model regression prediction three times on the energy efficiency dataset. This repetition helps to use more combinations of the input dataset decision trees. The voting method is used to predict the final value from three RF models.
Let b = {1, 2, . . . , B} be the number of decision trees.Ĉ b (x) denotes the regression prediction value of the b th decision tree [19]. Then, the output of the RF can be denoted as: The 3RF voting-based output prediction is represented as follows: The flow of the steps used to train and test the HL and DL datasets is depicted in Figure 2. The dataset is split into the training and testing dataset with proportions of 80% and 20%, respectively. The relation between the dataset features and the HL and CL output is analyzed to select the machine learning models. The feature versus HL and CL trends are also investigated to present the feature relation with the HL and CL output. The performance of the proposed model is evaluated in comparison to several well-known machine learning models.

Machine Learning Models
A brief overview of the employed machine learning models is described in this section. The grid search method is used to find the best hyperparameter setting. The model parameters are tuned between specific ranges to get the best results. Model hyperparameter setting and tuning range are shown in Table 2. KNN can be used for both classification and prediction. The prediction works based on feature similarity. The nearest neighbors were selected based on different distance measures. The uniform weight was used to assign equal weights to all neighbors. The number of neighbors was selected as three in our evaluation. The average of the nearest neighbor data values was assigned as the final predicted values. The Euclidean distance (E) was used to determine the nearest neighbors [20].
where k denotes the number of neighbors, and x i and y i are the data points in i th dimension in the Equation (3).

Linear Regression
LR works on the principle that the input independent variables are linearly related to the output dependent variable. The multiple linear regression is represented as [21]: where x 1 · · · x r are the input independent variables, r is the number of input features, and β r is the r th input feature coefficient in Equation (4).
In order to compensate for model overfitting, the error ε is modeled as a Gaussian distribution. The multiple linear regression in matrix form is represented as follows.

Random Forest
RF is a family of decision tree-based machine learning algorithms. Ensemble learning is used to perform the classification and predictions. The bootstrapping RF is used to perform the predictions in this work. The bootstrapping method combines ensemble learning and the random selection of the decision trees to determine the prediction output as the average value of all the decision tree predictions.
Let b = 1 and B be the number of decision trees;Ĉ b (x) denotes the regression prediction value of the b th decision tree [19], then the regression prediction of the RF forest is defined as:

General Additive Model
GAM is a slight variation of the linear regression model [22]. The linear form is replaced with the sum of the smooth functions in GAM. The function is represented as f i (x i ) for the i th input.This technique is helpful to detect the nonlinear covariate effects between input and output. The GAM is defined as follows: where f (x) is the smooth function, r denotes the number of input features, E(Y) signifies the linear exponential output Y distribution, and g() is the link function in Equation (7). The link function can be selected as an identity or log function.

Multilayer Perceptron
MLP is a global approximator and is well suited for mapping the nonlinear inputoutput combination. Typically, MLPs consist of three layers. The input layer feeds the input values to the neural network. The output layer performs the classification or prediction of the given problem. The hidden layer includes the neurons and supports the computations to process the input data and forwards the processed data as input to the output layer. The number of hidden layers can be arbitrary. The neuron processing unit is represented as follows [23].
where b denotes the bias value, W i denotes the i th neuron weight, and x i denotes the input to the i th neuron unit. Φ is the nonlinear activation function and f (x) is the neuron processing unit output in the Equation (8).

Dataset Description
The heat and cool loading specifications must be determined when designing an energy-efficient building. The building feature characteristics will be used to determine the heat loading and cool loading. Thus, the building designer and designers expect to use the building features to estimate the heat loading and cool loading. Therefore, the target output parameters in the datasets are heat loading and cool loading for energy efficiency prediction. The dataset was collected from the University of California Irvine Machine learning laboratory.
The total surface (S) is determined as: The Relative Compactness (RC) is measured as: The energy efficiency dataset was acquired from the UCI repository, which was generated using Ecotect simulation software and presented by [18]. The dataset comprises 768 building samples, and each sample is represented with eight features. To generate 768 combinations, 12 building shapes are simulated in Ecotect. The 12 building shapes had the same volume (771.75 m 3 ) and the features were varied. The variables orientation, glazing, and glazing distribution feature varied further to make the dataset. The orientation includes the building's north, south, east, and west orientation. The glazing area or window-to-floor ratio varied from 10-40%. The four glazing area values taken were 0%, 10%, 25%, and 40% in the dataset. The glazing area distribution combinations included uniform, majority of the distribution in either north, south, east, or west. Overall, the total number of building combinations was (12×3×5×4)+(12×4)= 768. The detailed description of 8 features, the feature value ranges, and metrics are reported in Table 3. The amount of latent and sensible heat removed from the required area to maintain the acceptable temperature

Performance Metrics
The performance of the models was estimated using the metrics MAE, RMSE, MAPE, and the coefficient of determination R 2 .
MAE measures the predicted value deviations from the original value. The MAE is the average difference between the predicted and original values. Given the CL and HL dataset output original value, y i and the dataset output predicted value wasŷ i , and the total number of values i = 0 . . . N, then the MAE was calculated as: RMSE is the standard deviation of the prediction errors. The prediction error measures the distance of the prediction values from the regression line data values. RMSE is measured as: MAPE is the average of the predicted absolute percentage errors. The MAPE is obtained in percentages, and the sign of the predicted and original difference value does not matter. MAPE is measured as: All three error metrics determine the deviation of the predicted value from the original value. Therefore, the lower the error metric value, the better the model performs.
The R 2 determines how well the model fits the data. The R 2 value lies between 0 and 1, and if the R 2 value for the model is close to 1, then the model performs well. Considering the mean of the dataset output predicted value isŷ i , the R 2 is defined as:

Results and Discussion
In this section, a detailed description of the experimental results obtained on the cooling load and heading load datasets is provided. The experiments were run on a standalone Linux machine with a system configuration of 8 GB RAM and an eight-core processor. A notebook web application runs locally on the Linux machine to perform the experiments. The software packages scikit-learn installed, and the Python programming language was used to implement machine learning models. Figure 3 shows the cooling load output value distribution for the energy efficiency dataset input features. The eight features, glazing area distribution, glazing area, orientation, overall height, relative compactness, roof area, surface area, and wall area mapping with cooling load values, were included in Figure 3. The circle indicates the real data value and the color lines represent cooling load values obtained when various machine learning techniques were applied for prediction. Results indicate that the real values highly deviate from the MLP line that represents the cooling load. On the other hand, the HM maps the input feature values more accurately with the cooling load data. Figure 3a-c look similar because the building features gazing area distribution, gazing area, and orientation have a similar dependency on the required cooling load values. Figure 3a shows that the cooling load varies significantly when the building height was either 3.5 or 7.0. When the building height was between 3.5 and 7.0, the cooling load was around 22. The other features, such as relative compactness, roof area, surface area, and wall area, had irregular dependency patterns with respect to the cooling load. Table 4 displays the cooling load performance metrics for machine learning model regression analysis. It shows that the 3RF performed well compared to other machine learning models. The 3RF obtained the best R 2 of 0.971 for the cooling load. The 3RF error metrics MAE, MSE, and RMSE values were 2.875, 1.076, and 1.695, respectively. The MLP and LR did not perform well in measuring the cooling load accurately. The supervised models such as KNN, decision tree algorithms RF, 3RF, and GAM performed comparatively well with an R 2 value greater than 0.95. A similar pattern was found in the error metrics for the machine learning techniques in cooling load determination.  Table 5 lists the accuracy (mean R 2 ) and standard deviation of the machine learning techniques when testing the cooling load estimation. The 10-fold cross-validation was performed to split the dataset into training and testing datasets and measure the performance accuracy of the employed techniques. The performance accuracy of the machine learning models followed a similar trend of the R 2 metric value performance in Table 5. Here, 3RF obtained the best accuracy of 96% among the used models for cooling load estimation.   The best-performing 3RF model performance was analyzed with respect to the input features in cooling load estimation. Figure 4 presents the mapping of the feature data with the output cooling load when the 3RF model is used for estimation. The results indicate that most of the feature data samples match with the 3RF predicted cooling load output. Hence, the 3RF performance is better than the other ML models.     The experiments were repeated for heating load performance estimation. As shown in Figure 5, the heating load mapping with respect to the input features is performed when the data is trained using machine learning models. The trends follow patterns similar to cooling load estimation. Figure 5 shows that the MLP model is missing some of the real data in the heating load estimation line for most of the input features. On the other hand, the 3RF matches the real data into the heating load prediction line. It means that 3RF performed well in accurately estimating the heat loading in a building.   Table 6 shows the performance of employed models in estimating a building's heat loading. Results indicate that the 3RFF outperformed all other models in predicting heat loading. The 3RFF obtained the best performance in terms of R 2 with a 0.998 score and the lowest error metrics value. The RF and GAM comparatively performed well with more than 0.99 R 2 value. The MLP and LR were the least-performing models to estimate heat loading. Overall, the results indicate that decision tree-based techniques work well to comprehend the energy efficiency input features and correctly estimate the heat and cool loading values. Table 7 displays the performance of machine learning models for heating load prediction problems. The 10-fold cross-validation method is used to split the datasets into train and test sets. Results show that the RF achieved the best performance accuracy on the heat loading estimation. A similar trend in Table 6 is observed for the machine learning performance accuracy results. The decision tree algorithms performed well to obtain the best accuracy for the heating load case. The standard deviation for each model indicates that the accuracy can be slightly varied, and the obtained accuracy results may not be the same when repeated cross-validation experiments are carried out. Table 8 shows the per epochs score for 3RF in both cooling and heating cases.  Table 8 shows the fold-wise cross-validation results using the best performing 3RF model. Fold-wide results are provided to analyze the fold-wise variance in the prediction accuracy of the 3RF model. It can be observed that, except for the first fold, the results are consistent with small variations in the prediction accuracy of building heating and cooling. The best performing RF model for heat loading prediction per the energy efficiency dataset features is shown in Figure 6. The subfigures show the real data points, and the RF predicted heat loading estimation line graph maps for all the feature data. Overall, the features such as grazing area distribution, glazing area, orientation, relative compactness, surface area, and wall area greatly influence the heat loading required in a building. Therefore, selecting the optimal building parameters is crucial for building energy efficiency.     Figure 7 depicts the performance comparison of both HL and CL prediction using the machine learning techniques KNN, LR, RF, and 3RFF. The results indicate that the proposed 3RFF performed better than the other ML models for HL and CL prediction. The 3RFF obtained an accuracy of 0.95 for both HL and CL prediction. The SD of the proposed HL accuracy prediction is higher than the CL accuracy prediction. The KNN is the least performed model for both HL and DL prediction. The MLP model was the least performed model among the selected HL and CL evaluation models. The ensemble models such as RF performed better than the linear regression model for energy efficiency problems. Our findings match the prior works, which showed that the ensemble learning models obtained the best HL and CL performance prediction. Overall, the performance accuracy results in Figure 7 proved that the proposed model performed well and outperformed the employed models in this work. Besides using the 3RF model, different ensemble schemes were used to achieve the optimal results such as different combinations of RF, LR, and KNN. For the most part, the existing literature uses only two models for ensemble models. Keeping this in mind, three models were used for the ensemble. For this purpose, the 3LR ensemble was also tried but the performance of the model was not as good as using the 3RF. In addition, 4RF and 5RF were also implemented but there was no improvement in results after 3RF. To avoid computational costs for 4RF and 5RF, a 3RF ensemble was used. Table 9 is provided to show the performance of 2RF, 3RF, 4RF, and 5RF regarding the prediction accuracy and computational complexity. It can be observed that increasing the RF models from 3 to 4 and 5 does not add any value regarding accuracy. Instead, the computational time increased. Thus, the best results were obtained using 3RF for building heating and cooling prediction.

Performance of Deep Learning Models
State-of-the-art deep learning algorithms such as convolution neural network (CNN), long short-term memory (LSTM), and the combination of CNN and LSTM have been considered to evaluate the HL and CL performance.
The CNN comprises the embedding layer, 1D convolution layer, 1D max-pooling layer, flatten layer, and dense layer. The ReLu activation function is used to perform the convolution. A pool size value of four was also considered to downsample the feature map. The 'Adam' optimizer and MAE loss function was used to train the CNN models. The dataset was split as training and testing data with 80% and 20% ratios. The batch size of 128 and 2000 epochs were selected to train the CNN models. Complete details of the CNN hyperparameters considered for HL and CL evaluation are given in Table 10.
LSTM is a type of recurrent neural network (RNN). A 100-unit LSTM layer was used in this work to design LSTM. Dropout and dense layers were also included to train the LSTM model for HL and CL prediction. The same CNN batch size, epochs, and optimizer, loss function parameters were considered to train the LSTM model. The CNN and LSTM models were combined to evaluate the HL and CL performance. The convolution, max pooling, and LSTM models were included in the CNN-LSTM model. The set of hyperparameters used in CNN and LSTM was also used in the CNN-LSTM. Table 10 displays all the hyperparameters used for CNN, LSTM, and CNN-LSTM models.

Cooling Load Prediction using Deep Learning
The cooling load estimation was also performed using deep learning techniques such as CNN, LSTM, and CNN-LSTM. Figure Table 11 displays the performance metrics of the three deep learning models when tested on the cooling load estimation. LSTM performed well compared to CNN and CNN-LSTM models with an R 2 value of 0.933. The combination of LSTM and CNN degraded the performance compared to the LSTM and CNN alone. The error metrics followed the inverse trend of the R 2 on all three models. The best performing deep learning model LSTM obtained the MAE, MSE, and RMSE values of 5.82, 1.92, and 2.41, respectively. Overall, the proposed 3RF obtained the best performance in cooling load estimation for building energy efficiency.

Heat Load Prediction using Deep Learning
Deep learning models CNN, LSTM, and CNN-LSTM are used for the estimation of the heating load. Figure 9 shows the model loss of the three DL models when the epochs vary from 0 to 2000 for both train and validation datasets. The model loss for both the cool loading and heat loading estimation followed a similar pattern. When CNN was trained and validated, the model loss abruptly dropped to 2% when epoch one was completed. On the other hand, the LSTM model needed at least 60 epochs to reach the 2% model loss. The CNN-LSTM and CNN models did not change the model loss value from epochs 30 to 2000. Further, CNN obtained the best model accuracy within the first few epochs compared to the LSTM. Table 12 presents the performance of deep learning models for the heating load case. When the error metrics and R 2 values were considered, the LSTM model performed better than the CNN and CNN-LSTM. The LSTM obtained the R 2 value of 0.95 and the RMSE value of 2.22. The CNN produced the R 2 value of 0.90, slightly less than the LSTM. The CNN-LSTM had the worst performance and obtained an R 2 value below 0.90. The result indicates that the combination of CNN and LSTM does not perform well for regression problems, where the output value varies based on the number of input features.  Results of 10-fold cross-validation are also added for deep learning models. Results indicate that LSTM achieved better results for both heating load and cooling load prediction as it achieved 0.90 for heating load prediction and 0.88 for cooling load prediction, as shown in Table 13. CNN and CNN-LSTM perform marginally lower than LSTM.

Performance Comparison with State-of-the-Art Studies
The proposed 3RF-based building HL and CL prediction are compared with the prior works, which utilized the same dataset for experiments. For the HL case, the ensemble learning models obtained good performances with R 2 values ranging from 0.98 to 0.998. The prior works [24][25][26] leveraged the RF or ensemble learning models and obtained a 0.998 R 2 . Ref. [6] utilized the component-based machine learning techniques to predict the HL. The performance of the component-based work is much lower than the best HL prediction performance. The proposed 3RF model obtained the best performances as compared to existing ensemble models with an R 2 value of 0.999. The proposed model is also efficient in terms of computational cost, as it takes 0.83 seconds(s) for the cooling prediction case for model training and testing, and for the heating case it takes only 0.61 s.
Existing work [24] on CL performance results reported an R 2 value of 0.986 when the ensemble model was used for evaluation. As shown in Table 14, the proposed method obtained an R 2 value of 0.997. It indicates that the proposed model performed well to predict the CL. It is also observed that for CL prediction, the proposed model performed better than the RF model proposed in [26] with an R 2 value of 0.991 and the XGBoost model proposed in [10] with an R 2 value of 0.94. The multi-layer perceptron model proposed in [17] obtained the CL prediction with a 0.9222 R 2 , which is far less than the current work. It is also important to note that deep learning models require higher computational resources and require a large amount of data to train the model. Table 14. Comparison with state-of-the-art approaches (metric: R 2 ).

Authors
Year

Discussion and Limitations
The building energy efficiency problem is a significant problem for sustainable and smart homes. The accurate prediction of the cooling and heating load for the building structure helps the designers to design energy-saving building structures for homeowners and indirectly helps society by saving energy consumption. This study extensively analyzes the building features, cooling load, and heating load dependency on the features using data analytics models. The RF-based decision tree models estimate the cooling and heating load most accurately per the building features. The relative compactness, surface area, and wall area play a significant role in selecting the appropriate cooling and heating load for a building. However, the dataset used in this study is limited to 768 building designs and the corresponding cooling and heating load values. For further validation, a larger dataset is needed to test the regression performance of heating and cooling load estimation. The deep learning models CNN and LSTM did not perform well for energy load prediction. Further experiments using deep learning models with a larger dataset are intended.

Conclusions
Energy consumption prediction is an important research problem for sustainable homes and modern smart cities. Cooling and heating load prediction helps designers make energy-sustaining architectures. Modern buildings and smart cities employ a large number of IoT devices that generate large amounts of data regularly. Data-driven approaches leverage this data for cooling and heating load prediction. This study proposed a novel ensemble model, 3RF, to predict buildings' cooling and heating profiles using different features such as glazing area, orientation, height, relative compactness, roof area, surface area, and wall area. Prediction within the context of various features indicates that relative compactness, surface area, and wall area play a significant role in selecting the appropriate cooling and heating load for a building. Experimental results using the proposed model suggest that the model achieves 0.999 R 2 for heating load prediction and 0.997 R 2 for cooling load prediction. The model's performance is superior to both the employed and existing state-of-the-art models. Precise prediction of energy requirements of modern buildings can be very important for energy efficiency and sustainability. Moreover, such predictions can be used for a large number of applications such as energy monitoring, real-time energy planning, and identifying high energy-consuming targets, which can help optimize energy efficiency. The current dataset consists of energy profiles for 768 buildings and seems insufficient for deep learning models. Enlarging the dataset and improving the performance of deep learning models is also intended as future work.