Short-Term Building Electrical Energy Consumption Forecasting by Employing Gene Expression Programming and GMDH Networks

: Over the past decade, energy forecasting applications not only on the grid side of electric power systems but also on the customer side for load and demand prediction purposes have become ubiquitous after the advancements in the smart grid technologies. Within this context, short-term electrical energy consumption forecasting is a requisite for energy management and planning of all buildings from households and residences in the small-scale to huge building complexes in the large-scale. Today’s popular machine learning algorithms in the literature are commonly used to forecast short-term building electrical energy consumption by generating an abstruse analytical expression between explanatory variables and response variables. In this study, gene expression programming (GEP) and group method of data handling (GMDH) networks are meticulously employed for creating genuine and easily understandable mathematical models among predictor variables and target variables and forecasting short-term electrical energy consumption, belonging to a large hospital complex situated in the Eastern Mediterranean. Consequently, acquired results yielded mean absolute percentage errors of 0.620% for GMDH networks and 0.641% for GEP models, which reveal that the forecasting process can be accomplished and formulated simultaneously via proposed algorithms without the need of applying feature selection methods.


Introduction
More recently, the ubiquity of the internet of things makes distributed energy systems smarter by optimizing energy efficiency for reducing losses and creates a new era named as the internet of energy (IoE), which is equipped with intelligent forecasting systems that employ meteorological forecasts and other explanatory information to predict future energy consumption. IoE brings energy forecasting into the forefront along with the smart grids and microgrids wherein buildings occupy the majority of the energy consumption. According to the one of the latest reports of the International Energy Agency, the buildings account for the largest portion of global final energy use with a share of 36%, which increases the significance of building energy forecasting to redress the balance between supply and demand for a more energy efficient future for the next generations of humanity [1].
An accepted standard is still not available for the classification of energy forecasting, but Hong and Fan grouped forecasting categories as very short-term, short-term, medium-term, and long-term with cut-off horizons of one day, two weeks, and three years [2]. Principally, short-term forecasts refer • An application of real-time short-term electrical energy consumption forecasting study with comprehensive meteorological observations is conducted for a large-scale hospital complex, including data acquisition, wrangling, and visualization in detail. Studies appertaining to short-term building electrical energy consumption forecasting is limited, especially for detailed real-time applications, and it is thought that this study will bridge the emphasized gap and strengthen the literature. • Among various machine learning algorithms, GEP and GMDH networks are selected as forecasting methods for their capability of generating simple model equations between predictor variables and target variables without the necessity of performing feature selection. As far as is known, this study is the first attempt in the literature that compares GEP and GMDH networks for the prediction of short-term building electrical energy consumption. Both methods are implemented under identical constraints during a one-year period. Performing analyses with the same criteria reveals the genuine performance of each method for benchmarking purposes with respect to coefficient of determination (R 2 ), root mean squared error (RMSE), and mean absolute percentage error (MAPE). For the first time, overall results of GEP and GMDH networks are interpreted from the points of accuracy, number of input parameters and complexity of model equations, and computational time. In addition to those, generated model equations in the context of this study can be employed for future studies regarding buildings having similar climatological conditions and electrical energy consumption profiles. • To the best of one's knowledge, an in-depth investigation of performance metrics acquired from the results of short-term building electrical energy consumption forecasting is firstly fulfilled in terms of several explanatory variables. Effects of short-wave irradiation, start and end of the shift hours, weekends and holidays, and seasonal transitions over short-term building electrical energy consumption forecasting are deduced along with hourly, daily, and monthly trends of prediction complexity in reference to MAPE.
The rest of the study is organized as follows: Section 2 presents the state-of-the-art review consisting of review studies intersecting building electrical energy consumption forecasting with artificial intelligence (AI), case studies in the field of short-term building electrical energy consumption prediction focusing on statistical and AI techniques for nonresidential buildings, and research studies utilizing GEP and GMDH networks for forecasting short-term electric load, demand, or electrical energy consumption; Section 3 introduces data source and acquisition, data wrangling, data set properties, and forecasting methods comprising the fundamentals of GEP and GMDH networks; Section 4 hosts discussion and experimental results of in-depth analyses; and finally, Section 5 concludes the study by emphasizing the prominent results for future studies.

Related Work
The literature contains a variety of successful reviews, which attempted to summarize building energy consumption forecasting methodologies from diverse perspectives. Firstly, Zhao and Magoules reviewed building energy consumption forecasting by classifying the methodologies, such as engineering methods, statistical methods, and AI methods [4]. Ahmad et al. summarized the applications of artificial neural networks (ANN) and support vector machines (SVM) for building energy consumption prediction by emphasizing the potential of a hybrid method that merges GMDH networks with least squares SVM (LSSVM) [5]. Raza and Khosravi conducted a review study on AI-based load demand forecasting techniques not only for buildings but also for smart grids by explaining all phases of short-term load forecasting comprehensively [6]. Daut et al. reviewed on the prediction of building electrical energy consumption by dividing the methodologies as conventional, AI, and hybrid methods [7]. Wang and Srinivasan compared single and ensemble models for AI-based building energy consumption forecasting within a review study [8]. Wei et al. presented a review of data-driven approaches for both prediction and classification of building energy consumption by mentioning practical applications of the approaches [9]. In a similar manner, Amasyali and El-Gohary reviewed data-driven building energy consumption forecasting studies by particularly focusing on the scopes of prediction, data properties and preprocessing methods, machine learning algorithms, and performance measures [10]. Lastly, Runge and Zmeureanu suggested a review for forecasting energy use in buildings utilizing ANN by highlighting applications, data, forecasting models, and performance metrics [11].
There are a limited number of studies in the literature that concentrated on short-term electrical energy consumption forecasting based on statistical and AI techniques for nonresidential buildings. Initially, Fan et al. presented a rigorous work about day-ahead building energy consumption forecasting, which employs an ensemble model in which weights are optimized by a genetic algorithm (GA), and the ensemble model consists of a single ANN, auto-regressive integrated moving average (ARIMA), boosting tree (BT), k-nearest neighbors (kNN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), random forests (RF), and support vector regression (SVR) [12]. Ke et al. analyzed the load profile and implemented hours-ahead building load forecasts by obtaining data from a substation feeder at the Centennial Campus of North Carolina State University and using similar day approach (SDA), direct curve fitting (DCF) with polynomial regression (PR), and MLR [13]. Wang et al. performed ensemble bagging trees (EBT) for forecasting hour-ahead energy consumption of Rinker Hall building in the University of Florida against a regression tree (RT) model [14]. Shabani and Zavalani utilized an incremental ANN approach against target mean (TM) for forecasting hour-ahead loads of a commercial building [15]. Zhu et al. compared performances of ANN by applying different strategies for neuron numbers, activation functions, data filtering, and regrouping for forecasting day-ahead loads acquired from two buildings in the City University of Hong Kong [16]. Yong et al. suggested implementing a combination of SDA and long-short term memory (LSTM) networks in comparison with ANN and a hybrid approach containing particle swarm optimization (PSO) and ANN for short-term load forecasting of a hotel building in Shanghai [17]. Ahmad et al. conducted a comprehensive work by obtaining data from a hotel building in Madrid and applied deep highway networks (DHN), SVR, and a tree-based ensemble (TBE) model for forecasting hour-ahead building heating, ventilation, and air-conditioning (HVAC) energy consumption [18]. Fang et al. tried to improve forecast accuracy by performing wavelet decomposition (WD) and ARIMA together as compared to the Holt-Winters method (HWM), LSTM, and seasonal auto-regressive integrated moving average (SARIMA) for daily energy consumption prediction of an office building in Qingdao, Shandong [19]. Fan et al. assessed deep network strategies including gated recurrent unit (GRU), LSTM, and recurrent neural networks (RNN) with several prediction approaches, such as direct, multi-input and multi-output (MIMO), and recursive approaches in order to forecast day-ahead energy consumption of an educational building in Hong Kong [20]. Finally, Divina et al. benchmarked different forecasting strategies, including ANN, ARIMA, ensemble, evolutionary algorithms (EA) for regression trees (EVTree), extreme gradient boosting (XGBoost), generalized boosted regression models (GBM), MLR, RF, and recursive partitioning and regression trees (RPart) for forecasting short-term electrical energy consumption of thirteen buildings belonging to a university campus in the south of Spain [21]. Comparative analysis of the aforementioned studies that processed short-term building electrical energy consumption forecasting is tabulated in Table 1, according to performed models, building type, temporal granularity of data set, forecast horizon, benchmark models, and performance results, respectively. The literature comprises several studies that employed GEP and GMDH networks for short-term electrical energy consumption forecasting. Huo et al. developed an improved GEP model for short-term load forecasting and compared their model with traditional models of genetic programming (GP) and GEP [22]. Fan and Zhu indicated that a combination of empirical mode decomposition (EMD) and GEP may perform higher accuracy than WD and GEP combination for short-term load forecasting [23]. Hosseini and Gandomi compared GEP models with multiple least squares regression (MLSR) and generalized regression neural networks (GRNN) for forecasting day ahead peak and total loads of a North American electric utility [24]. Deng et al. used artificial fish swarm based hybrid GEP along with cloud computing in order to model distributed electric load forecasting in comparison with ANN, PSO-SVM, SVR, and traditional GEP on the data set of EUNITE competition [25].
Sforna used GMDH networks for acquiring a function between electric load and temperature variables and compared GMDH networks with ANN on electrical and meteorological data of four major Italian cities containing Florence, Milan, Naples, and Rome [26]. Huang and Shih utilized a combination of fuzzy modeling and GMDH networks on Taiwan's electric load data in order to improve the performance of their short-term load forecast model against ANN and ARIMA [27]. Abdel-Aal employed GMDH networks on Seattle's electrical and weather data to obtain analytical expressions between input and output variables in forecasting hourly and daily electric loads with different variations of ANN, abductive networks, and network committees (NC) [28][29][30]. Elattar et al. proposed a generalized locally weighted GMDH networks based EA for short-term load forecasting and performed the algorithm along with local support vector regression (LSVR), locally weighted GMDH networks (LWGMDH), locally weighted support vector regression (LWSVR), and traditional GMDH networks on two different data sets belonging to New York City and Victorian electricity market of Australia [31]. Xu et al. applied GMDH networks in comparison with ARIMA for short-term load forecasting of New South Wales in Australia [32]. Koo et al. presented a comparative study that performed ANN, simple exponential smoothing (SES), and GMDH networks for forecasting Korean electric load data on an hourly basis [33], and another study that wavelet transform was firstly applied for decomposition before the implementation of Holt-Winters method, ANN, and GMDH networks for one day ahead forecasting of hourly electric loads [34]. Jacob et al. employed GMDH networks and linear regression (LR) for forecasting short-term electrical energy consumption of a university campus in Nigeria [35]. Zjavka and Snasel proposed a method named as differential polynomial neural network that merges the functionality of GMDH networks with differential equation substitutions and carried out short-term load forecasting against ANN, SVM, and GMDH networks for the UK electricity transmission network and Canadian detached houses [36]. Yuniarti et al. tried to integrate wavelet transform with GMDH networks for short-term load forecasting of a power company in Sumatara, Indonesia, and collated it with the coefficient method (CM), which is currently used by the company [37]. Liu et al. enhanced GMDH networks by introducing elastic net regression and enriching with difference degree weighting optimization for forecasting hourly loads in data sets pertaining to three locations in China [38] against ANN, SVM, least absolute shrinkage and selection operator (LASSO), ridge regression (RR), and traditional GMDH networks. For South Korea's hourly load data, Yu et al. suggested a forecasting methodology based on SVR, which implements GMDH networks and bootstrap methods for the input selection procedure in comparison with different variations of linear correlation (LC) and mutual information (MI) based filter methods [39]. Izzatillaev and Yusupov analyzed hourly electrical energy consumption forecasting in a grid-connected microgrid within a commercial bank by employing GMDH networks and ANN [40].
Benchmark analysis of the studies that utilized from GEP and GMDH networks for short-term electrical energy consumption forecasting is demonstrated in Table 2 in terms of performed models, application type, forecast horizon, and compared models, consecutively. Table 2. Details of studies employed gene expression programming (GEP) and group method of data handling (GMDH) networks for short-term electric load, demand, or electrical energy consumption forecasting.

Material and Methods
As a general framework, this section is named as Material and Methods. Material of this study is the data set, and methods correspond to forecasting methods that can generate model equations for the prediction task.

Material
Material of the study is the data set, which is firstly acquired, then wrangled, and lastly prepared as electrical, meteorological, and calendar data. Steps are described as follows.

Data Source and Acquisition
Hospitals may be described as highly sophisticated organizations from the point of view of functional, technological, economic, managerial, and procedural aspects. The reliability of continuous energy flow has utmost importance for hospitals owing to their uninterrupted duty for 24/7 operation without any excuses. With its full name, Çukurova University Balcalı Health Application and Research Hospital is a large hospital complex and a pioneer health institution situated in Campus Balcalı of Çukurova University in Sarıçam district of Adana, Turkey. Since 1987, the hospital has been serving uninterruptedly to a region in the Southern Turkey by satisfying unceasing demands to supply electricity for an emergency service, 42 polyclinics, 12 intensive care units, 23 operating rooms, 43 clinical services, 5 laboratories, a radiology unit, a nuclear medicine, a blood center, a burn unit, a sterilization unit, and a pharmacy with also surgery rooms, laundries, kitchens, and a morgue [41]. The hospital has 1200 beds, serves more than 3500 patients per day with over 4000 academic and administrative staff, and has an installed transformer capacity around 18 MVA [42]. Aerial view of the hospital is illustrated in Figure 1. Data acquisition stage covers an interval between 2 October 2017 and 1 October 2018 with a resolution of 10-min. Data acquisition terminal for the hospital is the medium-voltage switchgear building where the electricity meter of the hospital is located. Electrical data of the hospital were obtained from the hospital's electricity meter via a three-phase energy logger during that interval. The logger is also in connection with an on-site temperature-humidity transducer that measure ambient temperature and relative humidity. The logger conducts logging by using the connections of current and voltage transformer in the terminal box of the electricity meter. Energy logger settings are adjusted to the multiplying factors of current and voltage transformers properly.
Other meteorological data were acquired from MERRA-2 (Modern-Era Retrospective Analysis for Research and Applications, Version 2), which is a database available worldwide of meteorological variables hosted by NASA and generated by the Goddard Space Flight Centre. The spatial resolution is approximately 50 km, which geographically corresponds to 0.625°in latitude and 0.5°in longitude [43]. The data acquisition stage is visualized in Figure 2.

Data Wrangling
Data wrangling can be stated as importing, tidying, and transforming data from its raw form to another format with an intention of making the data more valuable and suitable for sophisticated tasks.
Conversion of temporal granularity of the gathered data is accomplished from 10-min to 1-h via a forecast time horizon converter proposed in [42]. During the conversion process, missing values (a ratio of below 1% and occurred sporadically due to power outages at the hospital) and outliers are firstly detected and then treated via ARIMA with Kalman smoothing owing to its frequent use in recent energy studies [44,45] and superior performance in comparison with a variety of imputation methods employed in [42]. In brief, ARIMA with the Kalman smoothing imputation method performs Kalman smoothing on the state-space representation of an ARIMA model [46]. Analytically, Kalman filters are applied in two phases that are fundamentally based on the state-space models indicated in the following equations as where x t is the state vector of a given system at an instant in time t, y t is the reciprocating measurement vector at t, F t is the state-transition parameter of the system, t is the random state noise term, H t is the measurement parameter, and ω t is the measurement error term. In the first phase, the state and the corresponding variance of the system is estimated by using Equation (1). In the second phase, the estimated phase is updated by performing both Equations (1) and (2). ARIMA with Kalman smoothing imputation method utilizes an automatic function that carries out a search in order to find the best ARIMA model [47]. After data wrangling, dimensionality of raw data possessing 52,416 rows and 19 columns is reduced by converting the raw data to a cleansed data set with 8736 rows and 18 columns representing input and target variables.

Data Set Properties
The data set employed for short-term building electrical energy consumption forecasting in this study has 3 input categories and 17 input variables that are summarised in Table 3.
Electrical variables standing for historical electrical energy consumption, meteorological variables taken from temperature-humidity transducer and MERRA-2, and calendar variables constitute the input variables of the data set.
Previous 1 h, 1 day, and 1 week electrical energy consumption values form retrospective electrical variables. Meteorological variables contain transducer device temperature and relative humidity, which are gathered from the on-site temperature-humidity transducer, and outdoor temperature and relative humidity, pressure, wind speed and direction, rainfall, and short-wave irradiation that are acquired from MERRA-2. Calendar variables are obtained from date and time logs of the energy logger and then evaluated as hour of day (0-23), day of month (1-31), type of day (0 for working days and 1 for weekends and public holidays), week of year , and month of year (1-12), respectively. Actual electrical energy consumption, transducer device temperature, outdoor temperature, and short-wave irradiation graphs between October 2017 and October 2018 are illustrated in Figure 3.

Forecasting Methods
Fundamentals GEP and GMDH networks are, respectively, explained under the subsection of forecasting methods. Both methods can constitute analytical expressions for input variables and target variables without the need for the implementation of feature selection.

Gene Expression Programming
GEP is an enhanced methodology primarily based on GA and GP [48]. GEP contains five basic components, namely function set, terminal set, fitness function, control parameters, and termination condition.
Although parse tree demonstration is used in traditional GP, GEP employs a fixed length of character strings ([+, *, *, β 1 ,x 1 , β 2 , x 2 ] for the expression tree in Figure 4) for illustrating solutions to the problems, which are then visualized as parse trees [24]. The illustration of trees in GEP is named as expression tree and shown in Figure 4. The expression tree shown in Figure 4 corresponds to Equation (3).
The flowchart of the GEP algorithm is indicated in Figure 5. Shortly, the mechanism starts with random production of chromosomes to generate the first population. Afterwards, expression of chromosomes and evaluation of each individual's fitness are carried out consecutively. Next, the selection of individuals are implemented with respect to fitness for reproduction with modification. The process is repeated for a determined number of productions or up to a solution [48].  In other words, mathematical evolution initially starts with producing candidate functions, followed by mutation, breeding, and lastly, natural selection in order to model the data as close as possible. In addition to functions and variables, expression can possess constants. The constants can evolve by assignation of the values explicitly or randomly. For the optimization of random constants, nonlinear regression algorithms, such as differential evolution, Gauss-Newton, Levenberg-Marquardt, or a combination of them, can be employed for refining the constants. Advantages and disadvantages of GEP are described in Table 4. Table 4. Advantages and disadvantages of GEP.

Advantages Disadvantages
1. Extremely versatile 1. Does not ensure that the levels of functional complexity in the 2. Easy to understand with its linear and phenotype are also directly reflected in the genotype ramified structure 2. The best individual is maintained, but some of better 3. Faster than old GAs individuals may be lost 4. Has no invalid individuals 3. Needs much additional computation owing to mutations, 5. Overcomes the shortcomings of GA crossovers, and rotations before reaching an optimal solution and GP 4. Indicates premature convergence Among GEP applications, symbolic regression is a broadly utilized method to obtain an analytical expression for a desired output from input variables of a given data set. Each sample of the data set contains input variables and outputs, which can be stated as where n represents the number of input variables, and m corresponds to the number of outputs, x i,j and o i,j are the jth input and output of the ith sample. MSE or RMSE is frequently used for the accuracy of fitting. The symbolic regression needs to find the optimal Γ * that minimizes the error for the given data set where Γ is the quality of the formula, f (Γ) gives the fitting error of Γ [49].

GMDH Networks
GMDH networks, namely polynomial neural networks, principally operate as self-organizing networks where neuron connections, number of selected neurons, layers, and neurons in hidden layers are not constant and are self-acting along with training in order to reach an optimal model for maximum accuracy without overfitting [50]. To do so, GMDH networks use least squares regression to find the best mathematical relation among input and output variables by a reference function, which can be expressed as where y corresponds to the output, X = (x 1 , x 2 , . . . , x n ) represents the input vector, and a symbolizes either the coefficient or weight vector [51]. Ordinarily, the previous equation is utilized in the quadratic form of two variables such that In GMDH networks, input layer contains neurons for each input variable indicated by x as illustrated in Figure 6. Each neuron in the first layer acquires its inputs from two of the neurons in the input layer. The neurons in the second and the third layers obtain their inputs from two of the neurons in the previous layer and this process continues up to output layer. The output layer takes two of its inputs from the previous layer and generates the final result that shows the most suitable analytical expression in satisfying the relationship between input and output variables. The flowchart of GMDH networks is indicated in Figure 7 [52].
If n is the number of neurons in a layer in GMDH networks, then the number of candidate neurons in the next layer will be calculated as for two variable polynomials. Additionally, it should be noted that one neuron also may skip layers directly from the input variables to one of the next layers in GMDH networks as demonstrated with dashed lines from x 5 to z 6 in Figure 6 as an example. During the training process, two different sets of input data are employed, namely main training data and control data, which is used for overfitting. The control data generally contain about 20% as many rows as the main training data. During the training algorithm, MSE is computed for each neuron and also applied to the control data. If the MSE of the best neuron in the current layer as measured with the control data is lower than the MSE of the best neuron in the previous layer, and the maximum number of layers has not yet been obtained, the training process continues to construct the next layer. Otherwise, the training process halts. It should be noted that when overfitting starts, the error as measured with the control data will increase, therefore the training process will stop. Pros and cons of GMDH networks are stated in Table 5. Table 5. Pros and cons of GMDH networks [53,54].

Pros Cons
1. Presents adaptive network topologies which can be 1. Tends to produce quite complex polynomials for customized to the given problem simple systems 2. Finds locally good weights owing to the reliability 2. Do not guarantee building up the true structure of the fitting technique 3. Biased estimates of coefficients due to the least 3. Can be trained rapidly by sparse connectivity squares method

Results and Discussion
All computations in the scope of this work were performed on a Macintosh computer with OS version of 10.15.2, a processor of 2.4 GHz (Intel Core i5), and a memory size of 8 GB. For all computing tasks, RStudio was used as an integrated development environment for R programming language, which is one of the most popular languages for statistical computing and data analytics with elegant graphics [55].
Values stored in input variables of the data set are scaled between 0 and 1 for normalization, which provides elimination of units of various data types, reducing computational time and covering less memory for data integrity, and benchmarking multiple data columns in a similar way. In the assessment of performances belonging to GEP and GMDH networks, R 2 , RMSE, and MAPE are utilized in this study. Formulae of the performance metrics are as follows: where y i is actual or measured output,ŷ is predicted output, y is mean of y i , and n indicates the number of observations [41]. For model testing and evaluation, random sampling method is implemented to GEP and GMDH networks in such a manner that 20% of the data set is employed to constitute training data, and 80% of the data set is adopted to form validation data randomly.  Figure 8 and yields the following equation

Parameters of GEP
whereÊ is the predicted electrical energy consumption, T O and I SW represent the outdoor temperature and short-wave irradiation values taken from MERRA-2, E h corresponds to the electrical energy consumption value for the previous one hour, and h od is the value of calendar variable standing for hour of day.

Parameters of GMDH Networks
For GMDH networks, the quadratic reference function with two variables stated in Equation (5) is employed. Parameters for the GMDH networks are predetermined as 20 for the number of both maximum network layers and neurons per layer, 16 for maximum polynomial order, and 10 −4 for convergence tolerance. Allowed network configuration for the neurons in the next layer is designated as the selection of neurons in the previous layer and original input variables. A hold-out sample of 20% is utilized for protection control in order to avoid overfitting.
The best GMDH network model having seven input variables is found aŝ where N corresponds to neurons from N 1 to N 16 such that each neuron represents a quadratic equation, T D and H D stand for transducer device temperature and relative humidity, t od symbolizes the calendar variable type of day, and E d indicates the electrical energy consumption for the previous day at the same hour. Detailed parameters and coefficients of Equation (7) are given in Table 6.

Overall Results
Correlation coefficients of input, target, and predictor variables are visualized as a map in Figure 9 according to Pearson's correlation prior to mentioning overall results. Pearson's correlation indicates a number between −1 and 1 that shows the extent to which two variables are linearly correlated. It should be emphasized that blank squares within the correlation map represent statistically insignificant p-values that are smaller than 0.01.   Figure 9. Correlation map of input, target, and predicted variables.
When overall performances of the applied methods are evaluated in terms of accuracy, it is seen that GMDH networks give slightly better results than GEP according to R 2 , RMSE, and MAPE for the short-term building electrical energy consumption forecasting problem, as shown in Table 7. However, it should be noted that the best GMDH network model employs seven input variables with different variations in several equations having high polynomial order, while the best GEP model executes four input variables in one simple equation. Therefore, the simplicity of the GEP model reveals the fact that the computational time required to reach the best model by using GEP is one fourth of the time needed for GMDH networks, as indicated in Figure 10. Thus, the selection of each method for short-term building electrical energy consumption forecasting problem depends on the order of importance. If accuracy is more important than computational time and simpleness, GMDH networks are recommended. Otherwise, GEP is suggested for its low computational complexity and run time.

Order.pdf
Additionally, graphs consisting of actual and predicted values by employing GEP and GMDH networks are demonstrated in Figure 11

Discussion of In-Depth Investigation Results
Daylight utilization is one of the crucial topics not only for electrical energy efficiency studies but also for architectural indoor lighting studies. Short-wave irradiation is active during daylight and considered as a prominent variable that affects energy consumption of sustainable buildings. One distinctive finding of this study is related to the effect of short-wave irradiation over short-term building electrical energy consumption forecasting. I SW is encountered in both model equations; hence, it draws the attention and is advised to be included as an explanatory variable for further studies. Short-wave irradiation affects outdoor and indoor temperature, which also have impacts on the building HVAC temperature set point that influences electrical energy consumption. This study unveils that if short-wave irradiation does not equal zero, the arduousness level of short-term building electrical energy consumption prediction significantly increases, as indicated in Table 8. Another innovative result of this work, which has never processed in the literature to the best of one's knowledge, are in-depth investigations of the error-related performance metrics regarding short-term forecasts with respect to hour of day, name of day, type of day, and name of month. Short-term forecasts are examined according to the hour of day, in order to deduce the challenging hours in building electrical energy consumption prediction. It is inferred from the obtained results presented in Table 9 that two hours and an hour before the shift start (06:00-07:00 and 07:00-08:00) have the largest errors and are difficult to predict along with the previous hour of the shift end (16:00-17:00). The forecasts in terms of the name of day are analyzed in detail and the results are shared in Table 10. In regard to Table 10, the complexity level of prediction shows a tendency to decrease from the first day of the week (Monday) to the end of the week (Sunday). In the forecasts according to the type of day, in-depth analyses indicate that forecasting working days are more difficult than predicting weekends and holidays, as illustrated in Table 11.
Months with peak errors are elaborated in Table 12 wherein October and April possess the largest errors in comparison with the others owing to the fact that in the mentioned months, significant meteorological changes occur due to seasonal transitions from summer to winter and vice versa.
Key results of the in-depth investigations are summarized in Figure 12. Effect of shift start and end on an hourly basis, decreasing trend from Monday to Sunday, and peak errors of months during seasonal transitions are highlighted in Figure 12 with respect to GEP and GMDH networks.

Conclusions
Share of buildings energy consumption in the global final energy use and evolution of existing electric power systems to smart grids and IoE are considered together, the significance of short-term building electrical energy consumption forecasting is comprehended. Complexity of the forecasting process comes from the fact that there are so many factors influencing building energy consumption and every building has its own characteristics, such as physical properties and operational schedule.
Recent studies in the literature show an interest in the application of machine learning algorithms to predict short-term building electrical energy consumption. However, most of them produce an abstruse analytical expression among explanatory variables and response variables. In this study, GEP and GMDH networks are employed to forecast short-term building electrical energy consumption for a large hospital complex in the Eastern Mediterranean owing to their capability of generating easily understandable model equations between input variables and target variables without the need of implementing feature selection. Both methods are performed under identical constraints and evaluated in terms of R 2 , RMSE, and MAPE.
According to the results of the analyses, the best MAPE scores of GMDH networks and GEP are calculated as 0.620% and 0.641%, respectively. It is considered that GEP can be chosen for its low computational complexity and run time, while GMDH networks may be selected for predictions holding slightly better accuracy. In-depth investigations are carried out in this study to generalize and highlight the increase in forecasting complexity during challenging transitional periods by investigating MAPE values. Acquired results deduce the effects of short-wave irradiation, start and end of the working hours, weekends and holidays, and seasonal transitions over short-term building electrical energy consumption forecasting along with hourly, daily, and monthly trends of the prediction arduousness with respect to MAPE.
Consequently, it should be emphasized that this study is the first attempt in the literature that benchmarks GEP and GMDH networks for short-term building electrical energy consumption forecasting to create genuine and simple model equations by interpreting remarkable results with regards to accuracy, number of input parameters and complexity of model equations, and computational time. Furthermore, produced model equations in this study can be utilized for future studies related to buildings possessing similar meteorological conditions and electrical energy consumption profiles. Acknowledgments: The authors are grateful and would like to thank the anonymous reviewers for their valuable comments and suggestions.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: