Application of Artificial Neural Networks for Virtual Energy Assessment

A Virtual energy assessment (VEA) refers to the assessment of the energy flow in a building without physical data collection. It has been occasionally conducted before the COVID-19 pandemic to residential and commercial buildings. However, there is no established framework method for conducting this type of energy assessment. The COVID-19 pandemic has catalysed the implementation of remote energy assessments and remote facility management. In this paper, a novel framework for VEA is developed and tested on case study buildings at the University of Melbourne. The proposed method is a hybrid of top-down and bottom-up approaches: gathering the general information of the building and the historical data, in addition to investigating and modelling the electrical consumption with artificial neural network (ANN) with a projection of the future consumption. Through sensitivity analysis, the outdoor temperature was found to be the most sensitive (influential) parameter to electrical consumption. The lockdown of the buildings provided invaluable opportunities to assess electrical baseload with zero occupancies and usage of the building. Furthermore, comparison of the baseload with the consumption projection through ANN modelling accurately quantifies the energy consumption attributed to occupation and operational use, referred to as ‘operational energy’ in this paper. Differentiation and quantification of the baseload and operational energy may aid in energy conservation measures that specifically target to minimise these two distinct energy consumptions.


Introduction
Energy efficiency is the hidden fuel for meeting the current global demand, yet it remains underutilised despite its proven potentials [1]. According to the 2018 International Energy Efficiency Scorecard Report [2], Australia has ranked 17th out of the top 25 countries consuming 78% of all energy on the planet, indicating its urgency to rally innovative endeavour in the field of energy efficiency. In conjunction with the latter, the state of Victoria's Climate Change Act 2017 targets net-zero greenhouse gas emissions by 2050, which includes the transformation of the commercial building sector to encounter the pressure from a rapidly growing population to building emissions [3]. To reduce the gradual rise of building energy consumption, it is therefore critical to develop a holistic framework in identifying energy flow and waste within a building, replacing the traditional periodic walk-through energy audits. Furthermore, the gradual rise of energy cost has propelled the urgencies of energy conservation and efficiency, which created a need for energy audits to be conducted regularly in commercial and industrial sectors [4,5].
An energy audit is known as the first step to identify energy flow within a building, which helps to achieve energy savings, improve indoor user comfort and productivity, and reduce environmental impact [6]. This assessment can be used by large and small businesses and even households to achieve higher energy efficiency improvements and carbon emissions reductions. A survey from the U.S. Energy Department shows that energy audits have saved about 20% of energy for the commercial building as of 2020 and are expected to reach 30% by 2030 with the utilisation of current technology. However, if emerging technologies are introduced, this number is expected to reach 55% within five years [7]. Joshi [8] claims that a reduction in the amount of energy input can be achieved without reducing the level of useful output services, which leads to the implementation of energy conservation measures throughout a building's lifecycle as an emerging response to the rising energy consumption and carbon emissions of the global building sector. A traditional energy audit involves physical functional observation of the building, conducting a series of data acquisitions of onsite indoor quality parameters, and operational tests such as leak tests, infrared imaging, blower door tests, and equipment sub-metering (as defined by the standards in [6]).
Traditional onsite energy audits require a substantial financial commitment of human resources allocation, travelling times, and installation of instrumentation. In some circumstances, overall implementation and the energy audit cost could exceed the potential energy saving cost. A traditional energy audit provides a periodic snapshot of the building energy behaviour [9]. On the other hand, virtual energy assessment (VEA) does not involve a physical walk-around of the building and aims to cover the activities of the physical energy audit remotely without travelling time and allocation of physical auditor onsite [10]. Conceivably VEA is a way to save time and cost and, more importantly, it has the potential to provide a more frequent assessment of the building energy behaviour to the extent of autonomous continuous assessment with machine learning. The concept of VEA is not entirely novel, it has been occasionally conducted before the COVID-19 pandemic in several cases of residential and commercial buildings as summarised in Table 1. Literature review of building energy. However, it was not widely used as the preference of building owners was leaning towards the traditional energy audit [11]. The unprecedented time of the COVID-19 pandemic has catalysed the application of VEA and provided the opportunities to unlock the market of VEA.
Energy modelling such as VEA can be conducted through two approaches: the bottomup approach and the top-down approach [12]. The bottom-up approach entails thermal analysis for an individual building, with data related to the building's enclosure, schedules, external weather conditions, internal loads, and potential systems. A white-box model (a bottom-up approach) is commonly used for virtual energy assessments where the building's prediction relies on simplified heat balance calculations, aggregated weather libraries, fixed schedules, and thousands of building's characteristics inputs. Consequently, inaccurate conclusions of energy efficiency can be reached, leading to a gap between the simulated models and actual building interactions. Conversely, a black-box model (also a bottom-up approach) is based on the actual building's energy consumption datasets to provide insight into the existing building performance without extensive buildings' background and information needed. The artificial neural network (ANN) has been an outperforming tool for VEAs [13].  [14]. Residential/VEA Regression Model The virtual assessment was performed based on top-down modelling approaches. The model was based on large publicly available sample data of residential houses from one region and has never been tested in another region. Furthermore, the use of publicly available data might be subjected to incorrect entries and distort the accuracy of the models.
Using artificial neural networks to assess HVAC-related energy saving in retrofitted office buildings [15]. Two prediction models were developed: MLR and ANN (feedforward multilayer perceptron), using large datasets obtained during energy audits.
ANN has superior performance to the MLR model. However, it lacks explanations on its internal parameters and takes longer training time on a trial-and-error basis, MLR model provides a transparent understanding of the linear relationship between the dependent and independent variables. The variable selection process is similar to both models and the variables selected are overlapping. There may be other variables that have been not considered in the process.
Neural networks for smart homes and energy efficiency [16].

Residential/N.A. Neural Network
The paper discussed theoretical approaches of self-regulated heating system of each unit in a communal housing by a smart home system which include neural networks that were trained in the tenant preferences using acquired data from sensors and live feedback. A simple recurrent network was deemed sufficiently effective however the appropriate function depends on the required number of dimensions and output data. The discussion did not include any examples where the approach was practically implemented.
Energy analysis of a building using artificial neural network: A review [17].

Various Building classification/N.A. Neural Network
The paper reviewed diverse applications of ANN in the prediction of building energy consumption, with the three most used networks being feedforward, competitive, and recurrent networks. The paper also stated that indoor air temperature is often regarded as the only control variable whilst another thermal comfort factor such as humidity was rarely considered, hence it might be beneficial to develop control strategies based on thermal comfort. Performance and adaptability for a constantly changing environment of ANN models needed to be considered as well.

Industrial and
Commercial/Traditional onsite Various auditing tools such as Heating Assessment and Survey Tool programming Energy efficiency measures were gauged for six industrial processes case studies to reduce the fuel consumption in the U.S. The procedure followed for energy assessing targeted specific processes and depended on walk-through bottom-up approaches and basic thermal analysis tools. The dependency on averaging and simple calculations in the case studies had led to overestimating the energy consumption reduction. In addition, the wide variety of energy processes limits the versatility of auditing procedures, which should only describe a broad framework of audits.
Application of multiple linear regression and an artificial neural network model for the heating performance analysis [19].
Artificial Neural Network and Regression Model MLR and ANN models were developed for the measurement and verification baseline for probable future energy conservation measures in a ground source heat pump system (GSHP). Various MLR models were developed to specify the influencing factors in the GSHP performance and establish prediction accuracy for the optimal ANN architecture. The deep belief network (DBN) was used as the ANN model, to counter the impact of backpropagation sensitivity. This research highlighted the potential future application of ANN as a smart energy audit tool to provide energy conservation solutions.
Applying computer-based simulation to energy auditing [20].
Commercial/N.A. eQuest simulation software tool A bottom-up approach has been investigated through a case study of a high-rise tower in the U.S. The energy assessment required extensive knowledge of the building architecture and calibration, in addition to the building internal loads and HVAC systems. The research pinpointed the limitations imposed by data such as information accessibility which prohibit the models from reflecting the reality.
Random Forest-based hourly building energy prediction [21] Commercial (Educational)/NA Random Forest prediction model This paper proposed the use of a random forest prediction model to estimate the hourly energy consumption of a building. Randomisation of building variables is applied to generate initial training sets to develop a tree splitting process based on a collection of regression trees. The performance of the random forest prediction model was tested on educational buildings at the University of Florida. The paper showcased the ability of the random forest algorithm to predict hourly energy consumption.  Table 1 implies that VEA and neural networks are commonly regarded as two independent concepts. It presents various research of building energy assessment approaches that paved the boost of machine learning tools in Table 1 uncommonly used VEA. With the vast accessibility to huge, recorded databases, the rapid growth of machine learning (ML) tools utilisation such as ANN is facilitated to extract valuable insights and predictions to assist energy consumption performance [22]. The ANN applications are surging as one of the most popular artificial intelligence (A.I.) models. Data analytics, supported by machine learning models and big data, has the potential to explore new solutions for pressing energy consumption issues [23].
In line with Victoria' Climate Change Act 2017 and Australia's national target, the paper proposes a framework with a hybrid of top-down and bottom-up approaches for a VEA, implementing ANN and conducting a "virtual" walk-around simultaneously. This framework is demonstrated using case study buildings on the University of Melbourne campus to highlight the potential of the energy gap during COVID-19 lockdown. Hence, this application can contribute to the University of Melbourne Sustainability plan 2022-2025 [24] to achieve Victoria's commitment to reduce greenhouse gas emissions from electricity consumption in 2030.
The methodology is described in Section 2, which is parted into three sub-sections of data collection and quality checks, modelling with neural networks, and uncertainty analysis. The results are presented in Section 3, which depict the historical timeline of the electrical consumption, the performance of neural networks with a confidence interval, and occupancy correlation to the electrical consumption. Discussions of building base load, parameters sensitivities, and future works of the VEA are presented in Section 4 Discussion.

Methodology
In this paper, a hybrid methodology was developed consisting of exploratory data collection for the cases study buildings and computer-aided neural network modelling using MATLAB, thus depicting the hybrid approach of top-down and bottom-up approaches. This section started with the description of the buildings being assessed, energy consumption data collection and quality checks, and data modelling with ANN function selection and architecture.

Case Study
The VEA was performed in four educational buildings at the University of Melbourne Parkville Campus during the COVID-19 pandemic lockdown in August 2020. To showcase the capabilities of the proposed methodology, multifunctional buildings were selected that contained public areas, lecture halls, laboratory facilities and office space. The selected buildings are described in Table 2. The buildings varied in age from 12 to 62 years, usable area of 3000 to 13,000 m 2 , and a range of different enclosure materials such as concrete, brick and unglazed/glazed glass. The case study buildings are equipped with a real-time energy monitoring platform that is accessible through a secure web server. These energy data were collected for the case study buildings for this study, as well as external weather conditions and occupancy rates. The external weather data were collected from the nearest weather station and the occupancy rate was collected from the university security. All of the above-mentioned data could be accessed for most of the new buildings and most of the existing buildings which have been upgraded with remote data access. The authors believe that the proposed methodology is replicable to any building with such datasets. Abnormalities and outliers were removed as part of the data quality checks.
Electrical consumption data was collected for a total load of each building separately using Clariti. End-uses submetering was unavailable, with meter data representing the total load consumption. Fifteen minutes of load consumption was gathered from June 2015 to January 2021, as the basis of the assessment. With the building closures due to COVID-19, the use of space and buildings' layouts and building enclosure details were obtained from facility managers. Furthermore, the electrical consumption data were screened for outliers, the Clariti tool is very precise and for the cases, at hand, no outlier was detected.
Central to the building external conditions, 10 years of detailed meteorological data from the Australian Bureau of Meteorology in 15-min increments were procured from the nearest weather station. This data included temperature, humidity, rainfall, and solar irradiance. Online images of the sides and tops of the buildings were supplemented by using tools such as Google Earth and Nearmap.

Variables Selection
To improve the learning efficiency and compilation of the ANN model, it was necessary to reduce the number of input data variables used to those that were most influential of the energy use in each building. For instance, occupants' behaviour is a typical key element of a building's energy performance. In this paper, occupants' behaviour information was scarce and limited to certain years, influencing its validity in the training process of the neural network; however, if available, it would have provided further fertile testing grounds for the neural network's ability to further detect the end-use energy performance.

Artificial Neural Network Modelling
The advent of ANN enables analysis from the complicated and large size of data, extracting patterns and trends to provide future projections with the potential to be trained for projections of unpredicted circumstances. It is currently widely used with applications ranging from sales forecasting, web searching, to visual imaging. ANN model is a simplified version of a biological neural network that combines data and stores relationships between independent and dependent variables. The ANN can self-study the historical data and users' preferences, through training, to improve its accuracy of prediction. It can also continue learning, which can be well adapted to a new environment [25]. Compared to sophisticated calculations of statistical models, the ANN model trains data for prediction, which is more suitable for a larger set of data [17].
As illustrated in Figure 1, a basic neural network consists of several connected nodes, or "neurons", which produce a sequence of activation. Input neurons (in the input layer) are activated using an initialiser, while other neurons in hidden layers are activated through weighted connections from previously activated neurons. After checking the outcome with desired outputs, the neurons adjust their weight functions to get a more accurate result [17]. This procedure is technically called "learning". In general, the traditionallearning approach in a neural network requires significant amounts of data and long chains of computational stages to obtain accurate results. To handle this obstacle, various other networks are developed to accurately assign the weight functions by adopting the training stages with an unsupervised learning technique. A branch of the ANN model, the Nonlinear Autoregressive Network with Exogenous Inputs (NARX), is a recurrent dynamic network with feedback connection enclosing several layers of the network to account for time-series modelling [26]. A NARX model has less sensitivity to the problem of long-term dependencies and has an outstanding learning capability and generalisation performance for time-series data. The superiority of NARX is also reflected in its ability to model multi-dimensional data and its ability to predict price fluctuations accurately. Compared to logarithmic multiple linear regression and multiple linear regression, NARX artificial neural network has higher accuracy [27].  A branch of the ANN model, the Nonlinear Autoregressive Network with Exogenous Inputs (NARX), is a recurrent dynamic network with feedback connection enclosing several layers of the network to account for time-series modelling [26]. A NARX model has less sensitivity to the problem of long-term dependencies and has an outstanding learning capability and generalisation performance for time-series data. The superiority of NARX is also reflected in its ability to model multi-dimensional data and its ability to predict price fluctuations accurately. Compared to logarithmic multiple linear regression and multiple linear regression, NARX artificial neural network has higher accuracy [27]. Figure 2 shows the architecture of the NARX network used in this paper. The main features of this network are 1. number of neurons: r = 10; 2. number of input variables: n = 5; 3. time lag (delay): ∆t = 3; and 4. number of layers: N = 1. The construction of the NARX network encompasses a feedforward network baseline that incorporates the nonlinear regression function of y shown in Equation (1). Where β out is output bias, ω out is output weight, ∅ is a linear activation function ∅(j) = j and, h k is hidden layer output. A branch of the ANN model, the Nonlinear Autoregressive Network with Exogenous Inputs (NARX), is a recurrent dynamic network with feedback connection enclosing several layers of the network to account for time-series modelling [26]. A NARX model has less sensitivity to the problem of long-term dependencies and has an outstanding learning capability and generalisation performance for time-series data. The superiority of NARX is also reflected in its ability to model multi-dimensional data and its ability to predict price fluctuations accurately. Compared to logarithmic multiple linear regression and multiple linear regression, NARX artificial neural network has higher accuracy [27].  Equation (2) is the output of the input layer h k output where β k input bias, ω k input weight for input variables, θ k input weight for the delayed output, Ψ is the sigmoid activation function Ψ(j) = 1 1+e −j , and h s is the delayed dynamic autoregression output.
To incorporate all layers of NARX into one equation, Equation (3) can be reproduced as Equation (4) where the operations of all layers are demonstrated.

Neural Network Training
Three neural networks were trained to find the optimal model architecture. These were: Multilayer Perceptron, Feedforward, and NARX. The following methodology is tailored to the NARX architecture. The NARX model was fed a 70/30 split of data (70% training, 15% validation, and 15% testing). Figure 3 demonstrates the input-output data and the process of training the NARX network. As shown in Figure 3 input data consists of five independent variables that were recorded during the period of interest (i.e., before COVID-19 (March 2015-March 2020)). The output variable is the electricity consumption for the building of interest also recorded for the period of interest. The frequency for recording the input and output variables is 15-min intervals. It should be noted that although timesteps are used for the indexing of all variables time is not an input variable. Equation (2) is the output of the input layer ℎ output where input bias, input weight for input variables, input weight for the delayed output, is the sigmoid activation function ( ) = , and ℎ is the delayed dynamic autoregression output.
To incorporate all layers of NARX into one equation, Equation (3) can be reproduced as Equation (4) where the operations of all layers are demonstrated.

Neural Network Training
Three neural networks were trained to find the optimal model architecture. These were: Multilayer Perceptron, Feedforward, and NARX. The following methodology is tailored to the NARX architecture. The NARX model was fed a 70/30 split of data (70% training, 15% validation, and 15% testing). Figure 3 demonstrates the input-output data and the process of training the NARX network. As shown in Figure 3 input data consists of five independent variables that were recorded during the period of interest (i.e., before COVID-19 (March 2015-March 2020)). The output variable is the electricity consumption for the building of interest also recorded for the period of interest. The frequency for recording the input and output variables is 15-min intervals. It should be noted that although timesteps are used for the indexing of all variables time is not an input variable.

Neural Network Validation
Now to verify the accuracy of the network and test for overfitting, the trained network is validated using the remaining 15% data using the process shown in Figure 4.

Neural Network Validation
Now to verify the accuracy of the network and test for overfitting, the trained network is validated using the remaining 15% data using the process shown in Figure 4. Equation (2) is the output of the input layer ℎ output where input bias, input weight for input variables, input weight for the delayed output, is the sigmoid activation function ( ) = , and ℎ is the delayed dynamic autoregression output.
To incorporate all layers of NARX into one equation, Equation (3) can be reproduced as Equation (4) where the operations of all layers are demonstrated.

Neural Network Training
Three neural networks were trained to find the optimal model architecture. These were: Multilayer Perceptron, Feedforward, and NARX. The following methodology is tailored to the NARX architecture. The NARX model was fed a 70/30 split of data (70% training, 15% validation, and 15% testing). Figure 3 demonstrates the input-output data and the process of training the NARX network. As shown in Figure 3 input data consists of five independent variables that were recorded during the period of interest (i.e., before COVID-19 (March 2015-March 2020)). The output variable is the electricity consumption for the building of interest also recorded for the period of interest. The frequency for recording the input and output variables is 15-min intervals. It should be noted that although timesteps are used for the indexing of all variables time is not an input variable.

Neural Network Validation
Now to verify the accuracy of the network and test for overfitting, the trained network is validated using the remaining 15% data using the process shown in Figure 4.   Comparing the 3 neural network architectures below, the NARX network gave the most accurate prediction (Table 3) without systematic biases. Annual cyclical variation in electricity use is also evident. The specifics of each network that was used in Table 3 are as follows: (1) Feedforward: This network is a simple Feedforward network using a backpropagation training function and 10 hidden layers that can do some accurate predictions; (2) CNN: This is a conventional neural network but, this article, uses a backpropagation training function and 10 hidden layers; (3) NARX: This network is extensively discussed in the previous section. Figure 2 shows that this network uses 10 hidden layers and also a backpropagation module that uses updated weights. Table 3 also demonstrates that the NARX network is demonstrating a better MAPE compared to the other networks. This could be referred to as the inherent elements that are embedded into the NARX network ( Figure 2) that can be called an improvement to the CNN and Feedforward networks. NARX network not only uses the backpropagation method but also uses delayed inputs and modified weighting that will improve the outputs. This evolution of networks from Feedforward to CNN (with backpropagation) and finally to NARX further demonstrates how critical is the network design considerations. Figure 5 shows the individual R values for the training, testing and all data inputs for the NARX training and testing process. Overfitting occurs when the network is overtrained on the training dataset and can only produce accurate results for the trained data and less accurate results for the test data set. The results presented in Figure 5 demonstrate that the issue of overfitting has not occurred as the testing data shows a highly accurate result. Comparing the 3 neural network architectures below, the NARX network gave the most accurate prediction (Table 3) without systematic biases. Annual cyclical variation in electricity use is also evident. The specifics of each network that was used in Table 3 are as follows: (1) Feedforward: This network is a simple Feedforward network using a backpropagation training function and 10 hidden layers that can do some accurate predictions; (2) CNN: This is a conventional neural network but, this article, uses a backpropagation training function and 10 hidden layers; (3) NARX: This network is extensively discussed in the previous section. Figure 2 shows that this network uses 10 hidden layers and also a backpropagation module that uses updated weights. Table 3 also demonstrates that the NARX network is demonstrating a better MAPE compared to the other networks. This could be referred to as the inherent elements that are embedded into the NARX network ( Figure 2) that can be called an improvement to the CNN and Feedforward networks. NARX network not only uses the backpropagation method but also uses delayed inputs and modified weighting that will improve the outputs. This evolution of networks from Feedforward to CNN (with backpropagation) and finally to NARX further demonstrates how critical is the network design considerations. Figure 5 shows the individual R values for the training, testing and all data inputs for the NARX training and testing process. Overfitting occurs when the network is overtrained on the training dataset and can only produce accurate results for the trained data and less accurate results for the test data set. The results presented in Figure 5 demonstrate that the issue of overfitting has not occurred as the testing data shows a highly accurate result. After completing the network training with the training dataset and then testing with the testing dataset, the network is used to produce a complete prediction using existing datasets. Figure 6 shows the network prediction vs. actual electricity load for all buildings in this paper. Figure 6 shows that the NARX network prediction accurately maps the actual electricity load and fluctuations. After completing the network training with the training dataset and then testing with the testing dataset, the network is used to produce a complete prediction using existing datasets. Figure 6 shows the network prediction vs. actual electricity load for all buildings in this paper. Figure 6 shows that the NARX network prediction accurately maps the actual electricity load and fluctuations.

Neural Network Forecasting
With the NARX network validated and trained following the Figure 7 process, the COVID-19 impacted input data was then given to the network and the network prediction was compared with the actual electricity load. These comparisons will be presented in the next section.

Neural Network Forecasting
With the NARX network validated and trained following the Figure 7 process, the COVID-19 impacted input data was then given to the network and the network prediction was compared with the actual electricity load. These comparisons will be presented in the next section.

Neural Network Forecasting
With the NARX network validated and trained following the Figure 7 process, the COVID-19 impacted input data was then given to the network and the network prediction was compared with the actual electricity load. These comparisons will be presented in the next section.

Baseline Due to Operation Interruption Caused by COVID-19
Figures 2-5 below compare the predicted electricity consumption from the NARX model and the actual metered energy consumption for each of the case study buildings. They include one year of regular operation leading to March 2020 for comparison of the COVID-19 impacted period from March 2020 to January 2021. The metered energy use data from March 2020 to January 2021 show the reduction in electricity use in the buildings during the period, and provide a better understanding of the base electrical load of the buildings over a significant length of time whilst the city of Melbourne endured two significant lockdowns due to health orders from the local spread of the COVID-19 coronavirus: • Staff and graduate students were beginning to return in late 2020, however, the University did not re-open to students until 3 January 2021.
The impacts of these events were notable in the metered energy use (green) from 17 March 2020, onward in the figures below, though not consistent for each building depending on what activity remained throughout the lockdowns. For B1, B2, and B4 (Figure 8 a, b and d respectively), between March 2020 and January 2021 the lower bound of the actual data is a strong measurement of the building baseload on evenings and weekends, and the upper bound was a strong indicator of the daytime building loads without added thermal load due to occupancy.
The situation was different for B3 (Library) (Figure 8c), which remained open with minimal staff to provide continued support to the university staff and students continuing working and studying remotely during the study period. Electricity consumption was lower than predicted and irregular, but overall, due to the continued activity, there was no definitive measurement of the baseload captured for the library. During the break between semesters in July 2020, the library had higher than predicted electrical consumption, which aligned with the short periods between the first and second lockdowns and the start of the second delayed semester of the 2020 school year, possibly due to the significant activity surrounding planning for a full semester to be completed online due to the lockdown. Some university services beyond libraries were open during the lockdowns, identified as essential services, albeit only for staff to be onsite for student support and filling student requests for pickup. Similarly, some faculties were also permitted to continue where online studies were not possible (e.g., medical, biological labs, etc.), but under strict health protocols as directed by the government of Victoria.
Overall, the amplitude of the metered energy use was decreased significantly between daytime peaks and baseload. The spread between the actual overnight vs. day baseload is roughly 20% of the magnitude spread of the predicted electrical consumption (red), where the remaining 80% avoided electrical consumption could be attributed to the occupation/operational use of the building. The offset between the lower bounds of the predicted and actual could be due to the additional thermal mass accumulation and additional HVAC programs during regular operation (with added plug loads).
Limited information was available from the university facilities departments for what changes were made to the building's internal systems for energy savings (if at all) when the lockdown was initially announced in March 2020. Assumptions were made that B1, B2, and B4 had their HVAC systems put to holiday settings, and some IT systems such as the A/V in the lecture halls were powered down. In September 2020, leading into the hotter summer weather in Melbourne, additional measures were taken to reduce HVAC operation by shutting curtains and minimising lighting demand to only safety/security requirements.
when the lockdown was initially announced in March 2020. Assumptions were made that B1, B2, and B4 had their HVAC systems put to holiday settings, and some IT systems such as the A/V in the lecture halls were powered down. In September 2020, leading into the hotter summer weather in Melbourne, additional measures were taken to reduce HVAC operation by shutting curtains and minimising lighting demand to only safety/security requirements.

Sensitivity Analysis
A sensitivity analysis was applied to determine the impact of key variable conditions, contributing to the building's energy consumption forecast and the VEA accuracy. To understand the influence of each of these conditions input on the neural network output, the meta-model-based sensitivity method of white gaussian noise was implemented to case studies data sets, to randomly generate noise values and retrain the NARX network. This is a sampling-based probabilistic method to maintain a well-validated model code. Once the retraining process was over, the mean absolute percentage error of each of the

Sensitivity Analysis
A sensitivity analysis was applied to determine the impact of key variable conditions, contributing to the building's energy consumption forecast and the VEA accuracy. To understand the influence of each of these conditions input on the neural network output, the meta-model-based sensitivity method of white gaussian noise was implemented to case studies data sets, to randomly generate noise values and retrain the NARX network. This is a sampling-based probabilistic method to maintain a well-validated model code. Once the retraining process was over, the mean absolute percentage error of each of the regenerated data sets was calculated and compared with the performance of other neural networks' architectures, leading us to determine the most influential data sets impacting the performance of the neural networks and the forecast certainty [28]. From Figure 9, the most influential variable of the neural model performance for B1, B2, and B3 was maximum temperature, while the most influential factors for the electrical engineering building were the maximum temperature and solar irradiance. This was due to the variation of the buildings' envelope material from Table 2, as B4 fenestration material was unglazed glass. networks' architectures, leading us to determine the most influential data sets impacting the performance of the neural networks and the forecast certainty [28]. From Figure 9, the most influential variable of the neural model performance for B1, B2, and B3 was maximum temperature, while the most influential factors for the electrical engineering building were the maximum temperature and solar irradiance. This was due to the variation of the buildings' envelope material from  Figure 9. The NARX network sensitivity analysis for four input variables: solar irradiance, rainfall, and minimum temp and maximum temp. Figure 9. The NARX network sensitivity analysis for four input variables: solar irradiance, rainfall, and minimum temp and maximum temp.

Uncertainty Analysis
Although NARX networks have been employed for several time series applications, they are not immune to uncertainty caused by several factors such as vanishing and exploding gradients, inappropriate selection of the neural network architecture, and false convergence to local optima instead of global optima during the training process [29]. To quantify the level of uncertainty associated with the energy consumption forecast, uncertainty analysis was conducted ( Figure 10) by using confidence interval principles for unknown sample distribution. Energy consumption of the case study buildings is based on stochastic weather data, building use, occupancy rate, and unplanned events such as COVID-19 lockdown. Expressing the forecast energy consumption results with prediction intervals provide certainty to the neural network outputs. The predicted energy consumption population sample has an unknown mean x and unknown standard deviation s √ n . The concept of unknown sample distribution states that the sample standard deviation is equivalent to an estimated standard error, replacing the standard deviation values, where the standard error approaches an equal value of the standard deviation for a large sample number n [30]. The confidence interval for the sample corresponds to the 95% (i.e., 1-2 α level) upper and lower levels of confidence, representing the 2.5th (q L ) and 97.5th (q L ) percentiles of the distribution of every simulated monthly prediction. This can be created as (5).
x ± t * s √ n where x is the monthly average of the predicted energy consumption, t * is the critical value of the t distribution with unlimited degrees of freedom, and s √ n is the standard error of the monthly predicted values. they are not immune to uncertainty caused by several factors such as vanishing and ex-ploding gradients, inappropriate selection of the neural network architecture, and false convergence to local optima instead of global optima during the training process [29]. To quantify the level of uncertainty associated with the energy consumption forecast, uncertainty analysis was conducted ( Figure 10) by using confidence interval principles for unknown sample distribution. Energy consumption of the case study buildings is based on stochastic weather data, building use, occupancy rate, and unplanned events such as COVID-19 lockdown. Expressing the forecast energy consumption results with prediction intervals provide certainty to the neural network outputs. The predicted energy consumption population sample has an unknown mean and unknown standard deviation √ .
The concept of unknown sample distribution states that the sample standard deviation is equivalent to an estimated standard error, replacing the standard deviation values, where the standard error approaches an equal value of the standard deviation for a large sample number [30]. The confidence interval for the sample corresponds to the 95% (i.e., 1-2 α level) upper and lower levels of confidence, representing the 2.5th ( ) and 97.5th ( ) percentiles of the distribution of every simulated monthly prediction. This can be created as (5).
where is the monthly average of the predicted energy consumption, * is the critical value of the distribution with unlimited degrees of freedom, and √ is the standard error of the monthly predicted values. To measure and compare the uncertainty based on each case study and model, p-factor and d-factor are used as the objective criteria [31]. The "P-factor" indicates the percentage of observed data as the actual energy consumption bracketed by the confidence interval upper and lower limit. The D-factor is the average distance between the upper and lower limits, calculated as follows: where d x is the average distance between the upper X u and lower levels X l of the confidence band for each month, and σ x is the standard deviation of the observed energy consumption [32]. The best outcome of the uncertainty analysis of a simulated prediction is to have 100% of the P-factor falling within the confidence band, and a D-factor close to zero. Table 4 indicates that the observations of the electricity consumption for the chosen buildings fall entirely within the prediction interval of the forecast energy consumption. Thus, this analysis eliminates the uncertainty associated with the NARX energy consumption predictions.

Discussion
The unprecedented time of the COVID-19 pandemic presented a unique situation that has unlocked the market of VEA and boosted its application commercially. The authors developed and validated a numerical model for VEA and have demonstrated the tool through selected multifunctional case study buildings. The advantages and limitations of the model are discussed in this section.
One of the advantages of the model is evaluating the energy gap during unprecedented times. From Figures 3, 4 and 6, the NARX model provides a reasonable forecast horizon for the typical energy usage from March 2020 to January 2021. This has provided a rare opportunity to accurately quantify the energy use attributed to occupation and operational use of the building (i.e., the difference between typical building energy use vs. unoccupied building base load during COVID-19, defined as the "energy gap"). This energy gap shifts more emphasis on behavioural energy efficiency measures to reduce the operational energy load.
Central to the framework approach proposed in this paper, a hybrid of bottom-up and top-down approaches was developed, using a 'black-box' model with feature extraction, respectively.
A black-box model, such as the NARX neural network employed in this study, enabled a virtual study of a building envelope with an unknown internal system, with external input data such as weather and solar exposure generating the outputs of the building's energy use. This was beneficial to gain insights into the building operation with minimal input data, which would have been difficult with common bottom-up white-box methods, which require substantial upfront knowledge of the building's construction [12]. In addition, the neural network was able to cope with non-linearity and the stochastic nature of the raw input temperature and raw output energy data, which was not considered with the white box methods that use averaged and typical data files for more generalised results. This inferred that the NARX should provide significantly more reliable results in prediction, especially if a significant amount of historical data is available.
However, a notable limitation of the black-box approach would be the limited adaptability of the model to other buildings. A neural network is tailored to the building on which the model was trained upon, which prevents the extrapolation of results for use to other buildings and therefore results in the limited macro-level understanding of the campus. Predictions were only as good as the input data, and hence the NARX model can only produce predictions as accurate as of the input data for weather and energy use. Deb and Schlueter [12] similarly noted the top-down approaches were "unsuitable for individual building retrofits", hindering the proportioning of the total consumption into energy end uses.
The NARX model was supplemented by the top-down 'feature extraction method' represented in the sensitivity analysis to determine the influential factors causing the variations of energy consumption based on the case study buildings' ages and envelope material. The sensitivity analysis prioritised the input data variables required in the description of the building energy consumption, as well as reducing the training time of the NARX model. In this study, the most influential variables to the buildings' energy consumption were hourly meteorological maximum temperature and solar irradiance, as shown earlier in Figure 9.
This combination of the two approaches has enabled the VEA to avoid the uncertainty inherited from the bottom-up aggregated simplified approaches and to gain an understanding of the interactions of the building beyond the black box limitations.
Further work could be conducted to examine the resiliency of each building, by expanding the envelope of these influential variables to account for more extreme climate scenarios. This could provide valuable insight into how a building would react in its current form in such scenarios and guide future retrofit priorities and design more tailored and sustainable energy conservation measures.

Conclusions
The unprecedented time of the COVID-19 pandemic presented a unique situation that has unlocked the market of VEA and boosted its application commercially. The authors developed and validated a numerical model for VEA and have demonstrated the tool through a case study of multifunctional educational buildings. The advantages and limitations of the model have been elaborated in detail. The key opportunity presented was to evaluate the "operational energy" over a significant amount of time which encompassed all four seasons in continuity (i.e., the difference between typical building energy use vs. unoccupied building base load during COVID-19 as demonstrated in [33]). In this paper, a combination of bottom-up ('black-box' model) and top-down ('feature extraction') approaches were employed to conduct the VEA. This combination reduced the inherent uncertainty from typical bottom-up aggregated simplified approaches under similar data collection scenarios and provides meaningful insight into the behaviour of the building beyond employing only the single 'black-box' methodology. A neural network model has limited applicability since it is tailored using the data attributed to that specific building. Supplementing the NARX model with a separate top-down approach (i.e., the sensitivity analysis) identified the most influential factors defining the variations of energy consumption, which not only optimised model performance but provided insight for where to target ECM activity for each building.