Impact Assessment in the Process of Propagating Climate Change Uncertainties into Building Energy Use

Buildings are subject to significant stresses due to climate change and design strategies for climate resilient buildings are rife with uncertainties which could make interpreting energy use distributions difficult and questionable. This study intends to enhance a robust and credible estimate of the uncertainties and interpretations of building energy performance under climate change. A four-step climate uncertainty propagation approach which propagates downscaled future weather file uncertainties into building energy use is examined. The four-step approach integrates dynamic building simulation, fitting a distribution to average annual weather variables, regression model (between average annual weather variables and energy use) and random sampling. The impact of fitting different distributions to the weather variable (such as Normal, Beta, Weibull, etc.) and regression models (Multiple Linear and Principal Component Regression) of the uncertainty propagation method on cooling and heating energy use distribution for a sample reference office building is evaluated. Results show selecting a full principal component regression model following a best-fit distribution for each principal component of the weather variables can reduce the variation of the output energy distribution compared to simulated data. The results offer a way of understanding compound building energy use distributions and parsing the uncertain nature of climate projections.


Introduction
In the United States, commercial and residential buildings are major energy use consumers and contribute to carbon emissions. Studies have shown that efficient buildings are interlinked with sustainable built environment [1]. Building energy use is mainly dependent on exterior climate conditions, occupant behavior, envelope characteristics, and equipment efficiency which are also the main sources of uncertainty in quantifying building performance [2]. With the current trends of climate change, there is a need of evaluating buildings that ensure a resilient and sustainable design [3]. Building simulation tools are widely used to assess building energy performance and occupant comfort [4]. However, many require weather files with high spatial and temporal resolution and in the format of a Typical Meteorological Year (TMY). Additionally, due to climate change, existing TMY files (using historical data) should not be used to assess building energy performance for the future [5,6]. On the other hand, Global Climate Models (GCMs) and Regional Climate Models (RCMs) are coarse in resolution. Therefore, they cannot be used directly in building simulation tools, and need to be downscaled to hourly temporal resolutions. This is done using statistical downscaling techniques (e.g., stochastic weather generators) [7] or the morphing process [8]. Downscaling methods are useful to develop weather files that can be directly incorporated in building energy simulations in order to assess building performance under climate change scenarios [9]. However, weather files that are developed from downscaled climate models have many levels of uncertainty [10,11] and most building simulation tools lack the ability to propagate various sources of uncertainties [12]. For example, the climate is changing and climate models which generate various projections based on emissions scenarios are uncertain in nature and their uncertainties need to be accounted for in building models. In recent years, developments in building standards and regulations have promoted improved building efficiency (e.g., ASHRAE 90. . Occupant behaviors and dynamic schedules have also been incorporated in building energy tools [13,14]. However, with global temperatures projected to increase by 4.8 • C by the end century (IPCC 2014 [15]), buildings will operate differently from their original design loads [16]. Given these various sources of uncertainty, there is a need to assess building energy performance using probabilistic approaches [17]. One method is the Monte Carlo technique which is commonly used to propagate input variations (e.g., climate, building operation, etc.) through the model to determine uncertainties to the output (e.g., energy consumption) [18,19]. A probabilistic approach to propagate climate uncertainties into building energy use has driven modeling efforts [20,21].
In this regard, fitting distributions to input variables and conducting an uncertainty analysis following random sampling in building simulations is a common practice [22,23]. Sun et al. (2014) looked into the uncertainties of microclimate variables in building energy models and presented a framework for uncertainty quantification which uses a detailed specification of urban form model to quantify microclimate conditions. They compared their results which included quantifying different sources of uncertainties using statistical methods [24]. Gang et al. (2015) examined the uncertainties of nine factors of building parameters, weather and indoor conditions and developed a probability distribution of the cooling energy use and capital cost [25]. In their study they followed three typical distributions of normal, uniform and triangular. De Wilde and Tian (2010) evaluated the impact of the selection of performance metrics and assumptions made in modeling building emissions, overheating and office work performance under climate change using the UKCIP02 climate scenarios following a uniform distribution in Monte Carlo sampling [26]. Wang et al. (2012) investigated uncertainties in energy consumption from different building operational practices and weather data. For the analysis of weather sources uncertainties, they compared the energy results from historical weather data to TMY3 weather files [27]. González et al. (2019) provided an extensive review on critical uncertainty indices [28]. Brohus et al. (2012) conducted a comprehensive uncertainty analysis of input distribution following different distributions [29]. They used a Kolmogorov-Smirnov goodness of fit test was followed as a determination of stochastic sampling.
Regression models are also used to predict building energy use [30] and are useful methods to generate correlations between thermophysical, weather and occupant factors to building output energy use. For example, Braun et al. (2014) used a multiple regression model with regressors of humidity ratio and relative humidity to predict gas and electricity use consumption for future climate periods in northern England [31]. Another study by Catalina et al. (2013) authors develop a regression model between multiple input variables and building energy demand [32]. They found a regression fit to predict heating energy demand to be simple and largely applicable to building simulation.
The process of conducting a Monte Carlo analysis can be complex and would require intense computational needs and recent studies are attempting to reduce the intensity of the uncertainty analysis. Yassaghi, Gurian and Hoque (2020) presented a four-step climate uncertainty propagation method which could reduce computational efforts and is applicable to regions where limited hourly weather data can be developed to evaluate climate change impacts on buildings energy performance [33]. However, probabilities are not always easy to assess. In many cases it is difficult to fit a distribution to the input (the distribution may not be known) or to develop an appropriate correlation between the model input and output, which could then add to the total uncertainties. In addition, the output distributions of energy use from uncertainty analysis can be confound and present a broad scale and therefore, could be difficult to interpret. Studies have attempted to investigate the impact of selecting various input variable distributions on propagating uncertainties to building energy use following Monte Carlo [25,29]. This paper evaluates the simultaneous impact of selecting appropriate average annual weather variable distribution and regression models between the weather variables and energy use when propagating climate change uncertainties. This is an approach that has not been fully investigated and is not completely understood. The output results of the impact assessment aim to enhance a robust and credible estimate of the uncertainties and interpretations of building energy performance under climate change. The goal is to offer a way of understanding compound building energy use distributions and parsing the uncertain nature of climate projections.
The present work looks into the impact of selecting various distributions for the mean annual weather variables and following different regression models in a four-step uncertainty propagation method. A sample office reference building was selected as the case study for the climate conditions of Philadelphia, PA, USA. Downscaling methods were used to generate hourly weather files based on the Intergovernmental Panel on Climate Change (IPCC) emissions scenarios and develop a data base of current and future TMY files that were incorporated into EnergyPlus. The outcome of this study offers more nuanced estimate of the uncertainties and interpretations of building energy performance under climate change. Section 2 (Methodology) presents the steps taken to conduct the input weather distribution and regression model impact assessment.

Study Approach
This study assesses the impact of various average annual weather variables distributions and selecting different regression model in building energy use distributions when conducting an uncertainty propagation analysis. We follow the four-step uncertainty propagation technique presented by Yassaghi, Gurian and Hoque (2020) [33] to conduct our impact assessment analysis. The four-step climate uncertainty propagation method consists of a development of regression model, fitting distribution to weather variables and random sampling ( Figure 1). Figure 1. The Four-Step Uncertainty Propagation method followed in the impact assessment (adapted from method proposed by [33]).
In Figure 1, the method shows four major steps. First, multiple building simulations are conducted for all available weather files (current and future). From the simulation results (e.g., for energy use) a multiple linear regression model is fit between the annual energy use results and the annual average weather variables of the TMY files used in the simulation. The process would develop a model between energy outputs and weather variables. Next, a distribution is fit to the average annual weather variables following a Kolmogorov-Smirnov (KS) goodness of fit test and a random sampling is conducted. The best-fit distribution is the distribution with the highest p-value in the Kolmogorov-Smirnov goodness of fit test. Finally, by having a regression model (between the average annual weather variables and energy use) and input average annual weather variables distribution, the uncertainties can be propagated to the energy use distribution. In the uncertainty propagation method presented by [33] the use of different regression models or input average annual weather variable distributions could alter the output energy use distribution results.
To conduct the model and distribution impact assessment (presented in red in Figure 1), we first compare the use of full multiple linear regression (LR) model with a stepwise multiple linear regression (LRS) model in the four-step uncertainty propagation method for heating and cooling energy use distributions. The regression models are conducted between energy use (dependent) and their associated average annual weather variables (regressors) of the weather files used in a dynamic building simulation. Then, we fit various distributions to the average annual weather variables of 46 current and future weather files. The input weather variable distributions are used in the regression model to propagate uncertainties to energy use. Next, we conduct a Principal Component Analysis (PCA) and develop principal components for the average annual weather variables. Then we develop full Principal Component Regression (PCR), stepwise principal component regression (PCRS), and 2-factor principal component regression (PCR2) models between the principal components and heating and cooling energy use. Then various distributions are fit to the principal components which are then used in the principal component regression models to propagate the uncertainties. Results are then compared to assess the impact of the regression model selection and distribution fit on the output energy use of the uncertainty propagation method. Below is a summary of the steps taken:

1.
Current and future weather files are created. Current weather files are obtained from existing resource presenting different historical periods. Future weather files are developed using weather generators (Step a in Figure 1).

2.
Multiple dynamic building simulations for each weather file is conducted and annual heating and cooling energy use are determined (Step b in Figure 1).

3.
The average annual values (arithmetic mean values) for each weather variable of the weather files are calculated and the corresponding principal components of the weather variables are developed.

4.
Regression models between the average annual weather variables (and principal components) as regressors and annual heating and cooling energy use (as dependent variables) are developed (Step c in Figure 1).

5.
Sample distributions are fit to the average annual weather variables of the weather files data sets and their corresponding principal components (Step d in Figure 1). The sample distribution is repeated for all weather variables and PCs except for the best fit scenario. For example, when the Erlang distribution is selected, all weather variables are assumed to follow an Erlang distribution. 6.
Parameters of the distributions are obtained and a random sampling of a sample size of 100 is conducted for all weather variables and principal components. 7.
The uncertainties of the input variables (weather variables and their associated principal components) are then propagated to the heating and cooling energy use using the regression model developed (Step e in Figure 1).
Details of the weather files developed and building case study is presented in Section 2.2. Sections 2.3 and 2.4 provide details of the regression models development and distribution fitting process respectively. Results are then presented in Section 3.

Case Study and Weather Files
Stochastic weather generators such as the AdvancedWEatherGENerator [34][35][36], the CCWorldWeatherGen [37] which adapts the morphing technique [8], and Meteonorm were used to downscale future weather files based on emissions scenarios. The downscaling methods were used to develop future TMY files that were incorporated into EnergyPlus. Appendix A shows a summary of the weather files, their sources and period of generation used here. In total, 46 weather files were developed ( The DOE large office reference building in the 4A Philadelphia climate category was analyzed. The office building has 12 floors plus a basement with a total area of 46321.45 m 2 and window to wall ratio of 40%. The heating and cooling systems used in the building are a boiler and chiller which use gas and electricity respectively as their energy fuel. The walls, roof and windows U-value, lighting load density, window to wall ratio and equipment type following ASHRAE standards 90.1-2004 are presented in Appendix B.

Selection of Regression Model
A model showing the association between input weather files and output energy use along with input distributions are required to propagate the climate uncertainties. However, developing a suitable regression model can be difficult, or due to lack of data, there could be doubt about the appropriate distribution of the input parameters. In the first part of this study, the model is developed using a full multiple linear regression (LR) between the input weather variables and heating and cooling energy use. We also conduct a stepwise multiple linear regression (LRS) model to understand the most influential weather variables that impact the building energy use.
The regression models are developed for both heating and cooling energy use and 7 1 β i are the regression coefficients. In addition, a full, stepwise, and 2-factor principal component regression is also developed for further investigation on the impact of the model selection on the output of the weather uncertainty propagation method ( Figure 1). Principal Component Analysis (PCA) is a statistical analysis that transforms original variables with possible collinearity into new uncorrelated variables called principal components [38]. A Principal Component Regression (PCR) model is a linear regression using the Principal Components (PCs) of the data set as the independent variable. The PCR model has the capability to remove possible collinearity among the independent variable reflected in a linear regression [39]. Three different distributions are considered to analyze how the selection of the input weather distribution impacts the output results of the uncertainty propagation.
The literature shows a strong linear correlation between building energy use and exterior climate conditions, even though the main weather variables acting as the driving force for energy use can differ depending on the location and case study building. In this study we consider all seven weather variables to be essential to the case study. This is the key reason why we focused on developing linear regression models. If a different regression model were to be used (e.g., non-linear), the results could change but would also be in contradiction to our current physics-based understanding of building performance and exterior climate conditions. On the other hand, we have limited knowledge about the climate and the statistical distribution of weather variables. This is due to lack of sufficient climate data. To address this limitation, the regression models are not developed for a specific time period and all future weather files are used in the process of developing the regression model. Indeed, as our knowledge of our climate and climate change increases, the uncertainties due to the selection of weather variables distribution will likely be reduced. Details of the weather variable distribution selection is given hereafter.

Selection of Average Annual Weather Variables Distributions
The average annual weather variables and their associated principal components distributions were selected following a Kolmogorov-Smirnov (KS) goodness of fit test [40].
In Equation (3), D n is the test statistic, F is the theoretical cumulative distribution, and F n is the empirical distribution for n number of observations. The KS test examines the theoretical Cumulative Distribution Function (CDF) difference to the Empirical Cumulative Distribution Function (ECDF) of the sample data and shows whether the sample data follow a certain distribution. The KS test was selected since it does not require normality assumptions and can be used when small sample sizes are available. The test ranks multiple distributions which can have a potential fit to the data (based on their p-value). When the distributions are fit, a random sampling was conducted to the data for a sample size of 100 and results are then incorporated into the regression models to propagate the uncertainties of the weather files to the building energy use.
Results of the output energy use distributions developed for each regression model case and following each sample input distributions are analyzed. We rely on scatter plots, box and whisker plots and mean absolute error values to assess the results.

Summary of Current and Future Weather Data & Energy Use
Climate models are inherently uncertain. We can, however, generate snapshots of what it might look like. In addition, the process of propagating the climate uncertainties is not absolute and comes with uncertainties with regard to the selection of appropriate parameters and models. Figure 2 reflects the dry bulb temperature (DB) for each weather file to reflect the impact of climate change on the current and future weather files and shows the results of heating (HeatingGas) and cooling (CoolingElec) energy use of the sample office building when EnergyPlus simulation is conducted for each weather file. The horizontal axis shows the weather files (current and future) used in this study and are sorted by temperature in an ascending order. Details of the weather files are presented in Appendices A and C. Air temperature in many building cases is predictably the most influential driving force for heating and cooling energy use. Although this could differ depending on the location of study and building. As temperature increases, heating energy consumption decreases and cooling energy use increases ( Figure 2). Using the morphed TMY3 weather file, adapted with IPCC A2 scenario (which is the second most extreme scenario), there is an increase of 1454.9 GJ in cooling energy use by the end of the century compared to the current available TMY3 file for Philadelphia. Other variables that also impact energy performance and are considered in this study are direct normal irradiation, diffuse horizontal irradiation, global horizontal irradiation, dew point and wind speed/direction. The main weather variables influencing a building's energy use can vary from building to building and depending on the building's physical characteristics, ventilation type, location and purpose of use may change.

Developed Regression Models
Multiple regression models are developed after conducting dynamic simulations on the case study and for all weather file data base. The regression coefficient values of the average annual weather variables of the weather files and their corresponding principal components are presented in Table 1. The regression models are between the annual heating and cooling energy use and the average annual weather variables (and their associated principal components for PCR). One set of the regression models are full and stepwise multiple linear regression between annual heating and cooling energy use and the average annual weather variables of the weather files used in the simulations. Another set of the regression models are full, stepwise and 2-factor principal component regression models between annual heating and cooling energy use and the principal components associated to the average annual weather variables. Appendix E shows a more detailed summary of the regression models coefficient factors errors and p-values.

Input Variables Distributions
Results of the KS statistic test for development of the input weather variable and their corresponding principal components distribution are summarized in Appendix F. For the multiple linear regression models the distributions of Normal, Lognormal, Logistic, Gamma, Weibull, Fisher-Tippett, Erlang and BestFit were selected for the average annual weather variables. The BestFit is the selection of best distribution fit for each weather variables separately based on the highest p-value in the Kolmogorov-Smirnov goodness of fit test which then would be used in the regression model to propagate the uncertainties to the energy use. Appendix G provides details of the best fit characteristics. For the PCRs we fit distributions of Beta, Fisher-Tippett, GEV, Logistic, Normal and BestFit to the principal components associated to the average annual weather variables. Figure 3 shows summary of the input distributions selected for the impact assessment analysis. The distribution selection of the weather variables for the multiple linear regression model and the principal component regression model are slightly different. This is due to the parameter fitting test that produced distributions (with relatively reasonable p-values) that differed between the weather variables and their corresponding principal components.

Impact of Regression Model Selection on Energy Use
The heating and cooling energy use prediction developed from the regression models of interest (LR, LRS, PCR, PCRS and PCR2) are presented in Figure 4. The standardized coefficient factors of the regression models are summarized in Appendix D which also shows which variables were determined to be statistically less influential on the heating and cooling energy use in the stepwise regression models and therefore were left out of the regression model. Details of the regression factors values are presented in Appendix E. A scatter plot of the predicted energy use and actual energy use is given in Figure 4.
As it can be seen from Figure 4, the predicted cooling energy use of the regression models shows a relatively closer fit to the actual cooling energy use compared to predicted heating energy use vs actual heating energy use. In general, most regression models for heating and cooling energy use show a relatively close prediction to the actual data except for the 2-factor principal component regression model. The regression models developed in this step are used to propagate the uncertainties of the input parameters to the building energy use.

Impact Assessment of Regression Model and Input Average Annual Variables Distributions on Energy Use Distributions
The sample distributions for the weather variables and their corresponding principal components are developed and random samples (100 samples per each variable) are generated and used in the regression models to propagate their uncertainties to the energy use. A summary of the parameters of each distribution is presented in Appendix F. We rely on box plots and mean absolute errors to assess the propagated energy use following selection of various regression models and input distributions of the uncertainty propagation method. Outliers were excluded from the box plots. Figure 5 is a consolidation of the graphs and is presented as a guideline to interpret the box plots. Note that the name of the distributions at the bottom of the graph represents the input variables distribution series selected to obtain the energy use probability density function in the uncertainty propagation method.
In the assessments hereafter we define the output energy distributions which were obtained following different input weather variable distributions in the uncertainty propagation method as "series". For example, Normal distribution series for heating energy use refers to the distribution of heating energy use obtained from the uncertainty propagation method when normal distribution was fit to the average annual weather variables.   Figures 6 and 7 show the maximum, mean and minimum values of the EnergyPlus simulation output, respectively. From Figure 6, following a full multiple linear regression, the selection of different input distributions can result in high differences in the lower and upper bound of the output heating energy use distribution. In some cases (Normal, Weibull and Fisher-Tippet), the lower bound showed more than 100% difference compared to the minimum of the data obtained from EnergyPlus simulation. However, the average for all input distribution selections show to be close to the average of the simulated data. The heating distribution result following the BestFit distribution of the input values (BestFit series) showed smaller variations from the higher and lower bound of the simulated heating energy use. In addition, among the distribution series, BestFit shows a smaller difference between the maximum and minimum value of the heating energy use distribution.  Figure 7 shows the results of heating energy use when the stepwise multiple linear regression was selected in the uncertainty propagation method and compares the heating energy distributions obtained following various input distributions for the weather variables.
As it can be seen from Figure 7, the average values for each input distribution selection are similar to the average of the simulated data. However, the lower and upper bound for some distributions show relatively high difference to the simulated data. Yet the difference in the higher and lower bound of the heating energy use data and the simulated data are generally smaller when following a stepwise linear regression (Figure 7) compared to a full linear regression ( Figure 6). In Figure 7, the heating energy distribution following the BestFit input distribution has relatively closer fit to the higher and lower bounds of the simulated data (compared to Erlang, FisherTippett, Weibull and Normal series) and shows a lower percentage change between the max and min of the distribution. However, Logistic, Gamma and Lognormal series showed closer values to the higher and lower bounds and smaller percentage difference between max and min of distribution compared to the BestFit. When the difference between higher and lower bound of a distribution are lower it could enhance the credibility of the output distribution when interpreting the data. This can also be seen in Table 2, which shows the mean absolute error for each series and following LR and LRS. As it can be seen from Table 2, the sum and average value of the mean absolute error of all the distribution series following a stepwise multiple linear regression showed lower values compared to the full multiple linear regression. This could reflect that following a stepwise regression model in the uncertainty propagation method could reduce the uncertainties in the selection of weather variable input distribution and reduce the variation of the output results between the lower and higher bound. This can be explained by the fact that stepwise regression model has fewer weather variables compared to a full regression model and as a result fewer uncertainty associated to the variables are propagated to the output energy use. It should be noted that both the stepwise and full regression model had shown to be a good fit to the data. Figures 8-10 show the heating energy use distribution when following a PCR, a stepwise PCR and a 2 factor PCR when selecting various input distributions of principal component variables in the uncertainty propagation method respectively.   In Figures 8-10 for all distribution series the average values show a relatively good fit to the mean simulated data. As it can be seen from Figure 8, the series show relatively smaller variations between maximum and minimum of the heating energy use distribution compared to Figure 6 which followed a full multiple linear regression model on the weather variables. Although the same distribution to input weather variables were not fitted to the principal components but by comparing the similar ones (FisherTippett, Normal, Logistic and BestFit) from Figures 6 and 7 it is clear that not only the difference between the higher and lower bound to the simulated data are smaller in Figures 8-10 but the percentage change of the max and min of the output distribution is also small. This is also evident in the sum and average of the mean absolute errors of the principal component regression models approach to multiple linear regression models approach in Table 3. From Table 3, it can also be seen that following a two-factor principal component regression model in the uncertainty propagation showed lower average and sum of mean absolute errors of all series and compared to all other Figures. In Figure 8, the BestFit series showed smallest percentage change between max and min of the distributions compare to all other series and relatively smaller difference to most other series in Figures 9 and 10 which was also seen in Figure 6 and for most series of Figure 7. Figures 11 and 12 shows the results of cooling energy use when propagating uncertainties following the regression models on various input weather file distribution.  From Figures 11 and 12 the average cooling energy use distributions show a close fit to the mean simulated data. In addition, the percentage changes between the max and min of the cooling energy distributions are relatively smaller than what was shown for heating energy consumption (Figures 6 and 7). This is also apparent in the average and sum of all mean absolute errors of the series for both regression models shown in Table 4. Figures 11 and 12 smaller variations can be seen in the lower and upper bound of the cooling energy use distribution from actual simulated data compared to heating energy use results presented in Figures 6 and 7. The BestFit distribution, compared to most series of both regression models show relatively smaller percentage change of max and min of the distribution and smaller variations between the lower and higher bound compared to the simulated data.     Similar to previous Assessments (Figures 6-12), the results of the cooling energy use distributions for all principal component regression model cases and all distribution series show a close fit to the average and relatively low variations in lower and higher bound of distributions compared to simulated data (Figures 13-15). The BestFit series for all regression cases show lowest percentage change between min and max of the distribution compared to other series of distributions. The sum of the mean absolute error of the series for the principal component regression cases (Figures 13-15) show lower values compared to the linear regression models (Figures 11 and 12) but the average of the mean absolute errors of the series were higher (Table 5).

Discussion
In general, the variations in the cooling energy use distributions were relatively lower compared to heating energy use distributions for all regression cases and all distribution series. One reason for the small variations among the distributions for cooling energy use and following the full and stepwise multiple linear regression can be explain by the regression fit in the scatter plots of Figure 4 which shows the regression models had a relatively better fit to the cooling energy use compared to heating energy use. In addition, when propagating input uncertainties into building energy use following a regression model and a parameter fitting process, the selection of regression model can significantly impact the uncertainties of the higher and lower bound of the output energy use when following different input distributions and could change the variations in percentage change of the maximum and minimum of the energy use distribution.
The use of principal component regression model also showed small variation between lower and higher bounds of the results especially for heating energy use. The drawback of the principal component regression model is that principal components cannot be attributed to any specific weather variable. In other words, we cannot determine which weather variables have the highest impact on the energy results. In addition, unlike a multiple linear regression model where the developed correlation can be used for other cases; here, the principal component regression model is only valid for the case under examination.

Conclusions
Given the current trends of climate change, a probabilistic approach to assessing building energy performance is necessary. Propagating the climate uncertainties into building energy performance would result in a building energy use distribution into the future. This offers the opportunity to assess the risk of variations in building cooling and heating energy use during the design phase. However, understanding the new uncertainties introduced during the uncertainty propagation method is of high importance. For example, the process of conducting an uncertainty propagation could introduce new uncertainties and without their consideration could yield unreliable energy results. We may not have a complete understanding of weather variables distributions, and in many cases fitting a proper regression model to the weather variables and energy use is a challenge. In this study the simultaneous impact of selecting appropriate average annual weather variable distribution and regression model between the weather variables and energy use when propagating climate change uncertainties has been investigated.
To propagate climate uncertainties into building energy use we followed a four-step propagation method which consists of a regression model development, input distribution fitting and random generation. We then assessed the impact of the selection of input average annual weather variable distributions and regression models on the heating and cooling energy distribution. The impact assessment addresses the uncertainties stemming from the selection of input parameters and the development of regression models.
Results show regardless of regression model selection or the input variables distributions, the average energy use showed small variations across the series and relatively small difference to the actual simulated data. This would mean we could have higher confidence in interpreting the mean values when considering climate change uncertainties in building energy use. However, in many cases, extreme conditions are of interest (higher and lower bounds) and finding the appropriate distribution to fit the input variables is not trivial. Our findings show the selection of the BestFit distributions for weather variables and principal components, for most cases could significantly reduce the higher and lower bound variations of the heating and cooling energy use distributions compared the simulated data. In addition, the BestFit distribution showed a relatively smaller percentage change in maximum and minimum of the heating and cooling energy use compared to most distribution series. Although in some cases for heating energy use distribution, the logistic distribution series showed to be a better fit. Our findings suggest that when the modeler is uncertain about input distributions, selecting the appropriate regression model can reduce the variations between high and low bound of output energy. For heating energy use distributions, results showed that a two-factor principal component regression model would result in less average and sum of mean absolute error for the distribution series of weather variables and their corresponding principal components respectively. For cooling energy use distribution however, results show following a stepwise multiple linear regression yields a lower average mean absolute error for all input distribution series but following a two-factor principal component in the uncertainty propagation method results in a lower sum of mean absolute errors of the output cooling energy use distribution series. The purpose of presenting the results are to show the impact the selection of different regression models and input distribution fit on the output energy distribution. The selection of the most appropriate combination can be very complex and would require a large amount of input data. Furthermore, it would necessitate developing simulations with high certainty, which is difficult. The results presented in this study offer a way of understanding how to parse the uncertain nature of climate projections and the limited number of future hourly weather generator methods available.