Energy Management through Cost Forecasting for Residential Buildings in New Zealand

: Over the last two decades, the residential building sector has been one of the largest energy consumption sectors in New Zealand. The relationship between that sector and household energy consumption should be carefully studied in order to optimize the energy consumption structure and satisfy energy demands. Researchers have made e ﬀ orts in this ﬁeld; however, few have concentrated on the association between household energy use and the cost of residential buildings. This study examined the correlation between household energy use and residential building cost. Analysis of the data indicates that they are signiﬁcantly correlated. Hence, this study proposes time series methods, including the exponential smoothing method and the autoregressive integrated moving average (ARIMA) model for forecasting residential building costs of ﬁve categories of residential buildings (one-storey house, two-storey house, townhouse, residential apartment and retirement village building) in New Zealand. Moreover, the artiﬁcial neutral networks (ANNs) model was used to forecast the future usage of three types of household energy (electricity, gas and petrol) using the residential building costs. The t-test was used to validate the e ﬀ ectiveness of the obtained ANN models. The results indicate that the ANN models can generate acceptable forecasts. The primary contributions of this paper are twofold: (1) Identify the close correlation between household energy use and residential building costs; (2) provide a new clue for optimizing energy management.


Introduction
The residential sector is one of the greatest energy consumers in New Zealand [1]. A significant amount of energy is used to provide comfortable indoor environments. Recently, the residential building sector has been increasing due to economic development, social progress, and increasing population [2]. The growth imposes great pressures on household energy consumption. If the increasing demand cannot be appropriately evaluated, it may have a negative effect on long-term energy management and economic development and structure. The decision makers desire to know the impacts of the growth on energy consumption. Hence, to establish a relationship between the residential building and energy management sector is essential. As such, examining the mechanism of the residential building sector on energy consumption has significant theoretical and practical implications for energy management. This study examines the effects of residential building costs on household energy consumption to establish a link between the residential building and energy sector. Moreover, there are many factors that may impact the future trend of energy consumption in the residential sector. To comprehensively assess every factor in detail is impossible and time-consuming. Hence, the residential building cost can be used as an indicator to forecast future household energy consumption, which enables decision makers to obtain accurate forecasts of future energy consumption in the residential building sector effectively and provides a new perspective for energy management.
To obtain a comprehensive cost indicator, the building costs of five main types of residential buildings in New Zealand including one-storey house (AR1), two-storey house (AR2), townhouse (AR3), residential apartment (AR4), and retirement village building (AR5) are considered. This study tends to use time series modelling techniques to forecast the building costs of five categories of residential building. There are two most widely used time series forecasting methods: Exponential smoothing and autoregressive integrated moving average (ARIMA). The exponential smoothing method was originally introduced by [3,4]; for short-term sales forecasting in support of supply chain management and production planning. The widespread usage of this method is mainly due to the fact that it is a relatively simple forecasting method requiring a small size sample and having a comprehensible statistical framework and model parameters. Exponential smoothing models developed are based on the trend and seasonality in time series, while ARIMA models are supposed to describe the autocorrelations in the time series. A framework for exponential smoothing methods was developed based on state-space models [5]. The autoregressive integrated moving average (ARIMA) approach is a widely used linear method [6]. It carries more flexibility by representing various components of time series including autoregressive (AR), moving average (MA), and combined AR and MA. It is the most efficient approach for short-term forecasting with rapid changes. The future predictions based on ARIMA models can be explained by previous or lagged values and the terms of the stochastic errors [7]. The performance of the two forecasting techniques was evaluated in terms of error measures.
Furthermore, an artificial intelligence (AI) method artificial neutral networks (ANNs) model is used for forecasting household energy consumption. Influence variables such as residential building costs are involved in the ANN model. The ANN technique is an intelligent method, which can learn and change its architecture when input variables are fed into it. An ANN model is developed to mimic human brain networking, which is inspired by a mathematical model of biological neutral networks [8]. In this study, multilayer perception ANN models are used to develop forecasting models for household energy consumption. Multilayer perceptron artificial neutral network (MLP ANN) models for forecasting household energy consumption are developed in this work. To improve the adequacy of the models, activation functions and a train-and-error process are used to optimize the parameters of the ANN models, including the number of hidden layers and neutrons, activation function, and training algorithm. The forecasts of household energy consumption obtained based on the proposed ANN models are compared with the actual values of the household energy consumption. The t-test is adopted to validate the effectiveness of the ANN models.
The rest of this study is organized as follows. Section 2 presents the previous studies about factors influencing residential energy use, cost forecasting methods and energy consumption methods. Section 3 introduces the correlation analysis, exponential smoothing method, ARIMA model, ANNs model, and t-test. Correlation analysis results, exponential smoothening and ARIMA models for the cost series, ANN models for household energy consumption are shown in Section 4. The results discussion is presented in Section 5. In the final section, the conclusion is presented.
Moreover, several studies addressed that some socio-economic variables have impacts on energy consumption [35][36][37]. In addition, some studies comprehensively evaluate the life-cycle cost of an energy saving building to prompt the adoption of energy-saving building. In [38], construction costs of conventional buildings and net-zero energy buildings were compared. Construction professionals usually consider that nearly-zero energy buildings may cost more, cost is regarded as a barrier to prompt net-zero buildings. However the results indicate no significant difference between the conventional building and net-zero energy building cost. Study [39] investigated energy savings and cost effectiveness of the Italian housing stock refurbishment. Energy performance of a building before and after refurbishment, refurbishment cost of the building, and energy cost savings after refurbishment were evaluated based on building typology in order to choose an optimal package plan for the building. Study [40] provided a method to identify cost-optimal levels of minimum energy performance requirements in residential buildings. The study evaluated the cost of factors (HVAC, building envelopes, technologies used etc.) and the energy-saving cost to develop a cost-optimal residential building.

Cost Forecasting Methods
Several studies have been conducted focusing on construction cost forecasting. For example, [41] explained a way of applying neutral networks to forecast changes in the construction cost index. In [42], a dynamic regression model to forecast building cost using economic variables was addressed. Although these methods are effective in identifying the leading cost drivers and appropriate estimation at the inception of the project, they are difficult to deal with as time-varying variables and reflect the time lag effects. Since much time-related data are dependent or have an autocorrelation [43], time-related techniques can be adopted to overcome these limitations.
In an attempt to solve time-related problems in the methods, the time series techniques, which estimate future values of a certain variable according to past values of itself and random shock factors, have been adopted to forecast the building cost. For example, [44] used time-series models to provide reliable forecasts of building costs, tender prices, and the impacts of economic inflation on building projects. In [45], an integrated regression analysis and ARIMA techniques to predict a tender price index for Hong Kong building projects was introduced. Reference [46] developed a Box-Jenkins model to estimate the labor market of the Hong Kong construction industry. Reference [47] illustrated a time series method that estimates future values according to past values and corresponding random errors to produce a reliable prediction of construction cost.

Energy Consumption Forecasting
Many researchers are now working in the field of energy consumption forecasting [8,48,49]. Forecasting techniques mainly focus on three areas, including statistical, engineering, and artificial intelligence (AI) approaches [50]. Statistical approaches utilize historical data to predict important parameters. Regression models, autoregressive integrated moving average (ARIMA), Gaussian mixture techniques, and conditional demand analysis are a few examples of statistical forecasting approaches [51][52][53][54]. Engineering approaches investigate energy consumption based on architectural features and climate behavior of buildings, including building design; thermal characteristics of building materials; heating, ventilation and air conditioning (HVAC) systems; and weather conditions [12,[55][56][57][58][59][60]. AI methods provide different processes for exhibiting modelled parameters based on the same historical data used by statistical forecasting methods [50]. Among all AI techniques, neutral networks have the advantage of being able to model varying, non-linear relationships between input and output variables. A few AI techniques, including genetic algorithms (GA), genetic programming (GP), fuzzy logic (FL), support vector machines (SVM), and ANNs have been employed in various studies [61][62][63].
One study [64] used statistical methods, including exponential smoothing models and Bootstrap aggregating ARIMA models, to predict electrical energy consumption in 2019 for different countries based on monthly electricity consumption data. One study [61] used three AI methods, including GP, SVM, and ANNs, to develop forecasting models of electricity consumption for five Asian counties based on long-term electricity consumption data . Another study [65] used both multiple regression and GP methods to predict natural gas consumption for a steel company in Slovenia; the results indicated that GP outperforms the multiple regression method. Moreover, another study [66] employed a GP method involving various social, political and economic parameters to develop a long-term forecasting model of energy consumption for a Brazilian industrial sector. An additional study [67] used the ANNs technique to develop a forecasting model of energy consumption in the residential building sector. Moreover, ANNs models have been used in the oil industry [68,69]. Study [69] employed multiple regression, ANNs, and SVM methods to forecast daily electricity consumption by oil driven pumps in China.

Data
The used data are composed of the average quarterly values of household energy usage and residential building costs observed for the period 2001:Q1-2018:Q4: (a) Household electricity usage, (b) household gas usage, (c) household petrol usage, (d) one-storey house building cost, (e) two-storey house building cost, (f) townhouse building cost, (g) apartment building cost, and (h) retirement village building cost. Household energy usage data are available from Statistics New Zealand. Electricity, gas, and petrol are the most used energy in the residential sector of New Zealand, as shown in Figure 1a [1]. The energy use in the residential sector of New Zealand are mainly in the areas such as space heating, water heating, refrigeration, and home appliances, as shown in Figure 1b [1]. The cost indices are provided by the QV cost-builder, which has been widely accepted in New Zealand. The building cost is composed of the capital construction, associated capital, and client-related cost [70]. The construction cost consists of material, labor, and equipment cost. The associated capital and client-related cost are added as the percentage of the capital construction cost.
The data (72 observations) was split into two parts: The training part for model fitting and the testing part for evaluating the forecasting performance by comparing forecasts with observations [71].  Asian counties based on long-term electricity consumption data . Another study [65] used both multiple regression and GP methods to predict natural gas consumption for a steel company in Slovenia; the results indicated that GP outperforms the multiple regression method. Moreover, another study [66] employed a GP method involving various social, political and economic parameters to develop a long-term forecasting model of energy consumption for a Brazilian industrial sector. An additional study [67] used the ANNs technique to develop a forecasting model of energy consumption in the residential building sector. Moreover, ANNs models have been used in the oil industry [68,69]. Study [69] employed multiple regression, ANNs, and SVM methods to forecast daily electricity consumption by oil driven pumps in China.

Data
The used data are composed of the average quarterly values of household energy usage and residential building costs observed for the period 2001:Q1-2018:Q4: (a) Household electricity usage, (b) household gas usage, (c) household petrol usage, (d) one-storey house building cost, (e) two-storey house building cost, (f) townhouse building cost, (g) apartment building cost, and (h) retirement village building cost. Household energy usage data are available from Statistics New Zealand. Electricity, gas, and petrol are the most used energy in the residential sector of New Zealand, as shown in Figure 1a [1]. The energy use in the residential sector of New Zealand are mainly in the areas such as space heating, water heating, refrigeration, and home appliances, as shown in Figure 1b [1]. The cost indices are provided by the QV cost-builder, which has been widely accepted in New Zealand. The building cost is composed of the capital construction, associated capital, and client-related cost [70]. The construction cost consists of material, labor, and equipment cost. The associated capital and client-related cost are added as the percentage of the capital construction cost.
The data (72 observations) was split into two parts: The training part for model fitting and the testing part for evaluating the forecasting performance by comparing forecasts with observations [71]. There is no clear rule for this dividing; in this study, about 72% of the data

Correlation Analysis
Correlation analysis is a statistical method used to evaluate the significance of correlation between two variables [72]. A significant correlation indicates that two variables have a significant relationship, while a weak correlation indicates that the variables are weakly related. In other words, the correlation analysis can be used to examine the significance of the relationship between two variables. Pearson's correlation coefficient, also called linear correlation coefficient, assesses the linear relationship between two variables [73]. Let i and j be two random variables of the same sample n. To calculate Pearson's correlation coefficient r, use Equation (1) as follows.
are the means of the variables i and j, respectively. The correlation coefficient r ranges between −1 and +1. If the linear correlation between i and j is positive (the increase of one variable is related to the increase of the other), then the correlation coefficient r > 0, whereas if the linear correlation between i and j is negative (the increase of one variable is related to the decrease of the other), then the correlation result r < 0. The value r = 0 indicates absence of any association between i and j. The sign of the correlation coefficient indicates the direction of the relationship. The magnitude of r indicates the significance of the correlation. For example, if r is close to 1, then the two variables are positively associated at a significant level, which also indicates that the increase of one variable is related to the increase of the other. If r is close to −1, then the two variables are negatively associated with each other at a significant level, which indicates that the increase of one variable is related to the decrease of the other. If r = 0, this usually indicates that the two variables are unrelated.

Exponential Smoothing
Exponential smoothing is one of the most effective forecasting methods when a time series has a trend that has changed over time, for example, since the 1950s [74]. It unequally weights the observed time series values. More recently observed values are weighted more heavily than more remote observations. The weights for the observed time series values decrease exponentially as one

Correlation Analysis
Correlation analysis is a statistical method used to evaluate the significance of correlation between two variables [72]. A significant correlation indicates that two variables have a significant relationship, while a weak correlation indicates that the variables are weakly related. In other words, the correlation analysis can be used to examine the significance of the relationship between two variables. Pearson's correlation coefficient, also called linear correlation coefficient, assesses the linear relationship between two variables [73]. Let i and j be two random variables of the same sample n. To calculate Pearson's correlation coefficient r, use Equation (1) as follows.
where i = 1 n n k=1 i k and j = 1 n n k=1 j k , are the means of the variables i and j, respectively. The correlation coefficient r ranges between −1 and +1. If the linear correlation between i and j is positive (the increase of one variable is related to the increase of the other), then the correlation coefficient r > 0, whereas if the linear correlation between i and j is negative (the increase of one variable is related to the decrease of the other), then the correlation result r < 0. The value r = 0 indicates absence of any association between i and j. The sign of the correlation coefficient indicates the direction of the relationship. The magnitude of r indicates the significance of the correlation. For example, if r is close to 1, then the two variables are positively associated at a significant level, which also indicates that the increase of one variable is related to the increase of the other. If r is close to −1, then the two variables are negatively associated with each other at a significant level, which indicates that the increase of one variable is related to the decrease of the other. If r = 0, this usually indicates that the two variables are unrelated.

Exponential Smoothing
Exponential smoothing is one of the most effective forecasting methods when a time series has a trend that has changed over time, for example, since the 1950s [74]. It unequally weights the observed time series values. More recently observed values are weighted more heavily than more remote observations. The weights for the observed time series values decrease exponentially as one moves further into the remote. A smoothing constant can determine the rate at which the weights of older observed values decrease. Exponential smoothing techniques include simple exponential smoothing, linear trend corrected exponential smoothing, Holt-Winters methods, and damped trend exponential smoothing [74].
According to [75], exponential smoothing models have been widely used in many research fields and industry practices due to their relative simplicity and good overall forecasting performance as well as considering trends, seasonality and other features of the data. A large number of existing research and studies also indicated their extensive industrial applications [76,77]. In this study, the Holt-Winters exponential smoothing method was adopted.

Holt-Winters Method
Holt-Winters methods are designed for time series that exhibit linear trend and seasonal variation, which include the additive Holt-Winters method and multiplicative Holt-Winters methods [74]. An advantage of these methods is that they can model data seasonality directly instead of stationary transforming for the data. If a time series has a linear trend and additive seasonal pattern, the additive Holt-Winters method is appropriate. Then the time series can be described in Equation (2).
where β 1 is the growth rate; S t is a seasonal pattern; t is the error term. For such time series, the mean, the growth rate, and the seasonal variation may be changing over time. A state space model for these changing components can be found in Equations (3)-(6).
To begin the estimation, the initial values for level, growth rate and seasonal variation should be estimated. Hence, first, a least squares regression model should be generated based on available data. The regression model can be expressed in Equation (7). The initial values l 0 , b 0 were also obtained from the model.Ŷ Obtained estimated values for each time period are based on the above regression model. The initial seasonal factor in each of the L seasons can be calculated in Equation (8).

Multiplicative Holt-Winters Method
If a time series has a linear trend with multiplicative seasonal variations, the multiplicative Holt-Winters method is appropriate to be used. The state space models for this method can be described in Equations (9)- (12).
And the seasonal factors can be computed in the following Equations (13).

Autoregressive Integrated Moving Average (ARIMA)
There are four steps to select an appropriate model for the time series data in the Box-Jenkins approach including model identification, parameters estimation, diagnosis checking and forecasting [78]. The development process of an ARIMA model is shown in Figure 3. ARIMA models are flexible and adaptive since they can forecast data values of a time series by a linear combination of its past values, and past errors (in terms of univariate analysis). Taking into account the seasonality of the time series, a seasonal ARIMA model denoted as ARIMA (p,d,q) (P,D,Q) L is introduced, where P represents seasonal autoregressive orders, D indicates seasonal differencing orders, Q represents seasonal moving average orders, and L indicates the number of seasons. A seasonal ARIMA model can be shown in Equation (14). where where B is the backshift operator; L is the number of seasons in a year (L = 4 for quarterly data and L = 12 for monthly data); δ is a constant term; a t , a t−1 , · · · are random shocks; ∅ 1 , ∅ 2 , · · · , ∅ p are non-seasonal autoregressive parameters; ϕ 1,L , ϕ 2,L , · · · , ϕ P,L are seasonal autoregressive parameters; θ 1 , θ 2 , · · · , θ q are non-seasonal moving average parameters, ϑ 1,L , ϑ 2,L , · · · , ϑ Q,L are seasonal moving average parameters. Classical ARIMA models are usually used to describe the stationary time series. The sample auto-correlation function (SAC) can be used to determine whether the time series is stationary. For example, if the SAC of a time series values either dies down quickly or cuts off quickly at both seasonal lags and non-seasonal lags, then the time series can be considered as stationary. If the SAC of the time series values dies down extremely slowly either at seasonal lags or non-seasonal lags, it is reasonable to decide that the time series is non-stationary. In order to obtain a stationary time series, the transformation of the time series to the stationary should be undertaken. The differences of the time series values are shown in Equation (15).
where L indicate the number of seasons in a year (L = 4 for quarterly data and L = 12 for monthly data). The parameters of the ARIMA model are usually estimated by the least square method. The obtained models should be checked for whether the ARIMA assumptions are satisfied. As a more accurate test, the Ljung-Box test is usually undertaken to examine whether the autocorrelation of the residuals is statistically different from an expected white noise process. If the p-value is greater than 0.05, indicating no significant autocorrelation in residuals, in turn, the model is adequate [79].

Error Measure for Model Comparison
Although a model may well fit the historical data, it is not valid to determine that the model has a good forecasting performance. The forecasting performance of a model can only be determined by the accuracy of the out-of-sample forecasts [80]. The accuracy of the forecasts was evaluated by the mean absolute percentage error (MAPE) between the actual and predicted values of the building cost. The lower the values are, the better the forecasting performance of the proposed model.
The most widely used criterion for forecasting models is accuracy, which has many forms, including the root mean square error (RMSE) [81], mean absolute error (MAE) [82], and mean absolute percentage error (MAPE) [83]. This study evaluated the accuracy of the forecasts by the mean absolute percentage error (MAPE) between the actual and predicted values of the building cost. The lower the values are, the better the forecasting performance of the proposed model. Denote the real observations for the time series by ( ) and the forecasting values for the same series by ( ). The mean absolute percentage error (MAPE) can be computed in Equation (16). Classical ARIMA models are usually used to describe the stationary time series. The sample auto-correlation function (SAC) can be used to determine whether the time series is stationary. For example, if the SAC of a time series values either dies down quickly or cuts off quickly at both seasonal lags and non-seasonal lags, then the time series can be considered as stationary. If the SAC of the time series values dies down extremely slowly either at seasonal lags or non-seasonal lags, it is reasonable to decide that the time series is non-stationary. In order to obtain a stationary time series, the transformation of the time series to the stationary should be undertaken. The differences of the time series values are shown in Equation (15).
where L indicate the number of seasons in a year (L = 4 for quarterly data and L = 12 for monthly data). The parameters of the ARIMA model are usually estimated by the least square method. The obtained models should be checked for whether the ARIMA assumptions are satisfied. As a more accurate test, the Ljung-Box test is usually undertaken to examine whether the autocorrelation of the residuals is statistically different from an expected white noise process. If the p-value is greater than 0.05, indicating no significant autocorrelation in residuals, in turn, the model is adequate [79].

Error Measure for Model Comparison
Although a model may well fit the historical data, it is not valid to determine that the model has a good forecasting performance. The forecasting performance of a model can only be determined by the accuracy of the out-of-sample forecasts [80]. The accuracy of the forecasts was evaluated by the mean absolute percentage error (MAPE) between the actual and predicted values of the building cost. The lower the values are, the better the forecasting performance of the proposed model.
The most widely used criterion for forecasting models is accuracy, which has many forms, including the root mean square error (RMSE) [81], mean absolute error (MAE) [82], and mean absolute percentage error (MAPE) [83]. This study evaluated the accuracy of the forecasts by the mean absolute percentage error (MAPE) between the actual and predicted values of the building cost. The lower the values are, the better the forecasting performance of the proposed model. Denote the real observations for the time series by (y i ) and the forecasting values for the same series by (ŷ i ). The mean absolute percentage error (MAPE) can be computed in Equation (16).

Multilayer Artificial Neutral Networks
The multilayer perceptron artificial neutral network (MLP ANN) technique has been widely used in forecasting models [84] and typically includes an input layer, an intermediate layer (hidden layer), and an output layer. The intermediate layer employs a number of neurons to optimize ANN models. All the neurons in ANNs are arranged in a layer-structure and neurons in different layers are interconnected according to the designed architecture. An architecture of the ANN model is shown in Figure 4. The ANN model is expressed as shown in Equation (17).
The output y j is used as an input signal in the next layer, w ij is the weight of the connection between the i-th and j-th elements, and b j is the bias.
A training process based on the historical data is performed to decide the architecture of the networks and the weights and bias of the ANN model. During the training process, the ANN model learns the relationship between input and output variables and captures important information.

Multilayer Artificial Neutral Networks
The multilayer perceptron artificial neutral network (MLP ANN) technique has been widely used in forecasting models [84] and typically includes an input layer, an intermediate layer (hidden layer), and an output layer. The intermediate layer employs a number of neurons to optimize ANN models. All the neurons in ANNs are arranged in a layer-structure and neurons in different layers are interconnected according to the designed architecture. An architecture of the ANN model is shown in Figure 4. The ANN model is expressed as shown in Equation (17).
The output is used as an input signal in the next layer, is the weight of the connection between the i-th and j-th elements, and is the bias. A training process based on the historical data is performed to decide the architecture of the networks and the weights and bias of the ANN model. During the training process, the ANN model learns the relationship between input and output variables and captures important information.

t-Test
The t-test is a statistical method that is also referred to as the Student's t-test. The t-test includes a one-sample t-test, two independent samples t-test, and paired sample t-test [85]. Unlike some statistical methods that heavily rely on sample size for their effectiveness, the t-test can be used with small sample sizes (such as n < 30) [86]. The t-test can be used to test the difference between one sample and a set mean value (one sample t-test). The t-test can also be used to compare the

t-Test
The t-test is a statistical method that is also referred to as the Student's t-test. The t-test includes a one-sample t-test, two independent samples t-test, and paired sample t-test [85]. Unlike some statistical methods that heavily rely on sample size for their effectiveness, the t-test can be used with small sample sizes (such as n < 30) [86]. The t-test can be used to test the difference between one sample and a set mean value (one sample t-test). The t-test can also be used to compare the difference between the mean of two samples (two sample t-test). Additionally, it can be used to test the mean difference between paired samples before and after the experiment (paired sample t-test). In this study, the paired sample t-test was used. To calculate the t-value, Equation (18) was used, as follows.
where D is the difference between sample x and y and n is the sample size. Then, the t-value can be calculated. Next, the t-critical value is calculated based on the sample size n. If the t-value is smaller than the t-critical value (p > 0.05), there is no significant difference between the two samples. If the t-value is greater than the t-critical value (p < 0.05), there is a significant difference between the two samples.

Correlation Analysis Results
The correlation analysis was performed to examine whether there is a significant correlation between household energy use and residential building costs, which indicated the reliability of using residential building costs as an indicator of the future trend of household energy use. Based on the correlation coefficient, the results of the correlation analysis show a significant correlation between the two variables. The significance level of the variables is validated by two-tailed significant correlation values at the 0.05 level.
The correlation analysis was conducted using the statistical software program SPSS (Statistical Packages for the Social Sciences, versions 23). Results of the analysis indicated that residential building costs positively correlate with household energy use. This significant correlation supports the research hypothesis. Therefore, there is a significant correlation between the energy use and residential building costs. The researcher predicts the correlation between residential building costs and household energy use. The results of the correlation analysis are essential for understanding the effectiveness of residential building costs to serve as an indicator of future trends of household energy use.
The results of the correlation analysis between household energy use variables and residential building cost are shown in Table 1, which shows that all household energy use variables are positively correlated with residential building costs. The highest correlation coefficient was obtained from the correlation between the household gas use and residential building cost of a two-storey house (AR2), with a significant value of r = 0.994. A weakest correlation was observed between household petrol use and residential building cost of retirement village (AR5), with a coefficient of r = 0.640. Despite having a relative weak correlation, all the household energy use variables correlate with the residential building cost at a significant level.

Exponential Models for Building Cost
Both additive Holt-Winters and multiplicative Holt-Winters models were applied to the five-cost series. Following the methods outlined in [75], the model parameters were estimated. The results of the exponential smoothing models for the cost of the five categories of the residential building are displayed in Table 2. The p-value of the model parameters indicate that they are effective. Moreover, the model fit R-square and error measures including the root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) were also generated. In addition, the model parsimony measure Bayesian Information Criterion (BIC) was also obtained. The residual tests including Ljung-Box Q test and Shapiro-Wilk test were carried out. The results are shown in Table 3. They all indicate that the Holt-Winters models can fit the cost series fairly well because the models can identify the trend and seasonal variation.

Seasonal ARIMA Models
In this study, for each cost series, a total of 52 observations from 2001:Q1 to 2013:Q4 were used to obtain the proposed models. For the stationary analysis of the five cost series the autocorrelation function (ACF) and partial autocorrelation function (PACF) were used; results are shown in Figure 5. By investigating the graphs of ACF and PACF for the five building cost series; it can be observed that the ACFs decay very slowly at both non-seasonal and seasonal lags. For each cost series, the appropriate number of differencing should be determined. Hence, it is reasonable to transform to a stationary series by taking four quarter differencing of data to remove seasonality and regular differencing to remove trends for the four-cost series, except the cost series for the two-storey house in New Zealand. The cost series for the two-storey house has only made a regular differencing to transform the data into stationary. After the differencing, the results of ACFs and PACFs for the five cost series are shown in Figure 6. In order to select proper seasonal ARIMA models, different models with various combinations of regular orders (p and q) and seasonal orders (P and Q) were evaluated. The seasonal ARIMA models for the five-cost series are shown in Table 4. There are four seasons in a year, as L = 4 in the ARIMA models.
The Ljung-Box Q test was employed to examine the autocorrelation of model residuals. If the p-value is greater than the value of 0.05, the null hypothesis that the data are not correlated should be accepted [6]. To examine the normality of the residuals the analysis applied the Shapiro-Wilk test. If the p-value of the test is greater than the value of 0.05, it indicates that there is no evidence to reject the null hypothesis that the data follow a normal distribution [87]. As seen from Table 5, the residuals of all the models pass the tests, indicating that the proposed models are adequate. According to the estimation results, the model fit measures and error measures are acceptable. This suggests that the proposed models fit the data fairly well. The model fit parameters of exponential models are also shown in Table 5.

ANN Models
An adequate MLP ANN model with determined architecture, a number of hidden layers and neurons, and activation functions are obtained through a trial-and-error process, which is not a trivial effort [88]. This process aims to select a suitable number of hidden layers and neurons, an activation function, and a training algorithm to optimize the MLP ANN model. To select an optimal model in the trial-and-error process, a significant number of models are developed and examined. Three ANN models corresponding to the different types of household energy are provided in order to forecast household energy consumption. Optimal ANN models for three types of household energy consumption including electricity, gas, and petrol are shown in Tables 6-8, respectively. Following the training process, an ANN model for household electricity consumption containing one hidden layer and four neurons was developed, as shown in Figure 7. The ANN model for household gas consumption includes two hidden layers with three neurons in the first hidden layer and two neurons in the second hidden layer, as shown in Figure 8. An ANN model for household petrol consumption includes one hidden layer and three neurons, as shown in Figure 9. Then, the ANN models were used to generate forecasts of the household energy usage.

Results Discussion
The correlation analysis was performed to test the correlation between residential energy use and residential building costs. These results demonstrate that significant correlations exist between the household energy use (electricity, gas, petrol) and residential building cost. The significant positive correlations between residential building costs and household energy use indicate that an increase in residential building costs can increase household energy usage.
The forecasting performance of the cost forecasting methods was evaluated by MAPE statistics. The MAPE of the proposed models for all the five cost series are presented in Table 9. Bold type is utilized in these tables to identify the lowest values of MAPE for each proposed model. As the results show, no single forecasting method is better for all data series. In fact, for all the five-cost series of residential buildings in New Zealand to which statistical approaches were applied, the exponential smoothing models displayed excellent performance for forecasting building costs of a one-storey house and a two-storey house. Seasonal ARIMA models produced more accurate forecasts for the cost series of a townhouse, apartment, and retirement village building. The exponential smoothing (ES) approach and ARIMA technique are both effective time series forecasting methods as they both can fairly well describe trend movement in the time series, but they have both strengths and weaknesses. For example, the ARIMA approach is more readily expanded to model interventions, outliers, variations and variance changes in the time series; but it is a relatively sophisticated technique. Due to different data patterns and limited sample size, it is unjust to attempt to determine whether one time series forecasting method is better than the other. Therefore, either the exponential smoothing method or the ARIMA approach should be given a chance to demonstrate its maximum potential in any empirical case study.
ANN models for three types of household energy, including electricity, gas, and petrol, are developed. The forecasts generated by the ANN models were compared with the actual values of household energy consumption. The paired sample t-test was used to compare the mean difference between the forecasts with the actual values. The t-test was performed using SPSS. The obtained t-values are all smaller than the t-critical value (t-critical = 1.994, n = 72, p > 0.05), which indicates that no significant difference exists between the forecasts and actual values. The t-values and p-values for three ANN models are shown in Tables 6-8, respectively. The proposed ANN models for the household energy consumption are acceptable. The results indicate that the residential building cost is a good indicator that can be used in household energy management. Moreover, the ANNs model is an effective method for forecasting household energy consumption using the residential building costs. Additionally, the explored relationship between the residential building cost and household energy consumption provides a new way for better energy management. Table 9. Forecast values for building cost of one-storey house in New Zealand.

Conclusions
Several methods have been used to predict future household energy use, but most of them are based on factors such as thermal envelope or HVAC systems. However, in this study, the residential building cost was used as an indicator of future trends of household energy use. In this study, quarterly household energy consumption data (electricity, gas and petroleum) and quarterly building costs data for five categories of residential building (one-storey house, two-storey house, town house, apartment, and retirement village) over an 18-year range 2001:Q1-2018:Q4 in New Zealand, were analyzed.
The correlation analysis showed a significant positive correlation between household energy usage and residential building cost. Moreover, based on the identified characteristics, the two time series forecasting techniques, exponential smoothing method and ARIMA approach, were adopted to take into account variations of residential building costs in predicting their future trends. It was concluded that both methods can produce proper forecasts. In addition, the ANN method was used to forecast future household energy use by using residential building costs as influencing factors. The ANN models generated acceptable forecasts of three different types of household energy usage (electricity, gas, and petrol).
This result not only explores the strong link between the residential building cost and household energy consumption but also establishes the relationship between the residential building sector and the energy sector. Moreover, the decision makers can easily obtain future energy consumption in the residential sector without evaluating every component of different types of buildings. The decision maker can obtain the future household energy consumption based on residential building costs, which facilitate the decision-making process and improve energy management performance and provide a new perspective for energy management. Although this study used the QV's residential building cost index and energy variables of Statistics New Zealand, the proposed methods can be used for similar data sets in other cities as well as globally.