Comparison of Forecasting Energy Consumption in Shandong , China Using the ARIMA Model , GM Model , and ARIMA-GM Model

To scientifically predict the future energy demand of Shandong province, this study chose the past energy demand of Shandong province during 1995–2015 as the research object. Based on building model data sequences, the GM-ARIMA model, the GM (1,1) model, and the ARIMA model were used to predict the energy demand of Shandong province for the 2005–2015 data, the results of which were then compared to the actual result. By analyzing the relative average error, we found that the GM-ARIMA model had a higher accuracy for predicting the future energy demand data. The operation steps of the GM-ARIMA model were as follows: first, preprocessing the date and determining the dimensions of the GM (1,1) model. This was followed by the establishment of the metabolism GM (1,1) model and by calculation of the forecast data. Then, the ARIMA residual error was used to amend and test the model. Finally, the obtained prediction results and errors were analyzed. The prediction results show that the energy demand of Shandong province in 2016–2020 will grow at an average annual rate of 3.9%, and in 2020, the Shandong province energy demand will have increased to about 20% of that in 2015.


Introduction
The energy consumption of Shandong province was 36,759.2 million tons of standard coal equivalent in 2015, and the average growth rate has been increasing by up to 5% over the past years [1-3].Energy consumption inspired an economic boom, and an exact prediction of the future energy demand is required for accurate forecasting of the required energy supply.In addition, energy consumption typically increases more rapidly than the GDP, and high-speed economic growth is also largely dependent on massive energy consumption [4][5][6][7][8][9][10][11][12].A significant number of studies have focused on energy demand prediction, especially during recent years [13][14][15][16].In 1976 and based on the ARMA model, Box and Jenkins [17] proposed the autoregressive integrated moving average model (ARIMA model).To obtain a stationary sequence, they processed the non-stationary time series with differential treatment, disposed the stationary sequence with the autoregression and moving average processes, and determined the parameters of the ARMA model, using both a correlation coefficient and a partial autocorrelation coefficient.Grey theory is a truly multidisciplinary and generic theory, originally developed by Deng [18].Countries pay more attention to their energy supply, focusing on areas such as the development of shale gas [19][20][21] and renewable energy [12,22,23], to ensure a sufficient energy supply as the foundation of economic development.The grey theory has been applied in many applications of systems analysis [24][25][26].
This study utilized the energy demand of Shandong province for the years of 1995-2015 as the object of study.The innovation point is that the adopted GM-ARIMA model forms a combination of the grey model and the ARIMA model.Firstly, for the nonlinear and large amounts of data sequences, the metabolic GM (1,1) model was used to fit the trend component of the time series.Secondly, establishing the differential equation and keeping new data in process generated forecast dates.Lastly, the ARIMA model was used to analyze and test the residuals to achieve the purpose of the prediction.This model can effectively improve the accuracy of the prediction, and provide a scientific and reliable theoretical basis for the prediction of the energy demand in Shandong province and inform related policy decisions.This paper is organized as follows: The first section describes the research background.The second section introduces the construction of the single prediction models, mainly the ARIMA model and the GM (1,1) model.The third section introduces the construction of the GM-ARIMA model, verifies the rationality of this model construction, compares the average relative error of the single model and the GM-ARIMA model, measures the accuracy of the combined model prediction, and obtains the forecast results for the next five years.The fourth section summarizes the entire text, suggests prospects of future work, and develops relevant suggestions.

Single Model Based Prediction of Future Energy Demand Based on Past Energy Demand
Energy is an important factor to maintain the development of society, and it is the most important factor to restrict the development of productive forces, while ensuring steady and healthy growth of the economy [12,[27][28][29].At present, the energy demand of Shandong province is increasing annually, the demand gap is large, and the main source of supply depends on a single supply.According to the "statistical yearbook of Shandong" survey data, coal consumption accounted for 79.48% of the total energy consumption in Shandong province in 2015 [1].This shows a prominent contradiction between energy supply and demand, and the coal supply gap is relatively large.To ensure a healthy structure for energy consumption as well as to ensure green development when using large amounts of energy to promote economic development, a rational analysis of the factors that affect energy demand is critical.This study selected the energy demand of the years of 1995-2015 of Shandong Province, as shown in Table 1, as reference data, generated forecasts, and analyzed the energy demand of Shandong Province for the next five years.The ARIMA model describes the approximate process of data sequences over time, uses the own lag term and random error term of variable Y t to explain variables, and consequently predicts the future with a certain mathematical model [30][31][32].The acronym stems from the univariate Box-Jenkins autoregressive integrated moving average (ARIMA) [7] analysis, which has been widely used for modeling and forecasting in many fields, such as medical, environmental, financial, and engineering applications [17,[33][34][35][36]. Furthermore, the forecasting of energy consumption is significant to control carbon emissions [37,38].In recent years, many countries pay increasing attention on developing greenhouse gas (GHG) emission inventories to reduce global warming [12,[39][40][41].China has realized this importance and initiated many actions to reduce carbon emissions and climate change [12,[42][43][44].Its specific form can be expressed as ARIMA (p, d, q), where 'p' represents the order of autoregressive processes, 'd' represents the order of difference, and 'q' represents the order of the moving average process [45,46].The autoregressive process is the return of current and previous data, which can obtain the AR function represented via the correlation coefficient; the moving average (MA) process is the return of both current error and previous errors, thus the MA function can be drawn via association between current data and errors.The letter 'd' represents the order of difference: if the sequence is stationary, then d is 0; if the sequence is non-stationary, it needs to be different to the order of d to smooth it, and then modeling starts.
The advantage of the ARIMA model is that it can use a combination of auto regression, difference, and moving average of different order to obtain the ARIMA (p, d, q) model, expressing various types of information of time series when fitting time series.Accurate predictions can be obtained by selecting appropriate parameters.This study used the ARIMA model to predict the model parameters and used SPSS software to determine model parameters and fit the sequence.Then, we obtained prediction data and average relative error.

The Construction of the ARIMA Model
An important assumption of the classical linear regression model is that random error items in the overall regression function satisfy the same variance, i.e., they all have the same variance.If this assumption is not satisfied, i.e., the random error term has different variances and the linear regression model is heteroscedastic.If the linear regression model displays such heteroscedasticity, the estimated parameters are not valid estimators after using the traditional least squares method to estimate the model.At this point, the significance test was invalid.To eliminate the possibility of heteroscedasticity in the original time series Y, we used the log of Y, i.e., Q = log(Y), and then tested the stationarity of the sequence via the Augmented Dickey Fuller (ADF) unit root test [47][48][49] (Test results are shown in Table 2. ADF test results show that the original sequence was stable after zero order difference, i.e., d = 0.After the difference order d had been determined, the data sequence was a stationary sequence; thus, the autoregressive and moving average (ARMA) processes were started.Before modeling the ARMA model, it was necessary to first determine the order p and q of each process.This order was obtained by calculating both autocorrelation coefficients and partial autocorrelation coefficients.After observing the autocorrelation function and the partial autocorrelation function, obtaining the order of magnitude, which leads to a function of zero, and researching truncation and tailing, the model type and its corresponding order could be determined.If the autocorrelation (or partial autocorrelation) coefficient were to be suddenly converged to the critical level range, and the value of them were suddenly to become very small, we would call this truncation.If the autocorrelation (or partial autocorrelation) coefficient were to drag a long tail, and the value of them were to be slowly reduced, we would call this trailing.These are the definitions of truncation and trailing.If the sample autocorrelation coefficient (or partial autocorrelation coefficient) were obviously larger than the standard deviation range of 2 times in the initial d order, almost 95% of the sample autocorrelation (partial autocorrelation) coefficients would fall within 2 times the standard deviation range, and the process of decaying from non-zero autocorrelation (or partial autocorrelation) to small fluctuations would be very sudden, and consequently, we would consider autocorrelation (or partial autocorrelation) coefficient truncation.If more than 5% of the samples were to fall outside the standard deviation range of 2 times, or a process of fluctuation, attenuated by a significant non-zero correlation function, were to be slow or very continuous, we would consider autocorrelation (or partial autocorrelation) coefficient trailing.The basic principles of model recognition are shown in Table 3.After fitting of a series using SPSS software, the autocorrelation coefficients and partial autocorrelation coefficients can be obtained as shown in Figure 1.
order of magnitude, which leads to a function of zero, and researching truncation and tailing, the model type and its corresponding order could be determined.If the autocorrelation (or partial autocorrelation) coefficient were to be suddenly converged to the critical level range, and the value of them were suddenly to become very small, we would call this truncation.If the autocorrelation (or partial autocorrelation) coefficient were to drag a long tail, and the value of them were to be slowly reduced, we would call this trailing.These are the definitions of truncation and trailing.If the sample autocorrelation coefficient (or partial autocorrelation coefficient) were obviously larger than the standard deviation range of 2 times in the initial d order, almost 95% of the sample autocorrelation (partial autocorrelation) coefficients would fall within 2 times the standard deviation range, and the process of decaying from non-zero autocorrelation (or partial autocorrelation) to small fluctuations would be very sudden, and consequently, we would consider autocorrelation (or partial autocorrelation) coefficient truncation.If more than 5% of the samples were to fall outside the standard deviation range of 2 times, or a process of fluctuation, attenuated by a significant non-zero correlation function, were to be slow or very continuous, we would consider autocorrelation (or partial autocorrelation) coefficient trailing.The basic principles of model recognition are shown in Table 3.

Autocorrelation Function Partial Autocorrelation Function Type of Model
Trailing P order truncation AR (p) Q order truncation Trailing MA (q) Trailing Trailing ARMA (p, q) After fitting of a series using SPSS software, the autocorrelation coefficients and partial autocorrelation coefficients can be obtained as shown in Figure 1.After the previous explanation, it can be seen that the p values are very small, thus the sequence can be determined as a sequence of non-white noise [50].However, the autocorrelation function is Sustainability 2017, 9, 1181 5 of 19 not zero after a certain step, but follows the form of a sine wave; thus, the autocorrelation function is trailing.Furthermore, the partial autocorrelation function is zero after the 6th order and has a truncation property.In summary, the sequence should be tested with the AR (1) model.
After the model type had been determined, the SPSS software could be used to fit the model.The fitting results are shown in Tables 4 and 5.As is shown by Tables 4 and 5, the fitting degree was very good, the coefficient of AR was −0.382, and the constant was 47,096.394.Then, the autocorrelation and partial autocorrelation maps of the residuals as shown in Figure 2 revealed stationary fitted sequences, therefore, it was reasonable to use AR (1) model fitting, and the relative average error was 6.309%.After the previous explanation, it can be seen that the p values are very small, thus the sequence can be determined as a sequence of non-white noise [50].However, the autocorrelation function is not zero after a certain step, but follows the form of a sine wave; thus, the autocorrelation function is trailing.Furthermore, the partial autocorrelation function is zero after the 6th order and has a truncation property.In summary, the sequence should be tested with the AR (1) model.
After the model type had been determined, the SPSS software could be used to fit the model.The fitting results are shown in Tables 4 and 5.As is shown by Tables 4 and 5, the fitting degree was very good, the coefficient of AR was −0.382, and the constant was 47,096.394.Then, the autocorrelation and partial autocorrelation maps of the residuals as shown in Figure 2 revealed stationary fitted sequences, therefore, it was reasonable to use AR (1) model fitting, and the relative average error was 6.309%.Thus, the result of the ARIMA model was: ∆X t = 47096.394− 0.382∆X t−1 + ε t .Finally, via fitting prediction as shown in Figure 3, it could be verified that the fitting effect was good.

Grey GM (1,1) Model
The application of the grey model has been extended to many fields, such as economy, energy, technology, management, and others.If a system has fuzzy hierarchy, structure relation, dynamic randomness, is incomplete, or contains uncertainty of index data, we call the system grey.Energy consumption dates have these characteristics.The advantage of using the grey model is that it does not require a large number of samples, and the effect of short-term prediction is good.This is the central reason why the grey model can be used for energy consumption forecasting.
From previous forecasting results, we know that grey models are widely used in forecasting carbon emissions, which provides reference for future economic development.In this study, we used the grey GM model to explain the relationship between energy consumption and economic development.Since the energy consumption is closely linked with many economic aspects, it would not only help to adjust the future energy supply and demand, but also spur the search for a combination point between economic development and new energy technology.

The Principle of the GM (1,1) Model
The GM (1,1) model is characterized by a small amount of data (typically 5-10) to predict future trends [18,[24][25][26]; therefore, the grey model features several advantages over the prediction of a small capacity sample.The basic idea of the grey theory is: to combine the known time series according to specific rules to form either a dynamic or non-dynamic combination, and then to solve the law of future development according to specific criterion and solution [26,51,52], which adopt the essential part of the Grey system theory [53].The core of grey theory is the establishment of differential equations.The reason why differential equations can be established for complex data and systems is that a certain rule must be present in the data sequence.The solution of the differential equation is solved with the least square method, and the parameters are taken into account to solve the predicted data sequence.In this paper, the five-dimensional GM (1,1) model prediction method was used, and the model was constructed with Excel to solve the parameters.Due to the significant change of oil in the last years [54][55][56][57][58], forecasting the energy consumption relatively exactly is of great importance.After the prediction data were obtained, we compared the prediction dates with the data of the years of 1996-2015, and obtained the average relative error of the model.

Grey GM (1,1) Model
The application of the grey model has been extended to many fields, such as economy, energy, technology, management, and others.If a system has fuzzy hierarchy, structure relation, dynamic randomness, is incomplete, or contains uncertainty of index data, we call the system grey.Energy consumption dates have these characteristics.The advantage of using the grey model is that it does not require a large number of samples, and the effect of short-term prediction is good.This is the central reason why the grey model can be used for energy consumption forecasting.
From previous forecasting results, we know that grey models are widely used in forecasting carbon emissions, which provides reference for future economic development.In this study, we used the grey GM model to explain the relationship between energy consumption and economic development.Since the energy consumption is closely linked with many economic aspects, it would not only help to adjust the future energy supply and demand, but also spur the search for a combination point between economic development and new energy technology.

The Principle of the GM (1,1) Model
The GM (1,1) model is characterized by a small amount of data (typically 5-10) to predict future trends [18,[24][25][26]; therefore, the grey model features several advantages over the prediction of a small capacity sample.The basic idea of the grey theory is: to combine the known time series according to specific rules to form either a dynamic or non-dynamic combination, and then to solve the law of future development according to specific criterion and solution [26,51,52], which adopt the essential part of the Grey system theory [53].The core of grey theory is the establishment of differential equations.The reason why differential equations can be established for complex data and systems is that a certain rule must be present in the data sequence.The solution of the differential equation is solved with the least square method, and the parameters are taken into account to solve the predicted data sequence.In this paper, the five-dimensional GM (1,1) model prediction method was used, and the model was constructed with Excel to solve the parameters.Due to the significant change of oil in the last years [54][55][56][57][58], forecasting the energy consumption relatively exactly is of great importance.After the prediction data were obtained, we compared the prediction dates with the data of the years of 1996-2015, and obtained the average relative error of the model.

Building the GM (1,1) Model
The data of energy demand in Shandong Province for the past 1995-1999 years were selected to establish the original time series X 0 = (x 0 1 , x 0 2 , • • • , x 0 m ).Among these, x 0 m was the energy demand for each year.To reduce the randomness of time series, we calculated the primary accumulation sequence The GM (1,1) model was established via X 1 , and the corresponding differential equation was [24,59,60]: Besides, 'a' is the coefficient of development of GM (1,1); 'b' is a grey function.
Using least squares to estimate parameters: According to related data of the energy demand in Shandong during 1995-1999, we obtained: The A and B introduced into the differential model corresponding to the GM (1,1) model, and the response function could be solved: After restoring the response function, the GM (1,1) prediction model is: The residual of the K moment is: k , which is the original demand for the K moment; x1 k is the predicted demand for the K moment.The mean of the original demand is: The residual mean is: The variance of the original demand is: The residual variance is: The posterior difference ratio is: The posterior difference ratio C is the index of the accuracy of the reaction model.The smaller C, the higher the accuracy of the model will be.After testing, C = 14.6433  39.1499 = 0.37 < 0.65.The inspection standard revealed that the GM model was feasible if the value of C is below 0.65.Apparently, the C value conformed to the inspection standard.Therefore, the GM (1,1) model could be used to test the energy demand of Shandong province.The prediction results and residual values of the GM (1,1) model are shown in Table 6.The formula for the arithmetic mean of the average relative error is: After calculation, the average relative error predicted by the GM (1,1) model (5th dimension) was 44.38%.

GM-ARIMA Model Based Prediction of Energy Demand Based on Energy Demand
The average relative errors of the ARIMA model and the GM (1,1) model were 6.309% and 44.38%, respectively.The average relative error was not ideal, and each model had its own shortcomings: the ARIMA model required the data sequence to not be following a trend, while the GM (1,1) model required less data.Combined with the original data characteristics shown in Table 1, we find that the original data of this paper followed a trend, and the amount of data was large.Therefore, using the ARIMA model or the GM (1,1) model to predict data would result in vulnerabilities.The combination model had the characteristics of pursuing advantages, while avoiding disadvantages.Via combination of both models, the advantages of a single model could be achieved, while drawbacks could be avoided.This is also the reason that forces us to build combinatorial models and achieve lower relative errors.
The GM-ARIMA model constructed in this paper is a combination model, combining the metabolic GM (1,1) model and the ARIMA model [61][62][63].The metabolic GM (1,1) model is a dynamic process that adds new data, while removing old data.By adding new data into the model, the factors that influence the system over time can be overcome.The metabolic GM model encourages the system to update and develop; therefore, it is more practical for larger sets of data.The ARIMA model treats the data sequence of the predicted object over time as a random sequence, and uses a mathematical model to approximately describe its sequence.In the GM-ARIMA model, we used the ARIMA model to predict the residual sequence, to correct the prediction results, and improve accuracy.
The core of this model is the priority of the utilized model.We used the metabolic GM (1,1) model to first predict the data.After comparing the predicted data with the original data, we obtained the residual error.The ARIMA model was used to later correct the residual sequence.The specific building process was as follows: First, the raw sequence was pre-processed so that it had the appropriate condition to apply the metabolic GM (1,1) model.Then, the dimension of the metabolic GM (1,1) model was determined with the criterion of the minimum residual sum of squares, and then the data was predicted.Next, the ARIMA model was used to predict the results of this residual correction.Finally, combined with the correction results, the prediction data were modified, thus obtaining the final prediction data.We used the average relative error value to judge the accuracy of the composite model.Compared to the average relative error value of both ARIMA and GM (1,1) models, and if the average relative error of composite model was small, we could take it for granted that the application of the GM-ARIMA model increased the prediction accuracy of energy demand compared to that of a single model, and it would arrive at a higher predictive value.

Data Analysis
The premise of modeling that uses the ARMA model is that the time series is a non-white noise sequence.The sequence diagram shows that the time series has an apparent tendency; therefore, it is a white noise sequence, and does not have the characteristics that enable using the ARIMA model.Furthermore, the GM (1,1) modeling condition was 5-10 data points, and the amount of data points used in this problem was 20, which does not conform to the condition of pure grey system modeling.Therefore, metabolic GM (1,1) should be used to predict the data.It was necessary to verify the nonlinearity of the original sequence when we used the grey model of metabolism.The judgment of the nonlinear sequence [64][65][66] could be made via histogram, skewness, variance, conditional mean estimation, and two-order linear indexing.In this study, we adopted the histogram judgment method.The criterion was: if the histogram were normal, the sequence would be linear.If the histogram were not normal, the sequence would be non-linear.The following provides the trend analysis and nonlinear judgment of the original sequence.

The Tendency of Raw Data Sequences
The energy consumption of Shandong province from 1995 to 2015 is shown in Table 1 The sequence diagram is shown in Figure 4.The diagram shows that the data sequence follows an apparent tendency.Therefore, a white noise sequence can be obtained with the traditional difference method, and the model cannot be modeled via ARMA.Therefore, the metabolic grey GM (1,1) model was considered.The premise of establishing a metabolic grey model is that the data is required to contain nonlinear sequences, making it necessary to test for nonlinearity of the data first.

Verifying the Nonlinearity of the Raw Data Sequence
According to known theory, for a nonlinear sequence, whether the histogram of data is normal curve can be used as the basis to judge whether the data is linear.The raw data histogram, drawn from the SPSS software, is shown in Figure 5. Figure 5 shows that the distribution of the sequence is obviously different from the normal distribution, and the data sequence could be determined as a nonlinear sequence.Therefore, the grey GM (1,1) model could be considered for the data sequence.

Establishing the Metabolic GM (1,1) Model
The GM (1,1) model is a modeling method that is based on small capacity samples, and the optimal amount of data is 5-10.Directly building the grey model with all 21 data of 1995-2015 years sample would violate this modeling criterion of the grey model; therefore, the establishment of a

Verifying the Nonlinearity of the Raw Data Sequence
According to known theory, for a nonlinear sequence, whether the histogram of data is normal curve can be used as the basis to judge whether the data is linear.The raw data histogram, drawn from the SPSS software, is shown in Figure 5.

Verifying the Nonlinearity of the Raw Data Sequence
According to known theory, for a nonlinear sequence, whether the histogram of data is normal curve can be used as the basis to judge whether the data is linear.The raw data histogram, drawn from the SPSS software, is shown in Figure 5. Figure 5 shows that the distribution of the sequence is obviously different from the normal distribution, and the data sequence could be determined as a nonlinear sequence.Therefore, the grey GM (1,1) model could be considered for the data sequence.

Establishing the Metabolic GM (1,1) Model
The GM (1,1) model is a modeling method that is based on small capacity samples, and the optimal amount of data is 5-10.Directly building the grey model with all 21 data of 1995-2015 years sample would violate this modeling criterion of the grey model; therefore, the establishment of a Figure 5 shows that the distribution of the sequence is obviously different from the normal distribution, and the data sequence could be determined as a nonlinear sequence.Therefore, the grey GM (1,1) model could be considered for the data sequence.
However, the problem of grey model computation is how to choose the dimension of the data in the metabolic GM (1,1) model.To improve accuracy, the data of 5-10 dimensions were fitted respectively, and the square sum of residuals was calculated according to the result of each fitting and the original data.The sum of squares of residuals of each dimension was calculated as shown in Table 7.As shown in Table 7, the square sum of residuals in the 5th dimension is minimum; consequently, the data length was 5.
The specific steps for establishing a metabolic model are as follows: First, we selected the first 5 data points and accumulated the data: Then, we fitted the data of the cumulative series with the aid of Excel, and obtained the predicted values: x(1) (t) , t = , the accumulation date should be returned to the original data sequence according to the following formula: x(0 (1) = x(1) (1); Finally, we removed the first date and added the most recent data point using the metabolism method, repeating this to obtain all the predicted data.
The fitting results are shown in Table 8.This study compared the predicted results with the data in the Shandong statistical yearbook, and calculated the average relative error.Consequently, the average relative error of the fitting model using metabolic GM (1,1) was 6.57%.

Residual Correction via the ARIMA Model
After using the metabolic GM (1,1) model to predict and compare to the original data, a series of residuals of the data was obtained.The procedure for making residual corrections using the ARIMA model is described in the following: First, the residual sequence diagram is drawn; then, a stationarity test is conducted to determine the difference order.Subsequently, the parameters p, q, and types of the model are determined via smearing and truncating the autocorrelation function and the partial autocorrelation function graph.Next, the model was established and the residual sequence was fitted to obtain the residual prediction results.Finally, the residuals of the fitted residuals were detected, and the correctness of the prediction model was verified.

Residual Sequence Diagram
The residual sequence could be obtained by comparing the predicted results with the actual results presented in the second chapter, and the residual sequence diagrams could be drawn using SPSS software as shown in Figure 6.
This study compared the predicted results with the data in the Shandong statistical yearbook, and calculated the average relative error.Consequently, the average relative error of the fitting model using metabolic GM (1,1) was 6.57%.

Residual Correction via the ARIMA Model
After using the metabolic GM (1,1) model to predict and compare to the original data, a series of residuals of the data was obtained.The procedure for making residual corrections using the ARIMA model is described in the following: First, the residual sequence diagram is drawn; then, a stationarity test is conducted to determine the difference order.Subsequently, the parameters p, q, and types of the model are determined via smearing and truncating the autocorrelation function and the partial autocorrelation function graph.Next, the model was established and the residual sequence was fitted to obtain the residual prediction results.Finally, the residuals of the fitted residuals were detected, and the correctness of the prediction model was verified.

Residual Sequence Diagram
The residual sequence could be obtained by comparing the predicted results with the actual results presented in the second chapter, and the residual sequence diagrams could be drawn using SPSS software as shown in Figure 6.

Stationary Test of the Residual Sequence
The premise for establishing the ARIMA model for residual data is that the residual sequence needs to be stationary; therefore, it is necessary to test for the stability of the residual sequence.If the model is stationary, the model can be directly built.If it is not smooth, several orders of difference need to be made to stabilize it.The number of differences is indicated by the parameter d in the ARIMA model.Using SPSS software, the results of unit root tests for residual sequences are shown in Table 9.

Stationary Test of the Residual Sequence
The premise for establishing the ARIMA model for residual data is that the residual sequence needs to be stationary; therefore, it is necessary to test for the stability of the residual sequence.If the model is stationary, the model can be directly built.If it is not smooth, several orders of difference need to be made to stabilize it.The number of differences is indicated by the parameter d in the ARIMA model.Using SPSS software, the results of unit root tests for residual sequences are shown in Table 9.Thus, the differential first-order sequence is stationary, and d = 1.

Autocorrelation Coefficients and Partial Autocorrelation Coefficients
After the parameter d was determined, it became necessary to determine the parameters p and q.The method for judging p and q has been introduced before, and the partial autocorrelation coefficient map was drawn via SPSS software, which is determined by the graph as follows.
The diagrams as shown in Figures 7 and 8 show that the autocorrelation coefficients and the partial autocorrelation coefficients of the residuals were all trailing; therefore, ARIMA (p, 1, q) models are chosen.The sequence was fitted via SPSS software, and the values of p and q were repeatedly determined.The confidence levels of the fitted sequences were compared between different p and q values, and following that, the final model type was determined as ARIMA (2,1,2).The parameters of the fitting result are shown in Table 10.
Therefore, the original residual sequence model is established as:  Thus, the differential first-order sequence is stationary, and d = 1.

Autocorrelation Coefficients and Partial Autocorrelation Coefficients
After the parameter d was determined, it became necessary to determine the parameters p and q.The method for judging p and q has been introduced before, and the partial autocorrelation coefficient map was drawn via SPSS software, which is determined by the graph as follows.
The diagrams as shown in Figures 7 and 8 show that the autocorrelation coefficients and the partial autocorrelation coefficients of the residuals were all trailing; therefore, ARIMA (p, 1, q) models are chosen.The sequence was fitted via SPSS software, and the values of p and q were repeatedly determined.The confidence levels of the fitted sequences were compared between different p and q values, and following that, the final model type was determined as ARIMA (2,1,2).The parameters of the fitting result are shown in Table 10.
Therefore, the original residual sequence model is established as:

Test of the Built Model
After the residuals have been fitted with the ARIMA (2,1,2) model, the residual sequence of residuals was obtained.The autocorrelation coefficients and the partial autocorrelation coefficients are shown as follows.
As shown in Figure 9, the fitted sequence is a white noise sequence; therefore, it could be used for prediction.

Test of the Built Model
After the residuals have been fitted with the ARIMA (2,1,2) model, the residual sequence of residuals was obtained.The autocorrelation coefficients and the partial autocorrelation coefficients are shown as follows.
As shown in Figure 9, the fitted sequence is a white noise sequence; therefore, it could be used for prediction.

Test of the Built Model
After the residuals have been fitted with the ARIMA (2,1,2) model, the residual sequence of residuals was obtained.The autocorrelation coefficients and the partial autocorrelation coefficients are shown as follows.
As shown in Figure 9, the fitted sequence is a white noise sequence; therefore, it could be used for prediction.Table 11 shows that the ARIMA model is capable of residual correction.The average relative error was 5.11%; therefore, the error was smaller and the prediction result was better.In combination with the above three methods, each accuracy could be calculated as shown in Table 12.The results show that the average relative error of the GM-ARIMA composite model constructed in this paper was smaller than the resulting errors for each single model.Thus, we can conclude that the GM-ARIMA model has higher accuracy.This analysis showed that the GM-ARIMA model can be used to forecast the energy demand of Shandong Province for the years of 2016-2020 and accurate forecast data can be obtained.The results are shown in Table 13.

Conclusions and Future Work
The GM-ARIMA model is very effective for systems with large sample size that follow trends, and it can be appropriately applied for the future energy prediction of Shandong province.This combination model was derived from the GM model and the ARIMA model.Through the improvement of the calculation process, we used the metabolic GM (1,1) model to predict the data.Then, the ARIMA model was used to obtain the results for residual correction.Finally, the residual correction process enabled further improvement in the accuracy of the predicted results on the original prediction.This was the major innovation of this study.Due to the complex data sequence of the energy demand, once the model was set up, we could use the currently available data to forecast the future data.
By analyzing and comparing the average relative error of different models, we found that the accuracies of GM (1,1), ARIMA (1,2,0), metabolic GM (1,1), and GM-ARIMA model were 44.38%, 6.309%, 6.57%, and 5.11%, respectively.It became obvious that the combination GM-ARIMA model has higher precision than each of the single models.The GM-ARIMA model combined advantages of both models, while avoiding their limitations.The advantage of the combination GM-ARIMA model is that it not only overcomes the shortcomings of the grey GM (1,1) model with less sample capacity, but also that it uses the ARIMA model to correct the prediction results and to improve accuracy.During the prediction process, the differential equation was used to determine the parameters, which provides a certain degree of accuracy and is more practical for non-stationary sequences.In summary, the results of energy consumption were reasonable.
The predicted results show that the energy consumption in Shandong province will still follow a slow upward trend in the future.A situation where the supply is not adequate to meet the demand may appear.To adapt to the increased energy demand, the government should take rational policy measures, increase investment in new energy technologies, promote energy structure adjustments, while at the same time, taking the development of green economic development into account.
The composite GM-ARIMA model provides a way to predict future energy consumption.This provides a reference value for future energy development planning, policy formulation, and technical direction.In the future, GM-ARIMA model predictions will be widely used in many areas: carbon emissions from fossil fuel combustion, energy shortages, economic development indicators, and a number of green energies.These forecasts are aimed at laying a solid foundation for the adjustment of energy structure and for the rational adjustment of the levels of both energy supply and demand.Through these efforts, we expect that the future energy development of Shandong will be more diversified, balanced, and orderly.

Figure 1 .
Figure 1.Autocorrelation and partial autocorrelation function diagram of energy demand in Shandong Province.(a) indicated that the autocorrelation function is tailed; (b) indicated that the partial autocorrelation function is truncated.

Figure 1 .
Figure 1.Autocorrelation and partial autocorrelation function diagram of energy demand in Shandong Province.(a) indicated that the autocorrelation function is tailed; (b) indicated that the partial autocorrelation function is truncated.

Figure 2 .
Figure 2. The autocorrelation and partial autocorrelation maps of the residuals.Thus, the result of the ARIMA model was: ∆ = 47096.394− 0.382∆ + .Finally, via fitting prediction as shown in Figure3, it could be verified that the fitting effect was good.

Figure 2 .
Figure 2. The autocorrelation and partial autocorrelation maps of the residuals.

Figure 3 .
Figure 3.The fitting prediction based on ARIMA model.Note: "UCL" means upper control line; "LCL" means lower control line.

Figure 3 .
Figure 3.The fitting prediction based on ARIMA model.Note: "UCL" means upper control line; "LCL" means lower control line.

Figure 7 .
Figure 7. Autocorrelation function of residual sequence.Figure 7. Autocorrelation function of residual sequence.

Figure 7 .
Figure 7. Autocorrelation function of residual sequence.Figure 7. Autocorrelation function of residual sequence.

Figure 9 .
Figure 9. Autocorrelation coefficients and partial autocorrelation coefficients of residual residuals.

Figure 9 .
Figure 9. Autocorrelation coefficients and partial autocorrelation coefficients of residual residuals.

Figure 9 .
Figure 9. Autocorrelation coefficients and partial autocorrelation coefficients of residual residuals.

Table 1 .
The energy consumption of Shandong province from 1995 to 2015.

Table 2 .
The stationarity test of energy demand time series in Shandong Province.
Note: Augmented Dickey Fuller (ADF).Q* is the first order difference of Q. Q** is the two-order difference of Q.

Table 3 .
The basic principles of model recognition.

Table 3 .
The basic principles of model recognition.

Table 6 .
The prediction results and residual values of GM (1,1) model.

Table 7 .
The sum of squares of residuals of each dimension.

Table 8 .
Trend item fitting results.

Table 9 .
Unit root tests for residual sequences. 2

Table 9 .
Unit root tests for residual sequences.
Note: Q* is the first order difference of Q.

Table 12 .
Comparison results of different methods.

Table 13 .
Forecast data of energy demand in Shandong in 2016-2020.