Forecasting of Chinese Primary Energy Consumption in 2021 with GRU Artificial Neural Network

The forecasting of energy consumption in China is a key requirement for achieving national energy security and energy planning. In this study, multi-variable linear regression (MLR) and support vector regression (SVR) were utilized with a gated recurrent unit (GRU) artificial neural network of Chinese energy to establish a forecasting model. The derived model was validated through four economic variables; the gross domestic product (GDP), population, imports, and exports. The performance of various forecasting models was assessed via MAPE and RMSE, and three scenarios were configured based on different sources of variable data. In predicting Chinese energy consumption from 2015 to 2021, results from the established GRU model of the highest predictive accuracy showed that Chinese energy consumption would be likely to fluctuate from 2954.04 Mtoe to 5618.67 Mtoe in 2021.


Introduction
Energy is a vital resource needed for socio-economic development, and it is increasingly of concern to more and more governments and economic sectors because of its extensive application and the strong dependency on it in the processes of production and consumption [1].In recent years, with the rapid development of the Chinese social economy and increase of population, there has been a rapid upward trend in Chinese energy demand and consumption [2].Chinese main energy sources include hard coal, lignite, hydropower, oil, natural gas, geothermal, solar, wind, nuclear, etc., but efficiency of energy production and utilization is too low.In order to meet domestic energy demand, energy import trade volume is increasing year by year.Therefore, China should develop its own corresponding energy production plan to meet the rising domestic energy demand.To ensure energy security, it is important to predict annual energy consumption for a 5-to 10-year period to establish an appropriate energy plan.Energy consumption forecasting is affected by various aspects of socio-economic factors, among which the gross domestic product (GDP), population, import and export trade and other factors are particularly important [3].
The energy consumption model is usually based on historical consumption data and historical data related to energy consumption, such as the economy, population, climate, etc. [4].At present, energy consumption forecasting in the world has three mainstream research methods: planning models, economic models, and machine learning models.The planning model uses linear and nonlinear programming to find the parameters that fit based on historical data.It was O'Neill who first applied the planning model to predict energy consumption in US [5].Meanwhile, this method has also been applied in coal, oil, natural gas, power demand and other fields [6].The economic model combines energy demand with other microeconomic variables and realizes the prediction of future energy demand through the inherent interaction between economic variables.The choice of economic variables is the key to the predicted accuracy [7].The machine learning model breaks through the constraints of the original mathematical calculation in terms of the accuracy of prediction.It realizes and identifies the relationship between the data characteristics through artificial intelligence and realizes the prediction of future energy consumption through the modeling of a large amount of historical data.It identifies the relationships between the various data features by means of artificial intelligence.Therefore, the machine learning model realizes the prediction of future energy consumption based on the training of a large number of historical data.Moreover, there are many models of machine learning.The application of energy consumption includes Autoregressive Integrated Moving Average (ARIMA) model [8], Artificial Neural Network (ANN) model [9], Ant Colony Optimization (ACO) model [10], Particle Swarm Optimization (PSO) model [11] and so on.
In view of the dynamic change of energy consumption, Gated Recurrent Unit (GRU) can effectively solve the problem of error caused by the spatiotemporal evolution of energy consumption.It has gating units that modulate the flow of information inside the unit.Compared with the original machine learning method, GRU belongs to a deep learning method, as it can use the memory units in a network to deal with any data sequence of input.Therefore, the ability to learn time series of GRU is greatly superior [12].The GRU may not only study the time series of long spans but also automatically determine the optimal time lag for prediction.In recent years, GRU has been successfully applied to handwriting recognition, human motion identification and robot control, etc. [13], but it is rarely applied in the field of economic forecasting.In this study, we selected three energy consumption forecasting models: multivariable linear regression (MLR), support vector regression (SVR) and Gated Recurrent Unit (GRU).By comparing these three models, we verified the superiority of the GRU model in the simulation of energy consumption from 2008 to 2015 in China.Then, we designed various scenarios to forecast Chinese primary energy consumption from 2015 to 2021.The results will help government to develop a reasonable energy plan.

Energy Consumption Forecast
In recent years, scholars from all over the world have studied the prediction of energy supply and consumption in the country and the region [14].Sözen (2006) employed the artificial neural network method to obtain the formula to predict the net consumption of energy.The results showed that the error of the net consumption of energy consumption obtained via artificial neural network method was very small [15].Deka (2016) compared five different forecasting technologies using economic and demographic factors to simulate US energy needs with in-depth discussion [16].Torrini (2016) proposed a fuzzy logic approach to extract rules from input variables and to provide Brazil's long-term annual electricity demand forecast [17].Philip (2012) used ARDL and PAM to measure the short-term and long-term influencing factors of energy consumption in Ghana and forecasted Ghana's energy consumption in 2020 [18].Gokhan (2015) predicted Turkey's primary energy consumption (PEC), which provided a predictive derivative model of population, gross domestic product (GDP) and energy consumption by regression analysis [19].Some scholars combine energy consumption with carbon dioxide emissions to establish a correlation forecasting model.Hasiao (2012) applied the improved nonlinear gray model (Bernoulli) to analyze the characteristics of carbon dioxide emissions, energy consumption and actual output in China and to establish a predictive model of numerical iteration [20].Pani (2010) applied correlation analysis to study the correlation of energy consumption, GDP and carbon emissions [21].Wenying (2015) conducted a bottom-up analysis of energy consumption and carbon dioxide emissions from the Chinese steel industry [22].Jain's (2014) findings suggested that the sensor-based energy prediction model was suitable for multi-family residential buildings [23].Wang (2011) analyzed the impacts of implementing new and expected energy and environmental policies with the Long-range Energy Alternatives Planning (LEAP) modeling tool [24].Blanca Moreno (2016) used the combined model of grey neural network and input-output to predict primary energy consumption in the Spanish economic sector [25].Xie (2015) applied the optimized single variable discrete grey prediction model to predict China's total energy production and consumption, and proposed a new Markov method based on the quadratic programming model to predict the trend of China's energy production and consumption structure [26].

Multiple Linear Regression
Multiple Linear Regression (MLR) is an important method in multivariate statistical analysis.It makes it possible to estimate the future regression coefficients and model accuracy without sampling the future system.At present, MLR is widely used in the research of many disciplines.Prakasvudhisarn (2015) predicted the electricity consumption of Thailand using the multiple linear regression and ANN models [27].Abuella (2015) presented a multiple linear regression analysis model for solar power probabilistic forecasting [28].Cleland (2010) applied multiple linear regression to usefully analyze the total energy consumption in the New Zealand food manufacturing industry [29].Amral (2008) investigated the short-term load forecast of the demand of the South Suleai power system with the multiple linear regression method and concluded that the short-term load forecasting multiple linear regression model had been relatively easy to develop and regularly update, and was widely used in commercial computing software [30].In Tuaimah's (2014) research, the multiple linear regression method was used to present a short-term load forecast for Iraq's power system requirements [31].Torkzadeh (2014) applied multiple linear regression & principal component analysis (MLR-PCA) as the approach to predict weekly electrical peak load of Yazd city and concluded that the error of this proposed method was quite small [32].Rahman (2014) presented a method for characterizing river water quality with the analysis of multiple linear regression [33].Mata (2011) showed a comparison between the MLR and ANN models to characterize dam behavior under environmental loads [34].Abushikhah (2011) proposed multivariable linear and non-linear regression, which used an hourly daily load to predict the next year's hourly load, and the results obtained using the proposed method suggested that its performance was close [35].

Support Vector Machine
The Support Vector Machine (SVM) is an evolutionary algorithm for data exploration, and is an algorithm with a high prediction accuracy [36].Support vector machines can be used to solve nonlinear programming problems, and can predict time series.At present, support vector machines have been widely used in planning, classification, nonlinear fitting and other fields.Its use is grounded in its superiority for solving nonlinear problems and, it has also applied to forecast energy consumption.Li (2009) applied SVM to predict the air conditioning energy consumption of office buildings.The results showed that the accuracy of SVM model prediction was higher than that of the BP neural network [37].Hou (2009) predicted the air conditioning energy consumption of the (Heating, Ventilating, and Air-Conditioning) HVAC system, and the results showed that the SVM model was more accurate than the (Autoregressive Integrated Moving Average) ARIMA model [38].Jain (2014) used the SVR model to predict energy consumption in New York's multi-tenant buildings.Meanwhile, verifying temporal and spatial changes in particulate concentrations can have an impact on the accuracy of the forecast [23].Wang (2015) tried to apply an instance-weighted variant of the SVM with both 1-norm and 2-norm formats to deal with the class imbalance problem [39].Furthermore, Zhang (2013) studied the application of support vector machine in face recognition [40].

Gated Recurrent Unit
The Gated Recurrent Unit (GRU) changes the means of original supervising machine learning and solves the problem by carrying the memory unit of the forgotten mechanism.While the GRU deep learning model has drawn attention of late, its application is currently still relatively rare, and is mainly concentrated in computer-related areas.Le (2017) proposed a Gated Recurrent Unit (GRU) based on the Recurrent Neural Network (RNN) to construct an energy decomposition classifier with deep learning, and applied the method to training the model with the UK DALE dataset.From the experiment, Le concluded that the deep learning method was very effective for non-invasive load monitoring (NILM) [41].Chung (2014) evaluated Recurrent Neural Networks (RNN) with three widely used recurrent units: a traditional tanh unit, a Long Short-Term Memory (LSTM) unit and a Gated Recurrent Unit (GRU).Finally, Chung confirmed the superiority of the Gated Recurrent Unit (GRU) [42].Jozefowicz (2015) compared the GRU and LSTM models and found that the GRU model was able to achieve comparable results to the LSTM model on multiple issues, while the GRU model was easier to train [43].Zhou's experiments (2016) showed that GRU had some advantages in learning recurrent neural networks with stable performance and relatively few parameters [44].Tang (2016) conducted an investigation on recurrent approaches to cope with question detection, and then built different RNN and bidirectional RNN (BRNN) models to extract efficient features based on gated recurrent units (GRU) at segment and utterance levels.Tang concluded that the particular advantage of GRU was that it can determine a proper time scale to extract high-level contextual features [45].Rana's (2016) speech experiments with eight different types of noise showed that the run time of the GRU was reduced by 18.16%, and was comparable to the long term short-term memory of the most popular recurrent neural network [46].Huang (2017) verified the use of GRU-ELC units with the most advanced performance on three standard scene marker datasets.This comprehensive experiment showed that the new GRU-ELC unit facilitated the problem of on-site labeling because it could more effectively encode the longer context dependency in the image than the traditional RNN unit [47].

Multiple Linear Regression Model
The Multiple Linear Regression model is a method used to deal with the complex relationship between an output variable and multiple explanatory variables.The purpose of its analysis is to predict the output variables with the value of multiple explanatory variables.The main limitation of the model is that the correlation between the variables changes with time and space [48].Assuming an output variable is y i , and some explanatory variables are x i , then the relationship between the output variable and the explanatory variable can be expressed as: Meanwhile, x i,h is the value of the hth explanatory variable for the year i, b 0 is the constant term of the plan, b h is the parameter of the hth explanatory variable, h is the number of all explanatory variables, y i is the value of the output variable for year i, ŷi is the estimated value of the output variable for year i, e i is the prediction error, where e i can be defined as:

Support Vector Regression Model
The Support Vector Regression model obtains an approximate function g(x) from G = {(x i , y i )} N i=1 in the historical data sample of the correlated variable, which is already known.The data x is mapped to a high dimension feature space by nonlinear method, and then the linear programming is carried out in this feature space [49].
In Equation ( 4), φ i (x) is the characteristic variable, b and w i as coefficients that can be estimated from the data.In this way, the nonlinear programming of a low-dimensional input space can be deduced into a linear programming of high-dimensional feature space.The coefficient w i can be obtained with the minimum function: In Equation ( 5), λ is a normalized constant, and function | f (x i ) − y i | ε can be defined as: The minimum function can also be expressed as follows: in addition, the kernel function explains the scalar product of the D i dimensional feature space: The coefficients α i and α * i can be obtained by the following formula: The constraint is

Gated Recurrent Unit Neural Network Model
The Gated Recurrent Unit (GRU) neural network model adapts to the problem of dependence on a variety of time scales by setting all kinds of cycle units [43] that modulate the flow of information with the gate unit.Assuming that the input of the model is expressed as x = (x 1 , x 2 , • • • , x T ), the logical calculation process is shown in Figure 1.
Assume that the activation function h j t of GRU is a function related to time t, which takes the linear interpolation between the activation function h j t−1 at the previous time point and the candidate activation function h j t , which is : At the same time, the update gate z j t determines whether the unit updates the activation function or maintains the proportion and the number of the existing activation functions.The update gate z j t is as follows: The whole calculation process is to sum the existing state and the state of the update calculation, but the GRU model can't control the range of state updates, but every calculation updates all of the states once.
The calculation of the candidate activation function h j t is similar to that of the simple RNN calculation, and its computational function is: Among which r t is the reset gate, ⊗ is the vector product.When the reset door is closed (r j t ≈ 0), the contents of the input sequence can be read while the past state is forgotten.The reset gate r j t is calculated as follows: The tanh function above has been very maturely and widely used in some research [45].
Energies 2017, 10, 1453 6 of 15 Assume that the activation function j t h of GRU is a function related to time t , which takes the linear interpolation between the activation function j t h 1 − at the previous time point and the candidate activation function which is : At the same time, the update gate j t z determines whether the unit updates the activation function or maintains the proportion and the number of the existing activation functions.The update gate j t z is as follows: The whole calculation process is to sum the existing state and the state of the update calculation, but the GRU model can't control the range of state updates, but every calculation updates all of the states once.

The calculation of the candidate activation function
is similar to that of the simple RNN calculation, and its computational function is: Among which t r is the reset gate, ⊗ is the vector product.When the reset door is closed

Data Sources
In order to verify the predictive accuracy of the above three models for Chinese primary energy consumption, in accordance with the research of Zong (2009) [50], we chose five variables: gross domestic product (GDP), population, import trade volume, export trade volume and energy consumption.Among these variables, the gross domestic product (GDP), population, import trade volume and export trade volume were regarded as independent variables, while energy consumption was a dependent variable.The data selected was from 1965 to 2015, and the data of the four variables of gross domestic product (GDP), population, import trade volume and export trade volume were derived from the World Development Indicator [51].The Chinese primary energy consumption data was from the "BP World Energy Statistics Yearbook" [52].These data are shown in Table 1.In this paper, the total number of data samples was 51.

Analysis of Results
MLR and SVR models are deterministically mathematical methods, and stable results can be obtained according to the formulas given above.The GRU model is a deep learning neural network, and further constructs the model structure.The GRU model has three layers, including an input layer, a hidden layer, and an output layer.The input layer consists of four input variables: GDP, population, import, and export.The hidden layer consists of three GRU units with time steps of 1, 4, and 6, and each GRU unit contains 32 cells, and the output layer is the characteristic variable of primary energy consumption.The structure of the model is shown in Figure 2.

Analysis of Results
MLR and SVR models are deterministically mathematical methods, and stable results can be obtained according to the formulas given above.The GRU model is a deep learning neural network, and further constructs the model structure.The GRU model has three layers, including an input layer, a hidden layer, and an output layer.The input layer consists of four input variables: GDP, population, import, and export.The hidden layer consists of three GRU units with time steps of 1, 4, and 6, and each GRU unit contains 32 cells, and the output layer is the characteristic variable of primary energy consumption.The structure of the model is shown in Figure 2. The training and testing of the GRU model were completed using the Keras kit on the PYTHON platform, in which the optimizer was set as "RMSprop", the loss function was set as "MAPE", the loss rate was set as "0.0001", the epoch was set as 2000.In order to prevent the phenomenon of overfitting, the calculation process was added to the validation part, determining whether or not it appears to be the best model parameters.
The main research goal of this paper is to compare the accuracy of the three models of MLR, SVR and GRU for medium term Chinese primary energy consumption forecasting.In order to express the advantages and disadvantages of the three models, the paper takes MAPE (mean absolute percentage error) and RMSE (root mean square error) as the results for error.The two error formulas are as follows: The training and testing of the GRU model were completed using the Keras kit on the PYTHON platform, in which the optimizer was set as "RMSprop", the loss function was set as "MAPE", the loss rate was set as "0.0001", the epoch was set as 2000.In order to prevent the phenomenon of over-fitting, the calculation process was added to the validation part, determining whether or not it appears to be the best model parameters.
The main research goal of this paper is to compare the accuracy of the three models of MLR, SVR and GRU for medium term Chinese primary energy consumption forecasting.In order to express the advantages and disadvantages of the three models, the paper takes MAPE (mean absolute percentage error) and RMSE (root mean square error) as the results for error.The two error formulas are as follows: At the same time, the y in Equations ( 14) and ( 15) represents the actual primary energy consumption in China, while ŷ in the model represents the predicted value of Chinese primary energy consumption.
The experiments using MLR and SVR can be effectively performed, but when performing the experiment using GRU, a very interesting problem emerges.When using all the training data  to do the GRU prediction experiments, the error of the predicted MAPE is 14, which blocks the purpose of improving the accuracy of prediction.However, the error of the predicted MAPE is 5.63 when doing the GRU prediction experiment with the data of the first 8 years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007), allowing the emergence of the optimal prediction model parameters.The main reason for this situation is that the data input variables are in a state of annual growth.Recent data provides more information for the forecast results, whereas the earlier data will have a detrimental effect on the forecast.The training and test errors in Table 2 show that the GRU model has a higher accuracy in the prediction with the MAPE and RMSE indicators than that of MLR and SVR model in the comparison of the forecasting errors in Chinese primary energy consumption.Figure 3 shows the comparison between the actual value of primary energy consumption in China from 1965 to 2015 and the predicted values of various models.In summary, the GRU model is the best method of research to predict Chinese primary energy consumption; this model will be used in the prediction of Chinese primary energy consumption in the medium term.
At the same time, the y in Equations ( 14) and ( 15) represents the actual primary energy consumption in China, while y ˆ in the model represents the predicted value of Chinese primary energy consumption.
The experiments using MLR and SVR can be effectively performed, but when performing the experiment using GRU, a very interesting problem emerges.When using all the training data  to do the GRU prediction experiments, the error of the predicted MAPE is 14, which blocks the purpose of improving the accuracy of prediction.However, the error of the predicted MAPE is 5.63 when doing the GRU prediction experiment with the data of the first 8 years (2000-2007), allowing the emergence of the optimal prediction model parameters.The main reason for this situation is that the data input variables are in a state of annual growth.Recent data provides more information for the forecast results, whereas the earlier data will have a detrimental effect on the forecast.The training and test errors in Table 2 show that the GRU model has a higher accuracy in the prediction with the MAPE and RMSE indicators than that of MLR and SVR model in the comparison of the forecasting errors in Chinese primary energy consumption.Figure 3 shows the comparison between the actual value of primary energy consumption in China from 1965 to 2015 and the predicted values of various models.In summary, the GRU model is the best method of research to predict Chinese primary energy consumption; this model will be used in the prediction of Chinese primary energy consumption in the medium term.

Chinese Primary Energy Consumption Forecasts Based on Different Scenarios
After comparing the model errors, the GRU model was used to predict Chinese primary energy consumption from 2016 to 2021.An attempt was made to reduce the uncertainty of the forecast by setting appropriate scenarios, and by using the different scenarios with suitable forecast data.For

Chinese Primary Energy Consumption Forecasts Based on Different Scenarios
After comparing the model errors, the GRU model was used to predict Chinese primary energy consumption from 2016 to 2021.An attempt was made to reduce the uncertainty of the forecast by setting appropriate scenarios, and by using the different scenarios with suitable forecast data.For gross domestic product (GDP), the forecast data of Chinese GDP published by the International Monetary Fund (IMF) [53] was employed.According to the World Population Prospects (2015) [54], the Chinese population will reach 1.424 billion by 2030.Using Equation (16) to calculate the annual growth rate of the population, it can be inferred that the Chinese approximate growth rate of the population from 2016 to 2021 will be 0.25%.Since there is no authoritative estimate of import and export trade in the world, the initial growth rate, average growth rate and minimum growth rate can only be calculated based on historical data of growth.Taking into account the potential for the country's ongoing transformation and upgrading of the industry, the lowest growth rate is set at the lowest non-negative growth rate from 1965 to 2015.Due to the uncertainty of the import and export trade volume, the forecast of Chinese primary energy consumption from 2016 to 2021 is best set as four possible scenarios, as is shown in Table 3.According to the calculated method of the above data, Chinese gross domestic product and population estimations from 2016 to 2021 are shown in Table 4.The estimated data on Chinese import and export trade levels at different growth rates is shown in Table 5.
In Equation ( 16), CAGR is the annual growth rate, V(t 0 ) is the value of the beginning year, V(t n ) is the value of the ending year, the number of years in the whole phase is n.
The data of four independent variables are substituted into the trained GRU energy consumption forecasting model; the results of the four different scenarios are compared in Figure 4.The data of four independent variables are substituted into the trained GRU energy consumption forecasting model; the results of the four different scenarios are compared in Figure 4.Although more scenarios would more accurately assess the behavior of the predictive model in predicting possible Chinese energy consumption, the four scenarios were chosen from what realities have been appropriately assumed and computationally proven to achieve superior results, and hopefully represent the spectrum of possible consequences.
In Scenario 1, a negative growth of Chinese primary energy consumption is predicted from 3013.96Mtoe in 2015 to 2970.19 Mtoe in 2021, cutting back 1.45% in 6 years.Calculated according to Equation ( 16), the annual growth rate is −0.24%.
In Scenario 2, Chinese primary energy consumption forecast indicates an increase of 86.42% from 3013.96Mtoe in 2006 to 5618.67 Mtoe in 2021.According to Equation ( 16), the annual growth rate is 10.9%.Chinese primary energy consumption increases fastest in this scenario.
In Scenario 3, the forecast result of Chinese primary energy consumption suggests an increase of 23.6% from 3013.96Mtoe in 2015 to 3725.2 Mtoe in 2021 and an annual growth rate of 3.6% based on Equation ( 16).
Scenario 4, China's primary energy consumption forecast reveals a decrease of 3054.96Mtoe Although more scenarios would more accurately assess the behavior of the predictive model in predicting possible Chinese energy consumption, the four scenarios were chosen from what realities have been appropriately assumed and computationally proven to achieve superior results, and hopefully represent the spectrum of possible consequences.16), the annual growth rate is 10.9%.Chinese primary energy consumption increases fastest in this scenario.
In Scenario 3, the forecast result of Chinese primary energy consumption suggests an increase of 23.6% from 3013.96Mtoe in 2015 to 3725.2 Mtoe in 2021 and an annual growth rate of 3.6% based on Equation (16).
Scenario 4, China's primary energy consumption forecast reveals a decrease of 3054.96Mtoe from 2015 to 2954.04 Mtoe in 2021, with a decrease of 1.99% over six years.The annual growth rate is -0.33% according to Equation ( 16), and Chinese primary energy consumption witnesses the fastest decline in this scenario.To sum up, the four scenarios predict that Chinese energy consumption in 2021 will fluctuate between 2954.04 Mtoe and 5618.67Mtoe.
The growing energy demand requires the government to make the right decisions in terms of energy planning.If energy planning results in incorrect underestimates of energy needs, there will be a shortage of energy supply, resulting in an energy deficit.Due to the strong correlation between energy consumption and greenhouse gas emissions, the prediction of future energy consumption can also affect the Chinese reaction to climate change.Through the accurate prediction of energy consumption, environmental managers can not only determine the major sources of carbon emissions, but can also determine whether all kinds of energy have an impact on climate change.Reliable energy forecasts can ensure the energy security of the country, achieving the sustainable development of energy and economy.

Conclusions
Chinese primary energy consumption forecasting is a key element of the success of national energy security and energy planning.Based on economic and demographic factors, three kinds of Chinese energy forecasting models-multivariable linear programming, support vector planning and gate recurrent unit-have been established for forecasting the energy consumption in China from 2016 to 2021.Through the results of the study, the following three important findings were obtained: 1.
Deep learning is the hotspot of current research, and in the GRU there are internal relations between the four economic variables (gross domestic product (GDP), population, import trade volume, export trade volume) and energy consumption.The four economic variables can be used to forecast the primary energy consumption in China; 2.
The GRU model is a model based on long and short memory for learning time series data.
Compared with the MLR model and the SVR model, the GRU model is superior for the processing of time series data, and the average absolute percentage error of the predicted result can be as low as 5.63.However, when applying this model, the choice of the amount of training data is a key factor in accurate prediction.In particular, for the prediction of macroeconomic variables, recent data is more important to the final forecast result, due to uncertainties in socio-economic change; and 3.
The GRU model is used to forecast energy consumption in China from 2016 to 2021, with a finding that Chinese energy consumption in 2021 will fluctuate between 2954.04 Mtoe and 5618.67Mtoe.
The proposed model could be one of the best techniques in deep learning.Although this is the first study that applies the GRU model in the prediction of Chinese primary energy consumption, there are more predictive testing technologies and methods that can be implemented.Two directions in the practice of the forecasting can be further pursued.First, continue to enhance the model structure and parameter settings of the GRU forecast method to increase the accuracy of the final energy consumption forecast; and second, select other economic variables related to energy consumption for the energy consumption forecast.

Figure 1 .
Figure 1.The computing logical of the Gated Recurrent Unit (GRU).

Figure 1 .
Figure 1.The computing logical of the Gated Recurrent Unit (GRU).
The 51-item data sample was used to divide the test samples from 15% of the total sample.The training samples were mainly used to modify the planning model, and the test samples were mainly used to judge the accuracy of the model.The experiment used 43 data items from 1965 to 2007 as training samples and 8 data items from 2008 to 2015 as test samples.

Figure 2 .
Figure 2. The structure of GRU neural network.

Figure 2 .
Figure 2. The structure of GRU neural network.

Table 1 .
Primary energy demand and indicator data of China.

Table 2 .
Comparison of forecasting error for various models.

Table 2 .
Comparison of forecasting error for various models.

Table 3 .
The forecasting scenarios of Chinese energy consumption (2015-2021).The initial growth rate the growth rate calculated in 2015, the average growth rate is the average growth rate from 1965 to 2015, and the minimum growth rate is the non-negative minimum growth rate from 1965 to 2015. Note:

Table 5 .
China's import and export estimation (2016-2021).In Scenario 1, a negative growth of Chinese primary energy consumption is predicted from 3013.96Mtoe in 2015 to 2970.19 Mtoe in 2021, cutting back 1.45% in 6 years.Calculated according to Equation (16), the annual growth rate is −0.24%.In Scenario 2, Chinese primary energy consumption forecast indicates an increase of 86.42% from 3013.96Mtoe in 2006 to 5618.67 Mtoe in 2021.According to Equation (