1. Introduction
India, with a total population of more than 1.3 billion, has become the world’s third-largest emitter of greenhouse gases while rapidly industrializing and urbanizing. Coal is one of the most important causes of this problem. As the world’s second largest coal consumer and importer, India’s future coal changes will have a huge impact on the global coal trading market [
1,
2,
3]. In addition, in terms of the impact on the domestic energy market, in 2016, India relied on coal to produce three-quarters of its electricity. This means that coal has an important supporting role in India, and that it impacts on the balance of supply and demand in the domestic energy market. Based on this situation, accurately predicting India’s future coal consumption is conducive to the formulation of both environmental and energy policy. On the one hand, controlling coal trends is beneficial to controlling India’s greenhouse gas emissions. On the other, determining changes in India’s relationship with coal plays an important role in controlling the global coal trading market ahead of time and balancing domestic energy supply and demand.
Most existing research focuses on the relationship between India’s energy intensity and economic development [
4,
5,
6], renewable energy [
7,
8,
9], and total energy consumption [
2,
10,
11]. Few existing studies on Indian energy are focused on coal. For example, in terms of the relationship between energy and the economy, Ahmad et al. [
12] studied the relationship between carbon emissions [
13], energy consumption and economic development, and concluded that all energy sources have a positive impact on carbon emissions. Dasgupta et al. [
14] derived and analyzed the energy intensity trends of seven energy-intensive manufacturing industries in India in the past. The conclusions show that structural changes have little effect on energy demand. In addition, there are also studies on the overall state of energy [
15,
16,
17,
18]. Wang et al. [
19] predicted the total energy consumption of China and India. The research results show that India’s future energy growth rate will be 2–4 times that of China. Singh et al. [
20] evaluated the potential index of solar energy development and provided reasonable suggestions for promoting solar energy development in India. Das et al. [
21] constructed a modeling framework for linear dynamics and estimated India’s energy demand and carbon dioxide emissions for the cement industry in 2021. In terms of renewable energy [
22,
23,
24], Sharma et al. [
25] conducted a comprehensive assessment of the availability, environmental impact and development prospects of renewable energy in India. Sonal et al. [
26] identified obstacles to solar energy development and provided countermeasures for India’s adoption of solar technology. Mohanty et al. [
27] developed a new combination method based on statistical methods, fuzzy algorithms and neural networks, and applied it to solar forecasting in India. The results show that India’s future solar energy will rise to 35GW in 2020. Jiang et al. [
28] developed and studied a multi-stage intelligent method based on integrated learning to predict 5-day global horizontal radiation in four regions of India. The final result confirms the effectiveness of the method. Bhattacharya and Ahmed [
29] compared the return prediction performance of the GARCH model with the GARCH-ANN model using the root mean square error as a standard for crude oil prices in India. The results show that the hybrid model of ANN and EGARCH has the best performance.
In existing research on energy forecasting, most people use several single prediction methods. Few studies use multiple combined methods to predict the research object at the same time. For example, in the application of the grey model, Chen et al. [
30] proposed two grey interval prediction methods: the interval grey model (abbreviated as: GM (1,1)) and the interval nonlinear grey Bernoulli model (NGBM (1,1)) for the problem of estimation range, which respectively predict minority and uncertain time series data. Yuan et al. [
31] also used the GM (1,1) model and the Autoregressive Integrated Moving Average model (ARIMA) to predict the total energy consumption in China. The results show that China’s future energy consumption will grow at a rate of 4%. In the application of a neural network model, Jebaraj et al. [
32] used a single neural network model to simultaneously predict and validate various energy sources in India. The verification results confirm that the neural network model can be make accurate predictions in most cases. Wang et al. [
33] used the linear ARIMA to correct NMGM residuals to forecast China’s dependency on foreign oil; they reported that China’s dependency on foreign oil will exceed 80% of its energy expenditures by 2030. Hossain et al. [
34] used artificial neural network models to simultaneously predict new solar and wind energy and applies them to the climate of Queensland. In the application of the ARIMA model, Oliveira et al. [
35] used the bagging ARIMA model to predict medium- and long-term power consumption. Wang et al. [
36] applied hybrid ARIMA and the metabolic grey technique to forecast shale gas output in the United States. Sen et al. [
37] selected the correct ARIMA model and predicted energy consumption and greenhouse gas emissions of Indian pig iron manufacturing institutions. Li et al. [
38] applied data mining and BP neural network models to the prediction of air pollution, and found the applicability of BP model to atmospheric data. Wang et al. [
39] adopted single- and non-linear forecasting techniques to predict shale oil output in the United States. Xu et al. [
40] adopted two single models: the ARIMA model and the BP neural network model to predict the monthly exchange rate of RMB. Through these applications, the study found that the average relative error of the two single models was 15% and 16%, respectively. Ray et al. [
41] also used genetic algorithms and neural networks to predict electrical load, and found that genetic algorithms provide better prediction results than backpropagation.
Through combing the above literature, the following points can be summarized: (1) Existing forecasting literature on India is concentrated on renewable energy, carbon emissions and individual social issues. (2) The study of a single predictive model has been unable to meet the high-precision prediction effect. (3) The combined model has a good performance and is valued in the field of forecasting. Based on this, it can be observed that forecasting India’s coal consumption is a gap in current research, and the combined model can provide a tool for analyzing and predicting this research.
In order to fill this gap, this study intends to use a variety of mixed time series forecasting models to forecast India’s coal consumption in an all-round way. The innovations of this research are as follows. (1) This study used a high-precision mixed time series model to predict coal consumption in India. The forecast results will provide a reference for future energy planning and the economic development of India. (2) The model selected in this study includes two traditional single models: metabolic grey model (MGM) and Back-ProPagation Network (BP), and two newly-developed hybrid models based on the error correction principle: the metabolic grey model, Autoregressive Integrated Moving Average model (MGM-ARIMA) and Back-ProPagation Network, and the Autoregressive Integrated Moving Average model (BP-ARIMA). The simultaneous use of multiple models can provide exhaustive and comprehensive forecasting. It can also ensure the accuracy of forecasting and increase the credibility of the predicted data, which can provide an accurate reference for the development of follow-up policies.
The remainder of this paper is as follows:
Section 2 categorizes the forecasting methods used. The introduction of the forecasting process is presented in
Section 3.
Section 4 introduces the accuracy and results of the predictions. A summary of the full text is given in
Section 5.
2. Method
2.1. Metabolic Grey Model
The metabolic grey model (MGM) is an improvement to the traditional grey model (GM) by way of adding an element replacement process. The traditional grey model theory, abbreviated as the GM model, was developed in 1982 by Professor Deng Julong [
42]. This theory mainly achieves an accurate understanding of system behavior through some known information. During operation, the GM model first accumulates or differentially processes a random sequence, making it regular:
. After that, a differential equation is established for this regular sequence:
. Through the solution of the differential equation:
, the prediction of the future data of the system can be realized. However, the GM model has obvious requirements for the predicted data. For example, the GM model is well suited to handle approximately 5–10 data. If the amount of data is too large or is fluctuating, the effect predicted by the GM model will be unsatisfactory.
In order to solve this problem, data replacement can be added as a solution for the GM model. This improved model is called the metabolic grey model (abbreviated as the MGM model) [
43]. The MGM model usually continues the calculation of the GM model in general. Specifically, the data processing method and the differential equation construction method of the MGM model are the same as the GM model. The difference is that the MGM model divides the prediction process of GM model into a number of prediction rounds, and the data used for each round of prediction is different. To illustrate the differences between the two models,
Figure 1 shows the corresponding prediction processes of GM model and MGM model. Each colored circle represents a known block of data, while the open circle represents a block of data that needs to be predicted in the future.
As shown in
Figure 1, the GM model uses only five pieces of data to predict all the unknown data. For MGM model, only one unknown piece of data is predicted per round. Furthermore, the data used by the GM model for prediction is invariant, and the data used for each round of the MGM model is different. The specific alternative principle is in line with the physiological process of metabolism. Assume the number of data used for grey prediction is five. After the first round of MGM model, the initial data is rejected, and the latest data reflecting the characteristics of the system is added. By analogy, each round of data sets used for MGM prediction is the one that best reflects system dynamics. This model overcomes a series of shortcomings of the traditional grey model for inaccurate prediction of large fluctuations in data. After this improvement, the MGM model can be applied to the prediction of large and volatile data sequences. The accuracy of the prediction is also greatly improved.
2.2. Back-ProPagation Network Forecasting Model
The Back-ProPagation Network, also known as the Back Propagation Neural Network, continuously corrects the network weights and thresholds by training the sample data to make the error function fall in the direction of the negative gradient and approach the desired output. It is a widely used neural network model, which is mostly used for function approximation, model recognition classification, data compression and time series prediction. The calculation process is as follows.
Step 1: Data preprocessing. In this step, the training data and test data are preprocessed using a normalized approach. After the model is established, the inverse normalization method can be used to restore the predicted data into meaningful data.
Step 2: Select the number of hidden layer neurons. Empirical formulas are often used as tools for this step: . Here, ‘a’ is assumed to be the number of neurons in the input layer, ‘n’ is the number of neurons in the hidden layer, and ‘b’ is the number of neurons in the output layer. After that, let ‘c’ take values from 1 to 10, constantly change ‘n’, and compare the models one by one to achieve the most accurate.
Step 3: Set parameters. In order to get the most effective model, it is often necessary to define the model during the training process. In this paper, the minimum training error is 1 ×
e−7, the number of training is 1000, and the learning rate is 0.01 (as shown in
Figure 2).
Step 4: Model prediction and testing. In order to obtain a reasonably usable BP neural network model, it is necessary to judge the model by prediction and calculation error.
2.3. The Autoregressive Integrated Moving Average model
For time series predictions, the ARIMA model is one of the most commonly-used statistical models [
44]. The principle of its prediction is to first convert a non-stationary time series into a stationary time series. Then, the dependent variable will be described as a model that only returns its lag value and the current and lag values of the random error term. It can be seen that the advantage of the ARIMA model is that the prediction process only requires endogenous variables and does not need other exogenous variables. However, the ARIMA model requires that the sequence be stable after being differentiated.
Specifically, the prediction process includes the following steps [
39].
Step 1: Smooth the timing data with a differential tool. Stationarity serves to ensure that the fitted curve obtained by sampling time series can continue inertially along the existing form in a short time in the future, that is, the mean and variance of the data should not be excessively changed, theoretically.
Step 2: Establish an autoregressive model (AR). The autoregressive model is a model that describes the relationship between current value and historical value, and is a method of predicting itself by using the historical event data of the variable itself. Its formula is as follows:
where,
is the current value;
is constant term; p is the order;
is the autocorrelation coefficient;
is the errors.
Step 3: Establish a moving average model (MA). The moving average model focuses on the accumulation of error terms in the autoregressive model. It can effectively eliminate random fluctuations in predictions. Its formula is as follows:
Among them, the meaning of each letter is the same as (1), and is the correlation coefficient of the MA formula.
Step 4: Combine AR and MA, and construct an autoregressive moving average model (ARMA). The specific formula is as follows. In this formula, p and q are the orders of the autoregressive model and the moving average model, respectively.
and
are the correlation coefficients of the two models, respectively, and need to be solved.
2.4. Two Combined Linear Modified Linear (MGM-ARIMA) and Linear Modified Nonlinear (BP-ARIMA) Models
Although each single model has its own applicability, some inevitable flaws exist. In this case, the combined model comes out. A combined model can minimize the shortcomings of each single model and allow them to complement each other with the advantages of the two single models (as shown in
Table 1) [
45]. Generally speaking, common combinations of models: the equal weight method, minimum variance method, and so on. The predicted values produced by these methods are the results of combining individual prediction results after weighting based on precision.
Different from the traditional combinations, the approach used in this study is a combination of prediction steps [
33]. Assume that the combined model includes two single models. The first model is called the base model, and the second is the modified model. The combined principle adopted in this study is to use the base model to make the prediction, and then use the modified model to recalibrate the error, i.e., in order to reduce the error. The specific steps are as follows.
Step 1: Use the base model to predict the original data sequence . The prediction is done in the same way as the base model prediction step. At the end of this step, preliminary predictions are obtained.
Step 2: Calculate the predicted initial error. By comparing the prediction result with the real value, the prediction error of the base model can be obtained, and is called the initial error. The relevant formula is: Where is the error value corresponding to the response time point ‘t’, is the predicted value, and is the true value.
Step 3: The initial error sequence is predicted by using modified model and a new error sequence is obtained. Again, the modified model has the same processing steps as before. The error sequence obtained at this stage is called the new error sequence.
Step 4: Combine the preliminary predictions and the new error sequence , and obtain the final predictions based on this formula: .
In this study, two combined linear modified linear (MGM-ARIMA) and linear modified nonlinear (BP-ARIMA) models were developed to predict India’s coal consumption. The similarity between the two models is that the modified models are all part of the ARIMA model. However, the difference is that the MGM-ARIMA model uses the MGM model as the base model, and the BP-ARIMA model uses the BP model as the base model. Since the principles of the three single models involved have already been explained before,
Figure 3 will briefly introduce the main combination methods.