Forecasting Economic Growth of the Group of Seven via Fractional-Order Gradient Descent Approach

This paper establishes a model of economic growth for all the G7 countries from 1973 to 2016, in which the gross domestic product (GDP) is related to land area, arable land, population, school attendance, gross capital formation, exports of goods and services, general government, final consumer spending and broad money. The fractional-order gradient descent and integer-order gradient descent are used to estimate the model parameters to fit the GDP and forecast GDP from 2017 to 2019. The results show that the convergence rate of the fractional-order gradient descent is faster and has a better fitting accuracy and prediction effect.

Gradient descent is generally used as a method of solving the unconstrained optimization problems, and is widely used in evaluation and in other aspects. The rise in fractional calculus provides a new idea for advances in the gradient descent method. Although numerous achievements have been made in the two fields of fractional calculus and gradient descent, the research results combining the two are still in their infancy. Recently, ref. [11] applied the fractional order gradient descent to image processing and solved the problem of blurring image edges and texture details using a traditional denoising method, based on integer order. Next, ref. [12] improved the fractional-order gradient descent method and used it to identify the parameters of the discrete deterministic system in advance. Thereafter, ref. [13] applied the fractional-order gradient descent to the training of neural networks' backpropagation (BP), which proves the monotony and convergence of the method.
Compared with the traditional integer-order gradient descent, the combination of fractional calculus and gradient descent provides more freedom of order; adjusting the order can provide new possibilities for the algorithm. In this paper, economic growth models of seven countries are established, and their cost functions are trained by gradient descent (fractional-and integer-order). To compare the performance of fractional-and integer-order gradient descent, we visualize the rate of convergence of the cost function, evaluate the model with MSE, MAD and R 2 indicators and predict the GDP of the seven countries in 2017-2019 according to the trained parameters.

The Group of Seven (G7)
The G6 was set up by France after western countries were hit by the first oil shock. In 1976, Canada's accession marked the birth of the G7, whose members are the United States, the United Kingdom, France, Germany, Japan, Italy and Canada seven developed countries. The annual summit mechanism of the G7 focuses on major issues of common interest, such as inclusive economic growth, world peace and security, climate change and oceans, which have had a profound impact on global, economic and political governance. In addition to the G7 members, there are a number of developing countries with large economies, such as China, India and Brazil. In the context of economic globalization, the study of G7 economic trends and economic-related factors can provide a useful reference for these countries' development.
The economic crisis broke out in western countries in 1973, so the data in this paper cover the period from 1973 to 2016, and data for the seven countries are available since then. Some G7 members (France, Germany, Italy and the United States) were members of the European Union (EU) during this period, so this paper also establishes the economic growth model of the EU. Data for this article are from the World Bank.

Model Describes
The prediction of variables generally uses time series models [14] (for example, ARIMA and SARIMA), or artificial neural networks [15,16], which have been very popular in recent years. The time series model mainly predicts the future trend in variables, but it is difficult to reflect the change in unexpected factors in the model. Additionally, the neural network model needs to adjust more parameters, the network structure selection is too large, the training efficiency is not high enough, and easy to overfit. Although the linear model is simple in form and easy to model, its weight can intuitively express the importance of each attribute, so the linear model has a good explanatory ability. It is reasonable to build a linear regression model of economic growth, which can clearly learn which factors have an impact on the economy.
Next, we chose eight explanatory variables to describe the economic growth in this paper. The explained variable is y, where y refers to GDP and is a function. The expression for y is as follows: where t is year (t = 44), θ 0 is the intercept. is an unobservable term of random error. θ j represents the weight of each variable. The eight explanatory variables are: x 1 : land area (km 2 ) x 2 : arable land (hm 2 ) x 3 : population x 4 : school attendance (years)

Fractional-Order Derivative
Due to the differing conditions, there are different forms of fractional calculus definition, the most common of which are Grünwald-Letnikov, Riemann-Liouville, and Caputo. In this article, we chose the definition of fractional-order derivative in terms of the Caputo form. Given the function f (t), the Caputo fractional-order derivative of order α is defined as follows: t is the Caputo derivative operator. α is the fractional order, and the interval is α ∈ (0, 1). Γ(·) is the gamma function. c is the initial value. For simplicity, cD α t is used in this paper to represent the Caputo fractional derivative operator instead Caputo cD α t . Caputo fractional differential has good properties. For example, we provide the Laplace transform of Caputo operator as follows: is the α rounded up to the nearest integer. It can be seen from the Laplace transform that the definition of the initial value of Caputo differentiation is consistent with that of integer-order differential equations and has a definite physical meaning. Therefore, Caputo fractional differentiation has a wide range of applications.

The Cost Function
The cost function (also known as the loss function) is essential for a majority of algorithms in machine learning. The model's optimization is the process of training the cost function, and the partial derivative of the cost function with respect to each parameter is the gradient mentioned in gradient descent. To select the appropriate parameters θ for the model (1) and minimize the modeling error, we introduce the cost function: where h θ (x (i) ) is a modification of model (1), h θ (x) = θ 0 + θ 1 x 1 + · · · + θ j x j , which represents the output value of the model. x (i) are the sample features. y (i) is the true data, and t represents the number of samples (m = 44).

The Integer-Order Gradient Descent
The first step of the integer-order gradient descent is to take the partial derivative of the cost function C(θ): and the update function is as follows: where η is learning rate, η > 0.

The Fractional-Order Gradient Descent
The first step of fractional-order gradient descent is to find the fractional derivative of the cost function C(θ). According to Caputo's definition of fractional derivative, from [17] we know that if g(h(t)) is a compound function of t, then the fractional derivation of α with respect to t is It can be known from (5) that the fractional derivative of a composite function can be expressed as the product of integral and fractional derivatives. Therefore, the calculation for cD α θ j C(θ) is as follows: and the update function is as follows: where η is the learning rate, η > 0. α is the fractional order, 0 < α < 1. c is the initial value of Caputo's fractional derivative, and c < min{θ j }.

Model Evaluation Indexes
We use the absolute relative error (ARE) to measure the prediction error: To evaluate the fitting quality of gradient descent on the model, the following three indicators can be calculated: The mean square error (MSE): The coefficient of determination (R 2 ): The mean absolute deviation (MAD): In these formulas, n is the number of years (n = 44). y i andŷ i are the real value and the model output, respectively.ȳ i is the mean of the GDP.

Main Results
In this article, we standardize the data for each country before running the algorithm, and each iteration to update θ uses m samples. The grid search method was used to select the appropriate learning rate and initial weight interval, and the effects of different fractional orders are compared to select the best order (see Table 1).The learning rate and the initial weight interval are applicable to both fractional-order gradient descent and integer-order gradient descent.

Comparison of Convergence Rate of Fractional and Integer Order Gradient Descent
In order to facilitate visual comparison, (4) and (6) are iterated 50 times, respectively, as well as their convergence rates (see Figure 1).
As shown in Figure 1, for each dataset, after the same number of iterations, the convergence rate of fractional-order gradient descent is faster than that of integer-order gradient descent, which indicates that the method combining fractional-order and gradient descent is better than the traditional integer-order gradient descent in the convergence rate of update equation.

Fitting Result
Then, we fit GDP with integer-order gradient descent and fractional-order gradient descent, respectively. Start by setting a threshold and stop iterating when the gradient is less than this threshold. The fitting effect diagram is shown in Figure 2, and the performance evaluation of the model is shown in Table 2.  It can be seen from Table 2 that the MSE, R 2 and MAD results of GDP fitted by fractional-order gradient descent are better than that fitted by integer-order gradient descent, which indicates that, under the same iteration number, learning rate and initial weight interval, the fitting performance of the data fitted by fractional-order gradient descent is better than that of integer-order.

Predicted Results
Finally, in order to test the prediction effect of fractional-and integer-order gradient descent on GDP, we forecast the GDP from 2017 to 2019, and used the ARE index to measure the prediction error (see Table 3).

Conclusions
In this paper, the gradient descent method is used to study the linear model problems which is different from [18,19]. The results show that, in addition to the least square estimation, the gradient descent method can also solve the regression analysis problem by iterating the cost function, and obtain good results, a without complicating the model. It also improves the interpretability of explanatory variables. We apply the fractional differential to gradient descent, and compare the performance of fractional-order gradient descent with that of integer-order gradient descent. It was found that the fractional-order has a faster convergence rate, higher fitting accuracy and lower prediction error than the integer-order. This provides an alternative method for fitting and forecasting GDP and has a certain reference value.