Fractional Derivatives for Economic Growth Modelling of the Group of Twenty: Application to Prediction

: This paper studies the economic growth of the countries in the Group of Twenty (G20) in the period 1970–2018. It presents dynamic models for the world’s most important national economies, including for the ﬁrst time several economies which are not highly developed. Additional care has been devoted to the number of years needed for an accurate short-term prediction of future outputs. Integer order and fractional order differential equation models were obtained from the data. Their output is the gross domestic product (GDP) of a G20 country. Models are multi-input; GDP is found from all or some of the following variables: country’s land area, arable land, population, school attendance, gross capital formation (GCF), exports of goods and services, general government ﬁnal consumption expenditure (GGFCE), and broad money (M3). Results conﬁrm the better performance of fractional models. This has been established employing several summary statistics. Fractional models do not require increasing the number of parameters, neither do they sacriﬁce the ability to predict GDP evolution in the short-term. It was found that data over 15 years allows building a model with a satisfactory prediction of the evolution of the GDP.


Introduction
In this paper, models of economic growth are developed. The economies considered are those of the countries members of the Group of Twenty (G20). The period under consideration consists of the years from 1970 until 2018. The gross domestic product (GDP) is obtained as the output of a dynamic system with eight input variables. The best models found employ derivatives of fractional order. These models are compared with alternative versions with integer order derivatives only. The comparison employs several statistical tools, commonly used to assess the quality of a model to predict future outputs. In this manner, the ability of predicting the evolution of the GDP in the short-term is demonstrated. This paper comes in the sequence of similar models obtained for Portugal, Spain [1], France, Italy [2], all the EU member-states [3], the Group of Seven [4], and China [5]. The success of fractional order models for this purpose has been justified as consistent with the mechanisms of economic growth, and is supported by the results.
Of the countries mentioned above, only China is not a highly developed country. Hence, this paper has, for the first time, long term fractional order economic growth models for several countries which are not highly developed, contributing to show that such models are suitable also for this particular case. It also includes a study of the best number of years included in each model to optimise results.
The remainder of this paper is organised in the following manner. Section 2 describes the G20 and presents a short state-of-the-art. Section 3 explains the methodology followed for economic growth

GDP Models
The models presented below rely on the following assumption: the evolution of the GDP is a result of variables of two types. Variables of the first type reflect available resources; variables of the second type reflect impacts on the economy. Consequently, the first model structure conceived for the GDP is a linear integer order differential equation, which is, for each country of the G20, given by: x 5 (t)dt + C 6 x 6 (t) + C 7 x 7 (t) Variables are as follows: Keynesian models for the dynamics of economies usually consider as inputs variables that have short-term impacts in the economy. Growth accounting usually favors a more long-term approach.
(See examples in [20][21][22], and the discussion in [23] about the factors economic growth depends upon). The variables above combine both. Notice that, to make the role of GCF clearer, since it appears twice in the model with different roles, two different variables (x 5 and x 9 ) are used to denote it.
As explained below in Section 4.1, not all variables in (1) have the same importance for the accuracy of the model. Their relative importance was found for each country for the whole time period. In this manner simpler models could be obtained. In particular, a second integer order model, with five variables only, was taken as an alternative: Impacts on the economy have effects that are felt for an extended period of time. Of course, this effect wanes away. Such a behavior can be modelled with fractional derivatives, since fractional derivatives are operators with memory [15,24]. In other words, the fractional derivative of a function is not a local operator, but its value depends on past values of the function. Depending on the particular order employed, this memory of past values can correspond to weights of the said past values that vanish for older time instants. This is the reason fractional derivatives are used to model phenomena such as distributions corresponding to power laws, long tails in general, or chaotic systems [25].
Hence, a fractional generalization of model (2) was considered. Rather than using more variables, and thus recovering model (1), the variables of model (2) representing impacts, and only those, are affected with a fractional derivative. Such variables are x 5 , x 6 and x 7 , so the considered model was: The sign of the differentiation orders α 5 , α 6 and α 7 can be positive or negative. This type of generalization has already been successfully used in our works [1][2][3][4].
Notice that the resulting fractional model has eight parameters. This is one parameter less than the number of variables of the original integer model (1). As the number of variables is similar, the comparison between the performance of the two is fair. The extra parameter of the integer model gives it a slight advantage; consequently, should the performance turn out to be the same, the fractional model will be considered better, since it achieves the same results with one parameter less.
The fractional differentiation operator D α k was numerically implemented following the Caputo definition [24], as 0 D α k t x k (t). Years are counted from 1970, which thus corresponds to the lower terminal 0. Terms for initial conditions were not included. Consequently, the effects of inputs are considered only from 1970 on. This approximation reduces statistical data needed to develop models and was used in previously published works, where it has provided acceptable results [4].

Optimizing and Assessing Performance
A fitting procedure implemented in MATLAB was used to find models (1)-(3) for each of the G20 countries. This procedure relies on Nelder-Mead's simplex search method. MATLAB's implementation from function fminsearch was used. The objective was the minimization of the mean square error (MSE): where N is the number of years-N = 49 in this case-and y j andŷ j are the GDP and the model's GDP estimate, respectively. Several performance indexes other than the MSE were used from function regstats to further evaluate the quality of the resulting models, viz.: 1.
The t-values and p-values for each variable.
In Section 4 it will be shown that not all nine variables x 1 , x 2 , . . ., and x 9 were necessary for every single model given by (1). This was already the case in models for other countries [1][2][3][4]. This result was established in three ways. First, from the tand p-values for each variable. Second, from performance indexes MAD and R 2 , that should not be significantly worst when one or more variables are removed from the model, if they are indeed necessary. Third, from the Akaike information criterion (AIC): where K is the number of model parameters. The value of the AIC itself does not give information about the quality of a model. But comparing the AIC values of different models does. With such a comparison, it is possible to find out which models have a higher probability be good models for the data. In fact, a lower value of the AIC denotes a higher probability of a model being the best.
Assuming that there are M models, this probability of model i being the best can be normalized as the Akaike weight w i , i = 1, . . . , M, by: In this way, models given by (2) and (3) were developed.

Models Found from Data for Different Numbers of Years
For each of the expressions (1)-(3), four models were obtained. The first uses the data for the entire 1970-2018 period, so as to obtain a long term fit. This method, however, may lead to overfitting. This can be caused by an excessive influence in parameters of data of too many years into the past. Furthermore, it is impossible to assess the capability of this model of economic growth to predict the future evolution of the economy, because there are no additional years of data for testing the prediction ability.
To improve on this, three models for shorter time ranges were obtained. To find out which numbers of years could be reasonable, trend lines were found for the GDP of each country. Both linear and exponential trend lines were obtained; the former provided a better fit for some countries, and the latter for others. Finally, a fast Fourier transform (FFT) was used to obtain, for the different tendencies, the spectral content of the oscillations y(t) −ỹ(t). This was done in [4] to obtain the best time ranges of models. In the present case, Figure 1 shows the spectral content of these oscillations for all countries, normalized so that every curve peaks at 1. It can be seen that, in the G20, economies do not have similar periods of oscillations around the corresponding tendencies. Within the frequencies where most peaks take place, three reasonable values of time ranges were chosen: periods of 5, 10, and 15 years. In this way, for each country, using (4) as cost function, 34 models were found for N = 15, for the periods 1970-1985, 1971-1986, 1972-1987, and so on, such as a moving average; and similarly for N = 10 (39 models) and N = 5 (44 models). And this was done separately for models given by (1)-(3). Each of these models can be tested, using for this purpose the data of years in the future. In this manner, it is possible to check how good the model is predicting GDP values which were not used to adjust its parameters. The quality of the prediction was measured with performance indicators MSE, R 2 , MAD, AIC, and w for each country.
The GDP of different countries has different orders of magnitude. To make model performance comparison easier, the figures below present the R 2 performance index, to show the quality of predictions obtained with each N-year model. The R 2 is always in a normalized range, irrespective of the magnitude of the variable under study, which makes it particularly suited for this visual purpose. In this way, all the important characteristics of the different models in relation to the others can be studied.

Results
This section presents the models obtained, as well as their performance predicting the GDP of G20 countries. Due to its extension, a full tabulation of results is not included in the paper, but is available in [26]. Data sources are described in Appendix A.  Tables 1-3. In those tables, the t-values given in bold are those corresponding to variables which, assuming a 5% significance level, are necessary for the model. This information is also given in Table 4. It turns out that variables important for modelling six or more countries are x 1 , x 3 , x 5 , x 6 , and x 7 . That is why model (1) could be simplified into model (2), which is in its turn generalized to fractional orders by model (3), only considering x 5 , x 6 , and x 7 to have fractional influence.  (1) and (2) and with fractional model (3). R 2 values are given to show the quality of the results of each model. As GDPs have different orders of magnitude, different scales were used in the y-axis for different countries.      Table 2. Performance indices of the different models obtained for the G20 members in Figure 3; for an explanation of performance assessment see Section 3.2.  Table 3. Performance indices of the different models obtained for the G20 members in Figure 4; for an explanation of performance assessment see Section 3.2. As can be observed, the MSE, R 2 and MAD allow reaching the same conclusion: the performance of models given by (3) is clearly better than the performance of integer models, in what concerns the quality of the fit during the period used to build each model. This happens for all sixteen countries. The Akaike weight, summarized in the last row of every country, also supports that models (3) are the best of the three for this purpose.  Figures 5-9 show the performance of integer and fractional models of a group of selected countries (one per continent), namely Australia, the European Union, India, South Africa, and the United States of America, for N = 5, 10 and 15 years, predicting the future evolution of the GDP. Showing results for all countries would take too much space; then, results obtained with all models and performance indices can be found in [26] for all countries.

Models for N-Year Period
Notice that models obtained with data from periods beginning in the 1970s can be used to predict the GDP for many years until 2018. On the other hand, models developed with data from periods ending in the 21st century can be used to predict the GDP for a few years only. Furthermore, predictions for many years into the future have, as can be expected, a lower performance than those for years close to the end of the data from which the model was got. In fact, the performances in Figures 5-9 deteriorate over time, but are quite good at prediction for a short period, and here again fractional models show their better performance, as R 2 values do not decrease so significantly.
As far as the number of years for prediction is concerned, it was observed that the smaller the value of N, the better fitting-MSE obtained for every N−year period was really close to zero-but the lower the ability to predict GDP in future: the values of R 2 were the smallest of the three cases. This was especially clear for integer model (1). Conversely, the largest the value of N, the lower the value of MSE, but the better the prediction. Notice that the values of R 2 for N = 15 were close to 1, especially for predictions with fractional model (3).
Hence, in order to predict the economic growth of a country of G20 with certainty, it is necessary to consider a relatively large period of years.

Conclusions
The models of economic growth, of both integer and fractional order, presented in this paper for countries of the Group of Twenty (G20), from 1970-2018, are satisfactory. The variables chosen to predict variations of gross domestic product (GDP) prove to be suitable to the desired purpose.
It is clear from the results obtained that the performance of fractional models is superior. This statement is qualitatively backed by several indexes. Fractional models do not require an additional number of parameters, neither do they sacrifice the ability to predict the evolution of the GDP in the short-term. As to the number of years needed to build acceptable models, results show that N = 15 years lead to the best results.
The methodology followed in this paper can be further applied to more countries, and eventually generalised to more variables. Database [27], for instance, includes many time series, usually of good coherence, for all countries, that could be tested in the systematic manner described. The main difficulty is the disparity in the number of years for which time series are available; while for the G20 we could complete the missing values for eight variables, sixteen countries, and forty-nine years, this would likely be very difficult or even impossible if the number of variables, countries or years should be increased. So it would be necessary to improve this methodology in a manner that would cope with missing data and still be able to find, validate and compare models. Funding: This research was supported in part by the Consejería de Economía, Ciencia y Agenda Digital (Junta de Extremadura) under the grant "Ayuda a Grupos de Investigación de Extremadura" (no. GR18159), in part by the European Regional Development Fund "A way to make Europe", and in part by FCT, through IDMEC, under LAETA, project UID/EMS/50022/2019.

Acknowledgments:
The authors would like to thank José Emilio Traver for its help in editing the graphical abstract.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Data Sources
This appendix lists the sources of data used in this paper, which is not tabulated in this paper because of its size. It is available in [28].

•
As in [4], variables for the EUU were the sum of the figures for its member states in each year. The only exception was x 4 , addressed below.

•
The source for the GDP, x 1 , x 2 , x 3 , x 5 , x 6 , and x 7 was [27]. • Variable x 2 was available until 2016 only. It was assumed that x 2 (2017 : 2018) = x 2 (2016). For Belgium and Luxemburg, which are member-states of the EUU, there is no x 2 data until 2000.
Thus, x 2 was assumed constant until that year. This approximation corresponds, in the worst case, to an error in x 2 of 1.9% for the EUU during those years.

•
The source for x 1 and x 3 for DEU until 1990 was [29]. In the same period, figures for x 2 were reduced in the same proportion.

•
The source for x 4 was [30] until 2010. Figures are available with a 5-year period only, and were interpolated with a third-order spline. The figure for 2010 was extended into the future, using the increase rate of the figures in [31], also interpolated with a third-order spline. However, Figures for the following member-states of the EUU are not found in [30]: Croatia, Estonia, Latvia, Lithuania, Slovenia, Slovak Republic. The source for x 4 for these states was [27]. The EUU figure for x 4 is a weighted average of the figures for the member states in each year. The weight is the share of each state in x 3 .
• Figures for x 5 , x 6 , and x 7 for JPN and USA for 2018 are those of 2017, updated with the yearly growth rate of the index in [32].

•
The source for x 7 for ARG until 1992 was [27]. In the 1993-2018 period, the figure for 1992 was updated with the yearly growth rate of the index in [32].

•
The source for x 8 for CAN until 2008 was [27]. In the 2009-2018 period, the figure for 2008 was updated with the yearly growth rate of the index in [32].

•
The source for x 8 for DEU, FRA, ITA and other states of the EUU until 2015 was [33]. Figures were converted to 2010 US$ using the price index in [27]. In the 2016-2018 period, the figure for 2015 was updated with the growth rate in [34][35][36] for DEU, FRA, and ITA, respectively. However, figures for x 8 for Luxembourg and Romania in [33] are only available until 2011 and 2013, respectively. The figure for the last year was updated with the growth rate of [27].