1. Introduction
This paper builds an economic model from the data of 1973–2016 for the Group of Seven (G7) by using the adaptive lasso method and BP neural network models to describe the evolution of the gross domestic product (GDP) of several factors. The quality of the models is assessed, showing the advantage of using BP neural network models to this purpose. The ability of predicting the short-term evolution of the GDP is also assessed. Possible explanations are presented for why this is so, and for the mechanism behind the BP neural network models. The results show how the weights of the neural network and the number of hidden layers can be used to better fit the data, and is expected to forecast GDP growth in the future.
The BP neural network model long been used to develop financial and economic models. However, there are still some shortcomings in improving the accuracy of the model. In recent years, applications of the BP neural network in economic growth have been studied in [
1,
2,
3,
4]. The grey prediction model [
5] is widely used to forecast economic growth, and experimental results show that the BP neural network model is superior to the
model. In addition, fractional calculus is widely used to construct economic models, incorporates the effects of memory in evolutionary processes, and experimental results show that the fractional order model is superior to the integer order model, such as in [
6,
7,
8,
9,
10,
11,
12,
13,
14].
Recently, a better way to get rid of redundant variables has been found in [
15], and Ming et al. [
9] improved the fractional EGM model in [
16], but the error is still difficult to control.
In this paper, we adopt the idea in [
15] and the economic model of the BP neural network in [
5] to study a group of GDP growth in seven countries. In order to compare the fitting effect between the BP neural network and the fractional order model, we establish the minimum absolute error coefficient, determination, and the Bayesian information criterion (BIC) index. Finally, we give the relative error evaluation model to show the prediction effect of the evaluation model.
In summary, based on the BP neural network model, this paper conducts the modeling of a group of seven economic growth. Through a case study, it shows that the BP neural network has a smaller error to forecast the GDP.
The G7
The G7 is a forum for major industrial countries to meet and discuss policies, including the United States (USA), the United Kingdom (GRB), Germany (DEU), France (FRA), Japan (JPN), Italy (ITA), Canada (CAN), and the European Union (EUU). In the early 1970s, after the first oil crisis hit the western economy, at the initiative of France in November 1975, the six major industrial countries, including United States, Britain, Germany, France, Japan, and Italy established the G6. Since then, Canada joined in the following year, and the Group of Seven (G7) was born. The addition of Russia in 1997 transformed the G7 into the G8. The Group of Seven (G7) is the predecessor of the Group of Eight (G8). On 4 June 2014, the G7 leaders’ meeting was hosted by the European Union, which took place in Brussels, Belgium on the evening of the 4th. This was the first time Russia was excluded since joining the group in 1997. The summit discusses foreign policy, economic, trade, and energy security issues.
Some of the members of the G7 have economies of comparable size, but not the economy of more developed countries, such as China and India. Therefore, it is necessary to study the factors related to the changes in GDP of these countries so as to compare the similarities and differences between GDP growth in different countries.
We collected data from the years 1973–2016 for the G7 countries for a total of 44 years, and used the eight variables obtained to establish different BP neural network models, which describe the changes in GDP in different countries.
2. Model Description
We selected the following eight explanatory variables (see
Table 1) in this paper: land area (LA) (km
), arable land (AL) (hectares), total population (TP) (million), avg. year of schooling (AYS) (year), gross capital formation (GCF) (dollar), exports of goods and services (EGS) (dollar), and general government final consumption expenditure (GGFCE) (dollar), and broad money (BM) (dollar). The data in this section are the data recorded from the World Bank over 1973–2016.
In order to express this more simply, we have defined the symbols as follows:
Table 1.
Symbols define.
| | | | | | | | y | t |
---|
LA | AL | TP | AYS | GCF | EGS | GGFCE | BM | GDP | year |
Thus, the model adaptive lasso method and BP neural network are considered as follows.
2.1. Adaptive Lasso Method
Although the lasso method is widely used in high-dimensional data analysis, it also has some shortcomings, such as that it does not have the so-called Oracle nature proposed by Fan and Li [
17], namely unbiasedness, sparsity, and continuity [
17]. For the lasso method, it is actually an improvement on the ridge regression. Using the penalty function is the singularity of the derivative of the absolute value function at zero. It is also important to compress one or more unimportant variable coefficients into zero. The coefficient of the variable gives a certain compression, which causes it to not meet the unbiased requirements. Thus, Zou [
15] proposed the adaptive lasso method, which has the so-called oracle nature.
The method uses the least squares estimation coefficient value under the full model to calculate the penalty terms of different variables. Specifically, the absolute value of the coefficient may be a variable in the real model, so the penalty is small. Conversely, the absolute value of the coefficient may be small with independent variables, and thus cause a large penalty. Based on this idea, the penalty function of the adaptive lasso is defined as follows:
where
and
are two non-negative adjustment parameters, and
is the initial least squares estimation coefficient value. Therefore, the lasso algorithm can be directly used in the calculation of the adaptive lasso method, and because the weight
is introduced, the compression of the non-zero coefficient by Lasso is weakened, thereby reducing the deviation and achieving the gradual meaning. Here, we point out that the adaptive lasso is a convex optimization problem, so we do not have to worry about multiple local small-value problems. We can use the correlation algorithm to solve it quickly. For details, please refer to Shojaie and Michailidis [
18] for the selection of adjustment parameters in the adaptive lasso to obtain the optimal solution. In addition, Fan and Li [
17] proposed the LQA (Local Quadratic Approximation) algorithm that can be used to solve the adaptive lasso result, and each of these algorithms has advantages and disadvantages. The R package referred to in this paper is based on the LQA algorithm, so the algorithm is briefly introduced as follows. Let
where
is the initial solution selected. And we denote
by
. Take
, when
, by simple calculation:
and
Perform a second-order Taylor expansion on the penalty term of Equation (
1), omitting the following higher-order infinitesimal approximation:
Then, use the Newton-Raphson iterative method to calculate, where the process is as follows:
Calculate Lasso’s solution as the initial solution by using the previous LARS algorithm;
Let ;
For a sufficiently small positive number , the algorithm stops when .
The method is relatively fast and the algorithm is relatively stable, but its disadvantage is that if a regression parameter is 0 in the iteration, the variable will always be excluded from the model. In addition, the result of the algorithm depends on the selection of the precision
. Different
may lead to some differences in the sparseness of the model and the estimation results of the parameters. For more details, see Fan and Li [
17].
2.2. BP Neural Network
The BP neural network, referred to as the error back propagation neural network, consists of an input layer, one or more hidden layers, and an output layer, and each layer is composed of some neurons, where between adjacent neurons a complete connection relationship is formed, and each neuron in the same layer forms a completely unconnected relationship. The
n input signals enter the network from the input layer, are transformed by the excitation function, reach the hidden layer, and are then transformed into the output layer by the excitation function to the form
m output signal [
5].
In
Figure 1, the relationship between the output
y and the inputs
is as follows:
In the above formula, is the parameter of the model, m is the number of nodes of the input layer, and n is the number of nodes of the hidden layer.
Algorithm steps of the BP neural network:
Step 1: The sample input and output parameters are normalized to the interval ;
Step 2: Weight and threshold initialization, assign a random value in ;
Step 3: Calculate the state of the hidden layer of the network and the output value of the output layer;
Step 4: Calculate the error between the output value and the actual value;
Step 5: Determine whether the sample error is within the acceptable range. If it is satisfied, the training ends. If it is not satisfied, continue to modify the weight and threshold, and go to Step 3 until the error reaches an acceptable range.
The BP neural network parameters (see
Table 2) and the network of the learning rate is
.
In order to measure the performance and predictive capabilities of the BP neural network models, we selected the data from 70% as the training sample, and data from 30% were used as the test sample. In addition, the average absolute deviation (MAD) and the coefficient of determination (
) were used to evaluate the model, and we used the absolute error to describe the prediction effect of the model. The MAD,
and the absolute error are defined as follows:
and
and
We usually use the BIC to evaluate the quality of a model. The smaller the BIC value, the better the model.