Regional Population Forecast and Analysis Based on Machine Learning Strategy
Abstract
:1. Introduction
2. Related Works
2.1. Essential Factors of Population Growth
2.2. Deep Learning Application in Decision Support
2.3. Potential Disadvantage of Conventional Models
3. Boosting RegressionBased Method and Recurrent Neural Network
3.1. Gradient BoostingBased Method
3.2. XGBoost Algorithm
Algorithm 1. XGBoost algorithm 
Input: Data ${\left\{\left({x}_{i},{y}_{i}\right)\right\}}_{i=1}^{n}$, and a differentiable Loss Function, as the algorithm (1): $l\left({y}_{i},{\widehat{y}}_{i}=F\left(x\right)\right)=\frac{1}{2}{({y}_{i}{\widehat{y}}_{i})}^{2}$ 
Step 1: Initialize model with a constant value: ${F}_{0}\left(x\right)=argmin{\sum}_{i=1}^{n}L\left({y}_{i},r\right)$ 
Step 2: for m = 1 to M: 




Step 3: Output ${F}_{M}\left(x\right)$ 
3.3. Gain
3.4. XGBoost Regression Model
3.5. Long ShortTerm Memory Network
4. Simulation Experiment
4.1. Data Description
4.2. Experiment Design
 The MAPE is applied as the measuring criteria to evaluate modelling performance in the comparison, as shown in Table 1. By observing a fitting tendency between the real historical data and the forecasted data from 2009 to 2018, it can further confirm the reliability of the forecast results from 2019 to 2025.
 Three inference models are applied in the comparison in this work, including the Linear Regression model (conventional method), the LSTM model, and the XGBoost Regression model. In addition, the comparisons are summarized in Table 1.
4.3. Near Future Forecasting with Linear Regression, XGBoost Regression, and LSTM Models
4.4. Feature Importance in the Present, across a Known Time to the Near Future
5. Conclusions
Feature  Birth  City Annual  Death  Immigration  Income  Population  Average MAPE  

Models in Different Year Range  
Linear_Regression_3Y  0.30265  0.36806  0.24133  6.35127  0.17148  0.23123  1.27767  
Linear_Regression_4Y  0.36432  0.39890  0.26973  26.03689  0.18115  0.26782  4.58647  
Linear_Regression_5Y  0.34876  0.37034  0.28464  11.57862  0.15104  0.25222  2.16427  
LSTM_3Y  1.40973  1.47480  1.43107  10.09746  0.29646  1.31306  2.67043  
LSTM_4Y  1.34646  1.48777  1.42434  11.45912  0.28670  1.30690  2.88521  
LSTM_5Y  1.21405  1.27438  1.41467  13.70877  0.27888  1.30739  3.19969  
XGBoost_3Y  0.01310  0.00396  0.00210  0.42950  0.00149  0.00017  0.07505  
XGBoost_4Y  0.00725  0.00179  0.00101  0.11286  0.00080  0.00012  0.02064  
XGBoost_5Y  0.00201  0.00069  0.00062  0.13376  0.00047  0.00009  0.02294 
