1. Introduction
Population and environment are coupled to form a mutually restrictive and interdependent organic whole. The population has always been a key issue in the development of human society, which directly affects environmental change, economic development and social stability [
1]. Only by properly handling the relationship between population, resources and environment can we promote the sustainable development of society and realize the comprehensive and coordinated development of man and nature [
2]. At the same time, in the relationship between economic development, the population, as an index to measure the degree of social progress, can reflect the economic development level of a region. From an international perspective, the trend of world population development is an important basis for the United Nations to formulate a global population strategy. Similar to other countries, the formulation of basic state policies, the arrangement of labor and employment, the development of social welfare, and even the standards of national economic and social development strategies in China are profoundly affected by population change. Especially under the situations that China is facing, such as the problems of an unbalanced birth rate, insufficient labor supply, weakening demographic dividend and serious aging trend, it is of great significance to understand the future population development trend of China, accurately grasp the population status, and scientifically predict the future population number for the scientific decision-making of population policy and the construction of a harmonious society.
A population forecast is a reliable and accurate forecast of the future population development trend based on the existing information, so as to make population activities and economic development move forward in a favorable direction. Choosing a reasonable model to accurately describe and forecast population and various population indicators has a decisive significance for formulating a suitable plan for population development which, coordinated with social and economic development, can provide scientific decision-making basis for economic and social development. Graunt and Erle et al. studied population prediction 300 years ago, focusing on the analysis and research from the perspective of age distribution and population evolution [
3]. Later, Leslie et al. began to pay attention to the influence of deterministic models on population prediction and introduced population dynamics into population prediction [
4]. With the rapid growth of the world population, the impact of population on society is increasing day by day. The population problem has become one of the major issues causing concern for the world. Common models for population prediction include the differential equation model, Leslie model, stochastic prediction model, mathematical statistics method, grey model, BP neural network model, linear time series model, deep learning models, etc. [
5,
6,
7,
8]. For the convenience of a comparative study, we summarize the mechanisms, assumptions, merits and drawbacks of mainstream population prediction methods, as shown in 
Table 1.
Each population prediction method has its own advantages and disadvantages, therefore it needs to make a choice according to the specific analysis of the problem. For example, although the prediction accuracy of the differential equation model is high, with the increase in variables, the equation will become more complicated and the difficulty of solving it will increase accordingly. The Leslie matrix model is more suitable for large time ranges and large time-span prediction problems. However, due to the different birth and death rates of different age groups, it is not suitable to analyze population growth under natural conditions, which limits the actual use of the Leslie matrix model. Although population development is a dynamic development process, the linear programming model is only suitable for the prediction of problems with a linear relationship, and the population problem inevitably has a nonlinear relationship, therefore the linear model has a limited role in population prediction. Although the grey prediction model reflects the regularity of data, it has the limitation that it cannot completely reflect the influence of various irregular social factors on the prediction index. The improvement of the grey prediction model, which not only gives full play to the regularity of data but also avoids the influence of irregular social factors on indicators, has become a research hotspot.
After professor Deng Julong first proposed the grey system theory in 1982, the grey system theory was gradually enriched and improved after 40 years of development [
9]. As a new system analysis method to solve the problem of “few data, poor information and uncertain problem “, it has gradually become one of the main schemes to study uncertain systems. The inherent laws of the data sequence are fully mined to make scientific and quantitative predictions, so as to reveal the future development trend of the system [
10]. Since the grey prediction model is not strict about the sample size and the probability distribution of the original sequence, it has a unique advantage in the prediction of small sample data and effectively solves a large number of practical problems in production and life. At present, a large number of research results with practical value have been obtained in various dynamic systems such as public health [
11,
12], transportation [
13,
14,
15], energy development [
16,
17,
18], air quality [
19,
20] and so on. At the same time, many scholars have carried out abundant optimization studies on the traditional grey prediction model based on the characteristics of the grey prediction model [
21], the construction of the initial value and background value of the model [
22,
23], the evolution of the accumulative generation operator [
24,
25], the expansion of the modeling equation [
26], and the combined prediction with other models [
27].
In the past, many scholars have undertaken a lot of exploration into the grey system theory. In particular, in the field of population prediction, most of the research is aimed at the traditional GM(1,1) model, which is formally expressed as a system with linear characteristics. However, in practical applications, most systems have mostly nonlinear characteristics and the dynamic changes of population time series data are the result of the synergistic effects of many factors, such as primary, secondary, direct, indirect, known, unknown, obvious and implicit, which are interrelated and mutually restrictive. These factors, which are partly known and partly unknown, interact with each other to determine the grey amount of reality—the total population. As a dynamic development process, population evolution inevitably has nonlinear characteristics, therefore only relying on a linear prediction model is not suitable for all systems. The Nonlinear Grey Bernoulli Model (referred to as NGBM), as a generalized nonlinear grey model, is an inevitable trend of research and development. Therefore, this paper considers using the NGBM model to model the population system, which is characterized by multi-factor control, complex evolution law, and uncertain data fluctuation, etc. Through a nonlinear model, the evolution law of the population system can be more effectively fitted.
As a general grey prediction model, the structure of the NGBM model changes with the power exponent. When the power exponent is zero, it is the traditional GM(1,1) model. When the power exponent is two, the model is a special grey Verhulst model [
28]. Meanwhile, the power exponent of the NGBM model has the flexibility to change appropriately with the trend of the original sequence. When the value of the power exponent is appropriate, the NGBM model can achieve high prediction accuracy. Therefore, the NGBM model should have more ideal results than the traditional linear model to predict the future population. However, in practice, it is still found that the fitting and prediction effect of this model are unstable. In order to make the model capture the changing trend of different original behavior sequences more closely, make full use of the value of new information, and further improve the stability and computational efficiency of the model, a new accumulation operator can be applied in the NGBM model. A large number of studies were conducted on how to properly preprocess the original data and many cumulative operators suitable for different characteristics of the original data series were proposed, such as fractional order accumulation [
29], complex order accumulation [
30], seasonal accumulation [
31,
32], new information priority accumulation [
33], etc.
It is mentioned in the grey system theory that if the original data have an exponential growth trend, they are more in line with the requirements of grey system modeling. The population can be regarded as a kind of data with exponential properties and saturated growth trends, therefore it is very suitable to be processed by the NGBM model. However, in the process of predicting this kind of data, the traditional accumulation operator will lead the prediction result to increase too much in the short term and there will be a large deviation between the prediction result and the reality. At the same time, owing to the traditional accumulation operator, the model cannot accurately control the mutation points in the data series, resulting in it not being able to accurately describe the disadvantages of the population change trend. By using new accumulation methods, such as fractional order or complex order accumulation, the order of data accumulation can be extended from the integer range to the real or complex number field, which can greatly increase the prediction accuracy of the model and enhance the goodness of fit with the real data. However, these methods are too complex in their calculations, which is not conducive to application and theoretical analysis. Therefore, in order to avoid the phenomenon of exponential growth or rapid decline of data change in the process of prediction and make the prediction result conform to the law of population growth as much as possible, Liu Lianyi et al. proposed a kind of damping accumulated operator with a damping trend, which makes it easy to be concise and easy to understand and popularize [
34]. By giving different weights to the information at different time points, the proportion of new information in the system is higher than that of old information, so as to highlight the role of new information in the sequence and effectively enhance the smoothness of the prediction data sequence and the stability of the prediction model. Through research, it was found that the combination of the damping accumulated operator and the traditional GM(1,1) model provides better performance compared with the traditional grey prediction model.
In this paper, the damping accumulated generation operator is combined with a nonlinear grey Bernoulli model to form the damping accumulated nonlinear grey Bernoulli model (referred to as DA–NGBM). On the basis of further improving the prediction accuracy and universality of the nonlinear model, the stability of model prediction and the reliability of prediction results are enhanced effectively. After fully studying the model, three cases related to the population size and structure of China further proved that the damping accumulated operator can effectively optimize the prediction efficiency of the original NGBM model and it can not only grasp the changes of the original sequence of different trends more accurately but is also suitable for the study of population direction prediction. Thus, it will really expand the application scope of the grey prediction model.
The rest of this article is organized as follows. In the second section, the traditional NGBM model and its solving process are reviewed, as well as the damped accumulated generating operator and related properties. In the third part, a new DA–NGBM model is constructed and its parameter estimation method, error reduction and perturbation boundary analysis are discussed. In the fourth part, three cases about the population size and structure of China are selected to verify the validity of the new model by comparing it with other traditional models. The fifth part makes a visual analysis of the testing results of the four models and discusses the performance of six kinds of prediction models for three actual cases, and then analyzes the prediction advantages of the DA–NGBM model in detail. Finally, the sixth part gives the conclusion of this paper and the prospects for future work.
  2. The Definition of the Damping Accumulated Generating Operator
  2.1. Traditional Grey Accumulation Generating Operator and Its Generating Sequence
The real data, to a great extent, reflect the actual situation of a certain problem in the objective world. Even if the type of data is in a discrete form or there is no law that can be found intuitively, as a direct manifestation of the behavior of the system, it must contain the deep evolution law of the system. In order to mine valuable and usable information from discrete and chaotic original data, the concepts of accumulation operator and accumulation generating sequence are proposed. By applying the summation operator on the original sequence items of the system, the intuitive data with irregular and large fluctuation can be transformed into usable data with obvious regularity and a linearly increasing relationship. Such operations facilitate the subsequent model building and the data have a powerful ability to display the actual behavior of the system, so as to more clearly highlight the evolution process and internal laws of the system.
Definition 1  ([35]). Supposeis the non-negative original data sequence, 
             is the first-order accumulation sequence of  ,  is the sequence operator, where is the first-order cumulative generating operator of , and is denoted as 1-AGO.is called the r-order accumulation generated sequence of the original sequence 
            .
  The traditional grey accumulation generating operator can effectively improve the smoothness of the sequence, reduce the fluctuation among the discrete data, enhance the regularity of the data, and ensure the accuracy and stability of the grey prediction model fitting results and prediction results. At the same time, as the operator of equal weight addition, it fails to distinguish between new and old data sequences at each time point, and the data at each time point has the same proportion in the system, which will lead to a decrease in the fitting accuracy of the prediction model. In addition, assigning the old and new information with the same weight to the fitting sequence cannot highlight the “new information priority principle” required by the grey system theory. Moreover, the new information with closer time distance and more accurate and timely practical significance cannot be effectively played, which limits the prediction effect of the model to a certain extent.
Definition 2  ([35]). Supposeis the original data sequence,  is the first-order accumulative generated sequence of ,  is the action sequence of the adjacent mean value operator of ,is called the nonlinear grey Bernoulli model, where  is the development coefficient,  is the grey action quantity,  is the power exponent, and  The solution process of the NGBM model is described as follows:
Step 1. Suppose  is the non-negative original data sequence,  is the first-order accumulative generated sequence of  and  is the action sequence as described in definition 2 above.
Step 2. Suppose 
, 
 and 
 are as described in Step 1, the grey differential equation of the NGBM model is
        
        and the whitening differential equation can be expressed as
        
 By solving the above differential equations, the time response of the NGBM model can be obtained as
        
Step 3. The parameter sequence 
 of the model can be estimated by least squares based on the principle of minimum error sum of squares
        
        where 
, 
.
 Step 4. By substituting the parameters 
 and 
 into the time response function (6) and setting 
, the time response of the original sequence can be obtained as
        
        where 
, 
 is the fitting value of the 
 moment, from which the sequence 
 can be obtained.
 Step 5. The sequence 
 is reduced by the first order, from which the fitting sequence of the original data can be obtained as 
, and the predicted value of the model is 
, 
,
, where
        
   2.2. Damping Accumulated Generating Operator and Its Generating Sequence
The traditional grey cumulative generating operator (abbreviated as AGO) can solve the problem that the data in the original behavior sequence are discrete and have no intuitive rules. However, since the cumulative operator gives the same weight to the behavioral data at each time point in the system, AGO cannot distinguish the difference between the old and new information. This will lead to a large deviation between the prediction model and the real data in the process of fitting. In order to solve these problems and further improve the prediction accuracy of the grey prediction model, the damping accumulated operator is specially introduced. By setting damping parameters, the weights of old and new information in the system can be distinguished, which makes the model perfectly conform to the “new information priority principle”, meaning more satisfactory prediction results can be obtained.
Definition 3  ([34]). Supposeis the non-negative original data sequence, its 
             -order damping accumulation sequence is defined as 
            , where is called the 
             -order damping summation operator of the sequence , the damping coefficient 
            .
  Due to the 
 matrix form of the sequence generated by 
-order damping accumulation can be expressed as
        
When , it can be found that the -order damping accumulated operator is equivalent to 1-AGO and the -order damping accumulation sequence of the original sequence is equivalent to the traditional first-order accumulation generation sequence proposed in definition 1.
The damping accumulated operator can be used to optimize the traditional accumulation operator. Although 1-AGO makes the original discrete data with larger floating have a linear growth relationship easier to observe through data addition, the equal-weight addition of the data at each time in the original sequence ignores the difference of information between the old and new data and fails to make full use of the distance to study the practical value of the new data at a relatively recent time. Therefore, the damping accumulated operator is optimized in this aspect. By introducing the concept of damping parameters, the data information at each time point has its own weight so as to carry out differential treatment of the influence of the old and new information in the system. Therefore, the newer information in the system accounts for a larger proportion of influence, thus affecting the model fitting efficiency, and improving the prediction ability of the model.
Property 1  ([34]). Let  be the 
             -order damping accumulated operator, then the damping accumulation sequence 
             monotonically increases with respect to the time variable 
            .
  Proof of Property 1. For 
, we have
        
Since  is a non-negative sequence, , so  can be obtained. The proof is that the time variable is monotonically increasing. That is,  is monotonically increasing with respect to the time variable . □
 In the model, the more recent the research year, the more information contained in the new data, and the more research value. However, population systems generally have an evolutionary trend that cannot be mutated, and with the increase in time, the annual population generally shows a law of slow growth or decline. The connotation of this nature is basically consistent with the actual situation of the population system.
Property 2  ([34]). The value ofin the damping accumulation sequence 
             is negatively correlated with the magnitude of the parameter 
            .
  Proof of Property 2. For , set damping parameters , then , and .
Due to the , we can get to know .
That is, the value  is negatively correlated with the magnitude of the parameter . □
 By setting the value of damping parameters, the proportion of each item in the original sequence in the damping accumulation sequence can be controlled. It can be found that the equal weights of the original sequence items are no longer added in the damping accumulation term, and the smaller the value  is, the greater the weight of the items in the original sequence in the new sequence will be, that is, it will occupy more proportion in the system research.
Definition 4. Suppose 
             is a non-negative time series, define is the information difference between 
             and .
  Property 3. Suppose 
             is the damping accumulated with the original data to generate the sequence and the information difference  of the sequence 
           is the increasing function of the time variable 
             and the decreasing function of the damping parameter 
            .
 Proof of Property 3. According to the definition of information difference and damping accumulated generating operator,
        
Then, the information difference 
 of 
 can be expressed in matrix form as
        
Obviously, when , the information difference  of the accumulated damping sequence  is the increasing function of the time variable  and the decreasing function of the damping parameter . □
 The larger the value of the time variable  is, the more recent the year represented by the data value is, which is consistent with the law of reality. This property can be expressed as the information difference between the newer data, which contains a more practical value and has more important research significance.
  4. Case Analysis of Population Prediction
The proposed DA–NGBM model provides a new idea for solving practical problems by using the grey prediction model. The behavior sequence of the model is constructed by using the damping accumulated method, which can not only satisfy the “new information first” criterion in grey modeling but also enhance the prediction accuracy and stability of the model to a certain extent. In order to verify the prediction effect of the DA–NGBM model after changing the accumulative method, three numerical calculation cases related to the prediction of population size and structure in China are selected in this section. Compared with the NGBM model, the original GM(1,1) model, the DA–GM (1,1) model [
34], the deep learning model–LSTM [
37] and the traditional statistical model–Exponential smoothing, the prediction ability of the improved DA–NGBM model was analyzed and evaluated.
The three cases selected in this paper contain ten groups of data and the first eight groups of data in the sample are used as the training set for the model. First, the eight data are used to fit the parameters of the model through the whale optimization algorithm. The whale optimization algorithm is used to find the optimal power exponent and the optimal damping parameters to minimize the fitting error of the model, and the optimal solution can be found by setting the searching times to 50 times. The last two groups of data in the sample are taken as the testing set for the model. On the premise of obtaining model parameters, the DA–NGBM model and the traditional grey prediction model are used for two-step prediction. The error analysis is made between the obtained prediction results and the data in the testing set to compare the prediction ability of the model. The specific division of sample data is shown in 
Figure 2.
  4.1. China’s Birth Rate Forecast from 2011 to 2020
According to the birth rate data of China from 2011 to 2020 in the China Statistical Yearbook 2021 [
38], the DA–NGBM model and traditional NGBM model, traditional GM(1,1) model, DA–GM(1,1) model, LSTM and Exponential smoothing are used to conduct comparative experiments to fit and predict the birth rate of the Chinese population, respectively. The birth rate from 2011 to 2018 was used to fit the model, the birth rate from 2019 to 2020 was used as the testing set, and the average absolute error percentage (MAPE) was used as the absolute standard to judge the quality of the prediction model and then the prediction accuracy of the four grey prediction models, one deep learning model and one traditional statistical model, was compared. It can be found that after adding the damping accumulated factor and intelligent optimization algorithm, the fitting and prediction accuracy of the DA–NGBM model is still greatly improved from the original model when the predicted data has obvious floating.
The comparison and prediction results of the six models are shown in 
Table 3. It can be found that the LSTM prediction model has the highest fitting accuracy after 5000 times of training. In terms of prediction, since this is a small sample problem, the LSTM model lacks sufficient training samples, therefore the prediction effect of it is not so good. To some extent, deep learning algorithms that require large sample data are not suitable for the prediction of small sample problems. The average fitting error of the DA–NGBM model is 4.63%, the prediction error for 2019 is 2.40%, and the prediction error for 2020 is 17.96%. China’s birth rate fluctuated due to the COVID-19 outbreak in 2020. However, under such data conditions with obvious fluctuations, the prediction effect of the DA–NGBM model is still optimal. It can be proved that by introducing the damping accumulated factor into the NGBM model, the prediction ability of the model is further improved and that this method is an effective improvement.
In this example, the optimal power index and damping parameter values of the DA–NGBM ,  are calculated by MATLAB software and the model parameters obtained by least square are , , respectively.
  4.2. Prediction of Registered Population in Jiangsu Province from 2011 to 2020
According to the data of the registered total population in Jiangsu Province from 2011 to 2020 in the Jiangsu Statistical Yearbook 2021 [
39], six kinds of prediction models are used for prediction, the total registered population of Jiangsu from 2011 to 2018 is taken as the fitting data, and the total registered population of Jiangsu from 2019 to 2020 is taken as the testing data. The comparison and prediction results of the six models are shown in 
Table 4. It can be seen that the total registered population in Jiangsu is a kind of data that has an overall trend of monotonically increasing. In the fitting calculation of this type of data series, the average fitting error of the DA–NGBM model is the first among six types of prediction models with a small error of 0.06%, which is further improved compared with the original NGBM’s average fitting error of 0.08%.
For the prediction data, compared with the prediction results of 2019, the prediction error of the original NGBM model is 0.14%, the prediction error of the original GM(1,1) model is 0.47%, and the prediction error of the DA–GM(1,1) model is about 0.2% lower than the original model. The DA–NGBM model still ranks first among the four models with a prediction error of only 0.1%. Similarly, the prediction data of 2020 were observed and the prediction error of the DA–NGBM model was 0.29% as the optimal prediction model. The performance of the deep learning model and the traditional statistical method in this case also has a certain gap with that of the new grey forecasting method. Therefore, in this kind of monotonically increasing data prediction, the prediction ability of the DA–NGBM should be affirmed.
In this case, the optimal power index and damping parameter values of the DA–NGBM model ,  are calculated by MATLAB software and the model parameters obtained by least square are , , respectively.
  4.3. Prediction of the Male Population of Nanjing City from 2011 to 2020
According to the data of the male population of Nanjing city from 2011 to 2020 in the Nanjing Statistical Yearbook 2021 [
40], four kinds of grey prediction models, a deep learning model and a traditional statistical method, are used for prediction. The total male population of Nanjing city from 2011 to 2018 is taken as the fitting data and the male population of Nanjing city from 2019 to 2020 is taken as the testing data. The comparison and prediction results of the six models are shown in 
Table 5. It can be seen that the total male population of Nanjing city is a kind of data with an overall trend that is monotonically increasing.
In the fitting calculation of this type of data series, the fitting effect of the LSTM method is the most accurate. However, the prediction effect of the LSTM model is worse than the DA–NGBM model in the testing stage and the prediction results of the DA–NGBM for two years are closest to the real value, with prediction errors of only 0.36% and 0.09%, which has the best prediction ability among the six predicted models. By comparison, the grey method shows superior ability in solving small sample prediction problems. It should be noted that the GM(1,1) model and the DA–GM(1,1) model obtained the same result in this case because the optimal value of the damping parameter of the DA–GM(1,1) model was selected as 1, therefore the DA–GM (1,1) model degenerated into the classical GM(1,1) model.
In this example, the optimal power index and damping parameter values of the DA–NGBM ,  are calculated by MATLAB software and the model parameters obtained by least square are , , respectively.
  5. Discussion
In order to show the fitting and prediction ability of the DA–NGBM model more vividly, this example is divided into two experimental areas. The birth rate of China from 2011 to 2018 is regarded as the fitting data and the birth rate from 2019 to 2020 is regarded as the testing data. The birth rate of China in 2021–2022 predicted by the six types of comparison models in this example is regarded as the forecasting data, and the fitting, testing and forecasting results of the six types of models are visualized by the MATLAB program, as shown in 
Figure 3.
The data, in this case, fluctuated greatly and showed a downward trend as a whole. It can be found that the GM(1,1) model and the NGBM model, as traditional grey prediction methods, can only match the real value at a few points and that these models do not have a strong perception ability for the changing trend of a data series. The optimized GM(1,1) model has a certain improvement in data fitting but there is still a certain deviation from the real value in the test results. The performance of the traditional statistical method, in this case, is even less than satisfactory. Their prediction results are difficult to correctly correspond with population changes, the description of the changing trend is not clear, and there are large errors in the fitting and testing stages, therefore the accuracy of future prediction results may not be too high. From the results curve of the LSTM model, it can be found that in the fitting area, the curve of the LSTM model is almost exactly consistent with the real data. However, in the testing area, the prediction curve of the LSTM model is significantly different from the real value and shows an opposite trend of change. Therefore, it can be confirmed that the LSTM model still has excellent fitting ability in small sample problems but it is not completely suitable for the prediction problem of small sample cases.
It is not difficult to find that the DA–NGBM model performs better than the other three grey models in fitting and predicting the original behavior sequences with large fluctuations. It can also be seen that this model has good stability and will not affect the overall prediction result because of the anomaly of a single point. Not only in the fitting stage is the data curve smooth and conforms to the law of population growth, which is suitable for the prediction of population system, but also in the prediction process, the multi-point data can be consistent with the real value, and also in the test results, it is close to the real trend of population change. 
The registered population of Jiangsu province from 2011 to 2018 is the fitting data and the registered population of Jiangsu Province from 2019 to 2020 is taken as the testing data. The registered population of Jiangsu Province from 2021 to 2022 is taken as the forecasting data of the six prediction models. The fitting, testing and forecasting results of the six types of models are visualized through the MATLAB program. The result is shown in 
Figure 4.
The data fluctuation of this case is small and shows an overall upward trend. It can be found that the fitting and test coincidence of several types of grey models are quite ideal cause the original sequence changes regularly. The traditional statistical method can also describe the changing trend of the population but there is a large deviation from the actual value. The LSTM model can show more accurate results in the fitting and testing area but the predicted value in the prediction area has changed significantly and the change mode does not conform to the law of population change in reality. Although the results obtained by the four types of grey prediction models are satisfactory, careful observation shows that, compared with the GM(1,1) model, the NGBM model with nonlinear characteristics has better performance in grasping data changes. It is more consistent with the real data in the fitting stage and has significantly less errors in the test stage than the GM(1,1) model. The deviation between the predicted curve and the real curve is small. This also reflects that the linear model has certain disadvantages. In order to make use of the advantages of the nonlinear model in mastering data changes, this paper has optimized the NGBM model.
After the improvement of the original model, the prediction curve of the DA–NGBM model coincides highly with the actual data curve in the fitting stage. Compared with the DA–GM (1,1), the NGBM model, LSTM and Exponential smoothing, the DA–NGBM model has obvious improvement and the best fitting ability. In the testing stage, the deviation between the predicted results and the real data is the smallest among the six models. This shows that the DA–NGBM model has high prediction accuracy and the optimization in the NGBM model has a good effect.
Next, we took the male population of Nanjing city from 2011 to 2018 as the fitting data and the male population of Nanjing city from 2019 to 2020 as the testing data. The male population of Nanjing city from 2021 to 2022 is taken as the forecasting data of the four types of grey prediction models, LSTM and Exponential smoothing. The fitting, testing and forecasting results of the six types of models are visualized through the MATLAB program. The result is shown in 
Figure 5.
Through the six images, in this case, the superior fitting and prediction ability of the DA–NGBM model can be demonstrated more clearly. It can be seen that although the NGBM model without optimization also has a good fitting ability, the prediction result of the DA–NGBM model is closer to the real value curve in the test stage. The predicted curves of the GM(1,1) model and the improved GM(1,1) model were quite different from the real value, especially in the testing phase, and could not accurately describe the changing trend of this population data. The LSTM model shows superior ability in the fitting region, but compared with the grey prediction model, the prediction deviation of the LSTM and the traditional statistical models is higher than the DA–NGBM and the results have less reliability.
Based on the above three cases, compared with the other five comparison models, the DA–NGBM model can deeply explore the fluctuation form of the population system and closely capture the overall evolution trend and individual change of this dynamic system, no matter if the original data have obvious fluctuation characteristics or the data series have monotonous growth (decline) trends. It is more reliable in fitting and prediction accuracy, especially in its prediction effect. This work, therefore, has great practical significance in population prediction. Promoting a country’s effective grasp of the trend of population change can provide guiding opinions for the country to formulate relevant population policies and play an important role in strengthening social governance, therefore this issue is suitable for further study and application in population system prediction analysis.
After verifying the effectiveness of the DA–NGBM model, the DA–NGBM was used to predict the birth rate of China, the number of registered residents in Jiangsu Province, and the male population of Nanjing city from 2021 to 2015. The prediction results are shown in 
Figure 6.
  6. Conclusions and Future Work
Correctly understanding and handling the relationship between population development and resources, environment and economy, and implementing the strategy of sustainable development, which is a major issue related to the rise and fall of a country and the survival of a nation, is also the eternal theme of mankind. As a direct means of adjusting and controlling population indicators, population decision-making plays an important role in economic and social development, and the accurate prediction of the population is directly related to the scientific decision-making of population issues. Therefore, selecting a reasonable and effective population prediction model, grasping the size of the population accurately, and understanding its development situation, have far-reaching significance for the formulation of the national economic plan and social development strategy.
  6.1. Conclusions
Taking into account the importance attached to population prediction by countries around the world, the need for population prediction in formulating population policies and social development strategies, and the improvement of social welfare facilities in order to promote sustainable development of society, and in order to promote the sustainable development of society, perfect social welfare facilities, a kind of grey Bernoulli model with damping accumulation is established in this paper. As an optimization of the traditional grey prediction model, this model has an obvious improvement effect on the reasonable prediction of population system change. The NGBM model is a nonlinear grey model essentially, whose flexible power index enables it to have a broader ability to solve nonlinear feature problems and a strong universality compared with the traditional GM(1,1) model. In this paper, the damping accumulated operator is integrated into the data sequence construction of the NGBM model, and the intelligent advantage of the whale optimization algorithm is used to find the parameter value of the model in order to make the prediction model more sensitive to the changes of the original data and enhance the accuracy and stability of the prediction model, as well as reduce the complexity of calculating and use convenient research.
Combined with three cases of birth rate in China, the total registered population in Jiangsu Province and the male population of Nanjing city, this paper studies the DA–NGBM model compared with three kinds of grey prediction methods and two traditional statistical methods. The result shows that the new prediction model can fully adapt to the original series with different development trend characteristics and has the best fitting and prediction performance. The experimental results show that the predicted curves obtained by the DA–NGBM model are in better agreement with the real data curves in both the fitting and testing stages. At the same time, the addition of the damping accumulated operator can effectively adjust the exponential growth rate of the predicted results and also ensure the reliability and stability of the predicted results for the fluctuating original data series. As a result, the DA–NGBM model can effectively predict future population changes reliably. Therefore it is believed that the DA–NGBM model will have a broad research background and application space in population prediction and it is worth promoting to the application research of relevant population problems.
  6.2. Future Work
As an important part of grey system theory, the grey prediction model reduces the randomness of the original sequence by cumulation operator and then introduces the data into the construction equation with biased and partial differential characteristics. Through this method, the inherent law of data series is fully mined, scientific and quantitative prediction is made, and the future development trend of the system is revealed. Grey system theory is a new system science method first proposed by Julong Deng in 1982 and has only been developed for 40 years. Therefore, some theories in this discipline are still not mature and many valuable theories have not yet been deeply explored. The DA–NGBM model, as a kind of grey prediction model, naturally has the advantages of grey system theory, but also there are some aspects to be improved. For example, the discussion of spatial stratified heterogeneity and temporal nonstationarity in grey forecasting models has not been presented. In the future, this issue will become a direction for further research and in-depth thinking. 
In addition, in the case analysis of this paper, although the LSTM model shows excellent fitting ability, for data series with small or insufficient sample size, under the condition of low training times, the LSTM model does not have the same advantage as the grey model in predicting the ability of small samples. Deep learning models such as LSTM are generally considered suitable for dealing with problems that have big data characteristics, while the grey model in this paper is aimed at uncertain systems with small samples. Although more population information can be obtained from China’s long-term population series, such behavior is not appropriate due to China’s population data being easily affected by national policies, such as the family planning policy, the two-child policy or the three-child policy. Therefore, the longer the population series of China is, the less reliable it is. Additionally, deep learning models that require big data seem to be unable to deal with the problem of population prediction in China well. However, as a current research hotspot, the deep learning model deserves us to seriously think about its integration and development with grey system theory in future research and it is a worthy direction to combine grey prediction models with some deep learning models to obtain more advanced results. In addition, in order to better solve the grey multivariate problem, it is also a valuable thought to extend the new cumulative operator to the high-dimensional grey model.