# Research on the Current Situation of Employment Mobility and Retention Rate Predictions of “Double First-Class” University Graduates Based on the Random Forest and BP Neural Network Models

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Literature Review

#### 2.1. The Connotation of “Double First-Class” Construction

#### 2.2. Graduate Employment Migration

#### 2.3. Forecast Methods

## 3. Status Quo of Employment Migration of “Double First-Class” University Graduates

#### 3.1. Regional Differences in Employment Mobility for “Double First-Class” University Graduates Exist

#### 3.2. The Employment Mobility of “Double First-Class” University Graduates Is Sticky

#### 3.3. The Employment Mobility of “Double First-Class” University Graduates Exhibits Concentration

## 4. Analysis of Influencing Factors of Employment Mobility of “Double First-Class” University Graduates

#### 4.1. Fixed-Effects Model Building

#### 4.2. Variable Description

#### 4.3. Empirical Analysis

## 5. Prediction of the Employment Mobility of Graduates from “Double First-Class” Universities

- -
- Normalization of original data;
- -
- Use of PCA to reduce the dimension of the data, and reduce the data from 22 dimensions to 9 dimensions;
- -
- Random forest model prediction results;
- -
- BP neural network model prediction results;
- -
- Comparative analysis model and prediction accuracy.

#### 5.1. The Principle of PCA

_{1}, E

_{2}, …, E

_{m}, respectively, and the m indicators form an m-dimensional random vector, E = (E

_{1}, E

_{2}, …, E

_{m})’, let q be mean value of the random vector E. The random vector E can be transformed into a new comprehensive variable by a linear transformation, which is represented by W. Thus, the new comprehensive variable can be linearly represented by the original variable, which satisfies the following formula:

_{1}= q

_{11}E

_{1}+ q

_{12}E

_{2}+ … + q

_{1m}E

_{m}

_{2}= q

_{21}E

_{1}+ q

_{22}E

_{2}+ … + q

_{2m}E

_{m}

_{n}= q

_{n1}E

_{1}+ q

_{n2}E

_{2}+ … + q

_{nm}E

_{m}

_{ij}is calculable using the following principles:

- (1)
- q
_{i1}^{2}+ q_{i2}^{2}+… + q_{im}^{2}= 1 (i = 1, 2, …, m) - (2)
- W
_{i}is linearly independent of W_{j}(i ≠ j; i, j = 1, 2, …, n); - (3)
- W
_{1}is the one with the largest variance among all linear combinations of E_{1}, E_{2}, …, Em; W_{2}is the one with the largest variance among all linear combinations of E_{1}, E_{2}, …, E_{m}that are not related to W_{1}; W_{N}is the one with the largest variance among all linear combinations when W_{1}, W_{2}, … W_{N−1}is uncorrelated.

_{1}, W

_{2}, … W

_{N}determined in this way are called the first principal component, the second component, …, and the nth principal component of the original variable indexes E

_{1}, E

_{2}, …, E

_{m}, respectively. Among them, the variance of W

_{1}accounts for the largest proportion of the total variance, and the variances of W

_{1}, W

_{2}, … W

_{N}decrease in turn. When analyzing practical problems, it is common practice to select the first few largest principal components, which not only reduces the number of variables but also captures the main contradiction of the problem and simplifies the relationship between variables. In this paper, the fit method of PCA was used to train all training data, resulting in the trained PCA model.

#### 5.2. Random Forest Prediction Model

_{1}(x), h

_{2}($x$) …, h

_{k}($x$)} is obtained. For any given new sample, its prediction result is the average summary of the results of k rounds, and its formula is:

- (1)
- In this paper, Bootstrap sampling was used to randomly select N training subsets from the original training set for 22 influencing factors, with the size of each training subset being approximately two-thirds of the original training set. After many repeats, there are always some samples that cannot be drawn; these samples form M out-of-bag data sets, which serve as the test sample set of random forest.
- (2)
- At each node of each decision tree, randomly select m variables as alternative branch variables, where the number of randomly selected variables is less than the number of original variables, and then select the optimal branch according to the branch goodness criterion.
- (3)
- Each decision tree begins recursive branching from the top down, and the minimum size of the leaf node is set to five. Based on this as the termination condition for the growth of the regression tree, a random forest model is generated from the generated decision tree.

#### 5.3. BP Neural Network Model

_{1}, X

_{2}, X

_{3}, …, X

_{n}; the corresponding expected output vectors are Y

_{1}, Y

_{2}, Y

_{3}, …, Y

_{n}; w

_{ij}and w

_{jk}are the connection weights from the input layer to the hidden layer, and the connection weights are from the hidden layer to the output layer, respectively; n and m are the number of input nodes and the number of output nodes, respectively.

- (1)
- Assign random values in the interval [−1, +1] to the connection weights w
_{ij}, w_{jk}and the thresholds a, b; - (2)
- According to the input vector X, the connection weight w
_{ij}from the input layer to the hidden layer and the hidden layer threshold a, the hidden layer output T is calculated.$${T}_{j}=f\left({{\displaystyle \sum}}_{i=1}^{n}{w}_{ij}{x}_{i}-{a}_{i}\right),j=1,2,3,\dots ,l$$

- (3)
- According to the hidden layer output T, weight w
_{ij,}and threshold b, through the transfer function, the actual output value C of each unit of the output layer is output; - (4)
- According to the expected input Y(Y
_{1}, Y_{2}, Y_{3}, …, Y_{n}) and the actual output value C, the correction error e of each unit of the output layer is calculated;e_{k}= Y_{k}− C_{k}, k = 1, 2, 3, …, m$${W}_{ij}={w}_{ij}+\mu {T}_{j}(1-{T}_{j}){x}_{i}$$$${W}_{jk}={w}_{jk}+\mu {T}_{j}{e}_{k}$$$${a}_{j}={a}_{j}+\mu {T}_{j}(1-{T}_{j}),j=1,2,3,\dots ,l$$$${b}_{k}={b}_{k}+{e}_{k},k=1,2,3,\dots ,m$$ - (5)
- Determine whether the global error meets the specified accuracy requirements and whether the number of iteration steps exceeds the specified number of steps. If true, the algorithm terminates; otherwise, it returns.

## 6. Conclusions and Policy Implications

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Li, T.; Zhang, J. What determines employment opportunity for college graduates in China after higher education reform? China Econ. Rev.
**2010**, 21, 38–50. [Google Scholar] [CrossRef] - Tano, S. Regional clustering of human capital: School grades and migration of university graduates. Ann. Reg. Sci.
**2014**, 52, 561–581. [Google Scholar] [CrossRef] - Zeng, G.; Hu, Y.; Wu, W.; Mensah, I.K. Employment Flow of College Graduates in China: City Preference and Group Difference. Sage Open
**2021**, 11, 2158244021998696. [Google Scholar] - Hewen, W. The System Inquiry and Promoting Strategies in the Construction of “Double World-class”. J. High. Educ.
**2017**, 38, 29–36. [Google Scholar] - Pan, J. The connotation and action framework of “Double First-class” construction. Jiangsu High. Educ.
**2016**, 5, 14–17. [Google Scholar] - Bie, D.R. On Construction of “Double First Class” Universities and Academic Disciplines. China High. Educ. Res.
**2017**, 11, 7–17. [Google Scholar] - Faggian, A.; McCann, P. Human capital, graduate migration and innovation in British regions. Camb. J. Econ.
**2009**, 33, 317–333. [Google Scholar] [CrossRef] [Green Version] - Winters, J.V. Do earnings by college major affect graduate migration? Ann. Reg. Sci
**2017**, 59, 629–649. [Google Scholar] [CrossRef] [Green Version] - Faggian, A.; McCann, P.; Sheppard, S. Some evidence that women are more mobile than men: Gender differences in UK graduate migration behavior. J. Reg. Sci.
**2007**, 47, 517–539. [Google Scholar] [CrossRef] - Faggian, A.; McCann, P.; Sheppard, S. Human capital, higher migration: An analysis students education and graduate of Scottish and Welsh students. Urban Stud.
**2007**, 44, 2511–2528. [Google Scholar] [CrossRef] - Yue, C.J. Analysis on the characteristics and influencing factors of inter-provincial mobility of college students. Fudan Educ. Forum
**2011**, 9, 57–62. [Google Scholar] - Wright, R.; Ellis, M. Where science, technology, engineering, and mathematics (STEM) graduates move: Human capital, employment patterns, and interstate migration in the United States. Popul. Space Place
**2019**, 25, e2224. [Google Scholar] [CrossRef] [PubMed] - Antosik, L.; Ivashina, N. Factors and Routes of Interregional Migration of University Graduates in Russia. Educ. Stud.
**2021**, 2, 107–125. [Google Scholar] [CrossRef] - Rehak, S.; Eriksson, R. Migration of university graduates and structural aspects of regional higher education. Eur. Plan. Stud.
**2020**, 28, 1941–1959. [Google Scholar] [CrossRef] - Haussen, T.; Uebelmesser, S. Job changes and interregional migration of graduates. Reg. Stud.
**2018**, 52, 1346–1359. [Google Scholar] [CrossRef] - Lares-Michel, M.; Housni, F.E.; Cervantes, V.; Reyes-Castillo, Z.; Nava, R.; Canedo, C.L.; Larios, M. The water footprint and nutritional implications of diet change in Mexico: A principal component analysis. Eur. J. Nutr.
**2022**, 61, 1–26. [Google Scholar] [CrossRef] - Gao, S.W.; Tian, R.; Chen, P. Principal component analysis for process monitoring in distributed system environment. Concurr. Comput. Pract. Exp.
**2022**, 34, e5309. [Google Scholar] [CrossRef] - Hao, W.W. Classification of sport actions using principal component analysis and random forest based on three-dimensional data. Displays
**2022**, 72, 102135. [Google Scholar] - Xiao, R.G.; Zhuang, Q.; Jin, S.S.; Liu, B.; Liu, G.Q. Evaluation of influencing factors of pipeline wax deposition strength based on principal component analysis. Petrol. Sci. Technol.
**2022**, 1–12. [Google Scholar] [CrossRef] - Zhang, H.; Feng, Z.; Wang, S.; Ji, W. Disentangling the Factors That Contribute to the Growth of Betula spp. and Cunninghami lanceolata in China Based on Machine Learning Algorithms. Sustainability
**2022**, 14, 8346. [Google Scholar] [CrossRef] - Chen, G.; Cao, G.Y.; Liu, Y.B.; Pang, L.H.; Zhang, L.; Ren, Q.; Wang, H.T.; Zheng, X.Y. The Future Population of Beijing—A Projection on the Population, Human Capital and Urbanization using PDE Model. Mark. Demogr. Anal.
**2006**, 12, 29–41. [Google Scholar] - Shou Ping, F. The Construction and Application of China’s Population Development Prediction Model. Stat. Decis.
**2010**, 4, 24–27. [Google Scholar] - Tu, X.L.; Xu, H.Y. A Comparative Study of ARIMA and Exponential Smoothing in Population Forecasting in my country. Stat. Decis.
**2009**, 3, 21–23. [Google Scholar] - Chen, N.; Lin, Z.J.; Wang, Q.M.; Zhu, H.C. The forecast on the population spatial distribution pattern based on the gray theory. Econ. Geogr.
**2006**, 26, 759–762. [Google Scholar] - Wei, Z.; Jing, W.; Wang, Y. What Determines the Talent Attractiveness of First-tier Cities?Based on the Random Forest Algorithm of Influential Factors. Sci. Technol. Manag. Res.
**2017**, 14, 99–108. [Google Scholar] - Stevens, F.R.; Gaughan, A.E.; Linard, C.; Tatem, A.J. Disaggregating Census Data for Population Mapping Using Random Forests with Remotely-Sensed and Ancillary Data. PLoS ONE
**2015**, 10, e0107042. [Google Scholar] [CrossRef] [Green Version] - Were, K.; Bui, D.T.; Dick, O.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecol. Indic.
**2015**, 52, 394–403. [Google Scholar] [CrossRef] - Palmer, D.S.; O’Boyle, N.M.; Glen, R.C.; Mitchell, J. Random forest models to predict aqueous solubility. J. Chem. Inf. Model.
**2007**, 47, 150–158. [Google Scholar] [CrossRef] - Shi, J.; Jin, J.H. Study on Population Prediction of the Middle and Lower Reaches of the Yellow River based on BP Neural Network. Sci. Technol. Manag. Res.
**2014**, 6, 245–250. [Google Scholar] - Ahmad, M.W.; Mourshed, M.; Rezgui, Y. Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build.
**2017**, 147, 77–89. [Google Scholar] [CrossRef] - Naghibi, S.A.; Pourghasemi, H.R.; Dixon, B. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ. Monit. Assess
**2016**, 188, 44. [Google Scholar] [CrossRef] [PubMed] - Liang, J.X.; Zhao, J.F.; Sun, N.; Shi, B.J. Random Forest Feature Selection and Back Propagation Neural Network to Detect Fire Using Video. J Sensors
**2022**, 2022, 5160050. [Google Scholar] [CrossRef] - Shi, H.; Wang, Z.; Zhou, H.; Lin, K.; Li, S.; Zheng, X.; Shen, Z.; Chen, J.; Zhang, L.; Zhang, Y. Using a Novel Algorithm Based on the Random Vector Functional Link Network and Multi-Verse Optimizer to Forecast Effluent Quality. Sustainability
**2022**, 14, 8314. [Google Scholar] [CrossRef] - Sanderson, W.C.; Scherbov, S.; Lutz, B.C.O.N. Conditional Probabilistic Population Forecasting. Int. Stat. Rev.
**2004**, 72, 157–166. [Google Scholar] [CrossRef] [Green Version] - Fang, K.N.; Zhu, J.P.; Xie, B.C. A Research into the Forecasting of Fund Return Rate Direction and Trading Strategies Based on the Random Forest Method. Econ. Surv.
**2010**, 32, 61–65. [Google Scholar] - Athey, S.; Tibshirani, J.; Wager, S. Generalized Random Forests. Ann. Stat.
**2019**, 47, 1148–1178. [Google Scholar] [CrossRef] [Green Version] - Best, K.; Gilligan, J.; Baroud, H.; Carrico, A.; Donato, K.; Mallick, B. Applying machine learning to social datasets: A study of migration in southwestern Bangladesh using random forests. Reg. Environ. Change
**2022**, 22, 52. [Google Scholar] [CrossRef] - Atambo, D.O.; Najafi, M.; Kaushal, V. Development and Comparison of Prediction Models for Sanitary Sewer Pipes Condition Assessment Using Multinomial Logistic Regression and Artificial Neural Network. Sustainability
**2022**, 14, 5549. [Google Scholar] [CrossRef] - Chen, L.; Mu, T.; Li, X.; Dong, J. Population Prediction of Chinese Prefecture-Level Cities Based on Multiple Models. Sustainability
**2022**, 14, 4844. [Google Scholar] [CrossRef] - Riiman, V.; Wilson, A.; Milewicz, R.; Pirkelbauer, P. Comparing Artificial Neural Network and Cohort-Component Models for Population Forecasts. Popul. Rev.
**2019**, 58, 100–116. [Google Scholar] [CrossRef] - Benzer, R. Population Dynamics Forecasting Using Artificial Neural Networks. Fresen. Environ. Bull.
**2015**, 24, 460–466. [Google Scholar] - Antanasijevic, D.; Pocajt, V.; Peric-Grujic, A.; Ristic, M. Urban population exposure to tropospheric ozone: A multi-country forecasting of SOMO35 using artificial neural networks. Environ. Pollut.
**2019**, 244, 288–294. [Google Scholar] [CrossRef] [PubMed] - Liu, M.; Wang, M.J.; Wang, J.; Li, D. Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: Application to the recognition of orange beverage and Chinese vinegar. Sens. Actuators B Chem.
**2013**, 177, 970–980. [Google Scholar] [CrossRef] - Wang, J.Z.; Wang, J.J.; Zhang, Z.G.; Guo, S.P. Forecasting stock indices with back propagation neural network. Expert Syst. Appl.
**2011**, 38, 14346–14355. [Google Scholar] [CrossRef] - D’Costa, A.P. The International Mobility of Technical Talent: Trends and Development Implications; WIDER Working Paper Series; UNU-WIDER: Helsinki, Finland, 2006; pp. 44–83. [Google Scholar]
- Rabe, B.; Taylor, M. Differences in Opportunities? Wage, Unemployment and House-Price Effects on Migration. Oxf. Bull. Econ. Stat.
**2012**, 74, 831–855. [Google Scholar] [CrossRef] - Jeanty, P.W.; Partridge, M.; Irwin, E. Estimation of a spatial simultaneous equation model of population migration and housing price dynamics. Reg. Sci. Urban Econ.
**2010**, 40, 343–352. [Google Scholar] [CrossRef] - Liu, F.H. Housing-choice hindrances and urban spatial structure: Evidence from matched location and location-preference data in Chinese cities. J. Urban Econ.
**2006**, 60, 535–557. [Google Scholar] - Dohmen, T.J. Housing, Mobility and Unemployment. Reg. Sci. Urban Econ.
**2005**, 35, 305–325. [Google Scholar] [CrossRef] [Green Version] - Fallick, B.; Fleischman, C.; Rebitzer, J.B. Job-Hopping in Silicon Valley: Some Evidence Concerning the Micro-Foundations of a High Technology Cluster; Board of Governors of the Federal Reserve System (U.S.): Washington, DC, USA, 2005.
- Cao, X.N. Debating ‘Brain Drain’ in the context of globalisation. Compare
**1996**, 26, 269–285. [Google Scholar] [CrossRef] - Ryan, C.D.; Li, B.; Langford, C.H. Innovative workers in relation to the city: The case of a natural resource-based centre (Calgary). City Cult. Soc.
**2011**, 2, 45–54. [Google Scholar] [CrossRef] - Glaeser, E.L.; Gottlieb, J.D. Urban Resurgence and the Consumer City. Urban Stud.
**2006**, 43, 1275–1299. [Google Scholar] [CrossRef] - Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev.-Comput. Stat.
**2010**, 2, 433–459. [Google Scholar] [CrossRef] - Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci.
**2016**, 374, 20150202. [Google Scholar] [CrossRef] - Bro, R.; Smilde, A.K. Principal component analysis. Anal. Methods
**2014**, 6, 2812–2831. [Google Scholar] [CrossRef] [Green Version] - Johnstone, I.M. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat.
**2001**, 29, 295–327. [Google Scholar] [CrossRef] - Ringner, M. What is principal component analysis? Nat. Biotechnol.
**2008**, 26, 303–304. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Biau, G. Analysis of a Random Forests Model. J. Mach. Learn. Res.
**2012**, 13, 1063–1095. [Google Scholar] - Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell.
**1998**, 20, 832–844. [Google Scholar] - Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J.C.; Sheridan, R.P.; Feuston, B.P. Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci.
**2003**, 43, 1947–1958. [Google Scholar] [CrossRef] - Mao, X.J.; Peng, L.H.; Wang, Z.L. Nonparametric feature selection by random forests and deep neural networks. Comput. Stat. Data Anal.
**2022**, 170, 107436. [Google Scholar] [CrossRef] - Hediger, S.; Michel, L.; Naf, J. On the use of random forest for two-sample testing. Comput. Stat. Data Anal.
**2022**, 170, 107435. [Google Scholar] [CrossRef] - Goh, A. Backpropagation Neural Networks for Modeling Complex-Systems. Artif. Intell. Eng.
**1995**, 9, 143–151. [Google Scholar] [CrossRef] - Wang, Z.; Kunnemeyer, R.; McGlone, A.; Sun, J.; Burdon, J.; Cree, M.J. Non-destructive detection of chilling injury in kiwifruit using a dual-laser scanning system with a principal component analysis-back propagation neural network. J. Near Infrared Spec.
**2022**, 30, 67–73. [Google Scholar] [CrossRef] - Yang, L.N.; Peng, L.; Zhang, L.M.; Zhang, L.L.; Yang, S.S. A prediction model for population occurrence of paddy stem borer (Scirpophaga incertulas), based on Back Propagation Artificial Neural Network and Principal Components Analysis. Comput Electron. Agric.
**2009**, 68, 200–206. [Google Scholar] [CrossRef]

**Figure 1.**The number of graduates from “Double First-Class” universities in each province after migration from 2014 to 2019.

**Figure 2.**“Double First-Class” university graduates staying in the local employment ratio chart in 2019.

**Figure 3.**“Double First-Class” university graduates’ main mobility path diagram (In the figure, the point represents the provincial capital cities, and the line represents the direction of population migration from provinces to provincial).

**Figure 4.**Mobility chart of the forecasting employment retention rate of “Double First-Class” university graduates.

Variable Name | Obs | Mean | Std. Dev. | Min | Max | |
---|---|---|---|---|---|---|

Economic level | GDP | 156 | 29,668 | 21,393 | 3449 | 107,987 |

Average wage | 156 | 59,527 | 29,675 | 25,202 | 164,563 | |

Commodity house price | 156 | 8677 | 6268 | 3629 | 37,420 | |

Unemployment rate | 156 | 3.173 | 0.647 | 1.300 | 4.500 | |

Industrial structure level | Rationalization of industrial structure | 156 | 1.431 | 0.789 | 0.704 | 5.234 |

Advanced industrial structure | 156 | 0.861 | 0.307 | 0.251 | 1.757 | |

High-tech industrial structure | 156 | 0.147 | 0.103 | 0.0222 | 0.468 | |

Producer service industry structure | 156 | 2.266 | 0.726 | 1.381 | 5.049 | |

Urban development level | Fixed asset investment | 156 | 14,686 | 8812 | 1689 | 37,664 |

Fiscal decentralization | 156 | 6.102 | 2.602 | 3.150 | 14.89 | |

Urbanization rate | 156 | 0.599 | 0.119 | 0.400 | 0.896 | |

FDI/GDP | 156 | 0.0217 | 0.0170 | 0.000385 | 0.126 | |

Science and technology education level | R&D full-time equivalent | 156 | 106,560 | 131,806 | 1779 | 642,490 |

Science and Technology Fiscal Expenditure | 156 | 16,061 | 18,089 | 1238 | 116,879 | |

Number of regular high schools | 156 | 95.15 | 34.46 | 17 | 167 | |

Number of elementary schools | 156 | 6460 | 4573 | 698 | 25,578 | |

Living environment | Road mileage per square kilometer | 156 | 1.081 | 0.447 | 0.304 | 2.115 |

Number of medical and health beds | 156 | 28.07 | 15.00 | 3.450 | 64.01 | |

Harmless treatment of domestic waste | 156 | 24,740 | 19,377 | 3880 | 134,543 | |

Quality of Life | Internet access users | 156 | 12,066 | 8371 | 1203 | 38,016 |

Number of performing arts groups | 156 | 507.8 | 524.2 | 39 | 2859 | |

Number of people participating in pension insurance | 156 | 1430 | 1029 | 224.9 | 5392 |

Dependent Variable | Model I | Model II | Model III | Model IV | Mode V | Model VI | Model VII |
---|---|---|---|---|---|---|---|

Number of University Graduates after Migration in Each Province | |||||||

(Pergdp) | 1.5550 *** | 1.6446 *** | 1.6782 *** | 2.8055 * | |||

(3.39) | (3.03) | (3.08) | (1.83) | ||||

(Wage) | −0.7835 ** | −0.6687 | −0.6524 | 0.4577 | |||

(−2.31) | (−1.45) | (−1.41) | (0.31) | ||||

(House) | −0.1435 | 0.4958 | −1.2683 | ||||

(−0.38) | (0.21) | (−1.29) | |||||

(House_sq) | −0.0345 | ||||||

(−0.30) | |||||||

(Tech) | 0.2803 ** | 0.1703 | 0.1652 | 0.1642 | 0.1215 | ||

(2.53) | (1.65) | (1.61) | (1.61) | (0.56) | |||

(Univer) | 1.8275 ** | 1.7631 ** | 1.5173 *** | 1.4563 *** | 1.4181 ** | 2.3227 ** | |

(2.77) | (2.77) | (2.95) | (3.03) | (2.70) | (2.37) | ||

(Prim) | −0.2154 | −0.4432 | −0.7075 ** | −0.7205 ** | −0.7071 * | −1.0944 *** | |

(−0.61) | (−1.24) | (−2.15) | (−2.17) | (−2.03) | (−2.70) | ||

(Road) | 0.1377 | 0.1381 | −0.0142 | −0.3225 | −0.3614 | −0.3755 | −0.3731 |

(0.52) | (0.70) | (−0.07) | (−1.22) | (−1.09) | (−1.16) | (−0.74) | |

(Health) | 0.9721 ** | 0.6609 | −0.1039 | −0.4356 | −0.4046 | −0.4598 | −1.2901 |

(2.16) | (1.19) | (−0.16) | (−0.56) | (−0.53) | (−0.56) | (−0.96) | |

(Net) | −0.2027 * | −0.3181 ** | −0.2888 ** | −0.3883 ** | −0.4090 ** | −0.4176 ** | −0.7672 * |

(−1.77) | (−2.38) | (−2.42) | (−3.46) | (−3.18) | (−3.31) | (−1.84) | |

(Constant) | 3.2375 | 1.0818 | 9.2633 | 11.9025 * | 11.5202 * | 8.8332 | 8.3677 |

(0.92) | (0.17) | (1.40) | (1.84) | (1.87) | (0.77) | (0.80) | |

Observations | 156 | 156 | 156 | 156 | 156 | 156 | 130 |

Number of provinces | 26 | 26 | 26 | 26 | 26 | 26 | 26 |

Number | 10 | 10 | 10 | 50 | 50 | 50 | 100 | 100 | 100 | 500 | 500 | 500 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

Depth | 10 | 50 | 100 | 10 | 50 | 100 | 10 | 50 | 100 | 10 | 50 | 100 |

MSE | 0.1806 | 0.1799 | 0.1866 | 0.1785 | 0.1761 | 0.1801 | 0.1795 | 0.1794 | 0.1804 | 0.1792 | 0.1782 | 0.1802 |

MAE | 0.1469 | 0.1463 | 0.1523 | 0.1457 | 0.1443 | 0.1468 | 0.1464 | 0.1461 | 0.1469 | 0.1460 | 0.1454 | 0.1467 |

Epoch | Training Set | Validation Set | ||
---|---|---|---|---|

MSE | MAE | MSE | MAE | |

1 | 0.6977 | 0.6777 | 0.1484 | 0.1239 |

2 | 0.1643 | 0.1364 | 0.1411 | 0.1164 |

3 | 0.1595 | 0.1332 | 0.1400 | 0.1160 |

4 | 0.1573 | 0.1314 | 0.1392 | 0.1155 |

5 | 0.1560 | 0.1304 | 0.1385 | 0.1150 |

6 | 0.1549 | 0.1295 | 0.1378 | 0.1146 |

7 | 0.1539 | 0.1288 | 0.1372 | 0.1143 |

8 | 0.1533 | 0.1282 | 0.1367 | 0.1140 |

9 | 0.1527 | 0.1277 | 0.1362 | 0.1137 |

10 | 0.1521 | 0.1272 | 0.1358 | 0.1135 |

… | … | … | … | … |

90 | 0.1301 | 0.1088 | 0.1159 | 0.0930 |

91 | 0.1298 | 0.1085 | 0.1158 | 0.0928 |

92 | 0.1295 | 0.1083 | 0.1158 | 0.0927 |

93 | 0.1292 | 0.1080 | 0.1158 | 0.0926 |

94 | 0.1289 | 0.1078 | 0.1158 | 0.0926 |

95 | 0.1287 | 0.1076 | 0.1158 | 0.0925 |

96 | 0.1284 | 0.1073 | 0.1158 | 0.0924 |

97 | 0.1282 | 0.1071 | 0.1158 | 0.0923 |

98 | 0.1279 | 0.1069 | 0.1158 | 0.0922 |

99 | 0.1277 | 0.1066 | 0.1159 | 0.0922 |

100 | 0.1274 | 0.1064 | 0.1159 | 0.0921 |

**Table 5.**Loss function (MSE) and quality metric (MAE) values of the training and validation set after PCA.

Epoch | Training Set | Validation Set | ||
---|---|---|---|---|

MSE | MAE | MSE | MAE | |

1 | 0.3319 | 0.2833 | 0.2085 | 0.1811 |

2 | 0.2097 | 0.1789 | 0.1286 | 0.1015 |

3 | 0.1617 | 0.1315 | 0.1315 | 0.1107 |

4 | 0.1553 | 0.1277 | 0.1255 | 0.1032 |

5 | 0.1534 | 0.1250 | 0.1252 | 0.1047 |

6 | 0.1521 | 0.1239 | 0.1238 | 0.1036 |

7 | 0.1511 | 0.1228 | 0.1233 | 0.1034 |

8 | 0.1503 | 0.1219 | 0.1230 | 0.1031 |

9 | 0.1495 | 0.1212 | 0.1228 | 0.1030 |

10 | 0.1489 | 0.1206 | 0.1227 | 0.1028 |

… | … | … | … | … |

90 | 0.1167 | 0.0890 | 0.1156 | 0.0941 |

91 | 0.1164 | 0.0888 | 0.1155 | 0.0941 |

92 | 0.1161 | 0.0886 | 0.1154 | 0.0941 |

93 | 0.1158 | 0.0883 | 0.1154 | 0.0941 |

94 | 0.1154 | 0.0882 | 0.1152 | 0.0940 |

95 | 0.1151 | 0.0880 | 0.1152 | 0.0940 |

96 | 0.1148 | 0.0878 | 0.1151 | 0.0940 |

97 | 0.1145 | 0.0877 | 0.1152 | 0.0942 |

98 | 0.1142 | 0.0876 | 0.1152 | 0.0942 |

99 | 0.1140 | 0.0874 | 0.1152 | 0.0943 |

100 | 0.1137 | 0.0873 | 0.1152 | 0.0943 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhao, Y.; He, F.; Feng, Y.
Research on the Current Situation of Employment Mobility and Retention Rate Predictions of “Double First-Class” University Graduates Based on the Random Forest and BP Neural Network Models. *Sustainability* **2022**, *14*, 8883.
https://doi.org/10.3390/su14148883

**AMA Style**

Zhao Y, He F, Feng Y.
Research on the Current Situation of Employment Mobility and Retention Rate Predictions of “Double First-Class” University Graduates Based on the Random Forest and BP Neural Network Models. *Sustainability*. 2022; 14(14):8883.
https://doi.org/10.3390/su14148883

**Chicago/Turabian Style**

Zhao, Yilin, Feng He, and Ying Feng.
2022. "Research on the Current Situation of Employment Mobility and Retention Rate Predictions of “Double First-Class” University Graduates Based on the Random Forest and BP Neural Network Models" *Sustainability* 14, no. 14: 8883.
https://doi.org/10.3390/su14148883