Big Data Spatio-Temporal Correlation Analysis and LRIM Model Based Targeted Poverty Alleviation through Education

to this work. Abstract: There are many factors affecting poverty, among which education is an important one. Firstly, from the perspective of digital statistics, this research quantitatively analyzes the correlation between average education years (AEY) and Gross Domestic Product per capita (GDP/C), and ﬁnds that there is a signiﬁcant positive correlation between AEY and GDP/C in provinces of China. Furthermore, from the perspective of spatial distribution and geostatistics, this research analyzes the correlation between AEY and the distribution of poor counties, revealing the inherent connection between education and poverty. Based on the data processing of nighttime light remote sensing images, this research adopts the machine learning method of random forest to extract the distribution status of spatio-temporal sequences for poor counties. Through the analysis, it is found that poor counties are characterized by centralized distribution and spatial autocorrelation spatially, and the number of poor counties decreases year by year in temporal evolution. On this basis, we analyze the correlation between education levels and the distribution of poor counties. It is found that, on the spatial scale, AEY in poor counties is relatively low, while AEY in non-poor counties is relatively high, showing a signiﬁcant negative correlation between the two. On the temporal scale, the number of poor counties gradually decreased from 2000 to 2010, and at the same time, the education levels of poor counties also gradually improved. Finally, from the perspective of improving education levels to promote poverty elimination, we analyze the main factors affecting education using Principal Component Analysis (PCA) and other methods and obtain a regression model. This research proposes the Linear and Residual Integration Model (LRIM) to more accurately predict AEY in each province in 2020 based on historical data, and identiﬁes the regions with low AEY as key regions for targeted poverty alleviation through education (TPAE) in the future. This research provides a decision-making basis to achieve TPAE means, helping to achieve the victory of the national education poverty elimination battle.


Introduction
The Chinese government innovatively proposed a strategy for targeted poverty alleviation (TPA) and poverty elimination [1,2] in 2013 and implemented a major policy of TPA in 2015. TPA is a poverty alleviation method that accurately identifies, assists and manages poverty alleviation objects based on the actual condition of different poor regions and different poor peasant households [3]. There are different understandings for the definition of TPA. Li and Ye [4] believed that TPA is to carry out precision poverty alleviation in different poor regions according to scientific standards, and introduced a dynamic development mechanism for poverty alleviation according to the local actual condition. Dong [5] thought that not only the targets of TPA should be precise to assist truly poor regions, but also poverty alleviation measures and effects should be precise. At the same time, many scholars had conducted research on TPA. Liu et al. [6] analyzed the relevant policy system, mechanism innovation and future challenges of poverty alleviation. Wu [7] analyzed the dilemmas of rural public management under TPA and proposed targeted solutions. Ruoqi et al. [8] analyzed the existing problems of TPA mechanism in industrial poverty alleviation through local poverty alleviation effects and experience, and innovated the mechanism to help poor households get rid of poverty and become rich. Zhao et al. [9] proposed a poverty alleviation mode combined with a big data platform and put forward targeted solutions and suggestions. Zhang et al. [10] simulated the interactive relationship between local government and poor households through an evolutionary game to improve the effects of the poverty elimination battle.
Poverty alleviation through education (PAE) is the most effective and direct way of TPA [11] and is an important breakthrough in realizing regional poverty alleviation [12]. "To alleviate poverty, we should first support education, and to eliminate poverty, we should first eliminate ignorance." PAE is to fundamentally change the status and situation of poverty by the way of improving education levels [13,14]. Liu [15] pointed out that for precise assistance to poor regions, priority development of education must be ensured in these regions. Sun and Guo [16] analyzed the effects and defects of targeted poverty alleviation through education (TPAE) for the poor and found that there are many problems in the current TPAE. Yuan [17] thought that education must be developed in order to change the state of poverty and achieve sustainable development in poor regions. Peters and Besley [18] pointed out that lack of learning opportunities led to more children falling into crisis and poverty in New Zealand. Meng et al. [19] found through the research that there is a certain correlation between gross domestic product per capita (GDP/C) and average education years (AEY) in Gansu Province. Sun and Liu [20] explored the dynamic change trend of education equity degrees between regions. Liu [21] analyzed poverty alleviation through finance and education using big data and studied new modes and ways of TPA in Henan. Ma et al. [22] explored the factors affecting AEY and analyzed the effects of different influencing factors on AEY. Janjua and Kamal [23] found through research a significant correlation between education improvement and poverty population reduction. Chen et al. [24] established educational poverty alleviation systems at each level by comparing and analyzing education poverty alleviation policies in developed countries such as the United States and the United Kingdom. Liu et al. [25] investigated the spatial agglomeration effect and driving forces of rural education levels and poverty in 27 provinces of China. Xu et al. [26] constructed a mathematical model of the spatial dynamic panel for the relationship between fiscal education expenditure on poverty alleviation and rural poverty, measuring the direct and indirect poverty alleviation effects of financial expenditure on PAE. Paraschiv [27] explored the role of education in poverty alleviation and studied the social and economic impacts of poverty on countries in the Organization for Economic Co-operation and Development.
Through the analysis of the existing research, it can be found that there are a lot of research studies related to poverty alleviation, while there are a few research studies related to PAE. Moreover, the research on TPAE is in the preliminary stage. For TPAE, the existing research studies have the following three problems. First, most of the research focuses on the concept, relevant policies and measures of TPA from the perspective of social science, and some carry out case studies in specific regions. It is rare to use a scientific method to research the correlation between education and poverty and how to carry out TPAE. Second, although the idea that education can improve poverty has been widely accepted, there is a lack of theoretical basis behind this phenomenon. There are few studies exploring the causal effect of education on poverty improvement. Third, in order to achieve the goal of TPAE, the key is to clarify the inner connection between education and poverty, qualitative research cannot meet the needs. At present, there is a lack of quantitative research on the relationship between education and poverty, which is not feasible to rely solely on statistical analysis.
In response to the above existing problems, the main contributions of this research are as follows. In terms of method, introduce GIS spatial analysis and geostatistical analysis, and combine with traditional mathematical-statistical analysis to carry out TPAE research. In terms of specific application, choosing AEY as the indicator of education development levels, the research analyzes the internal relationship between education development levels and poverty from the two aspects of the relationship between AEY and GDP/C and the relationship between AEY and the distribution of poor counties, quantitatively analyze its correlation, explore its causal effects, providing a theoretical basis for TPAE. On this basis, we further explore the influencing factors of education levels and adopt the Principal Component Analysis (PCA) method to extract the main factors to obtain the regression model. Finally, this research combines the advantages of Autoregressive Integrated Moving Average model (ARIMA) model for linear time series data prediction and Back Propagation (BP) neural network model for nonlinear residual prediction and builds the Linear and Residual Integration Model (LRIM) to accurately predict AEY in provinces. The regions with low AEY are then determined as the key regions for TPAE in the future, providing a decision-making basis for TPAE.
The organization of the rest of the paper is as follows. Section 2 introduces the research area and data, including the types and sources of data, the total population with different education levels, GDP/C indicator and nighttime light remote sensing data. Section 3 describes the research methods used in the paper, including the proposed research framework, correlation analysis method, random forest classification method for nighttime light image, analysis method for influencing factors of education, etc., and proposes LRIM integrated model at the end of this section to predict education levels. Section 4 mainly presents the results of the correlation analysis between education and poverty obtained by the above methods, and the prediction results of the AEY obtained by the constructed model and analyzes and discusses the results. Section 5 draws the research conclusions, and based on the conclusions, determines the targeted regions for TPAE and the effective measures that should be adopted.

Data
The research area is the Chinese mainland, including 22 provinces, five autonomous regions, and four municipalities, and does not involve Taiwan, Hong Kong, and Macao regions of China.
The research uses three data sources of Census Data, Statistical Yearbook Data, and nighttime light remote sensing data to calculate the required indicators.

Population Number with Different Education Levels
Education levels are divided into five levels: illiteracy, elementary school education level (EEL), junior high school, high school education level (JHEL) and technical secondary school education level (HEL), university and above education level (UEL). The population of five education levels is counted using the Sixth Census Data, the 2011 Statistical Yearbook, and the 2011 Education Statistical Data, and the results are shown in Table 1, and the spatial pattern of the national education levels is analyzed.

GDP/C Indicator
According to the population and the total GDP of each province in Sixth Census Data, GDP/C indicator of each province in 2010 is obtained, as shown in the last column of Table 1.

Nighttime Remote Sensing Data
In order to match Census Data and Education Statistics Yearbook Data, the research selects nighttime light remote sensing data (DMSP/OSL) in the four periods of 1995, 2000, 2005, and 2010. We crop the research area image based on the national vector map and perform DN outlier processing, radiation correction and other preprocessing. For the preprocessed data, based on the pixel scale in each county, we extract the 11 classification characteristics (Table 2) of nighttime light images from the four angles of the quantitative characteristic, dispersion degree, distribution characteristic, and spatial characteristic. The table of classification characteristics is as follows.
The research selects 100 poor counties and 110 non-poor counties as classification samples and extracts characteristic indicators of A1-A11 based on nighttime light remote sensing data. Considering space limitations, we select an indicator value from each of the four categories of characteristics, displaying the characteristic values of each county in 2010, as shown in Figure 1. The A1 indicator is selected as the quantitative characteristic, the A4 indicator is selected as the dispersion degree, the A8 indicator is selected as the distribution characteristic, and the A11 is selected as the spatial characteristic.

Categories Indicators Characteristics Description
Quantitative characteristics

Proposed Research Framework
The proposed research framework of the research is shown in Figure 2, and mainly includes three parts, respectively the correlation analysis between education and poverty, the analysis of the influencing factors of education levels, and the prediction of education levels for the purpose of determining the poverty alleviation region.

Proposed Research Framework
The proposed research framework of the research is shown in Figure 2, and mainly includes three parts, respectively the correlation analysis between education and poverty, the analysis of the influencing factors of education levels, and the prediction of education levels for the purpose of determining the poverty alleviation region.

Calculation of AEY
AEY generally refers to the average value of the total number of academic education years received by the population aged 6 and above. In the specific calculation, according to the population of 5 types of education levels, the following formula is used to calculate in provinces as units.
In the formula, is AEY, and 1 − 5 are respectively the population of 5 types of education levels: illiteracy, EEL, JHEL, HEL, and UEL. The coefficients are taken the value based on the current education system of China at all stages of education. Table 3 shows AEY from 1989 to 2019 in China. The correlation analysis between education and poverty Correlation analysis

Calculation of AEY
AEY generally refers to the average value of the total number of academic education years received by the population aged 6 and above. In the specific calculation, according to the population of 5 types of education levels, the following formula is used to calculate in provinces as units.
In the formula, Y ave is AEY, and NP Lev1 − NP Lev5 are respectively the population of 5 types of education levels: illiteracy, EEL, JHEL, HEL, and UEL. The coefficients are taken the value based on the current education system of China at all stages of education. Table 3 shows AEY from 1989 to 2019 in China.

Calculation of Correlation Coefficient
Because Spearman Correlation Coefficient (SCC) [28,29] is not sensitive to data errors and extreme values, this research first uses Pearson Correlation Coefficient (PCC) [30,31] to calculate the relationship between education and poverty and then uses SCC to calculate the correlational relationship. The advantage of using the two correlation coefficients is that the results can be mutually verified. PCC [32,33] is used to measure the linear correlation between interval variables by studying whether two data sets are on the same line. The calculation formula is: SCC [34] is based on PCC, using the rank of the elements in their respective sets to calculate the monotonicity relationship between elements in the two sets. The calculation formula is:

Random Forest Classification Algorithm
This research uses the random forest algorithm of machine learning to extract the poor counties from nighttime light remote sensing images [35,36]. Random Forest Algorithm is based on Bagging algorithm. N training sets with the same sample size as the original sample set are randomly sampled from the original sample set with replacement, and through N training sets establish N decision trees {h (x,θ n ) , n = 1, 2, . . . , N} to form a "forest" [37,38]. The unselected sample is out-of-bag data, and the probability that the sample is out-of-bag data is (1 − 1/N) N . When N is large enough, the probability is close to 37.8%. When each decision tree grows to select characteristic variables, randomly and equiprobably extract F from all E characteristic variables (usually F = |log 2 E| + 1) to construct each decision tree. Each decision tree is not pruned during its growth to maximize its growth. The final result of random forest regression depends on the voting of the classification results of N decision trees, namely: In the formula, x is the independent variable and dependent variable of the input model, and θ n is the independent identically distributed random vector.

Spatial Autocorrelation Analysis
Moran's I [39][40][41][42] is a method to calculate the spatial autocorrelation coefficient, and this research uses it to analyze the spatial autocorrelation of the distribution of poor counties. The value of Moran's I is distributed in [−1, 1], which is used to judge whether there is an autocorrelation relationship in the space. The statistical formula of Moran's I for spatial autocorrelation is:

Influencing Indicator Selection
There are many factors affecting education development levels in a region. Generally speaking, the factors can be divided into the following categories: population distribution, economic development, funding support, teachers, etc. The seven influencing factors are extracted from four categories. The natural population growth rate factor X 1 , the urban-rural population ratio factor X 2 , the population sex ratio factor X 3 , the industry population structure factor X 4 , and GDP/C X 5 are selected as the demographic factors, the compulsory education funding factor X 6 is selected as the funding factor, and the ratio of students per teacher X 7 is selected as the teacher factor.
Using the 2010 Census Data and the 2011 Statistical Yearbook Data, the above seven factors are statistically calculated. Among them, factors such as the urban-rural population ratio, industry population structure, compulsory education funding, and teachers are calculated by the following formulas.
In the formula, NP rural is the rural population, NP is the national total population, NP low is the industry population with low education levels, NP indu is the national total industry population, PEF lev3 is the per capita education funding of different education levels, SPT Lev is the number of students per teacher at each education levels, lev2 − lev5 are respectively EEL, JHEL, HEL, and UEL. Table 4 shows the data of the seven influencing factors in provinces of China in 2010.

Principal Component Analysis
Principal Component Analysis (PCA) [43][44][45][46] based on the idea of dimensionality reduction, can convert multiple factors into fewer comprehensive indicators that have no relationship with each other under the state of maintaining a low information loss. Firstly, construct a few linear combinations suitable for the original variables to generate new variables that are not correlated with each other, and then extract several variables containing most of the information of the original variables to explain the original variables. The extracted new variables are the principal components. PCA can be expressed by a mathematical model: In the formula, a in is the characteristic vector, and ZX n is the value after standardizing the original variable, (i = 1, 2, 3 . . . . . . n).

Proposed LRIM Model
In the research, a new LRIM is constructed to predict education development levels on time series data. The basic idea of the model is: considering that the linear and nonlinear composite structure is the common characteristic of the actual common time-series data, the data structure of time series is divided into two parts of the linear autocorrelation main body and the nonlinear residual. ARIMA [47,48] is used to predict the linear main body of the data, and calculate the residual between the prediction results and the real values. Then BP neural network model is used to predict the nonlinear residual, and the final prediction results are obtained by the integration.
ARIMA (p, d, q) model can be expressed as: AR represents the autoregressive model, p is the number of autoregressive terms corresponding to the model; MA is the moving average model, q is the number of corresponding moving average terms, and d is the number of different times to keep the sequence stationary.
In the formula, ∆ d y t represents the sequence of y after d differential transformation, and ε t is the random error at time t, which obeys a normal distribution with a mean value of zero and a constant variance. They are mutually independent white noise sequences and are parameters to be estimated in the model.
The process of using ARIMA for time-series prediction is: 1 Preprocess the collected time series data and perform stationarity test and white noise test. ARIMA can only be used to predict when it is tested as a stationary non-white noise sequence. 2 Model recognition is to select a model that matches the given time-series data from the known prediction models. 3 Order determination for a model is to determine the order of the chosen prediction model by using the Bayesian Information Criterion (BIC). 4 Parameter estimation is to estimate the model parameters by using methods such as least squares estimation, maximum likelihood estimation and correlation matrix estimation. 5 Model test is to verify the fitting effect of the prediction model.
BP neural network is one of the most widely used neural networks, which is a multilayer feed-forward neural network trained by the error back-propagation algorithm. The basic idea of the model is to train by using the gradient descent method and then reversely modify the weights and thresholds of the hidden layer according to the training results. After continuous learning and improvement, finally, obtain an optimized model output results of which are consistent with the input data mode. It is convenient to mine the nonlinear mode in time-series data.
Assume that the time-series y t of the research is composed of two parts: the linear autocorrelation main body L t and the nonlinear residual F t , namely: The method of using the newly proposed LRIM to predict the time series data is: (1) Model the linear autocorrelation main body L t through ARIMA model, determine the parameters of ARIMA (p, d, q) to establish a prediction model, and obtain the prediction result L ct . And subtract the prediction result L ct and the original time series y t to get the residual F t .
(2) The residual sequence F t contains the nonlinear part of the original sequence, and uses BP neural network model to describe this nonlinear relationship. Assuming that there are a pieces of input data in BP, the residual sequence is expressed as: (3) Integrate the predicted values of the two parts, namely:

Similarity of Spatial Distribution Pattern
Education levels are divided into five levels: illiteracy, EEL, JHEL, HEL, and UEL. The population with five educational levels in China (Table 1) are conducted to spatialize and obtain the spatial distribution as shown in Figure 3a-e. Figure 3f shows the distribution of GDP/C in provinces of China. According to Figure 3, it can be seen that the overall distribution of the illiteracy and the population with EEL by provinces shows a decreasing trend from southwest to northeast. The population in Tibet, Qinghai, Guizhou, Yunnan and other places accounts for a large proportion. For example, the illiteracy in Tibet Autonomous Region accounts for more than 35%, indicating the education levels in these regions are relatively backward. The distribution of the population with JHEL, HEL and UEL by provinces shows a decreasing trend from east to west. The population of Beijing, Shanghai, Tianjin and other coastal areas accounts for a large proportion. For example, the population with UEL in Beijing accounts for more than 32%, indicating that the education levels in these regions have increased significantly.
Economic development levels among provinces in China is not the same, which is specifically manifested in the difference of GDP/C indicator by provinces. The research calculates GDP/C indicators of provinces in 2010 as shown in the last column of Table 1, and its spatial distribution is shown in Figure 3f. It can be seen according to the figure that GDP/C in provinces of China in 2010 shows a decreasing trend from east to west, and GDP/C of coastal areas is significantly higher than that of central and inland areas. GDP/C in Beijing and Shanghai is high reaching 77,205 yuan in Shanghai, and 9214 yuan in Guizhou.
It can be seen according to Figure 3 that the distribution of GDP/C in provinces of China has a similar spatial pattern with the distribution of higher education levels (HEL and UEL).

Correlation between AEY and GDP/C
In order to quantitatively analyze the relationship between education and poverty, the researchers used two correlation coefficients to calculate the correlation between AEY and GDP/C. Using Formula (1) calculates AEY by provinces of China in 2010, and maps its geographic distribution, as shown in Figure 4a. It can be seen that the distribution of AEY shows a decreasing trend from east to west as a whole. Among them, in regions such as Beijing and Shanghai, AEY is more than 11 years, while in Tibet, Qinghai, Gansu, Guizhou, Yunnan and other provinces, AEY is only about 5 to 8 years, which has not reached the standard of compulsory education. its geographic distribution, as shown in Figure 4a. It can be seen that the distribution of AEY shows a decreasing trend from east to west as a whole. Among them, in regions such as Beijing and Shanghai, AEY is more than 11 years, while in Tibet, Qinghai, Gansu, Guizhou, Yunnan and other provinces, AEY is only about 5 to 8 years, which has not reached the standard of compulsory education.
AEY and GDP/C in provinces of China are generated into a line graph, as shown in Figure 4b. In the figure, it can be seen that the trends of AEY and GDP/C curve are roughly similar, showing a positive correlation.
The correlation coefficient is used to further calculate the correlation between AEY and GDP/C. Formulas (2) and (3) are used to calculate PCC and SCC values between AEY and GDP/C in 2010, and the results are shown in Table 5.
As can be seen in Table 5, the PCC of AEY and GDP/C is 0.729 and SCC is 0.754, the above indicating a significant positive correlation between AEY and GDP/C in 2010. Since SCC is based on PCC, use the rank ordering of the elements in their respective sets in the two sets to calculate the correlation between the two, the calculation result of the correlation is better than PCC. The correlation coefficient is used to further calculate the correlation between AEY and GDP/C. Formulas (2) and (3) are used to calculate PCC and SCC values between AEY and GDP/C in 2010, and the results are shown in Table 5. As can be seen in Table 5, the PCC of AEY and GDP/C is 0.729 and SCC is 0.754, the above indicating a significant positive correlation between AEY and GDP/C in 2010. Since SCC is based on PCC, use the rank ordering of the elements in their respective sets in the two sets to calculate the correlation between the two, the calculation result of the correlation is better than PCC.
Comprehensive analysis, there is a significant correlation between GDP/C and the local AEY. Therefore, education levels in the provinces of China and the local poverty levels are correlated. The improvement of the education levels can well promote the economic development and reduce the poverty indicator in a region. At the same time, the economic development levels of a region can well drive the improvement of the local education levels. The improvement of education levels is an important way and method to prevent the continuous spread of poverty in a region.

Correlation between AEY and Poor Counties Distribution
On the basis of finding a significant positive correlation between AEY and GDP/C, this research analyzes the correlation between AEY and the distribution of poor counties from another aspect, so as to explore causality to a certain extent based on the two correlations.
Nighttime light remote sensing image reflects the economic activities of humans and can reflect the economic condition of the region to a certain extent. Based on nighttime light remote sensing images, the research uses the random forest classification algorithm of machine learning to extract the spatio-temporal distribution of poor counties in China from 1995 to 2010, and further analyzes the correlation between AEY and the distribution of poor counties in provinces of China.
According   It can be seen in Table 6 that the classification results based on the random forest algorithm in 1995, 2000, 2005 and 2010 have a good effect. The accuracy of classification and recognition of poor counties is relatively high, with an overall accuracy above 95%. Kappa coefficients are all above 0.85, which can well distinguish the spatio-temporal dynamic distribution from 1995 to 2010.
LISA map and Moran's I scatter plot of the distribution of poor counties in China are shown in Figure 6. From spatial perspective, poor counties of China are still concentrated in the southwest, central, and northeast of China, while non-poor counties are mainly distributed in the northern coast of China. From the perspective of temporal evolution, show In March 2020, 100 poor counties and 100 non-poor counties in the existing data of national key poor counties are selected as the accuracy-test data, and the accuracy of the two recognition results before and after the nighttime light data in 1995, 2000, 2005 and 2010 is calculated. The overall accuracy and Kappa coefficient identified are shown in Table 6. It can be seen in Table 6 (2) and (3) calculate obtaining that the PCC of the two in 2000 is −0.794, and it in 2010 is −0.748. It can be seen that there is a significant negative correlation between AEY and the distribution of poor counties.
By comparing the distribution of AEY and poor counties in 2000 and 2010, it can be seen that on the spatial scale, AEY in poor counties is relatively low, while AEY in non-poor counties is relatively high, and the two show a significant negative correlation. On the time scale, the number of poor counties has gradually decreased from 2000 to 2010, at the same time, the education levels of poor counties also gradually improved.

Analysis of Influencing Factors of Education levels
This research selects the seven influencing factors among the four categories of population distribution, economic development, funding support, and teachers, including the natural growth rate factor X 1 , the population urban-rural ratio factor X 2 , the population sex ratio factor X 3 , the industry population structure factor X 4 , the resident comprehensive income factor X 5 , the compulsory education funding factor X 6 , the teacher-per-student ratio factor X 7 . Since the above seven factors may be correlated, use PCA to determine the influencing factors and weights of education development levels. According to the data in Table 4, draw scree plot and scatter plot of the seven characteristic values. It can be seen according to Figure 8 that, from the third point, the characteristic value of the point tends to be flat, so extract two factors for calculation. Among the seven characteristic factors, only the natural population growth rate (λ1 = 4.3150) and the urban-rural proportion of the population (λ2 = 1.3151) meet the requirements, and other characteristic factors cannot be taken as the principal component factors.  By comparing the distribution of AEY and poor counties in 2000 and 2010, it ca seen that on the spatial scale, AEY in poor counties is relatively low, while AEY in poor counties is relatively high, and the two show a significant negative correlation the time scale, the number of poor counties has gradually decreased from 2000 to 20 the same time, the education levels of poor counties also gradually improved.

Analysis of Influencing Factors of Education levels
This research selects the seven influencing factors among the four categories of ulation distribution, economic development, funding support, and teachers, includin natural growth rate factor 1 , the population urban-rural ratio factor 2 , the popul sex ratio factor 3 , the industry population structure factor 4 , the resident compre sive income factor 5 , the compulsory education funding factor 6 , the teacher-per dent ratio factor 7 . Since the above seven factors may be correlated, use PCA to d mine the influencing factors and weights of education development levels. Accordi the data in Table 4, draw scree plot and scatter plot of the seven characteristic valu can be seen according to Figure 8 that, from the third point, the characteristic value o point tends to be flat, so extract two factors for calculation. Among the seven characte factors, only the natural population growth rate ( 1 = 4.3150) and the urban-rural portion of the population ( 2 = 1.3151) meet the requirements, and other characte factors cannot be taken as the principal component factors. After the principal component transformation, the contribution rate of the first prin cipal component is 62.68%, and that of the second principal component is 18.79%, that is the cumulative contribution rate of the first two principal components is 81.47%. Selec the first two principal components for performing factor rotation to obtain the rotation component matrix. The formulas for the two principal components are: It can be seen from the above formula that, excluding the variables whose standard ized coefficients do not meet the condition (less than 0.3), the main role in the first princi pal component is the natural population growth rate factor 1 , the population urban rural ratio factor 2 , the industry population structure factor 4 , the resident compre hensive income factor 5 , the compulsory education funding factor 6 and the teache ratio factor 7 . The main role in the second principal component is the natural population growth rate factor 1 and the population sex ratio factor 3 . In the first principal com ponent, 1 , 2 , and 4 are positively correlated, 5 and 6 play a negative role, and in the second principal component, 1 and 3 are positively correlated.
Since the correlation also depends on the contribution rate for each of the seven in fluencing factors, a regression model is constructed for factor analysis, and the main indi cators are screened out by comparative analysis among the seven indicators.
Based on the seven characteristic variables extracted in provinces of China in 2010 the PCC matrix is calculated according to Formula (2), the correlation between the seven characteristic variables related to AEY is analyzed, and the heat map of PCC is obtained as shown in Figure 9. In the figure, ZRZZ is the natural population growth rate factor namely 1 factor, and CXBL, XBBL, HXRK, JMSR, JYJF, SZLL are respectively 2 to 7 factors.  After the principal component transformation, the contribution rate of the first principal component is 62.68%, and that of the second principal component is 18.79%, that is, the cumulative contribution rate of the first two principal components is 81.47%. Select the first two principal components for performing factor rotation to obtain the rotation component matrix. The formulas for the two principal components are: (17) It can be seen from the above formula that, excluding the variables whose standardized coefficients do not meet the condition (less than 0.3), the main role in the first principal component is the natural population growth rate factor X 1 , the population urban-rural ratio factor X 2 , the industry population structure factor X 4 , the resident comprehensive income factor X 5 , the compulsory education funding factor X 6 and the teacher ratio factor X 7 . The main role in the second principal component is the natural population growth rate factor X 1 and the population sex ratio factor X 3 . In the first principal component, X 1 , X 2 , and X 4 are positively correlated, X 5 and X 6 play a negative role, and in the second principal component, X 1 and X 3 are positively correlated.
Since the correlation also depends on the contribution rate for each of the seven influencing factors, a regression model is constructed for factor analysis, and the main indicators are screened out by comparative analysis among the seven indicators.
Based on the seven characteristic variables extracted in provinces of China in 2010, the PCC matrix is calculated according to Formula (2), the correlation between the seven characteristic variables related to AEY is analyzed, and the heat map of PCC is obtained, as shown in Figure 9. In the figure, ZRZZ is the natural population growth rate factor, namely X 1 factor, and CXBL, XBBL, HXRK, JMSR, JYJF, SZLL are respectively X 2 to X 7 factors. KMO measure and Bartlett test of sphericity are performed on seven characteristic values, and the results are shown in Table 7. KMO measure value is greater than 0.8, indicating that the result of factor analysis is very well. The significance of the Bartlett test of sphericity is less than 0.05, indicating that the data are distributed in a spherical shape. Each variable is independent of each other to a certain extent, which is suitable for factor analysis. Taking the seven characteristic parameters as independent variables and AEY in provinces of China in 2010 as dependent variables, sort by the contribution rate of the variables, carry out the bidirectional regression fusion, and establish a stepwise regression analysis model. The parameter table is shown in Figure 10. It can be seen from the results that the industry population structure factor 4 , the natural population growth rate factor 1 , and the compulsory education funding factor 6 are the main factors affecting regional education development levels. It can be seen KMO measure and Bartlett test of sphericity are performed on seven characteristic values, and the results are shown in Table 7. KMO measure value is greater than 0.8, indicating that the result of factor analysis is very well. The significance of the Bartlett test of sphericity is less than 0.05, indicating that the data are distributed in a spherical shape. Each variable is independent of each other to a certain extent, which is suitable for factor analysis. Taking the seven characteristic parameters as independent variables and AEY in provinces of China in 2010 as dependent variables, sort by the contribution rate of the variables, carry out the bidirectional regression fusion, and establish a stepwise regression analysis model. The parameter table is shown in Figure 10. KMO measure and Bartlett test of sphericity are performed on seven characteristic values, and the results are shown in Table 7. KMO measure value is greater than 0.8, indicating that the result of factor analysis is very well. The significance of the Bartlett test of sphericity is less than 0.05, indicating that the data are distributed in a spherical shape. Each variable is independent of each other to a certain extent, which is suitable for factor analysis. Taking the seven characteristic parameters as independent variables and AEY in provinces of China in 2010 as dependent variables, sort by the contribution rate of the variables, carry out the bidirectional regression fusion, and establish a stepwise regression analysis model. The parameter table is shown in Figure 10. It can be seen from the results that the industry population structure factor 4 , the natural population growth rate factor 1 , and the compulsory education funding factor 6 are the main factors affecting regional education development levels. It can be seen It can be seen from the results that the industry population structure factor X 4 , the natural population growth rate factor X 1 , and the compulsory education funding factor X 6 are the main factors affecting regional education development levels. It can be seen from Table 8 that the p values for significance test of the regression coefficients corresponding to the three selected influencing factors are under the condition that the confidence level is lower than 0.05, therefore, the selected regression variable factors all meet the significance. From this, the regression equation of the regression model is obtained: It can be seen from the established stepwise regression model that the greater the value of the industry population structure, the greater the ratio of the industry population with low education levels in a region, and the lower education development levels. The higher the proportion of the population growth rate, the faster the natural population growth in a region, and the lower education development levels. The more the investment of the compulsory education funding, the greater the investment required for popularizing compulsory education in a region, and the lower education development levels.

Prediction of AEY in Provinces
In order to overcome the defects of using a single linear model to predict time series data, the research predicts AEY in provinces of China based on the LRIM model constructed in Section 3.4 and compares it with the prediction results of the two traditional models.
Based on the data of history AEY from 1989 to 2019 (Table 3), the ARIMA model, BP neural network model and LRIM integration model constructed in the research are used to predict the time-series data of the national AEY. Among them, the data from 1989 to 2017 are used as the modeling sample, and the data from 2018 to 2019 are used as the test sample.

Prediction Results of ARIMA Model
To apply the ARIMA model, data stationarity detection and white noise detection should be performed first. The augmented Dickey-Fuller (ADF) test result is shown in Figure 11, and the parameter d in the ARIMA (p, d, q) model is determined to be 1.   It can be seen from the established stepwise regression model that the greater the value of the industry population structure, the greater the ratio of the industry population with low education levels in a region, and the lower education development levels. The higher the proportion of the population growth rate, the faster the natural population growth in a region, and the lower education development levels. The more the investment of the compulsory education funding, the greater the investment required for popularizing compulsory education in a region, and the lower education development levels.

Prediction of AEY in Provinces
In order to overcome the defects of using a single linear model to predict time series data, the research predicts AEY in provinces of China based on the LRIM model constructed in Section 3.4 and compares it with the prediction results of the two traditional models.
Based on the data of history AEY from 1989 to 2019 (Table 3), the ARIMA model, BP neural network model and LRIM integration model constructed in the research are used to predict the time-series data of the national AEY. Among them, the data from 1989 to 2017 are used as the modeling sample, and the data from 2018 to 2019 are used as the test sample.

Prediction Results of ARIMA Model
To apply the ARIMA model, data stationarity detection and white noise detection should be performed first. The augmented Dickey-Fuller (ADF) test result is shown in Figure 11, and the parameter in the ARIMA (p, d, q) model is determined to be 1. Figure 11. ADF test result of AEY. "Prob.*" is short for probability that comes with software systems, indicating the probability of the existence of a unit root.
According to the autocorrelation coefficient and partial correlation coefficient of the data series in Figure 12, it can be seen that there is an obvious truncation phenomenon when the autocorrelation coefficient and the partial correlation coefficient are both 1. Determined the model coefficient p = 1, q = 1, the final model is ARIMA (1, 1, 1). Figure 11. ADF test result of AEY. "Prob.*" is short for probability that comes with software systems, indicating the probability of the existence of a unit root.
According to the autocorrelation coefficient and partial correlation coefficient of the data series in Figure 12, it can be seen that there is an obvious truncation phenomenon when the autocorrelation coefficient and the partial correlation coefficient are both 1. Determined the model coefficient p = 1, q = 1, the final model is ARIMA (1, 1, 1). ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 20 of 26 Figure 12. Autocorrelation and partial correlation for study sequence.
According to the regression result shown in Figure 13, the t values of AR(1) and MA(1) are respectively 18.61856 and −1.853016, and the p values are respectively 0 and 0.0757, therefore, it is reasonable to use the ARIMA (1, 1, 1) model to predict.
The prediction results using the ARIMA (1, 1, 1) model are shown in Figure 14, using the ARIMA (1, 1, 1) model, and the final predicted value of the national AEY in 2020 is 9.360730 years. According to the regression result shown in Figure 13, the t values of AR(1) and MA(1) are respectively 18.61856 and −1.853016, and the p values are respectively 0 and 0.0757, therefore, it is reasonable to use the ARIMA (1, 1, 1) model to predict.
The prediction results using the ARIMA (1, 1, 1) model are shown in Figure 14, using the ARIMA (1, 1, 1) model, and the final predicted value of the national AEY in 2020 is 9.360730 years.
The prediction results using the ARIMA (1, 1, 1) model are shown in Figure 14, using the ARIMA (1, 1, 1) model, and the final predicted value of the national AEY in 2020 is 9.360730 years.

Prediction Results of BP Model
The time-series data of the national AEY from 1989 to 2019 is a one-dimensional sequence. The designed BP neural network structure is shown in Figure 15, including three input nodes and 10 hidden layer nodes. Select 70% of the data as the training data set, 15% of the data as the test data set, and 15% of the data as the validation data set. The prediction results after training the model are shown in Figure 16

Prediction Results of BP Model
The time-series data of the national AEY from 1989 to 2019 is a one-dimensional sequence. The designed BP neural network structure is shown in Figure 15, including three input nodes and 10 hidden layer nodes. Select 70% of the data as the training data set, 15% of the data as the test data set, and 15% of the data as the validation data set. The prediction results after training the model are shown in Figure 16. The prediction value of national AEY in 2020 is 9.5654 years, Root Mean Square Error ( ) is 0.2795, and Mean Absolute Percentage Error ( ) is 2.1293.

Prediction Results of BP Model
The time-series data of the national AEY from 1989 to 2019 is a one-dimensional sequence. The designed BP neural network structure is shown in Figure 15, including three input nodes and 10 hidden layer nodes. Select 70% of the data as the training data set, 15% of the data as the test data set, and 15% of the data as the validation data set. The prediction results after training the model are shown in Figure 16. The prediction value of national AEY in 2020 is 9.5654 years, Root Mean Square Error ( ) is 0.2795, and Mean Absolute Percentage Error ( ) is 2.1293.

Prediction Results of BP Model
The time-series data of the national AEY from 1989 to 2019 is a one-dimensional sequence. The designed BP neural network structure is shown in Figure 15, including three input nodes and 10 hidden layer nodes. Select 70% of the data as the training data set, 15% of the data as the test data set, and 15% of the data as the validation data set. The prediction results after training the model are shown in Figure 16. The prediction value of national AEY in 2020 is 9.5654 years, Root Mean Square Error ( ) is 0.2795, and Mean Absolute Percentage Error ( ) is 2.1293.

Prediction Results of LRIM Model
Using the LRIM model constructed in Section 3.4, predict the national AEY in 2020, and the final prediction result is 9.4741 years.
In order to evaluate the prediction effects of the three prediction models on the national AEY, this research uses two measurement indicators, MAPE and RMSE, the formulas of which are as follows: The smaller the value of RMSE and PE, the higher the model prediction accuracy, and the closer the prediction result to the true value. The prediction effects of the three models are shown in Table 9. Compared with the ARIMA model, the RMSE of the LRIM model decreases by 0.2303 and the MAPE of that decreases by 2.074. Compared with the BP neural network model, the RMSE of the LRIM model decreases by 0.0928 and the MAPE of that decreases by 0.5053. This shows that the prediction effects of the LRIM model are better than that of the single ARIMA model or BP model. The LRIM model is used to predict AEY in provinces in 2020, and the results are shown in Table 10, the broken line graph drawn as Figure 17.  It can be seen in Table 10 that of the prediction result using LRIM is less than 0.3, with the least 0.1522 in Beijing and the greatest 0.2299 in Shanghai. The prediction effect of the model is well, and the prediction value of AEY in provinces is close to the theoretical standard level. Analyzing the prediction results of AEY in provinces in 2020, AEY in all other provinces can reach more than 9 years except for the Tibet Autonomous Region, Fujian Province, and Guizhou Province. Among them, Beijing has the highest AEY, reaching 13.17149 years, and Tibet has the lowest, only 5.456407 years.
The prediction values of AEY in 2020 in Tibet, Guizhou, Yunnan, Fujian and other regions do not meet the compulsory education standard of 9 years. These regions will be identified as key regions for TPAE in the future. The government department should formulate strategies for TPAE according to the leading factors affecting education in the region. It can be seen in Table 10 that RMSE of the prediction result using LRIM is less than 0.3, with the least 0.1522 in Beijing and the greatest 0.2299 in Shanghai. The prediction effect of the model is well, and the prediction value of AEY in provinces is close to the theoretical standard level. Analyzing the prediction results of AEY in provinces in 2020, AEY in all other provinces can reach more than 9 years except for the Tibet Autonomous Region, Fujian Province, and Guizhou Province. Among them, Beijing has the highest AEY, reaching 13.17149 years, and Tibet has the lowest, only 5.456407 years.
The prediction values of AEY in 2020 in Tibet, Guizhou, Yunnan, Fujian and other regions do not meet the compulsory education standard of 9 years. These regions will be identified as key regions for TPAE in the future. The government department should formulate strategies for TPAE according to the leading factors affecting education in the region.
The prediction value of AEY in provinces can provide a decision basis for the relevant government department to accurately determine the corresponding poverty alleviation regions and to take poverty alleviation measures, ultimately achieving the victory of the poverty elimination battle.

Conclusions
TPAE is an important part of poverty elimination, giving full play to the key role of education in TPA. This research analyzes the correlation between education levels and poverty, and the influencing factors of education levels. Moreover, on the basis of the above analysis conclusions, it predicts education levels in provinces of China to determine key regions for poverty alleviation.
(1) There is a significant positive correlation between AEY and GDP/C. The higher AEY is, the higher the local GDP, showing that AEY can be used as an indicator of PAE to a certain extent. By increasing AEY, it can help the locals to improve their economic level and eliminate poverty. (2) There is a negative correlation between AEY and the distribution of poor counties. It indicates that the low level of local education is to some extent a factor causing poverty in the region. Moreover, by increasing AEY in the region, the local poverty situation can be improved, therefore, PAE is an important channel for TPA in China. (3) The industry population structure, the natural population growth rate, and the compulsory education funding have become the main factors affecting the level of regional education development. By continually improving the above three influencing factors can enhance the overall AEY in provinces, ultimately helping the region eliminate poverty. (4) The LRIM model constructed in this research is used to predict and analyze AEY in provinces of China based on historical data. In 2020, AEY in most regions of China has basically reached the national nine-year compulsory education, however, there are still some regions, such as Tibet, Guizhou, Yunnan, and Fujian, where AEY are very low. The above regions are identified as key regions for TPAE in the future. Only by realizing the improvement of education levels in these regions can the goal of TPAE in China be achieved. (5) Finally, on the basis of the above research, we have designed and developed a WeChat mini-program of "Through Train for TPAE", which serves as a one-to-one assistance platform between TPAE volunteers and poor students to promote the implementation of TPAE.