Spatial Characteristics of Life Expectancy and Geographical Detection of Its Influencing Factors in China

Life expectancy (LE) is a comprehensive and important index for measuring population health. Research on LE and its influencing factors is helpful for health improvement. Previous studies have neither considered the spatial stratified heterogeneity of LE nor explored the interactions between its influencing factors. Our study was based on the latest available LE and social and environmental factors data of 31 provinces in 2010 in China. Descriptive and spatial autocorrelation analyses were performed to explore the spatial characteristics of LE. Furthermore, the Geographical Detector (GeoDetector) technique was used to reveal the impact of social and environmental factors and their interactions on LE as well as their optimal range for the maximum LE level. The results show that there existed obvious spatial stratified heterogeneity of LE, and LE mainly presented two clustering types (high–high and low–low) with positive autocorrelation. The results of GeoDetector showed that the number of college students per 100,000 persons (NOCS) could mainly explained the spatial stratified heterogeneity of LE (Power of Determinant (PD) = 0.89, p < 0.001). With the discretization of social and environmental factors, we found that LE reached the highest level with birth rate, total dependency ratio, number of residents per household and water resource per capita at their minimum range; conversely, LE reached the highest level with consumption level, GDP per capita, number of college students per 100,000 persons, medical care expenditure and urbanization rate at their maximum range. In addition, the interaction of any two factors on LE was stronger than the effect of a single factor. Our study suggests that there existed obvious spatial stratified heterogeneity of LE in China, which could mainly be explained by NOCS.


Introduction
Life expectancy (LE) is a comprehensive and important index for measuring population health, which is vital for policy development and health improvement [1]. In recent years, the LE has continued to rise with the development of economy and living standard in China. From 2000 to 2017, it increased from 71.40 to 76.47 years old [2]. However, LE showed obvious spatial differences in China. According to a research, the difference of LE was larger than 10 years between the east and west in China in In order to explore the impact of social and environmental factors on LE in China, we selected 11 representative social and environmental factors (Table 1) classified to 7 categories. We performed an analysis at the provincial level, including 31 provinces in mainland China. The LE data in 2010 came from China Statistical Yearbooks [25]. The data of TDR (Total dependency ratio) in 2010 was obtained from China Population & Employment Statistics Yearbooks [26], and the data of MCEOUR (Medical care expenditure of urban residents) in 2010 came from China Health Statistics Yearbooks [27]. The data of other social and environmental factors in 2010 were all from China Statistics Yearbooks [28,29].

Descriptive Methods
In order to describe the spatial distribution of LE and social and environmental factors, the LE and main social and environmental factors of each province were mapped (the value was represented by the color depth) based on the provincial scale map of mainland China.

Spatial Autocorrelation Analysis
In order to analyze the spatial clustering characteristics of LE in China, we conducted the global and local spatial autocorrelation by using Moran's index (Moran's I).
Global Spatial Autocorrelation Analysis Global Moran's I is used to judge whether there exists spatial autocorrelation in LE in the whole area. Moran's I ranges from −1 to 1. If it is greater than 0, spatial autocorrelation exists in LE; if it equals to 0, no spatial autocorrelation exists in LE, if it is less than 0, spatial discreteness exists in LE. Global Moran's I can be expressed as follows [30]: where n represents the number of spatial units (provincial administrative unit in the study); x i and x j is the LE of the i and j province, respectively; x is the average of the LE, and W ij is the spatial weight matrix based on the inverse distance as follows: where d ij refers to the distance between two provinces, α is the appropriate constant (such as 1 or 2).

Local Spatial Autocorrelation Analysis
We used local Moran's I (LISA) to detect the spatial cluster types of LE further. And there are 4 types: high-high (high value is surrounded by high value), low-low (low value is surrounded by low value), high-low (high value is surrounded by low value) and low-high (low value is surrounded by high value) [31]. The local Moran's I is calculated as follows [32]: Each index in the above formula is the same as Equation (1). In addition, we also showed the spatial cluster type of LE by creating a Lisa cluster map.

Geographical Detector
Based on the spatial variance analysis, Geographic detector is a new tool to detect environmental factors of health risk. By comparing between strata-variance with the total variance in the whole area of dependent variable, the Geographical Detector can detect whether the factor causes the spatial stratified heterogeneity of dependent variable or not [24]. The Geographical Detector consists of 4 parts: factor detector, ecological detector, risk detector and interaction detector. Each of its components is described in detail as follows: Factor Detector A factor detector can be used to detect the importance of certain factors on LE, and is commonly measured by the Power of Determinant (PD) as follows [33]: where h = 1, . . . ,L refers to a certain stratum of each factor (L is the number of strata of the factor); σ 2 and σ 2 h represents the variance of LE in the whole and stratum h, respectively, N and N h represent the sample for them, respectively. PD ranges from 0 to 1. If PD is closer to 1, the effect of this factor on LE is greater. And the p-value of PD was also given by the factor detector.

Risk Detector
The risk detector is used to judge whether there is a significant difference between the average LE of different strata of each factor. T-test is used for hypothesis tests [16]: The degree of freedom was: where Yh represents the average LE of the stratum h; n h indicates samples for stratum h; Var represents sample variance, t follows the Student's-t-test distribution. The null hypothesis refers to the equivalent of LE in two areas. If null hypothesis is rejected at the significance level α, it is considered that there exists a significant difference in LE between the two areas.

Ecological Detector
The ecological detector is able to detect whether there is a significant difference between the effects of two factors on LE. And it is measured by the following formula: where n x1 and n x2 represent the samples; SSWX 1 and SSWX 2 represent the sum of within-strata variance of strata divided by x 1 and x 2 (any two social and environmental factors in the study) respectively; L 1 and L 2 represent the number of strata of variables x 1 and x 2 respectively; The null hypothesis represent SSWX 1 and SSWX 2 are equal. If null hypothesis is rejected at the significance level of α, this indicates that there is a significant difference between the effects of two factors on LE.

Interaction Detector
The interaction detector is used to identify the interaction between different factors. It can evaluate whether the interaction of factors x 1 and x 2 will increase or decrease the influence on LE. Specifically, it calculates the PD value of x 1 and x 2 after interaction (X 1 ∩ X 2 ) and then compares it with PD(x 1 ) and PD(x 2 ) to judge the interaction type. As shown in line 3 of Table 2, the PD value of x 1 and x 2 after interaction is greater than the maximum of the original PD values of the two factors, suggesting that the interaction type is bivariate-enhanced. Table 2. Types of interaction between two covariates [34].

Data Preprocessing
The Geographical Detector analysis generally requires the independent variable to be categorical variable, so it is necessary to discretize the 11 social and environmental factors before the analysis. Different discretization schemes will have different impact on performance. Generally, the discretization scheme with the largest PD value is preferred in a geographical detector analysis [35]. After comparison of various discretization schemes, we used the quantile method to classify the above 11 social and environmental variables and the classification interval is shown in Table 3. As for the dependent variable, it can be either continuous or discrete, so we did not make any preprocessing.

Software
The discretization of social and environmental factors and the spatial autocorrelation analysis were implemented by ArcGIS 10.2, and the analysis of spatial stratified heterogeneity of LE was completed by the GeoDetector.

Spatial Distribution Characteristics of LE in China
In 2010, the average LE was 74.83 years old in China, and the average LE of women was higher than that of men (77.37 vs. 72.38). The results show that LE showed a clear downward trend from the east to the west in 2010 in China, which reveals that there existed obvious spatial stratified heterogeneity of LE in China ( Figure 1). The average LE of the eastern areas was over 75 years old, especially in Shanghai and Beijing. The average LE of the central areas was a bit lower than the eastern areas which was between 73. 39

Software
The discretization of social and environmental factors and the spatial autocorrelation analysis were implemented by ArcGIS 10.2, and the analysis of spatial stratified heterogeneity of LE was completed by the GeoDetector.

Spatial Distribution Characteristics of LE in China
In 2010, the average LE was 74.83 years old in China, and the average LE of women was higher than that of men (77.37 vs. 72.38). The results show that LE showed a clear downward trend from the east to the west in 2010 in China, which reveals that there existed obvious spatial stratified heterogeneity of LE in China (Figure 1). The average LE of the eastern areas was over 75 years old, especially in Shanghai and Beijing. The average LE of the central areas was a bit lower than the eastern areas which was between 73.39 and 75.11 years, while the average LE in the western areas was the lowest in China. Tibet had the lowest LE of 68.17 years in all the 31 provinces.
From the results of the global autocorrelation analysis, the global Moran's I was 0.266 (p < 0.001), indicating that there existed obvious spatial clustering for LE in China. In order to further explore the cluster type of LE, we conducted a local autocorrelation analysis and obtained a Lisa cluster map ( Figure 2). It can be seen from Figure 2 that LE mainly presented two cluster types: high-high and low-low, among which high-high types were mainly distributed in the eastern coastal areas, including Beijing, Tianjin, Shanghai and Zhejiang provinces. The low-low type was mainly located in Western inland regions, such as Xinjiang, Tibet and Qinghai provinces. From the results of the global autocorrelation analysis, the global Moran's I was 0.266 (p < 0.001), indicating that there existed obvious spatial clustering for LE in China. In order to further explore the cluster type of LE, we conducted a local autocorrelation analysis and obtained a Lisa cluster map ( Figure 2). It can be seen from Figure 2 that LE mainly presented two cluster types: high-high and low-low, among which high-high types were mainly distributed in the eastern coastal areas, including Beijing, Tianjin, Shanghai and Zhejiang provinces. The low-low type was mainly located in Western inland regions, such as Xinjiang, Tibet and Qinghai provinces.

Spatial Distribution of Main Social and Environmental Factors
We also performed spatial maps of social and environmental factors to reveal their spatial characteristics ( Figure 3). We can see that there exist certain spatial differences in BR, NORPH and TDR. And the three indicators show an upward trend from the east to the west, which were different from the spatial distribution of LE ( Figure 3A-C). There were great spatial differences in the WRPC. For instance, the WRPC in Tibet reached the highest level (153,681.9 m 3 ), while in Tianjin, it was only 72.8 m 3 . In addition, the WRPC increased from the north to the south in China ( Figure 3D). NOHB and NOD showed a decline trend from the north to the south ( Figure 3I

Spatial Distribution of Main Social and Environmental Factors
We also performed spatial maps of social and environmental factors to reveal their spatial characteristics ( Figure 3). We can see that there exist certain spatial differences in BR, NORPH and TDR. And the three indicators show an upward trend from the east to the west, which were different from the spatial distribution of LE ( Figure 3A-C). There were great spatial differences in the WRPC. For instance, the WRPC in Tibet reached the highest level (153,681.9 m 3 ), while in Tianjin, it was only 72.8 m 3 . In addition, the WRPC increased from the north to the south in China ( Figure 3D). NOHB and NOD showed a decline trend from the north to the south ( Figure 3I-J). There existed big differences in the CLOUR, GPC, NOCS, MCEOUR and UR among all provinces in China, showing a downward trend from the east to the west ( Figure 3E-H,K).

The Influence of Social and Environmental Factors on LE Based on Factor Detector
Based on the factor detector, we analyzed the importance of social and environmental factors on LE. Table 4 lists the PD value and its p value of each factor. The PD value of medical resources (NOHB, NOD) was not statistically significant (p > 0.05), which indicates that medical resources had no effect on LE at the provincial scale. However, the PD of most variables ranged from 0.45 to 0.65. In particular, the PD value of NOCS was 0.89, which indicates that NOCS can mainly explain the spatial stratified heterogeneity of LE in our study. Moreover, the PD values of WRPC and MCEOUR were less than 0.4, revealing that they had less influence on LE than other factors.

The Influence of Social and Environmental Factors on LE Based on Factor Detector
Based on the factor detector, we analyzed the importance of social and environmental factors on LE. Table 4 lists the PD value and its p value of each factor. The PD value of medical resources (NOHB, NOD) was not statistically significant (p > 0.05), which indicates that medical resources had no effect on LE at the provincial scale. However, the PD of most variables ranged from 0.45 to 0.65. In particular, the PD value of NOCS was 0.89, which indicates that NOCS can mainly explain the spatial stratified heterogeneity of LE in our study. Moreover, the PD values of WRPC and MCEOUR were less than 0.4, revealing that they had less influence on LE than other factors.

The Optimal Range of Factors for the Maximum LE Based on Risk Detector
The risk detector showed the average LE of each stratum of all factors and analyzed whether there was significant difference between each stratum. Taking GPC as an example: the relationship between GPC and LE is shown in Figure 4. With the increase of GPC, LE also increased. When the GPC ranged from 13,119 to 21,253 RMB, the average LE reached 72.54 years old. However, when the GPC ranged from 42,355 to 76,074 RMB, the average LE rose to 77.80 years old. The significance of the average LE differences between each stratum of GPC is shown in Table 5 (Y: significant, N: not significant). The fourth layer corresponds to the maximum range in Figure 4 (42,355 to 76,074 RMB), while the first layer corresponds to the minimum range in Figure 4 (13,119 to 21,253 RMB). We can see that the difference of average LE between the fourth layer and other layers of GPC was statistically significant in Table 5. Furthermore, the risk detector was able to analyze the quantitative relationship between social and environmental factors and LE. As could be seen in Table 6, the results show that the highest ranges of CLOUR, GPC, NOCS, MCEOUR and UR related to the highest level of LE, while the lowest ranges of BR, TDR, NORPH and WRPC corresponded to the highest level of LE. The optimal range of factors corresponds to the maximum value of the LE [36]. Therefore, we displayed the optimal range of factors in Table 6. It is worth noting that the areas with the highest LE value could be identified as the main influencing area for each significant social and environmental factor [34]. We show the results in Figure 5 with visualization technology. the highest ranges of CLOUR, GPC, NOCS, MCEOUR and UR related to the highest level of LE, while the lowest ranges of BR, TDR, NORPH and WRPC corresponded to the highest level of LE. The optimal range of factors corresponds to the maximum value of the LE [36]. Therefore, we displayed the optimal range of factors in Table 6. It is worth noting that the areas with the highest LE value could be identified as the main influencing area for each significant social and environmental factor [34]. We show the results in Figure 5 with visualization technology.

Differences between the Impact of Different Factors on LE Based on the Ecological Detector
The significance of differences in PD values between two factors on LE was compared by the ecological detector ( Table 7). The results show that the differences in PD values among most factors were not significant (Y: significant, N: not significant). The differences in PD values of NOCS and other factors were statistically significant, suggesting that NOCS had a great impact on LE. The PD value of WRPC was significantly different from that of BR and UR, respectively. Similarly, the difference in PD values of MCEOUR and UR was statistically significant. Combined with the results of the factor detector, this shows that the impact of WRPC and MCEOUR on LE was weak.

Differences between the Impact of Different Factors on LE Based on the Ecological Detector
The significance of differences in PD values between two factors on LE was compared by the ecological detector ( Table 7). The results show that the differences in PD values among most factors were not significant (Y: significant, N: not significant). The differences in PD values of NOCS and other factors were statistically significant, suggesting that NOCS had a great impact on LE. The PD value of WRPC was significantly different from that of BR and UR, respectively. Similarly, the difference in PD values of MCEOUR and UR was statistically significant. Combined with the results of the factor detector, this shows that the impact of WRPC and MCEOUR on LE was weak. Table 7. Statistical significance of the differences in PD values among different factors.

Interaction between Different Factors Based on the Interaction Detector
We used the interaction detector to reveal the interaction effect and types among the factors. As shown in Table 8, the PD value was ≥0.9 after NOCS interacted with other factors. Notably, the PD value of the interaction between NOCS and NORPH reached 0.98, which was closer to 1. The PD value of UR interacted with other factors was also high; for example, the PD value of UR that interacted with WRPC was 0.92. However, the PD value of the interaction between WRPC and MCEOUR was only about 0.5. We found that the interaction effect of any two factors was greater than the individual effect of a certain factor on LE. Even for the factors with a lower PD value, their PD value increased after the interaction. Moreover, the results show that all interaction types were bivariate-enhanced.

Discussion
LE is an important indicator for measuring health status [37]. To our knowledge, this is the first time that the relationship between LE and social and environmental factors is explored in China from spatial perspective using the Geographic Detector technique.
Many previous studies have shown that LE was mainly affected by economic factors [36,38,39]. For example, one study found that the main reason for the spatial distribution pattern of LE in China was the economy [40]. In contrast, we found that NOCS could mainly explain the spatial stratified heterogeneity of LE at the provincial scale combined with the results of factor detector and ecological detector, that is, the effect of NOCS on LE was significantly greater than that of the other factors. There is some possible explanation. Firstly, compared with other factors, people with higher education were related to good health awareness and more timely access to health care [15,41]; secondly, a higher education population could resist the adverse effects of negative aspects with better psychological quality [42]. At the same time, our study found that WRPC had little effect on LE. Some previous studies also found that this effect showed an upward trend from 2000 to 2010 [15]. Therefore, our government, especially in the eastern developed areas, should pay attention to the protection of the ecological environment while improving social economy. In addition, the impact of medical resources (NOHB, NOD) on LE was not statistically significant. Even though there were abundant medical resources in some areas, their actual efficiency might be very low due to poor infrastructure and low economic level. Therefore, they might not fully play a role, even had no effect on LE. However, the impact of MCEOUR on LE was also quite small, which further reveals the importance of improving the utilization of medical resources [43].
The results of the risk detector show that when the economic factors (GPC and CLOUR) and UR reached the maximum range, the average LE was also closer to the highest level. Because economic status played a role through its effects on people's daily life, such as education, medical care, etc., the average LE would reach the maximum level with the GPC at the maximum value. The consumption level (CLOUR) was closely related to economic situation. When the consumption level reached the maximum value, people would purchase enough food to improve their health [44,45]. People who lived in the areas with highest UR would have the longest average LE because the high UR corresponded to the high economic conditions, medical and educational opportunities [10]; meanwhile, residents in rural areas also reported much higher rates of disability, injury and high blood pressure compared with urban residents, due to inequalities in education, health care and poverty [46]. Therefore, the Chinese government, especially in the central and western areas, should focus on the alleviation of poverty and urbanization to improve local LE. At the same time, when the BR and family living standard (TDR, NORPH) were in the minimum range, the average LE reached its maximum level. In general, the areas with the lowest BR were usually economically developed, such as Shanghai and Beijing. The social welfare in these areas was also higher than that of other areas [39]. Moreover, areas with a lowest BR tended to have the fewest number of families members (NORPH) and the lowest total dependency ratio (TDR). Therefore, they had the longest average LE.
Based on the interaction detector, we found that the PD value of any social environment factor interacted with NOCS was ≥0.9, indicating that education combined with other factors could significantly improve LE level. Therefore, the government, especially in Western China, should focus on improving the education as well as economic and medical conditions.
Our research shows the impact of social and environmental factors on life expectancy and their interaction [47]. We display the optimal range of factors for maximum LE and the main influencing area, which was meaningful for health policy development. Moreover, the selected variables covered multiple dimensions and the data on them were authoritative, which came from the national bureau of statistics. However, there are still some limitations in our study. Firstly, this study only focused on social and environmental factors, so there might be some factors influencing LE that were not included, such as air pollutants PM 2.5 , PM 10 , etc. However, we could not obtain the data of air pollution factors in each province in 2010. In addition, the data of LE and social and environmental factors used in this study were all from 2010, so they were insufficient in inferring a causal relationship. Moreover, the relationship between the dependent variable and independent variable was statistical and was not causality but the geographical detectors could filter out highly potential factors of LE for further confirmation, such as longitudinal studies [16]. In addition, this study was performed at the provincial level, which needs to be studied at a more precise scale in the future. Finally, the Geographical Detector could only explore the interaction effect between two factors and failed to further reveal the impact of multiple interactions on LE, which was also a key problem to be solved in the future.

Conclusions
In conclusion, there exist obvious spatial stratified heterogeneity of LE in China. Among the many social and environmental factors, NOCS could mainly explain the spatial stratified heterogeneity of LE. BR, TDR, NORPH, CLOUR, GPC and UR had less influence on LE, while WRPC and MCEOUR had the lowest influence on LE. Further study is needed to discover the actual causality between LE and these factors. When BR, TDR, NORPH and WRPC were at the minimum range, LE reached the highest level; conversely, LE reached the highest level with CLOUR, GPC, NOCS, MCEOUR and UR at the maximum range. In addition, the interaction of any two social and environmental factors on LE was stronger than the effect of a single factor. Our results provide political basis for the government to formulate economic and educational development, utilization of medical resources, environmental protection and population management policies to solve the regional inequality of LE in China.