Identifying Inﬂuencing Factors of Agricultural Soil Heavy Metals Using a Geographical Detector: A Case Study in Shunyi District, China

: Identifying inﬂuencing factors of heavy metals is essential for soil evaluation and protection. This study investigates the use of a geographical detector to identify inﬂuencing factors of agricultural soil heavy metals from natural and anthropogenic aspects. We focused on six variables of soil heavy metals, i.e., As, Cd, Hg, Cu, Pb, Zn, and four inﬂuencing factors, i.e., soil properties (soil type and soil texture), digital elevation model (DEM), land use, and annual deposition ﬂuxes. Experiments were conducted in Shunyi District, China. We studied the spatial correlations between variables of soil heavy metals and inﬂuencing factors at both single-object and multi-object levels. A geographical detector was directly used at the single-object level, while principal component analysis (PCA) and geographical detector were sequentially integrated at the multi-object level to identify inﬂuencing factors of heavy metals. Results showed that the concentrations of Cd, Cu, and Zn were mainly inﬂuenced by DEM ( p = 0.008) and land use ( p = 0.033) factors, while annual deposition ﬂuxes were the main factors of the concentrations of Hg, Cd, and Pb ( p = 0.000). Moreover, the concentration of As was primarily inﬂuenced by soil properties ( p = 0.026), DEM ( p = 0.000), and annual deposition ﬂux ( p = 0.000). The multi-object identiﬁcation results between heavy metals and inﬂuencing factors included single object identiﬁcation in this study. Compared with the results using the PCA and correlation analysis (CA) methods, the identiﬁcation method developed at different levels can identify much more inﬂuencing factors of heavy metals. Due to its promising performance, identiﬁcation at different levels can be widely employed for soil protection and pollution restoration.


Introduction
Heavy metals in agricultural soil can reduce the soil quality, affect the health of crops, and even threaten human health [1,2]. Along with the progress of modern agriculture and industry, soil environmental problems such as soil pollution, soil erosion, and degradation gradually emerged in many areas [3,4]. To address these problems, it is essential to identify the influencing factors of heavy metals in agricultural soil for soil evaluation and protection. The influencing factors of heavy metals are usually grouped into two main types, corresponding to natural and anthropogenic factors. Natural factors mainly refer to soil properties (e.g., soil type, soil texture) [5], digital elevation model (DEM) or terrain [6], and distance from water body [7], while anthropogenic factors usually refer to land use [8,9], water irrigation, traffic emission [10] and atmospheric deposition [11]. The concentrations

Study Area
The study area is located at Shunyi District, Beijing, China, at longitudes ranging from 116°28′ E to 116°58′ E and latitudes ranging from 40°00′ N to 40°18′ N ( Figure 1). This area covers 1020 km 2 and has a warm, temperate semi-humid continental monsoon climate [36], with an average annual temperature of 11.5 °C and average precipitation of 625 mm. The DEM changes from 24 m to 637 m in the area, and the average elevation is about 35 m. The primary agricultural lands in the area are irrigable land, orchard, vegetable field, grassland, and wasteland. Heavy metal concentrations in Shunyi District were slightly higher than the background values of Beijing topsoils except for Pb, suggesting slight contamination of heavy metals in this area [17].

Sample Collection and Analysis
To investigate the pollution status of heavy metals in Shunyi District, a large-scale soil sampling project was conducted after the crop harvest in the autumn of 2007. A total of 329 surface soil samples (0-20 cm depth) were collected from the agricultural areas, as shown in Figure 1. For each sampling site, five sub-samples were collected from the four vertexes and the center of a 10 m × 10 m grid. We then mixed these sub-samples to select 1 kg soil as the representative sample of this site. A global positioning system (GPS) was used to precisely locate each sampling location and record the corresponding information regarding vegetation and soil types.

Data Sources 2.2.1. Sample Collection and Analysis
To investigate the pollution status of heavy metals in Shunyi District, a large-scale soil sampling project was conducted after the crop harvest in the autumn of 2007. A total of 329 surface soil samples (0-20 cm depth) were collected from the agricultural areas, as shown in Figure 1. For each sampling site, five sub-samples were collected from the four vertexes and the center of a 10 m × 10 m grid. We then mixed these sub-samples to select 1 kg soil as the representative sample of this site. A global positioning system (GPS) was used to precisely locate each sampling location and record the corresponding information regarding vegetation and soil types. The collected soil samples were air-dried, crushed in an agate mortar, and then passed through a 2.0 mm sieve. The concentrations of six heavy metals, i.e., Cd, Cu, Zn, Pb, As, and Hg, were analyzed in the soil samples. The soil samples were digested in triplicate with the mixture of HNO 3 , HCI, and H 2 O 2 using the method 3050B recommended by the United States Environmental Protection Agency [37]. Concentrations of Cd, Cu, Pb, and Zn in the digestion solution were determined by inductively coupled plasma optical emission spectroscopy (ICP-AES, Thermo iCAP 6300, Washington, USA). In contrast, concentrations of As and Hg in the soils were determined by an atomic fluorescence spectrometry (Titan, AFS 830, Beijing, China) after digestion of the soil samples. Standard reference materials (GSS-1 and GSS-4) obtained from the Center of National Standard Reference Material of China were used for quality assurance and quality control [17]. Figure 2 shows the spatial distribution of each heavy metal of Shunyi District, China. In the distribution maps, the concentrations of each heavy metal were divided into 3 classes using the quantile method, which placed an equal number of units into each class [38] and seemed to be one of the effective methods for facilitating comparison [39].

Influencing Factors and Factors Stratification
This study examined four factors, including two natural and two anthropogenic factors. These factors were known to influence heavy metal concentrations in Shunyi District, China. The natural factors referred to soil properties, i.e., soil type and soil texture, and DEM. The two anthropogenic factors referred to land use and annual deposition fluxes of dry and wet atmospheric deposition. Soil properties data at a scale of 1:1,000,000 and DEM with a 30 m resolution were provided by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. The land use data at a scale of 1:50,000 were acquired from the National Administration of Surveying, Mapping, and Geoinformation. Annual deposition fluxes data of Cd, Cu, Zn, Pb, As, and Hg came from the inverse distance weight (IDW) [40] interpolation results of 39 samples of dry and wet atmospheric deposition, which were collected in the plain area of Beijing from November 2005 to November 2006 [41].
Factors stratification for corresponding influencing factors of 329 soil samples was used in this study. Factors stratification ensures that the spatial heterogeneity of soil pollutants can be identified after stratification [42]. The principle of factors stratification was that every stratum had sampling sites, and each area of different strata was similar. For category variables, e.g., soil properties and land use types, each category can be considered a stratum. For numerical variables, e.g., DEM and annual deposition fluxes, the variable was divided into three grades referring to the high, medium, and low levels using the quantile method, and each grade can be considered as a stratum. In this study, agricultural lands in Shunyi District were divided into four strata, corresponding to four land use types: irrigable land, orchard, vegetable field, and grassland and wasteland. Soil properties (soil type and soil texture) were divided into six strata, i.e., cinnamon soils and light loam, cinnamon soils and medium loam, cinnamon soils and sandy loam, fluvo-aquic soils and light loam, fluvo-aquic soils and medium loam, and fluvo-aquic soils and sandy loam. DEM and annual deposition fluxes were divided into three strata using the quantile method, i.e., low, medium, and high grades, respectively, and the stratification results are shown in Table 1. rescence spectrometry (Titan, AFS 830, Beijing, China) after digestion of the soil samples. Standard reference materials (GSS-1 and GSS-4) obtained from the Center of National Standard Reference Material of China were used for quality assurance and quality control [17]. Figure 2 shows the spatial distribution of each heavy metal of Shunyi District, China. In the distribution maps, the concentrations of each heavy metal were divided into 3 classes using the quantile method, which placed an equal number of units into each class [38] and seemed to be one of the effective methods for facilitating comparison [39].

Influencing Factors and Factors Stratification
This study examined four factors, including two natural and two anthropogenic factors. These factors were known to influence heavy metal concentrations in Shunyi District, China. The natural factors referred to soil properties, i.e., soil type and soil texture, and

Identification Method
In this study, we developed a method based upon geographical detector and PCA to identify natural and anthropogenic influencing factors of agricultural soil heavy metals, i.e., As, Cd, Hg, Cu, Pb, and Zn, at both single-object and multi-object levels, as depicted in Figure 3. We proposed a systematic and comprehensive solution at different levels to identify natural and anthropogenic influencing factors of agricultural soil heavy metals. At the single-object level, we directly used a geographical detector to identify natural factors or anthropogenic factors of each heavy metal, i.e., As, Cd, Hg, Cu, Pb, or Zn. At the multi-object level, we first used PCA to represent the concentrations of heavy metals including As, Cd, Hg, Cu, Pb, and Zn in agricultural soils. After PCA processing of As, Cd, Hg, Cu, Pb, and Zn, we selected principal components with eigenvalues greater than 1. Then, we used a geographical detector to identify natural factors or anthropogenic factors of each selected principal component. Furthermore, we obtained identification results at single-object and multi-object levels by q values, with the significance at p < 0.05 level.

Identification Method
In this study, we developed a method based upon geographical detector and PCA to identify natural and anthropogenic influencing factors of agricultural soil heavy metals, i.e., As, Cd, Hg, Cu, Pb, and Zn, at both single-object and multi-object levels, as depicted in Figure 3. We proposed a systematic and comprehensive solution at different levels to identify natural and anthropogenic influencing factors of agricultural soil heavy metals. At the single-object level, we directly used a geographical detector to identify natural factors or anthropogenic factors of each heavy metal, i.e., As, Cd, Hg, Cu, Pb, or Zn. At the multi-object level, we first used PCA to represent the concentrations of heavy metals including As, Cd, Hg, Cu, Pb, and Zn in agricultural soils. After PCA processing of As, Cd, Hg, Cu, Pb, and Zn, we selected principal components with eigenvalues greater than 1. Then, we used a geographical detector to identify natural factors or anthropogenic factors of each selected principal component. Furthermore, we obtained identification results at single-object and multi-object levels by q values, with the significance at p < 0.05 level. The geographical detector is a spatial statistical method to test the relationships between geographical phenomena and their potential influencing factors [22,23]. The geographical detector quantifies the relative importance of an influencing factor on the heavy metal concentrations by power determinant (q). The q is defined as follows: where h = 1, 2, …, L indicates a stratum; N is the number of samples; Nh is the number of strata. The parameter σ 2 refers to the variance of a heavy metal concentration for the whole study area, and 2 σ is the corresponding variance of a stratum. In this study, we used The geographical detector is a spatial statistical method to test the relationships between geographical phenomena and their potential influencing factors [22,23]. The geographical detector quantifies the relative importance of an influencing factor on the heavy metal concentrations by power determinant (q). The q is defined as follows: where h = 1, 2, . . . , L indicates a stratum; N is the number of samples; N h is the number of strata. The parameter σ 2 refers to the variance of a heavy metal concentration for the whole study area, and σ 2 h is the corresponding variance of a stratum. In this study, we used the significance of q value (p) as the key index to identify natural and anthropogenic influencing factors of heavy metals. The significance of the q value was defined at p < 0.05 level, and the corresponding q value indicated the influencing degree of this factor on heavy metals. For each natural and anthropogenic influencing factor of heavy metals, the p of the corresponding q value is less than the target significance (0.05), which indicates this influencing factor can be identified; otherwise, it cannot be identified.
In this study, we implemented the geographical detector algorithm using the software accessed from http://www.geodetector.cn/ (accessed on 15 October 2020). The CA, PCA, and data processing were carried out by SPSS 16.0 software. The fuzzy clustering was performed by MATLAB 7.0 software. The quantile method and spatial mapping were performed by ArcGIS 10.5 software.

The Identification of Influencing Factors of Heavy Metals at a Single-Object Level
We directly identified natural and anthropogenic factors of each heavy metal, i.e., As, Cd, Hg, Cu, Pb, or Zn, in Shunyi District, China. According to Equation (1), the obtained q values and corresponding significance p of influencing factors on each heavy metal concentration at a single-object level are shown in Table 2. The q values with the significance at p < 0.05 level are illustrated in Figure 4. This figure showed that As, Cd, and Cu concentrations were mainly derived from DEM, while annual deposition fluxes primarily influenced Cd, Hg, and Pb concentrations in Shunyi District. Meanwhile, annual deposition flux had a much higher association than DEM for Cd concentrations, and Zn concentrations had no association with all influencing factors, as shown in Table 2 and Figure 4.

The Identification of Influencing Factors of Heavy Metals at the Multi-Object Level
At the multi-object level, we first used PCA to show the concentrations of six heavy metals, i.e., As, Cd, Hg, Cu, Pb, and Zn, in agricultural soils for Shunyi District, China, and the PCA results are presented in Table 3. The eigenvalues of the first three extracted components were all greater than 1. The heavy metals in the area can be represented by the first three primary components that account for 64.496% of all the data variation. The component matrix showed that Cd, Cu, and Zn were strongly associated with the first component because of selecting the maximum principal component for each heavy metal. Moreover, As was the only element represented in the second component, while Hg and Pb mainly dominated the third component. In the second stage, we identified influencing factors including soil properties, DEM, land use, and annual deposition fluxes of each of the first three principal components using geographical detector software, and the identification results are shown in Table 4. Through the q values with the significance at p < 0.05 level, the results in Table 4 and Figure  5 suggested that the first component represented Cd, Cu, and Zn concentrations, which were mainly derived from DEM, land use, and annual deposition fluxes of As, Cd, Hg, and Pb. The second component represented As concentrations, which were primarily influenced by soil properties, DEM, and the annual deposition flux of Cd. The third component represented Hg and Pb concentrations, which were mainly influenced by annual deposition fluxes of Cd and Hg.

The Identification of Influencing Factors of Heavy Metals at the Multi-Object Level
At the multi-object level, we first used PCA to show the concentrations of six heavy metals, i.e., As, Cd, Hg, Cu, Pb, and Zn, in agricultural soils for Shunyi District, China, and the PCA results are presented in Table 3. The eigenvalues of the first three extracted components were all greater than 1. The heavy metals in the area can be represented by the first three primary components that account for 64.496% of all the data variation. The component matrix showed that Cd, Cu, and Zn were strongly associated with the first component because of selecting the maximum principal component for each heavy metal. Moreover, As was the only element represented in the second component, while Hg and Pb mainly dominated the third component. In the second stage, we identified influencing factors including soil properties, DEM, land use, and annual deposition fluxes of each of the first three principal components using geographical detector software, and the identification results are shown in Table 4. Through the q values with the significance at p < 0.05 level, the results in Table 4 and Figure 5 suggested that the first component represented Cd, Cu, and Zn concentrations, which were mainly derived from DEM, land use, and annual deposition fluxes of As, Cd, Hg, and Pb. The second component represented As concentrations, which were primarily influenced by soil properties, DEM, and the annual deposition flux of Cd. The third component represented Hg and Pb concentrations, which were mainly influenced by annual deposition fluxes of Cd and Hg.

Comparative Analysis of Cluster Analysis and Correlation Analysis Methods
We performed a comparative analysis between the proposed method and existing methods using cluster and correlation analysis. First, a fuzzy clustering method was used to represent the concentrations of heavy metals including As, Cd, Hg, Cu, Pb, and Zn in agricultural soils. Figure 6 shows the results of the fuzzy clustering for six heavy metals in Shunyi District. Compared with the PCA results of heavy metals, we selected three strata of the cluster results for comparison. Six heavy metals were divided into three strata: Cd, Zn, and Cu (first stratum), Pb and As (second stratum), Hg (third stratum) using fuzzy clustering, while three principal components represented Cd, Cu, and Zn (first component), As (second component), Hg and Pb (third component) using PCA, respectively. The results of the fuzzy clustering method and PCA method were generally consistent except for Pb.

Comparative Analysis of Cluster Analysis and Correlation Analysis Methods
We performed a comparative analysis between the proposed method and existing methods using cluster and correlation analysis. First, a fuzzy clustering method was used to represent the concentrations of heavy metals including As, Cd, Hg, Cu, Pb, and Zn in agricultural soils. Figure 6 shows the results of the fuzzy clustering for six heavy metals in Shunyi District. Compared with the PCA results of heavy metals, we selected three strata of the cluster results for comparison. Six heavy metals were divided into three strata: Cd, Zn, and Cu (first stratum), Pb and As (second stratum), Hg (third stratum) using fuzzy clustering, while three principal components represented Cd, Cu, and Zn (first component), As (second component), Hg and Pb (third component) using PCA, respectively. The results of the fuzzy clustering method and PCA method were generally consistent except for Pb.
Second, we also conducted a correlation analysis between heavy metal concentrations and influencing factors (Table 5). DEM was correlated with As, Cd, Hg, and Cu, while land use was correlated with Cd, Cu, and Zn. Moreover, annual deposition fluxes were correlated with Cd and Hg. Figure 7 shows the identification results for multi-object identification using the proposed method and the CA method. According to Figure 7, the multi-object identification results between heavy metals and influencing factors derived using the proposed method contained those derived from the CA method, except for the variable of Hg. Second, we also conducted a correlation analysis between heavy metal concentrations and influencing factors (Table 5). DEM was correlated with As, Cd, Hg, and Cu, while land use was correlated with Cd, Cu, and Zn. Moreover, annual deposition fluxes were correlated with Cd and Hg. Figure 7 shows the identification results for multi-object identification using the proposed method and the CA method. According to Figure 7, the multi-object identification results between heavy metals and influencing factors derived using the proposed method contained those derived from the CA method, except for the variable of Hg.

Comparisons with Related Studies
For six kinds of heavy metals in agricultural soils, the influencing factors of their con- Figure 7. The identification results for multi-object identification and CA methods.

Comparisons with Related Studies
For six kinds of heavy metals in agricultural soils, the influencing factors of their concentrations were mainly related to land use, annual deposition flux, DEM, soil type, and soil texture. This finding reflects that the concentrations of heavy metals in agricultural soil were attributed to the combination of natural and anthropogenic factors. Based upon existing research [17], local contamination from agricultural practices, particularly with the application of fertilizer and manure, was the main source of Cd, Cu, and Zn entering the agricultural soils. Their results support our finding that land use was the main influencing factor for Cd, Cu, and Zn in this study. The concentration of As was primarily influenced by soil properties, which was proved by the existing contributions [35,43]. Moreover, DEM was the secondary influencing factor for Cd and As [42], and atmospheric deposition played an important role in Cd, Hg, and Pb accumulations in the soils [17,32]. The identification results of heavy metal sources in agricultural soils are essential for protecting soil quality to promote the healthy development of agriculture.
To demonstrate the performance of the identification method at different levels in this study, we made two comparisons, i.e., between the single-object identification method and multi-object identification method, and between the identification method developed in this study and related studies [2,17,[43][44][45]. The results of the comparative analysis of this study and existing contributions are shown in Table 6. Table 6 shows that the identification results of different studies were generally consistent in terms of the concentrations of heavy metals influenced by the combination of natural and anthropogenic factors. In contrast, the influencing factors of each heavy metal were different between different studies. Based on the same location, the examined studies included two aspects, named related study I [17] and related study II [43], respectively. The results of the comparative analysis are shown in Table 7. Table 7 shows that the multi-object identification results between heavy metals and influencing factors included that of single-object identification in this study. Compared with related studies, the differences in identification results might be caused by different sampling times, locations, and identification methods for this study and related studies, as depicted in Table 7. Furthermore, we compared our study with the related study I over the same study area and heavy metals, as shown in Figure 8. This figure shows that the results derived at the single-object level using the geographical detector are significantly different from that of the existing method using PCA. In general, the proposed method at the multi-object level identified more influencing factors than the method based on the single-object level and the PCA method. Another interesting finding is that the derived results at the multi-object level refer to the combination of the results from the single-object method and the PCA, except for the variable of Pb. More specifically, (1) for variables of As and Zn, more influencing factors identified by the multi-object method than either the single-object identification or the PCA method; (2) for variables of Cd, Hg, and Cu, the influencing factors identified by the multi-object method were equal to the sum of results of the single object method and the PCA method. According to PCA results for six heavy metals in Table 3 and identification results in Table 4, the potential reason was that Pb was secondarily represented in the second component, and the second component was associated with soil properties, due to selecting only the maximum principal component for each heavy metal in this study. Therefore, the identification method developed at different levels in this study can identify much more influencing factors of heavy metals than frequently used methods, particularly for the multi-object identification method.

Uncertainty and Proper Use of Identification Method Developed in This Study
Although the proposed method achieved satisfactory results, further improvement can be conducted by interpolating the results of the atmospheric deposition samples and the stratification number of influencing factors, particularly for numerical variables. More

Uncertainty and Proper Use of Identification Method Developed in This Study
Although the proposed method achieved satisfactory results, further improvement can be conducted by interpolating the results of the atmospheric deposition samples and the stratification number of influencing factors, particularly for numerical variables. More specifically, the IDW interpolation results of dry and wet atmospheric deposition may lead to additional uncertainty to some extent in this study. An ideal IDW interpolation result is to obtain each point's actual value, making it challenging to describe high concentration when overall concentration is low [7]. Regarding influencing factors of heavy metals, the number of stratifications was determined according to prior knowledge in this study. By doing so, the number of stratifications should be determined in detail to express the spatial variability of heavy metals when using a geographical detector [42]. Furthermore, to select a geographical detector for identifying influencing factors, the significance level of q values needs to be specified, e.g., the significance at p < 0.05 level was used in this study.
We also provided suggestions for the proper use of the identification method developed in this study. Variation in data sources, study areas, and research targets should be considered for future implementation when applying the proposed method at different levels. For the cases with fewer variables, we recommend using the single-object method for identifying influencing factors of heavy metals; otherwise, we suggest using the proposed multi-object identification method.

Conclusions
This paper presented an identification method based on a geographical detector to identify natural and anthropogenic factors of six heavy metals (i.e., As, Cd, Hg, Cu, Pb, Zn) at single-object and multi-object levels in Shunyi District, China. The results suggested that Cd, Cu, and Zn concentrations were mainly influenced by DEM and land use factors, while annual deposition fluxes were the main factors of Hg, Cd, and Pb concentrations. Moreover, the concentration of As was primarily influenced by soil properties, DEM, and annual deposition flux. The multi-object identification results between heavy metals and influencing factors included that of single-object identification in this study. Compared with the frequently used PCA and CA methods, the identification method developed at different levels can identify much more influencing factors of heavy metals, particularly using the multi-object identification method. Due to its promising performance, the method developed at different levels can be widely employed for soil protection and pollution restoration.
Nevertheless, we expect that future work can be improved from the following two aspects: (1) to analyze whether the influencing factors of agricultural soil heavy metals operate independently or interconnect, and to quantify the interaction effect; (2) to identify and monitor high pollution risk areas in combination with the background values of heavy metals in agricultural soils.