Application of Integrated Land Use Regression and Geographic Information Systems for Modeling the Spatial Distribution of Chromium in Agricultural Topsoil

: Chromium (Cr) contamination is widely distributed in agricultural soil and poses a threat to agricultural sustainability. Developing integrated models based on soil survey data can be an effective measure to accurately predict the spatial distribution of Cr. Focused on an agriculturally dominated area, this study presents a novel hybrid mapping model that combines land use regression (LUR) and geostatistical methods to predict Cr distribution in topsoil and examines the influence of various influencing factors on Cr content. The LUR model was first adopted to screen the influencing factors for Cr predictions. Then LUR, was combined with ordinary Kriging (OK_LUR) and geographically weighted regression Kriging (GWRK_LUR) to describe the spatial distribution of Cr. Results showed that Cr distribution was profoundly influenced by soil Cu and Zn content, the distance between the soil sampling and livestock farm, orchard areas within 100 m, and population density within 1000 m. The developed GWRK_LUR model significantly improved the prediction accuracy of the OK_LUR and LUR models (by 9% and 16%, respectively). This model provides a novel route to account for the spatial distribution of Cr in agricultural topsoil at a regional scale, which has potential application in pollution remediation.


Introduction
Heavy metal pollution of agricultural soil has received significant attention due to its great influence on food safety and human health [1,2].Chromium (Cr) is widely distributed in agricultural soil, while hexavalent Cr is one of the most toxic heavy metals [3] and other valence states can be translated into hexavalent.Cr(VI) is not only more easily taken up by plants and affects the crop yield and quality, but also has harmful effects on humans and animals due to its carcinogenicity, mutagenicity, and genotoxicity.Human exposure to Cr is mainly derived from soil-crop systems because Cr is easily integrated into the food chain.Considering these potential risks to humans and plants, Cr contamination in soil has attracted worldwide attention [4].
It also has been reported that its distribution is influenced by a range of physicochemical properties of soil (e.g., soil type, geology, and pH) [5] and anthropogenic processes (e.g., industrialization, land use, traffic) [6].Studies have confirmed that Cr pollution in agriculturally dominated areas (i.e., without the influence of industry) is mainly derived from the soil parent material [7].Components within the soil parent material, including clay mineral colloids and functional groups, potentially exert a significant influence on the adsorption, desorption, migration, and transformation of Cr.Specifically, the mechanisms may differ considerably, depending on the type and inherent properties of the soil parent material, environmental conditions, and the chemical character of the Cr present [8].Both soil parent material and human activity can affect the distribution of Cr [9,10], making it challenging to evaluate Cr contamination in soils.
In soil Cr investigation, traditional surface sampling remains a reliable method for data collection [11].However, given the high cost associated with soil surveys, various analytical techniques have been developed for Cr distribution mapping, such as multivariate statistical analysis and geostatistical methods [12,13].Nevertheless, these methods have limitations.They potentially overlook fine-scale spatial changes in soil properties, and struggle to explain the specific factors that influence the spatial distribution of soil heavy metal contamination at a regional scale [14].
It is widely established that using correlated influencing factors as auxiliary variables can enhance prediction accuracy [14,15].However, these models often fail to cover different variable types within a single model framework.Land use regression (LUR) employs ground observations and multiple geographic covariates, including land use type, to build a regression model for estimating concentrations in areas lacking monitoring data [16].LUR has demonstrated its effectiveness in predicting the distribution of heavy metals [17].Nevertheless, the relationship between these factors and heavy metal contents is rarely linear in nature [18], which may undermine the simulation performance of LUR models.Fortunately, geographically weighted regression (GWR) has a good ability to solve this problem.GWR can depict nonstationary spatial effects in the relationship between the dependent and independent variables through allowing model parameters to vary across space [19].Geographically weighted regression Kriging (GWRK) integrates ordinary Kriging (OK) with GWR, enabling spatial prediction of unrecognized secondary factors contained in the GWR model.
Up to now, although researchers have adopted various methods to explore soil Cr distribution and affect factors, it is still unclear how Cr is distributed in agriculture dominated areas and which factors determine its distribution.Reasonable and high-precision mapping models for Cr spatial distribution are necessary but deficient.The objectives of this study were to identify the key factors influencing Cr distribution in agricultural topsoil in addition to the soil parent material, and to develop a spatial distribution model for Cr accounting for these influencing factors.We aimed to (1) develop a predictive mapping model for Cr concentrations in surface soil, (2) investigate the spatially variable relationship between the influencing factors and Cr concentrations, and (3) evaluate the overall distribution of Cr in topsoil in a typical agriculturally dominated area.The proposed model holds the potential not only to reveal the distribution of topsoil Cr in traditional agricultural areas but also to predict the spatial distribution of other soil pollutants at regional scales, thus enhancing our understanding and management of agricultural sustainability.

Soil Sampling and Chemical Analysis
The study area (116 1) is situated in Pinggu district, Beijing, China.As a traditionally agriculture dominated region, it has a large proportion of agricultural land (66.26%) which is mainly cultivated as orchard, vegetable fields, and cropland.Fruit sales and crop production are major components of the local economy; meanwhile, the rapid development of the agricultural economy in this area may have concomitantly increased soil pollution levels.The mountainous terrain is dominant in this district, and the main soil parent material is alluvial-fluvial matter.The main soil texture types are cinnamon soil and fluvo-aquic soil.
The sampling design also considered the distribution of Cr among various areas, soil types, and topographical positions; 208 sampling sites were ultimately collected in agricultural soil (Figure 1a).The geographic coordinates of these sampling sites were recorded using a global positioning system.
The sampling design also considered the distribution of Cr among various areas, soil types, and topographical positions; 208 sampling sites were ultimately collected in agricultural soil (Figure 1a).The geographic coordinates of these sampling sites were recorded using a global positioning system.At a depth of 0-25 cm, five subsamples surrounding the designated sampling location were collected from a 10 m × 10 m grid using a stainless-steel spade and were well mixed.After removing the stones and other debris, approximately 1 kg of soil was packaged in plastic bags and taken to the laboratory, then air-dried at ambient temperature and passed through a 100-mesh sieve.Cr contents were measured using graphite furnace atomic absorption spectrometry (PinAAcle 900 T, PerkinElmer, Shelton, CT, USA) after acid digestion (HCl, HNO3, HF, and HClO4).

Calculation of Potential Predictors
In addition to soil and terrain factors (e.g., pH, texture) that have commonly been used to predict the spatial distribution of heavy metals in previous studies [18], this study also incorporated auxiliary environmental variables in the development of the LUR model.Considering soil properties and auxiliary variables, we employed 84 unique covariates as the potential predictors (Table 1), including physicochemical characteristics of the soil (X1~X18), the shortest distance to a road/river/livestock farm (X19~X21), and the areas of land use types, road/river length, and population density within the circular buffer zones (X22~X84).

Categories
NO.

Description Factors (Units)
General categories

Specific categories
X22~X28 Vegetable field area in buffer V20, V50, V100, V200, V350, V500, V1000 (m 2 ).X29~X35 Cropland area in buffer C20, C50, C100, C200, C350, C500, C1000 (m 2 ).At a depth of 0-25 cm, five subsamples surrounding the designated sampling location were collected from a 10 m × 10 m grid using a stainless-steel spade and were well mixed.After removing the stones and other debris, approximately 1 kg of soil was packaged in plastic bags and taken to the laboratory, then air-dried at ambient temperature and passed through a 100-mesh sieve.Cr contents were measured using graphite furnace atomic absorption spectrometry (PinAAcle 900 T, PerkinElmer, Shelton, CT, USA) after acid digestion (HCl, HNO 3 , HF, and HClO 4 ).

Calculation of Potential Predictors
In addition to soil and terrain factors (e.g., pH, texture) that have commonly been used to predict the spatial distribution of heavy metals in previous studies [18], this study also incorporated auxiliary environmental variables in the development of the LUR model.Considering soil properties and auxiliary variables, we employed 84 unique covariates as the potential predictors (Table 1), including physicochemical characteristics of the soil (X1~X18), the shortest distance to a road/river/livestock farm (X19~X21), and the areas of land use types, road/river length, and population density within the circular buffer zones (X22~X84).
All land use maps, river and road network datasets, and data for the independent livestock farms in the study area were obtained from the National Science and Technology Infrastructure of China, National Earth System Science Data Sharing Infrastructure (http: //www.geodata.cn/,accessed on 20 April 2020).Soil type maps and soil property were acquired from the regional Digital Soil System.Population grid data for 2015 with a 1 km resolution were provided by the Resource and Environmental Science Data Center of the Chinese Academy of Sciences (http://www.resdc.cn/,accessed on 20 April 2020).The shortest distance to the nearest livestock farm/river /road was accordingly calculated.These auxiliary variables were obtained via GIS based on the basic geographic data.The acreage of each land use type, the total road/river length, and population density were also calculated based on specific circular buffer zones; the buffer radius was 20, 50, 100, 200, 350, 500 and 1000 m.

LUR Model
The first step of the modeling was to screen the factors that exhibited significant correlations with soil Cr concentration, and then remove the partial variables that may have had collinearity (r > 0.60), to ensure the interpretability of the model parameters.The relationship between Cr content and environmental variables was analyzed using the nonparametric Spearman correlation test [20].Secondly, based on the selection of variables via correlation analysis, a multivariate linear regression analysis with backward selections was performed to identify the most significant variables and build the LUR model [21].Descriptive, relationship, and regression analyses were all conducted using SPSS.22.0 software (SPSS Institute).The LUR model can be expressed as Equation ( 1): where y LUR is the Cr content in topsoil, x 1 , x 2 , . . .and x m refer to the relevant influence factors, β 0 is the intercept (constant value), β 1 , β 2 , . . .and β m are the regression coefficients, and ε is the residual (accidental error).Thirdly, based on the assumption of independent residuals in linear regression, a standard diagnostic test was conducted to assess the spatial autocorrelation of the residuals using global Moran's I in ArcGIS 10.5 [22].The two previous steps were iteratively repeated until the final LUR models encompassed only significant variables and satisfied all regression assumptions, including normally distributed residuals with low spatial autocorrelation.

Ordinary Kriging Combined with LUR Model (OK_LUR)
OK_LUR can consider the auxiliary variables at specific points for interpolating the outputs.This approach is grounded in the principle that the deterministic component of the target variable is explained through the regression methods.The predicted value is formed by the LUR model prediction and the OK prediction of the residual at a given point.The process of OK_LUR can be summarized as Equation ( 2): where y i represents the predicted value of the Cr content in topsoil at the ith location according to OK_LUR, β i is the intercept, x ij is the jth predictors' value of the ith observation, β j is the jth predictors' coefficient, ε i is the residual of the regression via semi-variogram and OK, and y pi is the predicted value of the Cr contents at the ith location according to OK_LUR without accidental error.Geostatistical analysis and OK interpolation were conducted with ArcGIS 10.5 (ESRI, Redlands, CA, USA).

Geographical Weighted Regression Kriging Combined with LUR Model (GWRK_LUR)
In this phase, we constructed the GWRK_LUR model based on predictors identified in the LUR model.Specifically, this model represents a local LUR model based on the GWR method.While the predictors remained static, the variable coefficients instead of the constant coefficient varied over the study domain.The local coefficient estimates were obtained through weighting neighboring observations using a distance-adjusted kernel function [23].Therein, the GWR tool in ArcGIS 10.5 was used, with the kernel type and bandwidth method parameters configured as 'adaptive' and 'AICc'.Then, the OK method was used to interpolate residuals from the GWR.The GWRK_LUR model is specified according to Equation (3): where y * i represents the predicted value of the Cr topsoil content at the ith location as given by GWRK_LUR, β * i is the spatially varying intercept at the ith observation, x ij is the jth predictors' value of the ith observation, (u i ,v i ) is the location in geographical space of the ith observation, β j(u i v i ) is the jth predictors' spatially varying coefficient at the ith location, and ε * i is the residual of the GWR according to OK.

Model Validations
The sensitivity of the models was examined via k-fold cross-validation (k-fold CV).Sampling sites were randomly divided into k parts, with k − 1 parts serving as the training set for model fitting, and the remaining part used for testing.The above procedure was repeated k times (k = 10) [24,25].Then, the coefficient of determination (R 2 ), mean bias error (MBE), root mean squared error (RMSE), and normalized root mean squared error (NRMSE) were calculated to describe the accuracy of the models [26].Generally, a higher CV-R 2 value, the coefficient of determination based on the k-fold CV, indicated better predictive ability of the Cr models; a positive value of MBE indicated that the observed value overestimated the simulated value and vice versa; lower RMSE and NRMSE values (closer to zero) meant more stable and accurate models.
where y i is the observed value of Cr contents at the ith location, ŷi is the predicted value, y i is the mean of the observed values, and N is the number of paired observed simulated values.

Descriptive Statistics for Topsoil Cr Concentrations
Cr concentrations in 208 soil samples ranged from 50.10 to 202.74 mg/kg with a standard deviation of 19.38, which was noted to be below the world average [27].The mean concentration (68.59 mg/kg) was found to be lower than the upper threshold (150 mg/kg) according to the soil environmental quality standards of both China and the USA.Furthermore, none of the sampled data exceeded the limits (800 mg/kg) that could generate risk to the soil environment.
Moreover, Cr concentrations within the study area were not much different from soil background values, which were 50.60-163.00mg/kg with a mean concentration of 68.10 mg/kg [28].Agricultural activities in the past 30 years had not significantly impacted Cr concentrations in the topsoil within the study area.
The topsoil Cr concentrations displayed medium spatial variability with 28.25% coefficient of variations, indicating that Cr concentration was likely to have been influenced by natural factors such as soil genesis and parent material [9].The measurements of all sampling points were highly skewed to the right (skewness = 3.97); therefore, they required transformation using the natural logarithm (Figure A1 in Appendix A).In addition, the topsoil Cr concentrations presented significant spatial autocorrelation with positive global Moran's I (0.15) and p < 0.01.

Evaluation of Environmental Factors Based on the Correlation between Predictors and Cr Concentrations
Based on the results of the Spearman correlation test, correlations between factors and Cr contents showed that the Cu factor had the largest correlation with Cr, followed by Zn, Mo, pH, population density within 1000 m (P1000), an orchard within 100 m (O100), cropland within 350 m (C350), and the shortest distance to a livestock farm (near_lf).Significant positive correlations between factors (Cu, near_lf, Zn, Mo, O100, C350) and Cr concentrations (Figure 2) indicated these factors could potentially impact the physicochemical process of Cr in soil.
The pH values of 208 soil samples ranged from 4.30 to 8.21 with a standard deviation of 0.76.The average value was 7.22, and most of the values were concentrated between 6.80 to 8.20.Soil pH can affect heavy metal leachability and bioavailability; it was utilized for a Cr-dependent criterion.The observed negative correlation between pH and Cr in our study was consistent with the results in paddy soil [29].
The soil organic matter (SOM) content of the sampling points ranged from 3.25 to 112.34 g/kg, with an average of 21.89 g/kg.The majority of these samples' content was mainly concentrated between 10 and 36 g/kg.Previous research has established that SOM significantly affects the distribution of Cr, primarily via complexation processes, redox reactions, and modulating the mobility of Cr [30].However, in our study, no significant correlation was observed between SOM and Cr.This lack of correlation may be attributed to the interference of other factors at a regional scale level.
Based on the correlations between 84 factors (Table 1) and Cr concentrations, we determined 8 factors (Cu, near_lf, Zn, Mo, pH, O100, C350, P1000) that demonstrated significant correlation as potential predictors of Cr LUR models.The selected predictors within the buffer zones (O100, C350, and P1000) exhibited significant spatial heterogeneity (Figure 3) and effectively depicted the differences across the entire study area [31].These predictors provided more information compared with commonly utilized variables in regression models.

Comparison of the LUR, OK_LUR and GWRK_LUR Models
In this study, the OK method was employed to interpolate the residuals derived from the LUR and GWR models.Based on the Kolmogorov-Smirnov (K-S) test, through natural logarithm transformation, it was foundthat the residuals of both the OK_LUR and GWRK_LUR models followed normal distribution.The Gaussian model offered the most optimal fit for the semivariogram (Table A1) of the OK_LUR residuals, while the exponential model proved the best fit for the GWRK_LUR residuals.termined 8 factors (Cu, near_lf, Zn, Mo, pH, O100, C350, P1000) that demonstrated significant correlation as potential predictors of Cr LUR models.The selected predictors within the buffer zones (O100, C350, and P1000) exhibited significant spatial heterogeneity (Figure 3) and effectively depicted the differences across the entire study area [31].These predictors provided more information compared with commonly utilized variables in regression models.Table 2 compares the performances of the LUR, OK_LUR, and GWRK_LUR models in modeling the spatial distribution of Cr.These models considered Zn, Cu, near_lf, O100, and P1000 factors as predictor variables.The results indicated that the LUR, OK_LUR, and GWRK_LUR models captured 28%, 37%, and 44% of the variation in Cr, respectively.All models exhibited a certain degree of overestimation of the Cr concentration according to the MBE.In addition, the Moran's I value of the LUR model residuals was 0.03, which indicated that the LUR model did adequately reduce the spatial autocorrelation in the Cr data.
The OK_LUR and GWRK_LUR models improved the LUR model's explanation by 9% and 16%, respectively.This phenomenon was attributed to the information contained in the LUR model residuals [32].The GWRK_LUR model showed better performance than OK_LUR.It benefited from the capacity to capture the spatial non-stationarity in the relationships through allowing coefficients to change in space [31].In contrast, the OK_LUR model was unable to adequately describe local relationships.The coefficients of factors and Cr in the GWRK_LUR model further supported this observation (Table 2).The GWRK_LUR model not only had the highest R 2 , but also obtained optimal MBE, RMSE, and NRMSE compared with the LUR and OK_LUR models.However, the highest local R 2 (0.58) of the DWRK_LUR model appeared in the north area, and local R 2 in the central region was consistently lower than 0.3 (Figure A2).It can be concluded that the quality of the GWRK_LUR model performance was non-uniform in space.In particular regions, the predictive capability of land-use factors may have been overestimated when using global regression (LUR and OK_LUR models).The coefficient between Cu and Cr was the highest (r = 6.118) according to the regression of the normalized parameters with the Z-score.This finding is consistent with reports that Cu and Cr have a major common origin in agricultural soil [33].The Cr models incorporated only a single land-use factor (O100), consistent with previous findings that only a limited number of land-use factors could be included in the final LUR models [16].
The accuracy of the geostatistical models was greatly dependent on the quality of sampling [5].Our models were based on 208 observations, exceeding the typical 40-80 samples recommended by Hoek and coworkers [22].Although more observations could obtain influencing factors that might be neglected through modeling with fewer points, we also sacrificed the prediction accuracy presented by the model (R 2 ).Additionally, the barely satisfactory R 2 of these models also proved that the distribution of Cr in agricultural soil was significantly influenced by the soil parent material [33].

Spatial Distribution of Predicted Cr Concentrations
The ranges of the Cr concentrations' predicted values in agricultural topsoil with the DWRK_LUR and OK_LUR models were 46.36-162.67 mg/kg (Figure 4) and 48.35-113.99 mg/kg (Figure A3), respectively.The DWRK_LUR model yielded predictions that were closer to the real observations (50.10-202.74mg/kg), indicating its superior capability in reflecting the spatial structure of the original regionalized variables.A Kriged surface is well known to be smoother than the regionalized variable predicts.Their strong smoothing and centralizing effects can lead to predicted values converging towards a narrow range [5].
The spatial distribution mapping results showed the continuous variety of characteristics of topsoil Cr across the area (Figure 4).Lower Cr concentrations were observed in the southern region, while elevated Cr concentrations emerged in the west, north, and northeast (Figure 4a).The Cr distribution exhibited a steeper gradient in the central region, coinciding with the distribution of Zn, Cu, O100, and P1000 (Figure 3).Higher Cr concentrations were predominantly found in the mountainous regions.This was chiefly because the soil from these regions was less sandy and retained the original soil state with fewer anthropogenic effects, including smaller populations and predominantly traditional patterns of agriculture [34].The weathering and mineralization processes in the mountains provided a high quantity of Cr to the soil [33,35], while anthropogenic effects in the mountainous regions may have contributed to a relatively lower transformation/transfer of Cr compared with more developed regions [19].This result indicated the interactions of different sources of artificial and natural impacts on the distribution of Cr, aligning with previous research conducted in the Yellow River Delta [8].Soil parent materials may have affected the 64.5% Cr concentration in soil, although was not considered for quantitative analysis in this study and should be improved in the future.
we also sacrificed the prediction accuracy presented by the model (R ).Additionally, the barely satisfactory R 2 of these models also proved that the distribution of Cr in agricultural soil was significantly influenced by the soil parent material [33].

Spatial Distribution of Predicted Cr Concentrations
The ranges of the Cr concentrations' predicted values in agricultural topsoil with the DWRK_LUR and OK_LUR models were 46.36-162.67 mg/kg (Figure 4) and 48.35-113.99 mg/kg (Figure A3), respectively.The DWRK_LUR model yielded predictions that were closer to the real observations (50.10-202.74mg/kg), indicating its superior capability in reflecting the spatial structure of the original regionalized variables.A Kriged surface is well known to be smoother than the regionalized variable predicts.Their strong smoothing and centralizing effects can lead to predicted values converging towards a narrow range [5].The spatial distribution mapping results showed the continuous variety of characteristics of topsoil Cr across the area (Figure 4).Lower Cr concentrations were observed in the southern region, while elevated Cr concentrations emerged in the west, north, and northeast (Figure 4a).The Cr distribution exhibited a steeper gradient in the central region, coinciding with the distribution of Zn, Cu, O100, and P1000 (Figure 3).Higher Cr concentrations were predominantly found in the mountainous regions.This was chiefly because the soil from these regions was less sandy and retained the original soil state with fewer anthropogenic effects, including smaller populations and predominantly traditional patterns of agriculture [34].The weathering and mineralization processes in the mountains provided a high quantity of Cr to the soil [33,35], while anthropogenic effects in the mountainous regions may have contributed to a relatively lower transformation/transfer of Cr compared with more developed regions [19].This result indicated the interactions of different sources of artificial and natural impacts on the distribution of The predicted Cr distribution exhibited a cold-hot-spot pattern, with high values clustered in the north area and low values clustered in the south (Figure 4b).Some areas had no significant cold-hot spots, indicating a lack of continuous high or low values.These localized variations in Cr distribution were attributed to a range of factors, including terrain, crop types, irrigation methods, and the distribution of livestock farms [7].The Cr pattern was very similar to the Cu distribution (Figure 4b), and this result agreed with previous research on Chinese arable soil [36].
The modeling results showed the Cr contents were below the threshold (800 mg/kg) that could generate risk to the soil environment.They suggested that agricultural soil in the study area is not contaminated with Cr.While Cr from natural sources tends to have low bioavailability in its highly stable form [35], agricultural inputs represent other primary sources that could potentially cause risk to soil.Therefore, it is imperative to develop management practices and policies aimed at reducing pollutant inputs, to ensure the sustainable utilization of land resources and protect farmland soil from heavy metal contamination.

Effect of Predictors on Cr Concentrations Based on Coefficients
A significant positive coefficient between Zn and Cr was found in the southwest (Figure 5b), while the influence of Cu on Cr was negative in this region (Figure 5c).The Zn and Cu factors exhibited similar positively correlation with Cr content in their high-value region and were negatively correlated in their low-value region (Figure 3a,b).This observation in the high-value region suggests a potential common source of Zn, Cu, and Cr [33].Agricultural practices using water irrigation, animal manure, and chemical fertilizers are contributory factors to the increased presence of not only Cu and Zn but also potentially Cr [36,37].The observation in the low-value region indicated that the environment was conducive to Cr enrichment in topsoil.In the central area, the influence of Zn and Cu on Cr content was non-significant (Figure A4a,b), which could be attributed to their complex spatial heterogeneity (Figure 3) [6].[33].Agricultural practices using water irrigation, animal manure, and chemical fertilizers are contributory factors to the increased presence of not only Cu and Zn but also potentially Cr [36,37].The observation in the low-value region indicated that the environment was conducive to Cr enrichment in topsoil.In the central area, the influence of Zn and Cu on Cr content was non-significant (Figure A4a,b), which could be attributed to their complex spatial heterogeneity (Figure 3) [6].The shortest distance to a livestock farm (near_lf), orchard area within 100 m (O100), and population density within 1000 m (P1000) exhibited notably positive effects on Cr in the western region, whereas they displayed more negative effects in the eastern area (Figure 5d-f).The near_lf factor significantly influenced soil Cr in most areas (Figure A4), and the largest effect was observed in the southwest (Figure 5).This might be the dominant livestock farm existing in this region (Figure 3).The O100 factor had no statistically significant effect on Cr in a small area of the eastern region (Figure A4).The impact of P1000 factor on Cr varied spatially with irregular coefficients.As an ecological conservation development area, the eastern region has less agricultural land and fewer livestock farms (Figure 3).Less environmental impact could possibly explain the non-significant correlation between predictors and Cr concentration in this area [7,13].
The effect of predictors on Cr displayed a trend from low to high in the southeast (Figure 5).This weak prediction effect was attributed to the fact that this area is the administrative center of the study area and has the largest population density (Figure 3e), indicating that agricultural land was not the primary land-use type [6].This phenomenon, namely the non-significant effect on Cr in the southeastern region, also caused a lower local R 2 in this area.
Overall, the GWRK_LUR model provided superior performance compared with the other models.The geographically varying coefficients of these factors provided evidence of the spatial non-stationarity in the relationship between the predictors and Cr contents [31].Furthermore, the spatially discrepant effect of the predictors on Cr was also presented.

Conclusions
A hybrid mapping model integrating LUR and geostatistical methods was developed to predict the spatial distribution of topsoil Cr at a regional scale.The proposed GWRK_LUR model significantly improved the accuracy of prediction achieved with the The shortest distance to a livestock farm (near_lf), orchard area within 100 m (O100), and population density within 1000 m (P1000) exhibited notably positive effects on Cr in the western region, whereas they displayed more negative effects in the eastern area (Figure 5d-f).The near_lf factor significantly influenced soil Cr in most areas (Figure A4), and the largest effect was observed in the southwest (Figure 5).This might be the dominant livestock farm existing in this region (Figure 3).The O100 factor had no statistically significant effect on Cr in a small area of the eastern region (Figure A4).The impact of P1000 factor on Cr varied spatially with irregular coefficients.As an ecological conservation development area, the eastern region has less agricultural land and fewer livestock farms (Figure 3).Less environmental impact could possibly explain the non-significant correlation between predictors and Cr concentration in this area [7,13].
The effect of predictors on Cr displayed a trend from low to high in the southeast (Figure 5).This weak prediction effect was attributed to the fact that this area is the administrative center of the study area and has the largest population density (Figure 3e), indicating that agricultural land was not the primary land-use type [6].This phenomenon, namely the non-significant effect on Cr in the southeastern region, also caused a lower local R 2 in this area.
Overall, the GWRK_LUR model provided superior performance compared with the other models.The geographically varying coefficients of these factors provided evidence of the spatial non-stationarity in the relationship between the predictors and Cr contents [31].Furthermore, the spatially discrepant effect of the predictors on Cr was also presented.

Figure 1 .
Figure 1.Location of soil sampling sites (a) and geo-environmental elements (b) in the study area.

Figure 1 .
Figure 1.Location of soil sampling sites (a) and geo-environmental elements (b) in the study area.

Figure 2 .
Figure 2. Correlations between predictors and Cr concentrations.Only the significant factors are presented in this figure.* p-value (i.e., significant level) less than 0.05; ** p-value below 0.01; *** pvalue 0.00.Cr, Zn, Cu, and Mo are the contents (mg/kg) of Cr, Zn, Cu, and Mo in soil; near_lf is the shortest distance (m) between sampling site and a livestock farm; O100 is an orchard area (m 2 ) with a circular buffer size of 100 m; C350 is the cropland area (m 2 ) within 350 m; P1000 is population density (per km 2 ) within 1000 m.

Figure 2 .
Figure 2. Correlations between predictors and Cr concentrations.Only the significant factors are presented in this figure.* p-value (i.e., significant level) less than 0.05; ** p-value below 0.01; *** p-value 0.00.Cr, Zn, Cu, and Mo are the contents (mg/kg) of Cr, Zn, Cu, and Mo in soil; near_lf is the shortest distance (m) between sampling site and a livestock farm; O100 is an orchard area (m 2 ) with a circular buffer size of 100 m; C350 is the cropland area (m 2 ) within 350 m; P1000 is population density (per km 2 ) within 1000 m.

Figure 3 .
Figure 3.The distribution of predictors across the study area.(a) It isthe contents (mg/kg) of Zn in the soil; (b) It is the contents (mg/kg) of Cu in the soil; (c) It is the shortest distance (m) between the sampling site and a livestock farm; (d) It is an orchard area (m 2 ) with a circular buffer size of 100 m; (e) It is population density (per km 2 ) within 1000 m.The color ranging from purple to red represents the value from small to large.

Figure 3 .
Figure 3.The distribution of predictors across the study area.(a) It isthe contents (mg/kg) of Zn in the soil; (b) It is the contents (mg/kg) of Cu in the soil; (c) It is the shortest distance (m) between the sampling site and a livestock farm; (d) It is an orchard area (m 2 ) with a circular buffer size of 100 m; (e) It is population density (per km 2 ) within 1000 m.The color ranging from purple to red represents the value from small to large.

Figure 4 .
Figure 4. Spatial distribution (a) and cold-hot-spot pattern (b) of Cr content predicted via by land use regression combined with geographically weighted regression Kriging (DWRK_LUR).

Figure 4 .
Figure 4. Spatial distribution (a) and cold-hot-spot pattern (b) of Cr content predicted via by land use regression combined with geographically weighted regression Kriging (DWRK_LUR).

Figure 5 .
Figure 5. Effects of predictors on Cr across the study area.(a) It is the intercept of the GWRK_LUR model; (b) It is the contents (mg/kg) of Zn in the soil; (c) It is the contents (mg/kg) of Cu in the soil; (d) It is the shortest distance (m) between sampling sites and a livestock farm; (e) O100 is orchard area (m 2 ) with a circular buffer size of 100 m; (f) P1000 is population density (per km 2 ) within 1000 m.The color ranging from blue to red represents the value from small to large.

Figure 5 .
Figure 5. Effects of predictors on Cr across the study area.(a) It is the intercept of the GWRK_LUR model; (b) It is the contents (mg/kg) of Zn in the soil; (c) It is the contents (mg/kg) of Cu in the soil; (d) It is the shortest distance (m) between sampling sites and a livestock farm; (e) O100 is orchard area (m 2 ) with a circular buffer size of 100 m; (f) P1000 is population density (per km 2 ) within 1000 m.The color ranging from blue to red represents the value from small to large.

Figure A2 .
Figure A2.The distribution of local R2 for the Cr DWRK_LUR model.

Figure A2 . 16 Figure A3 .
Figure A2.The distribution of local R2 for the Cr DWRK_LUR model.Sustainability 2024, 16, x FOR PEER REVIEW 14 of 16

Figure A4 .
Figure A4.Effects of predictors with Z-score normalized on Cr across the study area.(a) It is the Zscore normalized of Zn in the soil; (b) It is the Z-score normalized of Cu in the soil; (c) It is Z-score normalized of the shortest distance between sampling site and a livestock farm; (d) Itis the Z-score normalized of orchard area with a circular buffer size of 100 m; (e) It is the Z-score normalized of population density within 1000 m.Different colors in the images represent different ranges of numerical values.

Figure A3 . 16 Figure A3 .
Figure A3.Spatial distribution (a) and cold-hot-spot pattern (b) of Cr concentrations predicted through ordinary Kriging combined with the LUR model (OK_LUR).

Figure A4 .
Figure A4.Effects of predictors with Z-score normalized on Cr across the study area.(a) It is the Zscore normalized of Zn in the soil; (b) It is the Z-score normalized of Cu in the soil; (c) It is Z-score normalized of the shortest distance between sampling site and a livestock farm; (d) Itis the Z-score normalized of orchard area with a circular buffer size of 100 m; (e) It is the Z-score normalized of population density within 1000 m.Different colors in the images represent different ranges of numerical values.

Figure A4 .
Figure A4.Effects of predictors with Z-score normalized on Cr across the study area.(a) It is the Z-score normalized of Zn in the soil; (b) It is the Z-score normalized of Cu in the soil; (c) It is Z-score normalized of the shortest distance between sampling site and a livestock farm; (d) Itis the Z-score normalized of orchard area with a circular buffer size of 100 m; (e) It is the Z-score normalized of population density within 1000 m.Different colors in the images represent different ranges of numerical values.

Table 1 .
Descriptions of the potential predictors.

Table 2 .
Descriptive statistics for the Cr LUR, OK_LUR, and GWRK_LUR models.

Table A1 .
The results of geostatistical analysis of the residuals.