Spatial Distribution of Soil Nutrients in Farmland in a Hilly Region of the Pearl River Delta in China Based on Geostatistics and the Inverse Distance Weighting Method

: Soil nutrients are essential factors that reﬂect farmland quality. Nitrogen, phosphorus, and potassium are essential elements for plants, while silicon is considered a “quasi-essential” element. This study investigated the spatial distribution of plant nutrients in soil in a hilly region of the Pearl River Delta in China. A total of 201 soil samples were collected from farmland topsoil (0–20 cm) for the analysis of total nitrogen (TN), available phosphorus (AP), available potassium (AK), and available silicon (ASi). The coefﬁcients of variation ranged from 47.88% to 76.91%. The NSRs of TN, AP, AK, and ASi were 0.15, 0. 07, 0.12, and 0.13, respectively. The NSRs varied from 0.02 to 0.20. All variables exhibited weak spatial dependence (R 2 < 0.5), except for TN (R 2 = 0.701). After comparing the prediction accuracy of the different methods, we used the inverse distance weighting method to analyze the spatial distribution of plant nutrients in soil. The uniform spatial distribution of AK, TN overall showed a trend of increasing from northeast to southwest, and the overall spatial distribution of AP and ASi showed that the northeast was higher than the southwest. This study provides support for the delimitation of basic farmland protection areas, the formulation of land use spatial planning, and the formulation of accurate farmland protection policies.


Introduction
Soil nutrients are important indicators of cultivated land quality, which is determined by two aspects: soil fertility and spatial location [1]. Soil fertility, which is closely related to the concentrations of soil nutrient elements, is the foundation of soil productivity [2]. Soil nutrient elements can be classified into essential, beneficial, and toxic elements. Essential elements are critical for all plants under all growth conditions, and can be divided into two categories on the basis of their essentiality: (1) macro elements that are required in high amounts, e.g., nitrogen (N), phosphorus (P), and potassium (K), and (2) micro elements that are required in lower amounts, e.g., zinc, manganese, copper, and nickel [3]. Beneficial elements, which usually include sodium, silicon (Si), and cobalt, are vital for some specific plant species growing under certain environmental conditions [4]. A sufficient supply of soil nutrients allows plants to grow and develop normally, and the abundance and chemical form of nutrients directly determines whether the soil is fertile [5]. Therefore, it is essential to measure the concentrations of soil nutrient elements to estimate farmland quality.
Geostatistical methods as a predictive tool have been extensively utilized [6]. There have been many studies of soil nutrient properties in recent years. For example, some studies have assessed soil nutrient conditions in the northeast region of India [2], and predicted their spatial distribution in farmland in Croatia [7]. Similar studies have been conducted in China in Yunnan Province [8], Beijing [9,10], and Shanxi Province [11], as well as in hilly areas of Guangdong [12], Sichuan [13], and Hebei Province [14]. Some researchers have attempted to analyze the spatial variability of soil nutrient properties in farmland in New Zealand [15] and India [16], or in a sandy loam soil in Croatia [17]. In these studies, although different factors were chosen for the evaluations to address different research purposes, the indexes generally included soil organic matter, mineral nutrient elements, and other soil properties, such as pH. Among the mineral nutrient elements, macronutrients have been studied more than micronutrients, especially the essential elements N, P, and K, which are among the most important indicators commonly used to analyze soil fertility.
Silicon is the second most plentiful element in the Earth's crust, accounting for nearly 29% of the total content of the crust [18]. Generally, Si plays an important role in soil. In aquatic ecosystems, Si along with N and P are the main biogenic elements that maintain net primary productivity (NPP) [19]. Silicon also has a great impact on the growth and development of plants in natural ecosystems due to its unique function of alleviating the deleterious effects of abiotic and biotic stresses in plants [3]. In modern agriculture, Si is recognized as a functional nutrient for rice, sugar cane, etc., and plays an especially important role in the growth and development of Gramineae crops [20]. Within the Poaceae family, rice is the staple food for more than half of the world's population. Rice plants are Si accumulators and are a high Si demanding crop, usually absorbing Si in greater quantity than essential nutrients such as N, P, K, and Ca [21]. The relationship between increasing rice yield and Si depletion has attracted increasing attention from agricultural scientists [18]. The area studied in the present research has a rich rice yield but is relatively lacking in soil nutrients, partially due to the effects of acid rain, and the absence of Si and various macronutrients (N, P, and K) may limit the rice yield [22].
Soil nutrient properties can vary spatially due to many factors, such as pedogenic processes, climate, parent material, topography, and human influences [15]. When the soil nutrient content exhibits high spatial variability, it is difficult to estimate soil nutrient status among regions based directly on the different geographical features [23]. This study tries to use geostatistics and inverse distance weighting (IDW) methods to explore the problem of soil high spatial. Geostatistics have been widely used to explore spatial uncertainty in recent decades [24]. They can be used to describe the spatial distribution of a variable and to predict its value between sampling points [25]. The IDW method is also commonly used to predict the spatial distribution of soil nutrients. It is a moving weighted-average method that has its optimum effect under conditions in which the sampling points are evenly distributed and not clustered in the sampling area.
In this study, we analyzed the spatial distribution of soil N, P, K, and Si in farmland in Conghua District, Guangzhou, China, using geostatistics and the IDW method. The main objectives of the study were to (i) measure the total nitrogen (TN), available phosphorus (AP), available potassium (AK), and available silicon (ASi); (ii) determine the optimal model fitted and its parameters; and (iii) assess the spatial variability of the selected soil nutrient elements and analyze their spatial distribution.

Study Area, Soil Sampling, and Analysis
The study area was located in the northeast of Guangzhou, in the transition zone from the Pearl River Delta to the mountainous area of northern Guangdong Province, South China ( Figure 1). The terrain is tilted from north to south, and the total area of the study area is 1985.3 km 2 . The climate of the study area is Subtropical Monsoon Climate. The mean annual temperature ranges from −1.6 • C in winter to 36.7 • C in summer, and mean Agriculture 2021, 11, 50 3 of 12 annual precipitation is 1930.8 mm. Generally, the soil types can be divided into sandy soil, clayey soil and loam. The United States Department of Agriculture (USDA) Textural Triangle has 12 classes according to the proportion of sand content, viscosity and loam. The study area is located in the plain area of the Pearl River Delta, where the soil has less sediment content, fine particles, slow water seepage rate and general aeration performance. Therefore, the soil types are mainly clay and loam. Due to the cropping system of three crops a year, frequent tillage and high intensity development and utilization activities, the overall situation of soil nutrients in this region is not optimistic. A total of 201 topsoil samples (0-20 cm) were collected from farmland in the study area for the analysis of their soil nutrient properties. In the sampling: (i) select a large area of farmland; (ii) the sampling point is more than 100 m away from the highway and railway; (iii) avoid the composting edge, irrigation mouth and other places that affect the soil properties [26]. The sample locations were selected according to topography and land use. The sampling density was approximately one sample per 1.07 km 2 cultivated land. After the weeds and plant roots in the soil had been removed, the topsoil was collected with a small wooden shovel. At each site, five small sub-samples were evenly mixed to create one large composite sample. Finally, the composite sample was placed into a plastic bag, and the condition of the surrounding environment, vegetation, and geographical position was recorded. All specimens were transported to a laboratory for the analysis of soil properties.
The samples were air dried and sieved through a 2-mm sieve for the analysis of soil nutrient elements. The indicators tested included TN, AP, AK, and ASi. TN was determined using the semi-micro Kjeldahl method, AP was extracted with the sodium bicarbonate extraction/molybdenum antimony colorimetric method (Olsen), AK was estimated via ammonium acetate extraction/flame photometry, and ASi was analyzed using a colorimetric method [10].

Statistical Analysis
Descriptive statistics, including the mean, minimum, maximum, coefficient of variation (CV), and standard deviation, were calculated using SPSS 20.0 [27]. The distribution of the data was tested for normality using the Kolmogorov-Smirnov test, kurtosis, skewness, and percentile-percentile plots. The correlations among the variables were determined based on Pearson's coefficients, and the CV was used to describe the degree of variation of soil nutrients. A CV of less than 10% indicated weak variability, a CV between 10% and 100% indicated moderate variability, and a CV of more than 100% indicated strong variability [28].

Geostatistics
Geostatistics were used to analyze the spatial variability of plant nutrients in soil. The approach consists of two stages: the calculation of an experimental variogram from the data and model fitting, and predictions at unsampled locations [24]. A semivariogram is used to measure the spatial variation of regionalized variables [25,29], and is expressed as: In Equation (1), γ(h) is the semivariance at a given distance h; Z(xi) is the value of the variable Z at location xi, and N(h) is the number of pairs of sample points separated by the distance h. The variogram plot was fitted with a theorical model, i.e., spherical, exponential, linear, or Gaussian models. The fitted model was selected based on it having both the largest R 2 and the smallest residual [30], and it could provide information about the spatial structure and the input parameters for the kriging interpolation.
The kriging interpolation method is widely used to research spatial structure and provide estimates of variation at unsampled locations [31]. It includes the concepts of ordinary, simple, universal, probability, indicator, and disjunctive kriging, but the ordinary kriging (OK) method is regarded as the best technique and is known as the linear unbiased estimator.

The Inverse Distance Weighting Method
The IDW method assumes that a value of an attribute at an unsampled location is a weighted average of known data points, within a local neighborhood surrounding the unsampled location [32]. It is expressed as: In Equation (2),Ẑ is estimate value of x 0 , which is the estimation point and xi are the data points within a chosen neighborhood. The weights (r) are related to distance by dij, which is the distance between the estimation point and the data points. Higher weights indicate more influence points close to x 0 .

Accuracy Assessment
Cross-verify the accuracy of kriging interpolation using the hold-out method [33] and to help determine the best parameters for the IDW test [31]. The indexes included root-mean-square error (RMSE) and mean-absolute error (MAE), which are be calculated as follows [11]: where y(xi) is the measured value, y * (xi) is the predicted value, and N is the number of samples. The best prediction model was obtained when the RMSE and MAE had the smallest value.

Descriptive Statistics
As shown in Table 1, the average values of TN, AP, AK, and ASi were 842.48 ± 403.38, 43.41 ± 26.48, 77.80 ± 59.84, and 63.85 ± 33.61 mg kg −1 , respectively. The CV of TN, AP, AK, and ASi ranged from 47.88% to 76.91%, which indicated a moderate variability for all variables in the study area. The CV values followed the order AK > AP > ASi > TN. TN, AP, and AK displayed moderate variability similar to that found in Hebei [14], Yunnan Province [8], and India [17]. The gradually decreasing trend in the CV among TN, AP, and AK in the study area was similar to that reported in farmland of India [17]. The skewness values of TN, AP, AK, and ASi were 1.28, 1.20, 1.64, and 0.90, respectively. The percentile-percentile plots showed that ASi(D1) was normally distributed, whereas TN(A1), AP(B1), and AK(C1) were non-normally distributed. The Kolmogorov-Smirnov tests demonstrated that all variables followed a normal distribution after logarithmic transformation ( Figure 2). There were no significant correlations among TN, AP, AK, and ASi based on the Pearson coefficients.
where y(xi) is the measured value, y*(xi) is the predicted value, and N is the number of samples. The best prediction model was obtained when the RMSE and MAE had the smallest value.

Descriptive Statistics
As shown in Table 1, the average values of TN, AP, AK, and ASi were 842.48 ± 403.38, 43.41 ± 26.48, 77.80 ± 59.84, and 63.85 ± 33.61 mg kg −1 , respectively. The CV of TN, AP, AK, and ASi ranged from 47.88% to 76.91%, which indicated a moderate variability for all variables in the study area. The CV values followed the order AK > AP > ASi > TN. TN, AP, and AK displayed moderate variability similar to that found in Hebei [14], Yunnan Province [8], and India [17]. The gradually decreasing trend in the CV among TN, AP, and AK in the study area was similar to that reported in farmland of India [17].
The skewness values of TN, AP, AK, and ASi were 1.28, 1.20, 1.64, and 0.90, respectively. The percentile-percentile plots showed that ASi(D1) was normally distributed, whereas TN(A1), AP(B1), and AK(C1) were non-normally distributed. The Kolmogorov-Smirnov tests demonstrated that all variables followed a normal distribution after logarithmic transformation (Figure 2). There were no significant correlations among TN, AP, AK, and ASi based on the Pearson coefficients.

Ordinary Kriging Results
As shown in Table 2, the optimal models were selected based on having the smallest residual and largest R 2 values. For TN, AK, and ASi, the R 2 values are 0.701, 0.115 and 0.446, they are both larger than other models, so an exponential model was considered the optimal model, whereas a spherical model was selected as the optimal model for AP (Figure 3).  [13,37]. * represents the best fitted model.

Inverse Distance Weighting Results
In the inverse distance weighting method, root mean squared error (RMSE) indicates that the smaller the values, the better the model prediction. The minimum RMSE was observed when p = 1, and the maximum when p = 3 (Table 3)

Evaluation of Prediction Methods
Cross-validation was used to validate the accuracy of spatial interpolation. The best prediction models for the variables were selected based having the lowest RMSE and The ranges of the variograms for AK, ASi, TN, and AP were 5530, 2400, 3300, and 2190 m respectively, which were lower than those reported previously on a similar regional scale [10,11]. Based on the notion that the sampling interval should be less than half the range of the variogram [34] and the principle that sample numbers are usually a compromise between the accuracy required and the resources available for investigation, these results indicated that the sampling design was adequate for determining the spatial variability of soil nutrients in the study area [30].
The nugget (C 0 ) and sill (C 0 + C) values were used to reflect spatial heterogeneity. C 0 ranged from 59 to 24,900 mg kg −1 , whereas C 0 + C varied from 724 to 171,200 mg kg −1 . All semivariograms were generally well structured with a small nugget effect, except for TN, indicating that the sampling density was suitable for revealing spatial structures [35]. High C 0 values tend to hide the spatial variation of soil structure [36], but the high nugget effect for TN revealed an irregular distribution of spatial variability.  [13,37]. * represents the best fitted model.
The nugget-to-sill ratio (NSR) describes the degree of spatial correlation [37]. An NSR less than or equal to 0.25 indicates a weak spatial correlation; an NSR between 0.25 and 0.75 indicates a moderate spatial correlation; and an NSR greater than 0.75 indicates a Agriculture 2021, 11, 50 7 of 12 strong spatial correlation. However, a variable was still considered to have a weak spatial correlation when NSR ≤ 0.25. As shown in Table 2, the NSRs of TN, AP, AK, and ASi were 0.15, 0. 07, 0.12, and 0.13, respectively. All variables had an R 2 < 0.5, except for TN (R 2 = 0.701), therefore, they were considered to have weak spatial dependence. These results were similar to previous findings. For example, the spatial correlations of TN and AP in this study area were similar to those reported in a wastewater irrigation area of Beijing and a hilly area of Santai County [10,13].
Spatial variation is generally divided into intrinsic variation and extrinsic variation. Internal changes are caused by structural factors, while external changes are caused by random factors [6,10,33]. The structural factors of internal change include soil texture, soil particle, soil bulk density, etc. The spatial heterogeneity of soil organic matter, total nitrogen, total phosphorus and total potassium was studied by geostatistics method, and it was found that soil structural factors had great influence on soil organic matter, total nitrogen and total potassium, and were simultaneously affected by structural and random factors [13,38]. In general, variables with strong variability are mainly affected by structural factors, while variables with weak variability are affected by random factors. The driving factors of randomness are mainly human factors, such as agricultural irrigation, fertilizer use, tillage measures, etc. In studying the influence of land use change on spatial heterogeneity of soil nutrients in forests in southern Appalachia, Fraterrigo et al. found that land use had a persistent influence on spatial heterogeneity of soil nutrients [33]. All variables in this study belong to moderate variation. Spatial variation of soil nutrients is influenced by both structural and random factors, and has different characteristics in different regions.

Inverse Distance Weighting Results
In the inverse distance weighting method, root mean squared error (RMSE) indicates that the smaller the values, the better the model prediction. The minimum RMSE was observed when p = 1, and the maximum when p = 3 (Table 3). When p = 1, the RMSEs of TN, AP, AK, and ASi were 404.51, 26.85, 58.63 and 31.50, and when p = 3 the RMSEs of TN, AP, AK, and ASi ranged from 28.63 to 440.41, respectively. The RMSE values followed the order TN > AK > ASi > AP.

Evaluation of Prediction Methods
Cross-validation was used to validate the accuracy of spatial interpolation. The best prediction models for the variables were selected based having the lowest RMSE and MAE values (Table 4). Thus, the exponential model was selected for TN, AK, and ASi, whereas the spherical model was selected for AP. The mean errors of TN, AP, AK, and ASi were 0.57, 0.23, −1.19, and −0.57, respectively; the negative values indicated that the predicted values of AK and ASi were less than the experimental values. Based on their having the lowest RMSE and MAE, the best prediction models were selected when p = 1 using IDW. The mean errors of TN, AP, AK, and ASi were −0.68, 0, 0.01, and 0.15, respectively; the negative value indicated that the predicted value of TN was less than the experimental value. The estimation of soil properties is one of the most challenging problems in geoscience [39]. Studies have found that using the RMSE as an indicator can have disadvantages, especially in the process of verifying the models generated by different interpolation methods [40]. In this study, we selected the RMSE and MAE as indicators to assess the interpolation accuracy when the same method was applied to the data, e.g., OK or IDW, and used the indicators of optimal accuracy (IOA) and imprecision (IP) to evaluate the predicted accuracy when different interpolation methods were applied to the same data [41]; the formulas are as follows: As shown in Table 5, the IOAs of TN, AP, AK, and ASi obtained from IDW were 6.07, 0.11, 0.26, and 0.13, respectively. The IOAs of TN, AP, AK, and ASi obtained from IDW were less than those obtained from OK, indicating that the predicted accuracies of IDW were higher than those of OK. The IPs of TN, AP, AK, and ASi obtained from OK were 214.13, 0.96, 2.54, and 1.29, respectively. The IP values of TN AP, AK, and ASi obtained from OK were larger than those obtained from IDW, indicating that the predicted accuracies of IDW were higher than those of OK. Therefore, we concluded that the prediction accuracy of TN, AP, AK, and ASi obtained from IDW was higher than that obtained from OK. Therefore, we used IDW to analyze the spatial distribution of soil nutrient elements.

Spatial Distribution of Soil Nutrient Elements
The spatial distributions of TN, AP, AK and ASi were plotted using the ArcGIS 10.2 software, and the results are shown in Figure 4. TN displayed an overall increasing trend from the northeast to southwest in the range of 493.28 to 1529.56 mg kg −1 . The TN content was generally low (<700 mg kg −1 ) to moderate (700-1100 mg kg −1 ) in most areas, and it was only high (>1100 mg kg −1 ) in some scattered locations that represented 6.65% of the total area. It may be due to the influence of topographic factors. In northeast China, most areas are mountainous with high altitude and steep slope, while in southwest China, there are plain areas with low altitude. The AP content was low (25-40 mg kg −1 ) to moderate (40-55 mg kg −1 ) in most regions, and was generally higher in the northeast than the southwest. This is because areas with high total nitrogen content are located around urban areas, and the use of human factors on land is significantly higher than other areas. The AK content exhibited an even distribution across the study area, with no significant regional features. The AK content varied from 21.30 to 244.26 mg kg −1 , and was very low (<108 mg kg −1 ) in 99.66% of the total area. ASi exhibited a similar spatial distribution as AP. The ASi content was moderate (60-85 mg kg −1 ) to low (35-60 mg kg −1 ) in most regions, and most soils in the study area showed ASi deficiency. The overall low content of AK and ASi may be due to the fact that frequent human land use activities have destroyed the soil cultivation layer, resulting in the lack of soil nutrients, especially in the urbanized areas of the southwest, where the more intense land development and utilization activities, the more obvious.

Discussion
The purpose of this paper was mainly to discuss the spatial distribution features of soil nutrients at the county scale, so that to understand the spatial variation of soil nutrients and master the influencing factors in the evolution process, to provide a scientific

Discussion
The purpose of this paper was mainly to discuss the spatial distribution features of soil nutrients at the county scale, so that to understand the spatial variation of soil nutrients and master the influencing factors in the evolution process, to provide a scientific proof for formulating reasonable plan of land use management and a technical support for precision agriculture.
In this paper, kriging and inverse distance weight interpolation methods were used to analyze the spatial distribution features of plant nutrients in soil, and the optimal model and the optimal parameters of different methods were obtained in the study area. Based on these, a series of quantitative indicators were used to discuss the forecast results of different models. That was an innovative attempt in the thinking of plant nutrients in soil analysis because of many studies focus only on one method in the past [6,9,10,13]. The results showed that TN is moderately enriched and AP is deficient in the study area, which was consistent with the geochemical survey results. Secondly, in the selection of research objects, there are few studies on discussing ASi together with AN, AP and AK. In this study [42], the spatial distribution features of these soil nutrients were discussed according to the characteristics of regional natural geography and crop planting in the studied area, which was helpful to fully understand the soil nutrients and make a measure for precision agriculture [3,6].
Plant nutrients content in soil is caused by both natural and human factors. Most researches showed that parent material has the most significant effect on soil formation and soil nutrient content, and soil nutrient content is also closely related to climate, rainfall, vegetation, land use types and agricultural production measures [43]. It is well known that soil parent materials are the material carriers and there are differences in soil nutrients for the difference of parent materials [2,19,43]. It commonly existed a significant correlation between parent material and soil nutrients, that is, the higher the mineral element content of soil parent material, the higher the content of plant nutrients in soil in the corresponding area. Soil types are mainly caused by different parent materials. The influences of human activities on soil nutrients are mainly showed in agricultural production, the spatial distribution of soil nutrients will be affected by different fertilization methods and land use types.
The study area is located in the hilly area of Southeast China, granite is the main soil parent material, and the soil types are mainly clay and loam [44]. Many paddy fields are planted with rice, and dry lands are planted with dry crops such as peanuts. Located in the edge of Pearl River Delta urban agglomeration, land value is very high and land use is frequent, so that the spatial variability of plant nutrients in soil were simultaneously affected by structural and random factors [13,37]. In this paper, the soil sampling was selected according to topography, land use types and vegetation conditions, and the sampling points covered the main soil types. The analyzed conclusions were consistent with the geochemical survey results.

Conclusions
Results showed that the overall content of the spatial distribution of TN, AP, AK, ASi in the study area is relatively low. In addition to the uniform spatial distribution of AK, TN overall showed a trend of increasing from northeast to southwest, and the overall spatial distribution of AP and ASi showed that the northeast was higher than the southwest. Therefore, measures should be taken to increase the content of TN, AP, AK, ASi in the soil of the study area. According to the requirements of different crops for the content of TN, AP, AK, and ASi, it can be planted in districts according to local conditions. Thus, understanding of the spatial distribution characteristics of soil nutrient content in the study area can not only adapt to local conditions and promote the development of agricultural production, but also provide guidance for the delineation of basic farmland protection areas, land use spatial planning and the formulation of accurate farmland protection policies.