Storage of Soil Organic Carbon and Its Spatial Variability in an Agro-Pastoral Ecotone of Northern China

Spatial distribution of soil organic carbon (SOC) is important for the development of ecosystem carbon cycle models and assessment of soil quality. In this study, a total of 732 soil samples from 122 soil profiles (0–10, 10–20, 20–40, 40–60, 60–80, and 80–100 cm) were collected by a combination of fixed-point sampling and route surveys in an agro-pastoral ecotone of northern China and the spatial variation of the SOC in the samples was analyzed through classical statistical and geostatistical approaches. The results showed that the SOC contents decreased from 4.31 g/kg in the 0–10 cm to 1.57 g/kg in the 80–100 cm soil layer. The spatial heterogeneity of the SOC exhibited moderate and strong dependence for all the soil layers owing to random and structural factors including soil texture, topography, and human activities. The spatial distributions of the SOC increased gradually from northeast to southwest in the 0–40 cm soil layers, but there was no general trend in deep soil layers and different interpolation methods resulted in the inconsistent spatial distribution of SOC. The storage of SOC was expected to be 25 Tg in the 0–100 cm soil depths for the whole area of 7692 km2. The SOC stocks estimated by two interpolation approaches were very close (25.65 vs. 25.86 Tg), but the inverse distance weighting (IDW) interpolation generated a more detailed map of SOC and with higher determination coefficient (R2); therefore, the IDW was recognized as an appropriate method to investigate the spatial variability of SOC in this region.


Introduction
Soil organic carbon (SOC) plays an important role in global C cycling, which is sensitive to climate change and anthropogenic disturbance; SOC is roughly comparable to the sum of the atmospheric (750 Pg) and biotic (600 Pg) pools of carbon [1,2], and thus small changes in SOC can greatly alter atmospheric CO 2 concentrations and impact climate change. Understanding the spatial variability and storage of SOC is fundamental for assessing soil fertility, productivity, and C sequestration or emission potential, and establishing land management measures [3,4]. However, accurate estimation of SOC storage is not easy because it is determined by various factors including climatic variables [5],

Soil Sampling and Analysis
From July to September 2017, based on information on terrain conditions, land use types, and accessibility, a total of 122 representative sampling sites were determined in the major landscape units (Figure 1), including agricultural land, forestland, grassland, and shrubland. In each sampling site, one 10 m × 10 m plot was established. Soil samples were randomly collected from five sampling points in each plot with a soil auger (5 cm in diameter). The soil profile was divided into six different layers (0-10, 10-20, 20-40, 40-60, 60-80, and 80-100 cm) in each plot, and the soil samples from the same depth were mixed to obtain one composite sample. In addition, one undisturbed soil sample was collected using a soil auger equipped with a stainless-steel cylinder (5.5 cm in diameter and 4.2 cm in height) and the bulk density was determined by drying at 105 °C to a constant mass [33]. The geographic information (elevation, longitude, and latitude) of each sample site was simultaneously recorded with a portable global positioning system (GPS). All the disturbed soil samples were airdried, ground, and passed through a 0.25 mm sieve for measurement of the SOC content. The SOC content was measured using the K2Cr2O7-H2SO4 wet oxidation method [13]. The SOCD and SOCS in each soil layer were calculated using the following equations (Equations (1) and (2)): where SOCD is the soil organic carbon density (kg/m 2 ), SOCS represents the storage of soil organic carbon (Tg), SOC refers to the soil organic carbon concentration (g/kg) for a certain depth, BD represents the soil bulk density (g/cm 3 ), D is the thickness of the soil depth (cm), θ represents the

Soil Sampling and Analysis
From July to September 2017, based on information on terrain conditions, land use types, and accessibility, a total of 122 representative sampling sites were determined in the major landscape units (Figure 1), including agricultural land, forestland, grassland, and shrubland. In each sampling site, one 10 m × 10 m plot was established. Soil samples were randomly collected from five sampling points in each plot with a soil auger (5 cm in diameter). The soil profile was divided into six different layers (0-10, 10-20, 20-40, 40-60, 60-80, and 80-100 cm) in each plot, and the soil samples from the same depth were mixed to obtain one composite sample. In addition, one undisturbed soil sample was collected using a soil auger equipped with a stainless-steel cylinder (5.5 cm in diameter and 4.2 cm in height) and the bulk density was determined by drying at 105 • C to a constant mass [33]. The geographic information (elevation, longitude, and latitude) of each sample site was simultaneously recorded with a portable global positioning system (GPS). All the disturbed soil samples were air-dried, ground, and passed through a 0.25 mm sieve for measurement of the SOC content. The SOC content was measured using the K 2 Cr 2 O 7 -H 2 SO 4 wet oxidation method [13]. The SOCD and SOCS in each soil layer were calculated using the following equations (Equations (1) and (2)): where SOCD is the soil organic carbon density (kg/m 2 ), SOCS represents the storage of soil organic carbon (Tg), SOC refers to the soil organic carbon concentration (g/kg) for a certain depth, BD represents the soil bulk density (g/cm 3 ), D is the thickness of the soil depth (cm), θ represents the volumetric percentage of the coarse gravel, 0.01 is the unit conversation factor, S represents the area of pixels in the interpolated SOC map (km 2 ), and n and i are the soil layer.

Data Analysis
The descriptive statistical analysis for the SOC content was performed using SPSS ver. 20.0 (IBM Corp., Armonk, NY, USA), and the mean, minimum, maximum, standard deviation (SD), coefficient of variation (CV), kurtosis, and skewness values of the SOC content were calculated for each depth. Pearson's correlation analysis was performed to determine the relationships among the soil properties and sample geographic information.
Geostatistical methods were applied to describe and predict the spatial heterogeneity of the soil properties. The original data set was analyzed with a Kolmogorov-Smirnov test (p < 0.05) to determine whether the soil property data met the assumption of normality for the geostatistical analysis. If the data did not fit a normal distribution (p > 0.05), they were log transformed and then tested for normality. In this study, a semi-variogram was constructed to describe the variability in the SOC concentration by geostatistical software (GS+, ver. 9.0, Tetoc, Beijing, China), including four models (linear, exponential, spherical, and Gaussian) [34]. The model with the smallest residual sum of squares (RSS) and the largest coefficient of determination (R 2 ) had the best performance [35]. The parameters of the semi-variogram model (nugget variance (C 0 ), structural variance (C), sill variance (C 0 + C) and the spatial autocorrelation range) were used to determine the spatial structure of the SOC at a given scale. The ratio of nugget to sill variance (C 0 /(C 0 + C)) was calculated to evaluate the degree of spatial dependence; here, a ratio of <0.25 represents a strong degree of spatial autocorrelation; 0.25-0.75, medium autocorrelation; and >0.75, weak autocorrelation [34,36]. The semi-variogram model was calculated as follows (Equation (3)) [37]: where γ(h) is the experimental semi-variogram, h represents the lag distance, N(h) is the number of sampling point pairs with a distance of h, and z (xi) and z (xi + h) are the paired data values with a lag distance of h. In this study, inverse distance weighting (IDW) and ordinary kriging (OK) interpolation methods were used to describe the spatial distribution of the SOC concentrations. The interpolation performance was tested by a leave-one-out cross-validation approach. The root mean square error (RMSE), mean error (ME), absolute mean error (AME), and determination coefficient (R 2 ) were used to determine the prediction accuracy of the spatial interpolation as follows (Equations (4)-(7)) [35,38]: where Pi and Mi represent the predicted measured values, respectively; M and P are the average of measured and predicted values; and n is the number of sampling points. All the selected interpolation methods and the creation of the distribution maps were performed in ArcMap ver. 10.2 (ESRI Inc., Redlands, CA, USA) software.

Classical Statistical Analysis of Soil Organic Carbon
The mean values of SOC across the entire area significantly decreased with increasing soil depth, ranging from 4.31 g/kg in the 0-10 cm layer to 1.85 g/kg in the 80-100 cm layer (Table 1). This vertical distribution pattern of the SOC was consistent with the results of previous studies from a mountain-basin system in Xinjiang [39] and a catchment in the Loess Plateau, China [40], which resulted from a strong accumulation of organic materials from litter, roots, and microbes at the surface. Compared with the mean value of SOC in China as a whole, the soil organic carbon content was relatively low [7,9], which was attributed to the low productivity, serious land degradation, and desertification in the desert steppe transition zone with a semi-arid climate [41,42]. The coefficients of variation (CV) in SOC varied from 39% to 46%, showing moderate variability [43], which depended largely on the changes in climate, soil texture, pedogenic processes, and land use types [44] ( Table 1). The SOC content in loess soil and Pisha sandstone soil was higher than that in Aeolian sandy soil, and it was also higher in grassland and cropland than in forestland due to the great biomass productivity of grassland and the fertilization activities on agricultural land (Supplementary Materials, Figure S1). The Kolmogorov-Smirnov (K-S) test, skewness, and kurtosis values showed that the SOC contents for all the soil layers were consistent with a normal distribution (K-S, p > 0.05; Table 1); therefore, the geostatistical analysis was performed. Table 1. Classical statistical analyses of soil organic carbon concentrations (g/kg) in 0-10, 10-20, 20-40, 40-60, 60-80, and 80-100 cm soil layers.

Semi-Variogram Analysis of Soil Organic Carbon
Although the classical statistical analyses adequately reveal the fundamental characteristics of the variation in SOC content, they cannot describe the spatial variation in SOC. Therefore, we performed experimental semi-variogram analyses of the SOC using GS+ software. The results showed that the semi-variograms were best fitted by a spherical model, which had the lowest root square error (RSE) values and the highest determination coefficient (R 2 ). The exception was the 20-40 cm soil layer, for which a Gaussian model provided the best fit ( Figure 2, Table 2) [45]. The ratios of the nugget to the sill were 36% and 40% at depths of 0-10 and 10-20 cm, respectively, while this value was less than 25% in the other soil layers, implying that the SOC had moderate spatial dependence at depths of 0-10 and 10-20 cm but strong spatial dependence in the other soil layers [46]. Previous studies showed that the strong spatial dependence of the soil variables (C 0 /(C 0 + C) < 25%) was controlled by intrinsic factors, such as topography, soil texture, and parent material, while the moderate spatial dependence (25% < C 0 /(C 0 + C) < 75%) was synchronously affected by intrinsic and extrinsic factors, including not only soil texture, parent material, and topography but also land use management, fertilization, and cultivation practices [34,35,47,48]. For example, the SOC content showed positive correlation with silt, clay, and  Figure S2). Furthermore, because the significant differences in SOC among vegetation and soil types mainly occurred in the surface soil (Supplementary Materials, Figure S1), it is concluded that the spatial heterogeneity of the SOC in the 0-10 and 10-20 cm layers in our study mainly resulted from the coarse soil texture, severe desertification, and cultivation activities, while in the other soil layers, it resulted from the soil parent material and texture. The nugget to sill ratios showed a decreasing trend with increasing soil depth, which indicated that the spatial heterogeneity of the SOC derived from human activities may be gradually reduced across the soil profile. The spatial range reflects the maximum distance of the spatial autocorrelation for the soil characteristics influenced by environmental factors [49]. In our study, the ranges of the SOC (13.8-20.9 km) were all much larger than the mean distance between the sampling sites (8 km), which indicated that our sampling strategy was suitable for exploring the spatial patterns of SOC in this region [35]. However, the distance between the sampling points was quite widely separated in the southern Loess area with hills and gullies due to poor transportation, which might result in an increase in the spatial range ( Figure 1). In addition, the spatial range of the SOC content in the 0-10 cm layer was significantly smaller than those in the other soil layers, indicating that the spatial heterogeneity of the SOC content in the 0-10 cm layer was stronger than those in the other soil layers. Therefore, more sampling points in the 0-10 cm layers should be established to assess the sampling strategy, especially in areas with hills and gullies. soil profile. The spatial range reflects the maximum distance of the spatial autocorrelation for the soil characteristics influenced by environmental factors [49]. In our study, the ranges of the SOC (13.8-20.9 km) were all much larger than the mean distance between the sampling sites (8 km), which indicated that our sampling strategy was suitable for exploring the spatial patterns of SOC in this region [35]. However, the distance between the sampling points was quite widely separated in the southern Loess area with hills and gullies due to poor transportation, which might result in an increase in the spatial range ( Figure 1). In addition, the spatial range of the SOC content in the 0-10 cm layer was significantly smaller than those in the other soil layers, indicating that the spatial heterogeneity of the SOC content in the 0-10 cm layer was stronger than those in the other soil layers. Therefore, more sampling points in the 0-10 cm layers should be established to assess the sampling strategy, especially in areas with hills and gullies.

Spatial Distribution of Soil Organic Carbon Obtained by Interpolation
To further reveal the spatial variability in the SOC, spatial distribution maps of the SOC content were drawn with inverse distance weighting (IDW) and ordinary kriging (OK) interpolation approaches. Overall, the SOC contents throughout the study area in the 0-10, 10-20, and 20-40 cm layers showed similar spatial patterns, which increased gradually from the northern desertified district to the southern area with hills and gullies, i.e., they increased with latitude decrease (Supplementary Materials, Figure S2). Generally, SOC is affected by soil texture, topography, elevation, and climatic factors [50]. In the present study, there was no obvious difference in precipitation throughout the study area; however, the water-holding capacity in the southern areas with hills and gullies dominated by loess soil was higher than that in the northern desertified areas dominated by sandy soil [51]. Considering the positive response of plant biomass to soil moisture [52] and vegetation restoration, the accumulation of SOC generated by litter and roots was more significant in Loess gully region with higher elevation gradient, which could be explained by the positive correlation between SOC and elevation (Supplementary Materials, Figure S2). Moreover, the decomposition of the SOC was associated with soil texture, namely, the physical protection by aggregates of SOC mainly occurred in loess dominated by fine fractions [53]. In summary, the different output and input of organic matter determined by both vegetation and texture could well explain the spatial distribution pattern of the SOC. It is noteworthy that the eastern populous and wealthy region also presented high SOC, which was attributable to the early adoption of vegetation restoration in this region. For the other soil layers, the spatial distributions of the SOC differed significantly among the soil depths, especially between interpolation approaches, but the threshold of SOC content was small.
In particular, the interpolation maps generated by IDW showed detailed divisions, with numerous circular patches around the sampling points for all the soil layers, whereas those generated by OK were characterized by several large modules connected by a continuous curve, especially at 40-60, 60-80, and 80-100 cm soil depths, and the SOC contents were concentrated in a very small range (Figure 3). These differences could be explained by the nature of the interpolation approaches; specifically, OK, which uses a smoothing effect, disregarded those points with the highest or lowest SOC content, while IDW generated patches with a pattern similar to a "bull's eyes" in some of those points [54]. These spatial distribution patterns supported the previous viewpoint that more samples should be allocated in the surface layer and in the regions with hills and gullies.
To assess the performance of the interpolation methods, cross-validation indices: mean error (ME), root mean square error (RMSE), and absolute mean error (AME) were calculated (Table 3). In general, the values of ME, RMSE, and AME were closer to zero, and the predicted values were closer to the measured values [55]. In the present study, the ME values of the SOC content ranged from 0.01 to 0.10, which indicated that the interpolations of the SOC contents using IDW and OK were relatively unbiased. Although there were no significant differences in the RMSE and AME values between the IDW and OK for each soil layer, the R 2 values of the IDW method were slightly higher than those of the OK method, indicating that the IDW method was superior to the OK method for the spatial interpolation of SOC ( Figure 4). The result was inconsistent with those of previous studies in Masson pine forests [13] and Moso bamboo forests [56] but echoed the result of Gotway et al. [57]. These contradictory results may be associated with sampling allocations and the autocorrelation of the soil properties. It has been reported that the R 2 values of the predicted and measured values were affected by sample density, distribution, and representativeness, as well as validation method [13,56,58]. Therefore, we could attribute the relatively low R 2 (10%-33%) values in this study to the limited and unevenly distributed sampling points and, especially, to the nature of the leave-one-out cross-validation method. Furthermore, we also found that both interpolation methods resulted in the underestimation of high measured values, and overestimation of low measured values (Figure 4). To determine the uncertainty source of this estimate, we established the regression curve between measured values and predicted values from IDW and OK interpolation across 0-20 and 20-100 cm, respectively (Supplementary Materials, Figure S3). For both interpolations in 0-20 cm soil layers, the data points of forestlands existed more on the left side of Supplementary Materials, Figure S3a,b, but the data points of croplands and grassland were evenly scattered. In 20-100 cm soil layers, the data points of grassland from both interpolations were clustered near the intersection of fit lines. These results indicated that the accuracy of predicted SOC were related to land use types, i.e., the data points of forestlands would lead to an overestimation of SOC in 0-20 cm soil layer, and the data points of grasslands and forestlands were well predicted by IDW and OK methods in 20-100 cm soil layer. So, the area of different land use types should be considered to improve the estimation accuracy of SOC in the future study.   (g,h), in the 60-80 cm (k,l), and in the 80-100 cm (m, n) soil layers estimated by the inverse distance weighting (IDW) and ordinary kriging (OK) interpolation methods.
To assess the performance of the interpolation methods, cross-validation indices: mean error (ME), root mean square error (RMSE), and absolute mean error (AME) were calculated (Table 3). In general, the values of ME, RMSE, and AME were closer to zero, and the predicted values were closer to the measured values [55]. In the present study, the ME values of the SOC content ranged from 0.01 to 0.10, which indicated that the interpolations of the SOC contents using IDW and OK were relatively unbiased. Although there were no significant differences in the RMSE and AME values between the IDW and OK for each soil layer, the R 2 values of the IDW method were slightly higher than those of the OK method, indicating that the IDW method was superior to the OK method for the spatial  points of forestlands would lead to an overestimation of SOC in 0-20 cm soil layer, and the data points of grasslands and forestlands were well predicted by IDW and OK methods in 20-100 cm soil layer. So, the area of different land use types should be considered to improve the estimation accuracy of SOC in the future study.

Estimation of SOC Storage
The storage of SOC estimated by the IDW and OK methods is presented in Table 4. The estimations of SOC storage did not differ between the IDW and OK methods in any soil layer, but they decreased with increasing soil depth. These findings were supported by observations from Masson pine forests in subtropical China [13], which were predominantly determined by the concentrations of SOC and the soil layer interval [13,58,59]. The study area covered 12% of the land area in the Loess Plateau, but the accumulation of stored SOC predicted by the IDW (ca 25.65 Tg) and

Estimation of SOC Storage
The storage of SOC estimated by the IDW and OK methods is presented in Table 4. The estimations of SOC storage did not differ between the IDW and OK methods in any soil layer, but they decreased with increasing soil depth. These findings were supported by observations from Masson pine forests in subtropical China [13], which were predominantly determined by the concentrations of SOC and the soil layer interval [13,58,59]. The study area covered 12% of the land area in the Loess Plateau, but the accumulation of stored SOC predicted by the IDW (ca 25.65 Tg) and OK (ca 25.86 Tg) methods accounted for only approximately 0.54% of the total SOC storage in the 0-100 cm layers in the Loess Plateau [60], indicating that both interpolation methods were suitable to estimate the storage of SOC and that the study area had considerable potential for carbon sequestration. In our study, approximately 70% of the SOC was stored in the 0-60 cm soil layers, which was consistent with a study conducted in the Horqin Grassland of northern China [28]. Furthermore, topography can indirectly affect SOC content by changing temperature and precipitation patterns [47,59], which may result in the overestimation or underestimation of SOC storage in the southern hills and gullies area with low sampling density. Considering the high organic carbon content and its sensitivity to climate change and human disturbance in surface soils [28,40,60], we should allocate more sampling sites in the 0-60 cm soil layers while appropriately reducing sampling sites in the other soil layers, thus improving the estimation accuracy of SOC in similar ecosystems with relatively little funding and human resources.

Conclusions
The current status and spatial variability of SOC were investigated using a combination of geostatistical and classical statistical methods in an agro-pastoral ecotone of northern China. The region had low SOC content compared with the mean SOC content in China. The contents of SOC for all the soil layers showed moderate or strong spatial dependence, which could be well fitted by exponential or spherical models. The spatial distributions of SOC derived from IDW and OK had similar patterns, with a declining trend from the southwest to the northeast, but the interpolation maps derived from IDW were characterized by small patches, and those from OK were smoother. The storage of SOC across the entire soil profile (0-100 cm) throughout the study area calculated by both interpolation approaches was close to 25 Tg; the 0-60 cm soil layers accounted for approximately 70% of the total storage. These results enriched our knowledge of the current status and distribution of SOC at the regional scale and provided valuable information for verifying how SOC will respond to future climate change and human disturbance. The influential factors, including vegetation index, topography, and meteorological factors should be taken into account to improve the prediction accuracy of SOC in further studies.
Supplementary Materials: The following are available online at http://www.mdpi.com/2071-1050/12/6/2259/s1, Figure S1: The distribution of SOC content under different soil and vegetation types: Significant differences among soil and vegetation types within the same soil layer are marked with lowercase letters (p < 0.05), Figure S2: The Pearson's correlation analysis among soil properties and sample geographic information, Figure S3: Cross-validation of inverse distance weighting (IDW) and ordinary kriging (OK) in the 0-20 cm (a,b); in the 20-100 cm (c,d) soil layer.
Author Contributions: Y.Z. and Q.Z. designed the study, analyzed the data, wrote and edited the original draft. J.X., Z.W., and Y.Y. investigated and collected soil samples. P.L. and Y.C. reviewed the draft. X.Z. revised and improved the manuscript. All authors have read and agreed to the published version of the manuscript. Y.Z. and Q.Z. contributed equally to this work.