Application of Ordinary Kriging and Regression Kriging Method for Soil Properties Mapping in Hilly Region of Central Vietnam

Soil property maps are essential resources for agricultural land use. However, soil properties mapping is costly and time-consuming, especially in the regions with complicated topographic conditions. This study was conducted in a hilly region of Central Vietnam with the following objectives: (i) to evaluate the best environmental variables to estimate soil organic carbon (SOC), total nitrogen (TN), and soil reaction (pH) with a regression kriging (RK) model, and (ii) to compare the accuracy of the ordinary kriging (OK) and RK methods. SOC, TN, and soil pH data were measured at 155 locations within the research area with a sampling grid of 2 km× 2 km for a soil layer from 0 to 30 cm depth. From these samples, 117 were used for interpolation, and the 38 randomly remaining samples were used for evaluating accuracy. The chosen environmental variables are land use type (LUT), topographic wetness index (TWI), and transformed soil adjusted vegetation index (TSAVI). The results indicate that the LUT variable is more effective than TWI and TSAVI for determining TN and pH when using the RK method, with a variance of 7.00% and 18.40%, respectively. In contrast, a combination of the LUT and TWI variables is the best for SOC mapping with the RK method, with a variance of 14.98%. The OK method seemed more accurate than the RK method for SOC mapping by 3.33% and for TN mapping by 10% but the RK method was found more precise than the OK method for soil pH mapping by 1.81%. Further selection of auxiliary variables and higher sampling density should be considered to improve the accuracy of the RK method.


Introduction
Soil quality information plays a vital role in land use planning, resource management and site investigation [1].The most popular indicators to assess soil quality are soil organic carbon (SOC), total nitrogen (TN) and soil reaction (pH) [2,3].The SOC is one of the most important indicators of soil quality for agricultural land use due to its impact on physical, chemical and biological indicators of soil quality, such as soil texture, nutrient availability in soil or electrical conductivity [4].Moreover, TN and pH have an impact on growth of plants [5,6].Reliable information on the spatial distribution of these soil quality indicators is required for sustainable land management and agricultural production [7,8].
There are various methods for interpolation of spatial distribution of SOC, TN and soil pH, such as inverse distance weighting (IDW) and ordinary kriging (OK) [9][10][11][12][13].In recent years, researchers have suggested a combination of regression and spatial interpolation, called regression kriging (RK), to determine the spatial distribution of soil characteristics [14][15][16][17][18][19][20][21].For this method, the selection of auxiliary variables is essential and remotely sensed images are typically the first choice [22].The OK has been widely used in interpolation techniques due to its simplicity as well as availability in many geographical information systems (GISs) [23,24].In recent years, RK has become an acceptable method for soil mapping due to its lower cost, and its accuracy often outperforms other methods [25].However, the accuracy of the RK method is not clear in all of the case studies because it depends on actual soil and environmental variable relationships [26].
For soil characteristic mapping based on environmental variables, researchers [27] often use terrain characteristics as independent variables [19,27,28].Some researchers use the topographic wetness index (TWI), a local scale index to quantitatively indicate the balance between water accumulation and drainage conditions, as an environmental variable for SOC mapping [29][30][31] or TN mapping [16,27].
Most researchers [17,18,[31][32][33], however, use normalized difference vegetation index (NDVI) as an auxiliary variable for the regression process.NDVI has been applied to many different aspects of rangeland ecology, but it has limitations.Huete and Jackson [34] found that the soil surface impact on NDVI value was the most significant in areas with a vegetation cover between 45% and 70%.Moreover, NDVI is an estimate of above ground biomass, so if the vegetation is sparse with bare soil present, the soil color may significantly influence the spectral signal.Xue and Su [35] stated that when background brightness is increased, the NDVI also increases.To cope with these inconveniences, Baret et al. [36] suggested an index, the TSAVI, to minimize the soil brightness effect.
LUT is also considered an environmental variable for SOC and TN mapping [37], as well as soil pH mapping [38].At the current research site, Pham et al. [39] stated that different land use systems significantly influence the SOC, TN, and pH.
Mountains and hills cover eighty percent of Vietnam's territory with a complicated terrain.However, eighty-five percent of it are low mountains and hills.The landscape of Central Vietnam is a narrow shape with the hills in the West and small plains along the coast.Among the three macro-regions of Vietnam, the Central region is the least developed [40].The agricultural and forested land areas of the Central region account for seventy-eight percent of the total area [41], and the agricultural production is the main livelihood of local inhabitants.The lack of soil properties information is one of the main obstacles for agricultural cultivation in Central Vietnam [42].Therefore, this study conducted in the A Luoi district of Vietnam, aimed (i) to evaluate the best environmental variables to estimate SOC, TN, and soil pH with RK model, and (ii) compare the accuracy of the OK and RK method for soil property mapping.

Research Area
The study area is located between 107 • E to 107 • 30 E and 16 • N to 16 • 30 N, approximately 60 km west of Hue City, in the low mountainous and hilly region of Central Vietnam.Agricultural production, the collection of forest products, and social subsidies provide the main livelihood for the local people.The lack of primary resources, such as finance and market access, present challenges to these local communities.The climate at the research site exhibits tropical monsoon characteristics with an annual rainy season from September to December.According to statistics from 2005 to 2015, the average yearly precipitation is approximately 3180 mm.The average temperature is the highest in May at 25°C, while the lowest is in January at 17°C [43].The research site has a mountainous topography, with an elevation between 60 and 1760 m above sea level, which decreases from west to east.The slope of the terrain is complex and steep with an average slope of ten degrees.There are four soil types within the research site: Ferralic Acrisols (75%), Arenic Acrisols (14%), Humic Acrisols (6%), and Hyperdystrict Acrisols (5%).Regarding the soil texture, loam is major (71%), followed by silt loam (24%) and clay loam (5%) [44].

Remote Sensing Data
In this research, the near-infrared (NIR) and red bands were extracted from Landsat 8 data, which were downloaded from the United States Geological Survey (USGS) (acquired on 24 January 2015 with cloud cover less than 10%).The data were atmospherically corrected and converted from digital numbers to reflectance values by dark object subtraction (DOS1) algorithm and top of atmosphere reflectance (TOA) tool in QGIS software version 3.2 before calculation of the soil line [46].The digital elevation model (DEM) was downloaded from USGS and used for calculating the TWI.The data were stored in raster format with a spatial resolution of 30 m.The climate at the research site exhibits tropical monsoon characteristics with an annual rainy season from September to December.According to statistics from 2005 to 2015, the average yearly precipitation is approximately 3180 mm.The average temperature is the highest in May at 25 • C, while the lowest is in January at 17 • C [43].The research site has a mountainous topography, with an elevation between 60 and 1760 m above sea level, which decreases from west to east.The slope of the terrain is complex and steep with an average slope of ten degrees.There are four soil types within the research site: Ferralic Acrisols (75%), Arenic Acrisols (14%), Humic Acrisols (6%), and Hyperdystrict Acrisols (5%).Regarding the soil texture, loam is major (71%), followed by silt loam (24%) and clay loam (5%) [44].

Field Survey and Soil Quality Analysis
The total area of the A Luoi district is 122,415 hectares (ha), comprising 61,105 ha (49%) of protected forests, 57,492 ha (47%) of agricultural land, 2318 ha (2%) of water bodies and 2500 ha (2%) of residential and commercial area [45].

Remote Sensing Data
In this research, the near-infrared (NIR) and red bands were extracted from Landsat 8 data, which were downloaded from the United States Geological Survey (USGS) (acquired on 24 January 2015 with cloud cover less than 10%).The data were atmospherically corrected and converted from digital numbers to reflectance values by dark object subtraction (DOS1) algorithm and top of atmosphere reflectance (TOA) tool in QGIS software version 3.2 before calculation of the soil line [46].The digital elevation model (DEM) was downloaded from USGS and used for calculating the TWI.The data were stored in raster format with a spatial resolution of 30 m.

Field Survey and Soil Quality Analysis
The samples were collected in December 2015 relying on soil unit maps at scale 1:100000 and the grid sampling method.Soil unit maps display the soil type, land use, and slope [44,45].In total, 78 soil units are present at the research site.A grid sampling of 2 km × 2 km size for general cases and 4 km × 4 km for large areas and highly homogeneous areas was carried out.The guideline for sampling follows two basic principles: (1) if only one soil unit exists in the grid cell, the sample will be taken at the center of the cell, or (2) if more than one soil unit exists, the sample will be taken at the center for each unit that covers an area bigger than 30 ha in that grid.For each sample, soil material in the layer at 0-30 cm was collected from five points (North, South, East, West, and Center) inside a circle with a radius of 25 m then mixed together as a soil sample.A total of 155 soil samples were collected, air-dried and passed through a 2 mm sieve to remove stones, grass, forest litter and any other material on the soil surface.Out of these, 117 samples were used for spatial interpolation, and the 38 remaining samples (25% of total number samples) were used for validation of the model [47][48][49].Figure 1 shows the locations of the soil samples.SOC was determined with the Walkley-Black method [50], TN was determined via Kjeldahl's digestion [51,52], and pH was measured using a portable pH meter and 1M KCl [53].The samples were analyzed at the laboratory of the Soil Science Department at the Hue University of Agriculture and Forestry, in Vietnam.

Transformed Soil Adjusted Vegetation Index (TSAVI)
Baret et al. [36] proposed the TSAVI to minimize the effect of the soil background [54].Baret and Guyot [55] defined TSAVI with the following equation: The soil line represents the relationship between the red (R) and near-infrared (NIR) reflectances of bare soil that was proposed by Richardson and Wiegand [56], modeled with the following equation: In Equation ( 1) and ( 2), α and β are the slope and intercept of the soil line, respectively, NIR is the near-infrared, R is the red reflectance value, and X is soil background adjustment factor (almost in all cases X is 0.08).TSAVI equals zero for bare soil and is close to one for very high leaf area indices.
The soil line extends from an upper value of bright soil with high reflectance in both the R and NIR bands to lower values for darker soils [57].In this study, the soil line was identified with the quantile regression method.The quantile was set at a number close to zero, for example, 0.00001 [54].

Topographic Wetness Index (TWI)
TWI proposed by Beven and Kirkby [58], also called the compound topographic index, is based on two parameters: upstream contributing area and slope.TWI is represented as: where λ is the contributing area and δ is the local slope of the terrain.High values of TWI indicate a high potential for runoff generation.TWI is unitless.In this study, TWI was calculated with the aid of the raster calculator tool in ArcGIS 10.5.

Ordinary Kriging
OK is one of the most commonly used kriging techniques.The spatial prediction of the unmeasured point x o is given by predicting the value Z * (x o ), which equals the line sum of the known measured values (i.e., observed values).Isaaks and Srivastava [59], Cressie [60] and many other researchers provide an elegant and simple description of OK as the following formula: where Z * (x o ) is the predicted value at the unmeasured position x o , Z(x i ) is the measured value at position x i , λ i is the weighting coefficient from the measured position to x o and n is the number of positions within the neighborhood searching [61].A fitted model based on the input data distribution is needed to describe the spatial continuity of the data and show the spatial relationship between the pairs of points.In this study, the OK method was calculated using R software with a framework introduced by Hengl [61] and Omuto and Vargas [62].

Regression Kriging
RK is a spatial interpolation technique that combines a regression of dependent variables on predictors with kriging of the prediction residuals [14,63].The following equation calculates the RK interpolation: where m(x o ) is the fitted deterministic part, ê(x o ) is the interpolated residual, βk are the estimated deterministic model coefficients, λ i are the kriging weights determined by the spatial dependence structure of the residual and e(x i ) is the residual at position x i .Thus, the first part of the right-hand side of Equation ( 5) represents the regression and the second part represents the kriging of the residual.Hengl et al. [14] introduced the process of using the RK method for spatial prediction of soil variables.In this study, RK was conducted using the R software [64][65][66] with a framework introduced by Hengl [61] and Omuto and Vargas [62].

Validation
Thirty-eight of the 155 soil samples were randomly extracted from the dataset to test the predictive accuracy of the model.This was evaluated by comparing the observed and predicted SOC, TN, and pH values at validation point locations.In this study, mean error (ME) and root mean square error (RMSE) were selected as validation indices.We used R I to compare the OK and RK methods and to improve the prediction accuracy index.If R I is positive, the accuracy prediction of RK is higher than that of OK and vice-versa [16].
In Equations ( 6)-(8), ME is mean error, RMSE is root mean square error, n is the number of testing points, Z oi is the observed value at the ith position, and Z pi is the predicted value at the ith position.

Soil Samples Data Descriptions
The percent SOC of the topsoil layer (0-30 cm depth) varies from 0.42% to 3.02%.Meanwhile, TN ranges from 0.05% to 0.21%, and pH ranges from 3.60 to 4.68 (Table 1).High standard deviation values of SOC and TN (compared to the mean of each soil quality indicator value) imply that these values are widely distributed, while low standard deviation values indicate most of the values are close to an average value.The differences in SOC and TN between the samples were substantial, and vice versa for the soil pH.The distributions of all variables were only slightly skewed (with a skewness value less than 1.0), and their median values were very close to the mean values.Therefore, the soil property values of the sampling points follow a normal distribution (Figure 2).The percent SOC of the topsoil layer (0-30 cm depth) varies from 0.42% to 3.02%.Meanwhile, TN ranges from 0.05% to 0.21%, and pH ranges from 3.60 to 4.68 (Table 1).High standard deviation values of SOC and TN (compared to the mean of each soil quality indicator value) imply that these values are widely distributed, while low standard deviation values indicate most of the values are close to an average value.The differences in SOC and TN between the samples were substantial, and vice versa for the soil pH.The distributions of all variables were only slightly skewed (with a skewness value less than 1.0), and their median values were very close to the mean values.Therefore, the soil property values of the sampling points follow a normal distribution (Figure 2).The units for SOC and TN are the percentage of soil weight.

Environmental Variables Calculation
LUT, TSAVI, and TWI are predictor variables (independent variables) in this research.Figure 3 shows the spatial distribution of these variables.Based on the 2015 land use map [45], scale 1:50000 and the field survey results, we determined four LUTs belonging to the agricultural land use categories were arable land (AL), grassland (GL), natural forest (NF) and plantation forest (PF).The slope and intercept value of the soil line (α = 1.026, β = 0.00003) was determined for the research site.Using Equation ( 4

Model for Regression Kriging
Adjusted R squared values are critical to explaining the influence of the independent variables on the dependent variables of the model.The results from Table 2 show that the LUT affects TN and pH (7.00% for TN, 18.40% for pH), whereas a combination of the LUT and the TWI has the most robust impact on the SOC with a variance of 14.98%.Environmental variables affect SOC and soil pH more than they affect TN.Therefore, y = f(LUT) was used for TN and pH mapping with the RK method, while y = f(TWI, LUT) was used for SOC interpolation.

Model for Regression Kriging
Adjusted R squared values are critical to explaining the influence of the independent variables on the dependent variables of the model.The results from Table 2 show that the LUT affects TN and pH (7.00% for TN, 18.40% for pH), whereas a combination of the LUT and the TWI has the most robust impact on the SOC with a variance of 14.98%.Environmental variables affect SOC and soil pH more than they affect TN.Therefore, y = f(LUT) was used for TN and pH mapping with the RK method, while y = f(TWI, LUT) was used for SOC interpolation.The nugget parameter of the SOC semivariogram is very high, meaning that the unexplained of this soil indicator is caused by measurement error rather than the short sampling distance.Moreover, the nugget/sill ratio of SOC and pH is 0.56 and 0.86, respectively, indicating that the sampled spatial dependence is weak.This ratio indicates that the SOC value errors are related to sampling distances.The nugget/sill ratio of TN is lower than the ratio of other soil indicators.Since the nugget value is higher than zero, the separated observation by the smallest distances is dissimilar.

Spatial Interpolation
The spatial prediction maps by OK and RK method are presented in Figure 5 for the SOC, TN, and soil pH indicators.The SOC at the research site ranges from 0.89% to 1.78% when interpolated with the OK method.When interpolated with the RK method, the SOC content is somewhat more detailed than the OK method, ranging from 0.62% to 2.10%.The pH varies from 3.94 to 4.25 for the OK method and from 3.86 to 4.40 for the RK method.The TN for both methods does not differ much, 0.051 for the lowest threshold and nearly 0.189 for the highest value.The OK prediction map shows the gradual transition of the detailed level is lower than the transition with the RK method.The influence of auxiliary variables is shown clearly on the maps, which was interpolated with the RK method.For the TN and pH maps, transitions are evident at the boundaries between land use types, however, these changes are recorded at different TWI value locations and boundaries between different land use types on the SOC map as well.
According to SOC classification by Le and Ton, cited in Nguyen and Klinnert [67], SOC in upland soil in Vietnam was divided into three groups (less than 0.58% is low, from 0.58 to 1.16 is medium, and more than 1.74% is high).Regarding the TN, Do and Nguyen [68] suggested three groups for Vietnamese soil (less than 0.1 is low, from 0.1 to 0.2% is medium, and more than 0.2% is high).The results (Figure 6) show that there are small differences between the maps obtained by the OK and RK methods.The percentage of area where SOC ranges from 1.16% to 1.74% (medium level) by the OK method is 9% higher than by the RK method, and vice-versa for the remaining SOC contents.Meanwhile, there are more TN values at a low level when interpolated by the RK method compared to the OK method (58% and 53% of the total area, respectively).The difference in the percentage of area for pH values between the RK and the OK method is not significant.In general, the distributions of soil property classes in maps created by both methods are the same.
For an accurate prediction model, the absolute values of RMSE and ME should be as small as possible.Negative ME values indicate that the actual value recorded is higher than the predicted value.For the SOC and TN mapping, the absolute values of both ME and RMSE produced with the OK method are smaller than those generated with the RK method (Table 4).This statistical value indicates that the prediction accuracy of OK is higher than RK.The values of R I show that OK is more accurate than RK (for SOC and TN mapping) by 3.33% and 10.00%, respectively.
Regarding soil pH mapping, the absolute value of ME for the OK method is less than for the RK method, and vice versa for RMSE values.These values show that the sum of the mean errors for the OK method is smaller than for the RK method.However, the mean errors are unevenly distributed, and large errors are more frequent for the OK method.Therefore, the RK method is more accurate than the OK method (by 1.81%) for soil pH mapping.Figure 7 shows the degree of dispersion of RMSE values for observed and predicted values at 38 validation points.These small values indicate that the predicted values are close to the observed values and vice versa.The method with the larger number of smaller RMSE values is more accurate and vice versa.For SOC mapping, the value of RMSE for 25 of the 38 validation samples with the OK method is smaller than with the RK method.Meanwhile, there are ten samples which have RMSE values that are smaller with the OK method, and only one sample that has a smaller RMSE value with the RK method.There is no difference between the OK and RK RMSE values for the remaining 27 samples.Again, these data confirmed that for SOC and TN mapping, the OK method is more accurate than the RK method.Regarding soil pH interpolation, the number of validation samples that have small RMSE values is equal between the OK and RK methods.Therefore, although the RK method has higher accuracy, it is not significant for soil pH mapping.Figure 7 shows the degree of dispersion of RMSE values for observed and predicted values at 38 validation points.These small values indicate that the predicted values are close to the observed values and vice versa.The method with the larger number of smaller RMSE values is more accurate and vice versa.For SOC mapping, the value of RMSE for 25 of the 38 validation samples with the OK method is smaller than with the RK method.Meanwhile, there are ten samples which have RMSE values that are smaller with the OK method, and only one sample that has a smaller RMSE value with the RK method.There is no difference between the OK and RK RMSE values for the remaining 27 samples.Again, these data confirmed that for SOC and TN mapping, the OK method is more accurate than the RK method.Regarding soil pH interpolation, the number of validation samples that have small RMSE values is equal between the OK and RK methods.Therefore, although the RK method has higher accuracy, it is not significant for soil pH mapping.Figure 7 shows the degree of dispersion of RMSE values for observed and predicted values at 38 validation points.These small values indicate that the predicted values are close to the observed values and vice versa.The method with the larger number of smaller RMSE values is more accurate and vice versa.For SOC mapping, the value of RMSE for 25 of the 38 validation samples with the OK method is smaller than with the RK method.Meanwhile, there are ten samples which have RMSE values that are smaller with the OK method, and only one sample that has a smaller RMSE value with the RK method.There is no difference between the OK and RK RMSE values for the remaining 27 samples.Again, these data confirmed that for SOC and TN mapping, the OK method is more accurate than the RK method.Regarding soil pH interpolation, the number of validation samples that have small RMSE values is equal between the OK and RK methods.Therefore, although the RK method has higher accuracy, it is not significant for soil pH mapping.

The Impact of Environmental Variables on SOC, TN, and Soil pH
Land use has a strong impact on soil properties.Different land use types are managed using different practices, for instance, with the amount and frequency of fertilizer applied [69].Moreover, our previous research [39] indicated that at the study site, the SOC content of plantation forest and arable land is higher than for other land use types.The highest TN average occurred for arable land and the lowest for natural forest.The pH for arable land was significantly higher than for the other land use types.The impact of land use type on SOC has been analyzed in many studies [19,70,71].Our results are consistent with the findings of Liu et al. [72], who stated that LUT has the most influence on SOC when compared to other environmental variables.Quantitative studies of the influence of LUT on TN and soil pH spatial interpolation models have not yet been conducted, however, many researchers [16,27,73,74] have indicated that LUT has an effect on TN and soil pH.The influence of the LUT on SOC, TN, and pH is a consequence of farming systems and fertilizer application.
Ließ et al. [75] indicated that TSAVI does not belong to a group of 13 environmental variables that have the most effect on SOC content.Other authors often use NDVI as a predictor for SOC mapping.Ranjan et al. [76], Kumar et al. [17] and Wu et al. [77] found that the correlation of NDVI with SOC is 0.56, 0.66 and 0.67, respectively, for the soil layer between 0 and 15 cm.Our research, however, analyzed SOC in the soil layer between 0 and 30 cm.This depth difference could explain the weak correlation between SOC and TSAVI.With an increase of soil depth, the correlation of SOC and the density of vegetation decreased gradually [77].This could also explain the correlation between TN and TSAVI, especially at our research site, as the TN content is low.So far, no study has mentioned the relationship between TSAVI and soil pH.West et al. [78] reported that there was no correlation between NDVI calculated from Landsat 8 and soil moisture.Meanwhile, soil moisture has a strong influence on soil pH [79], so the density of vegetation does not show any correlation to soil pH.On the other hand, our research site has a very complex terrain and high annual precipitation.These natural conditions lead to a strong flow rate since the water concentrates in streams instead of dispersing over a wider area.TSAVI has a weak correlation with SOC, TN, and soil pH because of the soil depths and the very steep terrain, as well as the heavy rainfall.
Our results are similar to other authors [30], who also state that there is no significant correlation between TWI and TN content.The spatial TN distribution is more influenced by anthropogenic activity than by topographic features [80].Regarding the SOC concentration, our results indicate that TWI accounted for only 7.19% of the SOC content.This finding coincided with Pei et al. [29], in the case of using the single flow direction algorithm method to calculate TWI, like in our research.Our results are in agreement with Kumar et al. [81] who reported that the correlation between TWI and SOC in a tropical region (India) was 7%.She et al. [82] also stated that TWI is positively correlated with SOC content.In contrast, Wiesmeier et al. [83] and Yang et al. [84] stated that TWI has a very weak negative correlation with SOC.This difference might be attributed to terrain by slowing down SOC decomposition.Obu et al. [30] found a strong correlation between TWI and SOC at Herschel Island, where the maximum elevation is 180 m above sea level.Gamble et al. [20] found that the spatial distribution of SOC corresponds closely with TWI in a region with an elevation difference of only 6 m.Thus, we assume that there is no correlation between TWI and SOC at 30-m spatial resolution in complex terrain.So far, not much research has studied the influence of the TWI on soil pH, but Seibert et al. [85] found that soil pH increased with TWI.Our results indicate that TWI explained the soil pH variance of 4.59%, which means that this effect is very weak.This finding coincides with Huang et al. [86] who reported that the correlation between TWI and soil pH is not significant, only around 12%.Hjerdt et al. [87] stated that the difference in water flow movement in the area is considered a reason for the inconsistency of TWI.This reason may reduce TWI control on the distribution of soil moisture and soil organic matter [88].

Comparison between OK and RK
Zhu and Lin [89] stated that the RK was more accurate for soil property interpolation when a strong relationship existed between predicted soil properties and auxiliary variables, e.g., a coefficient of determination of more than 0.6, indicating that auxiliary variables explain more than 60% of the variance of the predicted variable.In all other cases, the OK was more suitable.Herbst et al. [90] found that the RK was more suitable than the OK for soil mapping when the correlation between soil properties and auxiliary was between 0.2 and 0.55.In our study, the auxiliary variables influenced soil properties by only 14.98% and 7.00% for the SOC and TN, respectively.Therefore, the OK interpolation method is more accurate than the RK method with LUT, TSAVI, and TWI auxiliary variables for SOC and TN mapping.Our results show that with 18.40% variance, the LUT variable improved the soil pH mapping with the RK method.The selection of more auxiliary variables (e.g., elevation, slope, soil moisture) is a possible option to improve the accuracy of the RK method.Wang et al. [91] also stated that RK may be more suitable for spatial predictions in relatively uniform environments, especially those that are suitable for gathering strongly autocorrelated data via regular grid sampling.Table 3 shows that the distance between our samples is too large for this criterion.Increasing sampling density is a solution to increase accuracy via the establishment of an appropriate semivariogram.To improve the accuracy of the RK method, Omuto and Vargas [62] suggested mixed-effects modeling, to avoid the failure of the RK model, recognizing that natural soils occur in groups with unique response characteristics depending on soil formation factors.

Conclusions
In the RK method, LUT is an auxiliary variable that most affects the interpolation model.For soil pH and TN mapping, a single regression of LUT and predicted variables were established for interpolation, whereas multiple regressions of LUT and TWI variables were used for the SOC mapping model.
The interpolated SOC and TN maps show that the OK is more accurate than RK because of the weak correlation between the auxiliary variables and the predicted variables.However, the RK method is better than the OK method for soil pH mapping.The LUT, TSAVI, and TWI at 30-m spatial resolution are not suitable auxiliary variables in the RK method for SOC and TN mapping in this hilly region of Central Vietnam, but the LUT should be considered an auxiliary variable for soil pH mapping with the RK method.

Figure 1 .
Figure 1.Land Use Map and Soil sampling positions.

Figure 1 .
Figure 1.Land Use Map and Soil sampling positions.

3. 2 . 1 .
Environmental Variables Calculation LUT, TSAVI, and TWI are predictor variables (independent variables) in this research.Figure3shows the spatial distribution of these variables.Based on the 2015 land use map[45], scale 1:50000 and the field survey results, we determined four LUTs belonging to the agricultural land use categories were arable land (AL), grassland (GL), natural forest (NF) and plantation forest (PF).The slope and intercept value of the soil line (α = 1.026, β = 0.00003) was determined for the research site.Using Equation (4), the TSAVI ranges from 0 to 0.57.The TSAVI values of AL are lower than for other land use types.The TWI value of the research site changes from 5.4 to 19.87.Most of the area of GL, PF, and NF have TWI values of approximately 10, while the TWI values of AL are higher than those of other land use types.
), the TSAVI ranges from 0 to 0.57.The TSAVI values of AL are lower than for other land use types.The TWI value of the research site changes from 5.4 to 19.87.Most of the area of GL, PF, and NF have TWI values of approximately 10, while the TWI values of AL are higher than those of other land use types.

Figure 3 .
Figure 3. Map of environmental variables.

Figure 3 .
Figure 3. Map of environmental variables.

Figure 4 and
Table 3 show the semivariogram and residual semivariogram of SOC, TN, and pH data.Both semivariogram models (initial variables and residuals) have approximately the same form, but the residuals semivariogram model has a small difference in sill, nugget, and range.

Figure 5 .
Figure 5. Maps of SOC, TN, pH by OK method (left) and by RK method (right).

Figure 5 .
Figure 5. Maps of SOC, TN, pH by OK method (left) and by RK method (right).

Figure 6 .
Figure 6.Percentage of area for SOC, TN, and soil pH by the OK and the RK method.

Figure 7 .
Figure 7. Root mean square error (RMSE) values of validation samples.

Figure 6 .
Figure 6.Percentage of area for SOC, TN, and soil pH by the OK and the RK method.

18 Figure 6 .
Figure 6.Percentage of area for SOC, TN, and soil pH by the OK and the RK method.

Figure 7 .
Figure 7. Root mean square error (RMSE) values of validation samples.

Figure 7 .
Figure 7. Root mean square error (RMSE) values of validation samples.

Table 1 .
Description of soil samples.The units for SOC and TN are the percentage of soil weight.

Table 1 .
Description of soil samples.

Table 2 .
Variance explanation for models.The semivariogram depicts the spatial autocorrelation of the measured sample points.Figure4and Table3show the semivariogram and residual semivariogram of SOC, TN, and pH data.Both semivariogram models (initial variables and residuals) have approximately the same form, but the residuals semivariogram model has a small difference in sill, nugget, and range.

Table 2 .
Variance explanation for models.The semivariogram depicts the spatial autocorrelation of the measured sample points.

Table 4 .
Accuracy assessment of ordinary kriging (OK) and regression kriging (RK) method for SOC, TN, and pH mapping.