A Comparison of Markov Chain Random Field and Ordinary Kriging Methods for Calculating Soil Texture in a Mountainous Watershed , Northwest China

Accurate mapping the spatial distribution of different soil textures is important for eco-hydrological studies and water resource management. However, it is quite a challenge to map the soil texture in data scarce, hard to access mountainous watersheds. This paper compares a nonlinear method, the Markov chain random field (MCRF) with a classical linear method, ordinary kriging (OK) for calculating the soil texture at different search radiuses in the upstream region of the Heihe River Watershed. Results show that soil texture values that were calculated by the OK method tends to predict soil texture values within a certain range (sand (12.098~40.317), silt (47.847~71.231), and clay (12.781~19.420)) because of the smoothing effect, thus leading to greater accuracy in predicting the major soil texture type (silt loam). Nonetheless, the MCRF method considers the interclass relationships between sampling points, leading to greater accuracy in predicting minor types (loam and sandy loam). Meanwhile, the OK method performed best for all the types at the radius of 65 km influenced by the densities of all the sampling points, while the best performance of the MCRF method differs with radiuses as the largest densities varying for different soil types. For loam and sandy loam, the OK method ignored them, thus the MCRF method is more suitable in mountainous areas with high soil heterogeneity.

Field surveys provide the most accurate soil texture data [4,11].However, it is difficult and expensive to map the spatial distribution of soil texture by field sampling at point scale [12].In recent years, remote sensing has become an effective way to obtain soil texture data over large scales [13,14].
Soil texture types are estimated from field samples and satellite images.However, results that were obtained from remote sensing may contain inaccuracies, thus leading to uncertainties in the estimates based on such techniques [15].To date, scale expansion methods based on limited field sampling data are the most widely used approaches to map the spatial distribution of soil texture at large scales [1,[16][17][18][19].
Over the past decades, numerous scale expansion methods have been used to infer the spatial distribution of soil texture types at large scales.These methods can be divided into four groups: statistical methods, data mining methods, kriging methods, and Markov chain models.Statistical methods, such as multiple linear regression and multiple stepwise regression (MSR) [20], are easy to apply.However, the accuracy of these methods is limited by the reliability of the sampling data [20].Data mining methods, such as the random forest model [21][22][23] and the artificial neural network models [1], often perform well in mapping soil texture [1,20,23].However, such methods lack physical mechanisms, which limits their applications [24].
The kriging methods have been frequently applied in soil science to estimate the spatial variation in soil types [25].The ordinary kriging (OK) method is the basis of all the kriging methods.A number of kriging methods extend the OK method by adding auxiliary information, aiming to improve the interpolation accuracy [13,[26][27][28] To date, the OK method is still considered to be the best linear, unbiased estimator and it remains the most popular geostatistical interpolation method [29,30] because of fewer parameter requirements and relatively consistent results [31].However, a major limitation of the kriging methods is its assumption that the spatial distributions and changes of the interpolated properties are continuous [1].To meet this requirement, a large number of sampling points are needed to generate an accurate high-resolution soil map, which is quite difficult or even impossible in high elevation, topographically complex mountainous areas with high spatial variability of soil texture, especially at large scales [1,[32][33][34].Additionally, the kriging methods do not consider interclass relationships between soil texture values, thus making it less applicable to areas with great soil heterogeneity.
The Markov chain (MC) theory was first proposed by a Russian mathematician Andrey Markov in 1906 [35].It describes stochastic processes with discrete state space or discrete index set [36].Over the past century, the MC has been widely applied to simulate and predict stochastic processes in both physical and social-economic sciences [37,38].Aiming for areal soil mapping from irregularly distributed point samples, the Markov chain random field (MCRF) method was proposed by Li (2007) by extending Markov chains into practical nonlinear geostatistical method.The MCRF method considers the interactions of the unknown location being estimated with its nearest known neighbors in different directions [39,40].It also can incorporate interclass relationships (i.e., cross-correlations, neighboring relationships, and directional asymmetry of class patterns) through cross transition probabilities.Additionally, it has also been demonstrated the capabilities of addressing the underestimation of small-class problem of the coupled Markov chain model [41].The MCRF method has been demonstrated to predict the soil texture types from limited soil samples, which makes it more suitable for estimating soil texture distribution in areas with high soil heterogeneity over large scales [41].However, the MCRF method has only been applied in a small flat, square area with 5.9 km by 5.9 km.The performance and influencing factors of the MCRF method at large scale are still unknown, especially in high elevation, mountainous areas with considerable variation in soil texture.
This study compares the MCRF method with the OK method at the watershed scale.The study area is the upstream of the Heihe River Watershed in Northwest China, a data scarce, mountainous area with high elevation and complex topography.This paper first applies both the MCRF and OK methods in order to calculate soil texture data based on a total of 178 field sampling points, and then evaluates the performances of the both methods by cross validation.Subsequently, the paper discusses the applicability and the influence factors of both methods in the mountainous watershed.

The Study Area
The Heihe River Watershed, which is located in Northwest China with a drainage area of 128,000 km 2 , is the second largest inland river (or terminal lake) in China (97.494 • -101.548• E, 37.731 • -39.655 • N) (Figure 1).The upstream of the Heihe River Watershed is located in the Qilian mountain ranges and it produces almost all of the water resource for the entire watershed [42,43].The elevation of the upstream of the Heihe River Watershed ranges from 1674 to 5584 m above sea level (m a.s.l.) [44,45].
Sustainability 2018, 10, x FOR PEER REVIEW 3 of 17 N) (Figure 1).The upstream of the Heihe River Watershed is located in the Qilian mountain ranges and it produces almost all of the water resource for the entire watershed [42,43].The elevation of the upstream of the Heihe River Watershed ranges from 1674 to 5584 m above sea level (m a.s.l.) [44,45].Soil maps available for this watershed include the global soil texture map (1:1,000,000 scale) [46] and the 1:1,000,000 soil map of China [47,48].Recently, two more soil texture maps have become available, for the study area [8,15].Gao et al. [15] developed a soil texture map for the watershed, which is based on the soil classification by the Chinese occurrence taxonomy or diagnosis taxonomy classification criteria with 53 sampling points for the entire Heihe River Watershed.Lu et al. [8] used fuzzy logic and data fusion to generate a soil texture map from 220 samples, for the entire Heihe River watershed with most of the samples for the upstream region taken from the valley area.To date, no detailed digital soil texture mapping for the entire Heihe River Watershed has been conducted [8].
These two coarse maps show seven soil texture types in the study area: loam, loamy sand, sandy clay loam, sandy loam, silt, silt loam, and silty clay loam [8,15].Silt loam is the main soil texture type in the upstream area of the Heihe River Watershed.Lu et al. [8] also report that the proportions of silt loam, loam, and sandy loam are 42%, 25%, and 26%, respectively, and the remaining types (loamy sand, sand, sandy clay loam, silty clay sand, and silty sand) account for less than 8% of the soil types in the entire watershed [8].
However, the numbers of sampling points used to generate the existing maps are quite small, thus limiting the quality and applications of these maps [49,50].As the upstream region of the Heihe River Watershed is the main runoff generation area for the entire watershed, high-quality soil texture maps are urgently needed to support eco-hydrological modeling and water resource management of the watershed.Soil maps available for this watershed include the global soil texture map (1:1,000,000 scale) [46] and the 1:1,000,000 soil map of China [47,48].Recently, two more soil texture maps have become available, for the study area [8,15].Gao et al. [15] developed a soil texture map for the watershed, which is based on the soil classification by the Chinese occurrence taxonomy or diagnosis taxonomy classification criteria with 53 sampling points for the entire Heihe River Watershed.Lu et al. [8] used fuzzy logic and data fusion to generate a soil texture map from 220 samples, for the entire Heihe River watershed with most of the samples for the upstream region taken from the valley area.To date, no detailed digital soil texture mapping for the entire Heihe River Watershed has been conducted [8].
These two coarse maps show seven soil texture types in the study area: loam, loamy sand, sandy clay loam, sandy loam, silt, silt loam, and silty clay loam [8,15].Silt loam is the main soil texture type in the upstream area of the Heihe River Watershed.Lu et al. [8] also report that the proportions of silt loam, loam, and sandy loam are 42%, 25%, and 26%, respectively, and the remaining types (loamy sand, sand, sandy clay loam, silty clay sand, and silty sand) account for less than 8% of the soil types in the entire watershed [8].
However, the numbers of sampling points used to generate the existing maps are quite small, thus limiting the quality and applications of these maps [49,50].As the upstream region of the Heihe River Watershed is the main runoff generation area for the entire watershed, high-quality soil texture maps are urgently needed to support eco-hydrological modeling and water resource management of the watershed.

Soil Sampling
We used the following formula to determine the minimum number of soil samples that would be needed to meet the sampling requirement of a robust statistical analysis [51].
where B denotes the percentiles, about α k of the Chi-square distribution with 1 degree of freedom; k is the total number of soil type in the study area and α is the confidence level, Π i denotes the proportion of type with proportion closest to 50% among all the types, and b i denotes the desired precision of the type i.In this study, soil type i (saturated frigid frozen soils under dense grassland between 3500 and 4000 m a.s.l.) occupies roughly 9.33% of the soil types and this proportion is closest to 50%.We set b i = 0.05 and α = 0.85, yielding a value of 170 for N.This indicates that if soil samples are collected randomly from 170 points, the field samples are to be statistically analyzed with a confidence level 85% and precision of 95%.
A total of 178 soil samples of various soil texture types were collected from the surface layer (0-10 cm) in the upstream region of the Heihe River Watershed between July 2012 and August 2015 (Figure 2).We used a Mastersizer 2000 (Malvern Instruments Ltd., Malvern, UK; the measurement range is 0.02-2000 µm) laser diffraction particle size analyzer at the Key Laboratory of West China's Environmental System (Ministry of Education) in the College of Earth and Environmental Sciences at Lanzhou University to analyze the proportions of sand, silt, and clay in the samples.We used the United States Department of Agriculture (USDA) soil texture classification scheme [52,53] to determine the soil textures of the 178 soil samples.Five soil texture types were identified: silt loam, clay loam, loam, sandy loam, and loamy sand.The percentage of each type is shown in Table 1 and Figure 3. Environmental System (Ministry of Education) in the College of Earth and Environmental Sciences at Lanzhou University to analyze the proportions of sand, silt, and clay in the samples.We used the United States Department of Agriculture (USDA) soil texture classification scheme [52,53] to determine the soil textures of the 178 soil samples.Five soil texture types were identified: silt loam, clay loam, loam, sandy loam, and loamy sand.The percentage of each type is shown in Table 1 and Figure 3.

The MCRF Method
The MCRF is defined as the single spatial Markov chain that moves (or jumps) within a given space, where its conditional probability distribution (CPD) depends on its nearest known neighbors within a circle of a particular radius [54].When using the MCRF method, the first step is to fit the CPD function that describes the relationship between transition probability values and distance [40].The transition probability value between two types can be calculated by Equation (2): where, F ts (d) represents the frequency of transitions from type t to type s at distance d, and the texture type is identified by the percentage of sand, silt, and clay of the point by the USDA classification, n is the total number of soil texture types present among the sampling points at different radius.
The number of neighbors is determined by radius; changing the radius therefore changes the transition probability values.We constructed all of the CPD functions for all pairs of soil texture types for different values of radius.
The next step is to calculate the transition probability values between the prediction point and the known sampling points.For each selected prediction point, the number of sample points that are used in the calculation is different at each radius.The CPD value Pr(u i ) between sample point u i and the prediction point u j is calculated by Equation (3) (Figure 4): where, p u k u j (h k ) represents the transition probability value from sample point u k to prediction point u j separated by distance h k between points u k and u j , p u j u i (h i ) represents the transition probability value from the sample point u i to prediction point u j separated by distance h i between points u i and u j , m represents the number of sample points lying within a circle of the selected radius whose center is the prediction point, n represents the number of known locations in the whole space, and u f represents the f th point in this space [54,55].The initial (or reference) soil texture types of the prediction point and calculation point determine which CPD function will be used.Then, this function will be used to calculate the transition probability value in the Equation (3).In this study, the initial (or reference) soil texture type of the prediction point is determined from the existing coarse resolution soil texture map [18].At last, for a prediction point u j , its predicted soil texture value (composition of sand, silt, and clay) Z * u j , is calculated by Formula (4).
where, Z(u i ) is the percentage values of each sand, silt, and clay of sample point u i .Then, the soil texture type of u j is identified by the composition of percentage values of sand, silt, and clay.* Pr where, is the percentage values of each sand, silt, and clay of sample point ui.Then, the soil texture type of is identified by the composition of percentage values of sand, silt, and clay.

The Ordinary Kriging Method
The Ordinary Kriging (OK) method is a well-known geostatistical method.The OK estimate the value of a random variable * (proportion of sand, silt, and clay) at the point and the estimator is described by Equations ( 5) and ( 6) below [56]: * (5)

The Ordinary Kriging Method
The Ordinary Kriging (OK) method is a well-known geostatistical method.The OK estimate the value of a random variable Z * OK u j (proportion of sand, silt, and clay) at the point u j and the estimator is described by Equations ( 5) and ( 6) below [56]: where, Z(u i ) is the measured soil texture value of sample point u i , Z OK u j is the predicted value at point u j , and λ OK i is the weighted index of point u i calculated using Equation ( 7) below.(The meaning of m and Z(u i ) are the same as described in the MCRF method).
The weights λ OK i are found by solving the following system of linear equations: where, C R u i − u j is the covariance function for the residual component of the variable calculated by the semivariogram functions between u j and u i , and µ OK is the Lagrange parameter to minimize the kriging variance [26].In this study, types of semivariograms are from the Gaussian semivariogram model because the Gaussian semivariogram model generates a relatively smooth estimated surface [57].
It is widely used in the field of approximate optimization [58].

Evaluation Criteria
In this study, we used the mean absolute error (MAE) and the relative mean absolute error (RMAE) to evaluate the soil texture type predictions that are derived using the MCRF and OK methods [43,59].MAE and RMAE are computed while using the following formulas: where, M is the number of field sample points used for validation, Z * u j is the prediction value of point u j , and Z u j is the measured value of u j .
Both MAE and RMAE are used to quantify the error between the predicted and observed values.By definition, both measures are greater than or equal to zero, with equality when the predicted values are identical to the observed values [43,59,60].The MAE value reflects the error between the predicted value and measured value.The RMAE value represents the ratio of the error between the measured value and the predicted data set to the measured value of all the points.
In this study, the soil texture value of each sample point was predicted from the measured values of other sample points by applying the OK and the MCRF methods.Then, through cross validation, the predictive accuracy of the two methods was evaluated while using MAE and RMAE.

Results
In the study area, around 90% of field samples are the silt loam type (Table 1).Thus, we refer to silt loam as the 'major' soil texture type and all the other types, sandy loam, clay loam, and loam, as 'minor' types.Because only one sample is the loamy sand type, it was discarded in the comparison in this study.

The Search Radiuses
Almost 90% of field samples are of the silt loam type, thus any sample point is likely to have at least one neighboring sample point of this type, irrespective of the search radius.Given that predicted results are influenced by sample points having the same type [61], we specified our search radiuses on the basis of distances between the samples having the same minor soil texture type.The shortest distances between the samples of the same minor soil texture type: clay loam to clay loam, sandy loam to sandy loam, and loam to loam were 33 km, 64 km, and 82 km, respectively; the second shortest distances were 82 km, 65 km, and 83 km, respectively; and, the furthest distances were 290 km, 290 km, and 159 km, respectively.We also included the search radius of 500 km, the furthest distance between two sample points, which has the effect of considering all of the sample points when applying the OK and MCRF methods.In summary, search radiuses of 33, 64, 65, 82, 83, 159, 290, and 500 km have been applied in this study.

The Performance Comparison of the MCRF and OK Methods in Calculating Individual Sand, Silt, and Clay Component
We compared the accuracy of the MCRF and OK methods in terms of their respective abilities to predict the quantities of sand, silt, and clay in a sample.As shown in Figure 5, for silt and clay, the MAE and RMAE values for the OK method are lower than those for the MCRF method, suggesting that the OK method is more accurate than the MCRF method in calculating silt and clay, irrespective of the search radius.For sand, although the RMAE values for both methods are close at a radius of 159 km, the OK method is more accurate than the MCRF method for all of the search radiuses.
We compared the accuracy of the MCRF and OK methods in terms of their respective abilities to predict the quantities of sand, silt, and clay in a sample.As shown in Figure 5, for silt and clay, the MAE and RMAE values for the OK method are lower than those for the MCRF method, suggesting that the OK method is more accurate than the MCRF method in calculating silt and clay, irrespective of the search radius.For sand, although the RMAE values for both methods are close at a radius of 159 km, the OK method is more accurate than the MCRF method for all of the search radiuses.As the search radius increases, the performance of the OK method in predicting the amount of sand and silt first increases and then decreases, with the greatest accuracy being obtained when the radius is 65 km (Figure 5).The accuracy of the OK method in predicting the amount of clay shows little variation.The accuracy of the MCRF method in predicting the quantities of sand and silt also increases first and then decreases, with the greatest accuracy being obtained when the search radius is 159 km.The accuracy of the MCRF method in predicting quantities of clay shows little variation, like that of the OK method.Overall, in predicting quantities of sand, clay, and silt, the accuracy of the MCRF method shows larger variations than the OK method as the search radius increases, and both methods show the greatest accuracy in predicting quantities of clay, followed by silt, and then sand.

The Performance of Both the MCRF and OK Methods in Calculating Different Soil Texture Types
In this study, we also evaluated the accuracy of the MCRF and OK methods in predicting different soil texture types.As shown in Figure 6, the accuracy of the MCRF method exhibits little variation when predicting quantities of sand, clay, and silt for silt loam.For clay loam, the accuracy of the MCRF method in predicting quantities of sand, clay, and silt decreases until the radius reaches 64 km, and then remains stable.For loam, the accuracy of the MCRF method varies little up to search radiuses of 65 km, sharply increases at the radius of 82 km, and it subsequently remains stable.For sandy loam, its accuracy improves as the radius increases up to 159 km and subsequently decreases.Overall, the accuracy of the MCRF method is similar for all of the search radiuses for the major type (silt loam), but it varies significantly for the minor soil texture types as the radius changes.
shows larger variations than the OK method as the search radius increases, and both methods show the greatest accuracy in predicting quantities of clay, followed by silt, and then sand.

The Performance of Both the MCRF and OK Methods in Calculating Different Soil Texture Types
In this study, we also evaluated the accuracy of the MCRF and OK methods in predicting different soil texture types.As shown in Figure 6, the accuracy of the MCRF method exhibits little variation when predicting quantities of sand, clay, and silt for silt loam.For clay loam, the accuracy of the MCRF method in predicting quantities of sand, clay, and silt decreases until the radius reaches 64 km, and then remains stable.For loam, the accuracy of the MCRF method varies little up to search radiuses of 65 km, sharply increases at the radius of 82 km, and it subsequently remains stable.For sandy loam, its accuracy improves as the radius increases up to 159 km and subsequently decreases.Overall, the accuracy of the MCRF method is similar for all of the search radiuses for the major type (silt loam), but it varies significantly for the minor soil texture types as the radius changes.The accuracy of the OK method in predicting quantities of sand, clay, and silt is shown in Figure 7.For silt loam, the accuracy of the OK method remains fairly constant.For clay loam, the OK method achieves its greatest accuracy at a radius of 65 km, varies slightly in predicting sand and silt, and it remains fairly constant in predicting clay.For loam and sandy loam, the OK method is most accurate at a radius of 65 km and then decreases in predicting sand and silt, while remaining stable in predicting clay.Overall, the OK method shows little variation in accuracy in predicting soil texture types, irrespective of the search radius.The accuracy of the OK method in predicting quantities of sand, clay, and silt is shown in Figure 7.For silt loam, the accuracy of the OK method remains fairly constant.For clay loam, the OK method achieves its greatest accuracy at a radius of 65 km, varies slightly in predicting sand and silt, and it remains fairly constant in predicting clay.For loam and sandy loam, the OK method is most accurate at a radius of 65 km and then decreases in predicting sand and silt, while remaining stable in predicting clay.Overall, the OK method shows little variation in accuracy in predicting soil texture types, irrespective of the search radius.
The accuracy of the OK method in predicting quantities of sand, clay, and silt is shown in Figure 7.For silt loam, the accuracy of the OK method remains fairly constant.For clay loam, the OK method achieves its greatest accuracy at a radius of 65 km, varies slightly in predicting sand and silt, and it remains fairly constant in predicting clay.For loam and sandy loam, the OK method is most accurate at a radius of 65 km and then decreases in predicting sand and silt, while remaining stable in predicting clay.Overall, the OK method shows little variation in accuracy in predicting soil texture types, irrespective of the search radius.Both of the methods were compared for all of the soil texture types at their best radius.As shown in Table 2, for the clay loam and major type-silt loam, the OK method performed significantly better than the MCRF method in calculating the sand, clay, and silt.For calculating sand, clay, and silt of the loam, the MCRF method performed significantly better than the OK method.Both methods show similar performance in calculating sand, clay, and silt of the sandy loam.Both of the methods were compared for all of the soil texture types at their best radius.As shown in Table 2, for the clay loam and major type-silt loam, the OK method performed significantly better than the MCRF method in calculating the sand, clay, and silt.For calculating sand, clay, and silt of the loam, the MCRF method performed significantly better than the OK method.Both methods show similar performance in calculating sand, clay, and silt of the sandy loam.The OK method achieves its greatest accuracy for a search radius of 65 km, irrespective of soil texture type.In contrast, the accuracy of the MCRF method varies according to soil texture type and search radius, achieving its greatest accuracy in predicting silt loam, clay loam, loam, and sandy loam at radius of 159 km, 33 km, 82 km, and 159 km, respectively.When comparing the accuracy of both methods at all radiuses, the OK method is slightly better than the MCRF method for silt loam, the major soil texture and is worse than the MCRF method for loam and sandy loam.For clay loam, the accuracy of the OK method is slightly better than the MCRF method, except at the 33 km radius.

The Impacts of Different Computation Mechanisms
As the search radius increases, the accuracy of the OK method remains fairly stable, while the accuracy of the MCRF method shows larger variations for all soil texture types.This discrepancy is mainly caused by the different computation mechanisms that are used by each method.When computing the predicted value for a point by the OK method, the weights of the sampling points are determined by a semivariogram function.For each such prediction point, the semivariogram function is derived to describe the relationships between distances and variance values of the sampling points within a specific radius around the prediction point.As the radius increases, new sampling points are incorporated into the calculation.Although the farthest distance varies significantly, most variance values also remain in the same range and thus show little variation.Thus, the sill values show smaller variations as the radius increases, yielding smaller variations in the accuracy of the OK method.Moreover, the OK method only considers the distribution of the sampling points, which also results in the relatively consistent accuracy of the OK method in calculating all of the soil texture types for all search radiuses.On the other hand, the MCRF method identifies the soil texture type of the prediction point.Then, the different transition probabilities will be applied in the calculation, which means that its results are likely to reflect the influence of different soil texture types in the sampling points.This leads to larger variations in the accuracy of the MCRF method as the radius increases.
The OK method is more accurate than the MCRF method for the major soil texture type, silt loam.This is because the OK method is a classical linear method, thus has a smoothing effect with overestimations of small values and underestimations of high values [18,62].Thus, we would expect the results produced by the OK method to occupy a relatively small range of values (excluding extreme values) [18].In this study, because silt loam accounts for around 90% of the sampling points, the predicted values are concentrated in the ranges that are associated with silt loam, thus leading to the greater accuracy of the OK method.
The MCRF method is more accurate than the OK method for loam and sandy loam.This is mainly because the MCRF method incorporates interclass relationships as a result of having different CPD functions for different situations [54].The MCRF method applies the CPD function to calculate the transition probabilities between the soil texture types in the study area.The transition probabilities between the points of the same soil texture type are larger than those between points having different soil texture types.When the first sampling point with the same soil texture type of the prediction point is included in the calculation, the larger transition probability of this sampling point causes the predicted values tending towards the soil texture type of the sampling point.In other words, the interclass relationships enable MCRF to account for complex categorical variables from the limited samples [54], leading to its greater accuracy when predicting the minor soil texture types.
For the classification of soil texture type, only main soil texture type (silt loam) is identified by the OK method in this study in all radiuses.To the opposite, the MCRF method could predict loam and sandy loam with radiuses 82 km and 159 km, respectively.With all sample points being added to the calculation, the MCRF could predict 75% points belonging to all minor types that were correctly in this study.The OK method, however, failed to predict any point belonging to all minor types.

The Impacts of Sampling Point Distribution
Density of sampling points affects the performance of interpolation methods [18,63].For example, Delabri et al. reported that the performance of OK method, for the silt, the root mean square error was 3.35 for the study area of 0.25 km 2 with 136 sample points [18]; the performance became worse, the root mean square error was 16.09 with the study area increasing to 144 km 2 with 164 sample points [63].Thus, it appears the density of sample points affects the performance of the OK interpolation.
In this study, the OK method achieves its greatest accuracy at a radius of 65 km for all soil texture types, largely because of the sampling point density at this radius.The sampling point density of a point is determined by Equation ( 10): where m is the number of points within a circle of radius r centered at the sampling point, and π is 3.1415.Intuitively, the accuracy of the OK method will be improved as the number of sample points increases, and the number of sampling points is related to the sampling point density.As shown in Table 3, the average sampling point density decreases as the radius increases.Although the radius of 33 km has the highest average density, only two sampling points has been used in the calculation, which leads to significantly worse performance.At the radius of 65 km, however, the average number of sampling points used for each prediction point is 30, and the minimum number is 7. Therefore, the OK method achieves greatest accuracy at the radius of 65 km because of the combination of higher sampling density within the search radius.Density((km 2 ) −1 ) 0.0025 0.0022 0.0023 0.0020 0.0020 0.0015 0.0006 0.0002 For the MCRF method, the greatest accuracy is achieved at the radius of 33 km for clay loam, at 82 km for loam, and at 159 km for sandy loam (Figure 5), which may be attributed to the sampling point density of these soil texture types at different radiuses.As shown in Table 4, the sampling point density of clay loam is highest with a value of 14.6 × 10 −5 (km 2 ) −1 at 33 km, while the highest density of the loam is 9.5 × 10 −5 (km 2 ) −1 at 82 km, and that of sandy loam is 4.7 × 10 −5 (km 2 ) −1 at 159 km.In other words, for each soil texture type, the performance of the MCRF method is related to its sampling point density of the same type at all radiuses (Table 4).The accuracy of the MCRF method is highest for silt loam and sandy loam at the search radius of 159 km.Given that over 95% of the soil in the study area is of these two types (Table 2), the MCRF performs best at the search radius of 159 km.

The Impacts of Sampling Point Variability
Both the OK and MCRF methods exhibit reasonably consistent accuracy in predicting the major soil texture type, silt loam, for all of the search radiuses.The OK method only considers the distribution of the sampling points, thus exhibiting consistent accuracy for all soil texture types [64].For the MCRF method, an interclass relationship exists with the major type at all radiuses because around 90% of the samples are silt loam.Therefore, there is no significant variation in the accuracy of the MCRF method as the search radius varies.
Both the OK and MCRF methods are most accurate when predicting the quantity of clay, then silt and sand.This is due to the statistical characteristics of the sampling points.The average proportions of the clay, silt, and sand in the study area are 15.858%, 63.006%, and 21.136%, with standard deviations 4.577, 9.850, and 12.433, respectively.The low standard deviation indicates that the data tend to be close to a specific value (usually the mean value or expected value), while the high standard deviation indicates that data spread out over a wider range of the values [65].Because scale expansion methods tend to reflect the main trend of the sampling data, they perform poorly at handling large fluctuations in the data.Therefore, both of the methods perform best in predicting the proportions of clay, because it has the lowest standard deviation of the three components.
In this study, both of the methods show similar accuracy for clay loam, for which the average proportions of sand, clay, and silt are 9.484%, 30.468%, and 60.048%, respectively.The corresponding values are 18.977%, 15.991%, and 65.033% for silt loam, 43.230%, 11.098%, and 45.672% for loam, and 61.171%, 7.064%, and 31.765% for sandy loam.In other words, the composition of clay loam is closest to that of silt loam when comparing with other minor types.The accuracy of the OK method is consistent for all soil texture types.For the MCRF method, the similarity between clay loam and silt loam leads to a little variation when a new clay loam sampling point is used in the calculation.Thus, both of the methods show similar performance for clay loam.

Conclusions
This study compares a relatively nonlinear method, Markov chain random field, with a classical linear method, ordinary kriging, for computing soil texture types at the watershed scale in the upstream region of the Heihe River Watershed.Using soil texture data obtained from 178 sampling points, both of the methods were applied at different search radiuses to calculate soil texture types by cross validation.Results show that the predicted soil texture values computed by the OK method tend to concentrate within a small range (sand (12.098~45.440),silt (41.964~71.231),and clay (12.781~19.419)),thus leading to greater accuracy in predicting the major soil texture type, silt loam (sand (12.098~40.317),silt (47.847~71.231),and clay (12.781~19.420)).In contrast, the MCRF method considers the interclass relationships between sampling points, which effectively captures the complex categorical variables from the limited soil samples, leading to greater accuracy in predicting minor types, loam, and sandy loam.

Figure 1 .
Figure 1.The Heihe River Watershed: the black line represents the boundary of the upstream region of the Heihe River Watershed; the blue line represents the river in the Heihe River Watershed; and, the red line represents the boundary of the study area.

Figure 1 .
Figure 1.The Heihe River Watershed: the black line represents the boundary of the upstream region of the Heihe River Watershed; the blue line represents the river in the Heihe River Watershed; and, the red line represents the boundary of the study area.

Figure 2 .
Figure 2. Locations of sampling points (shown in black) in the upstream region of the Heihe River Watershed.Figure 2. Locations of sampling points (shown in black) in the upstream region of the Heihe River Watershed.

Figure 2 . 17 Figure 3 .
Figure 2. Locations of sampling points (shown in black) in the upstream region of the Heihe River Watershed.Figure 2. Locations of sampling points (shown in black) in the upstream region of the Heihe River Watershed.Sustainability 2018, 10, x FOR PEER REVIEW 5 of 17

Figure 3 .
Figure 3.The components of 178 sampling points in the in the United States Department of Agriculture (USDA) triangular.

Figure 4 .
Figure 4. Illustration of a Markov chain random field.Arrows represent directional interactions in the spatial Markov chain.

Figure 4 .
Figure 4. Illustration of a Markov chain random field.Arrows represent directional interactions in the spatial Markov chain.

Figure 5 .
Figure 5.Comparison of mean absolute error (MAE) and relative mean absolute error (RMAE) for the ordinary kriging (OK) and Markov chain random field (MCRF) methods by soil component; (a): the MAE result of sand; (b): the RMAE result of sand; (c): the MAE result of clay; (d): the RMAE value of clay; (e): the MAE value of silt; (f): the RMAE value of silt.

Figure 5 .
Figure 5.Comparison of mean absolute error (MAE) and relative mean absolute error (RMAE) for the ordinary kriging (OK) and Markov chain random field (MCRF) methods by soil component; (a): the MAE result of sand; (b): the RMAE result of sand; (c): the MAE result of clay; (d): the RMAE value of clay; (e): the MAE value of silt; (f): the RMAE value of silt.

Figure 6 .
Figure 6.The MAE and RMAE for the MCRF method for different soil texture types by soil component; (a): the MAE value of sand by the MCRF method; (b): the RMAE value of sand by MCRF; (c): the MAE value of clay by MCRF; (d): the RMAE value of clay by MCRF; (e): the MAE value of silt by MCRF; and, (f): the RMAE value of silt by MCRF.

Figure 6 .
Figure 6.The MAE and RMAE for the MCRF method for different soil texture types by soil component; (a): the MAE value of sand by the MCRF method; (b): the RMAE value of sand by MCRF; (c): the MAE value of clay by MCRF; (d): the RMAE value of clay by MCRF; (e): the MAE value of silt by MCRF; and, (f): the RMAE value of silt by MCRF.

Figure 7 .
Figure 7.The MAE and RMAE for the OK method for different soil texture types by soil component; (a): the MAE value of sand by OK; (b): the RMAE value of sand by OK; (c): the MAE value of clay by OK; (d): the RMAE value of clay by OK; (e): the MAE value of silt by OK; and, (f): the RMAE value of silt by OK.

Figure 7 .
Figure 7.The MAE and RMAE for the OK method for different soil texture types by soil component; (a): the MAE value of sand by OK; (b): the RMAE value of sand by OK; (c): the MAE value of clay by OK; (d): the RMAE value of clay by OK; (e): the MAE value of silt by OK; and, (f): the RMAE value of silt by OK.

Table 1 .
Soil texture types used in this study based on the USDA soil classification scheme.

Table 2 .
Performance of the OK and MCRF methods for the best radius for each soil texture type.

Table 2 .
Performance of the OK and MCRF methods for the best radius for each soil texture type.

Table 3 .
Average sampling point density for different radiuses.

Table 4 .
Average sampling point density for each minor soil texture type.Note: Values are stated in units of ×10 −5 (km 2 ) −1 .