1. Introduction
Direct measurements of soil hydraulic properties in the field and laboratory can be tedious, laborious, and often expensive due to their significant inherent spatial variability. Therefore, pedotransfer functions (PTFs) are often developed and used to indirectly estimate these properties by establishing empirical relationships based on the readily available soil properties such as soil texture, bulk density (BD), and soil organic matter content (SOM) [
1]. The primary soil hydraulic properties include the soil water retention and hydraulic conductivity curves (SWRC and SHCC) that define the volumetric water content’s nonlinear relationships with the soil tension and the soil hydraulic conductivity, respectively. The hydraulic conductivity decreases as the volumetric water content decreases because of a reduction in the cross-sectional area of water flow and increased tortuosity and drag forces [
2,
3].
The experimental determination of the SHCC is more complicated than the SWRC. Therefore, the SHCC is often derived from the SWRC and saturated hydraulic conductivity (
Ks) information. A popular four-parameter expression developed by van Genuchten [
4] is widely used for SWRC parametrization, which coupled with Mualem-van Genuchten model [
4,
5] is often used for SHCC parametrization using the
Ks as a scaling factor. The SHCC can also be described by Gardner’s empirical expression [
6], which in some cases works similarly or even better than the Mualem-van Genuchten model [
6].
The PTFs are mainly developed to only estimate
Ks (point PTF) and parameters of the van Genuchten water retention model (parametric PTF), which are subsequently used for estimating the SHCC using the abovementioned approach [
7,
8,
9,
10]. For example, Schaap and Leij [
11] used PTF-based SWRC parameters of van Genuchten equation to estimate the SHCC. They observed that the best results are obtained when the parameters
Ks and
L (a term for the interaction between pore size and tortuosity) were flexible and not fixed as is the case in the classical Mualem–van Genuchten model. PTFs estimating unsaturated hydraulic conductivity at specific moisture tensions also exist (e.g., Moosavi and Sepaskhah [
12]). However, little is known about the development and application of PTFs to directly estimate the SHCC [
13].
Multiple sources of error interact in a complicated manner when PTF-driven SWRC and
Ks are used to estimate the SHCC. The first type of error is associated with estimating the parameters of the van Genuchten model and
Ks using parametric and point PTFs, respectively. The second type of error is related to the Mualem–van Genuchten parameterization of the SWRC and SHCC, which is often fitted only using a few water retention data pairs measured by equilibrium approaches. The SHCC estimations via Mualem-van Genuchten model can result in poor performance near saturation because of the inability to account for water flow through macropores [
10,
14,
15]. Furthermore,
Ks is a highly variable soil hydraulic property dependent upon the pore geometry at the scale of interest [
14] and seasonal variability [
15]. Significant variabilities in
Ks estimations might occur when using different PTFs modeling approaches [
16] and measurement techniques [
17], ultimately reflected in the SHCC estimations.
The pseudo-continuous PTF (PC-PTF) was introduced by Haghverdi et al. [
18] as a PTF development strategy for continuous estimation of the SWRC using machine learning approaches such as artificial neural networks (NN) and support vector machines [
19]. Using high resolution measured data is recommended for developing robust PC-PTFs since PC-PTF learns the shape of the SWRC directly from the actual measured water retention data [
19].
Schindler and Müller [
20] published a soil hydraulic international dataset using the Evaporation method and HYPROP
TM (Hydraulic Property Analyzer, Meter Group Inc., USA) system. The HYPROP system (
Figure 1) is an automated evaporation-based benchtop laboratory system that works based on the extended evaporation method [
21,
22]. The HYPROP has a relatively fast measurement cycle and provides high resolution reliable simultaneous measurements of soil water content and unsaturated hydraulic conductivity within a few days or weeks [
23,
24,
25]. Haghverdi et al. [
26] used an HYPROP measured Turkish soil data set to develop water retention PC
NN-PTFs and reported promising results. In a companion paper (Singh et al. [
27]), we utilized the Schindler and Müller [
20] dataset to develop water retention PC
NN-PTFs. However, no PC
NN-PTF has been developed to estimate the SHCC using high-resolution data measured via the evaporation method. Consequently, this study was carried out to (I) develop PC
NN-PTFs for SHCC estimations by utilizing the abovementioned international [
20] and Turkish [
26] data sets measured via the evaporation method, (II) determine the accuracy and reliability of the PC
NN-PTFs, and (III) assess the performance of the developed models across soil textures and different ranges of soil tension.
3. Results
3.1. Importance of the Input Predictors
Figure 3 shows the scatterplots of measured versus estimated log(
K) values for the PC
NN-PTFs developed in this study using different combinations of input predictors. All models showed acceptable performance, demonstrated by the well-scattered data around the 1:1 reference line except for the
Ks estimations in scenario 3 (training: the international dataset, test: Turkish datasets).
Table 4 summarizes the performance statistics of the models for all the scenarios. For scenario 1 (training: International, test: International), Model 2 (inputs: SSC, pF) resulted in the best performance with an RMSE of 0.520 and MAE of 0.406, followed by Model 4 (SSC, SOM, pF) where RMSE was 0.529 and MAE was 0.417. The lowest performance was observed in model 1 where RMSE was 0.571 and MAE was 0.428. MBE varied from 0.013 to 0.033, demonstrating no substantial under or overestimation of log(
K) for all models. The R values varied from 0.855 to 0.881, showing a high agreement between measured and estimated log(
K) in all models.
For scenario 2 (training: Turkish dataset, test: Turkish dataset), Model 2 (inputs: SSC, pF) resulted in the best performance with RMSE of 0.317 and MAE of 0.217, followed by Model 3 (inputs: SSC, BD) where RMSE was 0.336 and MAE was 0.219. MBE varied from 0.011 for model 2 to 0.043 for model 4, demonstrating no considerable under or overestimation of log(K). The R values were high (between 0.958 and 0.965) and similar among all models.
For scenario 3 (training: the international data set, test: the Turkish data set), model 1 with RMSE of 1.097 and MAE of 0.971 performed the best. Model 2 with RMSE of 1.317 and MAE of 1.254 showed lower accuracy compared to the other models. The estimated log(
K) values were highly correlated with the measured data (R: 0.935–0.954), yet all models showed an underestimation tendency (MBE ranging from −1.249 to −0.959). This is evident in
Figure 3 as well, where data points are well scattered but located mainly below the 1:1 line.
For scenario 4 (training: combined international and Turkish data sets, test: the Turkish data set), the best performance was observed for Model 3 with RMSE of 0.453 and MAE of 0.308. Model 1 also had a similar performance. Slight underestimation of log(K) was observed with MBE ranging from −0.335 for model 2 to −0.139 for Model 1. Correlation between observed and estimated log(K) was high and similar among all models, with R values ranging from 0.906 to 0.947.
No distinct relationship was observed between BD and SOM with RMSE values except for the Turkish clay soils where RMSE declined as BD increased (
Figure 4).
3.2. Performance across Soil Textures
The following analysis was only conducted using model 1 as the best performing PTF in the test phase.
Table 5 shows the performance of the PC
NN-PTF models for the dominant soil textures, representing about 89% and 92% of the international and Turkish data sets, respectively. When the international data set was used as the training set (scenario 1), clay loam had higher RMSE and MAE values than other soil textures. RMSE values ranged from 0.517 to 1.124, MAE values ranged from 0.342 to 0.748, and MBE values ranged from 0.026 to 0.288 for all textures. Furthermore, the model showed a tendency to overestimate log(
K) for all soil textures, except loam, where underestimation of log(
K) was observed. The correlation coefficient (R) values varied between 0.603 in clay loam to 0.881 for silt loam.
When only Turkish data were used for training (scenario 2), RMSE and MAE values varied from 0.206 to 0.395 and 0.146 to 0.312, respectively. MBE values ranged from −0.096 for sandy loam to 0.018 for clay loam, and no substantial underestimation or overestimation of log(K) was observed. The agreement between the measured and estimated log(K) values was very high, indicated by high and similar R values (between 0.926 and 0.982) for all the models within each soil texture.
When the international data set was used for training and the Turkish data set for validation (scenario 3), RMSE and MAE values varied from 0.964 to 1.444 and 0.863 to 1.377, respectively. Underestimation of log(K) was observed for all the soil textures with MBE values ranging from −1.370 to −0.860. Loam had the highest error relative to other soil textures, while clay had the lowest. High and similar correlation coefficient values (between 0.917 and 0.978) were observed for all the models and soil textures.
When the Turkish data set was used as a test and a combination of international and Turkish data sets were used for the training (scenario 4), the RMSE and MAE values varied from 0.230 to 0.745 and 0.173 to 0.683, respectively. The loam and sandy loam had higher RMSE and MAE values, while the errors for clay loam and clay were similar to when just the Turkish data were used for training (scenario 2). The MBE values ranged from −0.614 to −0.013, showing slight underestimation of log (K) for most soil textures except clay and clay loam where underestimation was not substantial. The agreement between the estimated and observed log(K) was high, as depicted by the high R values (ranging from 0.915 to 0.985) for all the models.
3.3. Performance at the Wet, Intermediate and Dry Parts of the SHCC
Table 6 shows the performance of different PC
NN-PTFs over three moisture ranges of the SHCC for the four data partitioning scenarios evaluated in this study. When the international data set was used for the training and testing of the models (scenario 1), the RMSE of Model 1 varied from 0.548 in the wet range to 0.603 in the dry range. The MAE values varied from 0.420 in the wet range to 0.440 in the intermediate range of the SHCC. The MBE values varied between −0.060 in the wet range to 0.140 in the dry range. The R values ranged from 0.509 for the wet to 0.640 in the intermediate range.
When only Turkish data were used for the training and testing of the PCNN-PTF models (scenario 2), the lowest error was observed in the intermediate range (RMSE = 0.317, MAE = 0.206, and MBE = 0.016) and the highest error belonged to the wet range (RMSE = 0.588, MAE = 0.471, and MBE = 0.031). The agreement between the observed and estimated log(K) was comparable among models with R ranging from a minimum of 0.809 in the wet range to a maximum of 0.860 in the intermediate range.
When the Turkish data were used as the validation data set (scenario 3), the PTFs showed their highest performance in the dry range (RMSE = 0.466 and MAE = 0.342) and their lowest performance in the wet range (RMSE = 2.285 and MAE = 2.158). Despite high R values (0.768–0.860), a tendency to underestimate log(K) was observed in all regions, as indicated by negative MBE ranging from −0.292 to −2.158.
When both international and Turkish data sets were used for the training of the models (scenario 4), the lowest error values were observed in the dry range (RMSE = 0.396 and MAE = 0.322) and the highest error in the wet range (RMSE = 0.757 and MAE = 0.602). RMSE values varied from 0.396 in the dry range to 0.757 in the wet range. Negative MBE values of −0.52 and −0.134 indicated a tendency to underestimate log(K) in the wet and intermediate ranges, respectively. The R values were high across all soil tension ranges.
5. Conclusions
We developed and evaluated PC
NN-PTFs to estimate the SHCC measured using the evaporation experiments, mainly via the HYPROP system. The PC
NN-PTF approach showed promising performance for continuous hydraulic conductivity estimation over a wide range of soil tensions. The HYPROP system offers the advantage of producing high-resolution soil hydraulic conductivity data over a wide range of soil tensions (pF = 1.5 to 3.5), which is critical for developing robust PC
NN-PTF models since this approach learns the shape of the SHCC directly from measured data. The KSAT instrument can be employed to measure the saturated hydraulic conductivity (
Ks) that can be used along with HYPROP data. The water retention PC
NN-PTFs developed and validated in the first part of this study (Singh et al. [
27]) also performed very well. Consequently, we recommend the PC
NN-PTF approach to derive the next generation of water retention and hydraulic conductivity models using high-resolution data measured via the HYPROP system.