Estimating Soil Available Phosphorus Content through Coupled Wavelet–Data-Driven Models

Soil phosphorus (P) is a vital but limited element which is usually leached from the soil via the drainage process. Soil phosphorus as a soluble substance can be delivered through agricultural fields by runoff or soil loss. It is one of the most essential nutrients that affect the sustainability of crops as well as the energy transfer for living organisms. Therefore, an accurate simulation of soil phosphorus, which is considered as a point source pollutant in elevated contents, must be performed. Considering a crucial issue for a sustainable soil and water management, an effective soil phosphorus assessment in the current research was conducted with the aim of examining the capability of five different wavelet-based data-driven models: gene expression programming (GEP), neural networks (NN), random forest (RF), multivariate adaptive regression spline (MARS), and support vector machine (SVM) in modeling soil phosphorus (P). In order to achieve this goal, several parameters, including soil pH, organic carbon (OC), clay content, and soil P data, were collected from different regions of the Neyshabur plain, Khorasan-e-Razavi Province (Northeast Iran). First, a discrete wavelet transform (DWT) was applied to the pH, OC, and clay as the inputs and their subcomponents were utilized in the applied data-driven techniques. Statistical Gamma test was also used for identifying which effective soil parameter is able to influence soil P. The applied methods were assessed through 10-fold cross-validation scenarios. Our results demonstrated that the wavelet–GEP (WGEP) model outperformed the other models with respect to various validations, such as correlation coefficient (R), scatter index (SI), and Nash–Sutcliffe coefficient (NS) criteria. The GEP model improved the accuracy of the MARS, RF, SVM, and NN models with respect to SI-NS (By comparing the SI values of the GEP model with other models namely MARS, RF, SVM, and NN, the outputs of GEP showed more accuracy by 35%, 30%, 40%, 50%, respectively. Similarly, the results of the GEP outperformed the other models by 3.1%, 2.3%, 4.3%, and 7.6%, comparing their NS values.) by 35%-3.1%, 30%-2.3%, 40%-4.3%, and 50%-7.6%, respectively.


Introduction
Controlling point source pollutants is a crucial issue in soil and water resource management plans [1,2]. To achieve this goal, related to the sustainability of natural ecosystems and human activities, managing runoff and soil losses from agricultural lands is crucial [3,4]. Agricultural drainage systems usually remove the excess of overland flow from the land through surface/subsurface networks to maintain the soil moisture levels at the standard points for crop production [5,6]. Nevertheless, water flowing through these systems can deliver soluble elements, which can be stored by the soil and modify soil quality [7,8]. During the drainage process, those soluble elements are leached from the soil and can pollute the drainage effluent [9], which may lead to increasing the risk of environmental problems. Thus, nature-based solutions should be necessary to conserve the stability between the disposed of/retained excess water, in such a way that neither waterlogging nor environmental side effects can take place [10].
Although soil phosphorus (P) is an essential nutrient for sustained crop production, it is also considered as a pollutant resource of water. In other words, it affects the growth of the crops as well as terrestrial systems in addition to the microorganisms living condition in the soil [11]. Interestingly enough, the drained water from agricultural lands can generally contain significant amounts of this element, which has a high spatial and temporal variation. The reduction of water quality due to eutrophication is another environmentally-damaging result of this delivery from lands to water resources or artificial drainage systems [12,13]. Generally, a considerable amount of soil P can be found in most of the calcareous soils of arid/semiarid regions of Iran, which represents a result of the reactions of absorption and illuviation of carbonate minerals [14,15]. Based on its diverse detrimental effects, stringent management of phosphorus in surface/subsurface water bodies is obviously essential to improve water quality.
To better manage soil phosphorus, gaining precise knowledge of available soil P content and planning an efficient fertilization process are strongly needed [16,17]. There are also other dynamics to be considered, namely, transport and source factors, which should be considered to conserve water quality. In several soils, applying excess P fertilizer above crop needs is desirable, which is thought to be able to enhance the optimal crop yield [18]. However, soils which contain a significant amount of amorphous clay and highly calcareous ones should be considered as exceptions, because increasing the P levels could be problematic. There are also different parameters, including soil pH and texture, organic matter, as well as the presence of iron and aluminium oxides which modify the extraction of P by the plants [19,20]. Therefore, accurate knowledge of the dynamics of the studied soils, as well as the complex system of soil P for the plants, are key in agricultural lands and human soil health [21,22]. Some soil parameters can usually be estimated by modeling approaches using easily measured soil parameters, such as soil separates. In this context, regression-based models can be applied to demonstrate the correlation between easily pedological measured parameters (as input variables) and target-dependent variables, which are referred to as pedotransfer functions (PTFs). Nonetheless, artificial intelligence (AI) techniques (e.g., gene expression programming (GEP), neural networks (NN), random forest (RF), multivariate adaptive regression spline (MARS), and support vector machine (SVM)) for mapping the interrelations among soil parameters used to be more applicable. The use of the wavelet transform approach might be applied in this context for improving the models' overall performance accuracy. It is interesting to note that the use of AI techniques in this regard is very limited, at least, in semiarid areas and larger scales (e.g., [23,24]) despite their wider applications in various soil and water analysis issues, such as modeling soil cation exchange capacity [8,25], modeling soil bulk density [26], predicting groundwater level fluctuations [27], estimating reference evapotranspiration [28], simulating watershed sediment amount [29], and estimating terrestrial parameters such as solar radiation [30], as well as groundwater pollution studies [31].
In revent decades, wavelet transform (WT) has become a reliable technique in analyzing periodicities and variations in time series analysis [32][33][34][35]. Wavelet analysis is expected to be a promising tool because of its multiresolution in frequency and time domains in modeling hydrological issues, which improves the accuracy of forecasting models by providing subseries of time series. They are based on gaining more detailed information about the behavior of the physical process to be estimated/simulated (e.g., [36][37][38]). In other words, WT constructs both low-and high-frequency components at various levels of resolution in time series, which would improve the accuracy of predictive models. It has been demonstrated as a powerful technique in some spatial simulation issues, as well [39,40]. The conjunction of these types of analysis with AI models has been applied for estimation issues in water resources and hydrological research [41][42][43][44]. It is evident from these studies that WT fairly advances the accuracy of AI methods in estimation.
Nevertheless, there is a lack of information about the development of the models based on limited measured soil variables for estimating soil P and applications in semiarid catchments. Therefore, the current study aimed to evaluate the GEP, MARS, RF, SVM, and NN methodologies for estimating soil P content using soil easily measured variables. Based on the best of the authors' knowledge, except using NN in the previously mentioned studies, the current research is the first application of these AI techniques, as well as their coupling with wavelet transform for modeling soil P. Using single data-partitioning methods for feeding the applied models with the input-target matrices is common around soil parameter modeling. Despite the simplifications involved in such data-partitioning modes, the developed models might be overfitted using only a part of data for their training, which might be linked to the attributes selected for the training stage. However, the current research intends to show the application of the most powerful k-fold testing cross-validation mode, where all the available input-target pairs are involved in developing and testing the applied models.

Study Region
The studied area of the Neyshabur plain (Northeast of Iran) is located between latitude 35 • 40 N to 36 • 40 N and longitude 58 • 12 E to 59 • 31 E, with an average altitude of 1256 m a.s.l., which approximately covers 90 km 2 [45] (Figure 1). This flat plain is characterized by gentle hillslopes ranging from 5 to 20 degrees (78%), and the rest of the land is covered by moderate inclination [46]. Based on an earlier study [47], most of the soils in this plain can be classified as Aridisols and Entisols orders [48], and irrigated farming is the main land use of this area. The general physiographic trend of the plain extends in an NW-SE direction. Land units and land use maps are included in Figure 2 according to [49]. Sustainability 2020, 12, x FOR PEER REVIEW 4 of 23

Soil Sampling Procedures and Soil Analysis
ArcGIS 10.2 (ESRI, USA) software was used to apply a fishnet sampling design method as a reliable strategy with 300 grids (Figure 1). Additionally, the influence of the spatial variation of soil

Soil Sampling Procedures and Soil Analysis
ArcGIS 10.2 (ESRI, USA) software was used to apply a fishnet sampling design method as a reliable strategy with 300 grids (Figure 1). Additionally, the influence of the spatial variation of soil

Soil Sampling Procedures and Soil Analysis
ArcGIS 10.2 (ESRI, USA) software was used to apply a fishnet sampling design method as a reliable strategy with 300 grids (Figure 1). Additionally, the influence of the spatial variation of soil parameters affecting P values in the Neyshabur plain was considered with a grid interval of 500 × 500 m. A portable Global Positioning System (GPS) to precisely locate the sampling locations was used. A total of 12 locations of about 300 grids correspond to urbanized areas (mostly limited by the fences) and were therefore not sampled. Thus, a total of 288 samples ranging from 0 to 30 cm of soil depth were gathered and selected. Then, the disturbed samples were collected, air-dried, crushed, and also sieved, utilizing a 2 mm sieve size. After separating and discarding, the large plant substances and pebbles were transported for laboratory analyses. Soil P was determined using the method developed by Olsen [50] and the amount of soil organic carbon (OC) content estimated using the Walkley-Black approach in addtion to dichromate extraction and titrimetric quantization [51]. Particle size distribution (Sand: 0.05-−2 mm, silt: 0.002-0.05 mm, and clay: < 0.002 mm) was determined using the Bouyoucos hydrometer method [52]. Schoeneberger´s methodology [53] was followed to classify soil texture. Soil pH was also measured by a digital pH-meter in saturated paste extract [54]. Calcium carbonate equivalent (CCE) was also gained using the back-titration technique [55]. The statistical-based characteristics of the utilized soil data are shown in Table 1. The maximum values of the standard deviation (SD) and coefficient of variation (CV) are devoted to the soil P. This can be justified due to the parent material, terrain attributes and agricultural practices [24,56].
Kurtosis and skewness coefficients obtained the highest values for soil OC. They revealed the degree of the deviation of OC from the Gaussian distribution, which might be justified due to the application of fertilizers as discussed by Fard et al. [57]. There are also some differences observed in soil properties, which might be effective parameters impacting on soil P content. In other words, its sorption and desorption may be altered under a certain set of environmental conditions [24,58].

Discrete Wavelet Transform (DWT)
Wavelet analysis, which is actually driven by the compact support of its basic function, has been developed based on the Fourier analysis. Despite the incapability of Fourier transform in providing a valid time-frequency analysis, wavelet transform analysis appear to be an effective tool in this area [59]. Therefore, WT is generally similar to the windowed Fourier transform, while their merit functions are completely different. Fourier transform decomposes the signal into sines and cosines (localization in Fourier space), but the WT utilizes localized functions in bothreal and Fourier spaces. Having waveforms of effectively limited duration and zero mean is one of these wavelets' specifications. It also provides a time-scale localization of processes, which improves the accuracy of forecasting models by providing time subseries. Translating the signals from the time domain to the time/frequency domain can be done with the discrete wavelet transform (DWT). During this process, the original signal decomposes into different frequency components, which can be converted into father and mother wavelets [60]. Wavelet function called the mother wavelet might be considered as +∞ −∞ ψ(t)dt = 0. ψ a,b (t), which might be acquired by compressing and expanding ψ(t) based on Equation (1): a represents a factor that shows the frequency, and b is also considered as a time factor. The range of real numbers is also presented by R. If ψ a,b (t) assured Equation (1), for the series f (t) єL 2 (R) or finite energy signal, the successive wavelet transform of f (t) might be considered as follows in Equation (2): The complex conjugate functions ψ(t) are shown as ψ(t). WT filters a wave for f (t) with various filters, in which the successive wavelet is frequently discrete in real applications. Aiming to obtain a successive wavelet, which is not frequently continuous when a = a j 0 , b = kb 0 a j 0 a 0 > 1, and b єR, k, j might be considered as integers, the DWT of f (t) might be written as Equation (3): The best selection for the a 0 and b 0 would be 2 and 1 in 1-time steps. The power of this logarithmic-based scaling is recognized as the most effective case for practical issues, which is known as dyadic grid arrangement [61]. Equation (3) becomes a binary wavelet transform when a 0 = 2, b 0 = 1 (Equation (4)): The properties of the original data sets in terms of frequency and time domain concepts (a or j and b or k, respectively) can simultaneously be described by W ψ f (a, b) or W ψ f ( j, k). Interestingly, low levels of frequency in WT in addition to high time-domain resolution would result in a reduction of a or j and vice versa [36]. The DWT can be considered as Equation (5) using integer time steps for a discrete-time series f (t).
where W ψ f ( j, k) equals a wavelet coefficient for the DW of scale a = 2 j , b = 2 jk . There are also various filters, namely, high and low passes operated by DWT. These filters are responsible for passing the original time series in order to separate them at a variety of levels. The DCs described as detail coefficients and approximation based subcomponents series (A) were calculated through Equation (5), where the DCs show the low-scaled high-frequency elements of a signal, and A is the components in the opposite way [61]. Hybrid modeling of approaches based on wavelet-artificial intelligence could provide more precise outputs of variations, periodicities, and trends in time series, which would make the models more viable. The coupled wavelet-AI models were obtained by combining the DWT and AI models. As an example, WGEP is a GEP-based model that utilizes the subseries components extracted by applying discrete wavelet transform on original patterns, so its inputs consist of the decomposition of the original patterns (Ds) obtained by the Mallat DWT algorithm [61]. Figures 3-5 show the decomposed wavelet subcomponents of the applied input parameters. It should be noted that the sum of each detail and approximation subcomponent gives the original input data. The attribute of each of these subcomponents is not the same, which provides useful information on various resolution levels [62]. Thus, adding this detailed information to the inputs improves the prediction accuracy of the AI models. However, it should be noted that D1 indicates the details of subcomponents with the highest frequency.
Generally, this component usually involves noisy data, while D3 shows the details of subcomponents with the lowest frequency. A represents the approximation of the original signal, ignoring the detail components. In fact, all these subcomponents have different information, but sometimes, removing the highest frequency component with noisy data may improve the models' accuracy.
Sustainability 2020, 12, x FOR PEER REVIEW 7 of 23 approximation of the original signal, ignoring the detail components. In fact, all these subcomponents have different information, but sometimes, removing the highest frequency component with noisy data may improve the models' accuracy.    As an example, Figure 6 reveals the flowchart and structure of the WGEP model. The selection of effective details and A subcomponents was made based on the correlation analysis, which is explained in the Modeling Protocol subsection. As an example, Figure 6 reveals the flowchart and structure of the WGEP model. The selection of effective details and A subcomponents was made based on the correlation analysis, which is explained in the Modeling Protocol subsection.

Gene Expression Programming (GEP)
GEP is a genetic-based algorithm which employs elements such as chromosomes and expression trees as programs. Every chromosome is composed of genes, any of which encode a smaller subexpression tree. In this model, the architecture coordination of the linear chromosomes authorizes it to operate notable genetic operators such as mutation, transposition, and recombination without any limitation. Selecting the sets of data to fit accordingly in order to introduce the relationship between variables is notable at this step. Meanwhile, GEP shows several advantages which make it a stronger approach compared to other learning algorithms. Based on the nature of GEP, the creation of diversity is quite simple [63]. In other words, the obvious robustness of GEP is in simplifying the production of genetic variation, as well as its unique nature that provides the evolution of complex programs consisting of numerous subprograms [64]. Furthermore, this procedure makes it capable of expressing several relationships between input and output variables. This unique feature makes GEP stronger than other methods [65,66].
The following plan for GEP-based simulation of soil P (target variable) using the mentioned input variables was followed in this study [67]. The first step was to select a fitness function, although various absolute-and relative-error-based fitness functions might be used for modeling soil P. Then, the root relative squared error (RRSE) was applied as advised by the literature (e.g., [25]). Second, it is essential to choose the predictor parameters and sets of function. Then, the utilized input variables were obtained using the Gamma test, as mentioned before. Various GEP function sets were evaluated

Gene Expression Programming (GEP)
GEP is a genetic-based algorithm which employs elements such as chromosomes and expression trees as programs. Every chromosome is composed of genes, any of which encode a smaller subexpression tree. In this model, the architecture coordination of the linear chromosomes authorizes it to operate notable genetic operators such as mutation, transposition, and recombination without any limitation. Selecting the sets of data to fit accordingly in order to introduce the relationship between variables is notable at this step. Meanwhile, GEP shows several advantages which make it a stronger approach compared to other learning algorithms. Based on the nature of GEP, the creation of diversity is quite simple [63]. In other words, the obvious robustness of GEP is in simplifying the production of genetic variation, as well as its unique nature that provides the evolution of complex programs consisting of numerous subprograms [64]. Furthermore, this procedure makes it capable of expressing several relationships between input and output variables. This unique feature makes GEP stronger than other methods [65,66].
The following plan for GEP-based simulation of soil P (target variable) using the mentioned input variables was followed in this study [67]. The first step was to select a fitness function, although various absolute-and relative-error-based fitness functions might be used for modeling soil P. Then, the root relative squared error (RRSE) was applied as advised by the literature (e.g., [25]). Second, it is essential to choose the predictor parameters and sets of function. Then, the utilized input variables were obtained using the Gamma test, as mentioned before. Various GEP function sets were evaluated (not presented here) to select the proper set. It was found that the following function set gave the optimum results.
Moreover, it is necessary to select a chromosomal architecture where a specific head size (= 8) and the number of genes (= 3) are utilized as advised by previous authors (e.g., [26]). Finally, we selected genetic operators. GeneXpro default operators were also chosen following previous research [65].

Multivariate Adaptive Regression Spline (MARS)
MARS is a nonparametric regression approach that might be regarded as an accompaniment of linear models which automatically simulates the nonlinearities and interactions between parameters. This innovative method excels at searching and finding optimal nonparametric regression models and also controls the data easily [68]. This algorithm has some striking advantages, such as flexibility, rapidness, and accuracy, which make it usable for predicting continuous and binary outputs using a combination of linear and nonlinear methods. Identifying the underlying functional relationships between input and output variables without any assumptions is another capability of MARS [69]. Basis functions are generated by the MARS model at the next step, called "searching". The algorithm includes a forward and backward stepwise plan. The intemperate forward stepwise chosen plan would make a complex and overtrained model after a number of splits, which would add up the functions as well as find the probabilistic nodes in order [70], which will have a lower performance accuracy. Thus, this step involves the removal of minimum real terms, which eliminates the nonobligatory variables among the previously chosen set. The resulting splines provide more accuracy as well as the flexibility to the MARS technique, which assumes a threshold and deviations of linear functions [71].

Random Forest (RF)
The RF algorithm was first presented by Breiman [72], who gathered a collection of trees for spatial and time-series simulations. This method as a group learning algorithm directs high-dimension regression and classification issues and has been widely applied for forecasting issues [69]. One of the advantageous of this approach is its capability to calculate interactions among different factors. This is a tree-based group method where all trees are dependent on multitude variables, and a forest is grown from several regression trees put together and forming a group [72]. There are two key parameters: the number of variables and that of trees [73]. Dealing with random binary trees enables the RF to make the final decision, which is achieved across averaging the output, after fitting single trees into the ensemble (bagging procedure). The bias of the bagged trees is the same as that of the single trees, while the variance is reduced by a reduction in the correlation between trees [74].
Here, after examining various numbers of trees, the tree number 150 was selected, which produced the lowest error magnitudes, while 10 cycles were found as the best for calculating the mean error, iteratively. Regarding the training error process, the percentage decrease was found to be 5% by trial and error; based on that, the minimum child node size to stop and the maximum number of levels were selected as 5 and 10, respectively.

Support Vector Machine (SVM)
SVM is one of the best machine-learning-based approaches that has been broadly implemented in different scientific fields recently. This technique was first developed by Vapnik [75] and has been framed on the concept of decisions in linear data categorization. This theory minimizes the upper bound generalization error instead of the local training error, providing this algorithm with a greater ability to generalize, which is the key aim in statistical learning for solving complicated problems [57,76]. In the current research, the regression-SVM type 1 was employed, as it has shown higher performance accuracy in previous research [77]. Through a trial and error procedure, SVM constant values were chosen as 10 (capacity) and 0.15 (epsilon). Different linear, sigmoid, polynomial, and radial basis functions were evaluated as SVM kernel functions; among them, the radial basis function kernel (Gamma = 0.25) gave the best results. Finally, the highest number of iterations was found to be 1000 (iteratively), and the applied models were stopped at an error value of 0.005.

Artificial Neural Networks (NN)
NN is a computing method whose basic concept is formed by the behavior of biological neural networks. The nodes are the main processor in this approach, similar to neurons, which also have weighted connections in this biological base system [78]. An advantage of this method is the ability to exploit relationships utilizing dependent and independent factors that enable it to model nonlinear behaviors. This specification of this model makes it more efficient for noisy data systems. Briefly, this computer-based model has been proposed in several layers that are responsible for recognizing complex patterns. In this study, a multilayer perception feed-forward network was applied to the data sets in addition to the different transfer functions in hidden and output layers. In selecting the perfect function process, 100 networks were examined during each training stage, where the conjugate gradient training algorithm with 180 cycles presented the maximum accuracy. Furthermore, Tanh and Identity activation were the functions with which the best function for the layers was associated. Through an iterative process, the number of hidden layer nodes was found to be 12 in this research. SVM, NN, RF, and MARS analysis was carried out using STATISTICA software (StatSoft, Hamburg, Germany).

Modeling Protocol
In this research, identifying the influential independent variable on soil P for selecting the most important input parameters was utilized using the statistical Gamma test. Constructing models based on this calibration test can be considered inflexible, though it may be advantageous in some situations where model selection is integrated together with the variable selection approach [26]. The obtained outputs using this test revealed that the best input composition would be a combination of pH, OC, and clay content, which are used to register the maximum influence on soil P with the minimum Gamma statistic (0.0193). Extracting the wavelet decompositions of the selected input variables was the next step, and computing the correlations of these subcomponents with the target parameter (Soil P) was considered as the following stage. All the results of this process are summarized in Table 2. Finally, the criteria for choosing the effective subcomponents as predictor variables for the wavelet-AI models were the ones with correlation values >0.09. Even though the linear correlations seem to be lower for each component, all these may be useful for the learning process of the implemented nonlinear models. A k-fold procedure was applied to the user data with the aim of dividing them into training as well as testing blocks. In this method, all the available data are split up to k number of blocks. We used 10-fold independent blocks in our research. The considered models were trained with a portion of data each time and tested using the rest of them. In this process, a total of 50 train-test phases (5 models × 10 k-folds = 50) were applied to each single and wavelet-based AI model. Using this effective approach provided reliable results of simulated values, which prevented model overfitting during the simulation as well as allowed all the data to participate in two stages.
For inclusive justification of the models' performance, there is a necessity of applying indices with the aim of assessing the validation of constructed models. Here, three statistical indicators (from Equation (6) to Equation (8)), namely, correlation coefficient (R), scatter index (SI), and Nash-Sutcliffe coefficient (NS), were utilized. Additionally, a t-test analysis was performed to prove the results.
In Equation (6), P io denotes the measured soil P value at the i-th observation, P im represents the corresponding simulated P magnitude, and P o indicates the mean observed P magnitude. n stands for the number of patterns. Overall, R cannot be utilized as a fitness measurement alone. Thus, using other indices such as SI and NS can be considered essential for validation models. The best fit for R, SI, and NS can be summarized in values equaling to 1, 0, and 1, respectively. In calculating these concepts, a full series of applied patterns were used, including both pooling the simulations of each data set and dividing for each test phase (1/10 of the patterns matrix).

Results and Discussion
The coupled wavelet-AI models, including WGEP, WMARS, WRF, WSVM, and WNN, were obtained through combining the DWT and AI models. In order to determine the decomposition level, we used the log(N) formula, where N is an indicator of the data number. In other words, 288 data were used in our study, which resulted in three decomposition levels (log(288) = 3) in DWT applications. Moreover, one approximation, namely, A, and three details (D1, D2, and D3) for each input were obtained as subcomponents, each of which were utilized as input in the wavelet-based AI models. Table 3 presents the statistical indices of the employed single as well as wavelet-based AI models. It should be noted that the values are not significant with respect to the t-test results; however, the main aim here was to compare different wavelet-based models. The highest significance level with the lowest t-statistic indicated the most robust model. The single models could not simulate the soil P content with reasonable accuracy, which might be linked to the complicated chemistry property of P in soils. This situation could be due to the reaction of inorganic P with other elements such as calcium, iron, and aluminium, which can convert them to phosphates [15]. In our study area, also, organic P can be found with different shapes, and it can be resistant to microbial degradation in soil and highly correlated with the variations of OC (as can be seen in Table 1) and soil texture. Specifically, limes might bring a discrepancy to the amount of P, which can make extrapolations difficult [24]. On the other hand, clay contents and Fe and Al oxides could improve P sorption [79][80][81], while soil OC presents an adverse effect [82]. Nevertheless, Demaria et al. [83] stated that soil pH and metal ions show a significant influence on soil P contents, so the variations in soil P contents might be due to the variations in soil properties, as soil OC and clay distribution were considerably different in our studied area.
Based on Table 3, the lowest SI (0.127) and the highest R (0.990) and NS (0.978) values belong to the WGEP model, which surpasses other models, including WRF, WMARS, WSVM, and WNN (ranked successively). By contrast, WNN represents the worst results, with the highest SI (0.256) and the lowest R (0.960) and NS (0.911) indicators. Meanwhile, the discrepancies among the models' performance are not considerable (0.067, 0.054, 0.086, and 0.129 between the WGEP and WRF, WMARS, WSVM, and WNN, respectively) with the exception of WNN. Furthermore, the accuracy increments of the WMARS, WRF, WSVM, and WNN models with respect to R-NS measures, respectively, are 1.5%-3.1%, 1.2%-2.3%, 2.1%-4.3%, and 3.1%-7.4% using the WGEP model. Moreover, a t-test approach as a statistical hypothesis was also set at a significant level of 95% with the aim of comparing the degree of the difference between the measured and simulated soil P values, that the results are presented in Table 3. During the t-test statistics analyses, the WGEP model revealed the highest performance among the other applied wavelet-based AI models.
Our findings demonstrated that the GEP technique has applied all the introduced input variables (selected primarily using the Gamma test, as well as the correlations between the wavelet decompositions and the target variable). Thus, the outcomes of the Gamma test are confirmed by the GEP. Shiri et al. [26] argued that the differences between the input selection of the GEP and Gamma test might be linked to their functional structure, where a major assumption with the Gamma test is that the existing interrelation of the studied problem consisted of a smooth function and a random variable, while GEP develops the structure and constants of the formulas simultaneously. The extent to which site-specific constants are embedded in the formulation-and whether those constants are similar to or different than those at another statio-would dictate the transferability of the function between stations. Therefore, in the current case, giving similar results of the input selection might be linked to the nature of the studied problem as well as using the wavelet transform for transferring the original data into their subcomponents. The maximum weight given by the GEP to the introduced input variables belongs to the subcomponents of the soil OC as well as soil clay content. The MARS model, however, gives the highest weight to the D12 (pH) and D21 (OC), while the highest weights correspond to the D22 (OC), D21 (OC), D12 (pH), and A31 (clay) with the RF model. Nonetheless, analyzing the obtained equation reveals that soil clay and OC content have a direct, positive effect on soil P amount, while pH shows an adverse relation, which confirms the results of previous studies (e.g., [18,84]).
The split-up of SI and NS values of the utilized techniques per test stage can be found in Figure 7, where it can be seen that the lowest SI and the highest NS values in 6 test stages (1st, 2nd, 4th, 6th, 8th, and 10th) out of 10 are allocated to the WGEP, while it has similar accuracy with WRF model in the 3rd and 5th test stages. Figure 7 shows considerable differences among the models' performance accuracy in different test stages as well. The reported difference between the maximum and minimum SI values among the test stages of WGEP and WSVM is 0.104 and 0.247, respectively. The same trend is also observed for the WGEP model for the NS indicator (the lowest NS fluctuations); however, the WMARS presents the highest The reported difference between the maximum and minimum SI values among the test stages of WGEP and WSVM is 0.104 and 0.247, respectively. The same trend is also observed for the WGEP model for the NS indicator (the lowest NS fluctuations); however, the WMARS presents the highest fluctuations for NS. Taking into consideration that the current research applied the k-fold testing approach in the way of holding back the data to feed the train-test partitions, such variations would dictate the requirement of using this method to evaluate the methodologies. Since there is a need for a strong validation for all available data as well as other data with the same statistical ranges, the obtained results would not satisfy these criteria. On the other hand, the patterns used in this research have been chosen from various spatial points where the necessity of using k-fold testing becomes more indispensable. The application of k-fold testing is more significant in checking the models' generalizability via external applications, for instance, in the considered models trained with data from different places and tested using data from another place, as the test patterns include data from different points [66].
The measured versus simulated magnitudes of soil P (utilizing the applied methods) have been plotted in Figure 8. fluctuations for NS. Taking into consideration that the current research applied the k-fold testing approach in the way of holding back the data to feed the train-test partitions, such variations would dictate the requirement of using this method to evaluate the methodologies. Since there is a need for a strong validation for all available data as well as other data with the same statistical ranges, the obtained results would not satisfy these criteria. On the other hand, the patterns used in this research have been chosen from various spatial points where the necessity of using k-fold testing becomes more indispensable. The application of k-fold testing is more significant in checking the models' generalizability via external applications, for instance, in the considered models trained with data from different places and tested using data from another place, as the test patterns include data from different points [66]. The measured versus simulated magnitudes of soil P (utilizing the applied methods) have been plotted in Figure 8.  It is apparent from the scatterplots that the WGEP model has less scattered estimates than the other applied methods, and its estimates fall into the 0.95 prediction interval except for 1 or 2 values. The WRF model has the second-best accurate estimates, while NN obtains the worst accuracy and most scattered soil P estimations. The WGEP model adequately estimates high soil P values, while the other models generally tend to overestimate them. All these graphs confirm the statistics evaluated in Table 3.
Overall, the wavelet-based GEP model excelled among all the applied models, as it could provide the most precise results in modeling soil P values in the present study. It also showed that each of the DWs obtained using DWT played a distinct role in the original data and exhibited different effects on the original output (here soil P data). This study justified previous applications [85][86][87] and indicated that discrete wavelet transform can be considered as a useful technique in data preprocessing as well.
Finally, it should be noted that land use may affect the relationships among soil variables, as discussed by other authors (e.g., [88][89][90][91]). Nonetheless, topography affects soil properties as well due to the local redistribution of water, solar radiation, and parent material through erosion and deposition processes [92][93][94]. While the developed models might not be directly transferable to other regions (since the influential parameters on soil P may differ in distinct locations), the proposed approaches can be useful for a suitable assessment of soil P magnitudes in different regions. Therefore, the current study presents a practitioner with a fully described blueprint for applying these techniques.

Conclusions and Challenges
The current research demonstrated the necessity of applying the k-fold cross-validation to assess the five variant wavelet-based AI models, namely, WGEP, WMARS, WRF, WSVM, and WNN, which were applied in modeling soil P. To achieve this goal, soil pH, OC, and clay content were used as inputs. These soil properties were the most adequate considering our study area conditions. Although applications based on a single data set assignment are commonly used, k-fold testing is advisable to perform a thorough evaluation of the modeling procedure referring to a data set as stated by Marti et al. [95]. Gamma test was also utilized for selecting the optimal inputs to the applied models and with the aim of preventing the AI-based models from overfitting in addition to ten k-fold cross-validations. Applying DWT with three decomposition levels to the input data was another part of the process where each subcomponent was used as input. Finally, our findings can be summarized in the following five main points: (i) The wavelet transform technique can be considered as an effective tool in preprocessing input data of AI methods in modeling soil P; (ii) among the applied AI models, WGEP provided the best estimations, followed by the WRF, while the WNN showed the worst results; (iii) models' results were also compared according to the t-test, and WGEP obtained the most robust results among them in prediction of soil P; (iv) the SI-NS accuracy increments of the WMARS, WRF, WSVM, and WNN models were also found to be 35%-3.1%, 30%-2.3%, 40%-4.3%, and 50%-7.4% using the WGEP model, respectively; and, (v) k-fold application can be considered as an essential method in modeling soil P, as it demonstrated a high spatial variation, defaulting and preventing the applied models from overtraining.
Applying the hybrid wavelet-data-driven models on the water and soil-related analyses, including groundwater quality modeling, soil moisture simulation, etc., could be an interesting topic for future investigations. Furthermore, coupling different wavelet transforms ranging from discrete (DWT) to continuous (CWT) with other data-driven models for the same issues would be another suggestion for further research in this field.