Prediction of Soil Organic Carbon at Field Scale by Regression Kriging and Multivariate Adaptive Regression Splines Using Geophysical Covariates

: Knowledge of the spatial distribution of soil organic carbon (SOC) is of crucial importance for improving crop productivity and assessing the effect of agronomic management strategies on crop response and soil quality. Incorporating secondary variables correlated to SOC allows using information often available at ﬁner spatial resolution, such as proximal and remote sensing data, and improving prediction accuracy. In this study, two nonstationary interpolation methods were used to predict SOC, namely, regression kriging (RK) and multivariate adaptive regression splines (MARS), using as secondary variables electromagnetic induction (EMI) and ground-penetrating radar (GPR) data. Two GPR covariates, representing two soil layers at different depths, and X geographical coordinates were selected by both methods with similar variable importance. Unlike the linear model of RK, the MARS model also selected one EMI covariate. This result can be attributed to the intrinsic capability of MARS to intercept the interactions among variables and highlight nonlinear features underlying the data. The results indicated a larger contribution of GPR than of EMI data due to the different resolution of EMI from that of GPR. Thus, MARS coupled with geophysical data is recommended for prediction of SOC, pointing out the need to improve soil management to guarantee agricultural land sustainability.


Introduction
Soil organic carbon (SOC) is one of the most important indicators for assessing soil quality and overall soil health [1]. SOC plays a key role in unveiling soil structure development, nutrient turnover and stability, soil water retention, regulation of greenhouse gases, and susceptibility or resilience to land degradation [2]. SOC stock is thus a main factor in soil health, fertility, quality, and productivity [3] and supports important soil-derived ecosystem services (ESs) including water filtration and erosion control, soil strength and stability, nutrient conservation, and climate change adaptation and mitigation by sequestration of atmospheric CO 2 [4]. By selecting key soil indicators under different land use and management practices, Shukla et al. [5] concluded that SOC was the main soil quality indicator and suggested using SOC to monitor soil quality changes [6].
These conditions are not always verified. Another way of taking into account the secondary variable is by checking for a spatial trend in the primary variable with respect to the secondary variable(s) and combining the deterministic part and the stochastic component, as in "hybrid methods" [29], or by adopting complex multivariate nonlinear approaches. In recent times, a number of hybrid interpolation techniques, which combine kriging with methods that use auxiliary information (covariates), have been developed and applied. Several authors have compared some of the techniques to incorporate trends and account for nonstationarity [31,32]. Two possible methods of nonstationary interpolation are regression kriging (RK) [28,33] and multivariate adaptive regression splines (MARS) [34]. In many cases, these techniques have been proven superior to common geostatistical methods, yielding more detailed results and higher accuracy of prediction, because they take advantage of being linear hybrid (RK) or nonlinear (MARS) [35]. MARS is a nonparametric predictive method that intrinsically models nonlinearities and interactions between variables, suitably managing local nonstationarity [34]. This method has been successfully applied in various fields, such as estimating the collapse potential for compacted soil, underground gas storage in bedded salt formations, and lateral spreading induced by earthquakes [36,37].
In this study, we compared the performance of RK and MARS to achieve the following objectives: (i) to prove that there are preferential nonlinear relationships between SOC and geophysical measurements, and (ii) to compare the performance of two nonstationary interpolation methods to effectively model SOC at the field scale. Machine learning techniques may open new perspectives to modelling SOC spatial distribution at the field and regional scales. The study was performed on a dataset deriving from a field experiment in which water of different qualities was used for irrigation.
To the best of authors' knowledge, no comparison between these methods has been presented before; therefore, it can be considered a novelty.

Study Area
Soil data were derived from a field experiment carried out in an olive grove located in Fasano (Apulia region, Southern Italy). The climate of the study area is "accentuated thermo-Mediterranean", as classified by UNESCO FAO [41,42], characterized by rather mild and rainy winters and warm and dry summer months. The soil of the experimental site is classified as loam (USDA classification), with an average content of silt, clay, and sand fractions of 35.28%, 21.74%, and 42.98%, respectively.
Olive trees were irrigated with treated municipal wastewater (TWW), and the following treatments were applied: irrigation with fresh water and full fertilization supply (FW); irrigation with TWW and full fertilization supply (R1); and irrigation with TWW and fertilizer supply reduced by the amount provided by TWW (R2) [10]. Treatments were arranged in a randomized complete block design (RCBD) with four replicates (Figure 1). Unit plot size was 108 m 2 , with 3 plants per plot and a plant spacing of 6 m × 6 m; field size was 1296 m 2 (whole experimental area was 1728 m 2 ).

Soil Sampling and Soil Analysis
Soil samples with absolute coordinates were collected on a regular grid (April 2017) at 6 locations (subreplicates) per plot at a 0-0.20 m depth for a total of 72 observations ( Figure 1); only 71 were used in this study. Soil organic carbon (SOC) was quantified on air-dried and sieved samples through dry combustion [43]. Further details about the experimental trial were reported by Barca et al. [44] and Stellacci et al. [10].

Acquisition and Preprocessing of Auxiliary Information
A geophysical survey was carried out using an EMI sensor (EM38DD, Geonics Limited, Mississauga, ON, Canada) and a Georadar (RIS 2k-MF Multifrequency Array Radar-System, manufactured by IDS SpA, Italy) connected to the DGPS along 6 parallel transects by sliding the sensors on the surface (Figure 1) on the same day as soil sampling.
EMI soil survey is based on the principle that a transmitter coil in contact with the soil surface produces a time-varying primary magnetic field in the subsoil. The eddy currents induced in the soil generate a secondary magnetic field, which is recorded by a receiver coil in the EM unit. The apparent conductivity near the receiver is determined by the ratio of the magnitude of the secondary magnetic field to that of the primary magnetic field [22]. The EMI sensor used herein consisted of two perpendicularly superposed EM38 sensors that simultaneously measured apparent electrical conductivity (EC a , expressed in mSm −1 ) near the soil surface (0-0.75 m depth) with the horizontal mode (EC a -H) and up to 1.5 m depth with the vertical mode (EC a -V) [22]. Before operation, the instrument was set to zero at a height of 1.5 m, according to the manufacturer's instructions, and at the end of the survey, the zeroing was checked to detect possible drift. The survey was performed using a nonmetallic platform with wood cover, and the sensor was towed behind a tractor/The EC a was recorded every second, with spatial resolution of 0.5 m, on average, along each transect.
Immediately after the EMI survey, the GPR survey was carried out by sliding the sensor along the surface. GPR data were collected with the common offset reflection method, using a monostatic system (the transmitting and receiving antenna placed in the same box) with two central frequencies of 600 and 1600 MHz (IDS Ing-manufactured, RIS 2k-MF Multifrequency Array Radar-System). The GPR worked with a time window of 60 ns and a temporal sampling interval of 0.05 ns; successive traces were collected every 0.024 m. GPR used electromagnetic pulse energy in the frequency range of 10 MHz to 1000 MHz. The transmitter component of the GPR system allowed the passage of generated pulse energy, which propagated through the subsurface materials, and the interactions with the material were sensed by the receiver component. Traditional surveys employ reflections of electromagnetic waves from boundaries between environments of different electromagnetic properties [45]. Theoretical aspects and working principles of radar components can be found in detail in Davis and Annan [46].
Both the data quality check and cleaning procedure characterized the preliminary data analysis. For EMI data, the points at which the instrument was stationary and any negative values were removed.
Processing the raw GPR data consisted of extracting quantifiable variables, such as attenuation, and displaying GPR data in horizontal maps at a specified time (or depth), called amplitude maps or time slices. The preprocessing of GPR signal amplitude data included the application of a set of filters [47] and the extraction of quantifiable variables.
The enveloped amplitude maps (time slices) were built by averaging the amplitude (or the square amplitude) of the radar signal, expressed in digital number (DN), within overlapping time windows of width ∆t equal to the order of the dominant period of each antenna (2 and 1 ns for the 600 and 1600 MHz antennas, respectively). The total time interval was of 10 ns for the 600 MHz antenna because this time was comparable with the depth of the soil, and it was 6 ns for 1600 MHz because of the attenuation of radar signal. The time slices were then transformed in depth slices using the velocity of the radar waves determined through the analysis of hyperbolae [48]. Data preprocessing was performed with ReflexW Software [49].
In order to estimate the geophysical covariates at the same locations as the SOC measurements, geostatistical procedures were separately applied to EMI and GPR data by using a multivariate approach and fitting a linear model of coregionalization (LMC) to the experimental variograms. Each group of geophysical data was interpolated with ordinary cokriging (ck) on a 0.5 m × 0.5 m grid. The estimated covariates, migrated at the sample locations, were: the EC a in horizontal (EC a H) and vertical (EC a V) modes; the amplitude for the 600 MHz antenna at ten depths from 0.05 m to 0. Finally, 25 covariates were considered, namely, the 23 geophysical covariates plus the (two) geographical coordinates expressed in the WGS84 coordinate system.

Regression Kriging (Residual Kriging)
In the present paper, kriging combined with linear regression (RK), a hybrid interpolation technique, was applied [35,39] (see Figure 2). In mathematical terms, RK can be described as the sum of a deterministic (regression) component and kriging as shown in the following equation: where s 0 is the spatial location associated with the desired prediction,m(s 0 ) is the trend, e(s 0 ) is the interpolated residual,β k are the estimated regressive coefficients, q k are the covariates, p is the number of coviariates, λ i are kriging weights, N is the number of observations, and e(s i ) is the residual (i.e., the difference between the regression estimation minus the observation) at the generic observational location s i . From a practical standpoint, once the trend component has been estimated, the residual can be interpolated with kriging and then added to the previously estimated component. The prediction of the residual is a very critical step, because in principle, only the autocorrelated components should be estimated, neglecting the purely random component. Unfortunately, it is very difficult to separate the overall residual into the autocorrelated and the noncorrelated components. There are many different opinions about the best way to accomplish this issue [50,51]. In the present paper, the variography directly performed on the residuals provided results that did not depart much from those obtained with more sophisticated statistical methods; in other words, this approach did not significantly bias the final predictions. Therefore, the more straightforward approach, which brutally separates observations from trend values to obtain residuals, was preferred [29,52]. The validation of the RK method is usually carried out by means of the cross-validation procedure, and specifically the leave-one-out method [53]. Cross-validation is structured as a two-stage procedure. In the first stage, a leave-one-out method is applied, which consists of dropping an observation from the dataset and predicting this omitted value using the remaining data. Leave-one-out is iterated for each value in the dataset, and each time, a residual is computed as the difference between the observed and predicted values. The second stage of the cross-validation consists of making inferences about the residuals' distribution [54,55]. The R library [56] used to perform the aforementioned analysis was {Automap version 1.0-14}.

Multivariate Adaptive Regression Splines (MARS)
MARS is a nonparametric and nonlinear predictive method that automatically models nonlinearities and interactions between variables managing suitably local nonstationarity [34]. Datasets are split into piecewise curves (splines) of differing slopes. Splines consist of two branches, i.e., left-sided (Equation (2)) and right-sided (Equation (3)) truncated functions, separated by a point called the knot [57].
are splines describing the regions on the right and left sides of the knot (t), respectively, and q is the degree of the polynomial. The subscript "+" indicates that the result of the function is 0 outside the local definition domain. For each of the covariate variables, MARS selects the couple of splines and the knot location more in accordance with the response variable. In a next stage, the different splines are added up in a single multivariate model, which describes the response as a function of the covariates. The result is a nonlinear model assuming the form: whereŷ is the prediction of the response variable; a 0 is the known term; M is the number of basic splines; and B m and a m are the m-th basic spline and its coefficient, respectively [58].
Overall, a MARS analysis consists of three stages. Specifically, (i) the variable that best describes the response by means of the splines in terms of R 2 is selected. Afterwards, (ii) other covariates are added stepwise, always using splines, to build a multivariate model (i.e., the global MARS model). The aim of this addition is the improvement of model in terms of performance (R 2 ). The performance is computed on the training set. Since the global MARS model is usually affected by overfitting, it needs to be "pruned" in a further stage, for which iterations of the generalized cross-validations (GCV) alternated with 10-fold cross-validation are used [59]. The GCV index is a sum of squared errors (observations minus predictions) adjusted by embodying a penalty for reducing the model complexity. This criterion is used to prevent overfitting derived from an excessively accurate model with respect to the training set: where C(M) is a parameter that penalizes models involving a large number of splines, defined as follows: where M is the number of nonconstant splines (i.e., all terms of Equation (4) except a 0 ) in the MARS model and d is a user-defined penalty value for each spline optimization. Increases in the cost d cause the exclusion of splines. Substantially, d is increased during the pruning step in order to obtain smaller models. Besides its use during the pruning phase, GCV index is essential to rank covariates based on their importance in the model. The definition of the final model is reached in a third phase. This phase (iii) is performed by cross-validation or a new independent test set. The R library used to perform the aforementioned analysis herein is {earth} [59].

Exploratory Data Analysis
Descriptive statistics showed that SOC data were normally distributed as confirmed by skewness and kurtosis values (Table 1) and by Shapiro-Wilk test (p = 0.656); for this reason, they were not subjected to a normal transform. The reported bubble plot (Figure 3) shows the spatial distribution of the SOC observations, evidencing some clusters of similar values.  The global Moran index provided an assessment of the spatial autocorrelation strength over the study area and is reported in Table 2. The result (I = 0.42) indicated a significant spatial autocorrelation (p = 0.00034). In addition to the global Moran index, the peak of the Moran index (local Moran index) was estimated by means of the computation of the mean of nearest neighbours. Afterwards, a lagged scatterplot provided the Moran computation at such distance lag. For the considered case, the mean of nearest neighbours was 2.63 m, and Figure 4 shows the Moran value corresponding to that distance, indicating a greater spatial correlation at short range (r = 0.75).

Linear Model Outcomes
The correlation matrix between SOC and the 25 covariates (23 geophysical variables plus the geographical coordinates) was first computed, and different sets of highly correlated covariates were derived and used to fit SOC data.
The following equation shows the first attempt to model SOC with the most correlated variables: SOC~ckAmp0.05m_600MHz+ ckAmp0.1m_600MHz + ckAmp0.4m_600MHz The five-point summary statistics and the coefficients of the linear model are reported in Tables 3 and 4. The outcomes seemed to indicate a larger contribution of the GPR data than of the EMI sensor data. The covariates related to the higher frequency antennae (1600MHz frequency) were therefore excluded. In particular, the GPR data representations for both frequencies showed a first discontinuity in the radar signal at 0.1 m depth, a high level of spatial continuity along the soil profile at least to 0.30 m, and a second discontinuity after 0.30 m depth. Therefore, the selected covariates were representative of information derived by two different layers. Signif. codes: 0.01, "*"; 0.05, ".".
The model was significant (F-statistic: 4.80 on 3 and 67 DF, p-value: 0.004) and showed a residual standard error of 0.26 with 67 degrees of freedom; multiple R-squared and adjusted R-squared were 0.177 and 0.14, respectively. Analysing Table 4, it was evident that there was a unique significant covariate, ckAmp0.1m_600MHz. The result showed the distribution of SOC to be significantly affected by the shallower layer, probably because it was comparable with the portion of sampled soil.
After many other attempts (not reported), a model was developed with the following optimal arrangement of the covariates: This model included the geographical coordinates and a unique geophysical covariate, ckAmp0.35m_600MHz (see Tables 5 and 6). This model was better that the aforementioned one, with all the covariates significant, a better value of R-squared (multiple R-squared: 0.26, adjusted R-squared: 0.22), and a more significant F-statistic p-value (F = 7.9 on 3 and 67 DF, p-value: 0.00018). Residual standard error was 0.24 with 67 degrees of freedom.  The model's residuals were then analysed. The Shapiro-Wilk Gaussianity test showed a nonsignificant departure from the normal distribution (W = 0.98567, p-value = 0.598); as a consequence, the Gaussian hypothesis was accepted. Afterwards, spatial autocorrelation analysis was performed to check at what extent the linear model filtered out the autocorrelation present in the raw data. From Table 7, it was evident that in the linear model's residuals, there was still a significant quantity of spatial autocorrelation (p-value = 0.0012). Therefore, it made sense to apply regression kriging (RK) to exploit the residual autocorrelation with the aim of improving the goodness of fit.

Regression Kriging (RK)
Geostatistical analysis was then applied to the linear model's residuals with the aim of finding in them a structure that could represent their spatial variability.
The goodness of fit between the selected variogram model and the empirical variogram was evaluated by means of the SSErr index, which provides a value that helps user to judge the quality of the final model. For the case at hand, the value was SSErr = 0.00050, which appeared to be a satisfactory result. Moreover, by analysing the variogram parameters, reported in Table 8, it was possible to figure out the strength of the model by computing the nugget-to-sill ratio index [60], also called the spatial dependence index (SDI; [61]). For the case at hand, the observed value was 0.075, indicating high descriptive capability for the variogram model. In Figure 5, the experimental variogram and the fitted nested model (nugget + spherical) are reported. Cross-validation statistics showed an MAE to RMSE ratio of 0.76, indicating a very good outcome. Mathematically, RMSE is always larger than MAE, because large errors are magnified by the square contained in the formula; therefore, the ratio between MAE and RMSE is always less than 1. However, the closer to 1 the ratio is, the fewer large errors made are by the model. This positive result was confirmed by a MAPE value far lower than 10% (Table 9). Computing the Lin coefficient (CCC) between observations and predictions, the outcomes were 0.65 for overall CCC, 0.68, for overall precision, and 0.95 for overall accuracy. The scatterplot of predicted versus observed values qualitatively showed the adequacy between the two data series (Figure 6).  The spatial distribution of SOC obtained through RK is reported in Figure 7.

MARS Model Assessment
The original dataset was split into two complementary subsets, namely, training and test, corresponding to 80% and 20% of the original data, respectively.
Since the model is calibrated by means of the training dataset with the aim to predict the test data, the two subsets should be (statistically) similar at some extent. For this reason, after the splitting, subsets were subjected to the t-test for mean homogeneity and the Levene test for variance homogeneity. In addition, a univariate cluster analysis, carried out to assess the presence of clusters among data, showed that observations could be split into four groups. This represents another constraint about the splitting that has to be taken into account, i.e., the training and test subsets should be formed by a balanced quantity of elements extracted from all the clusters. Subsets were both checked for Gaussianity by means of Shapiro-Wilk test; results showed for both subsets a nonsignificant departure from normal distribution (W = 0.99, p-value = 0.90, for the training set; W = 0.97, p-value = 0.81, for the test set).
A Welch two-sample t-test showed that the means of the two subsets were not statistically different (t = −0.25, df = 20.36, p-value = 0.81). In addition, a boxplot confirmed the equality of the two means of the SOC variable subsets (Figure 8).

Figure 8. Boxplot for SOC comparison between training and test sets.
A Levene test, based on the absolute deviations from the median with a modified structural zero removal method and correction factor, showed the homogeneity of the group variances (test statistic = 0.059, p-value = 0.81). In Figure 9, the placement of the observations for the training (red points) and the test (green points) sets is reported. In summary, the two subsets could be considered similar according to the distribution, mean value, and variance comparisons. Therefore, the training set seemed to be appropriate to calibrate the model and the test set to check for overfitting.
The model included the main GPR covariates selected previously. Regarding EMI data, only the apparent electrical conductivity measured in vertical polarization was selected because the two electrical conductivity variables were strongly correlated and therefore redundant. In addition, the sensor in vertical polarization had a maximum sensitivity approximately at a depth of 0.40 m, which was comparable with the time slices of GPR repeatedly selected (0.35 m).
From Table 10, it can be drawn that the MARS model was formed by four terms; apart from the intercept, the first was linear, and the remaining two were interactions between couples of covariates. After importance analysis was applied, by using the GCV and raw residual sum of squares (Rss) indices, the selected predictors were ranked accordingly (Table 11).  As first step, the Gaussianity of the residuals after the training was tested using the Shapiro-Wilk test; the residuals distribution could be considered Gaussian with a distribution~N(0.0, 0.036) (W = 0.98, p-value = 0.50).
By applying a blind cross-validation with k-fold = 10, the resulting R 2 was 0.51, but it should be borne in mind that this was a pessimistic result, as the extractions of blocks of 10 elements (k-fold with k = 10) from the original dataset was performed 200 times in a purely random fashion, neglecting similar subsets. Moreover, the original dataset was relatively small and represents a not-very-homogeneous reality. Finally, the results in terms of goodness of fit were averaged.
The first step consisted of checking the correlation between predicted and observed values for the training set; the results showed a certain agreement (r = 0.72, p-value ≈ 0.0). In addition, correlation between residuals and predicted values of training subset was checked and was close to zero, as expected.
Afterwards, the MARS model calibrated on the training set was applied to predict SOC data from the test set, which was independent from the model calibration (training) set.
As a first step, the correlation between observations and (test set) predictions was analysed. This resulted in a highly significant correlation (r = 0.87, p-value ≈ 0.0). The value gained after the validation step surprisingly outperformed that of the training set, which is a rare event. The correlation between residuals and (test-set) predicted was not significant.
Computing the Lin coefficient (CCC) between observations and predictions, the outcomes showed very good agreement (overall CCC, 0.81; overall precision, 0.88; overall accuracy, 0.93).
Since the observations were available, it was possible to compute the error metrics, which are reported in Table 12. The error indices were good overall; in particular, MAPE was below 10%, which value has been indicated in literature as a critical threshold. Another very interesting result concerned the ratio between MAE and RMSE, which was larger than that obtained with regression kriging (0.8 vs. 0.76). In conclusion, the MARS model could be considered effective whenever the coefficients of the covariates were not constant over the study domain and the covariates were intertwined in more complex ways than additively.
By comparing the error indices and Lin's coefficients of both methods, it became evident that MARS performed better than RK. The two methods were linear (RK) and nonlinear (MARS), respectively. The main difference concerns the interaction terms, since the MARS model has one linear term and two multiplicative terms (interactions) that represent the added value that allowed improving the predictive capability of MARS with respect to that of RK.
In Figure 10, the map of SOC predictions obtained with the MARS model is reported. Comparing the RK and MARS maps, they showed overall agreement, with a cluster of lower values in the northern part of the study area, a central part with the lowest values, and finally, a southern part with two clusters of larger values and a cluster of lower values. Finally, to quantitatively compare the maps obtained by the two methods, a crosscorrelogram was computed. The result was a value of 0.67 at the distance 0. Therefore, the map gained from RK can be considered a first approximation of that from MARS. This result underlines the reliability of the SOC spatial distribution predicted by the MARS model.

Discussion
Spatial prediction of SOC is critical for assessing the effect of agronomic management strategies on soil quality and crop productivity. In this scope, the sample size is a value that plays a key role in SOC prediction. Thus, it needs to be balanced between economic and predictive constraints. In fact, increasing the sample size may allow the application of statistical methods that take residual autocorrelation into account and thereby reduce the probability of inflation of the type I error rate [12], but at large cost. Regression kriging and MARS, incorporating covariate information often available at a finer resolution than primary data, such as proximal and remote sensor data, may improve the quality of SOC estimation without increasing the sampling size of the primary variable [62,63].
Outcomes obtained from linear models seem to highlight a larger informative contribution of GPR than of EMI data. From a physical standpoint, this result can be explained by the different nature of sensors' outcomes. In fact, GPR information results are more sensitive to near-surface effects than EMI data, which are integrated values over all soil layers [15]. However, unexpectedly [64], the covariates related to the higher-frequency antenna (1600 MHz frequency) were excluded, probably because they did not add further information or were redundant in this study case.
Two GPR covariates, namely, ckAmp0.35m_600MHz and ckAmp0.1m_600MHz, were selected by the MARS model. The same variables were also chosen by the final RK model (ckAmp0.35m_600MHz) and the preliminary RK model (ckAmp0.1m_600MHz). Similar importance was also assigned to the selected variables by both statistical methods, as shown by the ranking defined by GCV and Rss in MARS model, suggesting that their significance was physically based. In fact, the selected covariates were representative of information derived by two soil layers with different physical properties influencing radar signal and soil organic carbon distribution. The two methods also had the X geographical coordinate in common, indicating a larger continuity along this direction.
The main difference between the two approaches concerned the selection of the EMI covariate in vertical polarization performed only by the MARS model, indicating the different explanatory power of information brought by the two sensors. This result was tied to the intrinsic capability of the MARS model to intercept the interactions among variables and highlight nonlinear features underlying the data [34]. In addition, the coefficients of the MARS model were not constant but piecewise linear (splines), and therefore, their gradient varied over the studied domain [57]. This explains the larger descriptive capability of the MARS model and its ability to select hidden features with respect to regression kriging. Although MARS is not explicitly a spatial method, its capability of modelling covariate coefficients by means of flexible functions allows, when the geographic variables are included in the analysis, filtering out the spatial autocorrelation contained in the data, which makes it substantially a spatial method [65]. A confirmation of this was the statistical nonsignificance of the Moran I index obtained from the MARS residuals.
Studies on the spatial variability of SOC in agricultural soils remain a central theme in assessing the environmental sustainability of agricultural systems [66], because agronomic inputs could be rationalized in order to not impoverish the soil's fertility. Therefore, our results represent a knowledge contribution for future studies aimed at detecting the spatial distribution of soil organic carbon at the field scale. Geophysical methods show new applicative potentialities for environmental sciences (see, among others, [67]) and can represent support for research in this field. However, because of the complex interactions with soil properties, the use of geophysical measurements as covariates needs to be investigated in more detail to draw more precise conclusions. A limit of the present work could be its potential site-specificity, which could not be quantified in advance. Therefore, further experiments in different study areas and agroenvironments should be performed to test the performance of the methods under different conditions.

Conclusions
The results of our investigation showed that MARS outperformed RK in predicting SOC spatial distribution. The nonlinearity of MARS evidenced the contribution of EMI variables neglected by linear approaches. That result would have to be deepened in future works in consideration of the fact that EMI measures are more easily achievable than GPR ones.
The accuracy reached in mapping SOC with the support of MARS was remarkable and opens interesting perspectives in applying other, more powerful machine learning methods (e.g., deep learning) to even better exploit proximally sensed data. In the future, it is hoped that these machine learning methods will be successfully associated with mapping procedures and then applied at the regional and national level.
The use of relatively easy, accurate, and inexpensive geophysical methods for SOC estimation, together with application of advanced statistical techniques for SOC spatialization, can represent a viable solution to investigate agroecosystem sustainability.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board.