remote sensing Digital Mapping of Soil Properties Using Multivariate Statistical Analysis and ASTER Data in an Arid Region

: Modeling and mapping of soil properties has been identified as key for effective land degradation management and mitigation. The ability to model and map soil properties at sufficient accuracy for a large agriculture area is demonstrated using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) imagery. Soil samples were collected in the El-Tina Plain, Sinai, Egypt, concurrently with the acquisition of ASTER imagery, and measured for soil electrical conductivity (EC e ), clay content and soil organic matter (OM). An ASTER image covering the study area was preprocessed, and two predictive models, multivariate adaptive regression splines (MARS) and the partial least squares regression (PLSR), were constructed based on the ASTER spectra. For all three soil properties, the results of MARS models were better than those of the respective PLSR models, with cross-validation estimated R 2 of 0.85 and 0.80 for EC e , 0.94 and 0.90 for clay content and 0.79 and 0.73 Independent validation of EC e , clay content and OM maps with 32 soil samples showed the better performance of the MARS models, with R 2 = 0.81, 0.89 and 0.73, respectively, compared to R 2 = 0.78, 0.87 and 0.71 for the PLSR models. The results indicated that MARS is a more suitable and superior modeling technique than PLSR for the estimation and mapping of soil salinity (EC e ), clay content and OM. The method developed in this paper was found to be reliable and accurate for digital soil mapping in arid and semi-arid environments.


Introduction
Modeling and mapping soils for their physical and chemical properties typically involves extensive field work and laboratory analysis, which are expensive and time consuming [1].Digital soil mapping (DSM) relies on field observations, laboratory measurements and remote sensing data, integrated with quantitative methods to map spatial patterns of soil properties at different spatial and temporal scales [2], to provide up-to-date soil information [3,4].As soil salinity, texture and organic matter (OM) all play an effective and vital role in assessing topsoil characteristics and soil quality [5], remote sensing can be explored as an effective method for studying these soil properties [6].As a consequence, it would be an advantage to be able to map soil properties from one or more datasets of satellite imagery.
Remotely-sensed imagery in combination with field measurements has been used intensively for the last two decades in modeling and mapping of soil properties in a cost-effective manner at various scales [7,8].Satellite sensors, such as the Landsat Thematic Mapper (TM) Enhanced Thematic Mapper Plus (ETM+), Operational Land Imager (OLI) and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), have been used to map soil properties [9].While the feasibility of remote sensing data for soil mapping could be restricted by factors, such as image spectral and spatial resolution and the presence of the vegetation cover [10,11], these sensors were found to be effective for soil mapping [12][13][14][15].
The ASTER imager covers a wide spectral region extending from the visible to the thermal infrared with 14 bands with high spatial and radiometric resolutions [16].ASTER data were successfully applied to map and monitor soil salinity [17,18], soil texture [19], soil minerals [4] and soil organic matter (OM) [20].Several researchers successfully used ASTER spectra coupled with predictive models for mapping soil properties [20][21][22][23].For example, Luleva [20] used ASTER spectra coupled with the PLSR method for mapping different soil properties, including soil salinity and organic matter, in southeast Tunisia.She achieved R 2 of 0.88 and a ratio of performance to deviation (RPD) of 3.0 for soil salinity (electrical conductivity, EC) and R 2 of 0.80 and RPD of 2.35 for OM and concluded that the accuracy of the atmospheric correction and high correlation between image spectra and soil properties were the most important factors in accurate mapping of soil properties.Rivero et al. [22] used predictive models, as stepwise linear multivariate regression (SLMR), non-linear regression (NLR) and multiple regression, for mapping the surface soil total phosphorus (TP) with ASTER data in Florida, USA, and found ASTER data useful for mapping this soil property.Piekarczyk et al. [23] used ASTER spectra to estimate soil total exchangeable bases (TEB) at abandoned fields covered with vegetation in Poland, and they concluded that ASTER data may serve to assess soil physical and chemical properties.As low availability, high complexity and costs associated with hyperspectral imagery still limit its application in widespread monitoring and soil mapping of agricultural regions over the world [15], the ASTER data have the ability to bridge the gap between the demands of regional soil mapping for spectral data with large area coverage and the potential of the hyperspectral data to map soils with sufficiently high accuracy.
Several authors have studied the relationships between soil properties and spectral reflectance data, for example soil salinity [24][25][26][27][28], OM [29][30][31][32][33] and clay content [34][35][36][37][38].According to Ben-Dor et al. [39] changes in the spectral response occur due to changes in soil albedo and soil chromophores.Many researchers have identified clear relationships between the soil salinity, OM and iron content, which are also strongly correlated with the soil albedo.For instance, Baumgardner et al. [40] stated that the differences in moisture content, OM, soil texture, iron content, as well as salinity determine most of the spectral responses in the visible to shortwave infrared spectral range.Furthermore, Palacios-Orueta and Ustin [41] found that soil texture, soil OM and total iron content were the main factors affecting the spectral curve of soil.
Deriving information from spectral data typically relies on methods, such as (un)supervised classification [42,43], spectral unmixing [44], band ratios [45], absorption features analysis [46] and partial least squares regression (PLSR) [47].The PLSR approach has inference capabilities useful to model probable linear relationships between the reflectance spectra and soil properties [26] and enables the modeling of several response variables simultaneously, at the same time effectively addressing strongly collinear and noisy predictor variables [48].PLSR has been used successfully for mapping soil properties (e.g., [26,[49][50][51]), including also soil variables characterized by high spatial variability and complex nonlinear relationships [26,52].In this context, a non-linear predictive model for soil mapping may offer a better solution.Multivariate adaptive regression splines (MARS) is a powerful nonparametric modeling method that estimates complex nonlinear relationships among independent and dependent variables [53].It has been effectively applied in different fields [54][55][56][57] and generally exhibits very high performance compared with other linear and non-parametric regression models, such as principal component regression (PCR) and artificial neural networks (ANN).Nawar et al. [58] used PLSR and MARS with soil spectroscopy and Landsat TM/ETM+ data for mapping soil salinity in El-Tina Plain in Egypt and reported that MARS provided better estimations for the soil salinity, yielding better cross-validation R 2 and ratio of performance to deviation (RPD) values than the generally used PLSR method did.The results of this study indicated a potential to extend the modeling capability of PLSR to model and map other soil properties and the need to test MARS models with other than Landsat multispectral data in order to prove their robustness.
The aim of this study was therefore to extend the capabilities of the models elaborated in [58] to map soil salinity, clay content and organic matter and to test the performance of ASTER data in mapping these soil properties using a combination of field sampling and spectral reflectance derived from ASTER imagery.The specific objectives of the study were to: (I) evaluate the potential of ASTER data for the modeling of soil properties; (II) develop predictive models based on ASTER spectra for assessing soil salinity, clay content and organic matter using MARS and to compare its performance to the results acquired with the PLSR method; and (III) map the spatial distribution of soil properties based on ASTER imagery across a large agricultural area.

Study Area
The study area is the El-Tina Plain, located on the northwestern Sinai Peninsula in Egypt, approximately 380 km 2 in area (Figure 1), described in detail in [59,60].In brief, it is an arid area, with annual rainfall below 80 mm and daily evaporation rates ranging from 3.6 mm to 7.3 mm [61,62].The land surface is nearly flat, ranging in elevation from below sea level to 5 m above sea level, and the water table is persistently high in the northern part of the plain, from 50 to 75 cm below the ground surface, due to the subsurface seepage of Mediterranean seawater.The area is characterized by high spatial variability of soil salinity, clay content and organic matter.The soil texture varies from loamy sand to clay, and the soil salinity varies from non-saline to highly saline.Soils of the El-Tina Plain were classified into two orders, Entisols and Aridisols, which include eight subgroups: Typic Aquisalids, Typic Haplosalids, Aquic Torriorthents, Typic Torriorthents, Aquic Torripsamments, Typic Torripsamments, Gypsic Aquisalids and Gypsic Haplosalids [60].The El-Tina Plain is an important part of the El-Salam Canal project, yet unfortunately, this area does not have accurate soil maps and databases to support sustainable agriculture production.In this case, DSM based on ASTER satellite data offers potential to produce accurate soil maps with up-to-date information that is sufficient to evaluate soil quality within the region.

Soil Sampling and Analysis
A total of 118 topsoil (0-20 cm) samples were collected in the study area between May 26 and 8 June 2006 [60].A stratified random sampling strategy was used, with strata represented by soil units and land use types.The positions of sample points, land use and vegetation types were recorded in the field.The field spectroscopy measurements were collected in 2012 from 20 locations, grouped into six main physiographic features.As the field work was carried out six years after the acquisition of satellite data, only less likely to change or stable features and land cover types were selected.They included different soil types, wet and dry sabkhas (salt flats), sand, different crops and water bodies.Spectroscopy measurement were collected using a high resolution field portable spectroradiometer (Spectral Evolution PSR-3500, Spectral Evolution, Inc., Lawrence, MA, USA), which measures the reflectance over the range of 350-2500 nm.
The collected soil samples were air-dried, crushed and passed through a 2-mm sieve.The fine earth was used for soil physical and chemical analysis.The particle size distribution was measured with the pipette method [63].The soil reaction (pH) was measured in a 1:2.5 soil-water suspension, and the ECe was measured in a soil paste extract according to the method outlined in [64].Soil OM content was determined using the modified Walkley and Black method [65].

ASTER Data and Pre-Processing
ASTER is an advanced multispectral imager flying on the TERRA satellite with 14 spectral bands ranging from the visible to thermal infrared region, with high spatial, spectral and radiometric resolution [16].The spatial resolution varies with wavelength: 15 m in the visible and near infrared (VNIR) band, 30 m in the short-wave infrared (SWIR) band and 90 m in the thermal infrared (TIR) band [66].The cloud-free ASTER image used in the study was downloaded from The National Aeronautics and Space Administration (NASA) Land Processes Distributed Active Archive Center (LP DAAC; http://reverb.echo.nasa.gov/reverb/datasets/).The image acquisition date (31 May 2006) was selected to be as close to the soil sampling campaigns as possible and falls in the peak of the dry season in Egypt (summer season), with scarce vegetation and dry soils [62,67].Because the soil is bare and dry, information about the surface constituents can be collected by the satellite sensors [68].
We tested Level 2 data (AST07, surface reflectance), since it suits the purpose of this study; however, it showed inconsistency in Band 9, which may be attributed to the wrong aerosol type in the atmospheric correction.Consequently, as the absorption by atmospheric gases and aerosols may affect the naturally weak, narrow and mixed spectral features of the soil chromophores and, hence, their response registered by a sensor [69], Level 1B (radiance at the sensor) data were used instead.They were atmospherically corrected and then converted to surface reflectance before pixels were matched to any reference spectra.First, the ASTER SWIR bands with 30-m spatial resolution were resampled to VNIR spatial resolution (15 m).A nearest neighbor resampling procedure was used in order not to alter the digital numbers (DNs) representing the brightness values of the pixels for each band.Next, the ASTER image was atmospherically corrected using the Atmospheric Correction and Haze Reduction (ATCOR 2) model, which is based on the Moderate Resolution Atmospheric Transmission (MODTRAN 5) code [70].The following sensor and physical environmental parameters were used to run the atmospheric correction model: Earth-Sun distance = 1.00934 astronomical units (AU), average surface altitude = 0.002 km, atmospheric model = dry (26 °C) and aerosol type = rural.
The resulting image spectra were then validated with field spectroscopy measurements that were resampled to ASTER spectral resolution [71], for various land cover types (sand, sabkhas, water, crops).
Further analysis was limited to bare soils.Features other than soils (mainly vegetation and water that may interfere with soil) were masked using the supervised classification trained by classes carefully selected from the image.In this way, 92% of the study area remained for analysis.Finally, the image spatial subsets were created, and various stacks of color composites were produced to aid in the interpretation of the results.

Data Analysis
Two models, MARS and PLSR, were constructed based on the measured ASTER spectral reflectance and soil ECe, clay content and OM values of the 86 soil samples out of 118 collected in 2006.Thirty two samples were retained for model validation.For constructing the models, ASTER reflectance spectra were utilized as predictor variable (X variable) against measured soil properties values (Y, response variables).The resulting MARS and PLSR models were assessed using R 2 , RMSE and RPD parameters.

Multivariate Adaptive Regression Splines
Developed by Friedman [53], MARS is a nonparametric data mining method.Recently, MARS was applied as a regression method in several disciplines [54,59,[72][73][74] and was generally reported to perform better than conventional statistical methods.MARS uses basis functions to model the predictor and response variables by utilizing adaptive piecewise linear regressions; i.e., the non-linearity of a model is simulated through the use of separate linear regression slopes in definite intervals of the response variable space [75].To improve predictions and avoid over-fitting, MARS uses a modified form of the generalized cross-validation method (GCV; [76]).The GCV is the mean squared residual error divided by a penalty dependent on the model complexity and is expressed as follows: 1 ∑ where n is the number of observations and C(m) is the cost-complexity measure of a model containing m basis functions used to penalize the model complexity to avoid over-fitting by introducing a cost for the added basis functions in the model.Development of MARS models for soil salinity assessment was described in detail in Nawar et al. [58,59].In this study, the MARS analysis was performed using the ARESLab toolbox [77] with selected adaptations based on MATLAB 8.0 software.

Partial Least-Squares Regression
PLSR is a popular modeling technique applied in chemometrics and, commonly used for quantitative spectral analysis.The PLSR algorithm relates two data matrices, X and Y, via a linear multivariate model to select successive orthogonal factors (latent variables) that maximize the covariance between the predictor X (e.g., spectra) and response variables Y (e.g., measured soil property).A detailed description of the PLSR method can be found in Geladi and Kowalski [78] and Wold et al. [48].The PLSR model with leave-one-out cross-validation (LOOCV) used in this study was described in detail in Nawar et al. [58,59].The number of significant latent variables for the PLSR algorithm was defined using the cross-validation method.The PLSR process was performed using MATLAB (Version 8.0, The MathWorks).

Prediction Accuracy
The values of R 2 , RMSE and RPD were used to assess the performance of the soil property prediction models.RPD was classified into three classes by Chang et al. [79]: Category A (RPD > 2) is models that accurately predict a given property; Category B (1.4 < RPD < 2) has limited prediction ability; and Category C (RPD < 1.4) has no prediction ability.

Mapping Soil Properties Using ASTER Data
The resulting models were used to estimate the image-scale soil salinity (ECe), clay content and OM and to map the spatial variability of soil properties.As validation of the models and mapping results is an important step to ensure model quality [80], the resulting soil maps were validated using 32 soil samples (neither used in the calibration of the models, nor in the cross-validation) (Figure 1).The validation set sufficiently covered the variation in soil properties in the study area (Table 1) to be used in assessing maps created with the data mining techniques (PLSR and MARS).The corresponding soil properties values were extracted from the resulting maps computed using MARS and PLSR models.To assess the resulting maps, the predicted values were then compared with the true (laboratory) values using R 2 and RMSE parameters.* Very slightly saline (<4 dSm −1 ), slightly saline (4-8 dSm −1 ), moderately saline (8-12 dSm −1 ), strongly saline (16-32 dSm −1 ) and very strongly saline (>32 dSm −1 ).

Soil Properties
The descriptive statistical analyses of soil properties are shown in Table 2 and Figure 2. The soil salinity (ECe) of 86 soil samples used to build the models ranged from 0.4 to 164 dSm −1 .The mean soil salinity was 75.8 dSm −1 , indicating that most soils in the study area are salt-affected.The pH values ranged between 7.2 and 9.3 with a mean value of 8.3.The OM content of the samples ranged between 0.0% and 2.3%, and values > 1.5% were noted for almost half of all samples.Clay content ranged between 0.3% and 54.3%, with a mean value of 36.3%, and samples with a clay content > 40% comprised 60% of all soil samples.All three variables are correlated (OM and clay content, r = 0.85; ECe and clay content, r = 0.68; ECe and OM, r = 0.65).

Evaluation of ASTER Data
To evaluate the results of ASTER pre-processing, the collected reference field spectra were compared to reflectance values of nine VNIR and SWIR bands extracted from the ASTER image at corresponding locations, for the main land cover types of the study area (Figure 3).The results demonstrated that spectral reflectances derived from the ASTER image exceeded the measured reflectance for water over clay, dry sabkha, sand and agricultural soil (Figure 3a-e) and were lower for crop (maize) and wet sabkha (Figure 3c,f).The R 2 values were in all cases high and ranged between 0.85 for dry sabkha and 0.96 for maize and sand, proving the high correlation between ASTER and field measured spectral reflectance.
PLSR analysis allowed for the assessment of the contribution of specific ASTER bands to estimates of selected soil variables (Figure 4).For soil salinity, SWIR Bands 9 and 8 had the highest contribution to the estimation of soil salinity, followed by SWIR Band 7, VNIR Band 3, VNIR Band 1 and SWIR Band 4, whereas SWIR Band 6, SWIR Band 5 and VNIR Band 2 made the lowest contribution to the estimation of soil salinity.For clay content, SWIR Band 5 had the highest contribution followed by SWIR Band 8 and SWIR Band 7.These results indicating that SWIR bands had a higher contribution for clay content estimation than VNIR bands are in agreement with [31].VNIR Bands 2 and 3 had the highest contribution for estimating OM, followed by SWIR Bands 7 and 8 [20].

Prediction of Soil Properties
The MARS models used eight, five and eight basis functions for ECe, clay content and OM, respectively.The PLSR models used five latent variables for ECe and four latent variables for both clay content and OM.The LOOCV results proved MARS models to perform better than the PLSR models, with higher R 2 and RPD values and lower RMSE (Figure 5, Table 3).For both techniques, the results of OM modeling were inferior to ECe and clay content models (Figure 5c).These results indicate that MARS may be more suitable to model soil properties than the widely-used PLSR.

Mapping of Soil Properties
MARS and PLSR calibration models were used to map the distribution of ECe, clay content and OM with the ASTER image data (Figure 6).The quantitative ECe maps in Figure 6a indicated that the maximum estimated ECe values were 104.0 dSm −1 and 103.6 dSm −1 ; the minimum values were −3.6 and −3.4 dSm −1 ; the mean values were 24.0 and 23.3 dSm −1 ; and the standard deviations were 15.3 and 17.4 dSm −1 , for the PLSR and MARS maps, respectively.Figure 6b shows the quantitative maps of clay content generated based on the PLSR and MARS models.The maximum estimated clay content values were 78.3% and 65.0%; the minimum values were −3.0% and −2.0%; the mean values were 28.7% and 25.2%; and the standard deviations were 19.6% and 19.7%, for the PLSR and MARS maps, respectively.In the case of the ECe and clay content maps, the negative values are not reasonable, yet they counted for less than 4% of all pixels in both maps.They may be explained by a number of different factors.The first is the uncertainty of the atmospheric correction.Second, the PLSR and MARS models were established based on ASTER reflectance, which may be influenced by spectral mixture.The results from the quantitative OM maps in Figure 6c revealed that the maximum estimated OM values were 2.83% and 2.64%; the minimum values were both 0.0%; the mean values were 1.1% and 0.95%; and the standard deviations were 0.67% and 0.59%, for the maps based on PLSR and MARS, respectively.The received soil salinity, clay content and OM values were grouped into classes to present the spatial distribution of soil properties (Figure 7).The received soil salinity values were grouped into five classes (<4, 4-8, 8-16, 16-32 and >32 dSm −1 ) according to Nawar et al. [58].The high salinity occupied mostly the northern and western parts of the study area, while the southeastern part of the study area had low salinity levels (Figure 7a).The soil salinity was increasing from south and east to north and west in the direction of the El-Salam Canal and the Mediterranean Sea, which validated the trend effect caused by the seepage from the canal and the sea [58].Furthermore, the soil salinity maps had a similar spatial pattern for both MARS and PLSR models with small differences in class areas (Table 4).The soil salinity distribution in Figure 7a is consistent with our prior knowledge of the study area and the results received so far with other sensors [58].To interpret and understand the spatial distribution of clay content, its values predicted across the study area were grouped into five classes: very low clay content (<10%), low clay content (10%-20%), medium clay content (20%-30%), high clay content (30%-40%) and very high clay content (>40%).Although the spatial distribution of soil clay content showed significant similarity for maps derived from PLSR and MARS models, some small areal differences of classes 10%-20%, 20%-30% and 30%-40% (Table 4) were noted.On the other hand, the low clay content class (<10%) and highest clay content class (>40%) were quite similar.The clay content tended to increase from south to north towards the Mediterranean Sea.The area with high clay content was located mostly in the northern and western parts of the study area, and the area with low clay content occupied the southern and eastern parts of the study area.The soil clay content distribution in Figure 7b is consistent with our prior knowledge of the area.
OM content values predicted across the study area were grouped into five classes, taking into account OM variability under the arid conditions in Egypt [81]: very low OM content (<0.5%), low OM content (0.5%-1.0%), medium OM content (1.0%-1.5%),high OM content (1.5%-2.0%)and very high OM content (2.0%-2.8%).The spatial distribution of OM is shown in Figure 7c.It is apparent that low OM classes are predominant over the study area.Although the spatial distribution of OM had a similar pattern for both maps based on PLSR and MARS models, differences in the area of the very high OM class (2.0%-2.8%)and low OM class (<0.5%) were noted (Table 4).

Validation
The quantitative maps for soil salinity, clay content and OM were validated with 32 independent soil samples (Figure 8).For soil salinity, the results showed a satisfactory relationship between the measured data and predicted ECe values (R 2 = 0.81 and 0.78, RMSE = 7.7 and 8.1, for maps produced using the MARS and PLSR models, respectively; Figure 8a).Similarly, we found an excellent relationship between the measured data and predicted clay content values (R 2 = 0.89 and 0.87, RMSE = 6.9 and 7.6, for the MARS and PLSR models, respectively; Figure 8b).OM maps showed only a satisfactory relationship between the measured and predicted OM values (R 2 = 0.73 and 0.71, RMSE = 0.34 and 0.36, for the MARS and PLSR models, respectively; Figure 8c).
It is apparent from Figure 8 that modeling ECe, clay content and OM implies nonlinear relationships between the measured values at their higher range and ASTER spectral data.MARS showed the capability to model and overcome this difficulty.The results prove that the MARS models performed better than the PLSR models, indicating that MARS combined with ASTER reflectance data is a superior technique for estimating and mapping soil salinity, clay content and organic matter.

Evaluation of ASTER Data
The appropriateness of satellite data and the adequacy of their atmospheric correction are important pre-requisites for modeling soil properties.The ASTER Level 1B (AST L1B) data were converted to reflectance using ATCOR 2 [70] and then tested by comparing the resulting image reflectance spectra with field-measured reflectances for six selected stable land cover types (wet and dry sabkhas, crops (maize), agricultural soil, sand and water over clay).The results showed high correspondence between the measured spectra and those extracted from AST L1B calibrated and atmospherically corrected dataset.Due to inconsistency in Band 9 of the AST07 product, the image spectra could be unreliable to estimate and map surface soil properties with reasonable accuracy [82,83].On the other hand, AST L1B atmospherically corrected with ATCOR software was successfully used in various mapping applications [82,84,85]; therefore, the atmospherically corrected Level 1B dataset was applied also in this study to model selected soil properties.
The contribution of the ASTER bands in soil property estimation are illustrated in Figure 4.For soil salinity, the SWIR bands (i.e., Bands 9, 8 and 7) and NIR (i.e., Band 3) had the highest contribution to estimating soil salinity (Figure 4a).These results were consistent with the results of many studies on spectral wavebands and their relationships with soil salinity [26,74,86].For instance, Farifteh et al. [26] reported that the best performing bands for different scales (field, image and experimental) were found in the NIR and SWIR regions of the spectrum.They defined the wavelength at 2257 nm of the raw spectrum as the most effective band for estimating the soil EC for dry soils using a linear predictive model.The reflectance spectra within the NIR and SWIR regions were viewed as the best spectral region for evaluating the EC [26,52,86].The results of the present study suggest that ASTER data could be sufficiently sensitive to model and map soil salinity in arid and semi-arid environments.Soil clay content had a strong correlation with the SWIR bands (i.e., Bands 5, 8 and 7) and the red band (i.e., Band 2) (Figure 4b).Lagachere et al. [87] indicated that the waveband between 2150 nm to 2250 nm is effective for predicting clay content.Viscarra Rossel and McBratney [88] reported that clay was best predicted at 2100 nm, while the least accuracy was observed at 1600 nm for soil samples collected from a site in New South Wales, Australia.The results of OM indicated that the red band (i.e., Band 2) and NIR band (i.e., Band 3N) were the most significant wavelengths for OM prediction (Figure 4c).In addition, SWIR bands (i.e., Bands 4, 7 and 8) were found suitable for predicting OM.These results are close to the results that were reported by Barnes et al. [89].In their study, they found wavebands between 425 nm and 695 nm to have a strong correlation with OM content, if soils had the same parental material, whereas the middle infrared region should be examined, if the soils were from different parental material.It is necessary to note that this relationship is valid within certain wavelengths and without white salt crusts that are typically highly reflective and dramatically increase the albedo.

Modeling Soil Properties Using MARS and PLSR Methods
Two predictive models, MARS and PLSR, based on reflectance spectra derived from ASTER imagery were applied to estimate and map soil salinity, clay content and OM in El-Tina Plain, Egypt.The results confirmed that MARS models outperformed PLSR models and showed the ability to improve soil property estimation if coupled with ASTER spectra.These results support the findings by Shepherd and Walsh [72], Bilgili et al. [54] and Nawar et al. [59], which showed that MARS predictive models provided more robust predictions of soil properties than PLSR models.This is because MARS is a non-linear and flexible modeling method, capable of fitting complex and non-linear relationships and specifying the interaction effects, as well as the linear combinations of variables [53,54,90].Several studies showed that the prediction of soil properties implies some non-linear relationship between the measured soil values and soil reflectance spectra [26,50,52].MARS makes it possible to reduce the effects of the multistep process and any other unknown nonlinearity and, therefore, produces superior and more effective models compared to the PLSR method [90].It should be noted, however, that although the PLSR method assumes a linear relationship between the ECe and soil reflectance spectra, a small deviation from linearity is acceptable and can be suppressed by including additional modeling factors [91].
For soil salinity, MARS achieved high performance (R 2 = 0.85), better than the PLSR output (R 2 = 0.80), exceeding also the results acquired by Bilgili et al. [54,74] based on laboratory spectra (R 2 = 0.39 and 0.77, respectively), Nawar et al. [58], who used laboratory spectra and Landsat data (R 2 = 0.73), Shamsi et al. [92] based on MODIS data (R 2 = 0.39) and Allbed et al. [93], who tested IKONOS data (R 2 = 0.65).Excellent predictions for clay content were obtained in this study by both the MARS and PLSR models.Using laboratory spectra to estimate clay content with MARS and PLSR, Bilgili et al. [54] reported R 2 of 0.89 and 0.82, respectively.Furthermore, Selige et al. [34] and Gomez et al. [38] in their studies using HyMap data obtained R 2 value of 0.64 and 0.67, respectively.The high performance of clay content estimation in the current research with both models could be attributed to the wide range variation of clay content, as shown in Table 2.Only the OM models reached a lower accuracy (R 2 = 0.79 for MARS and 0.73 for PLSR), less than those for soil salinity and clay content.This might be attributed to the relatively narrow range of data values (0.0% to 2.3%) and the complex and non-linear reflectance behavior of soil organic matter.However, OM in this study was modeled with accuracy comparable to that received in previous studies (e.g., [31,54]).For example, Bilgili et al. [54] received R 2 of 0.79 for OM models using the MARS or PLSR technique.
One reason for the high performance of the predictive models for clay content and soil salinity is the high correlation between these two soil properties [54,59].Furthermore, increasing soil salinity causes a strong decrease in soil albedo [26] registered by the sensor, and the high variation in clay content results in differences in the absorbance characteristics and, so, in the albedo [94].Other factors contributing to the high performance of the models were the negligible influence of moisture content on the image spectra, because the image was selected in the early summer when soil was mostly dried, and a relatively uniform composition of clay minerals in the study area, with the dominance of montmorillonite [95,96], decreasing the influence of clay mineral type on the spectra variability.The good performance and accuracy of the predictive models in this study may be attributed also to the effectiveness of atmospheric correction in removing radiometric distortions and the retrieval of true reflectance values [50,97].However, the occurrence of negative values for soil salinity and clay content proves the need to improve the data processing and modeling approach for the lower range values of these two properties.

Soil Mapping and Assessment
In this research, the predictive models coupled with ASTER spectra proved to be promising for mapping soil salinity, clay content and OM, as R 2 ranged from 0.73 to 0.94 and RPD from 1.93 to 4.12.These results confirmed the soil mapping potential of ASTER data indicated in several other studies [20][21][22][23].To assess computed soil maps, we carried out both an internal validation using LOOCV and validation with independent samples collected close to the data of image acquisition, following the recommendation of Brus et al. [98], who reported that internal validation methods, such as LOOCV and n-fold cross-validation methods, will not provide unbiased estimates of map accuracy and proposed that independent samples should be collected using a design-based sampling scheme to validate any developed maps.The results of the independent validation revealed that the maps based on the MARS models achieved a very good agreement (R 2 = 0.89, 0.81 and 0.73 for clay content, soil salinity and OM, respectively), slightly surpassing the results of the PLSR method.While the MARS and PLSR models provided high accuracy maps of the soil properties, future studies should focus on improving modeling errors in digital soil maps.Recently, Brodský et al. [99] outlined an approach to see the propagation of the uncertainty associated with PLSR models to digital soil mapping (DSM) products related to soil properties at the farm scale.They concluded that the uncertainties associated with the PLSR predictions were lower than the uncertainties originating from the geostatistical prediction method.This was further demonstrated by Nelson et al. [100], who proposed a detailed analysis of the DSM error budget based on clay content mapping example.The errors from predictive models, such as PLSR and MARS, are much lower than the errors associated with spatial prediction methods, like geostatistical models [80,100].It is essential to get reliable estimates of surface soil properties and their spatial variability, as the outputs generated from modeling process are used to understand the soil quality.Subsequently, the outcomes of digital soil mapping would be able to help the regional managers to select the most preferred strategies for land management to progress the sustainability in the studied area.

Conclusions
This research explored a digital soil mapping methodology based on the MARS and PLSR models through a case study from the El-Tina Plain, Sinai, Egypt, for which ASTER imagery and concurrent field measurements of soil salinity, clay content and OM were available.The models of soil clay content were found the best, with R 2 = 0.94 and 0.90, RPD = 4.38 and 3.22, for MARS and PLSR, respectively.Soil salinity models had a good performance (R 2 = 0.85 and 0.80, RPD = 2.60 and 2.16, for MARS and PLSR, respectively).The results of OM modeling were significantly worse, but still acceptable for MARS (R 2 = 0.79 and RPD = 2.2).Soil maps computed using ASTER data and the elaborated models were validated with independent soil samples, providing very good results for clay content (R 2 = 0.89 and 0.87 for MARS and PLSR models, respectively) and soil salinity (R 2 = 0.81 and 0. 78 for MARS and PLSR, respectively).The accuracies of OM maps were acceptable, with R 2 of 0.73 for MARS and 0.71 for PLSR.Out of two modelling approaches used to map soil properties in the study area, MARS as a non-linear model, exhibited better estimation for all mapped soil properties than PLSR, particularly overcoming the deviations occurring between the predicted values and the measured soil values at higher ranges.The results in the current research indicated that MARS and PLSR models coupled with ASTER spectra are promising tools for estimating and mapping soil properties.However, more investigation into factors influencing modeling and mapping accuracy, for example types of salt minerals, types of clay minerals and soil moisture, should be considered in future research.
This study demonstrated how predictive models coupled with ASTER spectra could successfully map the soil properties over a large agricultural area, thus helping decision makers in arid regions to better manage land resources.The main strengths of this method, besides the high accuracy of modeling and mapping of selected soil properties, are the lower costs compared to the conventional field-based approaches and the simplicity to transfer models to image pixels for regional mapping.Further, using image spectra, the approach is robust enough to use various sensors for the mapping and monitoring of agricultural arid areas, for example the newly launched Landsat 8 or very high resolution multispectral images, such as those provided by the WorldView-3 satellite.

Figure 1 .
Figure 1.Study area shown in false color composite (RGB 132) of an ASTER image in 2006, as well as soil sample locations.

Figure 3 .
Figure 3.Comparison of field-measured and ASTER VNIR and SWIR reflectances at (a-f) corresponding localities and the linear regression models between field and ASTER spectra.

Figure 4 .
Figure 4. PLSR coefficients of the nine bands of ASTER data for: (a) soil salinity; (b) clay content; and (c) OM.

Figure 6 .
Figure 6.Quantitative maps of (a) ECe, (b) clay content and (c) OM based on the PLSR (left) and MARS (right) models.

Table 1 .
Distribution of soil samples in main soil units and land use types.

Table 2 .
Descriptive statistics of the soil parameters.EC, electrical conductivity; OM, organic matter.

Table 3 .
Cross-validation results of PLSR and MARS models for the estimation of soil properties based on ASTER spectra.
Nl, number of latent variables; Nb, number of basis functions.

Table 4 .
Class areas for ECe, clay content and OM for the PLSR and MARS maps.