Soil Salinity Retrieval from Advanced Multi-Spectral Sensor with Partial Least Square Regression

Improper use of land resources may result in severe soil salinization. Timely monitoring and early warning of soil salinity is in urgent need for sustainable development. This paper addresses the possibility and potential of Advanced Land Imager (ALI) for mapping soil salinity. In situ field spectra and soil salinity data were collected in the Yellow River Delta, China. Statistical analysis demonstrated the importance of ALI blue and near infrared (NIR) bands for soil salinity. A partial least square regression (PLSR) model was established between soil salinity and ALI-convolved field spectra. The model estimated soil salinity with a R2 (coefficient of determination), RPD (ratio of prediction to deviation), bias, standard deviation (SD) and root mean square error (RMSE) of 0.749, 3.584, 0.036 g·kg−1, 0.778 g·kg−1 and 0.779 g·kg−1. The model was then applied to atmospherically corrected ALI data. Soil salinity was underestimated for moderately (soil salinity within 2–4 g·kg−1) and highly saline (soil salinity >4 g·kg−1) soils. The underestimates increased with the degree of soil salinization, with a maximum value of ~4 g·kg−1. The major contribution for the underestimation (>80%) may result from data inaccuracy other than model ineffectiveness. Uncertainty analysis confirmed that improper atmospheric correction contributed to a very conservative uncertainty of 1.3 g·kg−1. Field sampling within remote OPEN ACCESS Remote Sens. 2015, 7 489 sensing pixels was probably the major source responsible for the underestimation. Our study demonstrates the effectiveness of PLSR model in retrieving soil salinity from new-generation multi-spectral sensors. This is very valuable for achieving worldwide soil salinity mapping with low cost and considerable accuracy.


Introduction
Rapid population growth and economic development demand effective use of land resources.Excessive or improper land use may result in severe soil degradation [1].Soil salinization is one of the major problems occurring in irrigated drylands [2,3].It may trigger other problems such as soil dispersion, erosion or desertification [4,5], which are harmful to crop yields [6,7] and human health [8].As a result, monitoring of soil salinity is urgently needed for salinization control [9,10].
Remote sensing possesses unique advantages over conventional proximal approaches in monitoring regional soil salinity [11][12][13].It provides an inexpensive means for mapping large-scale soil salinity and a synoptic overview of soil salinization at remote regions.To date, multi-spectral remote sensing data are widely used in soil salinity studies.Since the 1970s, the Landsat Multi-Spectral Scanner (MSS)/Thematic Mapper (TM)/Enhanced Thematic Mapper (ETM+) have acquired over four decades of global data, invaluable for mapping salt-contaminated areas [14,15].The Earth Observing (EO)-1 Advanced Land Imager (ALI) was launched to validate and demonstrate new techniques for Landsat 8 Operational Land Imager (OLI) [16,17].The advanced multi-spectral sensors provide new opportunities for soil salinity mapping.
Surface reflectivity may vary with different levels of soil salinization [18], which allows image interpretation [19,20] or classification [21,22].Statistical analysis can be also used to quantify soil salinity with multi-spectral remote sensing data.In most cases, multiple bands are transformed into an index more sensitive to soil salinity than a single band [23,24].The most frequently used indices include Normalized Difference Vegetation Index (NDVI) and various salinity indices [25][26][27][28][29][30][31].For single band, Shrestha [25] comprared six Landsat ETM+ bands and concluded that the short-wave infrared (SWIR) band was most closely correlated with soil salinity.Soil salinity was positively correlated with satellite signals at ETM+ band 7 (Table 1).The values of correlation coefficient (r) and coefficient of determination (R 2 ) were only 0.484 and 0.234 between predicted and measured soil salinity.In genneral, the predictive accuracy was unfavorable for these models with only one independent variable.As a result, a large number of studies selected band combinations to improve salinity detection.Douaoui et al. [26] demostrated that soil salinity was moderately (r = 0.43 ~ 0.50) correlated with a combination of red and green bands, but poorly (r = 0.00 ~ 0.09) with vegetation indices.The poor performance of vegetation index (VI) was also confirmed in Aldakheel [27].Soil salinity was estimated with individual relationships for subareas (R 2 = 0.20 ~ 0.50), whereas there existed very weak correlation between soil salinity and NDVI for the whole study area [27].In contrast, Allbed et al. [28] reported high correlations between soil salinity and VIs (R 2 = 0.51 ~ 0.78).
In general, soil salinity is negatively related with Vis.However, the accuracy for use of VIs may be case and site dependent.Besides a variety of VIs, Madani [29] also found a strong relationship between soil salinity and the difference of TM SWIR band and near-infrared (NIR) band.In addition, Bannari [30] declared that the index composed from SWIR bands was the most optimal indicator for soil salinity.It is likely that soil salinity may be also sensitive to SWIR bands, in addition to red and NIR bands which compose VIs.Inferentially, soil salinity could potentially be estimated from a combination of visible, NIR and SWIR bands.Odeh and Onus [31] also estimated soil salinity with a linear regression of two indicators.However, the R 2 value was only 0.32.In general, soil salinity can be quantitatively estimated from a variety of indices generated from multi-spectral data; however, the acucracy has not yet been well documented.
Since the 1980s, imaging spectroscopy has been applied for retrieval of soil salinity [32][33][34][35][36][37][38].Salinized soil has distinct spectral features in reflective solar bands, related to water in hydrated minerals.The absorption features were detected at 505 nm, 920 nm, 1415 nm, 1915 nm and 2205 nm for saline soils, in addition to 680 nm, 920 nm and 1780 nm for soil scalds and highly salinized soils [12,33].With statistical models, the dependence of soil salinity on diagnostic spectral features can be established.Ben-Dor et al. [34] performed a multiple regression analysis to predict soil salinity using 38 selected hyperspectral bands.Farifteh et al. [35] compared partial least square regression (PLSR) and artificial neural network (ANN) models to retrieve soil salinity from hyperspectral images, and concluded that the two models were comparable.Weng, et al. [36] also applied a PLSR model to EO-1 Hyperion images on a pixel-by-pixel basis.The PLSR model is effective in dealing with strong collinearity between independent variables [39], and has been successfully used in soil salinity retrievals from hyperspectral data [35,36,40].Moreover, new technologies have been developed, for example, the use of sub-surface-to-surface correlation, frequency domain electromagnetic induction (FDEM) and ground penetrating radar (GPR) [41,42].However, complicated atmospheric correction and high cost may preclude use of the hyperspectral remote sensing for mapping large-scale soil salinity [37,43].Therefore, existing studies are generally confined to experimental analysis [44,45].
Hick and Russell [46] stated that multi-spectral remote sensing may not be optimal for soil salinity retrieval.To the best of our knowledge, only a few studies have correlated soil salinity with multi-spectral bands.For instance, Fan et al. [47] used Landsat TM and auxiliary elevation data to retrieve soil salinity.However, their models were established with moderate correlation coefficients of 0.504-0.736.Notably, recently launched sensors have several advantages over Landsat sensors, especially in spectral resolution and signal-to-noise ratio (Table 1).ALI has more bands in the blue, NIR and SWIR range of the electromagnetic spectrum.The improved quality may benefit high-accuracy soil salinity mapping.In our recent study, soil salinity was successfully retrieved from ALI data with a generalized regression neural network (GRNN) model [48].In addition to the nonlinear model, a linear model is required to disclose explicitly the relationship between soil salinity and ALI bands.
This study was to investigate the possibility and potential in retrieving soil salinity from advanced multi-spectral sensor data with a linear regression model.For this purpose, ALI image was acquired over the Yellow River Delta (YRD), China, where long-standing soil salinization was reported due to both original and secondary salinization [21,49,50].First, in situ data of soil spectra and salinity were statistically analyzed to find optimal spectral bands for soil salinity retrieval.Second, a PLSR model was established based on soil salinity measures and ALI-convolved field spectra.The model was then applied to atmospherically corrected ALI data on a pixel-by-pixel basis.At last, errors associated with salinity retrieval were evaluated with a quantified sensitivity analysis.Our study may provide an inexpensive but reliable approach for quantitative monitoring of soil salinization at a large scale.The paper was organized as follows: Section 2 describes study materials and methodology; Section 3 reports the main results; Section 4 discusses on our findings and Section 5 concludes.

Study Area
The Yellow River Delta is located at northeast of Shandong Province, China (Figure 1).The average annual precipitation is 530-630 mm, of which 70% occurs in the summer season (May-July).The average potential evaporation is 1900-2400 mm.This is approximately three times the precipitation needed for natural soil salinization.In addition, the groundwater is shallow with an average depth of 1.14 m [36].According to Food and Agriculture Organization (FAO) classification, dominant soils include Gleyic Solonchaks, Calcaric Fluvisol and Salic Fluvisols [51].Major mineral salts are NaCl and MgCl2 [33].Natural vegetation consists of salt tolerant herbs, grasses and shrubs.A part of YRD was selected as our study area, corresponding to an ALI scene (Figure 1).It covers a spatial extent within 37.2°N-38.2°Nand 118.7°E-119.3°E,with a total area of about 3200 km 2 .The western part is primarily agricultural croplands, especially along the Yellow River, and the eastern part is the offshore of Bohai Sea.There exist some oil fields in the North and some salt pans in the South.

Remote Sensing Data
An EO-1 ALI Level 1B multi-spectral image was used for soil salinity retrieval.It was acquired on 14 April 2005 from the Data Acquisition Requests (DAR) System at [52].The sensor has nine multi-spectral bands (30 m) at solar reflective bands.Table 1 shows spectral bands of Landsat 7 ETM+, EO-1 ALI and Landsat 8 OLI sensors [16,17] to highlight unique spectral advantages of the ALI.The ALI bands are centered at wavelengths of 0.443 μm (Band 1p), 0.483 μm (Band 1), 0.565 μm (Band 2), 0.662 μm (Band 3), 0.790 μm (Band 4), 0.868 μm (Band 4p), 1.250 μm (Band 5p), 1.650 μm (Band 5) and 2.215 μm (Band 7).It has three more spectral bands (Bands 1p, 4p and 5p) and narrower spectral bandwidths over the ETM+.Compared to the OLI, the ALI has slightly broader spectral bandwidths, with an additional NIR band (Band 4).ALI L1B data are provided in digital number (DN).The DN values were converted to top-of-atmospheric (TOA) reflectance with calibration coefficients provided in [53] for nine spectral bands.The conversion can be described as follows: where ρTOA denotes the TOA reflectance, DN denotes the ALI digital number, G and B denote the gain and offset coefficients for sensor calibration.TOA reflectance was an input for atmospheric correction.The TOA NDVI can be obtained from the following equation: where NDVI denotes the normalized difference vegetation index, ρTOA,3 and ρTOA,4 denote the TOA reflectance for ALI bands 3 and 4. Non-soil ALI pixels were removed to ensure a reasonable soil salinity retrieval.Water surface, vegetation cover and artificial objects were discarded with an NDVI-based decision tree method [54].
In most cases, water surface had a negative NDVI value, and it was removed with NDVI < 0.03.The threshold was set positive to remove shallow water area.Most artificial objects were spectrally flat with NDVI value close to zero, and they were removed with a conservative threshold of NDVI < 0.05.Vegetation had a high NDVI value, and it was removed with NDVI > 0.14.The remaining pixels were mainly comprised of bare soils at different saline levels.The average TOA reflectance was 16.83%, 15.54%, 15.71%, 16.58%, 19.84%, 19.89%, 20.63%, 19.76% and 15.79% for the nine bands.These values were used for simulation described in Section 2.3.4.
Auxiliary remote sensing data include the TERRA Moderate Resolution Imaging Spectroradiometer (MODIS) atmosphere products.TERRA overpassed the study area near-simultaneously with the EO-1, allowing use of the products for atmospheric correction of ALI image.The data include aerosol (MOD04) and atmosphere profile (MOD07) products, available at [55].MOD04 was extracted for aerosol optical thickness (AOT) at 550 nm (AOT550), and MOD07 for total ozone concentration (TOC) and total precipitable water vapor content (TPW).To match ALI spatial resolution, these data were converted to vector form and interpolated to 30 m using inverse distance weighting (IDW) algorithm.For each 30 m × 30 m grid, the ambient four MODIS product values were used for the interpolation.The interpolated values and the MODIS product values were compared with R 2 and root mean square error (RMSE) as defined in Equation ( 6) and Equation (10).Soil salinity uncertainties related to these errors were discussed in Section 4.3.

Field Campaign and Laboratory Analysis
A total of 68 topsoil (depth < 5 cm) samples were collected during a field campaign between 11 and 17 April 2005.Meteorological conditions were favorable during the campaign with low atmospheric humidity, low wind speed and long hours of sunshine (Table 2).Sampling sites were determined with a false color (R = band 4, G = band 3, B = band 2) image of Landsat TM acquired on 5 March 2005 (Figure 1).According to our knowledge and former studies, the strip-like transect covered soil salinity varying from higher level in south and north to lower level in between.The number of soil samples in each salinization level was balanced.At each sampling site, spectral reflectance was measured with an ASD Fieldspec Pro FR Spectroradiometer [56].The spectroradiometer's probe was set within 15° field of view angle, and held vertically to ground surface.Before each measurement, a Spectralon ® reference panel was measured.Soil reflectance was defined as a ratio of radiation reflected from soil surface to that from the reference panel.Within each 90 m × 90 m square (3 × 3 ALI pixels), surface reflectance was measured at four homogenous targets, and averaged to represent the spectra of each sampling site.The homogenous targets were determined through a cluster analysis of the TM image and in situ visual verification.Correspondingly, topsoil samples were taken and sealed in plastic bags.At the same time, a handheld GPS was used to record the coordinates for each sampling site.
Field spectra data were provided at 1 nm intervals within 350-2500 nm.Due to strong atmospheric absorption, data quality was affected at spectral bands within 1355-1410 nm, 1810-1940 nm and 2451-2500 nm.The reflectance values deviated from the normal range of 0%-100%, and were omitted.As a result, a total of 1914 bands remained for further analysis.Since ALI spectral bands covered none of the discarded wavelengths, the omission had no impact on ALI-based salinity modelling.To reduce high-frequency noise caused by either atmospheric effect or instrument response, a five-point smoothing algorithm was applied to the spectra data.Subsequently, the smoothed data were resampled in accordance to ALI spectral band responses.The resampled spectra were used as x-variables for soil salinity modelling.Standard laboratory method [57] was used to measure soil salinity in soil samples.The concentrations of Cl − , SO4 2− , CO 3− , HCO3 − , K + , Na + , Ca 2+ , and Mg 2+ were measured using 1:5 soil-water mixtures.Soil salinity content (SSC) was defined as the total amount of the above eight ions.It was used as y-variable for modelling salinity-reflectance relationship.In terms of SSC value, soil salinization level was classified into non-saline, slightly saline, moderately saline and highly saline.The ranges were 0 < SSC < 1 g•kg −1 , 1 < SSC < 2 g•kg −1 , 2 < SSC < 4 g•kg −1 and SSC > 4 g•kg −1 according to the coastal saline soil classification in China [58].Field spectra were averaged within each saline level to find relations between soil salinity and soil reflectance.

Atmospheric Correction
Remote sensing data are generally affected by the atmosphere along the path between ground target and satellite sensor.For a solar reflective band, main atmospheric effects include atmospheric absorption and scattering.Solar radiation is partially absorbed by ozone in visible bands, and by water vapor in NIR bands.Aerosol imposes impact on the entire solar reflective bands, with significant impacts on short wavelength.To retrieve surface reflectance, TOA reflectance data need to be corrected for ozone, water vapor and aerosol effects.A general atmospheric correction for a Lambertian and uniform surface can be written as follows [60]: where ρTOA denotes the TOA reflectance, ρa denotes the atmospheric path radiance due to molecular scattering and aerosol scattering, ρ denotes the surface reflectance, Tg denotes the atmospheric absorption due to ozone and water vapor, Ts and Tv respectively denote the downward and upward atmospheric scattering due to atmospheric molecules and aerosols, and S denotes the atmospheric spherical albedo.Variables in Equation ( 3) can be related to sensor calibration and atmospheric parameters.For their detailed descriptions, see [58].Sensor calibration coefficients are available from [53] and atmospheric parameters from MODIS atmosphere products [61].With these known parameters, surface reflectance can be obtained by solving Equation (3).A mathematic form of the surface reflectance is as follows:

Partial Least Square Regression (PLSR) Model and Performance Assessment
The PLSR model has been widely used for soil salinity retrieval due to its strong ability in processing of data with high collinearity [35,36,39,40].To identify significant x-variables (wavelengths) for soil salinity retrieval, a PLSR model was developed based on Martens' Uncertainty Test [62].The test was a significance method, now implemented in The Unscrambler 9.7 software [63].The test can identify dominant sources of instability in regression modelling and quantify the stability of the regression results.It also allows identification of perturbing variables and selection of significant variables.In this study, all 68 samples were injected for the uncertainty test, with soil salinity as y-variable and the corresponding 1914-band spectra data as x-variables.The output was the important wavelengths for soil salinity retrieval.
The PLSR model can be established for soil salinity retrieval based on ALI-convolved field spectra data.The model takes a form as follows: where SSC denotes soil salinity content, ρ denotes the ALI-convolved spectral reflectance, w denotes model coefficient for the ith band, w0 denotes the constant item, and N denotes the number of spectral bands.In this study, two-thirds (45) of the 68 samples were used for modelling, and the others for validation.The x-variables were nine-band ALI convolved spectra data, and the y-variable was soil salinity.Model performance can be evaluated with R 2 , ratio of prediction to deviation (RPD), Bias, standard deviation (SD) and RMSE.The R 2 and RPD indicate the strength of statistical correlation between measured and predicted values.The model can be accurate for R 2 > 0.91/RPD > 2.5, good for 0.82 < R 2 < 0.90/RPD > 2, moderate for 0.66 < R 2 < 0.81/RPD > 1.5, and poor for 0.5 < R 2 < 0.65 [35].The Bias measures the mean difference of predicted versus measured value, and the SD represents the random component of total uncertainty (RMSE) [64].The metrics can be described as follows: where y denotes the measures with a mean value of y , ŷ denotes the predicted values, and N denotes the number of samples.

Soil Salinity Mapping and Error Analysis
Soil salinity can be mapped by applying Equation ( 5) and substituting x-variables with corresponding ALI-band surface reflectance data.Retrieved soil salinity can be extracted with GPS coordinates of sampling sites, and compared with measured and modelled values.For convenience, the retrieved soil salinity was labeled as SSCimage.Similarly, the measured and modelled values were labeled as SSCmeasure and SSCmodel.The total SSC error was subsequently determined as the difference between the retrieved and measured values (SSCimage-SSCmeasure).The total error can be separated into model-and data-related errors.The model related error is the difference between the modelled and the measured values (SSCmodel-SSCmeasure), which indicates the performance of the PLSR model.The data related error is the difference between the retrieved and the modelled values (SSCimage-SSCmodel), which indicates the discrepancy between the field measures and the ALI-retrieved values.The total errors can be plotted against the measured soil salinity to detect major contributors to the retrieval error in soil salinity.

Uncertainty Analysis
According to Equations ( 1) and (4), ALI surface reflectance is an analytic function of sensor calibration (CAL), TOC, TPW and AOT.Incorporating Equation ( 5), the ALI-derived soil salinity is also an analytic function of CAL, TOC, TPW and AOT, and can be expressed as follows: It indicates that uncertainty associated with soil salinity retrieval is related to the uncertainties in sensor calibration and ozone, water vapor and aerosol products.The uncertainty due to each component is a derivative of function (11) with regard to CAL, TOC, TPW and AOT.Assuming that each component be independent, the total uncertainty can be derived as follows: where USSC denotes the soil salinity uncertainty, reCAL, reTOC, reTPW and reAOT denote relative uncertainty of CAL, TOC, TPW and AOT, and f denotes Equation (11).Equation ( 12) can be divided into four components: CAL-, TOC-, TPW-and AOT-related uncertainty.For uncertainty analysis, TOA reflectance was set as the mean of TOA soil reflectance.TOC, TPW and AOT were set as the mean values of corresponding MODIS atmosphere products.For sensitivity analysis, TOA reflectance varied by ±5% (in absolute amount) for each band.The threshold can be determined with variation range of field spectra data.Low reflectance stands for slightly saline soil, and high reflectance for highly saline soil.TOC varied by ±0.1 cm•atm −1 , and TPW by ±0.5 g•cm −2 , and AOT by ±0.5.High TPW and AOT values represent humid and turbid air, and low TPW and AOT values represent dry and clear air.The relative uncertainty of each component was set within 0%-20% to represent a low-to-high degree of uncertainty.

Statistical Descriptions of Field Spectra Data
Figure 2 shows a relationship between soil salinity and field spectra data.The correlation coefficient ranged between −0.534 and 0.322, with maximum positive r value at 350 nm and maximum negative r value at 1354 nm.Soil salinity was positively correlated with spectral reflectance for wavelengths within 350 nm and 523 nm, and the correlation coefficient decreased with wavelength.For longer spectral wavelength, soil salinity was negatively correlated with spectral reflectance.The correlation coefficient increased from almost zero at 523 nm to about −0.5 at approximately 1000 nm.It remained stable for spectral wavelengths within 1000 nm and 2450 nm. Figure 2 also illustrates spectral positions of the ALI spectral bands.ALI bands 1p and 1 were located at blue band, and soil spectral reflectance at these bands was positively correlated with soil salinity.Other bands were located at positions with longer wavelengths, and soil reflectance was negatively correlated with soil salinity.Figure 3 shows the mean spectral reflectance for soil at different saline levels.There were 19, 14, 21 and 14 samples for SSC within 0-1 g•kg −1 , 1-2 g•kg −1 , 2-4 g•kg −1 and SSC > 4 g•kg −1 .In general, soil reflectance decreased with the degree of soil salinization.The trend was more significant for infrared bands than for visible bands.Spectral reflectance was similar to each other over a wide range of spectral wavelength for non-saline and slightly saline soils, indicating that it is generally difficult to discriminate them.Notably, their differences became larger at spectral wavelengths longer than approximately 1900 nm.Highly saline soil had higher spectral reflectance than moderately saline soils at shorter wavelengths, and the reflectance was comparable with than that of non-saline and slightly saline soils.These spectral behaviors provided a base for quantitative retrieval of soil salinity.

Statistical Descriptions of ALI-Convolved Field Spectra Data
Soil salinity ranged between 0.39 and 6.25 g•kg −1 , with a mean and SD value of 2.5 g•kg −1 and 1.6 g•kg −1 .Table 3 shows descriptive statistics of soil salinity and ALI-convolved field spectra.For SSC < 4 g•kg −1 , spectral reflectance decreased with soil salinity for all nine bands.This may be caused by increased soil moisture, as the hygroscopic MgCl2 was the major mineral salt in the soils [33].Spectral reflectance increased at bands 1p-3 for high saline soils.A possible explanation may be the existance of soil crust that had a bright color in visible bands.
Table 4 shows correlation coefficients between soil salinity and ALI-convolved field spectra, and coefficients between ALI-convolved field spectral bands.Soil salinity had a low positive correlation coefficient with reflectance at bands 1p and 1 (r = 0.144 and 0.060), whereas it had a weak negative correlation coefficient at ALI green and red bands (r = −0.134and −0.264).Soil salinity was moderately correlated with ALI NIR and SWIR bands (r = −0.477,−0.516, −0.582, −0.541 and −0.469).It was worth noting that band 1p was more sensitive to soil salinity than band 1 (r = 0.144 vs. 0.060).Similarly, band 4p also had increased sensitivity relative to band 4 (r = −0.516 vs. −0.477).Among others, the reflectance at band 5p had the largest negative correlation coefficient (r = −0.582)with soil salinity.Moreover, it had relatively lower r values with other bands.This implied that band 5p could provide complementary information for soil salinity retrieval.

Important Spectral Wavelengths for Soil Salinity Retrieval
Figure 4 shows regression coefficients for the PLSR model obtained from total soil samples and field spectra data.The coefficients were positive for blue and green bands (350-566 nm), and negative for red and NIR bands (567-1034 nm).The higher end wavelength of 1034 nm was determined because regression coefficient at this wavelength was less than 0.01.Regression coefficient was very close to zero for wavelengths within 1034-1451 nm, and positive for wavelengths within 1452-1810 nm.For longer wavelength, regression coefficient was negative for wavelength shorter than 1988 nm, positive within 1988-2116 nm, and negative for wavelength longer than 2116 nm.The regression coefficient varied slightly for those bands without water vapor absorption.For wavelengths near 1400 nm, 1900 nm and thereafter, the coefficient varied greatly even for neighboring wavelengths.

Partial Least Square Regression (PLSR) Model for Soil Salinity Retrieval
Measured soil salinity and ALI-convolved field spectra data yielded a linear model, described as follows: where SSC denotes the modelled soil salinity, R denotes the ALI-convolved field spectra data, and subscripts 1p-7 denote sequential order of the ALI spectral bands.Equation ( 13) highlighted the importance of bands 1p and 1 for soil salinity retrieval.Figure 5 compared measured and modelled soil salinity with collected samples.With respect to modelling samples, soil salinity values had a R 2 and RMSE of 0.797 and 0.779 g•kg −1 .For validation samples, they were correlated with a R 2 and RMSE of 0.689 and 0.940 g•kg −1 .In general, the model estimated soil salinity with a considerable accuracy; however, it may underestimate soil salinity according to less-than-unity slope values (0.748 and 0.757).
Table 5 provides statistics for evaluating the constructed salinity-reflectance relationship.For modelling samples, R 2 and RPD values were 0.749 and 3.584, indicating a moderate prediction ability as defined in [35].Bias, SD and RMSE values were 0.036, 0.778 and 0.779, indicating a considerable accuracy.For validation samples, R 2 and RPD values were 0.689 and 2.838.The Bias, SD and RMSE values were −0.211, 0.937 and 0.940.These values were comparable with those for modelling samples.For total samples, all these metric values lay between those for modelling and validation samples.R 2 and RPD values were 0.724 and 3.155.The Bias, SD and RMSE values were −0.048, 0.842 and 0.837.The slope values were 0.748, 0.757 and 0.772 for modelling, validation and total samples, which may indicate underestimation of soil salinity.

Salinity Mapping Using Partial Least Square Regression (PLSR) Model
Figure 6 shows the distribution of TOC, TPW and AOT interpolated from MODIS atmosphere products.TOC varied from 0.38-0.48cm•atm −1 , TPW from 0.74-1.49g•cm −2 , and AOT from 0.48-0.77.The mean values were 0.43 cm•atm −1 , 1.08 g•cm −2 and 0.58.Comparison showed that R 2 and RMSE values were 0.733 and 0.01 cm•atm −1 for TOC, 0.840 and 0.07 g•cm −2 for TPW, and 0.951 and 0.02 for AOT.In a relative manner, the relative RMSE values were 2.87%, 6.41% and 3.58% for TOC, TPW and AOT, indicating a high accuracy of the interpolations.Figure 6a shows that ozone concentration was higher in the northern part.Figure 6b,c illustrates that the atmosphere was more humid and turbid in the southern part of study area.The southern part was close to urban area, which is the major cause of increased AOT values.
Figure 7 shows a synoptic soil salinity map over the study area and detailed salinity distributions over three typical agricultural lands.For the whole study area, soil salinity ranged between 0.00 and 9.95 g•kg −1 , with a mean and SD of 1.98 g•kg −1 and 0.69 g•kg −1 .In general, salinization was intensified from the inland toward the coastal area.The cultivated area along the Yellow River was slightly affected by salinization, whereas the northern part (oil fields), the southern part (salt pans) and the eastern part (estuary) were moderately saline regions.Statistics showed that 10.5%, 36.4%,53.0% and 0.1% of the entire area were covered with non-saline, slightly saline, moderately saline and highly saline soils.Three agricultural lands were zoomed in for a detailed examination of soil salinity mapping.The land (a) was located in the northern part with slight-to-moderately saline soils; The land (b) was located along the Yellow River with non-saline and slightly saline soils; The land (c) was located in the southern part with moderate-to-highly saline soils.The detailed maps showed that cropland was likely consistent in soil salinity for all the 30 m × 30 m pixels within the land.Moreover, salinity remained close to neighboring agricultural lands.Since cultivation practice was similar for agricultural lands, these retrievals were logical and reasonable.

Quantification of Error Related to Soil Salinity Retrieval
Figure 8a shows a scatterplot of measured soil salinity versus total retrieval errors.The total errors were highly correlated (R 2 = 0.848) with the measured soil salinity.The slope and intercept values were −0.701 and 0.823, indicating an overall underestimate of soil salinity.For non-saline (SSC < 1 g•kg −1 ) and slightly saline (1 < SSC < 2 g•kg −1 ) soils, the errors were generally well within 1 g•kg −1 .For moderately (2 < SSC < 4 g•kg −1 ) and highly saline (SSC > 4 g•kg −1 ) soils, the errors increased linearly with soil salinity.The maximum error could be up to −4 g•kg −1 .Figure 8b shows a weak correlation between measured soil salinity and model-related error.The error ranged between −2 and 2 g•kg −1 with a fitting slope and intercept of −0.213 and 0.483.This implied that the model might overestimate low salinity and underestimate high salinity.Figure 8c shows a moderate correlation (R 2 = 0.467) between measured soil salinity and data-related error, with a slope and intercept of −0.487 and 0.340.The error increased with soil salinity, and the maximum error could be larger than −3 g•kg −1 .In overall, both the model-and the data-related error resulted in the underestimation of high salinity.For 2 < SSC < 4 g•kg −1 , the mean model-and data-related errors were −0.21 g•kg −1 and −1.06 g•kg −1 .While for SSC > 4 g•kg −1 , the errors were −0.48 g•kg −1 and −2.10 g•kg −1 .Thus, the data-related error accounted for 84% and 81% of the total error for moderately and highly saline soils.

Potential Use of Advanced Multi-Spectral Sensor for Soil Salinity Retrieval
Advanced multi-spectral sensors offer more opportunities for soil salinity quantification over a wide area of terrestrial surface.As for the ALI sensor, band 1p shows increased sensitivity to soil salinity than the blue band in Landsat series (Figure 2).Band 1p is of importance to the retrieval for its higher regression coefficient in the salinity-reflectance relationship.Figure 4 also demonstrates the importance of band 4p to soil salinity retrieval.Moreover, band 5p provides additional information (Table 4).These features constitute the fundamentals for soil salinity retrieval from the ALI multi-spectral data [30,48].In general, multispectral remote sensing data owns comparable spatial resolution relative to satellite hyperspectral data (e.g., Hyperion), yet covers a wider area of terrestrial surface.The platforms enable repeated observations over a target area with a short revisit time.Nevertheless, insufficient spectral resolution (number of spectral band and spectral width) may be a major obstacle for its use in quantitative soil remote sensing.As demonstrated in this work, the key spectral bands of the new-generation satellite sensors may greatly improve soil salinity retrieval at a large scale from multi-spectral remote sensing data.
The ALI was developed for demonstrative validation of new technologies and strategies to improve Earth observations.It provides a prototype for Landsat 8 OLI.Relative to the ALI, the OLI retains bands 1p, 4p and 5p, while obsoletes band 4 (Table 1).Since spectral reflectance is highly correlated between bands 4 and 4p (0.992 as in Table 4), the absence of band 4 may have a minor impact on soil salinity retrieval.In addition, the OLI bands are spectrally comparative and are narrower than the corresponding ALI bands (Table 1).With a narrower spectral band pass, atmospheric absorption can be further reduced, and the OLI sensor has the potentials for worldwide soil salinity mapping.

Soil Salinity Retrieval from Multi-Spectral Sensor Data Based on Regression Models
Instead of all spectral bands, a majority of existing studies perform statistical analysis based on selected bands [25] and/or indices generated from band combinations.The commonly used indices can be generated from green/red [26], red/NIR [28], NIR/SWIR [29] and SWIR/SWIR bands [30].This demonstrates from a different angle that the process of soil salinization may change soil surface reflectance at most spectral bands.Soil salinity variation can be detected with a single spectral band, dual-band or tri-band combination [26].Nevertheless, these indices are generally weakly or moderately correlated with soil salinity.In most cases, the correlation coefficient may be less than 0.50 [25,26,30].This paper incorporated all spectral bands for statistical analysis, and demonstrated the usefulness of the PLSR model for soil salinity retrieval with the ALI-convolved field spectra (Figure 6).The salinity-reflectance relationship was established with R 2 > 0.7 and RPD > 3 (Table 4).The bias was very close to zero, and the SD and the RMSE were less than 1.0 g•kg −1 .These statistics demonstrated much improved retrieval accuracy with the linear regression models based on all-band satellite data.

Physical Explanations for Index Based Soil Salinity Estimation
Equation ( 13) not only shows a relationship for soil salinity estimation at YRD, but also implies physical explanations for index based soil salinity studies.The regression coefficients are −3.8,−16.4,−14.9, 11.3 and −11.7 for green, red, NIR and two SWIR bands.Inferred from the coefficients, soil salinity may be insensitive to green band but sensitive to red band (coefficient of −3.8 vs. −16.4).This may be the reason why a combination of green and red bands has been extensively used for soil salinity mapping [26,28,31].However, the sensitivity of soil salinity is similar to red and NIR bands (−16.4 vs. −14.9).This may partly account for why VIs can be used successfully in some studies whereas unsatisfactorily in other studies [26][27][28].The general weak negative correlation between soil salinity and VIs may be contributable to the presence of vegetation in low saline soils.The coefficient is negative for NIR band yet positive for SWIR band 5 (−14.9 vs. 11.3).Moreover, the coefficient is positive for SWIR band 5 yet negative for SWIR band 7 (11.3 vs. −11.7).The contrasting coefficients may demonstrate the effectiveness of indices composed of NIR and SWIR bands [29] as well as two SWIR bands [30].The use of dual-SWIR bands have been also demonstrated with hyperspectral data.By using Hyperion surface reflectance data, Weng et al. [33] have discovered a strong correlation between soil salinity and index constructed using reflectances at 2052 nm and 2203 nm.
In general, dual-band index can be composed from sensitive/insensitive bands (e.g., green and red bands), sensitive yet not contrastive bands (e.g., red and NIR bands) and contrasting bands (e.g., and SWIR bands, dual SWIR bands).The object of index is to highlight the information related to soil salinity whereas suppressing noise and background information.In this sense, our relationship can largely suppress unrelated information and thereby yield high accuracy.

Primary Uncertainty Associated with Soil Salinity Retrieval
Figure 8a shows a strong negative relationship between measured soil salinity and total retrieval error.It indicates that soil salinity was notably underestimated for moderately and highly saline soils.The underestimation is common in existing remote sensing studies.For example, Douaoui et al. [26] reported decreased correlation coefficients between soil salinity and several indices for highly saline soils.Farifteh et al. [35] also underestimated soil salinity for highly saline soils.The reason can be inferred from Figure 4 in which spectral reflectance varies differently for highly saline soils.Figure 8b reveals negative model-related errors for moderately and highly saline soils.However, the model-related error only accounted for a small proportion (<20%) of total retrieval error.The major error may result from data-related error as shown in Figure 8c.
Ben-Dor et al. [38] suggested that favorable atmospheric correction should be mandatory for imaging spectroscopy, whereas in multi-spectral remote sensing atmospheric effects are minor.However, quantification of the atmospheric effects is needed to verify this point.For multi-spectral remote sensing, atmospheric correction requires data of sensor calibration, ozone concentration, water vapor content and aerosol optical depth.Figure 9 illustrates a quantitative description for soil salinity uncertainty resulting from these paramters.The horizontal axes denote parameter and uncertainty, and the vertical axis denotes soil salinity uncertainty at different levels of the paramter and uncertainty.According to simulation results, sensor calibration, ozone, water vapor and aerosol may induce retrieval uncertainty of 0.7-0.9g•kg −1 , <0.1 g•kg −1 , <0.2 g•kg −1 and <1.0 g•kg −1 .Neglecting the sensor calibration uncertainty, the total retrieval error may not exceed 1.3 g•kg −1 .The uncertainty increases sharply at high TPW and/or AOT values for humid and turbid days.In most cases, remote sensing imageries are preferred in dry and clear days rather than in humid and turbid days.As a result, the total uncertainty of 1.3 g•kg −1 is a conservative estimate.This also demonstrates that atmospheric correction is not the major source for retrieval error.
Data-related error may arise from scale difference between point sampling and remote sensing pixel.A high degree of surface heterogeneity may weaken the representativeness of point sampling for a 30 m × 30 m ALI pixel.The problem is one of the limitations for soil salinity mapping based on imaging spectroscopy [38].As shown in Figure 8c, data-related error is moderately correlated with measured soil salinity.It was within 1 g•kg −1 for non-saline and slightly saline soils, whereas increased with the degree of soil salinization for moderately and highly saline soils.Non-saline and slightly saline soils were gennerally distributed in agricultural lands (Figure 7).Uniform agricultural practice enabled similar spectral characteristics in neighboring algricultural lands.Thus, the sampling sites should be well representative within several hundred meters.However, moderately and highly saline soils unlikely had a homogeneous distrisbition within a large area.Salt crust and salt-tolerant vegetation may produce heterogenous surfaces.As a result, a comprehensive field compaign and validation are necessary for quantitative remote sensing.

Conclusions
Soil salinization has been a global concern, triggering secondary soil degradation and posing great threats to sustainable development.Timely detection and early warning of soil salinization is in urgent need.In this context, multi-spectral remote sensing data have been extensively used for soil salinity mapping.However, the coarse spectral resolution poses a major obstacle for quantitative retrieval.In virtue of improved spectral resolution of the ALI sensor, this paper applied a PLSR model to construct a relationship between measured soil salinity and ALI-convolved field spectra, with an examination in the Yellow River Delta, China.The model estimated soil salinity with a R 2 and RPD of 0.749 and 3.584.The estimates were almost unbiased in terms of SD and RMSE less than 1.0 g•kg −1 .The results demonstrated the usefulness of the ALI bands 1p, 4p and potentially band 5p in mapping soil salinity.Landsat 8 OLI would be an important candidate for soil salinity mapping at a large scale.The PLSR model was applied to mapping soil salinity from atmospherically corrected ALI data.Error analysis showed that soil salinity was relatively accurate for low saline soils, and underestimated for moderately and highly saline soils.The underestimation may result from the poor representativeness of soil sampling, not from the model ineffectiveness or atmospheric correction.
Our study provides comprehensive spectral analysis based on ALI-convolved field spectra.It confirms the effectiveness of ALI bands in soil salinity retrieval.We also demonstrate the potential use of regression models for retrieving salinity from advanced multi-spectral sensors, and explain the implications of our study for index based soil salinity estimation.The results and implications would be very valuable for inexpensive but accurate soil salinity mapping at a large scale.

Figure 1 .
Figure 1.Location of study area.The study area is a part of the Yellow River Delta in Shandong province, China.The black frame confines our target area for soil salinity retrieval, corresponding to an ALI scene.The red line delineates the boundary of the Delta, and the red solid circles denote sampling sites.

Figure 2 .
Figure 2. Correlation relationship between soil salinity and field spectra data (red lines).ALI relative spectral functions were superimposed to illustrate the positions of ALI spectral bands (blue lines).

Figure 4 .
Figure 4. Regression coefficients for the PLSR model obtained from total soil samples and field spectra data.Important wavelengths for soil salinity retrieval are marked in red color.Central wavelengths of ALI bands were denoted with arrows and band numbers.

Figure 5 .Table 5 .
Figure 5.Comparison of measured and modelled soil salinity with modelling samples (a) and validation samples (b).The horizontal axis denotes measured soil salinity, and the vertical axis denotes modelled soil salinity.Slope and intercept, R 2 , RMSE and sample numbers are provided for the comparison.

Figure 6 .
Figure 6.Spatial distribution of ozone concentration (a); water vapor content (b) and aerosol optical thickness at 550 nm (c) within the Yellow River Delta.The 30 m-resolution values were interpolated from MODIS atmosphere products.

Figure 7 .
Figure 7. Soil salinity map over study area and detailed soil salinity distribution at three agricultural lands in the north with slight-moderately saline soils (a); along the Yellow River with non-saline and slightly saline soils (b) and in the south with moderate-to-highly saline soils (c).

Figure 8 .
Figure 8. Scatterplots of measured soil salinity versus total retrieval error (a); model related error (b) and data related error (c).Slope and intercept values for linear regression and R 2 value are given for each scatterplot.

Figure 9 .
Figure 9. Simulated uncertainty in soil salinity retrievals due to ALI sensor calibration (a); ozone data (b); water vapor data (c); and aerosol data (d).The horizontal axes denote parameter and uncertainty, and the vertical axis denotes retrieval error at a given parameter and uncertainty.

Table 1 .
Comparison of Landsat 7 ETM+, EO-1 ALI and Landsat 8 OLI reflective spectral bands.The band names follow the ALI naming conventions.NIR = near-infrared band, SWIR = short-wave infrared band.

Table 2 .
Meteorological conditions for the study area during the field campaign.
[59]: The data are available at[59]from a meteorological station at Dongying city, close to sampling sites.

Table 3 .
Descriptive statistics of soil salinity and ALI-convolved field spectra for soils at different saline levels.

Table 4 .
Correlation coefficients between soil salinity and ALI-convolved field spectra and coefficients between ALI-convolved field spectra.