A WFS-SVM Model for Soil Salinity Mapping in Keriya Oasis, Northwestern China Using Polarimetric Decomposition and Fully PolSAR Data

Timely monitoring and mapping of salt-affected areas are essential for the prevention of land degradation and sustainable soil management in arid and semi-arid regions. The main objective of this study was to develop Synthetic Aperture Radar (SAR) polarimetry techniques for improved soil salinity mapping in the Keriya Oasis in the Xinjiang Uyghur Autonomous Region (Xinjiang), China, where salinized soil appears to be a major threat to local agricultural productivity. Multiple polarimetric target decomposition, optimal feature subset selection (wrapper feature selector, WFS), and support vector machine (SVM) algorithms were used for optimal soil salinization classification using quad-polarized PALSAR-2 data. A threefold exercise was conducted. First, 16 polarimetric decomposition methods were implemented and a wide range of polarimetric parameters and SAR discriminators were derived in order to mine hidden information in PolSAR data. Second, the optimal polarimetric feature subset that constitutes 19 polarimetric elements was selected adopting the WFS approach; optimum classification parameters were identified, and the optimal SVM classification model was obtained by employing a cross-validation method. Third, the WFS-SVM classification model was constructed, optimized, and implemented based on the optimal match of polarimetric features and optimum classification parameters. Soils with different salinization degrees (i.e., highly, moderately and slightly salinized soils) were extracted. Finally, classification results were compared with the Wishart supervised classification and conventional SVM classification to examine the performance of the proposed method for salinity mapping. Detailed field investigations and ground data were used for the validation of the adopted methods. The overall accuracy and kappa coefficient of the proposed WFS-SVM model were 87.57% and 0.85, respectively that were much higher than those obtained by the Wishart supervised classification with values of 73.87% and 0.68, as well as those of the commonly applied SVM classification of 83.61% and 0.80. Accuracy of different salinized soil mapping was also enhanced with the proposed methodology. The results showed that the proposed method outperformed the Wishart and SVM classification, and demonstrated the advantages offered by the WFS-SVM classification and potentials of PolSAR data in the monitoring soil salinization.


Introduction
Soil salinization is one of the prevalent land degradation processes and a major global environmental hazard, particularly in arid and semi-arid areas around the world [1][2][3][4]. The worldwide extent of primary salinized soils is about 955 M ha, which indicates that approximately 7% of the earth's continental extent is affected, while secondary soil salinization as a result of unreasonable human activities affects some 77 M ha, with 58% of these being situated in irrigated areas [1,5]. Therefore, spatio-temporal mapping, detecting, predicting, and monitoring of soil salinization dynamics must thus be urgently implemented in order to halt land degradation [6], such as soil erosion and desertification, and to secure sustainable land use and management in developing countries like China where a rapidly increasing population poses a significant threat to the ecology and environment [7,8].
Mapping soil salinity is relatively difficult owing to its large spatial and temporal variability, thus remote sensing is widely adopted to lower survey costs and time [8,9]. For real-time detecting and eventually taking effective control of salinity problems, remote sensing and geographic information system (GIS) techniques are very applicable, especially for the study of salt-affected soils in arid and semi-arid environments due to the sparse vegetation cover [5,10].
A number of techniques, which heavily rely on remote sensing data, such as multispectral and hyper-spectral, are used for monitoring and evaluating the evolution of soil salinity [11][12][13]. However, the Synthetic Aperture Radar (SAR) imagery is likely to be the most promising technique and has much potential for the detection of soil salinity due to the sensitivity of radar systems to the electrical conductivity (EC) [14][15][16][17] and roughness of the soil surface [7].
Furthermore, traditional optical remote sensing is hampered by weather conditions for timely mapping of soil salinity information in most regions of the world where cloud cover and overcast cloud shadows are frequent [18,19]. SAR remote sensing is able to operate day and night through cloud cover and is capable of penetrating the near-surface soil profile, providing an effective tool for land observation and extracting timely saline soil information in such regions [18][19][20][21][22]. Roughness and dielectric properties of the soil surface captured by SAR data [23,24] provide a unique advantage for detecting and mapping salinized soils in arid and semi-arid areas [7,25].
Polarimetric Synthetic Aperture Radar (PolSAR) images provide a rich set of information about ground objects and offer a significant improvement in the quality of data analysis compared to conventional single-channel SAR [26][27][28]. Classification of PolSAR data and effective extraction of ground information is arguably one of the most important applications in remote sensing. The polarimetric parameters extracted using different polarimetric target decomposition methods are of great importance due to their connection to the physical properties of ground objects [18], and may therefore be applied to classify and map soil salinization.
The overall goal of this study is to assess the potentials of fully PolSAR data for soil salinity monitoring in arid and semi-arid zones, within the environmental context of the Keriya Oasis, Northwestern China, where soil salinization appears to be one of the biggest threats to the eco-environment, to local agricultural production, and to people's daily life [29].

Study Site
The study area Keriya Oasis is located at the northern foot of the Kunlun Mountains along the southern edge of the Taklimakan Desert in Xinjiang Uyghur Autonomous Region (Xinjiang), situated in the northwest of China (Figure 1). The oasis is situated on a fluvial plain with relatively low-lying flat terrain, loose soil, poor permeability, and high salt concentrations, and is a fragile eco-environment, suffering from heavy soil salinization and desertification [30,31]. The total area of salinized cultivated land in Xinjiang is about 1.47 M ha, which accounts for 31.1% of the total cultivated land that suffered from wide spread salt-affected soil [7,32,33].
The region has a warm continental arid climate with an average temperature of 11.6 • C, with a minimum average temperature of −5.8 • C in January, and maximum average temperature of 25 • C in July. The total annual radiation is 6.117 × 105 J/cm 2 and the annual sunshine duration is 2.7346 × 10 3 h. Multi-year average evaporation is 2498 mm, which much exceeds the average annual precipitation of 44.7 mm; the evaporation rainfall ratio is approximately 55:1, and the frost-free period is about 200 days [7]. The Keriya River is a seasonal river that originates from the Kunlun piedmont and flows through the Keriya Oasis that entirely depends on the water resources from the Keriya River, and vanishes in the sand dunes of the Taklimakan desert [34]. The River is mainly supplied by meltwater from glaciers and snow on the Kunlun Mountains, and it feeds the oasis town Keriya with around 250,000 residents (more than 90% farmers) [30]. Oasis agriculture is the main land-use type of the Keriya county, providing the major source of economic income [31]. Major crop types include cotton, wheat, corn, rice, and grapes [30]. The total irrigation area of Keriya County is approximately 448 km 2 , consuming about 6.38 × 10 8 m 3 water per year. The annual groundwater extraction amounts up to about 5.04 × 10 6 m 3 , and most of the irrigation system employs the flood irrigation method, which is very water consuming [29,30]. Since the 1950s, the Keriya River basin has been intensively exploited [35]. The area is now suffering from severe water scarcity owing to excessive pressure from overpopulation, agricultural expansion, and most importantly, from the steady expansion of salinization and desertification [35,36]. Depending on the availability of water, vegetation type varies and includes spare brush vegetation and halophytic plants [37,38]. The main soil types in the study area are the meadow soil and the brown desert soil that are characterized by fine grains, coarse in texture, acidic, low in nutrients, have a low permeability, and exhibit a high water table and mineralization [7,38]. Types of saline soil in the study area are mainly sulfate and sulfite saline soil [39]. The natural and human factors jointly resulted in the salinization of soil and its severity in the oasis [29,30]. The local agricultural productivity is restricted by the highly salinized soil and poor land and water management plus the expansion of the agricultural frontier into marginal ecotones gave rise to salt concentration in the soil, which as a result, generated a strong negative impact on crop yields and agricultural production in the Keriya Oasis [29]. The improvement of soil qualities, where salinization is prevalent, is the approach required to achieve sustainable food production in this area, which calls for the monitoring and mapping of soil salinization at an early stage for an effective soil reclamation program.

Remote Sensing Data
Advanced Land Observing Satellite-2 (ALOS-2) is a Japanese earth observation satellite that was launched on 24 of May in 2014; it has featured with two optical cameras, in addition to a Phased Array L-band Synthetic Aperture Radar 2 (PALSAR-2) sensor that is mainly used for cartography, regional observation, disaster monitoring, and environmental monitoring [40,41]. It has a number of advantages such as both right and left direction observation, a wide bandwidth and wide swath observation, high quality imaging with different spatial resolutions, and so on [42]. The PALSAR-2 data used in this study were collected over the study area on 23 April 2015; they were acquired in a high sensitive mode with quad polarization including HH, HV, VH, and VV, in an ascending orbit and with the incident angle of 30.4 • (specific parameters can be seen from Table 1). The Landsat 8 OLI (Operational Land Imager) image ( Figure 1C) acquired on 30 September 2014 was also selected as reference optical data, for the reason that it is useful for the selection of training and validation plots and for better visual interpretation. PALSAR-2 data of the study area in CEOS mode in Level 1.1 were processed using SARscape5.2.1 ® modules of ENVI 5.3 ® image processing and analysis software from Harris Geospatial Solutions. SAR data processing steps include: (1) generating of single look complex (SLC) image; (2) multi looking (using 4 × 2 looks in range and azimuth respectively, suppressing speckle noise. and generating power image); (3) speckle filtering using the Adaptive Lee filter (3 × 3 window size) and noise reduction [43]; (4) geocoding and radiometric calibration [44,45], using a Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) and radiometric normalization (i.e., modified cosine model) [46]; and (5) image resizing for optimal resolution of 15 m × 15 m. These preprocessing steps generated an orthorectified, geocoded, and radiometrically calibrated backscattering coefficient (σ • ) image in dB, and the statistical characteristics of backscattering coefficient of each polarization band are given in Table 2. Meanwhile, Pauli decomposition was applied to the fully polarimetric PALSAR-2 data, and the Pauli standard RGB composition image was generated ( Figure 1D,E) for visual interpretation.

Field Data
Field campaigns were conducted from 22 April to 7 May 2015. A total of 95 field survey sites were selected for soil sampling and investigation ( Figure 1C), covering different ranges of land-cover and land-use (LCLU) and soil characteristics. Among them, 65 sampling points were within the PALSAR-2 image of our study area. At every site of a regular grid of 15 m × 15 m plots, five samples were taken from the topsoil (0~20 cm). The field investigation focused on collecting ground reference data (GRD) for training and validation samples plus different LCLU classes where the field samples were collected and a corresponding photo library of different LCLU classes was established over the study area ( Figure 2). The GRD were recorded for the patches of salinized soils, along with their environmental contexts, e.g., co-existing vegetation, LCLU, vegetation type, fraction of vegetation, and average ground water depth in each spot. The soil samples were air-dried and sieved through two mm sieves, and soil salt concentration (i.e., total soluble salt) was measured in a laboratory and subsequently, corresponding analysis was carried out. Different degrees of salinization (i.e., highly, moderately, and slightly salinized soil) and corresponding land cover types were defined according to top soil salt concentration along with other features such as vegetation coverage, type, and ground water table (Table 3).
Based on the field investigated and collected data, a total of 550 sample plots were collected including 285 training sample plots and 265 validation plots for image classification and accuracy assessment. It was ensured that these sampling plots were distributed evenly across the study area, and that each land cover class contained at least 33 samples, and each sample contained 55 to 671 pixels (Table 3).   Finally, the digital datasets, including PolSAR imagery and field GPS collections, were geo-referenced to the Universal Transverse Mercator (UTM) coordinate system, Zone 44 North with the World Geodetic System Datum of 1984 (WGS84).

Methodology
This study proposes the WFS-SVM classification model through integrating a wide range of polarimetric features derived from several polarimetric decomposition methods and SAR discriminators with optimal subset selection obtained by the WFS algorithm and image classification by the SVM classifier. The framework of the classification workflow is given in Figure 3. The main steps are described below. Part of the procedure can also be found in detail in our previous study [47].

Polarimetric Decomposition
It is a practical approach to decompose the PolSAR data and mine hidden information for the purpose of analyzing and understanding the scattering mechanism of ground objects [48,49]. Polarimetric target decomposition theorems were initially formalized by Huynen [50]. Since the late 1980s, many other decomposition methods have been developed by other researchers, and gained improvement over decades [51,52]. An essential objective of PolSAR data decomposition is to extract physical information from the observed microwave scattering by both surface and volume structures.
Moreover, it is of great importance to comprehend the average, dominant, or scattering mechanism and to understand the combination of scattering responses of each simple object associated with a corresponding physical interpretation in order to improve the quality of classification [18,53].
The Pauli decomposition is a common decomposition method widely used for PolSAR imagery [53]. Pauli decomposition expresses the scattering matrix S as the complex sum of the Pauli matrices.
where S hh , S vv denote the co-polarized complex scattering amplitudes; S hv , S vh denote the cross-polarization components respectively; and a, b, c, and d are all complexes given by: In the monostatic case, transmit and receive antennas coincide, the backscattering matrix may be symmetric, with S hv = S vh , and the Pauli matrix basis can be reduced to the first three matrices, which leads to d = 0 [54]. It follows that the Span value is given by: Therefore, the Pauli decomposition of the backscattering matrix is frequently adopted to symbolize all the polarimetric information in PolSAR data. As shown in Figure 1E, a Pauli RGB composite image consists of intensities |HH + VV| (blue), |HH − VV| (red), and |HV + VH| (green), which correspond to clear physical scattering mechanisms. The Pauli RGB composition image, thus, has become the standard for PolSAR image display and has often been implemented for visual interpretation [18].
The scattering matrix can also be vectorized on the basis of Pauli decomposition, thereby the scattering vector is obtained as: The 3 × 3 coherency matrix T 3 is defined as the expected value of KK* T [54]: where, symbol H stands for the conjugate transpose, * denotes the conjugate, and | | denotes the module. Coherency matrix T 3 is a close relative of covariance matrix C 3 , and they consist of the same information; however, it comes in different forms [54]. The Coherency matrix T 3 can better explain the physical and geometric characteristics of the target, and thus leads to broader application in Radar polarimetry. In this study, scattering matrix S, coherency matrix T 3 , and covariance matrix C 3 of PolSAR data were calculated. In addition to the Pauli decomposition, a variety of polarimetric decomposition methods have been proposed and corresponding polarimetric information was extracted in order to take full advantage of the PolSAR data. Those decomposition methods explored in this study are Barnes [55], Huynen [50], Cloude [56], Holm [57], H/A/Alpha [58], Freeman Two Components [59], Freeman Three Components [60], Van Zyl [61], Neumann [62], Krogager [63], Yamaguchi [64], and Touzi [49] etc, and detailed calculations and physical interpretations of these polarimetric parameters can be found in [54].

Wrapper Feature Selector (WFS)
PolSAR data contains a great variety of SAR features that can be extracted and used for image classification. It is very important to make the most of the discriminative power offered by these features [65]. However, the wide-range of PolSAR parameters and discriminators cannot all be used simultaneously in most classification problems due to the redundancy of information and the inherent speckle noise in some of the extracted parameters [65]. In order to fully utilize the PolSAR parameters, a vital step is to find the most discriminative and informative features and suppress speckle noise inherent in SAR data. Optimal feature subset selection often improves classification accuracy, enhances the efficiency of the classifier, and saves the time spent on the training and classification process [20].
In recent decades, Feature Subset Selection (FSS) has been a hot topic in machine learning. FSS is defined as a process of selecting and returning the best and minimally sized subset of relevant features from a larger set of original features, while retaining the physical meanings of the original features without transformation [66]. The main objective of feature selection is to find a subset of highly discriminant features related to the classification problem. Feature selection methods can further be broadly categorized into embedded approaches, filter approaches, and wrapper approaches [67,68]. In general, the wrapper model embodies a better efficiency and the results depend on the classification algorithm improved by the feature selection process. This model performs better with good applicability under the circumstances of a smaller data size and predetermined classifier [69,70]. For that reason, we adopted the Wrapper Feature Selector (WFS) for the optimal subset selection in our study. Feature selection in the WFS model is mainly composed of two key elements, namely the search algorithm and evaluation function. For the search engine, the Genetic Algorithm [71] was adopted, and for evaluation function, we used the cross-validation approach as our accuracy assessment tool [72].

Support Vector Machine (SVM) Classification
The SVM approach is a non-parametric machine learning methodology based on the statistical theory [73] that can be adopted for classification practice and is particularly promising in the remote sensing field owing to its capability of generalizing well even with a limited number of training samples [74]. For given training samples of two different classes, the SVM training algorithm aims to derive a hyperplane that optimizes the separation of closest points that belong to both classes and minimizes misclassifications. Regarding a two-class pattern recognition problem in which the classes are linearly separable, the SVM selects the one that minimizes the generalization error from among the infinite number of linear decision boundaries. Therefore, the selected decision boundary will be the one that leaves the greatest margin between the two classes, where a margin is defined as the sum of the distances to the hyperplane from the closest points of the two classes [73].
Typically, a multi-class SVM is implemented by combining several two-class SVMs. In our study, the "one-against-one" (OAO) approach and the Gaussian Radial Basis Function (RBF) kernel were applied for multi-class SVM. Employing the RBF kernel for SVM and obtaining the optimal SVM classification model is important to obtain the best set of penalty parameters C and kernel parameters γ for specific training datasets. In this instance, the cross-validation (CV) search algorithm was adopted [75], which uses a multiclass SVM OAO method [76]. The predicted set of optimal penalty parameters C and kernel parameters γ achieved with the highest CV accuracy was used to classify different land cover types and salinized soils. Final results were sieved and clumped to eliminate spurious pixels from the classification results.

Polarimetric Decomposition of Fully PolSAR Data
Firstly, the scattering matrix S, coherency matrix T 3 , and covariance matrix C 3 were calculated and the corresponding original scattering matrix elements were extracted. Subsequently, polarimetric parameters of the fully polarized PALSAR-2 image of the study area were extracted through different polarimetric decomposition methods to promote an optimal classification by using PolSARpro-v5.0.4 ® software [77], and all the descriptors used in PolSARPro-v5.0.4 ® for these polarimetric parameters were adopted [18,78]. Apart from these features derived using decomposition methods, several SAR polarimetric discriminators [79] including SPAN, polarization fraction, pedestal height, Radar vegetation index (RVI), single bounce eigenvalue relative difference (SERD), and double bounce eigenvalue relative difference (DERD) [80,81], were also retrieved and taken into account in our study in order to make full use of the discriminative power offered by all these features. Finally, a total of 81 polarimetric features were acquired from the PolSAR image. The features are given in Table 4 and the standard RGB composition images that demonstrate some of these polarimetric decompositions and SAR discriminators are displayed in Figure 4.

Construction of WFS-SVM Classification Model
It is imperative to select the most discriminative and informative polarimetric futures and to construct the optimal classification scheme in order to acquire a better classification accuracy and extract ground object information more precisely. Besides, classification results obtained by using a single feature may not be sufficient. In our study, multiple polarimetric parameters and discriminators were extracted for optimal classification purpose. However, it is not reasonable to adopt all the polarimetric features for classification and monitoring soil salinization due to some of the redundant features and noise. Though the PolSAR data were denoised through multi-looking and filtering, the image was still pertaining a certain amount of speckle noise information that is visible; some of the decomposed elements that contain much noise might hamper and even reduce the classification accuracy. It is worth noting that employing all these polarimetric parameters and elements is time-consuming. Especially, the SVM classifier requires significant computing resources to run [20]. For these reasons, a WFS-SVM model was constructed and implemented in order to make the full use of the discriminative power offered by all these polarimetric features of PALSAR-2 data, to reduce redundant information, increase classification accuracy, and minimize classification time.
These polarimetric decomposed elements and derived parameters need to be normalized to a range due to their wide dissimilarity. These data were normalized to a scale of [−1, 1], since the minus values contained in the data and the corresponding data format transformation were implemented. Then, in this work, the wrapper feature selection method based on a genetic algorithm (the accuracy was tested by 10 runs of 10-fold cross-validation algorithm) was employed by adopting the Weka 3.6.9 ® software [82] to select the best subset of polarimetric features, and the optimal subset fitted for SVM classification was obtained. This optimal feature subset consisted of 19 polarimetric feature elements including T 33 , C 11 , Pauli_a, Cloude_T 11 , Pauli_c, Free3_Vol, Free3_Dbl, Yam3_Odd, Yam4_Vol, Huy_T 11 , VZ3_Odd, Touzi_alpha_s, Krog_Kd, Neu2_delta_mod, H/A/a_T 22 , H/A/a_SE, PH, RVI, and SPAN, and accordingly, the optimal subset of SVM classification was generated so as to provide an effective data source for the construction of the WFS-SVM classification model.
In order to obtain the optimal SVM and WFS-SVM classifier, the cross-validation (CV) search model provided by LIBSVM-3.20 ® software [76] was adopted to search for the optimum set of penalty factors C and kernel parameters γ. The CV search results are C SVM = 2048 and γ SVM = 0.5 for the optimal SVM classifier, and C WFS-SVM = 256 and γ WFS-SVM = 0.25 for the WFS-SVM classification model (with both CV accuracy of higher than 90%), respectively. Consequently, the WFS-SVM classification model that was proposed in this study was constructed.

Soil Salinity Mapping
On the basis of multiple polarimetric decomposition and optimal feature subset, the whole training sample plots were used to train the classifier, and the PALSAR-2 image was classified by adopting the WFS-SVM classification model with optimum parameters. A comparison between the proposed WFS-SVM classification method, traditional SVM method with optimum parameters, and the Wishart supervised classification (which is based on the coherency matrix) was made to test the performance of the proposed method. The three classification methods were implemented and the classification results are shown in Figure 5.
From the proposed WFS-SVM classification results ( Figure 5C), it can be seen that highly salinized soils are mainly distributed in the lower reaches of the Keriya River and the northeastern part of the study area and in the southwest part of the Oasis periphery. Moderately salinized soils are mainly distributed in the periphery of the Oasis and in the ecotone between the Oasis and the Taklimakan Desert. Slightly salinized areas are mainly distributed in the transition zone between vegetation and moderately salinized soil. Figure 5A-C show a comparison among three different classification results. The results showed that the WFS-SVM model produced the most accurate classification outcome, while the result of Wishart supervised classification was the worst, with significant confusion between different degrees of salinized soil and barren land. The SVM classifier produced a relatively similar classification outcome in contrast to that of WFS-SVM. However, the salt-and-pepper noise was relatively reduced in WFS-SVM classification. Compared with other classification methods, the WFS-SVM classification was superior, especially in extracting different degrees of soil salinization information.

Accuracy Assessment
In order to quantitatively evaluate the effectiveness of the proposed WFS-SVM model with respect to monitoring accuracy for salinized soil, the classification accuracy of the above three classifiers was produced with a statistical summary and corresponding confusion matrix. The confusion matrix was generated according to the ground truthing obtained in the field investigation (see Tables 5-7, respectively). It is evident from the confusion matrices that the proposed WFS-SVM model yielded the highest overall accuracy of 87.57% with a Kappa Coefficient of 0.85; on the other hand, the overall classification accuracies of Wishart and conventional SVM were 73.87% and 83.61%, and their corresponding Kappa Coefficients were 0.68 and 0.80, respectively. Compared with the Wishart supervised and SVM classification, the overall classification accuracy of WFS-SVM was 13.7% and 3.97% higher, respectively, than the other two methods. In the WFS-SVM classification, there was an obvious increase in the extraction accuracy (producer's accuracy) of different (i.e., highly, moderate and slightly) salinized soil information in contrast with the Wishart and SVM, the extraction accuracy of highly salinized soil rose from 85.8% and 85.73% to 87.79%, moderately salinized soil from 84.82% and 81.53% to 90.43%, and the increase was even more significant in slightly salinized soil, rising from 75.51% and 86.89% to 93.96%. Additionally, the confusion between barren lands with salinized soil types was remarkably reduced in the WFS-SVM classification model. Overall Accuracy = 87.572% Kappa Coefficient = 0.8488 Note: WB, BL, VG, HS, MS, and SS stand for the water body, barren land, vegetation, highly salinized soil, moderately salinized soil, and slightly salinized soil. Prod Acc. is for a producer's accuracy and User Acc. is for a user's accuracy.

Discussion
In this study, soil salinization information was extracted by Wishart classification, conventional SVM classification, and WFS-SVM classification. A comparison among these classification methods showed that Wishart supervised classification of the PALSAR-2 coherency matrix, which was not subject to polarization decomposition, produced a low accuracy for the extraction of salinized soil information. The reason for this might be that the polarimetric characteristics of the PolSAR data were not fully exploited, as well as being partly due to the deficiency in generalizing and optimizing ability of the Wishart classification that is based on the maximum likelihood criterion compared with the SVM classification.
With regard to the extraction accuracy of soil salinization and overall classification accuracy, the proposed WFS-SVM model was obviously superior to the SVM classification method for the reason that the optimal polarimetric feature subset of WFS-SVM was obtained with the WFS method, allowing for the classifier trained with the optimal subset. In this way, the redundancy of polarization information was reduced and the rich polarization information of PolSAR data was comprehensively integrated. In addition, the inherent speckle noise in the SAR data was suppressed further. As can be observed from Figure 4, noise in RGB composite images of some polarization decomposition including Krogager, Cloude, H/A/Alpha, Holm2, and Huynen is more evident than that of other decomposition methods due to the retained speckle noise in some of the polarized feature elements of those images.
Through constructing and implementing the WFS-SVM classification model, the polarized feature elements with severe speckle noise were eliminated (e.g., Krog_Kh, Cloude_T 33 , H/A/a_T 33 , Neu2_delta_pha, Bar2_T 11 , Huy_T 33 , Hol2_T 33 , Touzi_tau_m 3 , Touzi_alpha_s 3 , Touzi_phi_s 3 , etc.). Figure 6 shows that the noise was very obvious in part of the polarized features. As a result, the inherent speckle noise problem in SAR data was overcame to some extent. Furthermore, some of the feature elements of the optimal feature subset (Krog_Kd, Pauli_a, Pauli_c, Free3_Vol, Free3_Dbl, Yam3_Odd, Yam4_Vol, VZ3_Odd, H/A/a_SE, etc.) were selected through the WFS process, which have a certain physical meaning and directly or indirectly reflects the differences of the scattering mechanism among different land objects. It is quite helpful to distinguish different ground objects, particularly useful for discriminating the difference between high, moderate, and slight salinization. Additionally, SAR discriminators included in classification like RVI and pedestal heights are beneficial to improve the extraction accuracy of vegetation. As can be seen from confusion matrices (Tables 5-7), the extraction accuracy of vegetation was improved significantly in SVM and WFS-SVM classifications, leading to the reduction of confusions between vegetation and slightly salinized soil. Finally, the mixing between highly salinized soil with barren land due to the similar characteristics of land surface (similar soil roughness, soil moisture with almost no vegetation coverage) was greatly reduced with the WFS-SVM approach, which contributed to improvement of the overall classification accuracy. The WFS-SVM classification model based on polarization decomposition improved the mapping and monitoring accuracy of soil salinization in arid areas. Although the WFS-SVM methodology proposed in this study demonstrated great potential, it only preserved part of the polarimetric features of the PolSAR data at the cost of unavoidably losing some other useful polarization information in terms of WFS feature subset selection. Therefore, care should be taken for selecting the most suitable feature subset for different study sites. Besides, the physical mechanism of the polarimetric feature parameters obtained by different polarization decompositions of PALSAR-2 data and its quantitative relationship with the degree of soil salinity requires further study. Lastly, the potentials of this methodology need further investigation and validation in other areas, despite the advantages demonstrated in this specific arid environmental condition.

Conclusions
This study developed a method that fully utilizes the intrinsic merit of polarimetric decomposition, optimal feature subset selection (wrapper feature selector, WFS), and support vector machine (SVM) algorithms for the classification and extraction of salinized soils using quad-polarized PALSAR-2 images. Multiple polarimetric decomposition methods (Pauli, Freeman, Barnes, Holm, Cloude, Huynen, Yamaguchi, VanZyl, Krogager, Touzi, Neumann, H/A/Alpha) were integrated to extract polarimetric parameters and SAR discriminators related to the physically scattering mechanisms of soil salinity. The WFS that uses genetic algorithms was adopted to select and provide the best feature subset to implement SVM classification; a cross-validation method was employed to identify the optimum classification parameters and obtain an optimal SVM classification model. A WFS-SVM classification model was developed and further optimized based on the optimal match of polarimetric features and optimum classification parameters.
The major conclusions of our work include the followings: (1) the optimal feature subset of the PALSAR-2 data over the study area by using the WFS algorithm consisted of 19 polarimetric feature elements including T 33 , C 11 , Pauli_a, Cloude_T 11 , Pauli_c, Free3_Vol, Free3_Dbl, Yam3_Odd, Yam4_Vol, Huy_T 11 , VZ3_Odd, Touzi_alpha_s, Krog_Kd, Neu2_delta_mod, H/A/a_T 22 , H/A/a_SE, SPAN, Pedestal Height (PH), and Radar Vegetation Index (RVI); (2) compared with the Wishart supervised and conventional SVM classification, a significant improvement was achieved by the proposed WFS-SVM classification approach, with the best overall accuracy of 87.57% and Kappa Coefficient of 0.85, while those by Wishart and the SVM were 73.87% (Kappa: 0.68) and 83.61% (Kappa: 0.80), respectively; (3) more importantly, the extraction accuracy of different soil salinization values was also increased with the proposed methodology; the extraction accuracy of highly salinized soil was increased to 87.79%, while that of Wishart and SVM classification was 85.8% and 85.73%, respectively, and the extraction accuracy of moderately salinized soils was enhanced from 84.82% and 81.53% to 90.43%, respectively; the improvement in accuracy was even more evident in slightly salinized soil, which increased from 75.51% and 86.89% to 93.96%, respectively. Additionally, the confusion of barren land with salinized soil types was decreased with the WFS-SVM classification approach.
The WFS-SVM approach developed in this work not only made good use of the set of rich polarization information of PolSAR data, but also minimized its redundancy, suppressing the speckle noise in SAR data. Particularly, this method significantly improved the accuracy of mapping the extent and degrees of soil salinization. The classification results demonstrated the potentials of the proposed WFS-SVM approach for the monitoring and mapping of soil salinization using PolSAR data in arid and semi-arid areas.