Lithology Discrimination Using Sentinel-1 Dual-Pol Data and SRTM Data

: Compared to various optical remote sensing data, studies on the performance of dual-pol Synthetic aperture radar (SAR) on lithology discrimination are scarce. This study aimed at using Sentinel-1 data to distinguish dolomite, andesite, limestone, sandstone, and granite rock types. The backscatter coefﬁcients VV and VH, the ratio VV–VH; the decomposition parameters Entropy, Anisotropy, and Alpha were ﬁrstly derived and the Kruskal–Wallis rank sum test was then applied to these polarimetric derived matrices to assess the signiﬁcance of statistical differences among different rocks. Further, the corresponding gray-level co-occurrence matrices (GLCM) features were calculated. To reduce the redundancy and data dimension, the principal component analysis (PCA) was carried out on the GLCM features. Due to the limited rock samples, before the lithology discrimination, the input variables were selected. Several classiﬁers were then used for lithology discrimination. The discrimination models were evaluated by overall accuracy, confusion matrices, and the area under the curve-receiver operating characteristics (AUC-ROC). Results show that (1) the statistical differences of the polarimetric derived matrices (backscatter coefﬁcients, ratio, and decomposition parameters) among different rocks was insigniﬁcant; (2) texture information derived from Sentinel-1 had great potential for lithology discrimination; (3) partial least square discrimination analysis (PLSDA) had the highest overall accuracy (0.444) among the classiﬁcation models; (4) though the overall accuracy is unsatisfactory, according to the AUC-ROC and confusion matrices, the predictive ability of PLSDA model for limestone is high with an AUC value of 0.8017, followed by dolomite with an AUC value of 0.7204. From the results, we suggest that the dual-pol Sentinel-1 data are able to correctly distinguish speciﬁc rocks and has the potential to capture the variation of different rocks.


Introduction
Lithology identification is essential for geological investigation and mineral resource exploration. Accurate information on lithology provides a profound basis to trace the process of planetary formation [1]. The traditional way to discriminate lithology is usually based on field investigation and laboratory experiments, which is expensive and timeconsuming. Remote sensing technology makes it possible to identify lithology on a large scale efficiently and timely, especially when the terrain is rugged and the topography is complex [2,3].
The performance of remote sensing on lithology classification has been widely explored and most of the studies concentrated on the use of spectral features. The spectra of rocks and minerals usually have special absorption features and these features can be due to various physical and chemical properties like the size of grains in rock, the composition, texture, element content, etc. [4,5]. Lyon [6] found that the depth of the spectra could be affected by the grain size in igneous rock and the spectral features caused by Si-O bond vibration were influenced by the silica content. Kahle [7] utilized the multispectral thermal infrared (TIR) data obtained by airborne NASA's (National Aeronautics and Space between cost and spatial-temporal resolution offers dual polarization (VV and VH) data. The potential of dual-pol Sentinel-1 data for lithology mapping or discrimination still needs further study.
Among the discrimination or classification methods, pixel-based methods are a common technique used for mapping or classifying lithological units [36]. The pixel-based classifiers consist of parametric (maximum likelihood) and non-parametric (SVM, neural networks, and tree-based) classifiers. Contrary to the parametric classifiers, the normally distributed data are not required by the non-parametric classifiers. Due to the fact that remote sensing data cannot provide data with normal distribution in most cases, the nonparametric classifiers are more preferred [22]. Partial least square discriminant analysis (PLSDA), as a non-parametric machine learning tool, combines the PLS regression and classification techniques. Unlike the neural network classifiers, partial least square (PLS)can still show robust and effective results with a small sample size [22,37]. Whereas PLSDA has shown great potential in many fields like bioinformatics, agriculture, and chemometrics, the usage of PLSDA in lithology discrimination is rarely mentioned.

Study Area and Field Samples
The study area is located in Huludao, a southwest coastal city with many hills in Liaoning Province, China. This region is on the West Block of North China Craton, which is one of the oldest Archean cratons in the world, and experienced widespread tectonothermal activity since the late Mesozoic [38,39]. The various lithological rock outcrops with different characteristics can be found in this area.
Fifty-four rock samples were randomly distributed in Huludao ( Figure 1). GPS positions and physical properties like color, grain size, and texture were recorded. In general, the sample was grouped into five kinds of rock as follows: andesite (n = 17), dolomite (n = 7), granite (n = 9), limestone (n = 15), and sandstone (n = 6). To reduce the effect from the dense vegetation cover with plenty of forest and cropland, most of the samples were located in open and unobstructed places.
Remote Sens. 2021, 13, x FOR PEER REVIEW 3 of 17 advantages for lithology mapping. However, most of the studies used high-cost commercial data, and the SAR data are often full-polarization data (HH, HV, VV, VH). Sentinel-1 as a good trade-off between cost and spatial-temporal resolution offers dual polarization (VV and VH) data. The potential of dual-pol Sentinel-1 data for lithology mapping or discrimination still needs further study. Among the discrimination or classification methods, pixel-based methods are a common technique used for mapping or classifying lithological units [36]. The pixel-based classifiers consist of parametric (maximum likelihood) and non-parametric (SVM, neural networks, and tree-based) classifiers. Contrary to the parametric classifiers, the normally distributed data are not required by the non-parametric classifiers. Due to the fact that remote sensing data cannot provide data with normal distribution in most cases, the nonparametric classifiers are more preferred [22]. Partial least square discriminant analysis (PLSDA), as a non-parametric machine learning tool, combines the PLS regression and classification techniques. Unlike the neural network classifiers, partial least square (PLS) can still show robust and effective results with a small sample size [22,37]. Whereas PLSDA has shown great potential in many fields like bioinformatics, agriculture, and chemometrics, the usage of PLSDA in lithology discrimination is rarely mentioned.

Study Area and Field Samples
The study area is located in Huludao, a southwest coastal city with many hills in Liaoning Province, China. This region is on the West Block of North China Craton, which is one of the oldest Archean cratons in the world, and experienced widespread tectonothermal activity since the late Mesozoic [38,39]. The various lithological rock outcrops with different characteristics can be found in this area.
Fifty-four rock samples were randomly distributed in Huludao ( Figure 1). GPS positions and physical properties like color, grain size, and texture were recorded. In general, the sample was grouped into five kinds of rock as follows: andesite (n = 17), dolomite (n = 7), granite (n = 9), limestone (n = 15), and sandstone (n = 6). To reduce the effect from the dense vegetation cover with plenty of forest and cropland, most of the samples were located in open and unobstructed places.

Sentinel-1 Products and Elevation Data
The Sentinel-1 mission is part of the Copernicus Programme developed by the European Agency, containing Sentinel-1A and Sentinel-1B constellations. Level-1 single-look complex (SLC) and Ground range detected (GRD) Sentinel-1 products providing dualpolarization (VV and VH) data, in Interferometric Wide Swath (IW) mode were utilized in this study. SLC products consist of focused SAR data geo-referenced using orbit and attitude data from satellite and provide in zero-Doppler slant-range geometry. GRD products consist of focused SAR data that have been detected, multi-looked, and projected to ground range using the WGS-84 Earth ellipsoid model. To reduce the effect of vegetation and snow cover, data acquired on 8 October 2020 were selected.
Due to the close relationship between topography, tectonic process, and lithology [40], elevation data were also involved in this study. SRTM (The Shuttle Radar Topography Mission) 1 Arc-Second Global elevation data generated from C-band Spaceborne Imaging Radar and the X-band SAR instrument onboard the space shuttle were used here.

Methodology
The methodology ( Figure 2) mainly consists of 4 parts: (1) preprocessing of Sentinel-1 GRD and SLC data and the extraction of corresponding polarimetric parameters of samples; (2) statistical analysis of polarimetric parameters; (3) discriminant analysis on the sample and cross-validation; (4) accuracy assessment.

Sentinel-1 Products and Elevation data.
The Sentinel-1 mission is part of the Copernicus Programme developed by the European Agency, containing Sentinel-1A and Sentinel-1B constellations. Level-1 single-look complex (SLC) and Ground range detected (GRD) Sentinel-1 products providing dualpolarization (VV and VH) data, in Interferometric Wide Swath (IW) mode were utilized in this study. SLC products consist of focused SAR data geo-referenced using orbit and attitude data from satellite and provide in zero-Doppler slant-range geometry. GRD products consist of focused SAR data that have been detected, multi-looked, and projected to ground range using the WGS-84 Earth ellipsoid model. To reduce the effect of vegetation and snow cover, data acquired on 8 October 2020 were selected.
Due to the close relationship between topography, tectonic process, and lithology [40], elevation data were also involved in this study. SRTM (The Shuttle Radar Topography Mission) 1 Arc-Second Global elevation data generated from C-band Spaceborne Imaging Radar and the X-band SAR instrument onboard the space shuttle were used here.

Methodology
The methodology ( Figure 2) mainly consists of 4 parts: (1) preprocessing of Sentinel-1 GRD and SLC data and the extraction of corresponding polarimetric parameters of samples; (2) statistical analysis of polarimetric parameters; (3) discriminant analysis on the sample and cross-validation; (4) accuracy assessment.

Sentinel-1 Preprocessing
The preprocessing for Sentinel-GRD data was similar as described in [41]. As such, the orbit information was firstly updated to obtain the precise position of the satellite. Radio-Remote Sens. 2021, 13, 1280 5 of 17 metric calibration was then done to convert the digital value to radiometrically calibrated SAR backscatter. The images were cropped to reduce image size and processing time. The speckle noise generated from the interference of waves from elementary scatterers was filtered by the refined Lee filter, as it can preserve point target and texture information [42]. Range Doppler terrain correction compensates the geometry distortion caused by sidelooking working mode and topographical variations. The pixel values were converted to a log scale in decibels (dB). To obtain more information, the ratio (represented as VV-VH) between the polarization VV and VH was also calculated.
The preprocessing for SLC data was also started from orbit file correction. The three subswaths (IW1, IW2, and IW3) acquired by the TOPSAR technique in IW mode were split. Each subswath can be processed independently and therefore only the slice that contains samples was processed. Then radiometric correction was applied followed by a deburst operation which removes the black-fill demarcations between the nine burst in the azimuth direction to get a seamless image. Speckle filters, terrain correction, and subset come next as the operation is done for the GRD data. H/ /A polarimetric decomposition helps extract scattering mechanisms in this study. Entropy (H), scattering angle (α), and anisotropy (A) were calculated based on the eigenvalues and eigenvectors of the polarimetric coherency matrix. The H plane was linearly separated into nine zones to determine the nine different scattering mechanisms [43]. The H ranges from 0 to 1, indicating the randomness of the scattering. The α represents the scattering-type such as surface scatter, dipole scatter, and multiple scatter. The anisotropy (A), as a complementary parameter to entropy, can provide information about surface roughness [44].
Meanwhile, textural information and textural characteristics obtained from SAR images have also been proved to be useful in the recognition of lithological units [27,34]. The backscatter coefficients VV and VH were chosen for grey level co-occurrence matrix (GLCM) [45] analysis. The contrast features (contrast, dissimilarity, and homogeneity), orderliness features (angular second moment, maximum probability, energy, and entropy), and statistics features (mean, variance, and correlation) were calculated respectively for VV and VH. Thus, there were 20 features in total that can be achieved based on the GRD data. Compared with the number of available samples, a principal component analysis (PCA) was applied to these features to reduce the feature number and redundancy.
All the preprocessing procedures for both GRD and SLC products were conducted by the Sentinel's Application Platform (SNAP) toolbox from ESA (European Space Agency). After the preprocessing, the pixel values of the backscatter coefficient VV, VH, ratio (VV-VH), entropy, alpha, anisotropy, the corresponding elevation values from SRTM, and GLCM features were extracted using the position of samples. The features used in this study were illustrated in Table 1. Table 1. Summary of features used for lithology discrimination.

Statistical Analysis of Polarimetric Parameters and Backscatter Coefficients
Before the discriminant analysis, the statistical analysis of the polarimetric derived matrices of different kinds of rock was performed in Matlab 2020a. A Kruskal-Wallis rank sum test [46] was firstly carried out on the rock samples to see the difference between the polarimetric features and whether the difference can differentiate various rock groups significantly. The Kruskal-Wallis rank sum test is a nonparametric test that does not assume a normal distribution of the data. The test helps a lot when comparing the difference Remote Sens. 2021, 13, 1280 6 of 17 between more than two groups by assessing the statistical differences of the sample means among the rock groups. Moreover, box-plots of different polarimetric parameters derived from Sentinel-1 data were also plotted for further analysis.

Discriminant Analysis and Cross-Validation
The 12 variables (Table 1) used for lithology discrimination were measured in different scales and units, a centroid centering and scaling step was conducted for each variable. Due to the complexity (dense vegetation cover and structure) of the study area, the sample size is limited. To avoid overfitting of the model, the input variables with useful information and little redundancy were attempted to be selected for further model building. It is worth noting that the discriminative information of different rocks can be subtle and be removed as noise. We carried out the variable selection via the toolbox PLS v8.9 in Matlab from Eigenvector Research, Incorporated [47]. The importance of all the variables was first measured in the XGBoost module. The gain of each variable was summed up on each node when we made an XGBoost analysis and the gain refers to the reduction in the loss function being optimized. Simultaneously, the variable importance in projection (VIP) which was widely used for variable selection in a PLS model was also conducted. Normally, the bigger the VIP score, the more important the variable across all the components further used for the model. The union of the variables selected by these two ways was used as the final input variable set.
PLS aims to find a hyper-plane in multi-dimensional space to divide space into multiregions [48]. Similarly, like the PCA, PLSDA also attempts to transform the data to a lower-dimensional space with a small error while dealing with colinearity in between the input variables. One difference between PLSDA and PCA is that PLSDA achieves dimensionality-reduction with full awareness of class labels. The response matrix y is recorded as a dummy block matrix that records the membership of each sample. Both the response matrix and the independent block x are involved to find the latent variables (LV) which can best describe y matrix [49]. Given the limited sample size, PLSDA implemented via the toolbox PLS v8.9 was selected to discriminate the rock in this study. In theory, the number of predictor variables for the model is fewer than the number of LV to reduce the chance of overfitting [50]. The predictor variables of the model were optimized using cross-validation iteratively and the LVs were adding to the model one by one until the cross-validation error rate stopped decreasing [51]. In this study, the sample was first sorted according to the rock types and a 10-fold cross-validation was utilized to guarantee the rock type proportion is preserved in both the training and the validation data. Note that 10% of the samples were treated as validation data and the other 90% of the data were treated as training data each time.
To find a suitable discrimination model, in addition to XGBoost analysis and PLSDA, classification learners implemented in Matlab 2020a were also used for the lithology discrimination. Cross-validation was set to be 10-fold and other parameters use default values. The classification learners included discriminant analysis classifiers, Naïve Bayes classifiers, support vector machines, nearest neighbor classifiers, decision trees classifiers, and ensemble classifiers.

Accuracy Assessment
The accuracy of the model was assessed by the confusion matrix and the AUC-ROC (area under the receiver operating characteristics curve) curve. The confusion matrix is a specific table that contains the performance information of the model [52]. The confusion between different rock classes was demonstrated and the accuracy from different viewpoints was calculated such as the overall accuracy, kappa coefficient, and F1-score.
AUC-ROC curve is another way to score the performance of the binary classification. ROC is the probability curve and AUC shows the separability between the classes. The ROC curve is plotted with the true positive rate (TPR, also called sensitivity) as x-axis and false positive rate (FPR, also represented as 1-specificity) as y-axis. AUC takes values from Remote Sens. 2021, 13, 1280 7 of 17 0 to 1, where a value of 0 means a totally inaccurate test and a value of 1 reflects a perfectly accurate test [53]. When AUC is 0.5, it means the model has no class separation capacity. The higher the AUC usually means the better the model can be classed from other classes. AUC-ROC offers a more reasonable measurement of the model performance when the class is imbalanced than the overall accuracy [54].

Results
After the extraction of corresponding features, the statistical boxplots were demonstrated in Figure 3, which presents the variation in backscatter coefficients (VV, VH, VV-VH) and decomposition parameters (H/α/A) derived from different rock types. In general, the mean value of backscatter coefficient VH for dolomite, andesite, and granite varies around −17 db. Sandstone has a relatively lower mean VH value (−18.3 db) and limestone has a slightly higher mean VH around −16 db. The mean VV of limestone (−9.2 db) shows an obviously higher value than that of andesite (−10.7 db), sandstone (−11.1 db), and granite (−11.2 db), whilst the mean VV of dolomite locates around −10 db. From Figure 3a, we can also notice that granite has a lower mean (VV-VH) value at around 5.6db than that of the other 4 lithological classes. The differences of the mean VV-VH among dolomite, andesite, limestone, and sandstone are unobvious. Whereas for the decomposition parameters, the variations of mean H and A for dolomite, andesite, limestone, granite, and sandstone are small, and H and A values are around 0.65 and 0.6, respectively. As for the Alpha, limestone has a substantial-high mean value at around 21 degrees compared to that of andesite and sandstone (around 19 degrees). Dolomite and granite almost have a same mean Alpha (around 20 degrees). According to Figure 3b and [43], the majority of the rock samples have a medium entropy surface scattering with the mean alpha values clustered around 20 degrees and entropy values clustered around 0.7. 0 to 1, where a value of 0 means a totally inaccurate test and a value of 1 reflects a perfec accurate test [53]. When AUC is 0.5, it means the model has no class separation capaci The higher the AUC usually means the better the model can be classed from other class AUC-ROC offers a more reasonable measurement of the model performance when class is imbalanced than the overall accuracy [54].

Results
After the extraction of corresponding features, the statistical boxplots were demo strated in Figure 3, which presents the variation in backscatter coefficients (VV, VH, V VH) and decomposition parameters (H/α/A) derived from different rock types. In g eral, the mean value of backscatter coefficient VH for dolomite, andesite, and granite v ies around -17db. Sandstone has a relatively lower mean VH value (-18.3db) and limesto has a slightly higher mean VH around -16db. The mean VV of limestone (-9.2db) sho an obviously higher value than that of andesite (-10.7db), sandstone (-11.1db), and gran (-11.2db), whilst the mean VV of dolomite locates around -10db. From Figure3(a), we c also notice that granite has a lower mean (VV-VH) value at around 5.6db than that of other 4 lithological classes. The differences of the mean VV-VH among dolomite, andes limestone, and sandstone are unobvious. Whereas for the decomposition parameters, variations of mean H and A for dolomite, andesite, limestone, granite, and sandstone small, and H and A values are around 0.65 and 0.6, respectively. As for the Alpha, lim stone has a substantial-high mean value at around 21 degrees compared to that of andes and sandstone (around 19 degrees). Dolomite and granite almost have a same mean Alp (around 20 degrees). According to Figure3(b) and [43], the majority of the rock samp have a medium entropy surface scattering with the mean alpha values clustered arou 20 degrees and entropy values clustered around 0.7.   Though differences can be seen from the boxplots, the result of Kruskal-Wallis r sum test (Table 2) reveals a statistical insignificance difference among the rock sam based on the backscatter coefficients and decomposition parameters, with all the p va larger than 0.05 at 95% confidence level. Thus, a single backscatter coefficient (VV/VH/ VH) or decomposition parameter (H/A/Alpha) cannot be treated as a solid discrimin among the 5 lithological classes. Multiple sources of information should be conside together to discriminate lithology.  Figure 4a, it can be seen that the GLCM component4 (11), GLCM component1 and the elevation data (7) contribute most of the information, followed by GLCM com nent3 (10), the GLCM component5 (12), and Alpha (3). While from Figure 4(b), except A (2), all the other variables provide more or less information for discrimination w using XGBoost analysis. The elevation data (7), VV-VH (6), and the GLCM compone (12) therein show great importance, followed by VV (5), H (1). To heed both, 10 varia (1, 3, 4, 5, 6, 7, 8, 10, 11, and 12) were selected. These variables were used as input for l discrimination analysis. Though differences can be seen from the boxplots, the result of Kruskal-Wallis rank sum test (Table 2) reveals a statistical insignificance difference among the rock samples based on the backscatter coefficients and decomposition parameters, with all the p values larger than 0.05 at 95% confidence level. Thus, a single backscatter coefficient (VV/VH/VV-VH) or decomposition parameter (H/A/Alpha) cannot be treated as a solid discriminator among the 5 lithological classes. Multiple sources of information should be considered together to discriminate lithology. In total, there were 12 variables (Table 1) considered for lithology discrimination in this study. To avoid redundancy and colinearity, the variable selection was carried out. From Figure 4a, it can be seen that the GLCM component4 (11), GLCM component1 (8), and the elevation data (7) contribute most of the information, followed by GLCM component3 (10), the GLCM component5 (12), and Alpha (3). While from Figure 4b, except for A (2), all the other variables provide more or less information for discrimination when using XGBoost analysis. The elevation data (7), VV-VH (6), and the GLCM component5 (12) therein show great importance, followed by VV (5), H (1). To heed both, 10 variables (1, 3, 4, 5, 6, 7, 8, 10, 11, and 12) were selected. These variables were used as input for later discrimination analysis.   Table  1).
The overall accuracy of different classification learners is shown in Figure 5. In light of the variable selection methods used here are embedded type selection, which works well with a particular learning process, the total variables before selection were also used as a comparison. From Figure 5, except for fine Gaussian SVM, cubic SVM, bagged tree, and subspace discriminant, the other classification learners showed higher or comparable results using selected variables compared to using total variables. Only results related to  Table 1).
The overall accuracy of different classification learners is shown in Figure 5. In light of the variable selection methods used here are embedded type selection, which works well with a particular learning process, the total variables before selection were also used as a comparison. From Figure 5, except for fine Gaussian SVM, cubic SVM, bagged tree, and subspace discriminant, the other classification learners showed higher or comparable results using selected variables compared to using total variables. Only results related to PLSDA with selected variables, which yields the highest overall accuracy (0.444), were discussed in this section. PLSDA with selected variables, which yields the highest overall accuracy (0.444), were discussed in this section. Both estimated and cross-validated ROC curves used for assessing the specificity and sensitivity for different rock classes using PLSDA are illustrated in Figure 6. Plots on the left and right describe the same thing in different ways. Each plot on the right shows the 1-specificity versus the sensitivity as a function of the selected threshold, while the plot on the left demonstrates the change of specificity and sensitivity with various thresholds. The corresponding AUC values for each rock class are listed in the ROC graphs. According to [53], AUC with a value of 0.7 to 0.8 indicates the predictive ability of the models is acceptable, AUC with a value of 0.8 to 0.9 indicates the predictive ability is excellent and AUC bigger than 0.9 means the predictive ability is outstanding. As can be seen from Figure 6, the predictive ability of the model for limestone is excellent, with an AUC larger than 0.8. The predictive capability for dolomite is acceptable, with an AUC larger than 0.7. As for the andesite, sandstone, and granite, models are a little overfitting and models are with poor predictive capability (AUC ranges from 0.5 to 0.7). Both estimated and cross-validated ROC curves used for assessing the specificity and sensitivity for different rock classes using PLSDA are illustrated in Figure 6. Plots on the left and right describe the same thing in different ways. Each plot on the right shows the 1-specificity versus the sensitivity as a function of the selected threshold, while the plot on the left demonstrates the change of specificity and sensitivity with various thresholds. The corresponding AUC values for each rock class are listed in the ROC graphs. According to [53], AUC with a value of 0.7 to 0.8 indicates the predictive ability of the models is acceptable, AUC with a value of 0.8 to 0.9 indicates the predictive ability is excellent and AUC bigger than 0.9 means the predictive ability is outstanding. As can be seen from Figure 6, the predictive ability of the model for limestone is excellent, with an AUC larger than 0.8. The predictive capability for dolomite is acceptable, with an AUC larger than 0.7. As for the andesite, sandstone, and granite, models are a little overfitting and models are with poor predictive capability (AUC ranges from 0.5 to 0.7).
With respect to the confusion matrix (Table 3), it is clear that the model shows a 'fair' agreement (kappa = 0.249) among all the rock classes [55]. F1 scores were calculated as a harmonic mean of user's accuracy and producer's accuracy (not listed here). F1 score ranges from 0.125 for granite to 0.629 for limestone. Accordingly, limestone was reliably discriminated, which is consistent with the result from Figure 6. Andesite and granite were severely mixed with other rock classes. While some degree of moderate mixing exists in dolomite and sandstone classes. Remote Sens. 2021, 13, x FOR PEER REVIEW 11 of 17 With respect to the confusion matrix (Table 3), it is clear that the model shows a 'fair' agreement (kappa = 0.249) among all the rock classes [55]. F1 scores were calculated as a harmonic mean of user's accuracy and producer's accuracy (not listed here). F1 score ranges from 0.125 for granite to 0.629 for limestone. Accordingly, limestone was reliably discriminated, which is consistent with the result from Figure 6. Andesite and granite were severely mixed with other rock classes. While some degree of moderate mixing exists in dolomite and sandstone classes.

Discussion
The main aim of this study was to assess the capacity of dual-pol Sentinel-1 SAR data for lithology discrimination in dolomite, andesite, limestone, sandstone, and granite rock classes. The backscatter coefficients, decomposition parameters, and GLCM information of all rock classes were first derived from SAR data. These SAR derived features were then analyzed via Kruskal-Wallis rank sum test to investigate the significance of the difference among rock classes. Furthermore, the derived features together with elevation data were used as input data in different discrimination models for classifying lithology. The important findings are discussed below.
Backscatter coefficients have been widely used in agriculture, environment, and geology, due to the sensitivity to ground surface physical properties such as surface roughness, dielectric constant, soil moisture, and grain size [56,57]. However, publication investigating specific rock class instead of rock assemblage (lithology unit) classification based on SAR data is scant, and the lithology unit classification usually refers to special stratigraphic relations. While the main landcover in our study area is vegetation and city, it is hard to find large continuous outcrops. Only uniform outcrops with relatively large areas are considered here and the collected samples were assigned to the dominant specific rock class. Thus, it is difficult to establish a discussion with our results. According to the field observations, one possible reason for the low accuracy may be the various grain sizes which can directly lead to different surface roughness [58] and texture. After checking the physical properties of the rock samples, it can be found that the grain size of granite varies from medium fine-grained to coarse-grained and the grain size of sandstone varies from silt to medium-grained. Various grain sizes then lead to diverse backscatter behaviors and the given C-band Sentinel-1 data may show different sensitivities to different particle sizes [59]. Different grain sizes also contribute to different surface textures and further affect the dielectric constant [60], which may explain why the GLCM components show great importance in lithology discrimination (Figure 4).
Apart from the grain size, other physical properties (Figure 7) like dielectric constant and magnetic susceptibility were also measured in the lab to help interpret the results. In general, the dielectric constant deviations of all rock classes are at a similar level (Figure 7a), while the deviation of magnetic susceptibility among different rock classes varies a lot. As can be seen from Figure 7b, compared to dolomite, limestone, and sandstone, the magnetic susceptibility of andesite and granite have bigger standard deviations, which may be caused by the various concentration of ferro-oxides. This might be another reason for the low discrimination accuracy for andesite and granite. According to [61], radar signal is strongly dependent on the dielectric constant and magnetic susceptibility, and significant scattering attenuation can be caused by higher dielectric constant and magnetic susceptibility. Our results show that the granite and andesite samples with slightly higher dielectric constant and magnetic susceptibility have smaller backscatter values compared to limestone with lower dielectric constant and magnetic susceptibility (Figure 3a). From our results (Figure 4), backscatter coefficients and decomposition parameters (variables 1-6) can actually provide useful information for lithology discrimination. However, compared to this information, elevation and texture information shows more importance in lithology discrimination. Contrary to the first three components often used in classification, the other bands obtained from PCA offer much more help. The result from [62] also indicates that the higher-order PCA components contain subtle information about the occurrence of minerals and rocks and enable to enhance lithological unit information. Further, Jakob et al. [25], Amin Beiranvnd Pour, and Hashim [63] also found that lower-level PCA components derived from spectral data can help extract different minerals and rocks. Thus, it is recommended to consider more the components derived from PCA when classifying lithology. The close relationship between lithology and topography is reflected in the fact that on the one hand rock types can influence both ecosystems and erosion rates, and on the other hand ecosystem and erosion rates can also influence how the rock is expressed in the topography. Moreover, the bedding, jointing, and tectonic deformation in the crust varies with different lithologies [64,65]. Hence, elevation data also play an important role in lithology discrimination [21].
The overall accuracy listed in Figure 5 indicates that PLSDA had the best performance when discriminating lithology, followed by SVM. Several previous studies already explored the performance of SVM in lithology mapping [21,22,25,66,67]. The requirement From our results (Figure 4), backscatter coefficients and decomposition parameters (variables 1-6) can actually provide useful information for lithology discrimination. However, compared to this information, elevation and texture information shows more importance in lithology discrimination. Contrary to the first three components often used in classification, the other bands obtained from PCA offer much more help. The result from [62] also indicates that the higher-order PCA components contain subtle information about the occurrence of minerals and rocks and enable to enhance lithological unit information. Further, Jakob et al. [25], Amin Beiranvnd Pour, and Hashim [63] also found that lower-level PCA components derived from spectral data can help extract different minerals and rocks. Thus, it is recommended to consider more the components derived from PCA when classifying lithology. The close relationship between lithology and topography is reflected in the fact that on the one hand rock types can influence both ecosystems and erosion rates, and on the other hand ecosystem and erosion rates can also influence how the rock is expressed in the topography. Moreover, the bedding, jointing, and tectonic deformation in the crust varies with different lithologies [64,65]. Hence, elevation data also play an important role in lithology discrimination [21].
The overall accuracy listed in Figure 5 indicates that PLSDA had the best performance when discriminating lithology, followed by SVM. Several previous studies already explored the performance of SVM in lithology mapping [21,22,25,66,67]. The requirement for input data of SVM is lower in data distribution and data size than that for other methods like maximum likelihood and artificial neural network. In contrast, the investigation of PLSDA in lithology discrimination is scant. After considering the limited and skewed rock samples with different physical or chemical properties (grain size, age, and concentration of metal, etc.), we thought the result from PLSDA is very promising. Our results ( Figure 5) indicate that PLSDA has a better performance than other classifiers. Though the overall accuracy from the confusion matrix (Table 3) is not high, AUC-ROC values in Figure 6 show the discriminability between different rock types of the model, especially the limestone and dolomite types.
Variable selection was carried out to reduce the data redundancy in this study. The comparison between the overall accuracies of models using all variables and selected variables ( Figure 5) indicates that the variable selection does not work for all classifiers. One reason is that the methods used for selection are embedded-type selection and may only work well with a particular learning process. Another reason is the difference among the mechanisms of different classifiers. However, variable selection is still recommended to serve as a baseline for choosing the final input variables when the sample size is limited.
More efforts are needed to put into improving the discrimination accuracy in the future. The balance between spatial resolution and outcrop area needs to be considered carefully. Though we already chose the samples with an outcrop area as big as possible, the backscatter signals can still be affected by the border effects from the surrounding environments. Further studies can be conducted in arid or semi-arid places with plenty of outcrops to exclude the surrounding influence. Also, more samples with various properties should be included to ensure adequate variation and explore the SAR scattering mechanism. The multi-sensor approach can also improve the model accuracy. Thurmond et al. [68] mentioned that merging optical and radar data is an effective way of identifying different lithologies on the basis of their spectral features and surface roughness.
Overall, this study proves that dual-pol Sentinel-1 SAR data actually have potential for lithology discrimination, which is useful in geological investigation and mineral resource exploration.

Conclusions
Remote sensing data have made lithology mapping cost-effective. The use of optical satellite data in lithology mapping has been well explored, while studies investigating SAR data, especially the dual-pol SAR data, on lithology mapping is few. Microwave data can provide structural, surface roughness, and dielectric information particularly. In this study, we attempted to use only dual-pol Sentinel-1 data together with SRTM data to discriminate different rock types.
We combined the backscatter coefficients, decomposition parameters, and texture features derived from Sentinel-1 products with cross-validation discrimination models to distinguish different rock types. The performance of models was then evaluated using confusion matrices and AUC-ROC. We found that PLSDA yields the best performance with an overall accuracy of 0.44. Meanwhile, the results show that the model works well for limestone with the highest F1 score and AUC value (0.629 and 0.80, respectively), followed by dolomite with a F1 score of 0.571 and an AUC of 0.720. Sandstone, andesite, and granite cannot be well classified due to the complex physical and chemical properties. The variable selection can somehow improve the overall accuracy of this study. These results are meaningful for proving the great potential of Sentinel-1 in lithology discrimination and to open up a new way for lithology mapping using remote sensing data.