1. Introduction
Synthetic aperture radar (SAR) imagery has the capabilities of providing high resolution images for a large area under all-weather and all-day conditions. Therefore, it has broad applications in areas such as geological exploration, environmental monitoring, and natural disaster assessment [
1]. Moreover, compared with traditional single polarized SAR images, full polarized SAR images (Pol-SAR), which contain four different Pol-SAR channels (e.g., hh, hv, vh, and vv), can provide more abundant information on land cover characteristics. It is able to provide not only the radar scattering cross section (RCS) but also the phase information of the four channels, which can reveal the scattering mechanism of targets from various aspects, such as structure and surface undulations. Hence, the classification of different lithologies based on Pol-SAR images is becoming an important tool.
In the last few decades, a variety of land cover classification algorithms have been proposed [
2]. They can be broadly categorized into supervised approaches, semi-supervised approaches, and unsupervised approaches according to whether manual annotations are utilized. Kong proposed a maximum-likelihood based classification method for Pol-SAR images based on a complex Gaussian distribution [
3]. This classifier was applied to single look Pol-SAR images and was modified to multi-look situations by Lee [
4]. In this work, many features are extracted from Pol-SAR images, which can describe the targets from various aspects. The application of features makes it an important field in which to conduct land cover classification based on polarimetric features. For instance, Cloude and Pottier defined three features including entropy (H), anisotropy (A), and alpha (
) by eigen decomposition. The H and
are widely used for classification [
5]. Freeman–Durden [
6] determined three-component scattering mechanisms based on physical models. In this method, double bounce, volume, and surface scattering correspond with the double-bounce scattering from a dihedral corner reflector, randomly oriented thin cylindrical dipoles, and first-order Bragg scattering, respectively. The Freeman–Durden decomposition is useful for decomposing the scattering mechanism from naturally incoherent scatterers. In Reference [
7], Qi synthesized polarization decomposition, and proposed an object-based Pol-SAR image classification based on a decision tree. Through mathematical and physical analysis of Pol-SAR data, Mahdianpair presented a modified coherency matrix. The classification accuracy of wetlands was improved to 92.17% by combining the modified coherency matrix and a random forest algorithm [
8].
Recently, deep learning have been proven to be very effective in classification and recognition as it can extract features automatically [
9,
10]. Therefore, a Pol-SAR image classification based on deep learning has attracted the attention of experts [
11]. Hansch proposed the usage of complex valued neural networks for classification of different land in Pol-SAR images, which confirmed the possibility of deep learning applied to Pol-SAR image classification [
12]. Lv et al. used a deep belief network as one of the deep learning models to classify different urban land covers in a Pol-SAR image [
13]. In Reference [
14], Zhou introduced a convolutional neural network into Pol-SAR image classification. Experimental results with Pol-SAR data indicated that the deep learning methods could provide competitive results. Xie proved the suitability of a stacked sparse autoencoder (SSAE) in the classification of Pol-SAR images [
15]. However, this method is pixel-based, without considering the local spatial information. To solve this problem, Zhang et al. took all the neighborhood pixels around the cell under test as the input data, and then the Pol-SAR images were classified by SSAE [
16]. This method integrated the contextual information of the neighborhood, but it has defects in practical applications especially when the target area is complex. In References [
17,
18], Hou and Geng proposed improved land cover classification methods based on superpixels to use the local spatial information more properly. In this study, SSAE is selected due to several advantages compared to other methods. First, unlike the convolutional neural network, SSAE is pixel-based, thus, it can classify any pixel including boundary pixels. Second, SSAE can easily handle large and multi-dimension remote sensing datasets, which is especially suitable for this study. Finally, it has a flexible and straightforward structure and has shown good results in various remote sensing applications.
In the processing of classifying different vegetation covers or different urban land covers using Pol-SAR, effective features extraction is the foundation to obtain good results. However, it is difficult to get satisfactory classification features just by using one band Pol-SAR data for lithology, since lithologies are affected by surface weathering over long periods of time, and the surface characters thus become similar among some classes [
1]. To classify different lithologies, dual-frequency Pol-SAR will play a role because of the different wavelengths and varying penetrability to vegetation cover. L-frequency has the longer wavelength and better penetrability. SSAE can extract some features according to its own rules, but it cannot extract all possible features, especially those with specific physical meanings. Combining the effective handcrafted features with raw data may improve the lithology classification performance of SSAE. In this paper, new polarimetric features were proposed based on dual-frequency Pol-SAR data for the surface fluctuations of different lithologies, and then SSAE was used to extract deep features and classify different lithologies. To the best of our knowledge, this is the first attempt at the classification of lithology by combining SSAE and dual-frequency Pol-SAR data.
The paper is organized as follows: in
Section 2, we explain the polarimetric features extraction and SSAE. This section also presents the new lithology classification method and describes the dataset used in the experiments. The proposed method is verified and analyzed in
Section 3, at the same time, the proposed method is also compared with other methods in this section. A discussion is found in
Section 4. And some useful conclusions are drawn in
Section 5.
3. Results
In this section, the performance of the proposed method is verified and analyzed.
3.1. Experimental Parameters
Our algorithm comes with some parameters that can affect the classification performance, such as the number of hidden layers and nodes in the SSAE, and the number of superpixels and training pixels. Specifically, the number of hidden layers and nodes in each layer can influence the performance of the SSAE. Moreover, superpixels divided from the Pauli RGB image can affect the classification results. A smaller number of superpixels implies that a superpixel contains more pixels and therefore more contextual information of the neighborhood can be considered. However, the classification of boundaries becomes poor when the number of superpixels is too small. Furthermore, the number of training pixels also influences the estimation accuracy. In order to analyze the impact of these parameters, experiments are carried out with the Qinghe dataset.
To determine the optimal number of layers, SSAE topologies with increasing depth were trained with the same number of superpixels and training pixels. In this experiment, the SIR-C image was divided into 2200 superpixels, and 10% of labeled pixels were randomly selected as the training set to train the network. The overall accuracy (OA), which is defined in Equation (
24), is calculated to compare different architectures. From the plot shown in
Figure 5, it can be seen that a three-layer SSAE is the most suitable for the dataset. Furthermore, the network has 60, 80 ,and 100 nodes in these hidden layers. From the above mentioned results, a three-layer SSAE with 60, 80, and 100 nodes was applied in the following experiments. The OA is:
where
N is the number of total samples,
r denotes the number of classes, and
is the diagonal elements in the confusion matrix.
Then, the optimal number of superpixels obtained from the SIR-C image can be determined by following experiments. In this step, the SIR-C image is divided into 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 2800, 3000, and 3200 superpixels, and then classification performance is analyzed to find the optimal number of superpixels. As shown in
Figure 6, 1800 superpixels is the favorable parameter.
Finally, we validate the performance of our network architecture with different numbers of training pixels. We respectively choose 0.1%, 0.2%, 0.5%, 1%, 2% , 3%, 5%, 10%, 15%, and 20% of labeled pixels as the training set to train the network. From
Figure 7, we find that, when the rate of training samples increases from 0.1% to 5%, the OA of classification increases sharply at first, and then a stable and high OA is obtained. 5% of the labeled pixels is enough to train the network, and it can prevent overfitting. Thus, as shown in
Table 2, 5% of the labeled pixels are randomly selected as a training dataset, and the remainder are regarded as the testing dataset.
3.2. Classification Results
In this experiment, using the proposed method, the Pol-SAR images can be classified.
First, the data should be pre-processed. A refined Lee filter with 9 × 9 window size is chosen to reduce the noise. Then, after superpixel segmentation, 1800 superpixels remain, and the result is shown in
Figure 8. It can be seen that most pixels belonging to the same superpixel have nearly the same color. This means that these pixels have nearly the same scattering mechanism, and can be classified as one group.
Then, polarimetric features can be extracted from the Pol-SAR data, and the features maps are shown in
Figure 9. It can be seen that when just using the original features, some classes can not be distinguished, however significant differences can be found in the features maps between these two classes when the new features are used.
Finally, extracted features are entered into the SSAE which has three hidden layers with 60, 80, and 100 nodes, respectively. The classified map is depicted in
Figure 10.
To varify the advantages of the new features compared to the original features (
,
,
, and
(
i = L, C)), comparison experiments and analysis are conducted. In this experiment, we use
, which is defined as follows, to describe the difference between class
i and
j:
where
and
represent the average of class
i and
j with feature
f, respectively.
is a number greater than 1, and it is proportional to the difference between two classes.
From
Figure 3 and
Figure 4, we can directly see that Beitashan Formation (class 2) and Upper Aermantie Formation (class 5) are difficult to distinguish, Diluvium (class 3) is easily confused with Alluvium–Diluvium (class 6). Therefore,
and
are calculated in the regions marked in
Figure 9a with original features and new features.
As can be seen in
Table 3,
and
for the new features are greater than those for the original features, which indicates that the new features are more effective in increasing the class separability than the original features. For example, for the situation that just uses
,
is 1.06. However,
is improved to 1.31 when
is used.
As shown in
Figure 10, the proposed method can generate competitive visual effects with good connectivity and smooth borders, compared with the ground truth. Thus, lithology can be classified in a good way with this method.
Although favorable results of lithology classification can be obtained with the proposed method, several different classes of lithology are confused. The confusion matrix is shown in
Table 4.
The Diluvium is mostly confused with Alluvium–Diluvium, leading to a low classification accuracy of these classes. This is primarily due to the fact that both Diluvium and Alluvium–Diluvium have a low back-scattering power.
3.3. Comparison with Other Classifiers
To validate and test the performance of the proposed method, the SIR-C data of Qinghe Xinjiang, China, was used. The experimental verification included two aspects, which are the classification accuracy and computational burden.
3.3.1. Classification Accuracy
To validate the lithology classification accuracy, a comparison between the proposed method and other methods is made in this section. The classifiers compared are three SSAE methods with different inputs, the Support Vector Machine (SVM) classifier, and Hou’s method from Reference [
18]. The three SSAE methods in this paper can be labeled as M1, M2, and M3, whose inputs are the coherency matrix of the L-frequency, the C-frequency, and the dual-frequency Pol-SAR data, respectively. Both the SVM classifier and Hou’s method utilized the same features as the proposed method. The superpixel segmentation is introduced for these comparison methods. The SVM is a typical method in classification and Hou’s method is a state of the art method in Pol-SAR image classification. In the experiment of SVM, ’rbf’ is used as the kernel and c = 100, gamma = 0.01. The classification accuracy is reported in
Table 5 and the results are plotted in
Figure 11. The OA and kappa coefficient (Kappa) are calculated to evaluate the classification performance. Kappa is defined as:
where
N is the number of total samples,
r denotes the number of classes,
is the diagonal elements in confusion matrix, and
and
are the number of samples in the
ith row and the
ith column of confusion matrix, respectively.
It can be directly seen from
Figure 11 that the classification performance is improved significantly compared with the other five methods. As shown in
Table 5, the proposed method has a lithology classification accuracy of 98.90% in this experiment, and it outperforms M1, M2, M3, SVM, and Hou’s method, which have accuracies of 72.99%, 80.55%, 94.70%, 86.69%, and 80.05%, respectively. The classification accuracy of lithology, especially for classes with high similarity, has been improved with the proposed method. From the results of M1, M2, and M3, we can conclude that dual-frequency Pol-SAR data can provide more information than single frequency Pol-SAR data; the accuracy of M3 is better than M1 and M2. As can be seen from the classification accuracy of the proposed method and M3, the new features extracted from dual-frequency Pol-SAR data are effective and can improve the classification accuracy of lithology. SSAE is more suitable for lithology classification than SVM, because SSAE can extract deep features of input data and classify the lithology with these features. Therefore, the classification result of the proposed method is better than SVM. By comparing the performance of the proposed method and Hou’s method, we can tell that the algorithm in this paper is more effective for lithology classification. Generally, the proposed method outperforms the reference methods because new features extracted from dual-frequency Pol-SAR data can increase the class separability of the input data, and SSAE is suitable for lithology classification.
3.3.2. Computational Burden
The computational burden of the proposed method is also an important issue, and the execution times of all methods are given in
Table 6. Please note that the time shown in
Table 6 is for the training step, which is the most time consuming of all the steps. The computational burden of the SVM is low when the number and dimension of samples are small. However, in this experiment, the number of samples is too large. Therefore, the computational burden of SVM is heavier than other methods. For the methods using SSAE, the time consumed is almost related to the input dimension because they have the same number of samples and the same architecture of the SSAE.
From
Table 6, we can see that the proposed method has an acceptable computational burden, and the times consumed for the methods using SSAE are significantly less than the method with SVM.
4. Discussion
The objective of this study is to classify lithology using dual-frequency Pol-SAR data. In this paper, new features are extracted from dual-frequency Pol-SAR data to increase the class separability of the input data, and the SSAE is selected to obtain deep features of the input data for lithology classification. From
Figure 10, we can see that lithology is classified successfully with the proposed method. To verify the performance of the proposed method, several methods including M1, M2, M3, SVM, and Hou’s method were applied to the same dataset for classification of lithology. The accuracy of M1 and M2 is 72.99% and 80.55%. The accuracy of these methods is low because they only use single frequency Pol-SAR data. The performance of Hou’s method is 80.80%. The classification performance of SVM is 86.69% and the SVM consumes more time than the other methods. Both M3 and the proposed method achieved more than 90% classification accuracy. Accuracy and an acceptable time consumed is essential in the classification with the proposed method. Therefore, the proposed method has the best result of lithology classification.
Classification of lithology is important in geological exploration, thus the method proposed in this paper can be used widely in the future. If we can obtain a few labels spread over a specific region by manual investigation, we can propagate this knowledge over the entire region. Besides, we also can train the network on one set of a region and apply it to another region, if the data of these regions are acquired under the same radar imaging conditions.
The current paper used SSAE to extract deep features of input data and classify lithology. In the future, new classifiers can be applied in the proposed method to improve the classification accuracy. Recently, a combination of results from different classifiers has been proposed [
30] and our future study will investigate this approach, which may improve the results obtained in this paper.