1. Introduction
As an essential component of the Arctic environment and even the global marine environment, sea ice plays a critical role in the weather and global climate system [
1]. It not only affects the dynamic conditions and heat exchanges between the ocean and atmosphere but also plays an important role in the climate and marine ecosystem [
2,
3,
4,
5]. Over the past three decades, the reduction in sea ice cover has not only had a profound impact on the climate, hydrology, and ecology of the Arctic region [
6,
7,
8,
9] but has also, to some extent, promoted the expansion of the navigation windows of the Arctic shipping routes with advantages in navigation costs and time costs, thus leading to the increase in maritime transport in the Arctic region [
10,
11,
12]. However, even in summer, navigation has increased risks due to the presence of sea ice. In this regard, repaid acquisition of marine meteorological information including sea ice is crucial for ensuring the safety of navigation in polar regions. To this end, the International Maritime Organization (IMO) issued the Polar Code on 1 January 2017, in which ships passing through the polar regions must receive the latest ice information, mainly including the type, thickness, and concentration of sea ice [
13]. The sea ice type can be defined according to the stage of sea ice development, from smooth nilas ice to deformed and rough new ice, and multi-year ice that has survived through the entire summer.
Remote sensing has become an important technological means for large-scale sea ice monitoring in the Arctic region due to its advantages of a wide detection range and rapid data acquisition. In particular, synthetic-aperture radar (SAR) has become an indispensable observation system in polar sea ice monitoring with its all-weather and all-day advantages. Moreover, as SAR signals of different frequencies differ in their abilities to penetrate into sea ice, multi-band SAR contributes to capturing the complementary information of sea ice [
14]. To be specific, L-band (1~2 GHz) SAR has higher penetration into wet snow and sea ice and can provide internal structural information of sea ice, such as thickness, salinity, and distribution of the bubbles [
15,
16]. That is, L-band SAR has more advantages in identifying the types of melting sea ice but tends to confuse new ice with open water [
17]. On the contrary, X-band (8~12 GHz) SAR has a small penetration depth and is more sensitive to the increase in sea ice thickness during the early stage of sea ice growth. That is, X-band can distinguish new ice from multi-year ice but has a poor ability to distinguish gray ice from gray-white ice [
18]. Up to now, most of the SAR sensors in service operate at the C-band (4~8 GHz), which is between the X-band and the L-band. Due to the moderate frequency adopted, the backscattering coefficients of different types of sea ice are significantly different in the C-band. That is the reason why C-band SAR has proved to be the most suitable sensor for polar sea ice type identification, especially for distinguishing ice from open water [
19].
In recent decades, many representative semi-automatic and automatic algorithms have emerged and been applied in practice for sea ice classification of SAR images. These models include simple backscatter thresholding [
20], clustering algorithms [
21,
22], expert systems [
23,
24,
25], semantics segmentation (IRGS) [
26], machine learning (support vector machines, neural networks) [
27,
28,
29,
30], and deep learning (CNN) [
31,
32,
33]. Tan [
26] proposed a semi-automatic sea ice classification algorithm for Sentinel-1 SAR images, which incorporated feature selection via random forest and iterative region growing using a semantics model to achieve multi-category sea ice classification in the Labrador Sea. Huiying Liu [
28] proposed a method for sea ice classification based on the texture features and sea ice concentration of dual-polarization Radarsat-2 ScanSAR images. Six types of sea ice were classified including open water (OW), new ice (NI), leveled gray ice (LGI), deformed gray ice (DGI), second-year ice (SYI), and multi-year ice (MYI). Bogdanov et al. [
30] compared neural networks with other supervised learning algorithms based on linear discriminant analysis (LDA) and used these algorithms to identify six sea ice types from RADARSAT and ERS SAR images of the Kara Sea. In addition, with the successful applications of deep learning models in image processing, preliminary explorations of these models have also been conducted in the classification of sea ice [
31,
32,
33]. For instance, Hugo Boulze et al. [
31] utilized a convolutional neural network (CNN) to recognize new ice, first-year ice, and multi-year ice based on 255 images of Sentinel-1 sea ice interpreted by experts. The recognition performance was better than the random forest algorithm, with the overall classification accuracy reaching 91.6%.
The above-mentioned classifiers can improve the classification accuracy of sea ice to a certain extent. However, the whole classification process mainly relies upon one single classifier, rather than combing the advantages of multiple different classifiers. To fully integrate the advantages of different classifiers, ensemble learning has been introduced into remote sensing image classification [
34,
35,
36]. The most protruding characteristic of ensemble learning is the complementarity among the base classifiers. That is, when one classifier misclassifies some samples, other classifiers may correct the categorization of these samples. Therefore, the ensemble learning approach has great potential in improving the accuracy of image classifications. However, it is challenging to design an ensemble learning model with an excellent classification performance. To enhance the robustness of ensemble learning models, the voting strategy adopted herein deserves careful consideration.
Under this background, the concept of ensemble learning is introduced into sea ice classification for the first time. Meanwhile, this paper proposes an ensemble learning method based on a two-round weight voting strategy (TRWV) for the effective classification of sea ice using multi-temporal Sentinel-1 SAR images. Compared with the traditional ensemble methods, this study has the following main remarkable characteristics. During the first round of the voting stage, the weights of six base classifiers are optimized by using a genetic algorithm. After obtaining the first coarse classification result, pixels therein can then be identified to be fuzzy or explicit. The fuzzy pixels are further rectified based on the local similarity of the neighboring explicit pixels. The final precise classification result indicates that the proposed two-round weight voting strategy can significantly reduce the impact of speckle noise of SAR images. At the end, experiments are carried out on 18 scenes from Sentinel-1 SAR images from the Northeast Passage in the Arctic region. In addition, six base classifiers and four different voting strategies are employed as the comparisons, which fully validate the effectiveness and superiorities of the proposed method. The rest of this paper is organized as follows.
Section 2 describes the proposed method, including data preprocessing and the detailed algorithm framework. In
Section 3, the experimental results are presented and compared with other methods.
Section 4 is devoted to the discussions and limitations. Finally,
Section 5 concludes this study.
2. Methods
The overall architecture of the proposed TRWV method is depicted in
Figure 1 for deriving sea ice categories from S1 EW images in HV and HH polarization. Firstly, the S1 EW images are preprocessed, which includes applying an orbit file, denoising, radiometric calibration, incidence angle correction, and converting to the decibel scale. Secondly, the preferable features of sea ice are selected via random forest from polarization features (HH, HV, HH/HV) and GLCM-derived texture features. Then, the weights of classifiers optimized by a genetic algorithm are adopted during the first round of the weight voting stage. Meanwhile, all pixels are divided into fuzzy pixels or explicit pixels (whose definitions can be found in Equation (7) in
Section 2.3). Finally, the fuzzy pixels can be expediently rectified based on the local similarity of the neighboring explicit pixel during the second weight voting stage.
2.1. Preferable Features Selection
In this paper, Sentinel-1 EW dual-polarization (HV and HH) data were employed to verify the proposed algorithm. Some preliminary preprocessing was completed before the release of the S1 EW dual-polarized SAR data; however, it is still indispensable to perform further preprocessing work consisting of a series of standard corrections, which are the application of a precise orbit file, thermal removal, image cropping, speckle filtering, incidence angle correction, range Doppler and terrain correction, etc., for the proposed method. All these corrections in this paper were achieved mainly based on the SentiNel Application Platform (SNAP) [
37] developed by the European Space Agency (ESA). The detailed procedures of the further preprocessing work are shown in
Figure 2.
Numerous studies have shown that SAR sea ice classification performance is improved by using image texture features. The texture features describe spatial variations of the backscattering coefficients of a group of adjacent pixels in the SAR image. The most common and classic texture feature extraction method is based on the gray level co-occurrence matrix (GLCM) in sea ice classification. Since the GLCM is constructed according to the distance and direction of each pixel pair, it can synthetically reflect the micro-detailed and macro-expressed textures of sea ice.
The GLCM represents the probabilities of all pairwise combinations of gray levels within the window of interest. Normally, the GLCM textures are determined by four parameters: gray levels, the sliding window size, inter-pixel distance, and orientation. For each SAR sub-image constrained by a constant window size, the GLCM is calculated as follows [
38,
39,
40]:
where
is the GLCM value of a pixel pair;
represents the frequency number of grayscale “pixel pairs”;
and
appear simultaneously within the sliding window;
is the observation angle involving 0°, 45°, 90° and 135°, which correspond to horizontal, northeast–southwest, vertical, and northwest–southeast, respectively;
represents the distance between pixels, namely, the step size;
denotes the gray levels.
In this study, GLCMs were calculated for
,
, and
polarimetric SAR images. Therein, multiple window sizes and step sizes were thoroughly employed: window size 5 with step size 1, window sizes 7 and 9 with step sizes 1 and 3, and window size 11 with step sizes 1, 3, and 5. To reduce the computation amount, the gray levels of the image were compressed from 256 to 32. Furthermore, the extracted texture features were obtained by averaging the GLCM from four different angles. Here, we calculated ten texture measurements, which are the angular second moment (ASM), contrast, dissimilarity, energy, entropy, correlation, mean, variance, homogeneity, and maximum, resulting in a total of 240 candidate GLCM features. The detailed formula of these features can be found in [
38,
39]. These texture features were produced by the texture analysis module from SNAP. In addition, the extracted texture features, together with the foregoing 3 polarization features, were all normalized to the interval of [0, 1] for the convenience of subsequent experiments.
Due to the information redundancy among the extracted texture features, feature reduction is an essential technique for capturing the important features or feature combinations. Random forest is a widely adopted feature selection method because of its simple principle, easy implementation, and low computational cost. Its main idea is to combine a number of decision trees built from bootstrapped training samples using a random subset of features. During this process, the random forest provides the corresponding importance measurement for each input feature
by the following Equation (2):
where
represents the importance of feature
in decision tree
, and
is the set of all decision trees,
. The importance of the random forest is described by the variation in the classification accuracy of the out-of-bag (OOB) sample, known as out-of-bag (OOB) error, which is caused by random transformation of features in the OOB sample. The function
in Equation (2) is given as follows:
where
is the OOB sample set;
represents the true classification label of pixel
;
is the category label of
predicted by the decision tree based on the OOB dataset;
represents the predicted category label of pixel
after random transformation of feature
;
counts the number of correctly classified samples.
The experiment of feature selection based on the random forest was carried out for 240 GLCM texture features on 16,124 artificially interpreted samples. The computation speed of the experiment and accuracy of the feature importance are mainly affected by two parameters: the number of decision trees and iterations. According to a previous study [
41], 20 and 50 were set, respectively, in this experiment for the number of decision trees and iterations. Therefore, the ultimate importance according to each feature can be obtained by averaging the importance after 50 rounds of running the above experiment. By ranking each feature with its importance, the top six features were picked out as the representative features, presented in
Table 1.
Moreover, to utmostly retain the SAR polarization information, the original 3 polarization features were also introduced, thereby aggregating the 9 preferable sea ice features. The flow chart of acquiring the preferable features of sea ice is shown in
Figure 3.
2.2. The First Round Voting Stage—Coarse Classification
In order to fully integrate the advantages of different classifiers, ensemble learning has been introduced into remote sensing image classification [
34,
35,
36]. In this paper, six frequently used classifiers, that is, naive Bayes (NB), decision tree (DT), k-nearest neighbor (KNN), logistic regression (LR), artificial neural network (ANN), and support vector machine (SVM), were employed as the base classifiers to generate the initial classification maps. Since the voting strategy plays a critical role in ensemble learning models, optimization of the voting strategy contributes to improving the classification ability of ensemble learning. Here, the voting strategy was improved with the voting weights of the base classifiers tuned by a genetic algorithm. Therefore, the first round voting stage was conducted on the initial classification maps to obtain the category score matrix and, further, the first coarse classification of sea ice.
Figure 4 below illustrates the detailed process of the first round voting stage.
Specific descriptions of the six base classifiers (NB, DT, KNN, LR, ANN, and SVM) can be found in the literature [
42,
43,
44,
45,
46,
47]. Actually, the ensemble classification method operates by voting the initial classification results of different base classifiers according to a certain voting strategy. At present, ensemble learning models are mainly implemented through the mechanisms of bagging [
48], boosting [
49], and stacking [
50]. Here, the bagging mechanism is utilized due to its inherent majority voting concept being involved throughout, which improves the final classification by combining classifications of the base classifiers with randomly selected training data subsets. However, the selection of voting strategies has a significant impact upon the classification performance of the bagging mechanism. Here, the weighted voting strategy was employed, which assigns different weights to the classification results of different base classifiers to achieve the optimal classification. The weights of the above six base classifiers were optimized by a genetic algorithm (GA), whose algorithm flow is shown in
Figure 5.
The specific steps of the GA are summarized as follows:
- (1)
Initialization: A group of multiple individuals is randomly generated, and each individual represents the weight of each classifier.
- (2)
Fitness: The GA computes the fitness (pros and cons) of individuals based on some objective evaluation function, thereby determining the survival probability of individuals in the next evolution.
- (3)
Selection: A certain number of excellent individuals with “more fitness” are selected through random or specific population rules to participate in cross and mutation. Generally speaking, excellent individuals are usually of high fitness, that is, better classification performance can be expected if they are used as the weights of base classifiers.
- (4)
Cross: New and excellent individuals are generated by exchange and combination of chromosomes.
- (5)
Mutation: The individual diversity is increased through genetic mutation by randomly selecting some individuals with a certain probability.
Suppose
represents the weights of base classifiers, whose optimization process, as mentioned above, can essentially be formulated [
51,
52,
53,
54] as follows:
where
is the predicted label of the sample
. The loss function
represents the difference between the predicted label
and the true label
.
According to the optimized weights
by the GA and the classification results of the base classifiers, the category score matrix of each pixel
can be calculated as follows:
where
represents the category label of the pixel
predicted by the
th base classifier, and
represents the category score value of this pixel assigned to category
,
(the total number of categories).
Therefore, the maximum index of the category score can be calculated by the argmax function to obtain the rough classification label.
In other words, coarse classification of sea ice is achieved after the first round voting stage.
2.3. The Second Round Voting Stage—Precise Classification
After the first round voting stage, the score values of some pixels assigned to different categories may be very close in the first coarse classification results. By introducing an experiential score threshold, each pixel can thus be identified as a fuzzy or an explicit pixel. As mentioned above, the fuzzy pixels are likely prone to be misclassified. To cope with this issue, the second round of voting is conducted to further determine the category attribution of the fuzzy pixels based on the local similarity of the neighboring explicit pixels, thereby yielding the final precise classification result.
Figure 6 shows the process of the second round voting stage with the specific implementation steps described as follows.
Firstly, by using the category score matrix
and the predefined threshold parameter
, each pixel can be identified as a fuzzy or an explicit pixel according to the following rules:
where
represents the current pixel
under consideration, and
and
represent the maximum and the secondary maximum of the category score vector corresponding to the pixel
, respectively. Therefore, a logical identification matrix
is generated, indicating that each pixel is either fuzzy or explicit.
Then, for each fuzzy pixel, one corresponding matrix will be created depicting the similarities between the fuzzy pixel and its neighboring explicit pixels. These explicit pixels are all selected from such a square neighborhood centering this fuzzy pixel. Specifically, the correlation coefficient
is introduced for depicting the similarity of the fuzzy pixel
and the explicit pixel
(
represents the size of the neighborhood), which is calculated as follows:
where
constitutes the similarity matrix
, and
and
denote the variance and covariance of the feature vectors.
In the following, the category attribution of the fuzzy pixel can be determined according to its similarity with the neighboring explicit pixels. That is, the cumulative summation of the correlation coefficients is conducted corresponding to each category in the similarity matrix
, thereby obtaining the score vector
.
where
is the logical identification matrix;
denotes the cumulative summation of the correlation coefficients corresponding to category
;
is the total number of sea ice categories. Thus, the maximum index of the score vector
is actually the assigned category of the fuzzy pixel under consideration, which is formulated as follows:
where
is the assigned category label of the fuzzy pixel.
Therefore, the final precise classification of sea ice is completed after the second round voting stage.
4. Discussion
The evaluation metrics shown in
Table 6 strongly demonstrate that the proposed ensemble learning method of TRWV distinctly improved the classification accuracy of the base classifiers. Meanwhile, TRWV is also superior to the ensemble classifiers with the current mainstream voting strategies in terms of the OA and kappa coefficient. To expand the application scope of the adopted two-round weight voting strategy, parametric sensitivity analysis is carried out below for two important parameters involved in the proposed method, TRWV, which are the category score threshold
and the neighborhood window size
. By measuring the gap of the maximum and the secondary maximum of the category score, the threshold
acts as the criteria for identifying each pixel as either a fuzzy or an explicit pixel. Additionally, the neighborhood window size
determines the spatial scale of the local similarity, that is, how far a region defined for the explicit pixels therein can be used to rectify the central fuzzy pixel during the second weight voting stage.
As shown in
Figure 11a, when
is fixed at 3 and
gradually increases from 0.05 to 0.15, the total classification accuracy rises from 0.9603 to the highest value of 0.9626, and the kappa coefficient increases from 0.9468 to 0.9498. However, when
continuously increases from 0.15 to 0.35, the overall accuracy and kappa coefficient show a gradual downward trend. Thus, it is found that TRWV achieves the optimal classification accuracy when
is taken as 0.15 in the condition of
being 3. On the other hand, the effect on the classification accuracy of the neighborhood window size
is still worth discussing when the threshold
is fixed at 0.15. When
gradually increases from 3 to 11 with a step size of 2, the overall accuracy of the classification results grows significantly from 0.9626 to the highest value of 0.9760. Meanwhile, the kappa coefficient increases from 0.9498 to 0.9678. This is mainly because a smaller window size
gives rise to a narrower spatial neighborhood, thereby leading to the limited spatial context information captured during the second round of the weight voting stage. As a result, the classifier cannot effectively suppress image noise and correct mislabeled pixels. With the increase in
, the suppression of image noise and the final classification accuracy are both improved obviously as more spatial context information is utilized. However, it is also found that the OA and kappa coefficients remain almost unchanged after reaching the maximum value, even though the parameter
continuously increases. Therefore, according to the above discussions, the category score threshold
was set as 0.15 and the neighborhood window size
was taken as 11 in expectation of the highest accuracy of sea ice classification.
Although several experiments have already proved that the proposed method of TRWV has an overwhelming advantage over the current mainstream voting strategies in the classification accuracy, the TRWV method is actually not dominant in terms of the computational cost. In addition to the computation cost in the first round voting stage, which is almost equivalent to that of the traditional ensemble learning method. Additional computations are still indispensable for further rectifying the fuzzy pixels based on their local similarity during the second round voting stage. Moreover, it can also be found from the accuracy evaluation results of the classification of Image II in
Table 6 that the values of the OA and kappa of all classifiers are very close. There is only a very slight increase of 0.02% in the overall accuracy for TRWV compared with the base classifier of the KNN model and the ensemble classifier with the GA strategy, which both perform best in the comparison methods. Based on the previous analysis, the main reasons accounting for this can be summarized as follows:
- (1)
The selection of training samples and test samples may not be objective enough. Moreover, the sea ice category is generally difficult to be interpreted from SAR images due to the influence of speckle noises, not to mention the artificial interpretation adopted in this experiment. In other words, incorrect interpretations of the pixel category are inevitable to a great extent, which thus brings about a negative impact on the performances of the classifiers.
- (2)
Compared with the conventional ensemble learning methods, what makes TRWV different is that it further corrects the fuzzy pixels based on the local similarity of the neighboring explicit pixels. Therefore, if some explicit pixels in the neighborhood are incorrectly classified, the central fuzzy pixel may also be misclassified.
Through the above experiments and discussions, the effectiveness of the proposed method has been fully verified. However, the limitation of the method is that the classification performance of the ensemble learning method is mainly dependent on its base classifiers. That is, the selected base classifiers determine the classification ability of the ensemble classifier to some extent. Thus, the performance of the ensemble classifier can be further improved by introducing new base classifiers such as object-oriented methods, segmentation algorithms, or a CNN in the follow-up studies.
5. Conclusions
In this paper, a two-round weight voting strategy-based ensemble learning method was proposed for refining sea ice classification. The effectiveness of the proposed method was verified by using 18 Sentinel-1 EW dual-polarized SAR images of the Northeast Passage. In TRWV, a random forest was adopted to select the extracted polarization features and texture features to construct the preferable features of sea ice. Then, the weight corresponding to each classifier was optimized by our genetic algorithm to achieve a coarse classification result in the first round voting stage. On this basis, each pixel was divided into a fuzzy pixel or an explicit pixel by introducing a predefined score threshold. Finally, the fuzzy pixels can then be further rectified based on the local similarity of the neighboring explicit pixels in the second round voting stage, thereby yielding the final precise classification result. The main contributions of this study can be summarized as follows:
- (1)
An ensemble learning method based on a two-round weight voting strategy was proposed and applied to Sentinel-1 sea ice data for the first time, achieving highly competitive classification results. The performance and effectiveness of the proposed TRWV method were investigated with Arctic sea ice scenarios from different sea areas and with different ice types. The classification results based on the multiple image scenes fully demonstrate the superiority of the proposed approach in terms of visual performance and quantitative accuracies compared with the traditional majority voting strategy and weighted voting strategy.
- (2)
In this study, we evaluated the performance of six base classifiers (NB, DT, KNN, LR, ANN, and SVM) for polar sea ice classification. The experimental results show that the classification performance of logistic regression is better than the other base classifiers. By using appropriate voting strategies and integrating the advantages of different base classifiers, ensemble learning has extremely important potential for sea ice classification based on Sentinel-1 images
- (3)
In this study, the idea of a two-round voting strategy was adopted for the first time to refine the classification results of the original ensemble learning, in order to improve the classification effect of sea ice. The experimental results indicate that the proposed strategy can preserve the edge contour of sea ice well, mainly because the pixels have a high correlation with their neighbors in the image spatial domain. In addition, in the process of deep mining the texture information of SAR data and calculating the similarity matrix among pixels in the neighborhood, the spatial context information is always taken into account, thus providing a guarantee for a more accurate ice classification map.
The proposed TRWV method in this paper showed a satisfactory performance of sea ice classification on Sentinel-1 images of the Northeast Passage in the winter Arctic region. However, there are still some limitations manifested in the following aspects. (1) The classification performance of the TRWV method is excessively dependent on its base classifiers. (2) Compared with the traditional voting strategies, TRWV has a higher computational cost. As a response to the above issues, the most worthwhile follow-up work of this study can be summarized as follows: (1) explore new base classifiers such as object-oriented methods, segmentation algorithms (IRGS), and CNNs; (2) adopt more efficient strategies to rectify the fuzzy pixels; and (3) evaluate the classification performance and seasonal robustness of TRWV by expanding the sea ice dataset, collecting it both in winter and summer.