This research proposes a machine learning algorithm based on multiple feature-level fusion to automatically classify single-scene images to generate the first ice presence map products. On the basis of the above products, a filtering based on cloud motion and weighted fusion method is proposed that can automatically generate the daily/weekly composite ice presence maps and weekly fused optical images. The methodological framework is shown in
Figure 1 and described in the following three subsections: pre-processing (
Section 2.2.1), classification by MFLFRF algorithm (
Section 2.2.2), and composite ice presence maps (
Section 2.2.3).
2.2.1. Pre-Processing
After downloading the data of the research area from the NASA website, the required bands in the image were geometrically corrected and reprojected to the WGS84 polar stereographic projection. This process can be completed in batches by calling the HDF-EOS To GeoTIFF Conversion Tool (HEG) (
https://wiki.earthdata.nasa.gov/display/DAS/Downloads), which is special processing tool used for MODIS images. Then, cropping, resampling of the nearest distance, radiometric calibration, solar zenith angle correction, and land mask processing on the image of the study area were performed. The radiometric calibration and solar zenith angle correction formulas are as follows:
The formula for apparent reflectance calculation [
41,
42,
43]:
In this equation, is the reflectance of each pixel in the corresponding band, where is the corresponding band, is the track, and is the frame_and_sample; is the count values of a single pixel in the corresponding band; is the reflectance scaling ratio; is the reflectance offset. After absolute radiometric calibration, the unit of radiance is W/(m2·μm·sr), and these parameters can be read in the attribute domain of the scientific data set of the corresponding band.
The solar zenith angle data obtained from the MOD021KM of the MODIS data is interpolated to a 250 m spatial resolution using the shortest distance method. Then, the resampled data of the solar zenith angle are calculated according to the correction formula of the solar zenith angle. The correction formula of the solar zenith angle is as follows [
41,
42,
43]:
Then, we combine the two formulas into:
where
is the solar zenith angle (radians) corresponding to each pixel;
is the detection value of the solar zenith angle of each pixel;
is the scaling ratio between the probe value and the real value, which is 0.01 by default.
is the apparent reflectance of a single pixel in the corresponding band after the solar zenith angle is corrected.
Snow and ice have a strong reflectance in visible (VIR), and a strong absorption in near infrared (NIR) and shortwave infrared (SWIR). This combination of bands makes the distinction between ice, clouds, and sea water more obvious. The thick ice and snow present a bright sky blue, while the cloud layer composed of small water droplets has the same scattering in visible light and short-wave infrared bands and its color appears white. These clouds usually exist lower, near the ground, and have higher temperatures. High and cold clouds are mostly composed of small ice crystals, which appear blue, while water clouds appear white. Therefore, the false color combinations of bands 7, 2, and 1 are selected to identify ice and clouds in the Arctic region in this study.
The
NDSI utilizes the contrast spectral behavior of the visible (green band 4: 0.55 μm) and shortwave infrared (
SWIR band 6: 1.64 μm) part of the spectrum [
31,
44]. Since the reflectance of snow in the green band and shortwave infrared band shows a strong contrast, the two bands can be used to extract ice and snow well. Therefore, using the
NDSI index is a classic way to distinguish sea ice from other surface features [
31,
45].
The
NDSII-1 [
34] and the
NDSII-2 [
35] are the same as the
NDSI, which is realized by using the difference in reflectance of ice and snow, but uses different spectral bands to express the reflectivity of ice and snow. In order to improve the accuracy of the overall sea ice map, the
NDSII-2 index was selected in this study, as it can identify sea ice more accurately than other indices [
31,
35].
Sea ice and clouds have similar characteristics in the spectrum, and it is very difficult to distinguish them only by their spectral features. Texture features can describe the spatial distribution of spectral information [
38,
46]. Some scholars have used the difference in texture features in the visible band for cloud detection [
38,
45]. The traditional Local Binary Pattern (
LBP) operator describes the local spatial structure of an image [
38,
47]. It encodes the difference between the center pixel
and its neighboring pixels into a binary pattern, and uses the binary pattern to mark the center pixel. The shape of the adjacent area is circular, and the radius is
.
where
is the number of neighborhood pixels on the circumference of a circle and
is a step function.
In order to improve the robustness of the operator to noise, Liu et al. [
48] proposed a Robust Extended Local Binary Pattern (
RELBP) texture descriptor. This operator takes into account the influence of the intensity of the center pixel and the filter response of the image.
RELBP contains three descriptors, which are
RELBP_CI, based on the intensity difference of the center pixel; RELBP_NI, based on the intensity difference of neighboring pixels; and RELBP_RD, based on the radial pixel intensity difference. The
RELBP_CI descriptor is selected in this study, and the formula is as follows:
where
is a local patch of size
and its center is in location
.
is the filter applied to the patch. The median filter is selected in this paper.
is the mean of
over the whole image [
38,
48].
2.2.2. Classification Using the MFLFRF Algorithm
Combining the spectral and textural features of ice and clouds in the polar regions, a Multi-Feature Level Fusion Random Forest (MFLFRF) classification algorithm was constructed to classify the ground objects into three types of targets: cloud, ice, and water. The Random Forest (RF) classification algorithm based on ensemble learning has become one of the most widely used algorithms in remote sensing and other application fields [
49,
50,
51]. The RF classification algorithm is based on the decision tree as the basic unit. Through ensemble learning, multiple decision tree weak classifiers are formed into a strong classifier. The final decision category is determined by voting on the classification results of all weak classifiers. RF is a general term for ensemble methods using tree-type classifiers, which are independent learning and prediction [
51]. The base classifier in the RF algorithm is the classification and regression tree (CART) decision tree. Each decision tree is trained on the training samples of the original training data, and only a subset of the input variables randomly selected by each node can be searched to determine the segmentation, which makes its training time shorter than that of other ensemble methods [
50].
In the traditional RF classification method, it is necessary to individually select training samples for each image for classification. However, this would be very complicated for the batch production of cloudless sea ice products. The types of surface features are relatively simple in the Arctic region and include clouds, open water, and sea ice. The regional features of the ground features are relatively fixed and are less affected by seasons and other factors, so it is possible to reuse the samples. Therefore, this study developed a training sample library to classify surface features in the Arctic through model training and prediction. The training samples were selected from the whole Arctic region by uniform sampling. The sample categories include sea ice, water, and clouds. The texture of clouds is random and is determined by the type of cloud. Thick clouds tend to be massive and their texture is relatively rough. The texture of thin clouds such as cirrus clouds is smoother and is quite different from the texture of ice and snow. According to different cloud texture and color characteristics, the cloud region samples were divided into cloud 1, cloud 2, and cloud 3, until they were finally classified as a cloud sample. Eighty percent of them were used as training samples, and the remaining twenty percent were used as validation samples. The details of the training sample library are shown in
Table 2.
The three common spectral bands (bands 7, 2, and 1), as well as the four calculated indices (NDSII-2, the texture features of bands 7, 2, and 1, respectively) were used as input features for the RF classification process. The pixel-based classification model was trained using the unified training samples. There are two important parameters that should be controlled to acquire a better classification result in the RF function: number of trees to grow (ntree) and number of variables randomly sampled as candidates at each split (mtry). Since the RF classifier is computationally efficient and does not overfit, the number of trees can be as large as possible [
49]. However, when the number of trees is increased above a threshold, the classification is no longer improved, as some studies on the sensitivity of RF classifiers based on the number of trees have proved [
51,
52]. The default value of ntree for remote sensing image prediction is 500. A higher mtry will result in stronger individual decision trees, but with an increase in correlation between trees, the accuracy of the model is reduced [
50]. An mtry usually uses the square root of the total number of variables in classification tasks [
38]. The two parameters ntree and mtry were determined with a test based on the modified out-of-bag (M-OOB) accuracy in this study, and we found that the number of 500 for ntree and four for mtry worked well for the classification task.
2.2.3. Composite Ice Presence Maps
By synthesizing the classification results of Aqua and Terra satellite images taken on the same day in the Arctic, a daily ice presence map was obtained, and a weekly ice presence map and a weekly fusion optical image were obtained based on this. The following rules were formulated in the composite ice maps:
Calculate the number of times N of non-cloud categories for each pixel among all image classification results from the Terra or Aqua satellites in a day.
(1) Ice extraction: Judge whether N is greater than the threshold T1. T1 is the threshold of the number of ice occurrences for each pixel per day and was defined as 5 in this study. If N is greater than T1, the pixel is judged to be in the category corresponding to the mode of the non-cloud sequence, and ice extraction is performed on the entire Arctic region. If N is less than T1, the pixel is judged to be a cloud. (2) Water extraction: Determine whether N is greater than the threshold T2. T2 is the threshold of the number of water occurrences for each pixel per day and was defined as 2 in this study. If N is greater than T2, the pixel is judged to be in the category corresponding to the mode of the non-cloud sequence, and water extraction is performed on the entire Arctic region; if N is less than T2, the pixel is judged to be a cloud.
Synthesize the results extracted in step 2 to obtain daily synthetic ice maps.
Repeat steps 1 to 3 to calculate the ice map for seven consecutive days and synthesize the final ice map for the week.
Use all daily synthetic ice maps for seven consecutive days to correct the classification results of the MFLFRF algorithm. According to the corrected classification results, the pre-processed images are fused by assigning weights to obtain weekly fused images.
The specific processing flow is shown in
Figure 2.
In the classification results using the MFLFRF algorithm, the extracted sea ice is mixed with a little cloud and cloud shadow, while the accuracy of the water extraction is higher. Therefore, when performing ice map synthesis, ice and water are extracted separately. First, the number of non-cloud categories N that the current pixel appears in all image classification results obtained from the Terra or Aqua satellites in a day is calculated, and then the category of the current pixel is determined by comparing N with the threshold to extract ice and water regions. In this way, it can be ensured that the area of the ice map can be maximized under the given cloud cover.
Since the shortwave infrared band has the characteristics of penetrating thin clouds but cannot penetrate thick clouds, the thin cloud areas can be extracted by threshold segmentation using band 7. After normalizing the thin cloud areas, the power function formula is used to calculate the weight of the thin cloud areas. Here, represents the normalized pixel value of the thin cloud, and is the assigned weight value. The weight value range is between 0 and 1, and the smaller the thin cloud value, the larger the weight value assigned. Then, the weighted average value of each pixel is calculated to obtain the weekly fusion image.