1. Introduction
Sea ice is integral to the regulation of the global climate system. Its high albedo reflects a substantial portion of solar radiation back into space, which helps maintain Earth’s thermal balance [
1]. In addition, sea ice exerts considerable influence on human and economic activities, including global fisheries and maritime shipping, especially in high-latitude regions [
2]. Therefore, research on the variability in and distribution of global sea ice is critically important.
Traditional sea ice remote sensing is generally categorized into two domains: optical remote sensing and microwave remote sensing [
3]. Optical-based methods are susceptible to weather constraints such as cloud cover and dense fog, rendering them incapable of achieving all-weather sea ice monitoring. Microwave remote sensing of sea ice primarily utilizes passive microwave radiometers, Synthetic Aperture Radar (SAR), scatterometry, and radar altimeters. SAR-based techniques have become prominent in sea ice classification research due to their higher spatial resolution [
4]. However, challenges such as infrequent revisit times, large data volumes, and high deployment costs restrict their broad application in large-scale sea ice remote sensing.
Global Navigation Satellite System Reflectometry (GNSS-R) is an emerging remote sensing technology that offers a novel approach to polar monitoring due to its low cost, high spatiotemporal resolution, and all-weather observation capabilities. GNSS-R has been widely applied in areas such as wind speed measurement [
5,
6,
7], soil moisture retrieval [
8], and inland water monitoring [
9]. In particular, in the field of sea ice remote sensing, the continuous iteration of GNSS-R data sources has driven rapid advances in related research. The TechDemoSat-1 (TDS-1) satellite, launched in 2014, provided foundational data for early sea ice remote sensing [
10]. Following TDS-1, a new generation of GNSS-R satellites with high-latitude coverage capabilities has been successively launched, such as the Fengyun-3E (FY-3E) satellite and the Tianmu-1 constellation. Currently, multiple studies [
11,
12,
13,
14] have demonstrated the effectiveness of data from these satellites for sea ice remote sensing. Building on these observational data, Yan and Huang pioneered the use of TDS-1 data to assess the feasibility of GNSS-R signals for sea ice detection [
15], later improving detection performance with neural networks [
16] and convolutional neural networks (CNNs) [
17]. Research has since expanded to retrieving additional sea ice parameters, such as concentration [
18,
19] and thickness [
20,
21], further validating the utility of GNSS-R signals in sea ice remote sensing.
Compared to the retrieval of other sea ice parameters, research on sea ice classification based on GNSS-R is still in its nascent stage. In 2019, Rodriguez-Alvarez et al. [
22] conducted the first study on GNSS-R sea ice classification. They used a multi-step classification strategy based on observables extracted from Delay Doppler Maps (DDMs). The approach began with histogram thresholding for sea ice detection, followed by a Classification and Regression Tree (CART) model for sea ice type classification. GNSS-R data from the TDS-1 satellite and ground-truth data from the U.S. National Ice Center (USNIC) sea ice charts were used. The CART model achieved Recall rates of 54.58% for First-Year Ice (FYI), 94.09% for Multi-Year Ice (MYI), and 69.74% for Young Ice. After applying a spatio-temporal dominant class test, the recall rates increased to 70.03%, 82.34%, and 80.75%, respectively. In 2021, Zhu et al. [
23] introduced machine learning algorithms, including Random Forest (RF) and Support Vector Machine (SVM), and extracted novel waveform features from DDMs. Using the Ocean and Sea Ice Satellite Application Facility (OSI SAF) dataset as ground-truth and TDS-1 as the data source, the study found that the RF model achieved superior performance, yielding an overall accuracy of 84.82% and a Kappa coefficient of 0.39. Subsequently, Cheng et al. [
24] introduced CNN to classify sea ice directly from DDMs, using OSI SAF ground-truth data and GNSS-R data from the FY-3E satellite. This method improved the recall rates for FYI and MYI to 71.23% and 82.79%, respectively, compared to [
22]. More recently, Chen and Huang [
25] proposed a spatiotemporal compensation feature set. They employed Z-score normalized DDM average (ZDDMA) to correct errors from different TDS-1 observation tracks and incorporated temporal data along with temperature data from the European Center for Medium-Range Weather Forecasts Reanalysis v5 (ERA5) dataset to characterize the seasonal variability in sea ice. Using an RF classifier and TDS-1 data, ground-truth was derived from the National Snow and Ice Data Center (NSIDC) sea ice concentration dataset by assigning sea ice types based on concentration thresholds. This feature set achieved an overall accuracy of 84.86% and a Kappa coefficient of 0.6931, with F1-scores of 90.11%, 64.20%, and 81.59% for FYI, MYI, and Thin Ice, respectively. The study also evaluated performance in the Beaufort-Chukchi Sea, a region with a balanced distribution of sea ice types, where overall accuracy reached 85.55%, the Kappa coefficient was 0.7370, and F1-scores for the three ice types were 89.35%, 80.84%, and 78.59%, respectively.
In summary, there remains significant room for improvement in the accuracy of existing GNSS-R sea ice classification methods, particularly regarding the suboptimal identification of MYI. To address this issue, this study introduces a novel set of observables for GNSS-R sea ice classification. For scattering dynamic properties, the Spectral Entropy (SE) of the Normalized Integrated Delay Waveform (NIDW) First-order Differential Curve is designed based on DDMs to quantify the oscillation complexity of the trailing edge power decay process, thereby capturing the distinct power decay patterns of FYI and MYI within the trailing edge interval. For scattering intensity attenuation properties, the Root Mean Square height (RMS) is calculated based on a coherent scattering model to quantify the attenuation magnitude of reflection intensity caused by surface roughness and related factors. Additionally, sea ice salinity (S) and L-band brightness temperature (TB) data from the SMOS satellite are incorporated to complement the characterization of dielectric and radiative properties, enabling L-band active-passive synergistic observation. The primary aim of this study is to construct a multi-dimensional feature set to improve GNSS-R sea ice classification, with particular emphasis on resolving classification ambiguity between FYI and MYI and significantly enhancing MYI identification.
The remainder of this paper is organized as follows:
Section 2 describes the study area and datasets, including GNSS-R observations, SMOS data, and OSI SAF sea ice ground-truth data, along with the strategies for data selection, filtering and dataset partitioning.
Section 3 establishes the theoretical framework for sea ice classification based on the multi-dimensional feature set, detailing the principles of GNSS-R-based sea ice classification, feature extraction mechanisms, and the experimental workflow.
Section 4 presents experimental results and analysis. It details the selection of the optimal feature set through an ablation study combining traditional and novel features.
Section 5 presents a discussion, focusing on an analysis of misclassified samples to investigate the spatial distribution patterns and physical causes of classification errors. Finally,
Section 6 concludes the paper and suggests directions for future research.
3. Methodology
To address the challenge of ambiguity in identifying FYI and MYI under complex Arctic sea conditions, this study proposes a sea ice classification method based on a multi-dimensional feature set, integrating GNSS-R scattering mechanisms with SMOS multi-source observations. The section is organized into four components: the theoretical foundation of GNSS-R-based sea ice classification, feature extraction principles, classifier methodology, and the implementation workflow.
Section 3.1 explains the physical mechanisms of GNSS-R sea ice scattering using the Zavorotny–Voronovich (Z-V) model.
Section 3.2 details the extraction principles and physical significance of the four novel features (SE, RMS, S, TB) introduced in this study.
Section 3.3 describes the construction of the Random Forest (RF) classifier.
Section 3.4 synthesizes these theories and methods to provide the complete workflow of the sea ice classification experiment.
3.1. Sea Ice Classification Theory Based on GNSS-R
GNSS-R uses L-band signals from multiple GNSS constellations as signal sources to perform remote sensing by receiving signals reflected from the sea ice surface. The observation follows a bistatic scattering geometry. The scattering characteristics of the sea ice surface can be described by the Z-V model [
29], which formulates the scattered signal power as a two-dimensional function of delay and Doppler, as presented in Equation (1):
where
and
denote the time delay and Doppler frequency shift, respectively;
is the satellite transmit signal power;
is the gain of the transmitter antenna;
is the gain of the receiver antenna;
is the coherent integration time in the signal;
is the distance from the transmitter to the surface point;
is the distance from the receiver to the surface point;
is the term related to the parameters of the reflecting surface, including the reflection coefficient and roughness;
is the Doppler shift function;
is the GNSS code correlation function; and
is the scattering effective area; and
is the wavelength of the L-band signal emitted by GNSS.
The fundamental observable derived from this model is DDM, which depicts the distribution of reflected signal power across the two-dimensional space of time delay and Doppler frequency shift. The delay dimension captures variations in signal propagation paths, encapsulating coupled information regarding surface roughness and surface scattering. Meanwhile, the Doppler dimension primarily reflects surface geometric effects and kinematic properties.
Sea ice is classified into FYI and MYI based on whether it has undergone a complete freeze–thaw cycle [
30]. There exist fundamental differences between the two in terms of internal physical structure, dielectric properties, and surface geometric morphology. These differences determine the distinct scattering response patterns of L-band signals during interaction. The structures, scattering mechanisms, and corresponding DDMs for different sea ice types are illustrated in
Figure 2.
Specifically, existing studies [
31,
32] indicate that in terms of physical properties, FYI forms rapidly within a single winter. Its interior is relatively homogeneous, and the brine drainage process remains insufficient, resulting in consistently high overall salinity. The surface that has not undergone severe dynamic compression or prolonged weathering, maintains a relatively level morphology with lower roughness. In contrast, MYI has experienced at least one complete freeze–thaw cycle. During the summer melting phase, meltwater flushing drives significant brine exclusion, thereby decreasing overall salinity. The repeated freeze–thaw process causes the voids originally occupied by brine to be filled with air, forming a dense porous microstructure with distinct dielectric properties. Additionally, multi-year dynamic compression, uneven refreezing, and long-term weathering collectively result in significantly higher surface roughness of MYI compared to FYI, which fundamentally governs their distinct L-band scattering responses.
These physical property differences produce distinct scattering mechanisms and DDM characteristics for FYI and MYI. The relatively smooth surface of FYI results in dominant coherent specular reflection, concentrating scattering power near the specular point (SP). High salinity increases the dielectric constant, enhancing surface reflectivity [
33]. Consequently, the DDM exhibits high peak power and a narrow waveform. High dielectric loss suppresses internal signal penetration, and the homogeneous structure simplifies the signal propagation path, leading to a rapid decay trend in power along the DDM trailing edge interval as the delay increases. Conversely, the scattering mechanism of MYI is more complex. On one hand, the rough surface induces partial incoherent diffuse scattering, dispersing power across a broader range of delays and Doppler shifts. On the other hand, the low dielectric constant caused by low salinity reduces surface reflectivity. A significant portion of the L-band signal penetrates into the sea ice interior, where its heterogeneous structure induces multipath and volume scattering [
34]. This process prolongs the signal propagation path. Overall, this manifests in the DDM as lower peak power and significant waveform broadening.
The DDMs from the BDS-R dataset have a dimension of 20 Doppler bins × 122 delay bins. The delay axis offers higher resolution (1/8 chip) near the specular point (SP) compared to 1/4 chip in other regions.
Figure 3 displays the high-resolution intervals of the DDMs for both FYI and MYI. The high-resolution interval of BDS-R enables a more refined characterization of the scattering power decay process near the SP, capturing subtle variations in sea ice scattering and improving data quality for feature extraction. Therefore, this study concentrates on the high-resolution interval from −2.875 to 2.875 chips, encompassing 47 delay sampling points, for subsequent analyses.
3.2. Sea Ice Type Features Extraction
The scattering characteristics and intensity of reflected signals have served as the critical basis for sea ice classification in previous studies. Based on the distinct scattering mechanisms of GNSS-R signals interacting with different ice types, existing research has extracted multiple features from DDMs for sea ice type discrimination [
22,
23,
25], such as DDMA and various Delay Waveform parameters. To construct a comprehensive multi-dimensional candidate feature set, the present study retains traditional features while incorporating four novel features: SE, RMS, S, and TB. These additional features characterize the differences between FYI and MYI across four dimensions: scattering dynamic properties, scattering intensity attenuation properties, dielectric properties, and radiative properties. This section focuses on elucidating the calculation principles and physical significance of these novel introduced features.
3.2.1. Spectral Entropy of the Normalized Integrated Delay Waveform First-Order Differential Curve
The Integrated Delay Waveform (IDW) is defined as the integration of the DDM along the Doppler dimension, calculated by summing the delay waveforms across 20 distinct Doppler frequency shifts. The normalized IDW is denoted as NIDW.
Figure 4 illustrates the typical differences in the NIDW of FYI and MYI within the trailing edge interval. In contrast to the rapid monotonic power decay observed in FYI, MYI exhibits significant oscillatory characteristics and a slower rate of decay. The trailing edge slope feature describes the macroscopic average power decay rate difference between the two ice types within the trailing edge interval by computing the slope over a selected trailing edge window. However, this waveform slope feature is insensitive to the fluctuation of the waveform within the calculation interval, and is therefore unable to capture the complex power variation characteristics of MYI within the trailing edge interval.
To quantify this complex power decay process and capture the instantaneous rate of change in power, this study calculates the first-order difference of the NIDW, yielding a First-order Differential Curve representing power variation:
where
represents the time delay, and
denotes the unit delay sampling interval (1/8 chip).
Figure 5 compares the First-order Differential Curves of the NIDW for FYI and MYI. The First-order Differential Curve of FYI remains consistently negative and monotonically converges within the trailing edge interval, approaching zero after approximately 0 to 8 delay sampling intervals. In contrast, the First-order Differential Curve of MYI exhibits overall alternating positive and negative oscillations throughout the trailing edge interval, with the signal power failing to decay to zero across the entire high-resolution trailing edge delay interval. Furthermore, the power decay rate of MYI fluctuates considerably within the 0 to 8 delay sampling interval, and the curve oscillation within the 8 to 23 delay sampling interval is significantly greater than that of FYI, resulting in a higher overall oscillation complexity of the waveform.
To quantify the oscillation complexity of the aforementioned NIDW trailing edge first-order differential curve, this study extracts SE of the curve over the entire trailing edge interval. SE characterises the degree of disorder in the energy distribution of the power spectrum of the curve. By applying a Fast Fourier Transform (FFT) to the trailing edge differential curve of the NIDW, the entropy of the frequency-domain power spectrum is computed. The more concentrated the energy in a small number of frequency components, the lower the entropy value; the more uniformly the energy is dispersed across multiple frequency components, the higher the entropy value. The specific calculation procedure is as follows:
where
denotes the number of delay sampling points of the First-order Differential Curve; in this study, the complete trailing edge interval is used, with a value of 23.
denotes the delay index,
is the discrete frequency index,
represents the power spectral value corresponding to the
-th frequency component, and
denotes the normalized probability distribution of the power spectrum, reflecting the proportion of each frequency component in the total energy.
From the perspective of physical scattering mechanisms, FYI is dominated by coherent surface reflection, resulting in a nearly monotonic trailing edge First-order Differential Curve. Following the FFT, the energy is highly concentrated in low frequency components, yielding a highly non-uniform distribution and a correspondingly low SE value. In contrast, MYI exhibits higher surface roughness, and due to volume scattering caused by its internal structure as well as multipath effects, the power values are dispersed during the decay process. Following the FFT of the MYI First-order Differential Curve, the energy is dispersed across multiple frequency components, producing a more uniform distribution and a correspondingly higher SE value.
Figure 6 illustrates the distribution of SE values for FYI and MYI within the dataset. The two ice types exhibit distinct distribution patterns. For FYI, SE values are overall lower, presenting a narrower top and wider bottom distribution, with a median of 2.085 bits and 50% of samples concentrated within the range of 1.867 to 2.479 bits. In contrast, MYI SE values are overall higher, presenting a distribution concentrated in the middle, with a median of 2.455 bits, which is 0.370 bits higher than the median of FYI, and 50% of samples are concentrated within the range of 2.110 to 2.694 bits. Overall, the SE value distributions of FYI and MYI exhibit clear differences that are consistent with theoretical expectations, demonstrating the effectiveness of this feature for sea ice type classification.
3.2.2. Root Mean Square Height
Root Mean Square height is defined as the standard deviation of surface heights deviating from the mean surface level. It serves as a reliable parameter for characterizing geometric texture and quantifying surface roughness [
35]. FYI typically exhibits a relatively smooth and flat surface with stronger coherent reflectance, whereas MYI, having undergone multiple freeze–thaw cycles and weathering processes, presents an uneven surface with higher roughness, resulting in a greater attenuation magnitude of signal reflectance intensity. According to coherent scattering theory [
36], when electromagnetic waves are incident on a rough surface, the surface reflectivity is modulated by the RMS, exhibiting an exponential decay.
Based on this coherent scattering theory, this study utilizes the BDS-R observational data to compute RMS.
where
is the reflectivity at the SP,
is the incidence angle,
represents the wavelength of the L-band signal emitted by GNSS, which is 19.2 cm, and
is the Fresnel reflection coefficient. These parameters can be directly obtained from the BDS-R data products.
It should be noted that, as confirmed by the FY-3E GNOS-II data team, provided in the BDS-R data and used in this formula represents a theoretical value for a smooth ocean surface, rather than the Fresnel reflection coefficient of the sea ice surface. Specifically, it is a theoretical value computed at the air–ocean interface based on the Klein–Swift dielectric model, using climatological static seawater parameters and a smooth ocean surface as the coherent reflection reference. Accordingly, the RMS value derived from this formula carries specific mathematical and physical implications that warrant further clarification. In a mathematical sense, the formula applies a geometric normalization correction through the incidence angle to eliminate the systematic influence of varying observation geometries, thereby quantifying the attenuation magnitude of the measured sea ice reflectance relative to the theoretical Fresnel coefficient of a smooth ocean surface reference and rendering observations acquired at different incidence angles mutually comparable. In a physical sense, this attenuation magnitude reflects the combined influence of multiple factors affecting the reflection intensity of the sea ice surface, including surface roughness and internal structural heterogeneity. Accordingly, RMS possesses an inherent physical capacity to discriminate between FYI and MYI based on their distinct scattering intensity attenuation characteristics.
Figure 7 illustrates the statistical distribution of the computed RMS values across different sea ice types. FYI exhibits a median of 3.47, with 50% of samples concentrated within the range of 3.06 to 3.78. In contrast, MYI exhibits a median of 4.01, with 50% of samples concentrated within the range of 3.62 to 4.31, representing a median difference of 0.54 relative to FYI. The statistical results demonstrate that RMS effectively captures the differences between FYI and MYI in terms of scattering intensity attenuation characteristics, providing physically grounded dimensional information for sea ice type discrimination.
3.2.3. Sea Ice Salinity
Salinity is a critical parameter controlling the microwave dielectric properties of sea ice. During the freezing process, FYI entraps a significant amount of brine. In contrast, MYI undergoes summer melting, during which meltwater flushing drains brine, lowering its salinity. Consequently, this study introduces SMOS salinity data for sea ice classification.
According to the Ulaby Model [
37] and the Vant Model [
33], there exists a nonlinear mapping relationship between sea ice salinity and its dielectric constant:
where
and
represent the salinity and temperature of the sea ice,
denotes the brine volume, and
and
are calculation coefficients. The aforementioned physical formula indicates that salinity determines the brine volume, which in turn dominates the variations in the dielectric constant.
Figure 8 presents the ground-truth distribution of Arctic FYI and MYI alongside the spatial distribution of salinity for November 2023. This month was chosen because Arctic sea ice is in a rapid growth phase during this period, and the ice type distribution is relatively stable, minimizing drastic salinity changes and ensuring representativeness. As shown in
Figure 8a,c, the spatial distribution of salinity exhibits high consistency with the OSI SAF ground-truth sea ice type distribution, suggesting that introducing the salinity feature can enhance the capability to capture the fundamental differences between FYI and MYI from the dimension of dielectric properties. Specifically, low-salinity regions (<6 psu) are concentrated in the Canadian Basin and its surrounding areas, the Greenland Sea, and the central Arctic Basin, corresponding to the primary distribution areas of MYI; in contrast, high-salinity regions (>12 psu) are distributed in the Arctic marginal seas, such as Baffin Bay, the Beaufort Sea, and the Laptev Sea, corresponding to the primary distribution areas of FYI. The transition zones (6–7 psu) are mainly located in the eastern Greenland Sea and the Canadian Basin adjacent to the Beaufort Sea, which may represent potential challenging areas for sea ice classification.
3.2.4. L-Band Brightness Temperature
Brightness Temperature represents the intensity of microwave radiation emitted from an object’s surface. This study introduces L-band TB from SMOS into GNSS-R sea ice classification. Previous studies [
38] have utilized brightness temperatures at high-frequency bands for sea ice classification. At high frequencies (e.g., 37 GHz), the shorter signal wavelength results in a shallow penetration depth, where the brightness temperatures of both FYI and MYI originate primarily from the radiant energy of the sea ice itself. In this frequency band, compared to FYI, the porous structure present in the upper layers of MYI attenuates the radiant energy emitted from the interior to the surface, causing its brightness temperature to be lower than that of FYI.
In contrast, the L-band TB introduced in this study is determined jointly by sea ice thickness and dielectric properties. Existing research [
39] indicates that TB is sensitive to variations in sea ice thickness up to a saturation thickness (1.4 m). Within this threshold, sea ice thickness is significantly positively correlated with both emissivity and surface brightness temperature; that is, as sea ice thickness decreases, both its emissivity and surface brightness temperature decrease accordingly. Specifically, the high salinity of FYI increases surface reflectivity, thereby reducing its microwave emissivity. Furthermore, because the majority of FYI is relatively thin, L-band signals penetrate the ice layer, so the observed TB includes radiant energy from both the sea ice and the underlying seawater, which has low emissivity. Consequently, the superposition of the thickness and dielectric property mechanisms results in FYI exhibiting low brightness temperatures in the L-band. Conversely, for MYI, the thicker ice layer effectively blocks the radiant energy from the underlying seawater. Simultaneously, its low salinity characteristic results in a higher emissivity. Unlike in high-frequency bands, where energy is attenuated by bubble structures, the longer L-band signal wavelength renders this attenuation negligible. Thus, the L-band TB of MYI is composed primarily of the radiant energy from the sea ice itself, presenting a higher TB than FYI in the L-band.
Figure 8b shows the spatial distribution of L-band TB for FYI and MYI in November 2023. It can be observed that the surface brightness temperature also exhibits consistency with the actual sea ice distribution. Simultaneously, it shows a significant negative correlation with the salinity distribution, confirming that high salinity leads to greater dielectric loss. The central Arctic Basin and other regions where MYI is primarily distributed appear as high TB zones (>245 K), while marginal seas where FYI is primarily distributed appear as low TB zones (<220 K). The Greenland Sea, Canadian Basin, and surrounding areas exhibit TB transition zones (220–245 K). In these regions, some FYI exhibits anomalously high TB, highlighting the classification challenges in these areas.
3.3. Random Forest
The method proposed in this study employs the Random Forest (RF) classification algorithm. The fundamental philosophy of RF is that “an ensemble of multiple weak learners can constitute a strong learner.” It operates by constructing multiple decision trees, where each decision node randomly selects a subset of features to search for the optimal split point. The experiments were implemented using the scikit-learn Python library (version 1.1.2). The key hyperparameters involved include the number of estimators, the maximum depth of the decision trees, and the maximum number of features considered in each training subspace. The primary advantages of this algorithm lie in its ability to mitigate the risk of overfitting and effectively capture non-linear relationships within the data. Furthermore, the ability to adjust class weights makes it suitable for addressing class imbalance in sea ice type datasets.
3.4. Process of Sea Ice Classification
This study proposes an Arctic sea ice classification workflow based on a multi-dimensional feature set.
Figure 9 illustrates the specific classification process, which primarily comprises the following three stages:
- (1)
Stage 1: Data Collection. The BDS-R data from FY-3E, the SMOS L3 data, and the OSI SAF sea ice ground-truth data are spatiotemporally collocated. The spatial and temporal scope is restricted to data with latitudes greater than 70°N from 15 October 2023 to 15 April 2024. Subsequently, data samples with SNR < 3 dB, an incidence angle > 40°, and SIC < 60% are excluded.
- (2)
Stage 2: Optimal Feature Set Selection. Based on the DDMs and observables provided by BDS-R, the scattering dynamic properties feature SE, the scattering intensity attenuation properties feature RMS, and other traditional features are calculated. These are combined with the S and TB data provided by SMOS to construct a candidate multi-dimensional feature set. Following the “Spatial Block Isolation” partitioning strategy described in
Section 2.4, feature selection is restricted to the training set, and the optimal feature set is determined via an ablation study.
- (3)
Stage 3: Sea Ice Classification Experiments and Analysis. Using the selected optimal feature set, the overall classification performance is evaluated on an independent test set based on the RF classifier. Comparative experiments with other feature sets are conducted, and the contribution of the novel introduced features to classification results is evaluated separately. Furthermore, a specific cause analysis is performed for misclassified samples to interpret the impact of geographic factors and other variables on model performance.
Figure 9.
Flowchart of the proposed sea ice classification method. The framework utilizes FY-3E BDS-R and SMOS datasets to extract and integrate features, constructing a multi-dimensional candidate feature set for experimental validation and analysis.
Figure 9.
Flowchart of the proposed sea ice classification method. The framework utilizes FY-3E BDS-R and SMOS datasets to extract and integrate features, constructing a multi-dimensional candidate feature set for experimental validation and analysis.
5. Discussion
This section analyzes the spatial distribution patterns and physical causes of the misclassified samples. The overall misclassified samples account for 6.14% of the total dataset, amounting to 11,151 samples. These error samples are not randomly distributed but exhibit significant spatial clustering.
Figure 13 illustrates the spatial distribution of the misclassification results in the Arctic. The misclassified samples are notably concentrated in two key regions: the Greenland Sea (20°W–10°E, 70–82°N) with 3500 error samples, and the Canadian Basin (122–155°W, 72–82°N) with 4314 error samples. The remaining 3337 samples are almost evenly distributed across other Arctic regions, indicating that regional geographic environments influence classification performance. Simultaneously, to investigate the temporal evolution characteristics of misclassification, this study divided the dataset’s time range (15 October 2023, to 15 April 2024) into three seasons: Autumn (October–November), Winter (December–February), and Spring (March–April).
Figure 14 reveals a significant spatiotemporal clustering of misclassification in the Greenland Sea. The errors in this region are predominantly FYI misclassified as MYI, totaling 2326 samples, accounting for 66.46% of the total errors in the region. The number of misclassified FYI samples peaked at 1253 during winter, significantly higher than in other seasons. Furthermore, the latitude distribution analysis confirms the spatial concentration of misclassification, identifying the 76–80°N latitude zone as the primary area of error. This region corresponds to the Fram Strait and the path of the downstream East Greenland current. This spatiotemporal clustering pattern shows strong consistency with known regional sea-ice dynamic processes. Existing studies indicate that the Fram Strait is the primary gateway for sea ice export from the Arctic Ocean to the Atlantic, and export activity is most active in winter [
40]. The high drift speeds and strong shear actions in this region [
41] lead to mechanical deformation processes such as ridging and rafting. The elevated RMS of FYI samples misclassified as MYI, relative to correctly classified FYI, indicates a greater attenuation magnitude of reflection intensity, which is primarily associated with increased surface roughness resulting from dynamic deformation processes such as ridging and rafting. Therefore, this study infers that ridging and rafting processes in this region alter the geometric structure of FYI, causing its scattering characteristics to resemble those of MYI and inducing classification confusion.
The Canadian Basin exhibits a misclassification pattern that contrasts starkly with that of the Greenland Sea.
Figure 15 shows that MYI misclassified as FYI is the dominant error type in this region, totaling 2797 samples and accounting for 64.88% of all regional errors, with minimal seasonal variation. TB analysis indicates that misclassified MYI consistently exhibits lower brightness temperatures than correctly classified MYI across all seasons. According to the L-band thickness-emissivity-brightness temperature mechanism, as sea ice thickness decreases, the observed brightness temperature gradually diminishes and may even incorporate radiant energy from the underlying low-emissivity seawater. This observation aligns with the known thinning trend of MYI in the Canadian Basin [
42]. The reduction in thickness likely shifts the radiative properties of MYI towards those of thinner ice, resembling FYI. Furthermore, RMS analysis corroborates a trend of surface smoothing: the lower RMS of misclassified MYI samples, relative to correctly classified MYI, indicates a reduced attenuation magnitude of reflection intensity, consistent with decreased surface roughness. This phenomenon is generally associated with long-term freeze–thaw cycles, during which summer melt ponds smooth out surface roughness [
43]. In summary, the misclassification in the Canadian Basin is likely attributed to the dual effects of thickness thinning and surface smoothing. These two physical processes tend to cause the scattering intensity attenuation and radiative characteristics of degraded MYI to converge with those of FYI, thereby inducing classification confusion.
The analysis of the spatial distribution of misclassification reveals that the classifier’s limitations are primarily concentrated in two regions, each with a distinct cause of misclassification. In the Greenland Sea, the misclassification pattern exhibits high consistency with sea ice dynamic processes. This is primarily manifested as mechanically deformed FYI, which is misclassified as MYI due to increased surface roughness elevating the scattering intensity attenuation magnitude, causing its scattering characteristics to resemble those of MYI. Conversely, the misclassification in the Canadian Basin reflects a long-term trend of thermodynamic degradation in MYI. Specifically, thickness thinning and surface smoothing reduce both the radiative and scattering intensity attenuation characteristics of MYI. This regional heterogeneity unveils the dual challenges facing L-band GNSS-R sea ice classification: (1) In dynamic mixing zones, sea ice undergoes rapid collision and compression, altering its surface features. (2) In stable MYI zones, sea ice degradation driven by climate warming leads to a transformation towards thinner ice resembling FYI.
6. Conclusions
To address GNSS-R sea ice classification under complex sea conditions, this study proposes a classification feature set that integrates multidimensional sea ice information. The approach introduces SE, which captures the oscillatory complexity of the trailing edge power decay process as a scattering dynamic property; RMS, which quantifies the attenuation magnitude of reflection intensity arising from surface roughness and related factors as a scattering intensity attenuation property; and SMOS salinity and L-band brightness temperature data, which characterize dielectric and radiative properties. Together with 5 traditional features (Sp-ref, REWI, Wd, ZDDMA3, and DDMA7), these constitute a comprehensive GNSS-R sea ice classification feature set of 9 features, spanning scattering dynamic, scattering intensity attenuation, dielectric, and radiative dimensions. By leveraging the sensitivity of GNSS-R to surface structure and the observational advantages of SMOS regarding physical properties, this method achieves complementary advantages within the L-band signal domain.
Experimental results confirm the effectiveness of the multi-dimensional feature set. The final model achieved a classification accuracy of 93.86% and a Kappa coefficient of 0.8061. Notably, the model significantly improved the identification capability for hard-to-distinguish MYI, with the F1-Score for MYI reaching 84.43% and Recall reaching 85.11%. Incremental feature analysis demonstrates that RMS is the primary feature for distinguishing sea ice types, indicating that scattering intensity attenuation arising from surface roughness and related factors is the dominant factor influencing L-band signal scattering. Meanwhile, SE provided the scattering dynamic dimension by capturing the oscillatory complexity of the trailing edge power decay process, and SMOS salinity and brightness temperature data supplemented the dielectric and radiative dimensions, further resolving classification ambiguity between FYI and MYI. Overall, the multi-dimensional feature set proposed in this study demonstrates superior sea ice classification performance compared to previous studies, with significant improvements in MYI classification.
Several limitations of this study should be acknowledged. At the data level, the effective spatial resolution of SMOS substantially exceeds the footprint scale of GNSS-R specular reflection points, causing all GNSS-R samples within the same aggregation grid to share identical SMOS observations. Although the “Spatial Block Isolation” partitioning strategy eliminates data leakage between training and testing sets, this resolution mismatch exists at the data fusion level. Furthermore, SMOS L3 observations are retrieved based on climatological static parameters and cannot capture sea ice dynamic changes in real time, potentially limiting their applicability during periods of rapid sea ice change. At the level of research scope, this study is temporally limited to the freezing season, as neither OSI SAF nor SMOS L3 data products provide reliable sea ice type labels during the melt season. Additionally, this study addresses only the binary classification of FYI and MYI, constrained by the category definitions of the OSI SAF product. Future research will focus on the following: (1) incorporating higher-resolution passive microwave data to address the spatial resolution mismatch at the data fusion level; (2) exploring GNSS-R-based sea ice classification during the melt season, pending the availability of reliable sea ice type reference data for that period; (3) combining deep learning algorithms to extend this method to multi-class sea ice classification; and (4) conducting misclassification correction studies addressing regional heterogeneity in the Greenland Sea and the Canadian Basin revealed in this study.