Spaceborne GNSS-R for Sea Ice Classiﬁcation Using Machine Learning Classiﬁers

: The knowledge of Arctic Sea ice coverage is of particular importance in studies of climate change. This study develops a new sea ice classiﬁcation approach based on machine learning (ML) classiﬁers through analyzing spaceborne GNSS-R features derived from the TechDemoSat-1 (TDS-1) data collected over open water (OW), ﬁrst-year ice (FYI), and multi-year ice (MYI). A total of eight features extracted from GNSS-R observables collected in ﬁve months are applied to classify OW, FYI, and MYI using the ML classiﬁers of random forest (RF) and support vector machine (SVM) in a two-step strategy. Firstly, randomly selected 30% of samples of the whole dataset are used as a training set to build classiﬁers for discriminating OW from sea ice. The performance is evaluated using the remaining 70% of samples through validating with the sea ice type from the Special Sensor Microwave Imager Sounder (SSMIS) data provided by the Ocean and Sea Ice Satellite Application Facility (OSISAF). The overall accuracy of RF and SVM classiﬁers are 98.83% and 98.60% respectively for distinguishing OW from sea ice. Then, samples of sea ice, including FYI and MYI, are randomly split into training and test dataset. The features of the training set are used as input variables to train the FYI-MYI classiﬁers, which achieve an overall accuracy of 84.82% and 71.71% respectively by RF and SVM classiﬁers. Finally, the features in every month are used as training and testing set in turn to cross-validate the performance of the proposed classiﬁer. The results indicate the strong sensitivity of GNSS signals to sea ice types and the great potential of ML classiﬁers for GNSS-R applications. used for classifying FYI and MYI using the spaceborne GNSS-R data. The results indicate that RF and SVM-based GNSS-R have great potential in sea ice classification. This study demonstrates that the great potential of GNSS-R for classifying sea ice types, which can be an effective and com- plementary approach for remotely sensing sea ice. In future studies, more GNSS-R fea- tures, ML algorithms, and environmental effects (such as ocean wind) should be investi- gated to improve the accuracy of classifying FYI and MYI. and


Introduction
Arctic sea ice is one of the most significant components in studies of climate change [1]. The knowledge of sea ice information is useful for shipping route planning and offshore oil/gas exploration. As one of the most important sea ice parameters, sea ice type is of particular interest since the characteristics of first-year ice (FYI) and multi-year ice (MYI) are different [2]. Compared to FYI, MYI has greater thickness and higher albedo, which is critical for energy exchange in the air-sea interface. Some previous studies indicated that the Arctic sea ice has reduced in extent and a part of ice cover is becoming thinner, changing from thicker MYI to thinner FYI [3]. The surface roughness and dielectric constant of different sea ice types change at different stages of ice growth. It is well known that the thinner, changing from thicker MYI to thinner FYI [3]. The surface roughness and dielectric constant of different sea ice types change at different stages of ice growth. It is well known that the surface of FYI is usually smoother than that of MYI. In addition, the salinity of FYI is higher than that of MYI. The FYI around the floe edges tends to undergo deformation when it collides with thicker ice. In general, the ice that survives at least one summer is regarded as MYI, which retains low salinity values and an undulating surface. These characteristics of different sea ice types are the basis for classification.
A wide variety of techniques has been applied to characterize changes in sea ice. Sea ice can be monitored from different platforms, such as buoys [4], ships [5], aircraft [6], and satellites [7]. Among them, satellite-based microwave remote sensing has been regarded as the most effective tool for monitoring sea ice [7].
In recent years, Global Navigation Satellite System (GNSS) Reflectometry (GNSS-R) has emerged as a powerful tool for sensing bio-geophysical features using L-band signals scattered from the Earth's surface [8]. GNSS-R was initially proposed for ocean altimetry in 1993 [9] after the concept of GNSS-R was proposed in 1988 [10]. Subsequently, the scope of the applications of GNSS-R has been extended to various fields, such as wind speed retrieval [11], snow depth estimation [12], soil moisture sensing [13], ocean altimetry [14], and sea ice detection [15]. Most GNSS-R studies were carried out using reflected L-band data collected on ground-based, airborne, and spaceborne platforms, and the latter one is regarded as the future trend due to its global coverage and high mobility [16]. The United-Kingdom (UK) TechDemoSat-1 [16] and NASA Cyclone GNSS (CYGNSS) [17], launched in 2014 and 2016 respectively, have promoted the research of spaceborne GNSS-R since their data are publicly available. Particularly, the TDS-1 data can be used for polar research as its global coverage, while the CYGNSS can only be applied in middle and lower latitude regions as its coverage of interest is the oceans within the latitude of ±38°. Recently, some other spaceborne GNSS-R missions have been successfully carried out one after another, such as the Chinese BuFeng-1 A/B [18], and Fengyun 3E [19].
Many applications of GNSS-R make use of the specular scattering geometry since GNSS signals reflected from the Earth's surface have the greatest amplitude at specular scattering. The Delay-Doppler Map (DDM), which is one of the most important GNSS-R observables, demonstrates the power scattering from the reflected surface as a function of time delays and Doppler shifts ( Figure 1). The reflection over rough surfaces, such as open water (OW), is usually regarded as incoherent, which results in a "horseshoe" shape of DDM presented in Figure 1a. The specular scattering is considered coherent when the reflected surface is relatively smooth. The typical coherent DDM is shown in Figure 1b. The TDS-1 data have been successfully used to sense sea ice parameters over the past few years. Yan et al. [20] firstly explored the sensitivity of TDS-1 Delay-Doppler Map (DDM) to sea ice presence using the pixel number of DDM with a signal-to-noise ratio (SNR) above a threshold. Similarly, Zhu et al. [21] proposed a differential DDM observable, which was used to recognize ice-water, water-water, water-ice, and ice-ice transitions. Schiavulli et al. [22] reconstructed a radar image based on DDM to distinguish sea ice from water. Alonso-Arroyo et al. [15] applied a matched filter ice detection approach with a probability of detection of 98.5%, a probability of false alarm of 3.6%, and a probability of error of 2.5%. Afterward, another study has proven that spaceborne GNSS-R can be effective for sea ice discrimination with a success rate of up to 98.22% compared to collocated passive microwave sea ice data [23]. Cartwright et al. [24] combined two features extracted from DDMs to detect sea ice with an agreement of 98.4% and 96.6% in the Antarctic and Arctic regions by comparing with the European Space Agency Climate Change Initiative sea ice concentration product. On the other hand, TDS-1 data were expanded to altimetry applications [25,26]. The ice sheet melt was also investigated in [27] using TDS-1 data. Furthermore, it has been shown that spaceborne GNSS-R is also useful for retrieving sea ice thickness [28], sea ice concentration [29], and sea ice type [30]. The GNSS-R observables derived from DDMs were used to classify Arctic sea ice in [30], where the TDS-1 sea ice types were compared with the sea ice type maps derived from Synthetic Aperture Radar (SAR) measurements.
In recent years, machine learning (ML) based methods have been widely used in geosciences and remote sensing applications [31]. ML has been proven powerful for applications in various remote sensing fields, such as classification [32,33], object detection [34], and parameter estimates [35]. Yan et al. [36] firstly adopted the neural network (NN) method to detect sea ice using spaceborne GNSS-R DDMs from TDS-1. This study demonstrated the potential of an NN-based approach for sea ice detection and sea ice concentration (SIC) estimation, which was further explored through the convolutional neural network (CNN) algorithm [37]. The DDMs were directly used as input variables in the CNN-based approach. Compared to NN and CNN, support vector machine (SVM) achieved the best performance in sea ice detection [38]. These three studies utilized the original DDM and the values derived from DDMs as input features for sensing sea ice. Zhu et al. [39] employed feature sequences that depict the characteristics of DDMs as input parameters to monitor sea ice using the decision tree (DT) and random forest (RF) algorithms. The RF aided method can discriminate sea ice from water with a success rate up to 98.03% validated with the collocated sea ice edge maps from the special sensor microwave imager sounder (SSMIS) data provided by the Ocean and Sea Ice Satellite Application Facility (OSISAF). Llaveria et al. [40] applied the NN algorithm for sea ice concentration and sea ice extent sensing using GNSS-R data from the FFSCat mission [40]. Rodriguez-Alvarez et al. [30] initially exploited the implementation of the classification and regression tree (CART) algorithm for sea ice classification using GNSS-R observables derived from GNSS-R DDM. The results showed that the FYI and MYI can be classified with an accuracy of 70% and 82.34% respectively. In order to illustrate the ML for sea ice sensing based on GNSS-R, relevant information about the above-mentioned studies is presented in Table 1. As one of the most powerful ML algorithms, RF has been widely applied for remote sensing image classification [33,42,43]. However, RF has not been considered for classifying FYI and MYI using spaceborne GNSS-R data. In addition, the SVM-based method showed great potential in sea ice detection and classification in some previous studies [33,38]. Although SVM has been applied to sea ice classification using SAR images [33], the application of SVM to GNSS-R sea ice classification has not been investigated. Therefore, RF and SVM classifiers are adopted in this study to develop algorithms for sea ice classification. The purpose of this research is to demonstrate the feasibility of spaceborne GNSS-R to classify sea ice types using ML classifiers.
The spaceborne GNSS-R dataset and reference sea ice type data from the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) OSISAF [44] used in this study is firstly described in Section 2. Then, the theoretical basis and the proposed method for sea ice classification are described with details in Section 3. The sea ice classification results are presented in Section 4 and discussed in Section 5. The conclusions are finally addressed in Section 6.

TDS-1 Mission and Dataset
The TDS-1 satellite began its data acquisition in September 2014 after its launch in July 2014. As one of eight instruments placed on the TDS-1 satellite, the Space GNSS Receiver-Remote Sensing Instrument (SGR-ReSI) was turned on only two days of an eight-day cycle until January 2018 [45]. The SGR-ReSI started its full-time operation in February 2018, and it came to the end in December 2018. The TDS-1 has provided a large amount of data at a global scale as the satellite runs on a quasi-Sun synchronous orbit with an inclination of 98.4 • . Currently, the TDS-1 data are freely available on the Measurement of Earth Reflected Radio-navigation Signals by Satellite (MERRBYS) website (www.merrbys.co.uk, accessed on 11 September 2021). The TDS-1 data are processed into three levels, including Level 0 (L0), Level 1 (L1), and Level 2 (L2). Among them, L1 data are usually adopted in scientific research. L0 refers to the raw data, which are not accessible except for a small amount of sample data. L2 includes wind speed, mean square slope, and sea ice products. One of the most important GNSS-R observables is DDM, which is generated by the SGR-ReSI through the cross-correlation between scattered signals and locally generated code replicas with different delays and Doppler shifts. The orbit and instrument specifications of the TDS-1 mission are presented in Table 2 [45].

Reference Sea Ice Data
The sea ice type (SIT) provided by the OSISAF [44] is adopted as the reference data to train the sea ice classification model and validate the results. The OSISAF provides daily SIT maps with a spatial resolution of 10 km in the polar stereographic projection. The SIT products discriminate OW, FYI, and MYI through analyzing multi-sensor data. The data with a confidence level above 3 are adopted to avoid using low-quality data [44]. In addition, the SIC products generated from the SSMIS measurements provided by OSISAF are also used as reference data to analyze the impacts of SIC on sea ice classification. The reference SIC products are mapped on a grid with a size of 10 km × 10 km in the polar stereographic projection. The TDS-1 data is matched with the SIC maps through the SP location and data collection date, which are available in the TDS-1 dataset. Figure 2a presents the GNSS-R ground tracks over OW, FYI, and MYI on 26 November 2015. The TDS-1 DDMs over OW-FYI and FYI-MYI transitions are shown in the figure. Meanwhile, the GNSS-R ground tracks are mapped against the SIC map in Figure 2b. In general, the SIC of MYI is higher than that of FYI. replicas with different delays and Doppler shifts. The orbit and instrument specifications of the TDS-1 mission are presented in Table 2 [45].

Reference Sea Ice Data
The sea ice type (SIT) provided by the OSISAF [44] is adopted as the reference data to train the sea ice classification model and validate the results. The OSISAF provides daily SIT maps with a spatial resolution of 10 km in the polar stereographic projection. The SIT products discriminate OW, FYI, and MYI through analyzing multi-sensor data. The data with a confidence level above 3 are adopted to avoid using low-quality data [44]. In addition, the SIC products generated from the SSMIS measurements provided by OSISAF are also used as reference data to analyze the impacts of SIC on sea ice classification. The reference SIC products are mapped on a grid with a size of 10 km × 10 km in the polar stereographic projection. The TDS-1 data is matched with the SIC maps through the SP location and data collection date, which are available in the TDS-1 dataset. Figure 2a presents the GNSS-R ground tracks over OW, FYI, and MYI on 26 November 2015. The TDS-1 DDMs over OW-FYI and FYI-MYI transitions are shown in the figure. Meanwhile, the GNSS-R ground tracks are mapped against the SIC map in Figure 2b. In general, the SIC of MYI is higher than that of FYI.

Surface Reflectivity
The GNSS-R instrument (e.g., SGR-ReSI) receives signals directly transmitted from GNSS constellations and scattered from the Earth's surface. For the TDS-1 mission, the signals observed are the L1 C/A codes with a center frequency of 1.575 GHz. The L-band signal is useful for remote sensing applications due to its insensitivity to the precipitation and atmosphere. The specular reflections are dominant when the surface is relatively flat and smooth. As demonstrated in [15], the GNSS-R signal received over sea ice is usually Remote Sens. 2021, 13, 4577 6 of 27 coherent due to its smooth surface. Thus, the coherent surface reflectivity SR coh at the specular point (SP) can be modeled as [46,47], where P coh is the coherently received power reflected by the surface, R r and R t are the distances from the receiver and transmitter to the SP respectively, G r and G t are the antenna gain of the receiver and transmitter, λ is the GNSS signal wavelength, and P t is the power transmitted by GNSS satellites.
Most of the variables in (1) can be obtained from the TDS-1 L1b data. R r and R t can be easily calculated according to the positions of the transmitter, receiver, and SP, which are stored in the metadata. G r at SP can be directly extracted from the metadata. The noise floor is determined using the average value of the first four rows of DDM [23]. The transmitted signal power can be derived using [48], where P t G t is also termed as effective isotropic radiated power (EIRP). P d is the direct power, R d is the distance from the transmitter to receiver, and G d is the zenith antenna gain, which is set as 4 dB in this study according to the Merrbys documentation [45,49]. As demonstrated in [28], the received power originates from a region surrounding the SP, which is usually the first Fresnel zone. As demonstrated in [50], the Fresnel reflectivity SR coh can also be modeled as, where θ i stands for the incidence angle and σ is the surface root mean square (RMS) height. The second term in Equation (3) represents the surface roughness. The R represents the Fresnel reflection coefficient, which can be derived through Equations (4)-(6): where θ i is the incidence angle over the sea ice, and i is the permittivity of sea ice, which is related to sea ice types. In this study, the data with an incidence angle below 45 • are applied. Data marked by the quality flags eclipse or direct signal in DDM [51] have also been removed to deal with reliable data. The TDS-1 surface reflectivity and corresponding incidence angle distribution in Arctic regions in 5 days (from 11 to 15 February 2018) is presented in Figure 3.

Features Derived from DDM
Besides the surface reflectivity, the other seven GNSS-R observables are chosen for sea ice classification. GNSS-R observables are defined as characteristics derived from the DDMs that describe their power and shape.
The DDM average (DDMA), which represents the mean signal-to-noise ratio (SNR) values over the point of peak SNR (Figure 4a), is applied in this study to classify sea ice types. As shown in Figure 4a, the DDMA expands 3 delay bins and 3 Doppler shifts centered at the peak SNR point. The distribution of DDMA in 5 days (from 11 to 15 February 2018) is presented in Figure 4b compared with the OSISAF SIT map. As illustrated in [23,39], the integrated delay waveform (IDW) is defined as the summation of 20 delay waveforms, which are the cross-sections of DDM at 20 different Doppler shifts. The cross-section of zero Doppler shift is defined as the central delay waveform (CDW) of the DDM. The degree of difference between IDW and CDW is described by differential delay waveform (DDW), which is depicted as follows,

Features Derived from DDM
Besides the surface reflectivity, the other seven GNSS-R observables are chosen for sea ice classification. GNSS-R observables are defined as characteristics derived from the DDMs that describe their power and shape.
The DDM average (DDMA), which represents the mean signal-to-noise ratio (SNR) values over the point of peak SNR (Figure 4a), is applied in this study to classify sea ice types. As shown in Figure 4a, the DDMA expands 3 delay bins and 3 Doppler shifts centered at the peak SNR point. The distribution of DDMA in 5 days (from 11 to 15 February 2018) is presented in Figure 4b compared with the OSISAF SIT map.

Features Derived from DDM
Besides the surface reflectivity, the other seven GNSS-R observables are chosen for sea ice classification. GNSS-R observables are defined as characteristics derived from the DDMs that describe their power and shape.
The DDM average (DDMA), which represents the mean signal-to-noise ratio (SNR) values over the point of peak SNR (Figure 4a), is applied in this study to classify sea ice types. As shown in Figure 4a, the DDMA expands 3 delay bins and 3 Doppler shifts centered at the peak SNR point. The distribution of DDMA in 5 days (from 11 to 15 February 2018) is presented in Figure 4b compared with the OSISAF SIT map. As illustrated in [23,39], the integrated delay waveform (IDW) is defined as the summation of 20 delay waveforms, which are the cross-sections of DDM at 20 different Doppler shifts. The cross-section of zero Doppler shift is defined as the central delay waveform (CDW) of the DDM. The degree of difference between IDW and CDW is described by differential delay waveform (DDW), which is depicted as follows, As illustrated in [23,39], the integrated delay waveform (IDW) is defined as the summation of 20 delay waveforms, which are the cross-sections of DDM at 20 different Doppler shifts. The cross-section of zero Doppler shift is defined as the central delay waveform (CDW) of the DDM. The degree of difference between IDW and CDW is described by differential delay waveform (DDW), which is depicted as follows, where NIDW and NCDW represent the normalized IDW and CDW respectively. The NIDW is related to the power spreading characteristics caused by surface roughness. The delay waveforms from delay bins −5 to 20 over OW, FYI, and MYI surfaces are shown in Figure 5.
Remote Sens. 2021, 13, x FOR PEER REVIEW 8 of 28 where NIDW and NCDW represent the normalized IDW and CDW respectively. The NIDW is related to the power spreading characteristics caused by surface roughness. The delay waveforms from delay bins -5 to 20 over OW, FYI, and MYI surfaces are shown in Figure 5. Features extracted from delay waveforms are described as follows: • RESC (Right Edge Slope of CDW). The fitting slope of NCDW with 5 delay bins starting from the zero delay one is defined as RESC, which is depicted by the slope of the blue line in Figure 6a. • RESI (Right Edge Slope of IDW). The fitting slope of NIDW with 5 delay bins starting from the zero delay one is defined as RESI, which is depicted by the slope of the green line in Figure 6a. • RESD (Right Edge Slope of DDW). The fitting slope of DDW with 5 delay bins starting from the zero delay one is defined as RESD, which is depicted by the slope of the magenta line in Figure 6a.  Features extracted from delay waveforms are described as follows: • RESC (Right Edge Slope of CDW). The fitting slope of NCDW with 5 delay bins starting from the zero delay one is defined as RESC, which is depicted by the slope of the blue line in Figure 6a. • RESI (Right Edge Slope of IDW). The fitting slope of NIDW with 5 delay bins starting from the zero delay one is defined as RESI, which is depicted by the slope of the green line in Figure where NIDW and NCDW represent the normalized IDW and CDW respectively. The NIDW is related to the power spreading characteristics caused by surface roughness. The delay waveforms from delay bins -5 to 20 over OW, FYI, and MYI surfaces are shown in Figure 5. Features extracted from delay waveforms are described as follows: • RESC (Right Edge Slope of CDW). The fitting slope of NCDW with 5 delay bins starting from the zero delay one is defined as RESC, which is depicted by the slope of the blue line in Figure 6a. • RESI (Right Edge Slope of IDW). The fitting slope of NIDW with 5 delay bins starting from the zero delay one is defined as RESI, which is depicted by the slope of the green line in Figure 6a. • RESD (Right Edge Slope of DDW). The fitting slope of DDW with 5 delay bins starting from the zero delay one is defined as RESD, which is depicted by the slope of the magenta line in Figure 6a.  The distribution of these six features versus the OSISAF SIT maps is illustrated in Figure 7. The distribution of these six features versus the OSISAF SIT maps is illustrated in Figure 7.

Sea Ice Classification Method
Many previous studies indicate that RF and SVM show great potential in classifications [33,38,39,42], while they have not been used for classifying FYI and MYI using GNSS-R features. Thus, these two classifiers are used in this study to classify sea ice types.

Random Forest (RF)
The RF is one of the ensemble methods that have received increasing interest since they are more robust and accurate than single classifiers [52,53]. RF is composed of a set of classifiers where each classifier casts a vote for the allocation of the most frequent type to the input vectors. The principle of ensemble classifiers is based on the basic precondition that a variety of classifiers usually outperform an individual classifier. Some of the advantages of RF for remote sensing applications include (1) It has high efficiency on large datasets; (2) It can deal with a large number of input variables without variable deletion;

Sea Ice Classification Method
Many previous studies indicate that RF and SVM show great potential in classifications [33,38,39,42], while they have not been used for classifying FYI and MYI using GNSS-R features. Thus, these two classifiers are used in this study to classify sea ice types.

Random Forest (RF)
The RF is one of the ensemble methods that have received increasing interest since they are more robust and accurate than single classifiers [52,53]. RF is composed of a set of classifiers where each classifier casts a vote for the allocation of the most frequent type to the input vectors. The principle of ensemble classifiers is based on the basic precondition that a variety of classifiers usually outperform an individual classifier. Some of the advantages of RF for remote sensing applications include (1) It has high efficiency on large datasets; (2) It can deal with a large number of input variables without variable deletion; (3) The variable importance can be estimated in the classification; (4) Relatively strong robustness to noise and outliers.
The design of a decision tree needs to choose an attribute selection measure and a pruning approach. The Gini index [54] is usually used to measure the impurity of training samples in RF. For a given training dataset T, the Gini index (G index ) can be expressed as, where m is the number of categories and p i is the proportion of samples T that belongs to class i. In the case of binary classification problems, the most appropriate characteristics at each node can be identified by the Gini index, given by: The RF can also evaluate the relative importance of different features in the classification process. It is helpful to select the best features and know-how each feature influences the classification results [55,56].

Support Vector Machine (SVM)
SVM is a powerful machine learning algorithm based on statistical learning theory and has the purpose of determining the location of decision boundaries that produce the optimal separation of classes [57]. It is a supervised classification approach, which can deal with linear, nonlinear, high-dimensional samples and result in good generalization. SVM is primarily used for binary classification problems, in which the sample data can be expressed as, where x i represents the input samples that consist of features for classification, y i is the class label of x i For the linear classification, the SVM classifier satisfies the following rule, where w T x + b = 0 represents a hyperplane, with parameters w and b being the coefficient and bias, respectively. The maximum classification interval algorithm can be formulated as, In order to obtain good predictions in nonlinear classification, the slack variables are introduced to express the soft margin optimal problem, where ξ i represents the ith slack variable, C is a penalty parameter. φ is a high-dimensional feature projection function related to the kernel function [58].

Sea Ice Classification Based on RF and SVM
In this study, the sea surface is divided into three categories, including OW, FYI, and MYI. As a powerful ML classifier, RF can deal with both binary and multi-classification problems. The strategy of one-against-one and one-against-all can be chosen to address the multi-classification problem. The RF classifier can be split into several binary classifiers using a one-against-all strategy (Anand et al., 1995) [59]. Although the standard RF algorithm can deal with multi-type problems, the one-against-all binarization to the RF can achieve better accuracy with smaller forest sizes than the standard RF [60]. In addition, the SVM is usually used for binary classification. Thus, the one-against-all binarization strategy is used in this study to classify OW, FYI, and MYI in two steps. In the first stage, the FYI and MYI are regarded as one category (sea ice), whereas the OW is regarded as another category. Then, the classification of sea ice types (FYI and MYI) will be carried out in the second stage using a similar method. The python software package from the Scikit-Learn is adopted in this study [61]. The process flow of sea ice classification is shown in Figure 8.
the FYI and MYI are regarded as one category (sea ice), whereas the OW is regarded as another category. Then, the classification of sea ice types (FYI and MYI) will be carried out in the second stage using a similar method. The python software package from the Scikit-Learn is adopted in this study [61]. The process flow of sea ice classification is shown in Figure 8. The sea ice classification method can be generally categorized into four parts: • TDS-1 data preprocessing and features extraction. Firstly, the TDS-1 data with a latitude above 55°N and peak SNR above -3 dB is adopted to extract delay waveforms, which are further normalized to extract features. A total of eight features, namely SR, DDMA, RESC, RESI, RESD, REWC, REWI, and REWD, are extracted from the TDS-1 data. Then the TDS-1 features are matched with OSISAF SIT maps based on the data collection date and specular point position through a bilinear interpolation approach. • OW-sea ice classification. In this step, the FYI and MYI are regarded as one category (i.e., sea ice). 30% of samples are randomly selected as training set and the rest 70% of samples are used to test the OW-sea ice classification results. • FYI-MYI classification. The sea ice datasets are applied in this step to classify FYI and MYI. As with the process of OW-sea ice classification, sea ice samples are randomly selected as training and test sets to classify FYI and MYI.

•
Performance evaluation of sea ice classification. The classification performance is firstly evaluated using the confusion matrix and some evaluation metrics, which are defined in Figure 9 and Table 3, respectively. The sea ice classification method can be generally categorized into four parts: • TDS-1 data preprocessing and features extraction. Firstly, the TDS-1 data with a latitude above 55 • N and peak SNR above −3 dB is adopted to extract delay waveforms, which are further normalized to extract features. A total of eight features, namely SR, DDMA, RESC, RESI, RESD, REWC, REWI, and REWD, are extracted from the TDS-1 data. Then the TDS-1 features are matched with OSISAF SIT maps based on the data collection date and specular point position through a bilinear interpolation approach. • OW-sea ice classification. In this step, the FYI and MYI are regarded as one category (i.e., sea ice). 30% of samples are randomly selected as training set and the rest 70% of samples are used to test the OW-sea ice classification results. • FYI-MYI classification. The sea ice datasets are applied in this step to classify FYI and MYI. As with the process of OW-sea ice classification, sea ice samples are randomly selected as training and test sets to classify FYI and MYI. • Performance evaluation of sea ice classification. The classification performance is firstly evaluated using the confusion matrix and some evaluation metrics, which are defined in Figure 9 and Table 3, respectively.
The row and column represent the actual and predicted types, respectively. As shown in Table 3, TP stands for the number of positive samples that are classified correctly. FN represents the number of positive samples that are classified into a negative type incorrectly. FP is the number of negative samples that are classified into a positive type incorrectly. TN is the number of negative samples that are classified correctly. The evaluation metrics that are used to evaluate the classification performance are defined in Table 3. Accuracy represents the ratio of the correct classification number in all the samples. Precision depicts the precision of predicted results that positive samples are classified correctly. Recall means the completeness of the samples that are classified correctly in all positive samples. F-value is the harmonic mean of classification performance for positive samples, where β is usually set as 1. Then the F-value is the F1 score, which is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. G-mean can be used to evaluate the overall classification performance. G-mean measures the balance between classification performances on both the majority and minority classes. In addition, the kappa coefficient is also used to measure inter-rater reliability for classifiers. The kappa coefficient measures the agreement between two raters who each classify N items into C mutually exclusive categories. The row and column represent the actual and predicted types, respectively. As shown in Table 3, TP stands for the number of positive samples that are classified correctly. FN represents the number of positive samples that are classified into a negative type incorrectly. FP is the number of negative samples that are classified into a positive type incorrectly. TN is the number of negative samples that are classified correctly. The evaluation metrics that are used to evaluate the classification performance are defined in Table 3. Accuracy represents the ratio of the correct classification number in all the samples. Precision depicts the precision of predicted results that positive samples are classified correctly. Recall means the completeness of the samples that are classified correctly in all positive samples. F-value is the harmonic mean of classification performance for positive samples, where is usually set as 1. Then the F-value is the F1 score, which is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. G-mean can be used to evaluate the overall classification performance. G-mean measures the balance between classification performances on both the majority and minority classes. In addition, the kappa coefficient is also used to measure inter-rater reliability for classifiers. The kappa coefficient measures the agreement between two raters who each classify N items into C mutually exclusive categories.

Evaluation Metrics Equation
Accuracy Figure 9. The definition of the confusion matrix. Table 3. The definition of evaluation metrics.

Results
Although the OSISAF SIT products are provided continuously, there is a lack of MYI information from May to October most of the time in the OSISAF SIT products. In addition, the TDS-1 data is very sparse from 2015 to 2017 as the GNSS-R receiver was working only two days in an eight-day cycle. Therefore, the TDS-1 datasets collected in 2018 are selected according to the availability of MYI information of OSISAF SIT products. Figure 10a presents the data availability status of both OSISAF SIT products and TDS-1 data that are used in this study. The number of OW, FYI, and MYI samples every month are shown in Figure 10b. According to the statistics, a total number of 460,685 samples, including 165,434 OW samples, 266,680 FYI samples, and 28,571 MYI samples, are used in this study. tion, the TDS-1 data is very sparse from 2015 to 2017 as the GNSS-R receiver was working only two days in an eight-day cycle. Therefore, the TDS-1 datasets collected in 2018 are selected according to the availability of MYI information of OSISAF SIT products. Figure  10a presents the data availability status of both OSISAF SIT products and TDS-1 data that are used in this study. The number of OW, FYI, and MYI samples every month are shown in Figure 10b. According to the statistics, a total number of 460,685 samples, including 165,434 OW samples, 266,680 FYI samples, and 28,571 MYI samples, are used in this study .   1  2  3  4  5  6  7  8  9  10 11 12 13 14  16  15   1  2  3  4  5  6  7  8  9  10 11 12 13 14  16  15  17 18 19 20 21 22 23 24 25 26 27 28 29 30   3  4  5  6  7  8  9  10 11 12  18 19 20 21  23 24  27 28  Feb. Mar. Apr. Oct.

2018
Day of Month As illustrated in Section 3, the sea ice classification is implemented using a two-step method. The first step aims to discriminate OW from sea ice using the RF and SVM classifiers. During the RF model training, the number of trees is set to different values ranging from 10 to 200, in order to find the best classifier. The number of estimators is finally set as 70 in this study. After identifying the OW, the sea ice samples are employed to classify FYI and MYI in the second step using the RF classifier. It is worth noting that the number of FYI samples is about nine times that of MYI samples. When randomly selecting samples from the whole FYI and MYI dataset as the training set, only a small amount of MYI samples are selected out. The proportion gap between FYI and MYI samples in the training set is too large and the training model may not grasp the features of MYI. Therefore, 30% of MYI samples are randomly selected as training set and the number of FYI samples is controlled to be three times that of MYI.
The confusion matrix of RF and SVM classifiers are presented in Figure 11, which presents the classification results of each class. Table 4 presents the evaluation metrics which are computed according to equations listed in Table 3. Table 4 demonstrates the evaluation metrics through validating with the OSISAF SIT maps using the test dataset. As shown in Figure 11a,b, a total of 322,480 samples, which include 109,559 OW and 212,921 sea ice samples, are applied for testing. The overall As illustrated in Section 3, the sea ice classification is implemented using a two-step method. The first step aims to discriminate OW from sea ice using the RF and SVM classifiers. During the RF model training, the number of trees is set to different values ranging from 10 to 200, in order to find the best classifier. The number of estimators is finally set as 70 in this study. After identifying the OW, the sea ice samples are employed to classify FYI and MYI in the second step using the RF classifier. It is worth noting that the number of FYI samples is about nine times that of MYI samples. When randomly selecting samples from the whole FYI and MYI dataset as the training set, only a small amount of MYI samples are selected out. The proportion gap between FYI and MYI samples in the training set is too large and the training model may not grasp the features of MYI. Therefore, 30% of MYI samples are randomly selected as training set and the number of FYI samples is controlled to be three times that of MYI.
The confusion matrix of RF and SVM classifiers are presented in Figure 11, which presents the classification results of each class. Table 4 presents the evaluation metrics which are computed according to equations listed in Table 3.  [15,38,40]. Similarly, the FYI-MYI classification results are evaluated using 20,000 FYI samples and 240,967 MYI samples (Figure 11c,d), which results in an overall accuracy of 84.82% for RF, whereas 71.71% for SVM. The performance of RF and SVM for OW-sea ice classification is comparable, whereas RF outperforms SVM significantly in the FYI-MYI classification. The overall space distribution of classification results is shown in Figure 12, which depicts the distribution of predicted and reference types. For illustration purposes, parts of classifications are shown in Figure 13, which demonstrates the predicted, reference, and their comparison results. Firstly, sea ice and OW are labeled as -1 and 1, respectively, in the OW-sea ice classification, whereas FYI and MYI are labeled as -1 and 1, respectively, in the FYI-MYI classification. The comparison results between predicted and reference types are computed as follows, where Ref represents the value of reference types, Pre is the value of predicted types, Diff is the comparison results. In the OW-sea ice classification, -2 represents the sea ice is misclassified as OW (ice-OW), 2 represents the OW is misclassified as sea ice (OW-ice), and 0 means the predicted and reference type is consistent. In the FYI-MYI classification, -2 represents the FYI is misidentified as MYI (FYI-MYI), 2 represents the MYI is misidentified as FYI (MYI-FYI), and 0 means the predicted and reference type is consistent. As shown in Figure 12b, c, most OW and sea ice misclassifications of RF and SVM classifiers appear similarly at the points of OW-sea ice transitions. This can also be seen in Figure 13a,b. The confusion matrix in Figure 11c,d presents that the number of FYI misclassifications is much larger than that of MYI misclassifications. This may be due to the large gap in the number of FYI and MYI samples. The overall accuracy is affected less by the MYI misclassifications since the number of FYI samples is more than 10 times that of MYI samples. As shown in Figure 12e,f, both RF and SVM classifiers have lots of misclassifications, which are widely distributed. In general, the number of misclassifications of SVM is more than that of RF, which can also be seen in Figure 13c,d. In addition, the kappa coefficient of OW-sea ice classification is 0.97 both for RF and SVM, whereas that of FYI-MYI is only 0.39 and 0.23 respectively for RF and SVM. In addition, although SVM can achieve comparable accuracy as RF in OW-sea ice classification, the time consumption of SVM is about nine times that of RF for the same test dataset.    Table 4 demonstrates the evaluation metrics through validating with the OSISAF SIT maps using the test dataset. As shown in Figure 11a,b, a total of 322,480 samples, which include 109,559 OW and 212,921 sea ice samples, are applied for testing. The overall accuracy of RF and SVM classifiers for OW-sea ice classification is 98.83% and 98.60% respectively, which are comparable to the results in previous studies [15,38,40]. Similarly, the FYI-MYI classification results are evaluated using 20,000 FYI samples and 240,967 MYI samples (Figure 11c,d), which results in an overall accuracy of 84.82% for RF, whereas 71.71% for SVM. The performance of RF and SVM for OW-sea ice classification is comparable, whereas RF outperforms SVM significantly in the FYI-MYI classification. The overall space distribution of classification results is shown in Figure 12, which depicts the distribution of predicted and reference types. For illustration purposes, parts of classifications are shown in Figure 13, which demonstrates the predicted, reference, and their comparison results. Firstly, sea ice and OW are labeled as −1 and 1, respectively, in the OW-sea ice classification, whereas FYI and MYI are labeled as −1 and 1, respectively, in the FYI-MYI classification. The comparison results between predicted and reference types are computed as follows, where Ref represents the value of reference types, Pre is the value of predicted types, Diff is the comparison results. In the OW-sea ice classification, −2 represents the sea ice is misclassified as OW (ice-OW), 2 represents the OW is misclassified as sea ice (OW-ice), and 0 means the predicted and reference type is consistent. In the FYI-MYI classification, −2 represents the FYI is misidentified as MYI (FYI-MYI), 2 represents the MYI is misidentified as FYI (MYI-FYI), and 0 means the predicted and reference type is consistent. As shown in Figure 12b, c, most OW and sea ice misclassifications of RF and SVM classifiers appear similarly at the points of OW-sea ice transitions. This can also be seen in Figure 13a,b. The confusion matrix in Figure 11c,d presents that the number of FYI misclassifications is much larger than that of MYI misclassifications. This may be due to the large gap in the number of FYI and MYI samples. The overall accuracy is affected less by the MYI misclassifications since the number of FYI samples is more than 10 times that of MYI samples. As shown in Figure 12e,f, both RF and SVM classifiers have lots of misclassifications, which are widely distributed. In general, the number of misclassifications of SVM is more than that of RF, which can also be seen in Figure 13c,d. In addition, the kappa coefficient of OW-sea ice classification is 0.97 both for RF and SVM, whereas that of FYI-MYI is only 0.39 and 0.23 respectively for RF and SVM. In addition, although SVM can achieve comparable accuracy as RF in OW-sea ice classification, the time consumption of SVM is about nine times that of RF for the same test dataset.    In each figure of (a) to (d), the top plot represents the predicted types of the classifier, the middle one is the reference types from OSISAF SIT and the lowest one is the comparison results between predicted and reference types. OW-ice means the OW is misidentified as sea ice, and ice-OW means the sea ice is misclassified as OW. FYI-MYI means the FYI is misidentified as MYI, and MYI-FYI means the MYI is misclassified as FYI.

Discussion
In order to analyze the influence of sea ice growth and melt on sea ice classification and the robustness of the sea ice classification method, another experiment is implemented. Each month of data is used as a training set individually, and the remaining four months of data are used as a test set. The strategy is similar to k-fold cross-validation [62] usually applied in ML. However, the k-fold cross-validation is to split the dataset into k parts randomly, which would eliminate the seasonal effects. Firstly, the data in February are used as the training set, and the data in March, April, November, and December are used as the testing set in turn. Hereafter, the data in Mar, April, November, and December are used as training set successively. The space distribution of classification results is presented in Appendixes A ( Figures A1 to Figures A10) and B (Figures A11 to Figures A20). The accuracy and kappa coefficients of OW-sea ice and FYI-MYI classification are shown in Figures 14 and 15, which indicate that the overall accuracy of OW-sea ice changes less In each figure of (a) to (d), the top plot represents the predicted types of the classifier, the middle one is the reference types from OSISAF SIT and the lowest one is the comparison results between predicted and reference types. OW-ice means the OW is misidentified as sea ice, and ice-OW means the sea ice is misclassified as OW. FYI-MYI means the FYI is misidentified as MYI, and MYI-FYI means the MYI is misclassified as FYI.

Discussion
In order to analyze the influence of sea ice growth and melt on sea ice classification and the robustness of the sea ice classification method, another experiment is implemented. Each month of data is used as a training set individually, and the remaining four months of data are used as a test set. The strategy is similar to k-fold cross-validation [62] usually applied in ML. However, the k-fold cross-validation is to split the dataset into k parts randomly, which would eliminate the seasonal effects. Firstly, the data in February are used as the training set, and the data in March, April, November, and December are used as the testing set in turn. Hereafter, the data in Mar, April, November, and December are used as training set successively. The space distribution of classification results is presented in Appendix A (Figures A1-A10) and Appendix B (Figures A11-A20). The accuracy and kappa coefficients of OW-sea ice and FYI-MYI classification are shown in Figures 14 and 15, which indicate that the overall accuracy of OW-sea ice changes less than that of FYI-MYI classification. This may be caused by the change of samples distribution as the proportion of OW and sea ice is relatively stable, while that of FYI and MYI changes more easily with ice growth and melt. ens. 2021, 13, x FOR PEER REVIEW 18 of 28 the ice growth from November to December, the successful classification rate increases relatively obviously, which is consistent with the sea ice extent trend shown in Figure 16.   As shown in Figure 14, the accuracy of OW-sea ice classification changes slightly from February to April and then drops from April to November, followed by a larger increase from November to December. The accuracy in November reaches the lowest point among all tested months. Figure 15 demonstrates that the general changing trend of accuracy for FYI-MYI classification is similar to that for OW-sea ice classification. This can be explained in combination with the sea ice extent trend in 2018, shown in Figure 16. As shown in Figure 16a, the sea ice extent reached the maximum and minimum on 14 March 2018 and 21 September 2018, respectively. The sea ice extent increases firstly to the peak sea ice extent and then decreases slightly from March to April, while it increases continuously from November to December. Moreover, among five months of use, the ice extent in November is the smallest (Figure 16b). The increase and decrease of sea ice extent can be regarded as ice growth and melt process, respectively. The ice extent during February to April changes less than that during November to December since November is a period of the early stage of winter. The sea ice extent during February to April is relatively stable. From February to March, the FYI grows and the presence of ocean water in FYI becomes less, which results in less misclassification for FYI. During the melting process from March to April, the surface of FYI can be more easily affected by ocean winds due to the decrease of sea ice concentration of FYI. This may lead to more misclassification of FYI to MYI. The overall accuracy for OW-sea ice and FYI-MYI classification in November is the smallest. The explanation is that the sea ice extent is relatively low in November when newly added sea ice is mostly surrounded by ocean water, which leads to more misclassifications. With the ice growth from November to December, the successful classification rate increases relatively obviously, which is consistent with the sea ice extent trend shown in Figure 16. Figure 14. The (a) accuracy and (b) kappa coefficient of OW-sea ice classification using data of each month as the training set in turn. The top and middle plots in (a) represent the accuracy of RF and SVM respectively, while the lowest plot is the accuracy of RF minus that of SVM. The top and middle plots in (b) represent the kappa coefficient of RF and SVM respectively, while the lowest plot is the kappa coefficient of RF minus that of SVM.

Conclusion
This study investigates the RF and SVM-based classifiers for Arctic Sea ice classification using the one-against-all binarization, which converts a multi-classification into several binary classification problems. Thus, the classification is implemented in a two-step way. The first step aims to discriminate OW from sea ice (FYI and MYI), which is further classified in the second step. The selected data periods include February to April and November to December in 2018, during which the information of different surface types (OW, FYI, and MYI) is available and can be used for comparison. Through validating against OSISAF SIT maps, the overall accuracy of the RF and SVM for OW-sea ice classification is up to 98.83% and 98.60%, which is comparable to results from some previous studies using TDS-1 data [15,30,[38][39][40]. Hereafter, the FYI-MYI classifier is modeled in the second using the samples of sea ice that include FYI and MYI. The overall accuracy of RF

Conclusion
This study investigates the RF and SVM-based classifiers for Arctic Sea ice classification using the one-against-all binarization, which converts a multi-classification into several binary classification problems. Thus, the classification is implemented in a two-step way. The first step aims to discriminate OW from sea ice (FYI and MYI), which is further classified in the second step. The selected data periods include February to April and November to December in 2018, during which the information of different surface types (OW, FYI, and MYI) is available and can be used for comparison. Through validating against OSISAF SIT maps, the overall accuracy of the RF and SVM for OW-sea ice classification is up to 98.83% and 98.60%, which is comparable to results from some previous studies using TDS-1 data [15,30,[38][39][40]. Hereafter, the FYI-MYI classifier is modeled in the second using the samples of sea ice that include FYI and MYI. The overall accuracy of RF and SVM for classifying FYI and MYI is 84.82% and 71.71%, respectively, which are lower than those of the OW-sea ice classifier. Moreover, the influence of ice growth and melt for sea ice classification is evaluated through a cross-validating strategy, which applies each month of data as the training set and the remaining four months of data as the test set. To the best of the author's knowledge, the RF and SVM are firstly used for classifying FYI and MYI using the spaceborne GNSS-R data. The results indicate that RF and SVM-based GNSS-R have great potential in sea ice classification. This study demonstrates that the great potential of GNSS-R for classifying sea ice types, which can be an effective and complementary approach for remotely sensing sea ice. In future studies, more GNSS-R features, ML algorithms, and environmental effects (such as ocean wind) should be investigated to improve the accuracy of classifying FYI and MYI. Acknowledgments: The authors would like to thank the TechDemoSat-1 team at Surrey Satellite Technology Ltd. (SSTL) for providing the spaceborne GNSS-R data. Our gratitude also to Ocean and Sea Ice Satellite Application Facility for the sea ice edge product used in comparisons.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The

Appendix A
Appendix A presents the results of RF and SVM for OW-sea ice classification using each month of data as the training set in turn. The RF classifier results are shown in Figures A1-A5, and the SVM classifier results are presented in Figures A6-A10. In each figure title, the RF and SVM represent method, then followed by the test data of month, such Feb, Mar, Apr, Nov, Dec, and OWseaice represents OW-sea ice classification.

Appendix B
Appendix B presents the results of RF and SVM for FYI-MYI classification using each month of data as the training set in turn. The RF classifier results are shown in Figures  A11-A15, and the SVM classifier results are presented in Figures A16-A20. In each figure title, the RF and SVM represent method, then followed by the test data of month, such Feb, Mar, Apr, Nov, Dec, and FYIMYI represents FYI-MYI classification.

Appendix B
Appendix B presents the results of RF and SVM for FYI-MYI classification using each month of data as the training set in turn. The RF classifier results are shown in Figures  A11-A15, and the SVM classifier results are presented in Figures A16-A20. In each figure title, the RF and SVM represent method, then followed by the test data of month, such Feb, Mar, Apr, Nov, Dec, and FYIMYI represents FYI-MYI classification.

Appendix B
Appendix B presents the results of RF and SVM for FYI-MYI classification using each month of data as the training set in turn. The RF classifier results are shown in Figures 13, A11, A12, A14 and A15, and the SVM classifier results are presented in Figures A16-A20. In each figure title, the RF and SVM represent method, then followed by the test data of month, such February, March, April, November, December, and FYIMYI represents FYI-MYI classification. Figure A10. SVM for OW-sea ice classification using the data in December as the training set. The data in (a) February, (b) March, (c) April, and (d) November are used as the test set in turn.

Appendix B
Appendix B presents the results of RF and SVM for FYI-MYI classification using each month of data as the training set in turn. The RF classifier results are shown in Figures  A11-A15, and the SVM classifier results are presented in Figures A16-A20. In each figure title, the RF and SVM represent method, then followed by the test data of month, such Feb, Mar, Apr, Nov, Dec, and FYIMYI represents FYI-MYI classification.