Integration of Sentinel-1 and Sentinel-2 Data with the G-SMOTE Technique for Boosting Land Cover Classiﬁcation Accuracy

: The importance of Land Cover (LC) classiﬁcation is recognized by an increasing number of scholars who employ LC information in various applications (i.e., address global climate change and achieve sustainable development). However, studying the roles of balancing data, image integration, and performance of different machine learning algorithms in various landscapes has not received as much attention from scientists. Therefore, the present study investigates the performance of three frequently used Machine Learning (ML) algorithms, including Extreme Learning Machines (ELM), Support Vector Machines (SVM), and Random Forest (RF) in LC mapping at six different landscapes. Moreover, the Geometric Synthetic Minority Over-sampling Technique (G-SMOTE) was adopted to deal with the class imbalance problem. In this work, the time-series of Sentinel-1 and Sentinel-2 data were integrated to improve LC mapping accuracy, taking advantage of both data. Moreover, Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was implemented to distinguish the most informative features. Based on the results, the RF integrated with G-SMOTE showed the best result for four landscapes (coastal, cropland, desert, and semi-arid). SVM integrated with G-SMOTE had the highest accuracy in the remaining two landscapes (plain and mountain). Applied ML algorithms showed good performances in various landscapes, ranging Overall Accuracy (OA) from 85% to 93% for RF, 83% to 94% for SVM, and 84% to 92% for ELM. The outcomes exhibit that although applying G-SMOTE may slightly decrease OA values, it generally boosts the results of LC classiﬁcation accuracies in various landscapes, particularly for minority classes


Introduction
Land cover (LC) data has great importance for different disciplines, such as biodiversity patterns [1], natural hazards studies (i.e., landslides [2] and wildfire [3]), and CO 2 emissions [4].Additionally, there is a considerable need for current and precise information on LC and its changes for sustainable development and global warming studies [5].The importance of the mentioned issues and the progress of Remote Sensing (RS) technologies toward providing data with better temporal and spatial resolutions have motivated scholars and scientists to study LC mapping widely.Although the tremendous attempts exerted in LC mapping, examining the roles of balancing data, image integration, and performance of different machine learning algorithms in various landscapes has not yet received much attention from scholars.
The advent of Sentinel-1 and Sentinel-2, providing images with high spatial resolution, global coverage, and ultimately their free access, brings excellent opportunities for LC mapping.As a result, many published papers have been conducted using these images.For example, Abdi [6] integrated these images for LC mapping complex boreal landscapes.In another study, those images were integrated for LC mapping in Colombia [7].Integration of radar and optic RS data can deliver complementary information to improve LC mapping accuracy, taking advantage of both data [8].More precisely, the geometrical characteristics of the classes are mainly examined by Sentinel-1, providing C-band.At the same time, Sentinel-2 Multi-Spectral Instrument (MSI) is sensitive to the manifest content of the LC classes [9].It has been reported that incorporating time-series of these images can lead to more accurate and reliable LC maps compared to using them individually [10].However, integrating these datasets in different landscapes for LC mapping has not yet been well documented.
To boost LC mapping accuracy, adding some supplementary information (i.e., textural information and spectral indices) into the classification procedure has been endorsed as an efficient and practical approach [11,12].For example, the impact of complementary information (e.g., topographic data and spectral indices) has been investigated for LC mapping in mountainous areas [12].Texture information provides some continuous measure of distribution in digital numbers of a satellite image within predefined local windows [13].Using texture information and spectral bands can create high separation capability among different LC types, particularly in heterogeneous landscapes [13,14].Moreover, it has been reported that spectral indices can also improve LC mapping accuracy [15].Since using a large set of features has some disadvantages, such as being time-consuming and highly computational complex [16], selecting the most critical features in LC classification using an appropriate feature selection method can lead to a more operative and reliable LC classification procedure [17].In this regard, the RS community has widely employed Feature Selection (FS) methods to select the most appropriate features from a pool of available features.Among the different FS methods, Support Vector Machine-Recursive Feature Elimination (SVM-RFE) as a powerful method has been successfully applied in different RS studies to eliminate redundant features [16].
It is generally accepted that Machine Learning (ML) algorithms can effectively improve LC classification accuracy.In this manner, although standard ML algorithms, in most cases, can obtain substantially reasonable accuracies for majority classes, they usually show poor accuracies for rare (minority) classes, mainly owing to the class imbalance problem [18].Since LC classes are of various distribution and extent, gaining equal samples for all LC classes is very difficult [19], leading to the class imbalance problem and unacceptable accuracies for minority classes.To address this issue, several balancing data have been presented.However, the proposed methodologies have primarily examined specific landscapes, and their performances in different landscapes have not been investigated.For example, Naboureh et al. [20] proposed a hybrid data balancing method for mountainous regions.In another study, Waldner et al. [21] investigated the impact of different data balancing techniques for mapping crops.To this end, the recently proposed method, namely the Geometric Synthetic Minority Over-sampling Technique (G-SMOTE), by Douzas and Bacao (2019) [22], has been introduced as a robust method to address the class imbalance problem.However, there is still a lack of research that can thoroughly assess the performance of G-SMOTE in different landscapes by applying different ML algorithms.
Given the importance of the issues mentioned above, the present study was an attempt to investigate the performance of G-SMOTE to handle the class imbalance problem in LC classification at six different landscapes applying three frequently used ML algorithms, including RF, SVMs, and ELM.Furthermore, the SVM-RFE method was applied for each landscape to select the most informative features and use them as classification inputs to obtain the optimal feature subset from radar and optical bands, spectral indices, and texture information.Specifically, we are going to answer the following questions in this study: (1) What are the most informative features from Sentinel-1, Sentinel-2, spectral indices, and textural information for LC mapping using three well-known ML algorithms in different landscapes?(2) What is the performance of the G-SMOTE algorithm in LC classification in different circumstances?(3) Which ML classifier has higher accuracy on LC mapping at diverse landscapes?

Materials 2.1. Overview of the Experiment Sites
In this study, six different landscapes (Figure 1) with different numbers of samples, LC types, elevation ranges, climate conditions, and areas were selected to assess the roles of G-SMOTE, integration of radar and optical data, and different ML algorithms in improving LC classification accuracy.Site-1, as a coastal landscape, covers an area of about 2266 km 2 located in Istanbul province, Turkey.Forest is the dominant LC type in this study area.Site-2, as a plain landscape, covers an area of about 2509 km 2 located in East Azerbaijan province, Iran.Bare land classes mainly cover this study area.Site-3, as a semi-arid landscape, covers an area of about 1309 km 2 located in Tehran province, Iran.Cropland is the dominant class in this study area.Site-4, as a desert landscape, covers an area of about 1966 km 2 located in South Turkmenistan.At the same time, bare land class and artificial surface cover most parts of this study area.Site-5, as a cropland landscape, covers an area of about 2966 km 2 located in Western Uzbekistan.The LC of Site-5 is largely cropland.Site-6, as a mountainous landscape, covers an area of about 3255 km 2 located in Urumqi province, China.Forest and snow classes mainly cover this study area.

Image and Reference Data
In the present study, time series of sentinel-1C products (Image Collection ID: COPER-NICUS/S2_SR) with 10 m spatial resolution and Sentinel-2A MSI Level-2A products with less than 15% cloud coverage between January 2019 and January 2020 were utilized (Table 1).Several preprocessing steps (i.e., orbit file correction, radiometric calibration, terrain correction, and speckle noise reduction) were initially applied for SAR data.Then, the multi-look parameterization of vertical-vertical (VV) and vertical-horizontal (VH) polarizations were obtained using Sentinel-1A images.Moreover, ten spectral bands of Sentinel-2 from Band 2 to Band 8A, Band 11, and Band 12 were utilized in this study.Next, using the nearest neighbor method, bands 5, 6, 7, 8A, 11, and 12 were resampled to 10-m resolution from 20-m pixel size, achieving the exact resolution as the other bands.A reliable training dataset with enough quantity and accuracy is needed for training any supervised ML classifier [3].In this study, very-high-resolution images available in Google Earth TM and ArcGIS and raw Sentinel-2 data were visually interpreted to produce the reference dataset.Based on the extent of LC types, 840, 759, 846, 818, 703, and 1077 samples were, respectively, produced for Site-1, Site-2, Site-3, Site4, Site-5, and Site-6 (Table 2).The obtained datasets were then divided into two parts; one was used to train ML classifiers (training dataset).The other was used to assess the accuracy of the LC maps (validation dataset).More precisely, 60% of the original GCPs were used for training, and the remaining 40% were used for accuracy assessment.

Methodology
The present research methodology comprises six main steps: (1) Obtaining sentinel-1 and Sentinel-2 images for each landscape and preprocessing.(2) Calculating spectral indices and texture information.(3) Adopting the SVM-RFE method to find the most valuable features (among the spectral band, spectral indices, and texture information).( 4) Implementing the G-SMOTE method to rebalance the acquired reference datasets.(5) Employ ML methods to generate LC maps with the selected features from step 3. ( 6) Analyze the results and recommend the most helpful method and features for every landscape.

Spectral and Textural Features
Spectral and textural features are two main features widely applied in image interpretation and classification [15,23].Three frequently used spectral indices were employed to improve classification accuracy: Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), and Normalized Difference Water Index (NDWI) (Table 3).On the other hand, the second-order statistics of the grey-level co-occurrence matrix (GLCM) obtained from the VV band of Sentinel-1 SAR data were also used to improve the LC accuracy.It has to be noted that we selected the VV band for GLCM calculation after some primary analyses and based on its high-value distribution.The texture measures of mean, contrast, variance, dissimilarity, homogeneity, second moment, energy, and entropy were derived from the VV band by applying a 3 × 3 window size filter.

Feature Selection
In this study, the SVM-RFE method was employed to select the most critical features for LC classification.The SVM-RFE seeks to rank the original features for subsequent analyzes [24].With this method, at each iteration, an SVM classifier is built by sequentially eliminating available features.Meanwhile, analyzing the exhibited change in the cost function, the weight of all features calculated, and the feature with minimum rank is eliminated [16].This process continues in anticipation that there are no additional features for removing to reach the feature-ranking list.Using the caret and e1071 packages in the R environment, the SVM-RFE method was applied to identify the best possible combination of features for each of the six sites.

G-SMOTE
Imbalanced data can occur in a deliberative normal sampling process, but it can probably happen in random sampling.As an extension of the SMOTE method, G-SMOTE forms convex combinations of neighboring samples and creates new alternatives of the minority classes instead of regenerating instances from the available instances.Unlike Smote that randomly generates synthetic samples somewhere a line connecting minority instances to its k nearest neighbors, G-SMOTE specifies a flexible geometric area around each rare sample to generate synthetic samples (Figure 2).Generally speaking, G-SMOTE is developed to escape noisy sample generation as it modifies the SMOTE algorithm [22,25].

ML Classifiers and Accuracy Assessment
Applying ML classifiers, considered accurate and efficient approaches for LC mapping, has attracted many scholars.In this work, three frequently used and well-known ML algorithms, namely RF, ELM, and SVM, were assessed for LC classification in six different landscapes (For more detailed information about the ML classifiers, read [26][27][28][29][30][31]). In the RF model, the maximum number of trees was 1000, which was selected from a range of 500 to 3000 using cross-validation.While the number of variables for each split in RF was set as the default value of 25.To get the best performance, the cutoff fraction in this model and the resampling process repeated time were set as 0.01 and 500, respectively.For the SVM, we used a kernel width (γ) of 0.95 and a regularization (C) parameter of 0.8.Of note, we used a tenfold cross-validation strategy to find the most suitable values for the parameters of the RF, ELM, and SVM algorithms.Three accuracy evaluation metrics, including User's Accuracy (UA), Producer's Accuracy (PA), and Overall Accuracy (OA), were computed with validation datasets to evaluate the accuracy of produced LC maps.We did not use the Kappa static because of increased criticism in literature [15,32,33].

ML Classifiers and Accuracy Assessment
Applying ML classifiers, considered accurate and efficient approaches for LC mapping, has attracted many scholars.In this work, three frequently used and well-known ML algorithms, namely RF, ELM, and SVM, were assessed for LC classification in six different landscapes (For more detailed information about the ML classifiers, read [26][27][28][29][30][31]). In the RF model, the maximum number of trees was 1000, which was selected from a range of 500 to 3000 using cross-validation.While the number of variables for each split in RF was set as the default value of 25.To get the best performance, the cutoff fraction in this model and the resampling process repeated time were set as 0.01 and 500, respectively.For the SVM, we used a kernel width (γ) of 0.95 and a regularization (C) parameter of 0.8.Of note, we used a tenfold cross-validation strategy to find the most suitable values for the parameters of the RF, ELM, and SVM algorithms.Three accuracy evaluation metrics, including User's Accuracy (UA), Producer's Accuracy (PA), and Overall Accuracy (OA), were computed with validation datasets to evaluate the accuracy of produced LC maps.We did not use the Kappa static because of increased criticism in literature [15,32,33].

Results
This study applied the SVM-RFE method to obtain the ranked list of the features before classification.Therefore, original features, including Sentinel-1 bands (VV, VH), Sentinel -2 bands (B2, B3, B4, B8, B5, B6, B7, B8A, B11, and B12), spectral indices (NDVI, NDWI, and NDBI), and eight texture information derived from VV band (mean, dissimilarity, homogeneity, second moment, contrast, variance, entropy, and correlation) were used.Table 4 gives a summary of the most critical features after adopting SVM-RFE for different landscapes.The RF, SVM, and ELM approaches using the obtained features (Table 4) were employed for generating LC maps in six different landscapes.The LC maps were generated with two sets of reference datasets to investigate the impact of class imbalance on the LC classification accuracy; first, without balancing samples and the scorned one by adopting the G-SMOTE for balancing samples of classes.As shown in Tables 5 and 6, all three ML algorithms illustrated good performance in LC mapping.For example, all generated maps obtained OA above 0.83, ranging from 0.85 to 0.93 for RF, 0.83 to 0.94 for SVM, and 0.84 to 0.92 for ELM.Our analysis also showed that adopting the G-SMOTE method to rebalance reference datasets substantially improved UA and PA accuracies of minority classes.As illustrated in Figure 3, RF-G-SMOTE showed the best performance in four landscapes, namely coastal, cropland, desert, and semi-arid.In comparison, the SVM-G-SMOTE obtained higher accuracies for the remaining two landscapes, including plain and mountain.

Most Informative Feature
After analyzing the results of SVM-RFE to choose the most critical feature for each landscape, it was revealed that NDVI, VV, and B12 bands were selected as prominent features in all six landscapes, which could confirm their importance in LC classification [16].In contrast, NDBI, which consider as an essential index for extracting built-up areas [23], was only introduced as a critical feature in three landscapes (namely plain, semi-arid,

Most Informative Feature
After analyzing the results of SVM-RFE to choose the most critical feature for each landscape, it was revealed that NDVI, VV, and B12 bands were selected as prominent features in all six landscapes, which could confirm their importance in LC classification [16].In contrast, NDBI, which consider as an essential index for extracting built-up areas [23], was only introduced as a critical feature in three landscapes (namely plain, semi-arid, and coastal).NDWI also had the same situation; it was only selected as the main feature for the coastal landscape.This selection is potentially related to the fact that separation of the water bodies is much simpler than the other classes, which makes the NDWI unnecessary in most cases.The SAR data were selected as informative features in all of the landscapes, which is plausible due to the excellent applicability of SAR data in LC mapping for broad land covers.Such data is already applied for different broad land cover classes such as forest [34], cropland [9], and built-up areas [35].Among the texture information, only three features were selected in all experiments, except coastal sites, including mean, variance, and homogeneity.The results illustrated that the texture information was relatively unimportant in comparison to other features.

Comparison of ML Classifiers
In general, our experiments exhibited that the ML classifiers integrated with the G-SMOTE method yielded sustainably better results in different landscapes.With balanced datasets, the coastal and desert were two experiment sites that showed the highest accuracies; in contrast, the lowest classification accuracy belonged to the semi-arid landscape.Comparing the classifiers revealed that different classifiers show dissimilar performances in diverse landscapes.For example, although the highest accuracy (OA and G-mean) belonged to the SVM-G-SMOTE classifier in three landscapes (coastal, cropland, and mountain), the RF-G-SMOTE provided better results in three remaining landscapes (plain, semi-arid, and desert).On the other hand, ELM-G-SMOTE had the worst outcome in five sites (in common with the SVM-G-SMOTE in two cases), whereas it provided the same leadership position as RF-G-SMOTE in the plain landscape.
Since our methodological approach in image classification was the same in all experiments, this diversity in results might be attributed to the landscape circumstances and the LC class distribution of reference datasets.That diversity in results among classifiers agrees with the previous studies where scholars and scientists have tried to find the best classifier for LC classification, but their conclusions were contradictory.For instance, Clerici and others [7], Fusing Sentinel-1, and Sentinel-2 data, introduced the SVM algorithm as the most accurate one (OA = 88.75%).Adam and others [36] claimed that RF shows higher accuracy than SVM in a heterogeneous coastal landscape.Our results suggest a comparable performance by both the SVM-G-SMOTE and RF-G-SMOTE methods.While the time and experimentation required to select the user-defined parameters of RF are pretty small compared to SVM.The SVM implementation involves choosing a suitable kernel and some kernel specific parameters like cost and gamma [26].On the other hand, the RF only needs a selection of two parameters, including n tree and m try.The value of 500 for n tree and the square root of the number of input variables for m try had already been proven as valid values [28].

Effect of G-SMOTE on LC Classification Accuracy
In addition to the quality and quantity of the training dataset, the class imbalance problem also has a significant impact on the ML classifier's performance [19].Applying the G-SMOTE method improved the LC classification accuracies based on the results (Tables 5 and 6).This finding agrees with the previous research [22].Moreover, analyzing the impact of G-SMOTE on OA showed that balancing data can slightly decline OA values.Among landscapes, coastal and semi-arid experienced more improvement by G-SMOTE integration with ML classifiers, mainly because of the higher degree of complexity in these experiment sites.Moreover, the desert landscape, which can be considered as an experiment site with less complexity, had the lowest impact; even SVM showed better OA accuracy than SVM-G-SMOTE in this landscape.

Conclusions
This study analyzed the potential of RF, SVM, and ELM in LC classification at six different landscapes by integrating Sentinel-1 and Sentinel-2 images.We also used SVM-RFE to select the most informative features from Sentinel-1, Sentinel-2, spectral indices, and textural information.Furthermore, we discussed the impact of G-SMOTE on the classification accuracy of ML algorithms.The result showed that NDVI, VV, and B12 could contribute as main features to improve LC classification accuracy.Our findings also indicated that all three ML algorithms, especially RF and SVM, are robust approaches for LC classification in different landscapes.
Moreover, the results confirmed that applying G-SMOTE has a significant impact on the accuracy of minority classes.After applying G-SMOTE to ML algorithms, the differences between UA and PA metrics for minority and majority classes have decreased.However, there were significant differences among them without considering the class imbalance problem.Further study could investigate the performance of other algorithms and sample sizes in balancing data.

Figure 1 .Figure 1 .
Figure 1.(A) The Location of six different landscapes.(B) The RGB color composites of the study areas.

12 Figure 2 .
Figure 2.An example of oversampling by the G-SMOTE algorithm.

Figure 2 .
Figure 2.An example of oversampling by the G-SMOTE algorithm.

Figure 3 .
Figure 3. Impact of G-SMOTE on the overall accuracy of the generated LC maps.

Figure 3 .
Figure 3. Impact of G-SMOTE on the overall accuracy of the generated LC maps.

Table 3 .
The formula of the spectral index.

Table 4 .
Most informative features of each experiment site.

Table 4 .
Most informative features of each experiment site.

Table 5 .
The result of accuracy assessment methods in six different landscapes (with original GCPs).Minority classes for each landscape are highlighted.

Table 6 .
The result of accuracy assessment methods in six different landscapes (after adopting G-SMOTE).Minority classes for each landscape are highlighted.