Integration of Time Series Sentinel-1 and Sentinel-2 Imagery for Crop Type Mapping over Oasis Agricultural Areas

: Timely and accurate crop type mapping is a critical prerequisite for the estimation of water availability and environmental carrying capacity. This research proposed a method to integrate time series Sentinel-1 (S1) and Sentinel-2 (S2) data for crop type mapping over oasis agricultural areas through a case study in Northwest China. Previous studies using synthetic aperture radar (SAR) data alone often yield quite limited accuracy in crop type identification due to speckles. To improve the quality of SAR features, we adopted a statistically homogeneous pixel (SHP) distributed scatterer interferometry (DSI) algorithm, originally proposed in the interferometric SAR (InSAR) community for distributed scatters (DSs) extraction, to identify statistically homogeneous pixel subsets (SHPs). On the basis of this algorithm, the SAR backscatter intensity was de-speckled, and the bias of coherence was mitigated. In addition to backscatter intensity, several InSAR products were extracted for crop type classification, including the interferometric coherence, master versus slave intensity ratio, and amplitude dispersion derived from SAR data. To explore the role of red-edge wavelengths in oasis crop type discrimination, we derived 11 red-edge indices and three red-edge bands from Sentinel-2 images, together with the conventional optical features, to serve as input features for classification. To deal with the high dimension of combined SAR and optical features, an automated feature selection method, i.e., recursive feature increment, was developed to obtain the optimal combination of S1 and S2 features to achieve the highest mapping accuracy. Using a random forest classifier, a distribution map of five major crop types was produced with an overall accuracy of 83.22% and kappa coefficient of 0.77. The contribution of SAR and optical features were investigated. SAR intensity in VH polarization was proved to be most important for crop type identification among all the microwave and optical features employed in this study. Some of the InSAR products, i.e., the amplitude dispersion, master versus slave intensity ratio, and coherence, were found to be beneficial for oasis crop type mapping. It was proved the inclusion of red-edge wavelengths improved the overall accuracy (OA) of crop type mapping by 1.84% compared with only using conventional optical features. In comparison, it was demonstrated that the synergistic use of time series Sentinel-1 and Sentinel-2 data achieved the best performance in the oasis crop type discrimination.

Nevertheless, it has been reported that the classification accuracy can be considerably increased by removing redundant features [20]. Feature selection is a crucial step to improve the performance of a classifier. From an operational perspective, the manual selection of such high-dimensional features is not desirable. Many approaches used separability criterion or hypothetical tests to select features based on assumptions of the sample distribution. In some cases, these assumptions were not satisfied, particularly when using SAR features. Moreover, crop samples can even break the assumption of unimodal distribution. For example, the sowing and harvesting dates are usually farmer customized; crop growth stages are affected by local weather conditions and soil conditions; therefore, they are site-specific. Thus, we prefer not to use separability criterion or hypothetical tests to select features. In the literature, several methods based on machine learning algorithms have been proposed for feature selection [21,22]. Some studies indicated the built-in attribute of random forest, the feature importance score, can be utilized as a ranking criterion to aid feature selection.
The objective of this research is to develop a method to integrate time series Sentinel-1 and Sentinel-2 features for the mapping of typical oasis crop distribution in heterogeneous smallholder farming areas. Firstly, in addition to SAR backscatter intensity, a number of InSAR products were extracted from time series Sentinel-1 data, such as the interferometric coherence, amplitude dispersion, and master versus slave intensity ratio. A statistically homogeneous pixel (SHP) distributed scatterer interferometry (DSI) [23,24] algorithm, originally proposed in the interferometric SAR (InSAR) community to identify distributed scatters (DSs), was adopted for the de-speckling of backscatter intensity and bias mitigation of coherence coefficient, so as to improve the quality of SAR features. To the best of our knowledge, this is the first time the use of amplitude dispersion and bias mitigated coherence is explored in crop type discrimination. Secondly, optical features were extracted from multi-temporal Sentinel-2 images. In particular, red-edge spectral bands and 11 indices were derived and included as input features for the oasis crop classification. Thirdly, a recursive feature increment (RFI) approach, on the basis of random forest feature importance, was proposed to obtain the optimal combination of S1 and S2 features for crop type discrimination. Finally, a random forest classifier was applied to the optimal feature set to produce a crop type distribution map. This study aims to answer the following questions: (1) Does the integration of Sentinel-1 and Sentinel-2 features achieve better performance than using SAR or optical features alone in the oasis crop type mapping? (2) If yes, which SAR feature has the most significant contribution? Are there any InSAR products that are capable of distinguishing oasis crop types? (3) To what extent can the inclusion of red-edge spectral bands and indices improve the accuracy of the oasis crop type identification? Which red-edge band or indices contribute most?

Study Area
The study area is located in oasis agricultural areas in Bayingolin Mongol Autonomous Prefecture, Xinjiang Uyghur Autonomous Region, China, encompassing the area 40.61°-42.44° N, 85.82°-87.14° E (Figure 1). This area is situated in the southern foothills of the Tianshan Mountains, on the north-eastern edge of the Tarim Basin, adjacent to the west bank of Bosten Lake, and to the north of the Taklimakan Desert. This region features a warm temperate continental climate, with much more evaporation than precipitation. It has a representative planting pattern in the arid and semi-arid regions of Northwest China. The major crop types cultivated from spring to autumn include cotton, spring corn, summer corn, pear, chili pepper, tomato, etc. Cotton is sown from early April to early May and harvested by the end of September. Spring corn is sown in mid-April to early June and harvested in mid-August to mid-September. Summer corn is sown in mid-June to mid-July and harvested by early October. Pear starts budding in late March, flowering from late March to the end of April, fruiting from May to early September, and is harvested in September. Chili pepper and tomato seedlings are nurtured in greenhouses from February to March, transplanted to open fields from mid-April to early May. Tomato is harvested in August, and chili pepper is harvested in October. Phenological calendars of the five crop types are summarized in Table 1.  Ground samples of six land cover types (cropland, forest, urban area, desert, waterbody, and wetlands) were visually identified from Google Earth high-resolution images in terms of polygons. These sample polygons were resampled to generate randomly placed points for each cover type ( Table 2). Individual fields of five crop species were collected in a field campaign in July 2018. Crop sample points were randomly extracted from each field under the condition that the minimum distance between any two points must be no less than 20 m. This was done in ArcGIS Data Management toolbox. Stratified K fold was applied in the units of crop fields; that is, sample points from the same field can only be used for training or testing. Thus, training samples are not spatially correlated to the samples used in validation. In the crop type classification, five folds were used to verify the accuracy, and the details are shown in Table 3.   1st  57  148  18  37  2nd  50  147  25  38  3rd  68  147  7  38  4th  63  151  12  34  5th  71  147  4  38   Corn   1st  23  89  9  22  2nd  25  88  7  23  3rd  30  90  2  21  4th  29  87  3  24  5th  21  90  11  21   Cotton   1st  37  150  13  38  2nd  43  151  7  37  3rd  43  152  7  36  4th  34  151  16  37  5th  43  152  7  36   Pear   1st  22  76  2  19  2nd  20  78  4  17  3rd  15  76  9  19  4th  18  77  6  18  5th  21  73  3  22  Tomato   1st  19  104  2  31  2nd  16  108  5  27  3rd  19  108  2  27  4th  15  110  6  25  5th  15  110  6  25 2.

Satellite Data
To monitor the growth stages of major crop types cultivated from spring to autumn in the study area, we examined Sentinel-1 and Sentinel-2 data acquired in the time period from late March to early October. After discarding Sentinel-2 acquisitions with too much cloud cover (more than 20%), there were in total eight Sentinel-2 acquisitions left. The available number of the Sentinel-1 and Sentinel-2 acquisitions for each month is summarized in Table 4.
The employed Sentinel-1 data is the level-1 interferometric wide swath (IW) single look complex (SLC) data. In total, 18 acquisitions were used (Table 5). Table 5. Metadata of the data stack of Sentinel-1 (S1) interferometric wide swath (IW) single look complex (SLC) data using the parameters from the first image. These values remain very close for all subsequent acquisitions. The Sentinel-2 data used in this study is the MSI Level-1C data ( Table 6). In total, eight acquisitions were used, spanning the time from 9 May 2018 to 11 October 2018.

Methods
The workflow used in this research is provided in Figure 2. Firstly, a number of SAR features were extracted from time series Sentinel-1 data. It has been reported intensity and derived ratios (VV versus VH ratio, and the normalized ratio procedure between bands (NRPB)) can monitor the vegetation changes. Some studies indicated that SAR interferometric coherence can potentially improve the discrimination capability of different land cover types [25][26][27][28]. Thus, the backscatter intensity, interferometric coherence, and their derivative products were derived as SAR features for crop mapping. A by-product from interferometric pairs was also included; that is, the master versus slave intensity ratio computed respectively for the VH and VV polarization modes. In addition, we derived a statistic of time series SAR intensity, i.e., the amplitude dispersion computed respectively for VH and VV polarization. The amplitude dispersion was originally used in multi-temporal interferometric SAR (InSAR) for the initial selection of persistent scatters, which is a good indicator to distinguish vegetated surface and man-made structures, bare rocks, etc.
Secondly, optical features were derived from Sentinel-2 images. In addition to the conventional features, such as the visible, NIR (Near-infrared), SWIR (Short-wave infrared) bands, normalized difference vegetation index (NDVI), and normalized difference water index (NDWI), the red-edge spectral bands were also extracted. A total of 11 red-edge indices (will be detailed in Table 7) were calculated for each S2 acquisition, in order to thoroughly explore the contribution of red-edge wavelengths in crop type discrimination.
To deal with the high dimensional input features for the classifier, we proposed a recursive feature increment approach to select the optimal combination of SAR and optical features (detailed in Section 2.3.2) based on the random forest feature importance ranking.
A two-step hierarchical cotton mapping strategy ( Figure 2) was implemented in this research. In step 1, the entire field site was classified into six land cover types, i.e., cropland, forest, desert, urban area, water body, and wetlands, so as to create a cropland mask. In the second step, the cropland was mapped into five different crop types, including chili pepper, corn, cotton, pear, and tomato. The automated feature selection method and RF classification were used both in step 1 and step 2, as illustrated in Figure 2 All Sentinel-1 IW SLC images were pre-processed by the SNAP-ESA Sentinel Application Platform v7.0.0 (http://step.esa.int). Processing steps include co-registration, de-burst, and subset. To derived InSAR products, in total, 17 interferometric pairs were formed for each polarization mode using the acquisition on 26 March 2018 as the common master image, and all the subsequent acquisitions as the slave images. The next step is the de-speckling of intensity and accurate estimation of coherence. As pointed out by [29], using regular-shaped windows, conventional de-speckling methods average the values of neighboring pixels indiscriminately, leading to degraded image resolution and blurred edges between objects of different scattering characteristics. Furthermore, conventionally, the coherence is calculated by a regular-shaped sliding window, which unavoidably averages pixel values from different distributions (e.g., belonging to different land cover types), resulting in a biased estimation. In this study, a SHP DSI algorithm [23,24] was applied to the single look complex SAR data prior to SAR feature extraction. Here, we briefly summarize the principles and processing steps of the SHP DSI algorithm: In SAR images, there are large numbers of distributed scatters (DSs) that exist in a resolution cell in cropland, forest, desert, etc. DSs cannot dominate the backscatter characteristics of a resolution cell, and the pixel appears as a random variable along the time dimension. When the SAR time series is sufficiently long, according to the central limit theorem, the vector sum of all the distributed scatters from a pixel can be defined as a complex Gaussian random variable. In a SAR image, we can identify if a pixel has the same statistical distribution with the other using the confidence interval. In this way, pixels can be divided into different statistically homogeneous pixel (SHP) subsets. The SHP DSI algorithm applies a modified Lee filtering method to the diagonal elements of the complex covariance matrix of an arbitrary pixel, on the basis of the SHP subset that the pixel belongs to, to obtain filtered time series intensity.
As for the interferometric coherence, for an arbitrary pixel , the coherence coefficient is calculated as: (1) where and represent the corresponding complex values of the pixel in the master and slave images, and is the number of pixels in an SHP subset where the pixel is located. Firstly, the coherence coefficient is calculated on the basis of each SHP subset for the purpose of removing the errors caused by the image texture. Secondly, a maximum-likelihood fringe rate algorithm [24] is applied to compensate for the interferometric fringe pattern. Finally, the bias in the coherence estimation is further mitigated using the second kind of statistical estimator proposed by [30].
After the implementation of the SHP DSI algorithm, we obtained the de-speckled backscatter intensity and bias-corrected coherence coefficient for each acquisition date. The units of VH and VV intensity were both converted decibels (dB). On this basis, a number of other Sentinel-1 features were generated as follows.
As one of the InSAR products, the master versus slave intensity ratio of the VH and VV polarization modes was calculated as (2) where , and , are respectively the backscatter intensity (in the unit of dB) of the master and slave images of each interferometric pair with the VH, VV polarization mode. A statistic previously used in multi-temporal InSAR, i.e., amplitude dispersion is calculated as where and represent the amplitude of each Sentinel-1 acquisition of the VH and VV polarization mode. A by-product of backscatter intensity, i.e., VV versus VH intensity ratio is calculated as (4) where and are respectively the backscatter intensity (in the unit of dB) of each Sentinel-1 acquisition. Another intensity-based ratio, the normalized ratio procedure between bands (NRPB) [31] is calculated as (5) In summary, we derived 10 subcategories of features from Sentinel-1 data. Adding features of different acquisition dates together, 142 Sentinel-1 features were produced, geocoded, and resampled to 10 m resolution.

Sentinel-2 Data Processing and Feature Extraction
Sentinel-2 data were firstly processed from Level-1C to Level-2A surface reflectance using a Sen2Cor-ESA Sentinel-2 Level 2A processor (https://step.esa.int/main/third-party-plugins-2/sen2cor/(accessed on 30 October 2019).). Cirrus cloud correction was done during the Level-1C to Level-2A processing by Sen2Cor. Pixels covered by thick clouds were masked out using the opaque clouds band in the Level-2A product metadata. All spectral bands of 10 m and 20 m spatial resolution, including the red-edge bands (Band 5, Band 6, and Band 7) were included in the analysis. The 20 m resolution bands were resampled to 10 m resolution for further processing. Apart from the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI), 11 red-edge indices (Table 7) were generated for each Sentinel-2 acquisition.

Random Forest Classification
The random forest (RF) classifier [36] was deployed in this study for both cropland extraction and crop type classification. Random forest is a machine learning algorithm for classification, which consists of an ensemble of decision trees where each tree has a unit vote for the most popular class to classify. RF has been widely used for remote sensing classification, as it runs efficiently on large databases and performs well with high-dimensional features.
In this study, we used the RF classifier from the Scikit-learn Python module [37] for classification. The two key parameters, i.e., the number of trees and the number of features in each node, were chosen by analyzing the out-of-bag (OOB) errors. As a result, the number of trees was set as 450 in step 1 and 550 in step 2, and the number of maximum features in each tree was set as the square root of the total number of input features in both steps.
2. Optimal S1 and S2 Input Variable Selection Using Recursive Feature Increment (RFI) Method Inspired by the backward feature elimination method [22], we proposed a forward feature selection approach to obtain the optimal combination of S1 and S2 features, which is referred to as the recursive feature increment (RFI) method.
Firstly, a feature importance ranking was obtained by the permutation importance of random forest calculated for each feature. The permutation importance assesses the impact of each feature on the accuracy of the random forest model. The general idea is to randomly permute the feature values and measure the decrease of the accuracy due to the permutation. In this study, we utilized the permutation importance function from the ELI5 package to estimate the feature importance.
Secondly, using the feature ranking with the most important feature placed at the top, RF classification was implemented by recursively considering bigger and bigger feature sets, starting from one feature and adding one at a time. For the feature set at each iteration, an RF classifier is parameterized and assessed using stratified k-fold cross-validation, with the mean overall accuracy (OA), mean kappa coefficient, and mean f1 score of all classes, as well as the f1 score of individual class recorded.
Finally, by analyzing the time series of the accuracy metrics, for example, the kappa coefficient, the feature set corresponding to the iteration achieving the highest accuracy is selected as the optimal combination of S1 and S2 features.

Accuracy Assessment
The performance of classification was assessed using stratified k-fold cross-validation. The entire sample dataset was split into k stratified "folds". The folds were made by preserving the percentage of samples for each class. In the case of unbalanced samples, this is to ensure that each fold is a good representative of the whole. For each fold in the sample dataset, the classification model was trained on k-1 folds and tested on the kth fold. This was repeated until each fold had served as the test set. Three accuracy metrics were used in this study, i.e., the overall accuracy (OA), Cohen's kappa coefficient, and F1 score of each class. All these metrics were averaged over the k folds.
The OA is calculated from the confusion matrix by adding the number of correctly classified sites together and dividing it by the total number of the reference sites. It is to test what proportion were mapped correctly out of all of the reference sites. From the confusion matrix, the producer's accuracy (PA, also referred to as recall) and the user's accuracy (UA, also referred to as precision) can also be calculated. UA (precision) is the ratio of correctly predicted positive observations to the total predicted positive observations; PA (recall) is the ratio of correctly predicted positive observations to all the observations in the actual class. The kappa coefficient is to evaluate if the classification does better than randomly assigning values by taking into account the possibility of the agreement occurring by chance. In the case of uneven class distribution, the F1 score is a more useful metric than OA, as it is a weighted average of precision and recall. The F1 score for each class is calculated as follows:

Step 1: Cropland Extraction
A random forest (RF) classifier was applied to the Sentinel-1 and Sentinel-2 features for land cover classification, so as to generate a cropland mask. As the purpose of this step is to extract a cropland mask, we use the F1 score of cropland (Figure 3) as the accuracy metric for the RFI feature selection (Section 2.3.2). The top 114 features in the importance ranking were selected. The feature importance scores of the top six features are displayed in Figure 4, with corresponding boxplots of different land cover types shown in Figure 5.   In Figure 5, the top six features exhibit a good ability to distinguish different cover types, which reflects the reliability of the RFI feature selection method.
The random forest classifier was parameterized using the selected 114 features to generate a land cover map (Figure 6a). Then, a cropland mask (Figure 6b) was produced from the land cover classification results. The land cover classification accuracy was assessed using 10-fold crossvalidation (Table 8).  Using the RFI feature selection method, we obtained the optimal combination of Sentinel-1 and 2 features for crop type discrimination. In this case, the kappa coefficient was used as the main accuracy metric to determine the final feature set. The mean OA achieved its maximum at the same time as the kappa coefficient, at the 113 th iteration. Thus, in total, 113 features were selected. The mean OA and mean kappa coefficient averaged over the k-fold cross-validation are plotted in Figure 7. The feature importance scores of the top six features selected for crop type classification are shown in Figure 8.   Based on the optimal combination of S1 and S2 features, a random forest classification model was trained using the ground samples and applied to the whole study area to produce a crop distribution map (Figure 10). The classification accuracy was assessed by fivefold cross-validation (Table 9). Besides, the accuracy was also verified by a "one sample per field" method; that is, from an individual validation field (as listed in Table 3), only one sample was extracted. In this method, the validation samples are independent from each other. The corresponding accuracy metrics are listed in Table 10. From Figure 10, we can see that pears are mainly planted in Korla County; the four counties near Bosten Lake (Bohu, Yanqi, Hejing, and Hoxud) are the main production areas of chili peppers and tomatoes; cotton is mostly cultivated in Yuli County on the edge of the Tarim Basin. Corn cultivation spreads across the whole study area. Table 9. Accuracy of step 2 crop type classification using the optimal combination of S1 and S2 features, assessed by stratified fivefold cross-validation using the ground samples.

Performance Comparison of SAR Features Filtered by Different Methods in Crop Classification
In this section, we compare the crop classification results obtained using SAR features processed by conventional de-speckling methods and the SHP DSI method. The input variables include all the SAR features mentioned in Section 2.3.1. In the conventional procedure, the intensity and intensityderived features were de-speckled using a 7 × 7 refined Lee filter, and the coherence was estimated using a 7 × 7 sliding window. In the SHP DSI method, statistically homogeneous pixels (SHPs) were identified based on a 5 × 9 window using the intensity stack formed by 18 acquisitions. Then, the despeckled intensity and accurately estimated coherence were obtained by the procedure described in Section 2.3.1. An example intensity of a subset area extracted from the original SAR intensity, intensity filtered by refined Lee method, and intensity filtered by the SHP DSI method are compared in Figure 11. Figure 12 shows the example coherence of the same subset area extracted from the interferometric coherence estimated respectively using a 7 × 7 sliding window and the SHP DSI method. The crop classification accuracy obtained by using SAR features processed by different filtering algorithms is listed in Table 11.   In Figure 11, it is evident that the speckles in original SAR intensity are both reduced by a refined Lee filter and SHP DSI filter, but the sharpness and details of each image is better preserved in (c). In Figure 12, it is found the coherence over a water body is overestimated in (a). This bias is reduced in (b) by using the SHP DSI method. It is clear the coherence in (b) exhibits less salt and pepper-like noise.
In Table 11, it is demonstrated that the SAR features filtered by the SHP DSI method yield the best accuracy in crop type classification. Compared with the results of using a refined Lee algorithm, the overall accuracy is increased by 6.25%, the kappa coefficient is improved by 0.08, and the F1scores of each crop type are all improved.

Performance Comparison of Features from Different Sources in Crop Type Mapping
A comparison was conducted on the performance of crop type classification using four groups of features. The first group includes only Sentinel-1 features; the second group contains only conventional optical features exclusive of red-edge features; the third group comprises all Sentinel-2 features inclusive of red-edge contribution; the last group consists of the both SAR and optical features. The same feature importance ranking obtained in Section 3.2.1 was used for all groups. For each feature group, the optimal feature set was individually determined by the RFI feature selection method, using the kappa coefficient of each group as the accuracy metric. The mean OA and kappa coefficient reached the maximum at the same iteration index for each test, as summarized in Table  12. The corresponding F1 scores of different crop types are compared in Figure 13. Table 12. Accuracy assessment of crop type discrimination using different groups of features. The mean overall accuracy (OA) and kappa coefficient were averaged over fivefold cross-validation in both multi-sample per field validation and one sample per field validation.  In both Table 12 and Figure 13, it is evident that the combination of SAR and optical features achieved the best accuracy in crop type discrimination. The inclusion of red-edge bands and indices improved the mean OA and kappa coefficient respectively by 1.84% and 0.03 (3.06% and 0.04% in multi-samples per field validation), compared with the results using other Sentinel-2 features. By integrating the optical and SAR features, the mean OA and kappa coefficient were improved by 1.58% (1.55% in multi-samples per field validation) and 0.02% compared with the results only using Sentinel-2 features. In Figure 13, among the five crop types, the F1 score of chili pepper and corn were significantly improved by the inclusion of red-edge features.

Importance of Features from Different Sources
The accumulated importance scores were calculated by subcategories (Figure 14), different data sources (Figures 15a and 16a), and the month of acquisition (Figures 15b and 16b). Among the features selected for cropland extraction (Figure 14a), it is found that the SWIR 2 band (Band 12) contributed the most in the cropland extraction, followed by the Sentinel-1 VH intensity and Band 11 (SWIR 1), again the red-edge 1 (Band 5), and NDWI.
In the features selected for crop type discrimination (Figure 14b), the VH intensity exhibited the highest accumulated score for crop type classification, followed by the NDVI, Band 8 (NIR), Band 6 (red-edge 2), Band 4 (red), and Band 5 (red-edge 1). The optimal feature set comprises six subcategories of SAR features, 13 subcategories of red-edge features, and nine subcategories of conventional optical features. The contribution of features from different data sources will be explored in detail later.
It should be noted that the VH intensity held a significant share in both land cover classification and crop discrimination. The intensity ratio between bands, i.e., S1 NRPB and VV versus VH intensity ratio were only used in land cover classification. As for InSAR products, it is found that the VV coherence was selected in step 2. No coherence was selected in step 1. The coherence feature with the highest score in its subcategory, i.e., the VH coherence on 17 August 2018, ranked 118 in the RF permutation importance results in step 1. Its importance score was 0.27%, which is comparable to the score of 0.29% of the last chosen feature, NDre2 (red-edge) index on 23 July 2018. The amplitude dispersion of VH polarization, the master to slave intensity ratio in both polarization modes, were used in both step 1 and step 2.
All conventional optical features and the three red-edge spectral bands contributed to both the land cover classification. Regarding the red-edge indices, "NDVIre3" was only used in step 1; all the other 10 indices were selected for both cropland extraction and crop type identification. Feature scores in the three groups calculated for each month. The VH amplitude dispersion, as a single-phase feature, is plotted on the rightmost bar. Each importance score was normalized and converted to a percentage. Feature scores in the three groups calculated for each month. The VH amplitude dispersion, as a single-phase feature, is plotted on the rightmost bar. Each importance score was normalized and converted to a percentage. Figures 15a and 16a, the three groups of features held similar proportions in both steps. The conventional optical features had the largest proportion in step 1 and step 2; the SAR features accounted for −15% in step 1 and −16% in step 2; the red-edge features accounted for −24% in step 1 and −22% in step 2.

As shown in
Comparing Figures 15b and 16b, the features selected in step 1 and step 2 had significantly different temporal distribution. For land cover classification, the features in May had the largest proportion, followed by September, October, and June. Red-edge features contributed more in September, June, and July. In crop type classification, it is clear that features in August held the biggest share, of which red-edge features had a significant proportion. Compared with other months, the red-edge wavelengths performed the best in August, when most crops were ripening. This is likely associated with the sensitivity of the red-edge wavelengths to differences in the leaf structure and chlorophyll content of crops. Conventional optical features (visible, NIR, and SWIR wavelengths) were dominant from May to July and in October, and held a significant share in August and September.
In both steps, there were no optical features available in March and April, but some SAR features show good separability in the early season. SAR features in June and July were not selected for either of the steps. The proportion of SAR features re-increased in August and at the end of harvesting season in October. The amplitude dispersion was used in both steps, as a single-phase feature, plotted on the rightmost bar in both Figures 15b and 16b.
In addition, we examine the correlation between selected features in both step 1 and step 2, as shown in Figure 17. In both step 1 and step 2, most of the selected features showed a low correlation, as revealed by the histograms in (c) and (d).

The Contribution of SAR Features in Crop Type Discrimination
The electromagnetic radiation of the C-band cannot penetrate through vegetation cover. Therefore, the radiation from the Sentinel-1 C-band SAR sensor reflects the interaction between the radar signal and the ground surface cover, which explains the sensitivity of Sentinel-1 SAR to the vegetation cover.
It has been reported that VH polarization is more sensitive to the structural and geometrical arrangements of plants [11,38]. This is in accordance with the result that VH intensity bands held the biggest proportion in the selected SAR features (also in all features) for crop type discrimination (Figure 14b). We examined the selected SAR features and found that 13 out of 22 features were in VH polarization mode, including six features of the VH intensity ( Figure 18) and six features of the master versus slave VH intensity ratio (Figure 19), as well as the VH amplitude dispersion (Figure 20a). Pear is the most distinguishable crop in most of these features. This is highly likely because SAR intensity is sensitive to target structure, as the planting density and vegetation height in pear fields are significantly different from those in other crop fields. It is interesting to notice that the VV coherence on 7 April 2018 revealed a good ability to differentiate cotton from other crop types (Figure 20b). In Figure 20b, we can see the VV coherence is significantly low in the cotton field. According to local crop phenological calendars (Section 2.1), cotton is the crop that is most likely to be sown in early April. Other annual crops are usually sown or transplanted in mid-April or later. Thus, this is possibly related to the changes of the cotton field surface in the sowing season (early April) from bare soil to plastic-mulched land, which leads to a fast decrease of coherence in 12 days, while the other crop fields almost remain as before. In addition, chili pepper can be easier recognized from the VH intensity on 16 October 2018 (Figure 18f).

The Sensitivity of Red-Edge Wavelengths to Different Crop Types
Red-edge refers to the region of the sharp rise in vegetation reflectance between the red and NIR part of the electromagnetic spectrum. It is an important wavelength range that is sensitive to vegetation conditions. This research demonstrated the advantage of adding red-edge spectral bands and indices in the crop type discrimination, which increased the overall accuracy by 3.06% compared with the results only using conventional optical features. In Figure 14, among the selected red-edge features, Band 6 (red-edge 2), Band 5 (red-edge 1), and the NDVIre2n (red-edge index based on Band 6 and Band 8A) exhibited greater importance. As for red-edge indices, 10 out of 11 indices were selected for crop type discrimination (NDVIre3n was discarded by RFI method), with a number of indices showing a good capability to differentiate corn from other crops ( Figure 21). This explains why the inclusion of red-edge features significantly improved the mapping accuracy of corn ( Figure  13). Apart from corn, some re-edge indices also revealed good separability of tomato (Figure 21c-f,h) and pear ( Figure 22). Several red-edge features show good capability to identify chili pepper ( Figure  23). These findings reinforce the benefit of using red-edge wavelengths in crop type mapping.   From the top six scoring features for crop type mapping ( Figure 8) and the accumulated importance score of each subcategory (Figure 14b), it can be inferred that the most important rededge band is Band 6, and the most useful red-edge indices is the NDVIre2n (calculated using Band 6 and Band 8A (NIR narrow)). The top six features emphasized the significance of three red-edge features, i.e., Band 6 (red-edge 2), Band 7 (red-edge 3), and NDVIre2n (red-edge index based on Band 6). The accumulated importance ranking (Figure 14b) indicated the contribution of Band 6, Band 5, and NDVIre2n. It has been reported that the red-edge close to red wavelengths (Band 5) is mainly related to the difference in chlorophyll content, while the red-edge close to NIR (Band 7) is usually correlated to variations in the leaf structure [6]. The above results suggest that the separability of rededge features lies in both the leaf structure and chlorophyll content of different crop species.
In a nutshell, SAR features distinguish crops based on the structural and geometrical arrangements of plants or changes of the crop field surface, while optical features rely on the multi-spectral information to differentiate crops. In experiments, the red-edge wavelengths exhibited good separability between different crop types, due to their sensitivity to the variations in chlorophyll content and leaf structure. Therefore, the combination of SAR and optical integrated the physical and spectral characteristics of crops, improving the performance of crop classification.
With the development of deep learning technology, powerful convolutional networks, such as U-net [39], can be explored for crop type mapping in future work. U-net has the potential to support multi-dimensional input variables. By implementing U-net with multi-temporal, multi-sensor features, it is expected that a higher level of accuracy can be achieved in crop type identification.

Conclusions
This research proposed a method for the synergistic use of Sentinel-1 and Sentinel-2 features for oasis crop type mapping through a case study in a smallholder farming area in Northwest China. First of all, a SHP DSI algorithm was introduced for the de-speckling of SAR intensity and accurate estimation of interferometry coherence to improve the quality of SAR features. It was demonstrated that the use of the SHP DSI method improved the crop classification accuracy by 6.25% when only using SAR features. A variety of SAR features and optical features were derived from multi-temporal Sentinel-1 and Sentinel-2 images, including several InSAR products and red-edge spectral bands and indices. Secondly, based on the permutation importance of the random forest classifier, a recursive feature increment feature selection method was proposed to obtain the optimal combination of Sentinel-1 and Sentinel-2 features for cropland extraction and crop type classification. Finally, a crop distribution map was generated with an overall accuracy of 83.22% and kappa coefficient of 0.77. The contribution of SAR and optical features were explored thoroughly. Among all the Sentinel-1 features, the VH intensity held the biggest proportion, indicating the better sensitivity of VH polarization to vegetation changes. It was also noted that some of the InSAR products, such as the VH amplitude dispersion, the master versus slave intensity ratio, and the VV coherence in early April revealed good separability of certain crop types. As for Sentinel-2 features, we demonstrated the merits of using red-edge spectral bands and indices in oasis crop type mapping. The inclusion of rededge features improved the crop classification OA by 1.84% compared with only using conventional optical features. This proves the superiority of Sentinel-2 data due to the increased spectral resolution. A comparison was conducted on the performance of oasis crop classification using four combinations of features. The results indicated that the integration of SAR and optical features achieved the best performance. We concluded that the integration of time series S1 and S2 imagery is advantageous, and thanks to the free, full, and open data policy, it can be further explored in the vast majority of regions for the monitoring of crop status.