1. Introduction
Thanks to the development of Earth Observation (EO) technologies, remotely sensed data have become accessible for a broad range of users in both the public and private sector and cover many important application domains [
1], such as protecting fragile ecosystems, managing climate risks, and enhancing food security [
2]. Therefore, data derived from EO information are becoming indispensable in support of many sectors of society, especially for agronomic applications. Indeed, remote sensing data derived from EO have already proven their potential and effectiveness in spatiotemporal vegetation monitoring [
3,
4]; therefore, monitoring agricultural resources using remote sensing offers the opportunity to estimate crop areas [
5], predict crop yield [
6,
7,
8], and evaluate water demand [
9,
10] and to know the total surface that is cultivated and the precise distribution of crops [
11]. Accordingly, in order to establish the most effective management strategy and adapt agricultural practices correspondingly, regular precise information is required to find out variations in the field, so that policymakers, stakeholders, farmers, and researchers can be informed about the state of agricultural land. Crop classification is one of the most used methods of information extraction to manage and plan many agricultural activities.
However, the above-mentioned applications are still mostly based on optical remote sensing [
12]. Commonly, the optical remote sensing methods used to assess crop status rely on combinations of different bands that are used to build relationships with crop biophysical parameters of the canopy [
13]. Unfortunately, according to [
14], two-thirds of the EO provided data by optical remote sensing sources are often covered by clouds throughout the year. Hence, it may be a challenge to overcome weather conditions with the objective of obtaining an acceptable quality of optical remote sensing data. For this reason, [
12] listed out the advantages that synthetic aperture radar (SAR) data have over optical data, and it can be resumed into three main characteristics. The first one concerns the ability of SAR sensors to acquire data independently of the weather condition and at night [
15]. The second important property is the sensitivity of SAR data to canopy structure [
16,
17]. The third characteristic concerns the SAR sensitivity to moisture or the water content of the land surface [
18,
19,
20,
21,
22,
23]. Nevertheless, dealing with radar data for any land application is a challenging task and many consideration should be taken into account, such as removing the speckle noise effect from radar images [
24,
25], and dealing with the difficulty in interpreting the information [
26] and the distortion caused by changes in topography [
27].
To make the most of the aforementioned advantages of SAR data, several authors considered using them for phenological monitoring of numerous crop types and had very promising results. One study [
28] found that the synergistic integration of SAR and optical time series offers an unprecedented opportunity in vegetation phenology monitoring for mountain agriculture management. The central idea of this work was to derive the main phenological features from time series of Sentinel-1 and Sentinel-2 images. Results show that Sentinel-1 cross-polarized VH backscattering coefficients have a strong vegetation contribution and are well correlated with the normalized difference vegetation index (NDVI) values retrieved from optical sensors, thus allow the extraction of meadow phenological phases. Likewise, another study [
29] analyzed the temporal trajectory of SAR and optical remote sensing data for a variety of winter and summer crops widely cultivated in the world (wheat, rapeseed, maize, soybean, and sunflower). The SAR backscatter and NDVI temporal profiles of fields with various management practices and environmental conditions were interpreted physically. Accompanied by some in situ measurements (Green Area Index (GAI) and fresh biomass) as well as rainfall and temperature data, the time series of optical NDVI and SAR backscatter (VH, VV, and VH/VV) were analyzed and physically interpreted. As a result, this study pointed out that dense time series allowed the capture of short phenological stages and, thus, precise descriptions of various crop developments.
Therefore, SAR data may offer valuable information that can reinforce optical remote sensing data and can be especially advantageous to crop classification application. That is the reason why several classification studies used both SAR and optical remote sensing products [
30,
31,
32] in order to assess the potential of their complementary use. For instance, a study was carried out with the objective of joining the use of Sentinel-1 radar and Sentinel-2 optical imagery to create a crop map for Belgium [
33]. The obtained results showed that the combination of radar and optical imagery outperformed classification based on single-sensor inputs. These results were obtained following a methodology that highlighted the role of each remote sensing component. This procedure relied on the use of 18 incremental classification schemes, and the classification was performed by a random forest (RF) classifier. Another work [
34] used 9 Sentinel-1 SAR images and 11 optical Landsat-8 images (used as a surrogate for Sentinel-2). Further, classification was done by the RF classifier and the methodology was set to highlight the impact of SAR image time series when they were used as a complement to optical imagery. In addition, this work evaluated the most relevant SAR image features and the use of temporal gap-filling of the optical image time series. The study presented two main conclusions. First, SAR image time series allowed significant improvements in the classification process, and second, they allowed the use of optical data without a gap-filling process, because a methodology was used to replace the missing values that were eliminated by a cloud screening filter [
35]. In agreement with the previous studies, it was revealed in [
36] that the synergic use of Sentinel-1 and Landsat-8 data enhanced the accuracy of classifications compared to those performed with optical or radar images alone. Moreover, the classification in this study was performed by RF. Furthermore, a series of studies was conducted in [
37] to improve the classification efficiency in cloudy and rainy regions using Sentinel-1 and Landsat-8, but they built a recurrent neural network (RNN)-based classifier suitable for remote sensing images on the geo-parcel scale. They succeeded in designing an improved crop planting structure map in their specific study area.
The current work is based on the results reported in [
38], the main objective of which was to assess the contribution of Sentinel-2A and Landsat-8 information to crop classification. Moreover, 22 classification algorithms were evaluated to determine which was the most robust. The use of combined Sentinel-2A and Landsat-8 information did not contribute much to improve crop classification accuracy compared with using only Sentinel-2A information. Further, large differences in accuracy were found depending on the machine learning algorithm that was used, which also depends on the type of information used. Consequently, the interest of the present work is in integrating multitemporal SAR data, Sentinel-1, and optical data obtained with Sentinel-2A, together with determining the best machine learning algorithm to perform accurate crop classification in a semiarid region. The following are the main objectives:
- ▪
Establish a simple and efficient methodology that allows the incorporation of SAR data with optical data to perform classification over a large area and with a dense time-series of Sentinel-2 (22 different acquisition date) and Sentinel-1 (39 different acquisition date) data, so that the phenological temporal dynamics of the studied crops can be detected completely, with the purpose to provide the maximum amount of information that allow the differentiation between the crops.
- ▪
Select the best SAR feature that allows the best classification results.
- ▪
Evaluate the performance of 22 nonparametric classifiers that were tested in our previous work with only optical data. In this current work we added SAR data to the optical. The novelty that this paper can bring is that a large number of these algorithms have not been tested with SAR data, so depending on their performance, we will assess them and select the best one.
3. Results
3.1. Analysis of Temporal Signatures of Crops
Figure 5 and
Figure 6 represent the NDVI profile and temporal signatures of six of the studied crops for the three backscatter channels (VH, VV, and VH/VV) of descendant and ascendant orbits. We decided to comment briefly on the general tendency of the selected crops.
As a general observation for all crop types, the curves of both ascendant and descendant orbits have the same general tendency.
When considering the maize crop (
Figure 5, left column), generally in the study area it is sown in April and the harvest period can extend from the end of August to the end of September (
Figure 7). From 15 May to the start of July, the VH backscatter and VH/VV steadily increased. According to [
61], this observation can be explained by an increase in volume scattering when newly formed leaves are unfold, and subsequently by the accumulation of biomass. VV also rises constantly during the vegetation development phase. This increase may be due to the augmentation in double bounce scattering [
62]. During the ripening phase, from mid-July to the end of September, it can be noticed that VH/VV, VH, and VV remain stable, indicating that maize reaches its maximum height and the fruit is in the development phase [
61].
In the case of barley (
Figure 5, center column), in this study area it is seeded at the start of January and harvested at the end of June or beginning of July (
Figure 7). It is important to mention that barley and wheat have very similar growing seasons, phenology, and crop structure [
63]. At the early and late growth stages, when the crop emerges and after the harvest, the backscatter is essentially determined by the condition of the soil, but between the start and end of the growth cycle, vegetation scattering becomes significant and the relationship between radar backscatter and vegetation biophysical parameters is considerably influenced by the dynamics of the canopy structure, including orientation, size, and density of the stems, and the dielectric constant of the crop elements, which depends on the phenological stage [
29,
64]. Vegetation starts to increase at the beginning of March, which corresponds to the tillering stage until the beginning of April, which corresponds to the stem elongation period. So, as consequence of this vegetation development, VH and VV increase, while from 10 to 15 April, both backscatter signals decrease due to the rising attenuation from the predominantly vertical structure of the barley stems [
29,
63]. At the beginning of May, we again observed an increase in VH and VV polarization, which is related to the heading stage [
65,
66], and this can be explained by the increase in fresh biomass. VH/VV starts to increase at the tillering phase (beginning of March) and continues to rise during the stem extension phase (corresponding to the period 20 March to 10 April). Further, a decrease is noticed from 10 April until the start of May, related with VH and VV to the vertically rising structure of the crop, as commented previously. Around the start of May, the VH/VV ratio starts to increase again, indicating the start of the heading and flowering phase. As heading takes place, the flag leaves become less dominant within the canopy, and as a consequence, the crop develops a more open vertical structure. During the end of the phenological cycle, which starts at the beginning of June, VH/VV, VV, and VH are characterized by a steady decrease because the canopy dries out, which generates higher penetration of the backscattering signal [
62], and increases the influence of the soil.
Looking at the grapevines (
Figure 5, right column), observing the VV and VH curves, we can detect the start of vegetative development between the end of April and the start of May coinciding with the increase of VH. However, we cannot detect any other important indicator marking the transition between phenological stages. Generally the backscatter VV, VH, and especially VH/VV, did not show a clear increasing or decreasing pattern, because in the case of woody crops, there is a predominance of soil backscatter due to the open spaces between vine trees [
67].
Purple garlic in our study area is planted in January and harvested in mid-June (
Figure 7). According to the left column of
Figure 6, the NDVI and VH/VV are similarly sensitive to garlic phenology. The NDVI and VH/VV increase in the same way until the harvest, and after that, they decrease. Generally, VV and VH behave mostly the same, and this observation can be explained by the fact that the garlic crop does not totally cover the soil even at its maximum phenological development. So, it seems that the effect of ground scattering is not attenuated during the entire crop cycle.
Poppy is cultivated from mid-January to the start of July (
Figure 7). It can be seen that the general tendency of VH/VV corresponds to the NDVI behavior (
Figure 6, center column). VH starts to increase at the beginning of April, decreases slightly at the beginning of June and increases again, and then finally decreases by the end of the cycle. Contrary to VH, VV decreases at the beginning of April until June, and after that it increases slightly until mid-June, to finally decrease.
Peas are cultivated from the start of January until mid-June (
Figure 7). According to the right-hand column of
Figure 6, VH and VV have the same general behavior. They start to increase from March to April, decrease slightly by mid-April, increase until mid-June, and finally decrease by the end of the crop cycle. VH/VV is relatively more stable than VH and VV, but distinguishing the important phase of the cycle during the study period can be easily done.
3.2. Evaluation of Classification Methods with Only SAR Data
The results of the classification scenarios using only Sentinel-1 information were obtained for the evaluated classification methods and are shown in
Figure 8. In this section, we compare the F1 score between the scenarios of the first approach and the reference one (only Sentinel-2 NDVI). In general, VH/VV underperforms compared with NDVI, VV, and VH. Regarding SAR derived information, VH in general presents better performance than VV. Further, when comparing NDVI and VH information, depending on the classification algorithm, one performs better than the other. Decision tree algorithms perform better with NDVI values than with SAR derived information. The same conclusions are obtained with nearest neighbor algorithms. However, the use of support vector machines with VH information in general improves the results of the classification process, leading to the best results for the M8 algorithm. Regarding ensemble classifiers, there are different results depending on the selected algorithm, but the best performing algorithm, M21, slightly improves the classification performance compared to NDVI. Thus, according to our results, SAR derived information for classification can outperform optical information if adequate classification algorithms are selected.
The classification algorithm that obtained the best F1 score in any scenario is the M21 ensemble classifier of type KNN subspace (
Table 4). Using SAR data slightly improves the classification results (0.87 over 0.88), mainly when VH polarization is utilized. So, it can be concluded that for M21, crop classification with only optical data or only SAR information performs similarly. This is relevant because the Sentinel-2A mission started in 2015, while Sentinel-1 started in 2014. This makes it possible to perform crop classification from earlier dates, taking advantage of the 10 m resolution of this mission.
3.3. Evaluation of Classification Methods with SAR and Optical Data
The results of the classification scenarios incorporating Sentinel-1 and Sentinel-2 information were obtained for the 22 algorithms, as shown in
Figure 9. In this section, we compare the different scenarios of the second approach and the reference one in order to evaluate the added value that SAR features can bring to classical classification methodology (based on only optical data).
When combining NDVI and VV information, we noticed that for the majority of classifiers this incorporation improved the F1 score results compared to using only NDVI information, except for M3 and M9. The same observation can be made when comparing the scenario using NDVI combined with VH with the one using only NDVI information. Concerning the eighth scenario (VH/VV incorporated with NDVI), for the majority of the classifiers it presented only a slight improvement in F1 score compared to using only NDVI information, except for M1 and M2, which had the same F1 score in both scenarios. Further, the eighth scenario registered the lowest F1 score among all scenarios. Thus, the VH/VV ratio did not contribute to improving the classification results, and for some classifiers it deteriorated the accuracy compared to scenarios incorporating VV or VH with NDVI.
According to our results, it can be inferred that the fifth (NDVI+VV) and sixth (NDVI+VH) scenarios provided, for almost all classifiers, equal or very similar results. Thus, VV and VH contribute equally to improving the classification process. Furthermore, it can be noticed that generally the difference between F1 scores of the fifth (NDVI+VV), sixth (NDVI+VH), seventh (NDVI+VH+VV), and ninth (NDVI+VH+VV+VH/VV) scenarios is very narrow, but it is important to mention that the majority of classifiers presented their best F1 score with the seventh scenario. Therefore, we conclude that using the seventh scenario (NDIV+VV+VH), integrating VH and VV, both, polarization channels with NDVI may offer the best option to improve classification accuracy.
Regarding the algorithm with the best performance for the crop classification task,
Table 5 shows that when integrating optical and SAR information, the M8 (cubic SVM) algorithm returns better results than M21. Furthermore, combining optical and SAR information increases the F1 score from 0.87 to 0.93, highlighting the improvement of crop classification with the proposed approach. The classification results obtained with the best classifier M8 when applying the seventh scenario (NDVI+VH+VV) are shown in
Figure 10.
3.4. Evaluation of Best Method of Crop Classification with First and Seventh Scenarios
The following section provides a demonstration of the confusion matrix of M8, the best performing classifier, in the seventh scenario (
Table 6) and the first one (
Table 7) in order to highlight how the incorporation of VH and VV with NDVI improved UA and PA, and to evaluate the misclassified cases.
Our analysis is based on the distribution of crops into groups according to the improvement or deterioration brought by the seventh scenario compared to the first one:
First group: barley C1, maize C2, wheat C3, onion C4, purple garlic C5, grapevines C11, and peas C12. For these crops, the seventh scenario improved both UA and PA.
Second group: sunflower C8. The seventh scenario improved only UA.
Third group: alfalfa C9, and almond tree C10. The seventh scenario improved only PA.
Analyzing the results of the first group, we noticed that the introduction of VH and VV significantly improved UA and PA for peas, barley, wheat, and purple garlic. Indeed, the improvement between the reference and seventh scenarios for UA ranged between 6.1% and 36.7% and for PA ranged between 5.1% and 22.5%.
The second group contained only sunflower C8. UA was improved by 33.4% and PA decreased to 13.3%. The sunflower also had the lowest PA value with the best performing classifier because of the small number of visited plots, which led to worse calibration and validation of the algorithm, this can be the reason for the obtained result for this specific crop type.
Focusing on the third group, concerning alfalfa C9, PA was enhanced by 2.4% and UA deteriorated by 4.5%. Indeed, it was wrongly classified as wheat C3 four times, and one time each as maize C2 and sunflower C8. All this misclassification may be explained by the fact that these crops have coinciding phenological phases. In the case of permanent crops, almond tree C10 the classification results were good although that permanent crops have special temporal signatures because they are predominated essentially by soil backscatter due to the open spaces between trees.
4. Discussion
In this current work, we assessed the potential of integrating Sentinel-1 information (VV and VH backscatter and their ratio VH/VV) with Sentinel-2A data (NDVI) to improve the crop classification and to define which are the most important input data that provide the most accurate classification results. Furthermore, we evaluated the performance of 22 nonparametric classifiers on which most of these algorithms had not been tested before with SAR data.
As a general tendency when using only Sentinel-1 information, the classification performance of the majority of classifiers achieved very similar accuracy with VH and VV polarization channels, while classification using the VH/VV ratio as the input obtained lower accuracy. Similar results were obtained in [
36] in which the principal objective was to evaluate the performance of a supervised crop classification approach based on only crop temporal signatures obtained from Sentinel-1 time series without optical data.
According to our classification results, the seventh scenario (NDVI+ VH+VV) is considered the best scenario. Then, the integration of both polarization channel VH and VV with NDVI has improved the classification accuracy, and this can be explained by the fact that radar signals interact differently with land cover components [
68], therefore it is important to choose adequate polarization channels that allow the representation of the most important backscatter mechanisms. Indeed, according to [
29,
69], the VH backscatter is dominated by the attenuated double bounce (when targets and ground surface are perpendicular, they can act as corner reflectors, providing a “double-bounce” scattering effect that sends the radar signals back in the direction they came from) and volume scattering mechanisms (volume scattering occurs when the radar signal is subjected to multiple reflections within three-dimensional matter; at the shorter C-band wavelength, it can take place within the canopy of lower or sparse vegetation types), while the VV backscatter is dominated by direct contributions from the ground and the canopy. Consequently, the integration of both VH and VV allowed the capture of all the important backscatter mechanisms during the phenological cycle of the observed crops. However, looking in the literature, we did not find a consensus about the most pertinent features derived from SAR imagery. In [
34] it was found that Haralick texture features (entropy and inertia), the VH/VV ratio, and the local mean together with the VV imagery contain most of the information needed for accurate classification. However, in [
36] it was found that the most important features in the classification scheme are VH/VV and Haralick texture. Another study [
33] did not find any clear difference in importance between the two polarization channels (VV and VH).
As mentioned previously in the results section, all the results of the tested scenarios in the second approach (when integrating optical and SAR information) show that M8 (cubic SVM) was the best performing classifier and showed better results than M21.In [
38] it was concluded that M21 provided the best classification result using optical data (NDVI derived from Sentinel-2 and Landsat-8), and in the present work this classifier showed slight improvement with the seventh scenario, rising from 0.87 to 0.90. Therefore, incorporating SAR information (VV and VH polarization) can only improve the performance of the classifiers; with a non-robust classifier (M4) it generates important improvements and with a robust classifier (M21 and M12) it generates slight improvements. However, even with the best scenario, we notice that M9 and M10presented the worst results. M9 presented an F1 score of 0.35 in all scenarios, but a relatively high F1 score of 0.77 in the first scenario (only NDVI). Based on this observation, we can conclude that probably the M9 classifier, medium Gaussian SVM, is not adequate for classification with SAR data. This conclusion can also be applied to M3.
Additionally, this study compared the performance of the best classifier M8 with the reference (only NDVI as input) and Seventh (NDVI+VH+VV) Scenarios. This comparison revealed that the introduction of VH and VV significantly improved both UA and PA for peas, barley, maize, wheat, and purple and white garlic. Indeed, the improvement between the reference and seventh scenarios for UA ranged between 5.9% and 23.4% and for PA ranged between 7.3% and 12.3%. The resulting enhancement was noted especially for cereals (barley and wheat). This observation can be explained by the fact that cereals have a quite distinctive temporal signature, as found in [
67], and as shown in
Figure 4. Furthermore, the presence of more clouds in spring could lead to SAR supplying more information for classification. Corn C2 achieved good classification results; both UA and PA were slightly improved despite the temporal signature of VV and VH being distinguished from other crops. In fact, the patterns of VV and VH of corn in our study area were very similar to the pattern described in [
70]: VH increased because of volume scattering during the vegetative period until reaching the maximum height, and after that stayed constant, and the VV curve has the general tendency of VH, but its main characteristic is that it is rather higher than the other VV crop curves. Moreover, for peas C12, the improvement of UA and PA in the seventh scenario were very notable, actually they increased by 23.4% and 9.6%, respectively. This may be due to the volume backscatter produced by the heterogeneous shrub-like structure of legume canopies as found in [
66].
5. Conclusions
In this paper, we evaluated the performance of 22 nonparametric classifiers with SAR Sentinel-1 and optical Sentinel-2A data employing a simple and efficient methodology that allows the incorporation of dense time-series of datasets (22 Sentinel-2A and 39 Sentinel-1 different acquisition date), so that the phenological temporal dynamics of the studied crops can be completely detected. The simple methodology consisted in computing the mean of the optical feature (NDVI) and SAR feature (VV, VH and VH\VV) at plot level (plot-based approach) for each available date without recourse to the calculation of other feature like Haralick texture features (entropy and inertia). Nine classification scenarios based on the selection of input features were then applied.
The results of the first approach based on the use of only SAR data as the input feature revealed two important conclusions. The First one is that classification results, in general, presented better performance with VH than with VV or with VH\VV. The second one is that SAR-derived information VH can outperform optical information NDVI if adequate classification algorithms are selected (the case of M5, M7, M8, M10, M12, M20, and M21).
The results of the second approach, which is based on integrating SAR with optical data, showed that the best tested scenario was the integration of VH and VV with NDVI. Consequently, we recommend the use of both VH and VV as input features to take advantage of the different information that these two polarization channels can provide. Concerning the tested classifiers, we found that the best performing one was cubic support vector machine SVM (F1 score = 0.93); indeed, with this classifier the F1 score was improved by 6% compared to the results provided by the best performing classifier M21 (F1 score = 0.87) in the reference scenario (the first one). Furthermore, we concluded that some classifiers (M3 and M9) are not adequate to deal with SAR data, except these two classifiers, incorporating SAR information (VV and VH polarization) can only improve the performance of the classifiers; with a non-robust classifier (for example M4) generated important improvements and with a robust classifier (M21 and M12) generated slight improvements.