Detection of Irrigated and Rainfed Crops in Temperate Areas Using Sentinel-1 and Sentinel-2 Time Series

: The detection of irrigated areas by means of remote sensing is essential to improve agricultural water resource management. Currently, data from the Sentinel constellation offer new possibilities for mapping irrigated areas at the plot scale. Until now, few studies have used Sentinel-1 (S1) and Sentinel-2 (S2) data to provide approaches for mapping irrigated plots in temperate areas. This study proposes a method for detecting irrigated and rainfed plots in a temperate area (southwestern France) jointly using optical (Sentinel-2), radar (Sentinel-1) and meteorological (SAFRAN) time series, through a classiﬁcation algorithm. Monthly cumulative indices calculated from these satellite data were used in a Random Forest classiﬁer. Two data years have been used, with different meteorological characteristics, allowing the performance of the method to be analysed under different climatic conditions. The combined use of the whole cumulative data (radar, optical and weather) improves the irrigated crop classiﬁcations (Overall Accuary (OA) ≈ 0.7) compared to the classiﬁcations obtained using each data separately (OA < 0.5). The use of monthly cumulative rainfall allows a signiﬁcant improvement of the Fscore of irrigated and rainfed classes. Our study also reveals that the use of cumulative monthly indices leads to performances similar to those of the use of 10-day images while considerably reducing computational resources.


Introduction
Human activities have an impact on the different components of the hydrosphere, and 80% of the world's population is now facing water shortages that will worsen with global warming [1]. Faced with this emergency situation, it is necessary to develop adaptation strategies to allow access to water resources for the entire population and to maintain agricultural activity. One of the adaptation strategies that has been favoured is irrigation. FAO estimates that 80% of the food needs in 2025 will be covered by irrigated agriculture [2] and more than 324 million hectares are equipped for irrigation in the world [3]. However, the use of irrigation has led to conflicts on water uses that are likely to worsen in a context of climate change. Rational and collective management of water resources has therefore become crucial. To achieve this objective, explicit information on agricultural practices and on the amount of water needed for crops over large areas is needed [4]. In this context, remote sensing observation could play an essential role. Many studies have proposed methodologies based on remote sensing images to derive useful indicators for water management [4][5][6]. Many of them have used multispectral and multi-temporal remote sensing images to map irrigated areas [7][8][9][10][11][12][13][14], demonstrating the high potential of remote sensing data. However, most of these studies are conducted in semi-arid areas [15][16][17][18][19][20][21][22][23]. Very little research is carried out in temperate areas, which also suffer from water scarcity. In France, for example, each year, about twenty departments (Nomenclature of Territorial Units for Statistics level 3) apply water restrictions. This number increases in the driest years: 73 departments (among 101) in the summer of 2019. In temperate areas, the detection of irrigated plots is difficult because of the the smaller differences in observed phenology between irrigated and rainfed crops compared to what is observed in semi-arid zones [24,25]. This smaller differences is related to local agricultural practices and pedoclimatic conditions.
The launch of the Sentinel constellation gives a new opportunity for the mapping of land cover and agricultural practices [26,27]. Numerous studies have shown that using temporally dense optical time series can improve crop mapping over different climates [28][29][30][31][32]. Vogels et al. [33] used a new approach to map irrigated areas in the Horn of Africa: a Sentinel 2 times series. However, as optical imagery is affected by cloud cover, the performance of the crop mapping with such data can be reduced in some cases, particularly for temperate or tropical areas [33] . Other recent studies [34][35][36][37][38][39] have shown that the joint use of optical and radar data improves the robustness of the mapping methods [21,35,37]. Nevertheless, all the studies mentioned above deal with the mapping of irrigated areas in arid or semi-arid climatic areas. There are few works on mapping irrigated crops in temperate areas with high spatial and temporal resolution. Demarez et al. [36] demonstrated than combining high spatial and temporal optical imagery (Landsat-8) and SAR (Synthetic Aperture Radar) imagery (Sentinel-1) allows to improve the detection of irrigated areas in the southwest of France. However, in this study the main limiting factor was the temporal resolution of the Landsat-8 images.
In this work, we propose a new methodology for distinguishing irrigated and rainfed crops at plot scale in temperate areas using the joint use of optical (Sentinel-2), radar (Sentinel-1) and meteorological (SAFRAN) data. The novelty comes from the combined use of vegetation, polarisation and meteorological indices. The study area is located in southwest France, and the summer crops to be classified are maize (both irrigated and rainfed), soybean (both irrigated and rainfed) and sunflower (rainfed). As only part of the summer crops was irrigated in the study area, one challenging point was to distinguish between irrigated and rainfed plots of the same species. In order to make this discrimination, we relied on the phenological development of the vegetation cover as a explanatory variable. Different scenarios were used to evaluate the performance of classifications models as a function of the various types and number of features. Different scenarios were evaluated using ground truth data collected in 2017 and 2018. These two years were characterised by contrasted meteorological conditions, which allowed to analyse the performance of the method under various climatic conditions.

Study Site and Dataset
The study site is a watershed (Adour Amont) located in the southwestern France, west of Toulouse (city). It covers 1500 km 2 , as shown in Figure 1. The climate is temperate with continental and oceanic influences. It is characterised by a hot summer (mean T • = 22 • C) and a wet spring (mean rainfall = 108 mm) [40]. In this region, the minimum rainfall occurs between July and October, which corresponds to the maximum development of summer crops, as shown in Table 1, making irrigation mandatory. Maize, soybean and sunflower represent, respectively, 82%, 9% and 8% of the summer crops, i.e., 36% of the whole cultivated crops, according to the agricultural Land Parcel Information System "Registre Parcellaire Graphique" (RPG) provided by the French Services and Payments Agency [41]. The RPG contains information on crop types but no information on agricultural practices. Only maize and soybean are irrigated and the irrigation water needs represent 30 million m 3 over the season [42]. The two years studied are characterised by different weather conditions over the cropping period. The year 2017 can be defined as a dry year, as it has a total rainfall of 424 mm compared to 573 mm in 2018.

The Reference Dataset
The reference dataset was established from field campaigns carried out at two stages of the growing season: after sowing (May) and at flowering (July). The identification of the irrigation practices was based on the presence of irrigation equipment or through declarative information carried out by the farmer. The reference dataset consisted of 832 plots in 2017 (557 irrigated and 275 rainfed) and 942 plots in 2018 (680 irrigated and 262 rainfed) as shown the Table 2. Within the dataset, irrigated maize is the most represented crop with 60% of the plots sampled in 2017 and 48% in 2018. Irrigated soybean is the least represented crop in the dataset with 3% in 2017 and 5.6% of the plots sampled in 2018. This low sampling of irrigated and rainfed soybean is explained by the low representativeness of this crop in the territory (<10%), contrary to maize which represents 82% of the agricultural plots cultivated in summer crops in 2017 and 84% in 2018. Maize and soybean areas are increasing, while sunflower areas are decreasing between the two years, according to the RPG. The plots are located on gentle slopes (less than 5%) and on alluvial soil [43]. The dataset represented 21% of the total area of the summer crops in the territory.

Sentinel-2
The Sentinel-2 (S2A-S2B) images were processed to level 2A with the MAJA processing chain [44,45], allowing to obtain surface reflectances at top of canopy (TOC). This process presented in Hagolle et al. and Baetens et al. [44,45], allows the detection of clouds and their shadows, the correction of the atmospheric effects of the images from an estimation of the aerosol optical thickness (AOT) and water vapour. This process can be applied to time series of images. 100 images were processed in 2017 and 124 in 2018, corresponding to the tiles T30TYN and T30TYP, on Theia website [https://theia.cnes.fr/]. The average number of cloudy pixels was 31% in 2017 and 37% in 2018 over the whole time series, with cloudiness reaching over 85% in June (spring). The obtained images were temporally aggregated to regular 10-day interval composite datasets using the gapfilling module Orfeo Toolbox 6.7 (OTB) [46]. This temporal resampling relies on a distance-weighted linear interpolation of the clear acquisition dates of the optical satellites taking into account cloud and cloud shadow masks, which is necessary to study very large areas, as it limits the impact of satellite tracks, clouds and their shadows [28,47].

Sentinel-1
The Sentinel-1 images came from the C-SAR instrument onboard two Sentinel 1 satellites (S1A-S1B) (Interferometric Wide Swath mode, IWS-GRD [26]). This mode provides dual-polarisation (VV and VH) imagery, at resolution of 20 m, with a swath of 250 km. The images were radiometrically (Multilooking filter 2 × 2) and geometrically corrected using the OTB software [46]. Only Sentinel-1 images acquired in ascending mode (acquisition time: 6.00 pm) and between April and November were used (i.e., one image every 6 days, i.e., 79 images in 2017 and 100 in 2018 ). The ascending mode (orbit n • 30) was used in the study, to limit the impact of morning dew and freezing dew, which can lead to artefacts in the SAR signal. Like Sentinel-2 images, the SAR images were orthorectified on the S2 grid and with the same pixel size. In addition to the orthorectification process, the SAR images were linearly interpolated every 10 days to maintain the same temporality in the satellite data between Sentinel-1 and Sentinel-2.

Meteorological Data
Rainfall data from the SAFRAN database [40] were used in the classification process. It is defined in mm/day. SAFRAN is a system for atmospheric analysis of surface meteorological variables based on the use of homogeneous climatic zones and capable of taking into account altitudinal variations. These data are spatially interpolated to cover all of mainland France with a regular grid of (8 × 8 km).

Optical Features
As shown on Table 3, the optical features involved into the classification process are the Normalized Difference Vegetation Index (NDVI), the Normalized Difference Red-Edge (NDRE) and the Normalized Difference Water Index (NDWI). These indices were selected as they are sensitive to various characteristics of the plants: the fraction of green vegetation cover [48,49], the foliar pigment [50,51] and the plant water content [52], respectively. Figure 3 illustrates the dynamics of vegetation cover growth between irrigated and rainfed crops. A significant difference is observed at full growth (NDV I > 0.8 and NDW I < −0.40) . This difference is stronger in 2017. It decreases when entering the senescent phase. The growing cycle is quite synchronous for all crops, as shown in the Table 1. For sunflowers, the growth cycle is different than for other crops, with lower growth peaks.

Radar Features
VV and VH polarisations were used into the classification. They are, respectively, sensitive to soil moisture [53][54][55] and to the volume scattering of vegetation [53,56]. As SAR backscatter time profiles in both polarisations are noisy by environmental factors , we used the VV/VH ratio which partially compensates these effects. The dynamics of SAR polarisations differ between rainfed and irrigated crops, as shown in Figure 4. This difference is more marked throughout the season than for optical indices. Moreover, there is a clear difference in the amplitude of the SAR signal between the two types of crops (soybean and maize) at full growth. 2017 2018 Figure 4. Dynamics of SAR polarisations (VV (a,c) and VH (b,d)) for maize, soybean and sunflower crops, The solid line curves represent maize, the dotted line represents soybean, with irrigated blue and rainfed red for both crops. The black curve represents sunflower. The envelope around the curves corresponds to the 95% confidence interval.

Cumulative Indices
The use of multispectral (optical and radar) and multi-temporal images products leads to an increase in the number of features, to be processed, and consequently increase the redundancy of spectral information and the computation time of the classification process. To avoid these effect, we computed cumulative monthly indices, correspond to the sum of the spectral information of each feature (rainfall, optical and radar). We used gap filling for simplicity. The use of cumulative indices sounds pertinent as they are related to the plant functioning. Indeed, many authors have demonstrated the link between plant development of the plant and the cumulative spectral indices provided by remote sensing data [57][58][59]. For this study, we assumed that the irrigation have an impact of several irrigation events (4 to 5 irrigations) on crop development could be captured using the cumulative indices over the growing season. Indeed, we assume that the speed and amplitude of crop development differs between irrigated and non-irrigated crops at the end of the season.

Classification
The algorithm used for the classification process is the Random Forest (RF) [60]. It was selected for its robustness, because it is easier to parametrise and has good performances [29,30,[61][62][63][64]. The Random Forest library used is the one provided by Shark [65] available through the supervised classification framework of Orfeo Toolbox 6.7 [46]. The number of trees was set to 100, the maximum depth to 25 and max-features to number f eatures.
The pixel classification procedure has been fully automated by the Iota2 processing chain developed by CESBio and available as free software [https://framagit.org/iota2-project/iota2.git]. Each classification was evaluated using the reference data set, which was randomly divided into two parts: 50% of the plots were used for the training phase and 50% were used for validation. This division into two parts avoids an optimistic estimate of the performance of the classification, as it ensures that correlated pixels from the same plot will not be used for training and validation This division into two parts is performed for each run. RF classifiers can perform poorly when the number of training pixels for each class is unbalanced. To address this problem, the maximum number of training pixels per class has been limited to 10,000. Limiting the maximum number of pixels also reduces the training time of the classifier. For those classes where fewer pixels were available, all the pixels in the training set were used. In our case, the soybean classes (irrigated and rainfed) had less than 10,000 training pixels because this crop is poorly represented in the territory.

Scenarios
Several scenarios were evaluated: The total number of features over the entire study period of each scenario is given the Table 4. Optical only 24 3 Optical and SAR 48 4 Optical, SAR and rainfall data 56 Not cumulative 5 10-day Optical and SAR 385

Validation
Each metric was averaged over the five runs to obtain a mean value and a confidence interval per scenario and to measure robustness against the sample selection noise. The global performance was evaluated using the Kappa coefficient (1) and the Overall Accuracy (OA). The Kappa coefficient expresses a relative difference between the observed agreement Po and the random agreement which can be expected if the classifier was random, Pe. The Overall Accuracy (OA), is the total number of correctly classified pixels divided by the total number of validation pixels.
with P o = 1 n ∑ r i=1 n ii and P e = 1 n 2 ∑ r i=1 n i n i , The performance of the different scenarios was evaluated using the reference datasets. Table 5 shows the number of pixels used for learning and validation of the classification model for each class. In view of the strong imbalance between the two datasets (Training and Validation), we know that the overall accuracy may be biased by the majority classes. To limit this bias, we decided to use another metric at the class level.
The performance of each class was evaluated using the Fscore (2).
where • Accuracy is the ratio between the correctly classified pixels and the sum of all pixels classified as this class, and • Recall is the ratio between the correctly classified pixels and the total number of reference data pixels of that class.
The central processing unit (CPU) time and allocated Random Access Memory (RAM) were also analysed in order to define a trade-off between classifier performance and use of computational resources. This evaluation was carried out for the two steps of the classification process: the model learning and the classification steps.

Confidence Map
The map of irrigated areas contains for each pixel the label selected by the classifier, which is a majority vote on the labels selected by all the decision trees in the forest. The probability of each class can be estimated as the proportion of trees in the forest that chose that label. It is therefore possible to associate a confidence to the decision using the probability of the majority class. Although this confidence estimate may be valuable to the user, it should be kept in mind that it is an estimate of the classifier itself, and may therefore be erroneous. The correlation between confidence and classification quality was assessed by Inglada et al. [47].

Postprocessing
A regularisation was applied to the final classification in order to remove isolated pixels. This procedure filters the input labelled image using majority vote in a 3 × 3 neighbourhood. The majority vote takes the most representative value of all identified pixels and then sets the centre pixel to this majority label value [46]. The winter crops were also masked using the RPG of the classified year.

Performance of Each Scenario
The global performances of the classifications are shown in Figure 5. The best results are obtained with the 10-days scenario 5. The classifications with cumulative indices lead to performances slightly inferior to the 10-days classifications (OA = 0.78 in 2017), while significantly reducing the RAM usage and CPU time, as shown in the Table 6. RAM is reduced by a factor of 2 for the learning phase. The CPU time is reduced by a factor of 2 for the learning step and by a factor of 4 for the classification step. Therefore, scenario 4 may be considered a good trade-off between accuracy and computational resources. However, we should note that Kappa values (≈0.6) are quite low even for the best configuration. For the further analysis, we retained the scenarios combining optical and radar features (scenarios 3, 4 and 5) as were supposed to be more robust to various meteorological conditions.  Figure 6 shows the Fscores for the three scenarios. The best performances are observed for sunflower for both years (Fscore > 0.9), irrigated maize in 2017 (Fscore > 0.85) and irrigated soybean in 2018 (Fscore > 0.68). The lowest scores are observed for the irrigated soybean in 2017 (0.28 < Fscore < 0.4). Intermediate Fscore (0.4 to 0.8) are observed for the rainfed maize and rainfed soybean.The same trends are observed for all scenarios.

Fscore Results
Except for sunflower, adding rainfall into the cumulative method (Scenarios 3 and 4) improves the Fscore and allows to reach performance comparable to the 10-day scenario (Scenario 5).

Analysis of Confusion Between Classes for Irrigated Crops
The confusion between classes were analysed for the three scenarios. Figure 7 illustrates the percentage of confusion between classes for both years, i.e., the percentage of prediction error.
For both years and regardless of the scenario, confusion are observed between practices (irrigated and rainfed) for a given crop but little confusion occurs between crops. The strongest confusions (30% < Fscore < 50%) are observed in 2017 between irrigated and rainfed soybean (Figure 7b,d) as observed on the Fscore values ( Figure 6). Lowest confusions are observed between irrigated and rainfed maize, for both years, with Fscore values varying from 10 to 30%, in average (Figure 7a,c).
Moreover, the use of rainfall data allows for a reduction in confusions for all classes. Indeed, the percentages of confusions are almost systematically lower for scenario 4 compared to those obtained for scenario 3.  Figure 8 shows the confidence index for each pixel on scenarios 3 ( Figure 8a) and 4 ( Figure 8b) in 2017. The best confidence for all classes combined is for scenario 4 for both years. Indeed, the average confidence is 69.5% ± 8% for scenario 4 compared to 63% ± 10% for scenario 3 in 2017 and 63% ± 10% and 72.10% ± 7% for scenarios 3 and 4, respectively, in 2018. We also see in Figure 8 that some areas, such as the Northeast, still have low confidence despite the addition of rainfall. These areas remain difficult to classify by the model. The Table 7 shows the average percentage of confidence for each class. Irrigated and rainfed maize and sunflower have the highest confidence for both years. Irrigated and rainfed soybeans have the lowest confidence and those for both years studied.   Table 8 shows the total areas by crop provided by scenario 4 and 5 and compared to RPG for the entire study area. The areas estimated were similar to the RPG, with average errors per class not exceeding ±8% for Scenario 5. However, the errors were higher for Scenario 4. Maize is the crop with the best area estimated by both scenarios, with an average error of ±2%. In contrast, soybeans have an average error of ±15%, with the largest gap observed in 2017 in Scenario 4. Scenario 5 produced to better performances.

Optical or/and Radar Features
Scenarios 1 and 2, based on single-source use of radiometric information, show the worst performance in terms of OA and Kappa coefficient (between 0.3 and 0.49). The low Fscore obtained with the radar data only is due to the large variations in the SAR signal and the lack of spectral information that does not allow good discrimination between classes. Indeed, SAR polarisations are the result of multiple contributions, between rain, vegetation and tillage, leading to classification errors. The results obtained by this scenario are in agreement with those obtained by Ferrant et al., on the Telangana province in South India with a Kappa coefficient between 0.3 and 0.49 depending on the year [35]. The poor performance of scenario 2 is due to the low number of features, caused by cloud cover (37% in 2017 and 31% in 2018), and the absence of the 2B sensor in 2017 that can make crop growth detection difficult like frequently observed in [9,10,33,35,36].
Thus so-called "single-source" scenarios are highly dependent on the number and quality of the dataset. Figure 5 reveals that the synergy optical and radar data permits to limit the impact of cloud cover, with a gain in all metrics for both years. Similar results were found in the literature with a significant gain in performance on the detection of irrigated areas in India (more than 74% gain on the Fscore and 0.20 on the kappa coefficient, when the synergy of the two sensors is used) [35], in Northern Spain, with a 5% increase in overall accuracy compared to using Sentinel-1 data alone [37].

Impact of Cumulative Indices
Dealing with the huge number of images available requires optimised computing methods. To avoid this constraint, we evaluated the performance of classifications with a reduced number of indices. Results show that cumulative indices lead to similar performances for maize (irrigated/rainfed) than the overall 10-day features ( Figure 5), meaning that these is redundant information when using all the full spectral bands and dates. The lowest Fscore observed for the minority classes (Soybean and rainfed maize) is due to a deficit of learning samples. Indeed, the low number of reference dataset for theses classes does not allow to characterise the whole spatial and temporal variability of these classes, and consequently leads to a high degree of confusion, as shown in Figure 7. However, Scenario 5 is seems less sensitive to the size of reference dataset. The addition of the learning data for the soybean and rainfed maize in 2018, the results were improved with lower confusion (Figure 7) and higher Fscore. This link between performance of the classification methods and the number of reference data has been raise by Pelletier et al. [64]. For sunflowers, there is a decrease in Fscore and confidence between 2017 and 2018, which is explained by a smaller sample area in 2018, with only 40 ha, compared to 120 ha in 2017. Nevertheless, this small sample area does not have a significant impact on the results, as the phenology of this crop is very different from that of the other crops studied, as shown in Figure 3.
Much less computational resources was needed for the cumulative approach, as shown in Table 6. Classifications based on the use of cumulative indices (56 features) lead to similar performances to classifications using the 10-days features (385 features), while reducing the consumption of the computing resources by a factor of 4.

Contribution of Rainfall Features
The addition of rainfall data into the process slightly enhances the performance of the cumulative method, as illustrated by Fscore ( Figure 6) and confidence (Section 4.4 and Table 7).
Indeed, adding these data improves the separability of the classes (Figure 7), and the possible noise of labelling present in the reference dataset. Nevertheless, they seem to lose importance during rainy years, as illustrated by the results on irrigated maize, which show a decrease in the Fscore between 2017 and 2018. This loss of efficiency can be explained by similar canopy growth dynamics between irrigated and rainfed crops, as rainfed crops are not subject to water stress limiting their phenological development during drought periods. These data also show that they can be used to discriminate rainfed and irrigated crops for near-real-time approaches [37]. Moreover, we note that scenario 4 is slightly faster compared to scenario 3 during the learning and application phase of the model (classification). This increase in speed is explained by the addition of rainfall data in the classification process which seems to simplify the choice of the classifier to assign a class to a given pixel.
The analysis of the confidence map illustrated in Section 4.4 confirms these results as the best confidence values are observed for scenario 4 ( Figure 8) and those for both years. Nevertheless, the contribution of meteorological data does not significantly increase confidence for minority classes as shown in the Table 7. There are still areas where uncertainty remains high, as illustrated in the inserts of the Figure 8. As rainfall exhibits spatial heterogeneities all the ambiguities between irrigated and rainfed plots might not be removed. Indeed, the rainfall data used in our study have a low spatial resolution (8 km), which can increase confusions on the distinction between irrigated and rainfall.

Conclusions
The objective of this study was to establish a methodology for detecting irrigated and rainfed crops in temperate areas, using monthly cumulative of optical vegetation indices and SAR polarisations together. The use of these cumulative indices allows taking into account discrepancies on canopy development (speed and amplitude) between various crops and practices (irrigated and rainfed), while retaining all the spectral information. Classifications with radar only, or optical only, show poor performances (Kappa and OA < 0.5) caused by the lack of spectral information, not allowing a good discrimination of classes. The combined use of optical and radar features gives excellent results for irrigated maize (Fscore > 0.80), which represents 80% of the summer crops in the area. The results were worse for soybean (Fscore < 0.60), especially in 2017, which is explained by the lack of in situ reference data, partly due to the low representativeness of this crop on this territory (9%), making in situ collection difficult. In order to overcome this data constraint and to be able to extend the study area, the use of an eco-climatic spatial stratification could be envisaged as illustrated by Inglada et al. [47]. The use of spatial stratification would allow an improvement in performance as well as a better sampling of the minority classes.
In the course of this work, the contribution of rainfall data was also evaluated. The addition of these data allows a significant improvement of the Fscore for the irrigated and rainfed classes and a reduction of confusion between classes during dry years. However, the low spatial resolution of rainfall data used in our study (8 km), can lead to high uncertainties, especially in areas with strong rainfall heterogeneity. To limit these uncertainties, the use of data with a finer spatial resolution, like the AROME [66] or the COMEPHORE [67] data distributed by Météo-France, can be considered in a future work. Our study also reveals that the use of monthly cumulative indices leads to performances similar to those of the use of gap-filled images every 10 days while reducing the need of computer resources (×4). A cumulative method would seem to be the best choice for operational application, maintaining the best classification performances, while reducing the need of computational resources. The approach developed is valuable for cereal crops in temperate climate but it is might also be valuable for semi-arid areas where the contrasts between irrigated and rainfed crops are huge. However, that needs to be confirmed with further studies. The study could also be extended by an in-season study such as proposed in Demarez et al. [36] that revealed that the use of Landsat-8 and Sentinel-1 images allowed early detection of irrigated crops.