Monitoring Watering Practices on European Rice Paddy Fields With Sentinel-1: Characterizing European Rice in Multitemporal SAR Sequences for mapping and traceability purposes.

: Whereas a vast literature exists on satellite-based mapping of rice paddy ﬁelds in Asia, where most of the global production takes place, little has been produced so far that focuses on the European context. Detection and mapping methods that work well in the Asian context will not offer the same performances in Europe, where different seasonal cycles, environmental contexts, and rice varieties make distinctive features dissimilar to the Asian case. In this context, water management is a key clue; watering practices are distinctive for rice with respect to other crops, and within rice there exist diverse cultivation practices including organic and non-organic approaches. In this paper, we focus on satellite-observed water management to identify rice paddy ﬁelds cultivated with a traditional agricultural approach. Building on established research results, and guided by the output of experiments on real-world cases, a new method for analysing time series of Sentinel-1 data has been developed, which can identify traditional rice ﬁelds with a high degree of reliability. This work is a part of a broader initiative to build space-based tools for collecting additional pieces of evidence to support food chain traceability; the whole system will consider various parameters, whose analysis procedures are still at their early stages of development.


Introduction
At the global level, rice paddy fields account for about 12% of global cropland area and provide staple food to roughly half the Earth population [1]. Actually, rice provides more calories for human consumption than any other cereal crop [2]. Moreover, water demand and sequestration [3] plus methane emissions [4] [5] from such fields generate a remarkable impact on the overall environmental balance. The fact that most rice production takes place in Asia [1] has probably had a role in the richness of scientific results in space-based mapping of rice paddy fields in Asian contexts, including both lowlands [6], highlands [7] and mixed areas [8], warmer [9] [10] and colder climates [11]. Spaceborne remote sensing is used even for estimating the transplantation period [12], time trends [11], crop height [13] and phenology monitoring [14]. Significantly less research results are indeed reported on average regarding the European context, although some more papers have started to appear in recent years, addressing e.g. phenology-based mapping [15], time-series-based classification [16], growth monitoring [17]. Given the different weights and contexts, however, the priorities are different. Whereas Asian rice production has a significant role in the overall energy, environmental and food budgets, thus large-scale mapping and monitoring is the prevailing topic, European rice lacks such impact and the emphasis is placed on more detailed assessment of rice characteristics. In this context, a theme worth developing is water management monitoring based on spaceborne radar data. The availability of such data under free and open terms, including Sentinel-1 since 2014, Radarsat Constellation Mission (RCM) [18] since 2019, etc. is on an increasing trend.
Earth Observation (EO) radar is particularly suitable to detect and outline water bodies and water cover [19], so dense series of Synthtetic Aperture Radar (SAR) acquisitions can lead to detailed monitoring of flooding and drying of rice paddy fields. Water management practices, of which a broad variety exists, convey significant information about the rice crops they refer to [20]. Greenhouse gas emission [21], water demand [22], and even pollutant concentration [23] are just examples of parameters that can be significantly impacted by the selection of water management approach to be used. Considering the increasing trend of organic rice growing [24] in Europe, however, there exists another possible use for water management information: a clue supporting possible, claimed organic status of crops, as organic practices require specific water management criteria [25]. More in general, monitoring of water management in rice paddy fields can generate additional traceability information, which is a valuable component of organic food appeal as it contributes to reassuring the consumer on the reliability of the organic status declaration. In addition to water management, a number of parameters exist that are visible from space and can help supporting organic claims; several have already been identified [26,27]. Apart from enhancing traceability, space-based remote sensing can provide precious support to organic farmers; organic crops are far more vulnerable than traditional ones to the emergence of weeds and pests, and to various other types of risk. Keeping crops constantly monitored from space, thanks also to the Copernicus open-data policy, possibly integrated by in-situ sensors [28], can effectively help farmers to keep their delicate crops healthy and their fields productive. To the best of our knowledge, no specific investigation has been carried out yet on the matter, before our early publications [26,27]; research on the use of satellite-based Earth Observation (EO) for cross-checking and supporting organic cultivation practices appears to be in its infancy at the moment. Albeit complex, the problem may be opening up a new application sector for research on spaceborne EO. This paper is primarily intended to start a thread of investigation on satellite-based time series of radar acquisitions for purposes of rice paddy field mapping and enhanced traceability generation, in a European context. This paper will focus on mapping of European rice using dense sequences of SAR data. The paper is organized as follows: the next chapter introduces the specific, example issue of monitoring water management in organic rice paddy fields; chapter 3 outlines the reference state of the art in detection of water from radar acquisitions on areas partly covered with vegetation, and justifies some of the choices made in the later development; chapter 4 describes the study area and its features. Next, chapter 5 reports on the development of the method, steered by the partial results obtained in preliminary tests, whereas chapter 6 offers some results and a related discussion. Finally, chapter 7 draws some preliminary conclusions on the work done so far and outlines a way forward.

Focus on water
As previously mentioned, water management is an important piece of information in terms of food traceability. The moments in time when inflow and outflow of water from the paddy field chamber takes place may vary significantly for different cultivation practices; hence, by analysing the presence of water as a function of time in a field, a clue can be generated about the practice being implemented. This task is made complex by the existence of a wide range of different cultivation practices, especially when organic rice is concerned; such patterns are practically all weather-dependent in way way or the other, which results into complex patterns of flooding and drying. The problem can be simplified by focusing on standard cultivation practices (i.e. non-organic) and consequently standard watering practice. The authors have previously made a preliminary investigation [29] using public available databases and a statistical approach; in the present paper, ground truth data was sourced from volunteer farmers or from direct inspection, and the investigation zooms into the details of the temporal trend. In terms of satellite data analysis, this is a multi-temporal water mapping problem, as a matter of fact, and as such it should be tackled. An investigation on the state of the art in flooded vegetation detection with radar data has been consequently carried out. Radar sensing is the natural choice for mapping 3 of 16 inner water bodies in general, as these latter usually generate distinctive features in radar reflectivity maps due to the low backscatter level of mirror-reflecting, calm water surfaces. Moreover, for this specific application, radar sensors offer an additional advantage over optical sensors thanks to their insensitivity to weather conditions [30]; this means that cloud cover and haze, both frequent phenomena in rice-producing areas including Northern Italy, will not interfere with data acquisition. However, the presence of emerging rice, and possibly also undesired weed, complicates the matter and calls for more specific analysis. In this paper we describe how we started from the scientific state of the art in flooded vegetation monitoring to develop a method for extracting and analysing suitable radar time series on land parcels that allow identifying rice paddy fields managed in a traditional manner as opposed to organic. Ground truth data was offered by volunteer farmers in North-Western Italy.

State of the art in flooded vegetation detection
A first round of review on the state of the art in space-based flooded vegetation detection for purposes of rice crops monitoring and mapping was initially carried out to assess the existence of possible ready-to-use solutions. The review revealed that the majority of scientific papers published so far focuses on South-Eastern Asia, where most of global rice production takes place; however, cultivation practices in such region are different from European ones for both cultural, environmental and climatic reasons, and simple reuse of methods is not viable [31][32][33]. Still, interesting clues were collected and a starting point for designing a method suitable for European domestic rice could be defined as outlined in the following. Remote sensing data can help monitor the ground surface of crops at a large scale by providing precise and timely information on the phenological status and development of vegetation [34][35][36][37]. In particular, several studies have been conducted on the use of remote sensing data when monitoring rice paddy fields [38,39]. The data used by these studies define three broad categories: optical-based, Synthetic Aperture Radar (SAR)-based and data-fusion-based (optical plus radar and/or ancillary data sources like e.g., weather stations and other sensors).
In Table 1 the reader may find summarized the most common radar-based state-ofthe-art techniques involved in rice mapping and monitoring applications together with the corresponding literature references. A common method to map rice fields takes advantage of decision trees and/or random forest classifiers with different input features [40][41][42][43][44], relying only on SAR backscatter time series. In this type of methodology, classification is performed by a simple phenology-based decision tree. Another common approach for mapping rice fields relies on histograms [45][46][47]. In particular, histogram modes are used to identify surface water by selecting a radiometric threshold. In a post-flood event image, indeed, like generally in any scene with flooded and non-flooded areas next to each other, the distribution of values tends to be bi-modal. This makes water detection easier as the two histogram modes generally represent water and non-water pixels respectively. Normally, threshold values are set at the local minimum between the two modes of the histogram polynomial fitting curve. Despite their effectiveness in binary mapping, especially where radar acquisition is carried out at the right time of year, these methods output a single-date rice map, lacking the temporal information which is needed in our case. Another widely used rice mapping approach is based on polarimetry. Polarimetric data may provide a reference on the actual scattering mechanism taking place in the fields. Thanks to fully-polarimetric SAR data, in fact, it is possible to analyse the double-bounce enhancement due to still water in flooded agricultural fields [32,[48][49][50]. Once the coherence and covariance matrices are derived from the polarimetric dataset, decompositions like the Freeman and eigenvector decompositions are applied in order to derive meaningful information regarding the physics behind the scattering process. Table 1. List of state-of-the-art methods in rice mapping applications using radar data.

Study area and data 4.1. Study area and ground reference data
Among all EU countries, Italy is the biggest rice producer, covering more than 53% of the entire rice-cultivated European area; also, Italy exports more than 45% in weight of its domestic output, thus playing a primary role in the European rice market. Most of the Italian rice production takes place in North-Western Italy, with the province of Pavia (see Figure 1), providing alone just above one third of the total domestic rice production [51] thanks to its 82,000 hectares of rice paddy fields. In this area, thanks to our local collaborating farmers, we were able to identify 20 rice paddy fields and define GIS polygons marking the boundaries of each field. 10 more polygons were used to define non-rice fields and build counterexamples. The corresponding 30 GIS polygons were used for isolating responses from each single field, spatially averaging them within each polygon and composing the related time-series for each field. This dataset could potentially be expanded by merging in crowdsourced multitemporal information from volunteer collectors in the future [52]. Regarding the small size of the sample, it should be remarked here that obtaining reliable ground truth on the type of crop is a time-consuming task, which effectively limits the size of the final result. Possible approaches include in-situ inspection, and direct contact with farmers. In a previous experiment [29] we used the geographic database named DUSAF 6.0 ("Destinazione d'uso dei suoli agricoli e forestali", 6 th version), referring to year 2018 and developed by Lombardy region using AGEA ortophotos and SPOT 6/7 satellite images, publicly available on the web Geoportal of Regione Lombardia [53]. This database is extensive, but comparison with in-situ inspection results and visual interpretation of high-resolution multispectral satellite images raised doubts about the punctual correspondence between the stated and actual crop type. The DUSAF database is still suitable for investigating on a statistical basis as in [29] but not as suitable for detailed analysis of time series. It must be noted that, unlike Southern Asia, where yearly harvests may be multiple, the agrarian calendar for this temperate climate is characterized by a single rice cropping pattern a year.

Satellite data
Reviewing the scientific literature, we found that a wide variety of SAR sensors have been employed in rice mapping applications [41,48,49,[54][55][56][57][58], such as COSMO-SkyMed (CSK), Sentinel-1, Radarsat-2, TerraSAR-X, PALSAR-2, etc. Notwithstanding its inability to provide fully polarimetric data, in this work we decided to take advantage of Sentinel-1 radar data. The free and open Sentinel data policy set up under the Copernicus umbrella encourages EO data users thanks to easy access and use of the data anywhere and anytime. In particular, Sentinel-1 provides freely accessible data at both temporal and spatial resolutions fully compatible with the application we intended to develop. The Sentinel-1 SAR sensor operates in band C with a central frequency of 5.405 GHz and a right-looking antenna capable to provide a radiometric accuracy within 1 dB. The acquisition incidence angle can range from 20 • to 47 • . Regarding the polarization modes, Sentinel-1 can provide images acquired with VV (Vertical transmit, Vertical receive) and VH (Vertical transmit, Horizontal receive) polarization in different acquisition modes: Stripmap (SM), Interferometric Wide Swath (IW), Extra-Wide Swath (EW) and Wave (WV). In this work, we used VH-polarized images, as this polarization appears to be more sensitive to the features of rice paddy fields in comparison with VV [42,58,59]. Regarding the acquisition mode, IW Ground Range Detected (GRD) images were used, as backscatter intensity is the main source of information for the proposed application. Once the multitemporal dataset covering the entire rice growing season was acquired through the ESA Copernicus Hub, SAR backscatter time series were extracted for each single considered field. As mentioned above, each sample was computed as the average value of the Normalized Radar Cross Section σ 0 over the entire field at the given date. The features of Sentinel-1 orbit and sensors at the selected acquisition mode result into yearly time series composed of 121 samples. It is worth to note that such high number of acquisitions is related to the peculiar geographic location of the study site, lying in an area where the descending-orbit swaths of Sentinel-1A and B overlap with each other ( Figure  2). The 6-day repeat cycle of a single Sentinel platform, indeed, would itself result into roughly 60 samples per year; the overlapping swaths double this latter figure, although this comes with some caveats. Overlapping swaths have indeed their owns pros and 6 of 16 cons. The main pro is the increased temporal frequency, whereas the main con is related with the inhomogeneities introduced into the time series. The two overlapping orbits present indeed significantly different incidence angles over the region of interest: 33 • and 43 • for orbit number 66 and 168 respectively. Therefore, in order to correctly use all the available measurements, a normalization of the incidence angle must be performed prior to classification. This can be accomplished using different techniques, such as the popular cosine squared normalization [60]. Moreover, time lags between adjacent samples are not evenly spaced 3-day intervals, but they rather alternate a 1-day with a 5-day interval, which adds to the complexity of the analysis. The time-series extraction procedure is depicted in Figure 3. Note that, for sake of simplicity, Figure 3 does not show all the pre-processing steps applied to the downloaded images, but only those characterizing the proposed method. The classification system, described in section 5, is entirely based on the extracted SAR time-series. All samples used in this work have been identified and selected both by in-situ inspections and visual interpretation of interactive online maps. Some other ground truth data were also provided by experts of the rice supply chain. These pieces of information have been used to assess the goodness of the classification procedure and to provide an Overall Accuracy (OA) measure.

The proposed method
As it emerged from the state of the art in Section 3, three different approaches to rice mapping using space-borne radar data can be exploited: investigation of the scattering mechanism by using polarimetric SAR dataset, use of decision trees based on the analysis of SAR time series and flooded vegetation systems for rice mapping applications based on the analysis of histograms. Regarding the polarimetry-based approach, strong limitations to the envisaged practical use of the system would be posed by the high cost of fullypolarimetric datasets and the generally scarce coverage offered by this type of data. In the case of histogram-based methodologies, issues are related to the small amount of information provided by a single SAR image. By the nature of the analysis we intend to perform, single-date mapping is definitely not a solution, as the information relevant to us is contained in time series of water floodings. For the above-mentioned reasons, we decided to develop a rule-based classification model which leverages techniques based on the investigation of rice fields SAR time series. As already written in chapter 3, practically all papers on EO-based mapping and monitoring of rice paddy fields focus on South-East Asian Countries where cultivation practices are significantly different; for example, the majority of rice fields in Indonesia have a double rice cropping pattern: planting takes place in the wet season (October to March) and also in the dry season (April to September) [61,62]. Moreover, as irrigation water is available at any time during the year, farmers can grow rice whenever convenient [59]. Due to these differences in rice farming practices, it is not possible to simply re-use an existing rice mapping algorithm developed for Asian Countries; we therefore need to undergo additional research in order to develop a rice mapping system suitable for European rice cultivation.
A preliminary classification approach consists of comparing a SAR time-series sample with a rice "reference" time series. Such reference signal was artificially built by averaging a number of rice field radar responses, and it is assumed to represent a "prototype" response for a "typical" rice paddy field under traditional agricultural practices in Italy. In order to make the sample representative, 15 different time series of SAR backscatter on traditional rice paddy fields have been used to create the reference signal. Then, simple tools such as the Root Mean Squared Error (RMSE) and correlation coefficient (ρ) were hypothesised as comparison tools to classify rice fields by setting a threshold on a similarity measure.
SAR time series on rice fields may have, for example, different mean backscatter intensity values due to several reasons like how and when the field is prepared to accommodate rice seeds, the sowed variety, the length of the growth cycle and many other environmental conditions. Even if rice field samples preserve their typical radar response, vertical displacements between two compared time series, caused by the above mentioned reasons, could lead to classification errors if RMSE alone is used as a similarity metric. An RMSE-metric, indeed, only accounts for point-wise displacements between two time series and not for the "overall similarity" of the trends. Despite the good stability of the reference trend (Figure 4a), differences on each single sample tend to bear little connection with the crop type; non-relevant features such as different mean values shifting the overall time series upwards or downwards even by a small amount can generate enough accumulated RMSE to misclassify a genuine conventional rice paddy field. Moreover, considering the situation in Figure 4, the reader can not that in this case, rice and non-rice field time series feature very similar mean values (around -10 dB). Notwithstanding the obvious differences between the samples, this translates into low RMSE which leads a simple, threshold-based classifier to mistakenly identifying the non-rice sample as an actual rice field. For these reasons, the RMSE indicator may be not the best choice as a comparison tool, given its sensitivity to amplitude variations between signals, even with similar overall behaviour.  Regarding the use of the correlation coefficient, the major advantage is that it normalizes the variance of the compared signals to 1. In fact, contrary to RMSE indicator, the correlation coefficient can actually evaluate the "shape along time" of the SAR time series and will disregard changes in amplitude values that are due to non-relevant factors (e.g., weather, sowed variety, soil conditions, etc.). Even the correlation coefficient, however, suffers from limitations linked to the noisy nature of Sentinel-1 data. Both the reference and test fields signals present residual high-frequency noise caused by both the effects of speckle noise and incidence angle variations [60,63,64], surviving spatial averaging (plus, in the case of the reference time series, inter-series averaging). This frequently leads to a situation where the two signals have a substantial number of corresponding samples with similar (or contrary) off-average displacements by pure effect of noise, translating into a substantially increased likelihood of wrong classifications. Experiments with correlation have indeed reported disappointing levels of Overall Accuracy (OA), i.e., around 80%. Such poor classification performances, largely caused by high-frequency noise, can be reduced through low-pass filtering. This step was implemented, as shown in Figure 6, resulting into a visible increase in OA. The cutoff frequency of the filter was determined by analyzing the magnitude of the time series frequency spectrum, reported in Figure 5. In particular, the observation of a quick decay in amplitude between 0 and 0.02 [×π rad/sample], followed by a plateau and another significant decay after 0.1 [×π rad/sample], suggested that a simple 5 th order low-pass Butterworth filter with normalized cut frequency f cut = 0.1 [×π rad/sample] could be suitable to suppress noise. Such cutoff frequency was confirmed suitable by experiment and it is considered a good solution because it suppresses high frequency noise components while still preserving the relevant traces of the rice plant phenology. Thanks to this type of processing, which allowed extracting the envelope from the time series with the high noise frequencies suppressed, the classification accuracy increased. On the other hand, the number of false positive occurrences also increased to a barely acceptable level.  In order to make results actionable, a reduction in the number of false positives is necessary. This can be achieved by leveraging a feature of traditional rice crops that was little used up to that point: traditional rice paddy fields are flooded only once, and in a specified time window occurring between April and May. Figure 7 shows a typical conventional rice field SAR time series which has been low-pass filtered. After a deep investigation on rice field SAR responses, we discovered that a local minimum between April and May in the filtered radar reflectivity sequence was a discriminating feature for conventional rice fields. We then designed an algorithm able to detect the presence of such minimum, together with the neighbouring maxima associated with the stages of ploughing (before flooding) and plant emergence (late stages of flooding). Both ploughing and emergence locally increase the apparent surface roughness of the observed surface at band C wavelenghts, shifting the reflection type from mirror to diffuse, and thus increasing backscatter [65]. An assessment of mutual distances among such salient points against the operational calendar for traditional rice allows identifying compatible behaviours and suppressing most false positives generated by the previous classification step. It is also worth to note that the two ripples adjacent to the local minimum in Figure 7, are not due to the overshooting effect of the filter. Whereas, such ripples represent two physical events occurring on the rice paddy field which translate into an increase of the field roughness: the tillage and plant emergency phases. The rule designed for classifying a rice paddy field is described in the following. Referring to the simplified scheme of the classification methodology reported in Figure  8, if one of the two computed distances is not within the correct range of values, the classification result is uncertain and the output of the algorithm is a maybe label. Whereas, if both distances are non-compliant, the output of the system is a non-rice label. Therefore, in order to be fully classified as rice, all the decision nodes must provide a positive flag. Note that the output of the proposed system is not simply based on a score of positive flags. In fact, more important features are prioritized with respect to others, and the result is made less simplistic by including a maybe classification result.

Some results
The method was initially tuned using a training ground truth dataset composed of 30 samples, of which 20 were classical rice paddy fields and 10 were land parcels with various agricultural land cover classes including organic rice. Although the training set size may appear scarce, we should remember that obtaining reliable ground truth information about cultivation practices is not easy as this information is generally not publicly available. A set of 30 samples was fixed as a reasonable compromise between reliability of the information, and time and effort needed to collect it. Examples of optical satellite images on fields from the training set are visible in Figure 9. A first test was run on the same data set to confirm that the parameter set led to sensible results. The observed 92% accuracy score proved that the settings were sensible, but obviously, given the limited size of the ground truth and the re-use of the same set for training and testing, it was not sufficient to confirm usability of the method in the general case. A new test data set, with no overlap with the previously used data set, was thus generated to cross-check that the developed criterion and the parameters set could be profitably used elsewhere. The new test dataset was composed of 15 more classical rice paddy field parcels plus 10 parcels with other agricultural classes; none of these 25 parcels appeared in the training dataset. As a first approximation, the min-to-max distance threshold has been chosen by visual interpretation, after analyzing the radar responses of a number of rice fields. Therefore, in order to provide a more rigorous definition of such value, the overall accuracy has been evaluated as a function of the distance threshold. To do so, this parameter has been swept across the range 1-15 dB. The algorithm automatically classified all samples for each threshold value, and the OA value was recorded in each case. Such accuracy results were then plotted in a graph, visible in Figure 10.
From Figure 10 it can be observed that the highest classification accuracy is reached when the min-to-max threshold value is around d thrs = 4 dB. Setting the threshold to this latter value results into increasing the OA on the training set from 92% to 100%. As a cross-check operation, using the same settings on the test data set resulted into scoring 100% OA again, and obviously no false positives were reported either. Although they were run on test sets of limited size, our experiments appear to support that the developed Overall Accuracy Figure 10. Overall Accuracy as function of the distance threshold. A connecting line is added between points to increase visibility. method is promising in terms of spotting classical rice cultivation practices on rice paddy fields. The method needs further assessment on larger samples; this however requires a substantial amount of additional work, which could not be performed in the timeframe of this first pilot project. It is planned for the future stages of the service development.

Conclusions
In this paper, it was described a simple SAR-based mapping methodology for mapping rice paddy fields managed with a conventional (i.e. non-organic), in a European context. This is done as a part of a broader action with the objective of setting up a system capable of automatically collecting information that can be used to enhance traceability of agricultural crops with the help of satellite monitoring. In the experiments conducted with volunteer farmers, we were able to spot classical rice fields thanks to precise detection of the flooding period occurring in conventional rice fields only, which translates into a typical pattern of low backscatter values between April and May, typically preceded and followed by higher backscatter values due to ploughing and emergence. Experimental results involving simple tools such as RMSE and correlation coefficient indicators showed it is actually possible to build a classifier based on the comparison between a sample SAR time-series and a reference signal without having to resort to particularly complex approaches. On a small-sized dataset, we obtained 88% and 84% OA for the RMSE-based and correlation coefficient-based algorithms respectively. Finally, our flood-based thresholding method achieved very promising results. Using the same dataset previously used also for the RMSE-based and correlation coefficient methods, we achieved 92% overall accuracy. We also estimated the overall accuracy by letting the min-to-max distance to vary in order to determine the optimal threshold. After such tuning, the accuracy reached even 100% for a threshold value of d cut = 4 dB; as the most important consequence, the number of false positives sunk to zero. Unfortunately, assembling extensive ground truth is not straightforward due to the absence of accessible, up-to-date maps of rice-cultivated areas in Europe. Our ground truth was laboriously collected through multiple interactions with different farmers, and scaling up the size of the collected body of information will represent a challenge in itself. Still, this work represents a good starting point for further investigation.