Targeted Grassland Monitoring at Parcel Level Using Sentinels, Street-Level Images and Field Observations

The introduction of high-resolution Sentinels combined with the use of high-quality digital agricultural parcel registration systems is driving the move towards at-parcel agricultural monitoring. The European Union’s Common Agricultural Policy (CAP) has introduced the concept of CAP monitoring to help simplify the management and control of farmers’ parcel declarations for area support measures. This study proposes a proof of concept of this monitoring approach introducing and applying the concept of ‘markers’. Using Sentinel-1- and -2-derived (S1 and S2) markers, we evaluate parcels declared as grassland in the Gelderse Vallei in the Netherlands covering more than 15,000 parcels. The satellite markers—respectively based on crop-type deep learning classification using S1 backscattering and coherence data and on detecting bare soil with S2 during the growing season—aim to identify grassland-declared parcels for which (1) the marker suggests another crop type or (2) which appear to have been ploughed during the year. Subsequently, a field-survey was carried out in October 2017 to target the parcels identified and to build a relevant ground-truth sample of the area. For the latter purpose, we used a high-definition camera mounted on the roof of a car to continuously sample geo-tagged digital imagery, as well as an app-based approach to identify the targeted fields. Depending on which satellite-based marker or combination of markers is used, the number of parcels identified ranged from 2.57% (marked by both the S1 and S2 markers) to 17.12% of the total of 11,773 parcels declared as grassland. After confirming with the ground-truth, parcels flagged by the combined S1 and S2 marker were robustly detected as non-grassland parcels (F-score = 0.9). In addition, the study demonstrated that street-level imagery collection could improve collection efficiency by a factor seven compared to field visits (1411 parcels/day vs. 217 parcels/day) while keeping an overall accuracy of about 90% compared to the ground-truth. This proposed way of collecting in situ data is suitable for the training and validating of high resolution remote sensing approaches for agricultural monitoring. Timely country-wide wall-to-wall parcel-level monitoring and targeted in-season parcel surveying will increase the efficiency and effectiveness of monitoring and implementing agricultural policies.


Introduction
Grasslands are not only an intrinsic part of European agriculture, but also essential for global bio-geochemical cycling, biodiversity and climate change mitigation (e.g., soil carbon stocks), providing a range of economic (e.g., feed) and environmental services [1]. Therefore, following the 2013 reform of the European Union's (EU) Common Agricultural Policy (CAP) to make the direct payment system more environmentally friendly with the so-called 'greening' rules, farmers can receive direct payments for maintaining permanent grasslands. For the administration and control of these farmer support requests, compliance monitoring of grassland management is a mandatory task of the EU Member States' (MS) CAP paying agencies.
New opportunities for grassland monitoring are provided by the EU Copernicus program and the observations by the Sentinel-1 and -2 satellites. The Synthetic Aperture Radar (SAR) Sentinel-1 and multi-spectral Sentinel-2 sensors are acquiring high resolution (10 m) observations with high revisit frequencies. The combination of the Sentinels' observations, which are freely accessible, and the increasingly accessible crop-type data at the parcel level enable new possibilities for EU crop monitoring. In the remainder of this Introduction we will summarize the importance of grasslands, discuss the state-of-the art of high-resolution remote sensing, specifically with respect to the Copernicus Sentinels and grasslands, the availability of open-access parcel-level crop type declarations in the EU, highlight the continued need for in situ data and field observations and discuss recent opportunities provided by street-level imagery.

The Importance of Grasslands
Grasslands are key biotopes in climate change mitigation as they store approximately 34% of the global stock of carbon in terrestrial ecosystems [2]. Indeed, enhancing soil carbon sequestration is seen as having the highest greenhouse gas mitigation potential in the agriculture sector [3], and the Intergovernmental Panel on Climate Change (IPCC) provides methods for greenhouse gas inventories for grassland [4]. Unlike forests, where vegetation is the primary source of carbon storage, most of the grassland carbon stocks are in the soil. Cultivation and the effect of urbanization on grasslands, as well as other modifications of grasslands such as desertification and livestock grazing, can result in significant carbon emissions (e.g. [5]). Grasslands provide habitats for plants and animals-soil microfauna and large mammals alike-and also support large numbers of wild herbivores that depend on the biotope for breeding, migratory and wintering habitats, sharing the land with domesticated herds. Grasslands also support large numbers of domesticated animals such as cattle, sheep and goat herds, horses and water buffalo, which are the sources of meat, dairy products, wool and leather products. Grasslands in Europe had been declining since the 1960s. Given the importance of grasslands, policies were developed to counteract this decline. The permanent pasture ratio was introduced as part of a cross compliance in 2004. This required MS to keep the decline of the ratio of permanent grassland area and total agricultural area below 10%. The maintenance of permanent grassland, as part of the greening, strengthened this requirement to 5% and required identifying the most environmentally-sensitive grasslands and protecting them from ploughing [6].

High-Resolution Remote Sensing Grasslands Monitoring
Grasslands vary greatly in their degree and intensity of management, from extensively-managed rangelands to intensively-managed pasture and hay land that are fertilized and irrigated. The huge variety in grassland typology provides a challenge for any remotely-sensed-based classification scheme.
Fortunately, the Copernicus program has reached the operational status in the provision of full, free and open access to Sentinel satellite products for any user. In constellation, Sentinel-1 (S1) and Sentinel-2 (S2) have a revisit capacity of respectively six days and five days, when combining the identical A and B platforms. Data have the finest spatial resolution of 10 m, which is ideal for monitoring fields at the parcel level.
Crop type mapping with Sentinels has already been demonstrated in various recent studies such as Belgiu and Csillik [7], Immitzer et al. [8] and Defourny et al. [9] using vegetation indices or reflectance. To make sense of the large Sentinel data volumes, machine learning algorithms can improve the accuracy of crop mapping. In this context, deep learning was demonstrated to be useful for many remote sensing applications (see [10] for a review). Kussul et al. [11] demonstrated the potential of deep learning for crop type classification with S1 time series in the Ukraine.
In addition to grassland mapping per se, several studies demonstrate the potential for detailed grassland characterization with Sentinel observations, especially in the context of activity monitoring.
For example, recent experience has evidenced the possibility of using coherence between two S1 SAR scenes for detecting mowing events in Estonia [12], as well as for grassland cutting dates in Germany [13]. High coherence values due to backscattering from the ground were linked to ploughed bare fields and low vegetation height [14,15]. S2 has also recently demonstrated its potential ability to monitor grassland mowing in Germany [16]. Veloso et al. [17] analyzed the temporal trajectory of remote sensing data for a variety of winter and summer crops and pointed out the interest of S1 data and particularly the VH/VV ratio for crop monitoring.
To conclude, Sentinels allow the timely monitoring of grasslands using high-resolution remote sensing, but methods still have to be developed and adapted for this purpose.

CAP Policy Context
In the context of the EU's CAP direct payment management and control measures, discussions are ongoing on simplification, in particular the substitution of the On The Spot Checks (OTSC) by a system of monitoring. OTSC (for OTSC guidelines, see [18]) are based on a limited sample (5%) of farmer declarations that are checked with either a pre-defined selection of several high resolution images or via field inspection and measurement. Monitoring refers to 'a procedure based on regular and systematic observation, tracking and assessment of the fulfillment of eligibility conditions and agricultural activities over a period of time, which involves, where and when necessary, appropriate follow-up action' [19]. To this end, an implementing regulation (Regulation (EU) No. 2018/746 [20]) was recently adopted (May 2018), allowing MS to switch from OTSC to monitoring for their controls starting from 2018. Several MS are considering switching from OTSC to monitoring, but no operational official monitoring has yet been carried out. Therefore, providing scientific and technical guidance for the implementation of the monitoring and marker approach is essential.
Monitoring applies to the full population of farmer declarations and is primarily relying on continuous S1 and S2 time series. Basically, the monitoring approach determines whether the time series characteristics (i.e., behavior) of an agricultural parcel are in line with the crop and activities declared under a particular scheme for which aid is requested. For example, basic area payments will require evidence of an agricultural activity; crop diversification requires determination of the crop types; and greening payments may require detection of crop management activities (e.g., mowing). These concepts are captured in the definition of a marker, which is 'a unique collection (combination) of property values of a data signal that evidence the presence of a particular continuous state or a change of state of the land phenomenon' [19], where land phenomena reflect the characteristic phenology of a crop or evidence a mechanical practice (e.g., ploughing, mowing, harvesting). Where appropriate, the absence of a marker or the assignment of a marker relating to another crop or activity than the declared case will trigger follow up action by the management and control authorities.

Open Access Parcel-Level Crop Type Declarations
In the EU, the Member States' Land Parcel Identification Systems (LPIS), which provide detailed digital geometries of agricultural reference parcels to aid the management of the CAP, are being released as open access data in an increasing number of regions. LPIS contain up to several millions parcel boundary vectors, depending on the Member State, delineating agricultural land eligible for CAP support. Farmers use the LPIS to declare their cropping practices, including specific environmental measures where relevant, in an annual aid application. Member State administrations maintain LPIS and carry out management and control activities on the basis of the LPIS, often with the use of remote sensing [21]. The level of detail available in the various open access LPIS implementations varies across Member States. For instance, in the Netherlands [22], Austria [23] and Denmark [24], the actual parcel declarations are made public annually, whereas, for example, France and the Czech Republic publish only the LPIS reference vectors. In some Member States, LPIS are managed at a regional level (e.g., Spain, Germany, Italy). The trend to release LPIS as open access data follows the realization that they have significant potential in supporting a range of agronomic and environmental use cases in scientific, public and private use contexts.

In Situ Data Needs
The combined use of detailed reference parcel databases with deep Sentinel data stacks facilitates near-real-time information gathering of markers for large sets of agricultural parcels. A subsequent challenge is to collect in situ data to train and validate the information extraction process. Declared parcel data may not be available until late in the growing season and may not, a priori, be considered as the ground-truth. However, traditional in situ ground-truth collection lacks the scale and possibility for automated integration into big data analyses and is prone to sampling errors [25]. Moreover, it require a huge organizational effort, making it difficult to achieve periodic resampling to assess changes in dynamic agricultural phenomena over time.

Street-Level Imagery
Citizens without professional expertise in remote sensing have become actively involved in the creation and analysis of large datasets, which is known as crowdsourcing. The rise of geospatial user-created information has been of great benefit to the collection of large quantities of reference data for land cover classification. Crowdsourcing applications such as Geo-Wiki [26] have demonstrated the potential to collect massive and high-quality in situ data.
Different platforms and business models coexist to host these growing archives. The most famous repository is the street-level pictures collected by Google Street View (GSV), through image-acquiring devices mounted onboard its dedicated cars, bikes and other transportation means. Its coverage is extensive, and in many cases, each location has been captured multiple times over the years. Despite the global coverage of GSV, limitations in term of access, contribution and date filtering availability through the API make it difficult to use the vast dataset for dynamic agricultural monitoring. Other platforms such as Open Street Cam, Tencent, Baidu, Mapillary, Here, Bing Streetside, and Apple provide the same type of repositories, each with their specificities. Mapillary [36], a European platform, is the first to provide open access and free-of-charge detailed street photos based on crowdsourcing [37]. In addition, it is possible to filter the data via queries through the site's API, allowing the harvesting of images for a defined time window and geographical area. Geo-tagged street-level imagery was already tested to collect agricultural statistics. Wu and Li [38], for example, developed their own device combining a GPS, a video camera and a GIS analysis system to collect crop type information in China, which was also used by [39] to validate paddy rice maps. Use of GSV along with remote sensing was used in the Global Croplands project where use of the Street View Application [40] permitted the collection of information on agriculture then used to train and validate a 30-m Africa cropland map [41]. Thus, their recent improvements offer a great potential to exploit this type of data for agriculture monitoring.

Objectives
Given the importance of grasslands and their policy relevance, along with the developments sketched above, the overall aim of this study is to evaluate the feasibility of integrating the in-season availability of parcel-level crop type information with S1 and S2 markers. These markers are designed to identify, across a large area, grassland parcels with deviating spectral characteristics that are then targeted for inspection during a one-day field survey.
The specific objectives are to evaluate: (1) whether efficient markers for grassland monitoring can be defined from a combination of Sentinel-1 SAR and Sentinel-2 multi-spectral observations; (2) the usefulness of combining cloud processing using Google Earth Engine (GEE) and deep learning (Tensorflow) to perform large-scale parcel-level marker evaluation tasks; (3) whether the targeted monitoring approach is appropriate and able to efficiently scale across the whole area covering 15,000 parcels along with its implications for CAP monitoring; and finally, (4) whether street-level imagery can contribute to the efficiency and effectiveness of a field survey designed for the evaluation of our markers.
Computationally intensive data-driven science such as reported here needs to be completely transparent and reproducible. Therefore, a final objective embedded in our manuscript is focused on providing all scripts and code used across a variety of software packages needed to reproduce the results available to the public on the GitHub repository https://github.com/rdandrimont/AGREE (AGRicultural Enhanced Evidence), summarized in Figure A1, Appendix A, and available in an online document as Supplementary Material.

Study Area
For this study, an area in the Netherlands was chosen as a large amount of high quality geo-information is available under open access license there, including very detailed annual agricultural parcel sets. Secondly, grassland and maize are the two most important crops in terms of cultivated area in the Netherlands (with 52.0% and 13.3%, respectively). However, crop occurrence has a distinct regional pattern which is primarily linked to varying soil types. Most grassland and (silage) maize are cultivated on sandy soils, peat and poorly-drained heavy clay soils, while most other arable crops (primarily potatoes, winter wheat, sugar beet, vegetables and onions) are cultivated on well-drained clay-rich alluvial and loamy soils. The study site is the Gelderse Vallei, which is part of the province of Gelderland, and includes 7 municipalities: Ede, Wageningen, Renkum, Barneveld, Arnhem, Putten and Nijkerk (Figure 1). The region's area is 220,171 ha with 17,215 parcels totaling 28,116 ha, where the 4 main crops for 2017 are permanent grassland (62%), maize (16.2%), temporary grassland (10%) and natural grassland (1.4%). This range of grassland types represents a wide variation in management intensities and pressures resulting from other land use conversions such as urbanization, new road infrastructure, as well as nature conservation.
km Figure 1. The study site is the Gelderse Vallei including 7 municipalities: Ede, Wageningen, Renkum, Barneveld, Arnhem, Putten and Nijkerk. The ground-truth points were collected during a field visit; geo-tagged street-level imagery was acquired simultaneously at 1 frame per second along the survey path.

Agricultural Parcel Database
The Netherlands open geo-data infrastructure exposes to the public a large number of very high quality reference datasets, using various protocols (e.g., WMS, WFS, atom downloads). For this study, the "Basisregistratie Gewaspercelen" (BRP) datasets are the most relevant [22]. BRP is derived from the Land Parcel Information System (LPIS) [21]. The LPIS itself is maintained and updated on the basis of the digital topographic map at scale 1:10,000 and actual aerial orthophotos. Applicants use the LPIS parcels to indicate the location and total area under cultivation of a particular crop and, where relevant, specific environmental or management practices eligible under support schemes. The applications are then digitized by the relevant authorities for subsequent administration and control of support payments. At the end of the season, a consolidated set is released in the public domain. BRP datasets comprise around 770,000 parcels with a total area of approximately 1.5 million hectares and have been available since 2009. In this study, we work with the BRP concept version for 2017, which was released on 15 July 2017 (BRP2017).

Sentinel-1
The S1 constellation consists of 2 identical C-band active microwave Synthetic Aperture Radar (SAR) low-Earth orbiting platforms operating at 5.6 GHz (5.4 cm wavelength, [42]), with an effective revisit of 6 days. For the global land surface, the dual polarization (VV, VH) Interferometric Wave (IW) mode is the default operating configuration. Level 1 products are produced both as Single Look Complex (SLC) and Ground Range-Detected (GRD) outputs and downloadable, usually within a few hours after sensing, from the Copernicus Open Hub data access points [43]. GRD products need to be processed to calibrated, geo-coded backscattering coefficients (σ 0 ), which can be done with the European Space Agency's SNAP Sentinel-1 toolbox, which is released as open source software [44]. Each scene is subjected to the following processing steps: thermal noise removal, radiometric calibration and terrain correction. The Google Earth Engine [45] platform already collects all GRD data, runs them through the toolbox and then ingests the geo-coded output as log-scaled and quantized to 16 bits unsigned integer into the COPERNICUS/S1 catalog. In the study area, and for the considered period from 1 January 2017 to 1 August 2017, 36 descending and 34 ascending orbits were acquired.

Sentinel-2
The Sentinel-2 MultiSpectral Instrument (MSI) Level-1C reflectance data are loaded for the period of interest and converted to top of atmosphere reflectance from 1 January 2017 to 1 August 2017. Over this period and the study area, which is fully contained in the Sentinel-2 reference grid tile 31UFT, 44 S2 images acquisitions are available, including cloudy observations. For the study area, S2-A was mostly used, because S2-B data were only available starting from 30 June 2017.

Methods
A key objective of this study is to use satellite markers to identify parcels declared as grassland that are potentially not grasslands. Thanks to the ability of S1 to retrieve data independently of atmospheric conditions, a consistent, gap-free time series is available at the parcel level making it particularly suitable data for deep learning classification. For S2, an approach based on a normalized index for bare soil event detection provides discrete evidence of grassland conversion. Hereafter, the methodology applied on the Copernicus data is described. We have implemented our methods in code that can be deployed in Google Earth Engine (GEE), run in Python or executed as recipes in the open source Sentinel-1 toolbox. All relevant code is available openly to the public on a GitHub repository (see Appendix A for details). The description of the processing steps for S1 backscattering is described in Section 2.3.1 (Figure 2a), the description of the methodology applied for S1 coherence in Section 2.3.2 ( Figure 2b) and the methodology applied for S2 in Section 2.3.4 ( Figure 2c).

S1 Backscattering
The processing step for S1 backscattering is summarized in Figure 2a and starts by loading the 10-m resolution S1 VV and VH GRD time series. After edge masking and conversion from dB to natural, 7-day average σ 0 temporal signatures are extracted for each parcel, internally buffered by 10 m to avoid boundary pixels. In this way, we generate a consistently-timed, equally-sized set of temporal features to use as input for the subsequent machine learning runs, independent of the number of actual S1 coverages of each individual parcel. We extract a CSV formatted table from GEE with the weekly averages for both the VV and VH polarized bands for each parcel.

S1 Coherence
For our study, we also consider S1 coherence as a potential marker, and the required processing is shown in Figure 2b. Starting with the operational status of Sentinel 1B in late September 2016, it is now possible to generate coherence with a 6-day temporal baseline over Europe. Coherence generation requires Level 1 SLC as input and can be derived by using the SNAP Sentinel-1 toolbox with the Graph Processing Tool batch procedure. The processing recipe (available as Supplementary Materials) includes debursting (i.e., merging of the azimuth bursts) of individual SLC IW subswaths (3 per scene), followed by co-registration of each pair of subswaths from successive scenes and finally the calculation of coherence by averaging 4 × 1 range and azimuth samples. The three subswaths are merged and terrain corrected to a 20-m pixel output product. Downloading SLC data and post-processing to geocoded coherence is a scripted process, which can be scheduled to run synchronized to S1 acquisitions. We generate coherence for the descending relative orbit 37 and ascending orbit 88 only. These orbits cover our study area completely and are approximately 3 days apart (scene acquisition time for Orbit 37 at 05:58 UTC on T0 and for Orbit 88 at 17:18 UTC on T0 + 3 days).
We upload the geocoded SAR coherence as assets to the GEE user account so that they can be shared with a range of use cases (e.g., [46]). Our marker is based on the occurrence of high coherence events, i.e., detecting the presence of bare soil or (very) low vegetation cover. In a similar manner as for σ 0 , temporal profiles are generated for each parcel by generating the maximum for a fixed period, in this case 15 days, i.e., with 4-5 coherence observations in each period. Extracts per parcel are also exported to a CSV formatted table, for both VV and VH polarizations.

Deep Learning S1 Backscatter and Coherence Classification
For both S1 backscattering and S1 coherence, the same deep neural network machine learning routines are used, but applied separately ( Figure 2). The choice of TensorFlow [47] for deep learning is practical and based on its growing reputation as a versatile open-source toolkit for a wide range of machine learning problems. However, results reported in this study are likely to be reproducible in other (python based) open source machine learning libraries (theano, scikit-learn, etc.).
The CSV records exported from GEE are first filtered for fields smaller than 0.1 ha, to avoid noisy time series at the training stage. Secondly, the crops are aggregated to 5 main classes: Grassland (GRA), Maize (MAI), Cereal (CER), Potatoes (POT) and Other (OTH), which represent >90% of the crop area in the study site. The specific BRP2017 crop categories aggregated into the 5 classes are described in Table A1 in the Appendix B. We also remove ancillary parcel attributes (e.g., area, perimeter, crop labels) from the records that do not play a role in the machine learning steps.
For the training step, the labeled records (N total ) are split into a record set for (N training ) and one for testing (N test ). The choice of N training can be a fixed number of records or a percentage of all records. In this study, we found that using between 100 and 300 randomly selected parcels per class for the training set performed well.
The TFLearn module provides a model wrapper for a deep neural network, which in our case consists of 2 fully-connected layers with 32 nodes and a softmax activation function. The shape of our input is specific to the S1 backscatter record set, which has 34 features (17 weekly averages for VV and VH polarization) and for the S1 coherence, which has 18 features (9 two-weekly averages for VV and VH). The deep neural network automatically performs classifier tasks, such as training, prediction and production of accuracy metrics. We train the model for 80 epochs. Training accuracy levels off well before 80 epochs and does not significantly increase with a higher numbers of epochs.
The testing set provides the "marked parcels", that is for these, the trained model predicts the most probable class given the time series for each parcel. To improve the prediction reliability, we run the training and prediction twice with different random training samples, and we keep only results for which the class probability is higher than 70%. The choice of 70% ensures that the probability is higher than the sum of the probabilities for the other classes (unlike, for instance, with a 50% threshold), but not too high to eliminate potential candidate outliers. After the 2 runs, we select the identified class with the highest probability in both cases. If the class is different from its original label, the parcel is marked as an outlier. Due to the non-uniform distribution of the crop classes, with a dominance of grassland and maize, the results do not improve significantly with the use of more iterations. The validation of the markers is done using independent ground-truth data and is described in Section 2.4.3.

S2 Bare Soil Index
Similarly to S1, S2 is processed as described in Figure 2c. Clouds and cirrus are removed using the 'QA60' flag provided in the metadata by the European Space Agency (ESA). The shadows of clouds are then removed using solar geometry and height estimation. Although atmospherically-corrected S2 images are better for reliable spatial and temporal comparison, level-2A were not yet available in GEE. However, atmospheric correction is not always a prerequisite for classification and change detection [48], especially when working with normalized indices. Therefore, as a trade-off between quality and availability, the level-1C data were preferred for this study.
In order to detect ploughing or other grassland conversion measures that lead to a temporary loss of the vegetation cover, we adopt the Bare Soil Index (BSI), proposed by [49] for LANDSAT TM to Sentinel-2 as (Equation (1)). The SWIR1 (Short-Wave InfraRed) and the red bands allow one to quantify the soil mineral composition, while the NIR (Near-InfraRed) and the blue are sensitive to the presence of vegetation. The blue (B2, range 490 ± 32.5 nm), red (B4, range 665 ± 15 nm) and NIR (B8, range 842 ± 57.5 nm) all have a 10-m resolution, while the SWIR1 (B11, range 1610 ± 45 nm) has a 20-m resolution and is resampled to 10 m.
For each parcel, the BSI is calculated for each available cloud-free Sentinel-2 acquisition. The number of times when BSI is positive, i.e., the parcel is potentially bare, is calculated using Equation (2). This provides an indication that the parcel was ploughed and possibly converted.
Parcels are marked with the S2-BSI when the soil was detected as bare more than 1 time in the period 1 April to 1 August (i.e., Equation (2) > 1).

Combining Satellite Markers
After processing S1 and S2, we obtain a binary marker for each of the three workflows ( Figure 2) indicating whether the parcel was marked as an outlier: S1 backscatter, S1 coherence and S2 BSI. In addition, we also combine the three markers with a logical AND and with a logical OR (Table 1). S1 AND S2 indicates that the parcel should be marked by S1 (either S1 backscatter or S1 coherence) and S2 at the same time. S1 OR S2 indicates that the parcel should be marked by at least one of the three markers, i.e., S1 backscatter or S1 coherence or S2 BSI. Table 1. Satellite markers and their combination. S1 AND S2 corresponds to the logical AND where the parcel should be marked by S1 (either S1 backscatter or S1 coherence) and S2 at the same time. S1 OR S2 corresponds to the logical OR where the parcel should be marked by at least one of the three markers, i.e., S1 backscatter or S1 coherence or S2 BSI.

Ground-Truth Collection and Accuracy Assessments
To assess the performance of these markers and their combinations, ground-truth collection was carried out on 11 October 2017 using 2 approaches: (i) a field visit (described in the Section 2.4.1) and (ii) by simultaneously collecting street-level geo-tagged pictures using a rooftop camera (described in Section 2.4.2). The accuracy assessment methodology is then described in Section 2.4.3.

Ground-Truth from Field Visits
Crop type was recorded during the field visits as points recorded with a mobile map app (Avenza Maps R [50]). This app permits downloading of maps for offline use on a smartphone or tablet for use with the device's built-in GPS to track the location. Two background layers were generated from very high resolution hybrid layers of Google Map Aerial and Google Street Map with the marked parcels boundaries differently colored by S1 and S2 markers.
The app then allows one to put a pin on a located field and to specify attributes, which include the position (longitude, latitude), the time, the crop class as aggregated in the 5 groups: cereals (CER), grassland (GRA), maize (MAI), potatoes (POT) and other (OTH) (see Table A1 in Appendix B). In addition, status information was collected such as: bare soil, catch crop, meadow, renewed and cut, renewed recently (for GRA), standing or stubble (for MAI). Additionally, geo-tagged still camera photos were taken. A total of 241 fields were surveyed by field visits. Amongst these, 5 fields were absent from BRP2017 and 5 surveyed positions were wrongly located outside the parcel boundaries and, therefore, excluded.

Geo-Tagged Street-Level Images
Street-level digital photography represents a valuable source of in situ data with an important potential as it provides advantages in terms of logistics and objectivity. This section describes how the pictures were collected and then visually interpreted to provide another source of ground-truth.
An action camera (SONY HDR-AS300R) with a Zeiss R Tessar wide-angle lens looking forward was mounted on the roof of the car with a suction cup. The choice of the device and mount was advised by Mapillary [51]. In particular, we followed the set-up described by [52]. It was mounted with an external battery to collect data up to 8 h continuously on a 128-GB SD card and handled via a remote control screen into the car. As suggested by [51], the camera was programmed to record at 1-s intervals. The size of acquisition of each image is 4 K, i.e., 3840 × 2160 pixels. The exposure is adjusted for each shot allowing one to smoothly follow changes in brightness along the route.
A script available on GitHub and developed by Mapillary [53] for preprocessing and uploading images on their platform applies the following steps: skip images that are potential duplicates (the minimum interval between consecutive acquisition was set to 0.5 m); group images into sequences based on GPS and time (a cutoff time of 25 s and a cutoff distance of 100 m); interpolate compass angles for each sequence; add Mapillary tags to the images; and finally, upload the images.
After having uploaded the images onto Mapillary, the location of the image, the sequence of the driven path and the image themselves could be accessed through an API access (see [54] for a technical description). To select one image per field, we selected the street-level picture located the closest to the centroid of the intersection of the considered field and a 35-m driven road buffer. The street-level pictures were visually photo-interpreted with the help of an aerial overview (Figure 3a) and the image itself (Figure 3b). In addition to this image, the photo-interpreter was provided with a link to the field to interpret on the Mapillary platform where he could select another picture of the same field if the provided picture were not suitable. For each parcel, the photo-interpreter had to choose a tag (i.e., CER, GRA, MAI, POT or OTH) and should have also indicated if the street-level picture permitted him to photo-interpret the image or if an additional picture from Mapillary was needed (see the results in Section 3.3.1).  Table 9).

390
In this study, four accuracy assessments were carried out: (1) the BRP2017 assessment using 391 ground truth, (2) the assessment of the street-level pictures' photo-interpretation method , (3) the 392 S1 and S2 markers' assessment using ground truth and (4) the S1 and S2 markers' assessment using 393 street-level pictures. To carry out these assessments, we use the following metrics described below 394 first for assessing multiple classes (>2) and then for binary cases.

395
The most common way to express classification accuracy for multiple classes is the preparation 396 of a so-called error matrix also known as confusion matrix or contingency matrix (Table 2). Such 397 Figure 3. The photo-interpreters were given a high-resolution aerial image with the parcel and the point where the image was taken (a) along with the image to photo-interpret (b). This example shows a maize parcel on the left side of the road, which is not yet harvested.

Accuracy Assessment
In this study, four accuracy assessments were carried out: (1) the BRP2017 assessment using the ground-truth; (2) the assessment of the street-level pictures' photo-interpretation method; (3) the S1 and S2 markers' assessment using the ground-truth; and (4) the S1 and S2 markers' assessment using street-level pictures. To carry out these assessments, we used the following metrics described below, first for assessing multiple classes (>2) and then for binary cases.
The most common way to express classification accuracy for multiple classes is the preparation of a so-called error matrix, also known as a confusion matrix or contingency matrix ( Table 2). Such matrices show the cross tabulation of the classified land cover and the actual land cover revealed by field observation results. The confusion matrices compare each class of the map (n i ) to the reference sample classes (n j ) with k classes. Column Total n +j n +1 n +2 n +3 n Each confusion matrix reports the Overall Accuracy (OA), the User's Accuracy (UA), the Producer's Accuracy (PA) [55] and the F-score, which is used for map assessment. OA represents the proportion of all cases correctly classified (Equation (3)) with n being the total number of samples and q the total number of classes. UA, also defined as recall, corresponds to the probability that a randomly-selected sample from the map is classified as correct in the reference sample. PA, also defined as precision, corresponds to the probability that a reference sample is correctly classified in the map. Therefore, UA is related to the commission error, while PA indicates the omission error. They are calculated following Equations (4) and (5) and weighted using the same method as for the global overall accuracy. The F-score (Equation (6)) represents for a class k the harmonic mean of the user and producer accuracies and ranges between 0 and 1.

Results
The first part of the result section (Section 3.1) presents the satellite-based marker results at the parcel level and then at the study area level. In the second section, the street-level imagery acquisition is presented (Section 3.2). Finally, the four assessments are presented demonstrating how the different approaches proposed performed (Section 3.3).

Satellite-Based Markers
First of all, a temporal series of a marked parcel is shown to illustrate the results in light of the signal measured by the satellite. This parcel was declared as permanent grassland in BRP2017, but was marked by all three markers. Figure 4 shows the S1 backscatter intensity time series along with coherence for both the VV and VH polarizations. For the parcel location and the period 1 January to 1 August, there were 36 descending and 34 ascending orbits acquired. From the time series, we observe a drop in the backscattering for both polarization VV and VH at the end of March, indicating a vegetation decrease. The coherences from both polarization VV and VH have a low value when ploughing occurs, but for the late April-early May period, high values of coherence indicate a bare stable surface state. Figure 5 shows the S2 BSI and NDVI. In the study period, 44 Sentinel-2 observations were available (OBSERVATION in Figure 5) for this parcel. However, only 10 of those were cloud-and-shadow free (BSI and NDVI in Figure 5). At the end of April, the BSI was above zero, indicating that the soil was bare. The fact that two consecutive bare soil states were detected explains why the parcel was marked by the S2-BSI marker. Figure 6 provides the BSI sum for the parcel at the pixel level along with true color S2 overviews of selected dates. The BSI sum along the growing season is then averaged for the parcel, which is then marked when greater than zero. In this example, ploughing can be observed during the second half of March on the RGB composite ( Figure  6), which corresponds exactly to the BSI time series ( Figure 5). Interestingly, in this example, half of the parcel was observed to be bare on 14 March, and the remaining part was bare on 24 March. Vegetation regrowth was then observed on the 26 May acquisition ( Figure 6) for which BSI goes back below zero, as indicated in Figure 5.
The study site contained 15,395 parcels for which area size and class distribution are described in Table 4 together with the training sample composition used for the S1 markers. In total, less than five percent (4.6%) of the parcels were used for training in three classes (GRA, MAI, CER). Potatoes (POT) are not used for training as they represent less than 150 parcels (116). The other classes (OTH) are not used, as they are by definition too heterogeneous to be used for the training. . S1 backscattering and coherence signal for a parcel declared as permanent grassland and marked with S1. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq qq q qq qq q The machine learning classification of the S1 weekly backscattering signatures has led to a total of 1002 (8.51%) marked outliers, while the S1 coherence marked 345 (2.93%) outliers ( Table 5). The S2 BSI method led to 1064 (9.03%) marked outliers. When combining markers as described in Table 1, we obtained five different marker combinations. Both the logical AND and OR of the S1 backscatter and S2 BSI markers show that both marker approaches were highly complementary, with only 303 (2.57%) parcels marked as outliers by both. In Figure 7c, the spatial pattern of marked outliers for the S1 AND S2 combination is shown for the study area. Maps showing the distribution of the non-marked (grey) and marked (red) declared grasslands with the different marker combinations: (a) S1 backscatter, (b) S1 coherence, (c) S2 BSI, (d) S1 AND S2 and (e) S1 OR S2.  Table 5. Number and distribution of declared grassland parcels marked with S1 AND S2, S1 OR S2.

Markers
Marked N (Percent) S1 backscatter 1002 (8.51%) S1 coherence 345 (2.93%) S2 BSI 1064 (9.03%) S1 AND S2 303 (2.57%) S1 OR S2 2015 (17.12%) For the current study, about 35,000 street-level images were collected (see Figure 1 for a map of the route covered). After pre-processing, the pictures were uploaded to the Mapillary platform. In the study area, 1411 fields were observed with street-level imagery, i.e., within a 35-m buffer of the driven road (Table 6). These fields were mainly declared as grassland (82%) and maize (13.61%). After image collection, the subsequent step was the visual photo-interpretation tagging, which will be described in the following section.  Table 7 presents the confusion matrix between the declared parcels' BRP2017 and the ground-truth collected. Of the surveyed parcels, 92% were found to be of the same class as declared in the BRP2017 (OA = 0.92). For grassland, which is our main target, UA was 0.99 and PA was 0.90 (Table 7). Indeed, nine parcels declared as grassland in the BRP2017 were in fact maize parcels, and five were Others (OTH). Table 7. Confusion matrix of the BRP2017 parcels with the ground-truth collected during the field survey along with the per-class user accuracy, producer accuracy and F-score.

Assessment of Street-Level Pictures Tagging with the Ground-Truth
In the second step, we evaluate the fitness for purpose of the proposed labeling method for the street-level pictures with automated picture selection. Three different photo-interpreters have tagged the pictures for each of the 214 parcels for which we collected field observations. For this study, the photo-interpretation rate was about 100 parcels per hour. The parcel labeling by the interpreters was fast and straightforward on the automatically preselected pictures for 83 to 97% of the parcels depending on the interpreter (Table 8). For the remaining parcels, the interpreter had to search for a more suitable image first, which required more time. Only 2 to 7% of the parcels were not visible from any of the collected pictures and, thus, were not labeled by the interpreters ( Table 8).
The confusion matrix comparing the interpreted photo tags to the field survey results shows an OA that varies between 88.79% and 92.06% according to the interpreter (Table 9). For the three photo-interpreters and the two main classes of interest (i.e., grassland and maize), the UA, PA and F-score were always above 0.9 (Table 10). Table 8. Suitability of the collected street-level imagery for the photo-interpretation for each of the interpreters. Either the picture automatically selected is suitable for photo-interpretation, or the interpreter needs to select another picture of the same parcel, or the parcel is not visible from any of the collected pictures.

Markers' Assessment Using the Ground-Truth
We now proceed with the performance assessment of the proposed satellite markers, in particular those that detect declared grassland parcels as not conforming. The field survey contained 144 parcels declared as grasslands. The comparison of marked parcels (Table 11) with the ground-truth and are discussed in Section 4.1.2. Table 11. Marked declared grasslands fields with S1 and S2, S1 or S2 among the surveyed parcels (TP: True Positive, FP: False Positive, FN: False Negative, TN:True Negative).

Assessment of the Markers with Geo-Tagged Street-Level Acquisitions
The interpreted and tagged street-level pictures provide a significantly larger sample of parcels for which the satellite markers can be compared. For the study area, 1411 parcels were observed by street-level imagery ( Table 6). For the marked parcels (Table 11), between 34 and 219 parcels, depending on the marker, were observed with street-level pictures. For each of the five markers, we sampled the same number of fields amongst the non-marked fields to derive more robust metrics. This results in a number of parcels ranging from 64 to 386 with corresponding tagged street-level imagery used for the subsequent assessment (see Table 12 for the detailed samples' distributions). As an example, the S1 backscatter marker selects 1002 parcels for which 104 parcels were observed by street-level pictures. Among those 104 parcels, 92 were identifiable on the street-level pictures and thus tagged. In addition, 92 non-marked parcels were randomly selected and tagged, resulting in a total of 184 samples used to assess the S1 backscatter marker. Table 12. Street-level imagery samples' distribution, which were tagged to assess the markers. The same number of parcels are sampled in marked and non-marked categories.  Table 13 lists the performance matrix derived from Table 12. All the markers except S1 coherence have a FP number of zero, indicating that the markers do not miss any converted grasslands. This is well described by the specificity metric ranging from 0.71 to 1.00, which relates to the marker's ability to correctly identify grassland parcels that are not grasslands. The FN, corresponding to grassland parcels marked by the satellite that are identified as not converted on street-level imagery, is important and results in the highest accuracy for the combined S1 AND S2 marker.

Discussion
The overall objective of this paper is to evaluate the feasibility of integrating the in-season availability of parcel-level crop type information with S1 and S2 markers. The markers aim to identify, across a large area, grassland parcels with deviating temporal-radiometric characteristics that are then targeted for inspection during a one-day field survey was achieved. Nevertheless, following the five specific objectives described in Section 1.7, we now discuss issues to clarify specific, but important details of the methodology, and that will need further investigation and improvement in future studies.

Efficiency of the S1 and S2 Markers for Grassland Monitoring
The first specific objective of this study was to evaluate whether efficient markers for grassland monitoring can be defined from a combination of Sentinel-1 SAR and Sentinel-2 multi-spectral observations. Indeed, we demonstrate that when combining both sensors, we were able to detect parcels wrongly declared as grasslands with a precision of 98%.

Marker Concept
Nevertheless, the marker concept as described in Section 1.3 is challenging to translate to a precise remote sensing definition as the interpretation is not only linked to a remotely-sensed signal, but also to agricultural practices. The markers tested are a temporal crop type classification using S1 and the detection of bare soil with S2. The latter marker suggests ploughing of the parcel, and this indicates possible grassland conversion. In the Netherlands, grassland can be declared permanent if it is maintained continuously for five years. However, ploughing the parcel is allowed (if not in Natura 2000), as long as it is directly renewed to grassland thereafter. Therefore, while the S2 BSI marker correctly identified some parcels as being bare during the growing season, renewed grassland was observed on these parcels during the field survey in October. These parcels are included in the FN category. Similarly, the field survey also revealed that some parcels in this category flagged with the S1 backscatter and S2 BSI markers actually were heavily grazed meadows, e.g., by horses, leading to bare soil appearance, especially during wet periods in the spring.

Evaluating Markers with Targeted Field Visits and Street-Level Images
The field observations were collected according to a routing designed to maximize the inspection rate of the marked parcels, but without taking into account a proper distribution with respect to non-marked fields. This limitation was partly overcome by using the street-level images to build a more complete ground-truth sample. Following this approach, information about the efficiency of the methodology can be deduced. Using the satellite marker methodology, a high level of TN and a low level of FP were obtained as indicated by the evaluation with the field observations and the street-level imagery (respectively Tables 11 and 13). This is a requirement for methodologies focusing on parcel monitoring. Regarding the results obtained with street-level pictures (Table 13), compared to the assessment of the markers using the targeted field visits (Table 11), it is interesting to see that the FP number is low, ranging from zero to two for street-level pictures and ranging from one to seven for the targeted field visits.
Generally, the accuracies obtained with street-level pictures (Table 13) are lower than the ones obtained with the ground-truth (Table 11) since the proportions of FN are higher. However, the accuracy for S1 OR S2 in Table 11 at 0.23 is much lower than the corresponding value in Table 13. This is because the number of TP is proportionally much lower in Table 11 since these parcels (grassland detected as grassland), were not targeted for field observations.

Combination of S1 and S2
The evaluation of the performance of different single and combined markers shows that parcels marked by both sensors yield the highest sensitivity (0.84), precision (0.98), accuracy (0.84) and F-score (0.9). In this case, we observe two FP and 12 TN when compared to the field visits. In fact, only one of the FP is detected by S1 backscatter, but not by S2 BSI (hence, S1 OR S2 has one FP). In conclusion, only one parcel is not flagged by any of the marker combinations, which is a good operational result in an inspection context.
In this study, S1 backscatter and S2 BSI demonstrate their ability to detect outlier parcels. S1 coherence was less efficient with more false positives than the other markers. This can be linked to the time window (two weeks) selected to create the maximum composite as this may not be optimal to grasp short-term change events within the parcel. Furthermore, S1 coherence is generated at 20-m resolution, which may lead to a loss of quality in extracting parcel means, given the relatively small average parcel size of 1.75 ha. S1 coherence is probably more useful as a discrete type of signal (like S2 BSI), rather than in a machine learning approach as used in this study. More work is needed to explore how coherence should be exploited for such applications.

Processing Methodology
The second objective is to assess the usefulness of combining cloud processing using Google Earth Engine (GEE) and deep learning (Tensorflow) to perform large-scale marker evaluation tasks at the parcel level.

Tools, Coding Languages and Platforms
A core characteristic of this study is the ability to analyze diverse data streams, including deep time series stacks from Sentinel-1 and -2 against large parcel datasets, and to validate results with field visit observations and tags derived from interpreted geo-tagged street-level images. For remote sensing processing, GEE was selected because it already hosts S1 and S2 in "analysis ready" formats and provides the sophisticated processing functionality and capacity to rapidly generate arbitrarily large signature compositions and extracts. One limitation of GEE is that it does not host S1 coherence outputs yet. Thus, coherence processing needs to be done outside GEE, starting from S1 SLC images and with the use of the SNAP Sentinel-1 toolbox. A recent study by Zebker [56] proposes a new method to resample S1 SLC data to map projected SLC products that will then allow coherence (and interferometric phase) generation on cloud platforms such as GEE. This would likely lead to further popularizing the use of such outputs in crop monitoring and other applications.
Another GEE platform limitation is the lack of S-2 atmospherically-corrected surface reflectance, even though that was not an essential requirement in this study. The reason why these datasets are not yet ingested as a GEE catalog is primarily due to a lack of consensus in the scientific community on the preferred correction algorithm [57].
GEE integration with TensorFlow is a work in progress, which required us to run the latter on a stand-alone workstation (an eight-core Intel Xeon E3-1505M v6 @3.00 GHz, with 64 GB RAM and Quadro M2200 GPU) using the Python implementation. The Mapillary API was used to support the tagging of the geo-tagged street-level images. Overall, Python provides the "glue" to interlink the various inputs and outputs to processing modules, for format conversion (typically using GDAL [58]) and occasional spatial data handling (in PostgreSQL/Postgis).
The tools developed and published along with this study can be scaled to work for much larger areas than our study area. We have already applied equivalent GEE extractions and TensorFlow classifications at the country scale using the full BRP2017 (770,000 parcels). TensorFlow run times for the analysis proposed in this study are in the order of minutes, i.e., in no way limiting the uptake of the methodology to complicated cloud solutions or dedicated hardware solutions such as GPUs.

Need for Open-Access Parcel Identification
A prerequisite of this study is the availability of open-access digital agricultural parcel features with their actual crop type. Because of technical, administrative and political reasons, such datasets are not always available (on time). Segmentation methods such as Simple Non-Iterative Clustering (SNIC) [59] could now efficiently be applied on the cloud [60] and thus bridge the gap for countries or regions where vector parcel data are not available. However, this remains challenging as the segmentation performs differently according to the landscape structure, input data and parameter settings. Existing large-scale (e.g., 1:10,000) topographic vector data and high quality land use/land cover maps would facilitate enhancing segmentation results. An alternative would be to use timely high-resolution imagery to generate the parcel boundary data on demand, in a similar manner as how MS LPIS are created and maintained. Systematic collection of open spectro-temporal crop type libraries, much in the fashion of the USGS Spectral Library [61], but dedicated to agriculture, would be an essential contribution to crop recognition tasks.

TensorFlow Training Improvement
One of the limitations in the S1 machine learning approach is that the study area contains about 15,000 parcels with unevenly-distributed crop types. This results in limitations when designing a proper statistically-distributed training sampling strategy. Enlarging the sample with parcels stratified by class from neighboring areas will probably improve the method's robustness and should be investigated in the future. Such an enlarged set would facilitate multiple training set selection, allowing majority voting with more runs and stricter class probability thresholds per run to improve the results' robustness. So far, the structure of the neural network itself was not properly assessed as the results obtained with two hidden layers with 32 nodes each were satisfying, both in achievable model accuracy and computational time. However, further investigation and hyper-parameterization (number of layers, number of nodes, number of epochs) could lead to additional improvements. Other interesting perspectives include the use of transfer learning methods such as [62] with trained models from different years or geographical locations.

Implications for CAP Monitoring
The third objective is to evaluate whether the targeted monitoring approach is appropriate and able to efficiently scale across the whole area covering 15,000 parcels. The question for this study was to evaluate if the proposed markers could detect parcels for which the predicted crop class was not the same as the declared one, in particular for declared grassland. Table 7 comparing the BRP2017 with the targeted field visit shows that 18 parcels are not compliant, i.e., their declared label is not what was observed in the field. Among the 18 diverging parcels, nine parcels were declared as grasslands, but categorized as maize. Of these, six were confirmed as maize in both the field visit and the tagged street-level pictures (the other three parcels were too far from the road and, therefore, not observed). As mentioned in Section 2.2.1, this study used the early available BRP2017 (available in July). After conducting this study, a definitive version of the BRP2017 was made available. We therefore cross-checked the 18 diverging parcels in the early version of BRP2017 and in the definitive one. Among the six parcels declared as grasslands, but found as maize, one was changed to maize in the definitive BRP version.
The total of nine non-compliant grasslands in the total of the 1411 parcels (0.64%) that were observed in the street-level images suggests that the overall problem of undeclared grassland conversion in the study area is very low. Since we have targeted our field visit and street-level image collection based on the S1 and S2 markers, we may assume that the actual grassland conversion for the whole area is even below the rate derived from the street-level sample. However, permanent grassland conversion risks being a gradual process, and corrective action is required to ensure that non-compliant parcels are resown as grassland. Our method can be an important contribution to systematically follow this process and understand its temporal and geographical dynamics.

Street-Level Imagery
The fourth objective is to evaluate whether street-level imagery can contribute to the efficiency and effectiveness of a field survey designed for the evaluation of our markers.
In less than three hours, the interpreters were able to tag more than 200 parcels with an overall accuracy close to 90% ( Table 9). The performance for grasslands and maize distinction was good as UA and PA were always above 90% for the three interpreters (Table 10). These results are encouraging and demonstrate that using street-level photo-interpretation is a valuable approach. However, the study focused on five crop groups that are rather distinct, even in sub-optimal quality photos. A full evaluation with a wider range of crop types and crop stages would be necessary to appreciate street-level imagery in more generalized crop recognition contexts. The three interpreters in our study were researchers with some expertise in agronomy, which may have biased the interpretation results to higher accuracies. However, as discussed by [63], a proper training and relevant feed back can improve recognition skills over time.
This study demonstrates the potential to improve in situ data collection efficiency. During this study, in one day, 214 parcels were visited to collect crop type, while at the same time, 1411 parcels were observed with street-level imagery. The collection efficiency improved thus by a factor seven, still reaching an interpretation accuracy of about 90% (Table 9). This efficiency factor is probably somewhat higher if the survey were to deploy only the street-level imagery collection (no stops needed for field visit observations) and would double if side-looking cameras were mounted on both sides of the car.
The usefulness of collecting ground-truth data with such an approach depends on landscape structure including field size and landscape elements obscuring the field of view. Clearly, when the parcel is not adjacent to the road or hidden by anthropic or natural elements such as trees or hedges, it limits the usefulness of the approach. Automating the image tagging with the use of computer vision methods opens a new avenue for further raising the efficiency of street-level image use. The potential of these methods has already been demonstrated in other user domains [27][28][29][30][31][32][33][34][35]. In order to use the full potential of deep learning approaches for crop type recognition, abundant and representative training data are a pre-requisite. Generating dedicated street-level crop-type hierarchical datasets following the approach of ImageNet [64] will be a worthwhile effort for the scientific community.
In this study, street-level pictures were collected specifically to assess the markers, and these were then uploaded on an open-access platform. Mapillary currently (as of 6 July 2018) provides more than 315 million crowd-sourced images. A huge future, and so far unexplored, opportunity exists in the use of such images to collect and generate useful in situ data. It is important to note that collecting data with street-level cameras should comply with privacy regulations such as the European General Data Protection Regulation (GDPR) [65].

Code Sharing
Following the general reporting requirements that come with scientific research and following the final objective embedded in our manuscript, all scripts and codes used across a variety of software packages needed to reproduce the results presented here are made available on the GitHub repository https://github.com/rdandrimont/AGREE (AGRicultural Enhanced Evidence), as summarized in Figure A1, Appendix A, and are available online as Supplementary Material. In this way, the computationally-intensive data-driven science carried out in this study, and the results obtained, can be reproduced transparently and completely.

Conclusions
Taking advantage of four socio-technological developments, (1) frequent and high-resolution observations made by Copernicus Sentinels. (2) in-season availability of parcel-level crop-type declarations.
(3) cloud computation and deep learning and (4) the availability of street-level imagery, a case-study to monitor grasslands in the Netherlands was carried out in 2017. Using Sentinel satellite observations during the growing season allowed us to flag grassland parcels with deviating temporal-radiometric characteristics. The best marker was obtained by combining S1 and S2. A subsequent one-day field survey to inspect the parcels and build a relevant ground-truth sample confirmed the usefulness and scalability of the methodology, which detected parcels wrongly declared as grasslands with a precision of 98%, when combining both sensors. Additionally, this study demonstrated that collecting street-level imagery permits efficiently collecting in situ data on crop type with an accuracy of about 90%. This opens avenues to improve high accuracy crop monitoring at the parcel level.