1. Introduction
Grasslands cover about one-third of the global ice-free land surface. In Europe, they account for approximately one-third of the utilized agricultural area. Grasslands are a key element of our agricultural systems as they provide nearly half of the feed requirements for global livestock production [
1,
2]. They also play an essential role in regulating, e.g., soil erosion, carbon, water, and nitrogen fluxes [
3,
4,
5] and are habitats for a broad range of plant and animal species [
6,
7].
In land-cover maps, grasslands and other open biotopes are often embedded in one broad land-cover class (e.g., [
8,
9,
10]). However, the quality and quantity of—and the synergies and trade-offs between—their provisioning and regulating ecosystem services vary significantly depending on the grassland types, grassland-use intensity, and environmental context [
11,
12,
13]. More specifically, grasslands-use intensity has a significant impact on their ecological value as habitats [
13,
14,
15,
16,
17,
18]. Furthermore, the spatial and temporal distribution of grassland-use intensity affects the biodiversity at the landscape scale [
19,
20]. It is therefore crucial to map and monitor grassland-use intensity to further study links and balance trade-offs with their ecosystem services and their ecological value as habitats.
Most grasslands in temperate areas are managed through grazing, mechanical mowing, or a combination of both for forage production. The stocking rate of grazing animals and the timing and frequency of mowing events are major factors of grassland-use intensity. The type and quantity of management practices are commonly used as indicators to classify grasslands, both from an agricultural and an ecological perspective [
21,
22,
23].
In recent years, with the emergence of new satellites combining a high spatial and temporal resolution, such as the Sentinel missions, an increasing number of studies have shown the potential of remote sensing for grassland mapping and monitoring [
24,
25]. Optical and radar image time series are increasingly used to discriminate grasslands habitats [
26,
27,
28] or management practices [
29]. A large number of studies focus on mapping grassland-use intensity either through image classification methods [
30] or by retrieving different factors of grassland-use intensity (e.g., mowing frequency, grazing intensity, and/or biomass production) from image time series and auxiliary data [
29,
31,
32,
33,
34]. In this context, the most frequently used index seems to be the Normalized Difference Vegetation Index (NDVI), followed by biophysical indices, such as the Leaf Area Index (LAI).
Lately, particular attention has been given to the detection of mowing events based on satellite image time series. The timing and frequency of mowing events have a great influence on grasslands provisioning and regulating ecosystem services and on biodiversity [
35,
36]. Several types of approaches have been explored. A majority of studies use optical imagery (e.g., Sentinel-2, Landsat, MODIS, and RapidEye), detecting mowing events through significant decreases in the NDVI, LAI, or other vegetation indices [
33,
37,
38]. In radar remote sensing (e.g., Sentinel-1), mowing events can be detected through sudden increases in interferometric coherence time series [
39,
40]. Optical imagery allows detecting mowing events with more precision (i.e., less false positives) but with a lower detection rate than radar because persistent cloud cover prevents the detection of some events. A few studies, therefore, combined both optical and radar imagery, increasing the detection rate while limiting the number of false positives. Studies combining Sentinel-1 and Sentinel-2 time series for either rule-based or deep learning mowing detection methods reached high performances, with F1-scores between 79% and 84% [
41,
42].
Many mowing detection methods described in the literature are object based, relying on existing delineations of parcel boundaries or a preliminary segmentation step [
32,
39,
40,
41,
42,
43]. This allows a spatial smoothing of the satellite signal and avoids the salt-and-pepper effect that is inherent to pixel-based approaches [
37,
38,
44]. One of the major drawbacks of object-based approaches is, however, the potential heterogeneity of practices inside declared or delineated parcels. When only one part of a parcel is mown at a given time and the other is grazed or mown at a different time, the signal change can be smoothed out, causing omissions.
While many studies focused on mowing detection, few have investigated the possibility of monitoring grazing activities [
30,
32,
45] and even fewer discriminate grazed and mown grasslands [
29]. Grazing has been identified as a major confounding factor to mowing detection in several studies as many false mowing detections occur in pastures [
40,
44] because they both result in biomass removal. However, both from an agricultural and an ecological point of view, grazing and mowing cannot be considered equivalent management practices because the first is selective (depending on the type of livestock), while the second is not.
The objective of this study is to differentiate grasslands in terms of management practices at the sub-parcel level, corresponding to homogeneous management units. Grasslands are differentiated in two hierarchical steps, combining a pixel-based and an object-based method to account for the variability in the practices inside declared parcels.
The first step consists of detecting the main management practice at the pixel level to differentiate pastures, managed exclusively through grazing, from hay meadows, which are mown mechanically. This preliminary pixel-based classification approach tackles two main issues in grassland monitoring, namely (i) the heterogeneity of practices inside declared parcels and (ii) grazing as a confounding factor for mowing detection.
Then, an object-based mowing detection method based on the Sentinel-1 (S1) and Sentinel-2 (S2) time series is applied to the management units classified as hay meadows to further differentiate them and to produce an exhaustive grassland management practice classification.
3. Methodology
Previous studies have shown the great potential of combining Sentinel-1 and Sentinel-2 for grassland mowing detection. Because of the speckle inherent to SAR imagery, pixel-based approaches are challenging. Therefore, studies using SAR data for mowing detection rely on object-based approaches, averaging the signal per parcel.
However, grassland parcels, as declared by farmers in the Land Parcel Identification System (LPIS, i.e., a vector dataset based on legal declarations by farmers in each EU country, including parcel boundaries and crop types), often include several management units that are not all exploited at the same time or in the same way. To illustrate,
Figure 2 shows the Sentinel-2-derived LAI time series of a grassland parcel in our study area. The parcel’s average time series (in gray) is relatively constant, while about half of the pixel time series (in green) significantly decreases in the middle of June. This decrease in LAI is due to a mowing event that occurred on one part of the declared parcel, while the other part was not mown but grazed. The two management units are also visible on the orthophoto (dashed red line in
Figure 2).
This is an issue for object-based grassland monitoring methods, such as the one developed in Sentinels for Common Agricultural Policy (Sen4CAP) [
42]. Therefore, we develop a hierarchical grassland characterization approach combining a pixel-based classification method and an object-based mowing detection method.
3.1. Pixel-Based Supervised Classification
In this phase, the aim is to differentiate grasslands in terms of main management practice (grazing or mowing) and retrieve homogeneously managed parcels for further characterization. A pixel-based supervised classification method is used to discriminate exclusively grazed from mown . The class includes mixed practices, which are alternatively mown and grazed. Field observations were used to build a reference dataset to train and validate a random forest classifier based on S2 and S1 time series.
3.1.1. Input Features
Vegetation indices derived from specific spectral bands are commonly used to emphasize certain properties of a vegetation cover, such as biomass. We made the hypothesis that grasslands that are grazed throughout the season should have relatively constant and stable vegetation index time series compared to grasslands with at least one mowing event causing a sudden change in biomass [
38,
44]. To test the sensitivity of the classification to the choice of vegetation index, three spectral vegetation indices and one biophysical index derived from Sentinel-2 were considered for the classification, namely NDVI, the red-edge chlorophyll index (CIre), and the LAI. The NDVI is computed as the normalized difference between the near-infrared (band 8) and the red (band 4) reflectance (Equation (
1)) and is largely used for vegetation monitoring and more specifically for grassland mapping and mowing detection [
33,
37,
44]. The CIre is related to the increase in reflectance between the red and near-infrared (i.e., the red edge) which is linked to biomass and chlorophyll content of vegetation. It is calculated as the ratio between lower (band 5) and upper (band 7) red-edge reflectance (Equation (
2)). The CIre was used by Hardy et al. [
34] to retrieve grassland biomass.
The LAI was retrieved from Sentinel-2 reflectances through the calibrated artificial neural network from the BV-NET tool [
50], which is implemented in several European Space Agency (ESA) agricultural monitoring toolboxes (e.g., Sen2Agri, Sen4CAP).
To fill gaps due to cloud cover, the S2 vegetation index time series were temporally interpolated using the Image Time-Series Gap Filling tool [
51] available in Orfeo Toolbox [
52]. As we intend to apply this classification to a larger area, the time series were temporally resampled to a 5-day grid, starting at the first acquisition date, to overcome the multiple-day offset between adjacent satellite tracks [
53]. Both linear and cubic spline interpolations were tested for the three indices. A total of 6 different S2 feature sets were thereby tested as input features for the random forest classifier (
Table 1).
Microwave data guarantee regular temporal coverage and can provide complementary information to optical data. The complementarity of Sentinel-1 and Sentinel-2 was shown in the context of grassland mowing detection [
41,
42]. Therefore, we tested the classification with microwave time series alone and in combination with Sentinel-2 data. Sentinel-1
backscattering amplitudes in VV and VH polarization and the ratio VV/VH were used as input features. Ascending and descending pass acquisitions were made at different times of the day and with different look angles. Because radar signal is strongly impacted by water content, morning acquisitions are significantly affected by dew and vegetation water content. Each polarization was therefore tested in ascending (e.g.,
) and descending (e.g.,
) pass separately. A total of 6 different S1 feature sets were thereby tested as input for the classification (
Table 1).
In addition, to assess the complementarity of S1 and S2 for differentiating and , the best-performing S2 feature set was tested in combination with the best-performing S1 feature as well as the respective time-series minimum (), maximum (), mean (), and median () values and all statistics together (). A total of 6 different S1 and S2 feature combinations were thereby tested.
3.1.2. Classification Mask
The classification mask was built by combining and resampling a grassland mask and a shadow mask (
Figure 3).
The grassland mask was obtained by reclassifying the 2 m resolution land-cover product of LifeWatch [
54]. Two grassland classes, namely “Monospecific grassland with graminoids” and “Diversified grassland and shrubland”, were taken into account.
The shadow mask was based on a digital surface model (DSM) (
Figure 4). The DSM of Wallonia is a product of the orthophoto acquisition campaign of 2019 (Service Public de Wallonie, SPW). Shadow projections were computed with 2 m resolution based on the object heights from the DSM and a sun azimuth and elevation of 146and 38, respectively.
The combined grassland and shadow mask was resampled to 10m to match Sentinel-2 pixels. A minimum rule was applied for the resampling to take only pure pixels into account for the classification, with 100% grassland and no shadow.
This mask allows to classify only pure grassland pixels and discard pixels that are influenced by shadows or trees (
Figure 5). The LAI time series of masked pixels (in gray on
Figure 5) consistently differ from the valid pixels (in green on
Figure 5). The majority of masked pixels in this example are mostly influenced by shadow, which manifests in lower LAI values throughout the season. A few masked pixels are influenced mostly by trees and shrubs and have higher LAI values compared to valid grassland pixels.
3.1.3. Reference Data
Based on the field observations, the observed parcels could be classified into two categories: , which were exclusively grazed during the study period, and on which at least one mowing event was observed. The reference parcels were redrawn manually, based on the LPIS, the grassland mask, and the Walloon orthophoto of 2019 (SPW) to obtain homogeneous reference parcels. When two or more management units could be differentiated inside one declared parcel, only the management units closest to the road were considered to be matching the field observation.
During the redrawing, 5 parcels were discarded because they contained no valid pixels due to shadow. In total, the reference dataset contained 412 parcels (194 and 218 ). They were equally partitioned into a training and a validation dataset through stratified random sampling. The training dataset was used to train, calibrate, and compare the classification methods through cross-validation. The validation dataset was used to validate the final product.
3.1.4. Cross-Validation
Different classifiers were evaluated and compared through a 4-fold cross-validation scheme. The training dataset was split into 4 subsets to keep a reasonable number of samples for the validation at each iteration. The classifiers were compared based on the mean overall accuracy (OA) and its standard deviation. The best-performing classifier was then trained using the whole training dataset. The resulting classification was then validated with the validation dataset. The user and producer accuracy (UA and PA) of both classes were also computed in addition to the overall accuracy.
Both during the cross-validation and the final validation, we applied a per-pixel wall-to-wall validation, assessing each pixel inside each redrawn homogeneous reference parcel.
3.2. Object-Based Mowing Detection
The pixel-based classification obtained in the first step was used in combination with the LPIS to obtain homogeneous parcels for an object-based mowing detection using the Sen4CAP toolbox v3.0 [
55].
3.2.1. Classification Post-Processing
The following steps were applied to obtain homogeneous management unit polygons based on the classification and the LPIS.
The classification is filtered to remove isolated pixels: a pixel value is changed to the other class if there are less than 4 pixels of the same class in a window around the pixel.
All parcels declared as grasslands (temporary or permanent) are extracted from the LPIS.
The LPIS grassland polygons are rasterized at 10 m resolution using the parcel unique feature IDs (between 1 and 999,999) as raster values.
The binary classification is multiplied by and summed up to the values rasterized LPIS. This combined raster carries information about the management class ( vs. ) and the LPIS parcel delineation.
The combined raster is polygonized.
No-data polygons (i.e., covering masked areas) and polygons with an area smaller than 1000 m (10 pixels) are discarded.
3.2.2. Mowing Detection Method
The mowing detection method of Sen4CAP is based on two separate algorithms detecting changes in Sentinel-2 and Sentinel-1 time series extracted per parcel (
Figure 6). The detailed method is described in De Vroey et al. [
42].
The S1 algorithm detects significant increases in VH interferometric coherence by comparing each value
to the previous value
obtained by linear fit of the six previous values
. The detection is based on a Constant False Alarm Rate (CFAR) adaptive threshold (
×
) that takes into account the standard deviation of the residual fitting errors (
). The S2 algorithm detects a mowing event when the decrease in NDVI between two consecutive cloud-free acquisitions is larger than a given threshold, fixed at 0.12 for this region [
42].
A confidence level is computed for each detection, with lower values for S1 than for S2 to compensate for the lower precision of S1 mowing detection. For each parcel, the four most confident detections are retained. For each detection, the detection interval is given along with the confidence level and the data source (S1, S2, or both). The confidence levels of the detections range from 0 to 1 and are well correlated to the precision of the detections [
42].
3.2.3. Validation
The mowing detection performances were assessed at two levels. First, the actual detection of each single mowing event was validated. Second, the further classification of hay meadows, based on the mowing detections, was validated.
In the first case, the mowing detections were validated by crossing the detection intervals with reference intervals. Reference mowing intervals were retrieved from the observations made during the field campaign. A reference mowing interval consists of the time interval between an observation of short grass and the previous observation of high grass. When a detection interval intersects a reference mowing interval, it is considered as a true positive (TP). If no reference mowing interval overlaps a detection, it is a false positive (FP), and if no detection overlaps a reference mowing interval, it is counted as a false negative (FN). The remaining intervals are true negatives (TN).
Two quality metrics, namely the
and the
, are calculated using Equations (
3) and (
4).
In the second case, the accuracy of the differentiation between hay meadows with an early first mowing event and a late first mowing event was validated through a confusion matrix and related quality metrics (UA, PA, and OA).
The calibration reference dataset was used to define the optimal confidence level thresholds and maximize the accuracy of the management practice classification. The validation dataset was then used to assess the result.
Here as well, to stay consistent with the previous classification validation, a per-pixel wall-to-wall validation was applied.
5. Discussion
5.1. Classification and Mowing Detection Performances
One of the main motivations behind the binary classification developed in this study was to be able to exclude pastures for the subsequent mowing detection. In previous studies, grazed parcels were either not taken into account [
41] or shown to be a confounding factor for mowing detection [
38,
42,
44]. Precise information on the management practice of grasslands (i.e., mowing or grazing) is however rarely available. Using a large field dataset, we showed that this information could be retrieved with high accuracy from Sentinel-2 vegetation index time series. This corroborates the hypothesis that grazed grasslands can be distinguished from mown grasslands based on their relatively constant temporal vegetation index profiles. The LAI had already been identified as a relevant variable to discriminate grazed and mown grasslands in a study using three SPOT images [
56]. The LAI retrieved from S2 with the BV-NET tool [
50] was shown to be fit to the purpose of this study. It would however still need to be validated for the absolute retrieval of the LAI in temperate agricultural grasslands.
In this study, the three tested vegetation indices derived from Sentinel-2 (the NDVI, CIre, and LAI) performed similarly, and the random forest classifiers all reached a high overall accuracy. The performances obtained with the Sentinel-1 backscattering time series were much lower. This can mainly be explained by the speckle effect inherent to SAR imagery that makes a pixel-based analysis challenging without any spatial or temporal smoothing. The addition of the Sentinel-1 backscattering temporal statistics to the Sentinel-2 input features did not significantly improve the classification results. Sentinel-1 was therefore discarded for the classification step. The LAI time series with cubic spline interpolation was retained for a further analysis because it performed slightly better, but the NDVI and the CIre could be used as well because the differences in the performances were not statistically significant.
Another related aim of this pixel-based classification was to tackle the issue of grassland parcel delineation, raised in previous studies [
41,
42] and illustrated in
Figure 2. In datasets such as the LPIS, parcel delineations often include several management units that are managed differently or at different times. The binary classification and the post-processing, including a filtering step to remove isolated pixels, allowed to retrieve more homogeneously managed grassland patches at the management units level.
Next to the heterogeneity of practices, delineated grassland parcels can also include hedges, trees, and buildings with different spectral signatures that can hinder the classification. Thanks to the 2 m resolution land-cover product that was used to build the grassland mask, the 10 m pixels with less than 100% grassland cover could be masked out. In optical remote sensing, shadows can also be a significant issue. A shadow mask, estimated through a DSM, was therefore added to the grassland mask to further optimize the classification performances. Overall, the availability of very high resolution products such as the land-cover map, the orthophoto, and the DSM was a great asset. Very high resolution data and products are increasingly available and could be used to build similar grassland masks and reproduce the classification over larger areas.
The operational object-based mowing detection method of the Sen4CAP toolbox was applied to the homogeneous patches of the
retrieved from the classification. According to the validation reference dataset, the method reached a
of 93% and a
of 82%. These detection performances are much higher than those obtained on the same grasslands without the preliminary classification, especially in terms of
. The
was only 44% when the pastures were taken into account due to false mowing detections on grazed grasslands [
42]. The exclusion of pastures thanks to the classification was of course a major factor in this increased performance. However, even compared to the
we obtained in De Vroey et al. [
42] on hay meadows alone (73%), the present results show a significant improvement. This implies that the homogeneity of practices and the absence of trees and shadows inside the reshaped grassland parcels also contributed to the high mowing detection accuracy. In addition, the wall-to-wall pixel-based validation applied in this study could also explain the higher performance metrics because the size of the parcels was not taken into account in the validation in the previous study [
42]. The mowing detection performances obtained here are also slightly higher than those obtained with a deep learning approach combining Sentinel-1, Sentinel-2, and Landsat-8 in a convolutional neural network with a maximum
of 86% and a
of 82% [
41].
While this method showed high performances in our study area in the 2019 growing season, it should be further tested in more extended areas and other seasons. For example, the effects of drought on vegetation could significantly alter the vegetation index time series and thereby represent a challenge for classification and mowing detection.
5.2. Grassland Typology and Perspectives
Previously, a few studies have considered the classification of grassland management practices and intensities through remote sensing, showing promising results but often lacking sufficient representative ground truth data for validation. Using a supervised classification algorithm on RapidEye imagery and a rule-based method to estimate the first mowing date, Franke et al. [
30] classified four types of grassland (semi-natural, extensive, intensive, and tilled) with high accuracy on a small study area in Germany. The red-edge vegetation index derived from five RapidEye images was used by Gómez Giménez et al. [
32] to retrieve a grassland-use intensity index based on the individual estimation of three factors (mowing, grazing, and fertilization intensity). They obtained promising results for the estimation of grazing and mowing intensities but lacked actual ground truth data for validation.
In this study, the hierarchical categorization based on the classification and the mowing detection allowed to differentiate five types of grassland based on the main management practice (grazing or mowing), the date of the first mowing event, and the mowing frequency. Thanks to the large and regionally representative field dataset, we showed that three classes (pastures, meadows with an early first mowing event, and a late first mowing event) could be differentiated with 79% overall accuracy. The mowing frequency estimation could not be validated because the field campaign was only carried out between April 9th and July 19th, while mowing events occur until the end of October. However, given the high detection accuracy obtained during the study period, we make the hypothesis that the detections remain relatively accurate throughout the season.
While
could be further differentiated through the mowing detection,
were not further categorized. In a recent study with a similar hierarchical categorization approach, pastures and mown grasslands were differentiated based on biomass productivity and both classes were then subdivided into three management levels based on the exploitation (i.e., harvest) frequency [
29]. Both the biomass productivity and the exploitation frequency were retrieved through the detection of significant drops in Landsat NDVI time series, considering the cumulative change and the count of drops, respectively. While this approach showed consistent results with regional statistics and georeferenced land-use data, the land-use intensity levels of both classes could not be validated due to a lack of ground truth data. Moreover, the timing of the first exploitation activity should be considered in addition to the exploitation frequency as it is a major factor of grassland-use intensity and has an influence on their ecological value [
57,
58].
We showed that the retrieval of homogeneously managed grassland patches and the identification of pastures greatly improved the precision of the mowing detection and allowed to classify five grassland types with high accuracy. These management units could further serve as a baseline to retrieve other grassland characteristics and study their relationships with biodiversity and ecology. The method developed in this study should be further tested in different conditions to be able to extend it over larger areas and transfer it to other seasons to classify grasslands at the landscape level [
58,
59] and study inter-annual variations [
20] to contribute to ecological habitat monitoring.