1. Introduction
Grasslands are semi-natural elements that represent a significant source of biodiversity in farmed landscapes [
1,
2,
3,
4]. They provide many ecosystem services such as carbon storage, erosion regulation, food production, crop pollination and biological regulation of pests [
5], which are linked to their plant and animal composition.
Different factors impact grassland biodiversity conservation. Among them, the age of a grassland (i.e., the time since last ploughing/sowing) is directly related to its plant and animal composition. Old “permanent” grasslands, often called semi-natural grasslands, hold a richer biodiversity than temporary grasslands [
2,
6,
7,
8]. Indeed, they had time to establish and stabilize their vegetation cover, contrarily to temporary grasslands, which are part of a crop rotation. Additionally, agricultural management of grasslands (i.e., mowing, grazing, fertilizing, reseeding, etc.) influences their structure and composition [
9,
10,
11,
12]. Management is essential for their biodiversity conservation because its prevents woody establishment. Conversely, an intensive use constitutes a threat for this biodiversity [
12,
13]. Therefore, it is important to know the age of a grassland and to identify the management practices in order to monitor their effect on biodiversity and related services. However, these factors are defined at different temporal scales: over the years for the age of a grassland and during a vegetation season (i.e., a year) for the management practice.
Usually, ecologists and agronomists characterize grasslands at the parcel scale through field surveys. However, these surveys require important human and material resources, the knowledge of the assessor and a sampling strategy, which make them expensive and time consuming [
14]. They are thus limited in spatial extent and in temporal frequency, limiting grassland characterization to a local scale and over a short period of time.
Conversely, remote sensing offers the possibility to provide information on landscapes over large extents, thanks to the broad spatial coverage and regular revisit frequency of satellite sensors [
15]. In this context, satellite images have already appeared to be an appropriate tool to monitor vegetation over large areas with a high temporal resolution.
In the remote sensing literature, grasslands have relatively not been studied much compared to other land covers like crops or forest [
16]. Most of the studies focusing on grasslands have agronomic applications, such as estimating biomass productivity and growth rate [
17,
18,
19] or derivating biophysical parameters like the Leaf Ara Index (LAI), the Fraction of Photosynthetically Active Radiation (fPAR) and the chlorophyll content [
20,
21,
22,
23,
24]. Studies having biodiversity conservation schemes such as assessing plant diversity and plant community composition in a grassland are usually based on ground spectral measurements or airborne acquisitions at a very high spatial resolution [
25,
26,
27,
28,
29,
30,
31]. However, such acquisitions are time consuming and expensive, and thus, they do not allow for continuous monitoring of grasslands over the years.
Using satellite remote sensing images, grasslands have been much studied at a regional scale with medium spatial resolution sensors (i.e., MODIS, 250 m/pixel [
17,
18,
32]), where the Minimum Mapping Unit (MMU) is at least of hundreds of meters. This scale is suitable for large, extensive, homogeneous and contiguous regions like steppes [
33], but not for fragmented landscapes, which are usually found in Europe and in France particularly [
34,
35]. These fragmented landscapes are made of a patchwork of different land covers, which have a small area [
35]. In these types of landscapes, grasslands can be smaller (less than 10,000 m
2) than the pixel resolution [
36] (see
Figure 1 for a graphical example). As a consequence, pixels containing grasslands are usually a mixture of other contributions, which can limit the analysis [
37,
38]. As examples, Poças et al. [
39] had to select large contiguous areas of semi-natural grasslands in a mountain region of Portugal to be able to use SPOT-VEGETATION data (1-km resolution). Halabuk et al. [
40] also had to select only one MODIS pixel per homogeneous sample site in Slovakia to detect cutting in hay meadows. A 30-m pixel resolution is still not sufficient for grassland characterization. Indeed, Lucas et al. [
41] and Toivonen and Luoto [
42] showed that it was more difficult to classify fragmented and complex elements [
43], like semi-natural grasslands, than homogeneous habitats, using Landsat imagery. Price et al. [
44] classified six grassland management types in Kansas using six Landsat images, but the accuracy of the classification was not satisfying (less than 70%). Therefore, to detect small grasslands in fragmented landscapes, high spatial resolution images are required [
36,
45,
46].
For high spatial resolution images (about 10 m/pixel), few intra-annual images are usually available for a given location [
47]. However, Buck et al. [
48] concluded that three RapidEye images per year were not enough to detect the mowing practices in grasslands. It was confirmed by Franke et al. [
49] who classified grassland use intensity into four categories: semi-natural grassland, extensively-used grassland, intensively-used grassland and tilled grassland. They increased the classification accuracy when increasing the number of RapidEye images from three to five scenes. Additionally, Schmidt et al. [
50] concluded that about seven to ten images, depending on the vegetation index used, are a good tradeoff between the amount of satellite data and classification accuracy of grassland use intensity. Some works report results with few images per year, such as Dusseux et al. [
51], but they worked on LAI. In their study for mapping grassland habitat using intra-annual RapidEye imagery, Schuster et al. [
52] concluded the more acquisition dates used, the better the mapping quality.
Given the heterogeneity of grasslands in fragmented landscapes, their phenological cycle and the punctuality of the anthropogenic events (e.g., mowing), dense high spatial resolution intra-annual time series are necessary to identify the grassland management types [
36,
52,
53,
54]. Moreover, to discriminate semi-natural grasslands from temporary grasslands, inter-annual time series are necessary. Until recently, satellite missions offering high revisit frequency (1–16 days) had coarse spatial resolution (i.e., NOAA AVHRR, 1 km; MODIS, 250/500 m). Conversely, high spatial resolution missions did not provide dense time series and/or were costly (i.e., QuickBird, RapidEye). For these reasons and compared to crops, grasslands’ differentiation through Earth observations is still considered as a challenge [
52]. However, new missions like Sentinel-2 [
55], with a very high revisit frequency (five days) and high spatial resolution (10 m in four spectral channels, 20 m in six channels), provide new opportunities for grasslands’ monitoring over the years in fragmented landscapes [
54] at no cost, thanks to the ESA free data access policy. For instance, the high spatial resolution is assumed to make possible the identification of grassland-only pixels in the image, and several pixels can belong to the same grassland plot. Hence, the analysis can be done at the object level, not at the pixel level, which is suitable for landscape ecologists and agronomists who usually study grasslands at the parcel scale [
56]. Thus, object-oriented approaches are more likely to characterize grasslands ecologically [
57,
58]. Yet, many works consider pixel-based approaches without any spatial constraints [
17,
42,
44,
48,
49,
52,
59].
At the object level, grasslands are commonly represented by their mean NDVI [
18]. However, such a representation might be too simple since it does not account for the heterogeneity in a grassland. Sometimes, distributions of pixels as individual observations are still better than the mean value to represent grasslands, as in [
54]. Lucas et al. [
41] used a rule-based method on segmented areas for habitat mapping, but it did not work well on complex and heterogeneous land covers. Esch et al. [
60] also used an object-oriented method on segmented elements then represented by their mean NDVI. These methods based on mean modeling do not capture grasslands’ heterogeneity well. Other representations can be found in the literature, taking the standard deviation and object texture features as variables [
61], but they were not applied to time series. To our knowledge, these methods do not use the high spatial and the high temporal resolutions jointly. Moreover, all of these studies used vegetation indices as a variable, although it has been shown that classification results are better when using more spectral information [
35,
62].
To deal with the high spatio-spectro-temporal resolutions new satellite sensors are now offering, dimension reduction is usually performed through the use of a vegetation index such as NDVI [
50,
52,
63,
64], PCA [
65] or spectro-temporal metrics [
35,
66]. However, a large amount of spectro-temporal information is lost with these solutions. Franke et al. [
49] developed an indicator of the spectral variability of a pixel over the time series, the mean absolute spectral dynamics, but its efficiency was assessed using a decision tree algorithm. Decision trees are usually not recommended because they tend to over-fit the data [
67]. Therefore, the high spatio-spectro-temporal resolutions have not really been addressed in the literature of remote sensing classification. Indeed, such time series bring new methodological and statistical constraints given the high dimension of data (i.e., number of pixels and number of spectral and temporal measurements). Dealing with more variables increases the number of parameters to estimate, increasing the computation time and making the computation unstable (i.e., ill-conditioned covariance matrices, etc.) [
68,
69]. Hence, conventional models are not appropriate if one wants to use all of the spectro-temporal information of time series with high spatial and temporal resolutions. Thus, classifying grasslands with this type of data is still considered as a challenge [
52].
In the present study, we introduce a model suitable for the classification of grasslands using Satellite Image Time Series (SITS) with a high number of spectro-temporal variables (e.g., Sentinel-2 data). Two temporal scales are considered in this work: (i) an inter-annual time series of three years to discriminate old grasslands from young grasslands and (ii) an intra-annual time series to identify the management practices. Note that in this work, the objects are not found from segmentation [
38], but from the existing dataset in a polygon form.
The first contribution of this study is to model a grassland at the object level while accounting for the spectral variability within a grassland. We consider that the distribution of the pixel spectral reflectance in a given grassland can be modeled by a Gaussian distribution. The second contribution is to propose a measure of similarity between two Gaussian distributions that is robust to the high dimension of the data. This method is based on the use of covariance through mean maps. The last contribution is the application of the method to old and young grasslands’ discrimination and of management practices’ classification, which are non-common applications in remote sensing. Moreover, to our knowledge, mean maps have not yet been used on Gaussian distributions for supervised classification of SITS at the object level.
In the next section, the materials used for the experimental part of this study are presented. Then, the methods, including the different types of grassland modeling and the measures of similarity between distributions, are introduced in
Section 3. Following that, we experiment with the proposed methods on the classification of a real dataset in
Section 4. Finally, conclusions and prospects are given in
Section 5.