1. Introduction
Information on the spatial explicit distribution of ecosystem services (ES) is important for their conservation and management, policy design, implementation and monitoring, fulfillment of reporting processes to national and international mechanisms, as well as for communicating complex information to the public in order to increase awareness and engagement in biodiversity conservation and natural capital protection [
1,
2,
3,
4]. Land-Use/Land-Cover (LULC) mapping and thematic products are often used as proxies of ecosystem types, determine the delivery of ES, and are used as a key input for spatially explicit models of ES supply and demand [
1,
5]. Up-to-date LULC maps, with increased thematic resolution and accuracy are of high importance when it comes to mapping ecosystems and providing proxies to actual ES [
6]. Furthermore, LULC information, when coupled with additional socioeconomic datasets, can also facilitate the transfer of relevant ES information across different spatial and temporal scales [
7,
8,
9].
Remote sensing, providing synoptic, spatially continuous, and consistent observations at multiple scales, can provide timely and accurate spatially explicit information on Earth’s physical surface cover (i.e., land cover) [
10], which is essential for ES related research and practice [
11]. Global open-access land cover products generated through Earth Observation (EO) data e.g., GlobeLand30 [
12], S2GLC [
13], Copernicus Global Land Cover [
14], as well as continental/regional products provided through the Copernicus land monitoring service, such as the Corine Land Cover [
15], provide information that can be potentially used as relevant to ES mapping and assessment processes at global, national, regional, and local scales. However, such operational, land-cover efforts might present limitations related to the use of diverse input data, accuracy variability over different areas and different classes, coarse update intervals and outdatedness in comparison to the real world [
16], coarse thematic resolution and/or class diversity, and coarse minimum mapping units (MMU). The above shortcomings raise the need for timely and on-demand land-cover datasets that could replace or complement available, freely distributed products, providing timely information to the end-users and addressing some, if not all, of the above shortcomings [
17].
The recent availability of medium-high spatial resolution satellite imagery, such as the open-access Landsat-8 and Sentinel-2 (S2) data, can facilitate the development of finer scale, cost-effective land cover mapping datasets at national and supra-national scales [
16,
18,
19], not only from public agencies and official organizations, but also from individual research groups [
20]. This shift, leading to the “democratization” of a broad area mapping process, can also provide land cover information adhering to project-specific nomenclature needs. The improved temporal observation frequency of the aforementioned satellites can also provide enhanced information, related to the phenological differences and consequently intra-annual spectral variability. Such information has been proven to play an important part in land cover discrimination, especially for areas dominated by vegetation [
21]. Freely available Synthetic Aperture Radar (SAR) data recording physical properties of terrestrial objects can be also used complementary to optical remote-sensing data, for enhancing land-cover class separation [
22]. SAR can penetrate clouds and thus broad area land-cover mapping can be achieved in areas frequently covered by clouds [
23]. Multi-seasonal analysis is also facilitated with SAR data, especially over LULC domains where the period of observation of is crucial (i.e., cropland mapping) [
24]. Accordingly, earlier studies demonstrate that the integration of both Sentinel-1 (S1) and Sentinel-2 has been proven as an efficient approach for improving LULC map accuracies in many domains [
25,
26,
27].
Beyond improved data availability during the last decade, the remote sensing community has also noticed advances in satellite data-processing methods along with rapid developments in computer hardware and software. In terms of data volume handling and computational power, products of national and supra-national scale can be more easily produced with the aid of cloud-computing platforms which provide access to numerous datasets and algorithms, great computing power, and cloud storage, free of charge for non-commercial use [
28].
Information extraction about Earth’s surface is also facilitated from the availability of analysis-ready data with minimum pre-processing requirements, novel and robust classification algorithms enabling complex feature space relationships accommodation [
20], as well as paradigm shifts in remote sensing analysis for a generation of geographic information. In regard to the latter, Object-Based Image Analysis (OBIA), involving delineation and analysis of image-objects instead of individual pixels [
29], is a relatively new analysis paradigm that can contribute to overcoming traditional, per-pixel approaches [
30]. Per-object analysis can minimize within-class spectral variability and classification inaccuracies from pixel artifacts due to spectral differences in atmospheric correction, incorrect cloud masking and misregistration problems [
13]. In addition, regarding broad-area mapping, OBIA can reduce computational complexity [
22], albeit the possible labor-intensive segmentation task. Although OBIA is rarely used in broad-area mapping, it has been used effectively in combination with Sentinel imagery, mostly in small scale studies, for crop [
31] and urban land-cover mapping [
22]. The upcoming “2nd generation CORINE Land Cover”, whose production has recently started, will be the first object-based pan-European LULC product relying on a high spatial resolution vector product (layer representing objects) called the “CLC+ Backbone”. This layer will be produced by incorporating Sentinel-1,2 and very-high resolution (VHR) imagery segmentation, as well as ancillary data (such as Open Street Map, EU-hydro, etc.) [
32].
Nonetheless, and despite the aforementioned advances, a lot of challenges and open questions exist when it comes to the development of end-to-end automated or semi-automated workflows for land-cover information retrieval and wall-to-wall mapping over broad areas [
17,
18,
33]. The synthesis of substantially large multi-sensor and multi-temporal EO datasets, in the form of large data cubes, along with the plethora of processing algorithms and approaches implies great challenges and bottlenecks for broad scale land cover mapping spanning from the selection of adequate, reliable reference data, to the proper synthesis of temporal EO information (i.e., compositing), and appropriate data fusion and feature extraction processes.
The main aim of the study is to develop a classification workflow for the production of a fine-scale, land-cover product that will be employed for terrestrial ecosystem condition and ES mapping in Greece, according to the provisions of European Union (EU) initiative for Mapping and Assessment of Ecosystem and their Services (MAES) as described in the EU Biodiversity Strategy to 2020 [
34,
35]. The complex classification nomenclature includes 21 classes linked to the 3rd level of the MAES ecosystem typology, classified through an object-based approach using 10 m spatial resolution Sentinel-1 and Sentinel-2 data along with additional geospatial information from national and EU’s Copernicus land cover products.
Additional research objectives of the study are:
The evaluation of the relevance of different training data extraction strategies (i.e., manual and automated) to increase the classification accuracy.
The evaluation of seasonal and monthly EO information to increase the classification accuracy.
The assessment of the relative importance of the different features in the classification process, facilitating data dimensionality reduction and computational time and resources optimization.
2. Study Area
The study area is the terrestrial territory of Greece, including nearby coastal areas (2 km buffer). Greece covers an approximate area of 131,957 km
2, with one of the largest coastlines in Europe of 15,021 km long, [
19]. Geomorphologically, 78% of Greece’s area is covered by mountains [
36] and has a highly diverse landscape. The climate, according to the Köppen–Geiger climate classification, is mainly temperate Mediterranean with large areas in northern Greece being classified as semi-arid and fewer areas, mostly in higher altitudes, classified as humid continental [
37]. The combination of landscape and climate conditions together with Greece’s biological assets has led to the existence of a large ecosystem type variety, with forest habitats being the predominate habitat group covering 37% of the natural habitats in Greece [
38].
The only currently-available, nation-wide, LULC information is the Corine Land Cover (CLC) dataset with a spatial resolution of 25 ha, while, in regard to ecosystem types, the new enhanced Ecosystem Type Map of Europe (v. 3.1) [
39] with a spatial resolution of 1 ha is now available. The new version of the ecosystem map represents the terrestrial European Nature Information System (EUNIS) level 2 habitat classes, for the European Environment Agency (EEA) EEA-39 members. It was based on the 2012 CLC dataset, with further refinement with other products such as the High Resolution Layers (HRLs) (i.e., area of soil sealing, forest cover, grassland and wetness index), Urban Atlas, riparian zones, Natura 2000 (N2K), and the latest version of OpenStreetMap (OSM) [
40].
4. Results
4.1. Classification Schemes
With respect to the achieved accuracy levels under different training data sampling approaches (
Table 4), the manual sampling procedures presented the highest overall accuracy (77.33%–79.55%) for all feature sets evaluated.
On the other hand, the automation of the training sample extraction procedure for all three feature/composite sets, resulted to the lowest overall accuracies (74.89%–75.64%).
Regarding the different feature sets evaluated, high accuracy was achieved when the monthly features both using the manual (M-M-R) and automated (A-M-R) training sample approaches (overall accuracy 75.64% and 79.55%, respectively) were employed. On the other hand, the use of all features, including S2 original bands through seasonal composites presented the least satisfactory results for both manual (M-S-F) and automated (A-S-F) training (overall accuracy 74.89% and 77.33%, respectively).
The highest accuracy was noted in the classification scheme considering the monthly indices set and the manually identified training samples (M-M-R) (
Figure 2).
The classification accuracy of individual classes (PAs and UAs) are provided in
Table 4. The manual training sample extraction provides a better balance of commission and omission errors, presenting lower mean absolute differences between producer’s and user’s accuracy across all three feature sets (7.84%), compared to the automated (13.85%) approaches. Again, the classification experiment relying on the monthly indices feature set and visually identified training samples (M-M-R) classification experiment, presented the lowest mean absolute difference (7.43%) between PA and UA.
The manual training sample extraction also addressed the large differences in omission/commission errors evident in the discrimination of the objects covered by urban fabric (1.1.1 and 1.2.1), agricultural areas (2.1.1 and 2.2.1), bare rocks (6.3.1), and peat bogs (7.2.1). However, a negative effect in error balance was evident when it came to Mediterranean sclerophyllous forest class (3.4.1), mixed forest (3.5.1), and moors and heathlands (6.1.1).
The ICSI also indicated that the manual training sample extraction, compared to the automated one, lowered the classification result in 10 out the 21 classes, predominantly in the Mediterranean sclerophyllous forest class (3.4.1) by 19% and inland freshwater and saline marshes (7.1.1) by 15 %, when relying on the full feature set (M-S-F). However, when the monthly feature set was employed (M-M-R), the manual training sample extraction improved the ICSI in 13 out of the 21 classes.
The use of the monthly instead of the seasonal indices feature set, when considering the manual training samples (i.e., M-M-R instead of M-S-R), also improved the ICSI in 13 out of the 21 classes while the improvement was even higher when compared to the full seasonal feature (M-S-F) set (16 out of the 21 classes).
Nevertheless, in all classification schemes, moderate to low accuracies (lower than 30%) in terms of the ICSI, were attained for the sclerophyllous vegetation (5.2.1), moors and heathland (5.1.1), and Mediterranean sclerophyllous forest (3.4.1).
Classes exhibiting the highest accuracies, while remaining stable among the two sampling techniques, where temperate mountainous coniferous forests (3.3.1) Mediterranean coniferous forests (3.3.2), grasslands (4.1.1), marine areas (7.3.1), and rivers and lakes (8.1.1).
Table 5 and
Table 6 depict the confusion matrices of the best (M-M-R) and worst (A-S-F) performing classifications, respectively. In the A-S-F classification experiment (
Table 6), low density urban fabric (1.1.2) had high confusion with dense urban fabric (1.1.1), bare area (6.3.1) was frequently confused with dense urban fabric (1.1.1), and peat bogs (7.2.1) exhibited high confusion with inland freshwater and saline marshes (7.1.1). All these misclassifications were minimized with the manual sampling and monthly feature approach (M-M-R) (
Table 5).
In addition, one can observe the confusion between the classes which had low accuracies in all classification experiments. Mediterranean sclerophyllous forest (3.4.1) is often classified as sclerophyllous vegetation (5.2.1) and vice versa, mixed forest (3.5.1) as temperate deciduous forests (3.1.1) or temperate mountainous coniferous forests (3.3.1), and last, moors and heathland (5.1.1) is confused with grasslands (4.1.1) and sparsely vegetated areas (6.1.1).
4.2. Visual Assessment of the Results
Landscape complexity in large-scale LULC mapping generates errors that are not always evident in the classification quantitative assessments using the confusion matrix and derived standard accuracy measures. Visual assessments are always essential for gaining an insight on the final product accuracy and for revealing errors that were not highlighted in the accuracy measures [
33].
In general, the visual comparison of classifications revealed minor differences between different feature sets, but considerable differences for classifications of different sampling techniques.
Particularly, the automated sampling classifications resulted to an overestimation of mixed forests (5.3.1) and Mediterranean coniferous forests (3.3.2) (
Figure 3a–c). On the contrary, low density urban fabric (1.1.2) was underestimated, an observation also consistent with the PA and UA of the automated sampling classification (
Table 4). As it can be seen in image subsets included in the second row of
Figure 3, areas surrounding the dense urban fabric, are classified as permanent crops (2.2.1), in the A-M-R experiment, instead of low density urban fabric, in the M-M-R approach. This effect is mainly caused by the automated LPIS samples containing greenhouses, which mixes the crop spectra with artificial materials.
Manual samples classifications, although reaching higher accuracy values, also demonstrated problems in specific areas. Particularly, in the manual sampling classifications, arable land (2.1.1) was frequently classified as floodplain forests (3.2.1), an effect noticed mostly in high moisture agriculture areas, such as river estuaries and can be explained by the additional number of manual floodplain samples obtained, for solving the spectral confusion with riverbanks in the automated sampling (
Figure 3g–i).
Riverbank areas identification was challenging in both automated and manual sampling approaches, being classified as either dense urban fabric (1.1.1) or floodplain forests (3.2.1). In
Figure 3a–c the riverbank of the river crossing the forest in the south, is mistakenly classified as floodplain forests in the manual sampling classification and as dense urban fabric in the automated sampling classification. Conversely, in
Figure 3d–f one can see the riverbank in the west, mistakenly classified as dense urban fabric in the manual sampling classification and as floodplain forests in the automated sampling classification. These errors can be explained by the large temporal variations of moisture in riverbanks, throughout seasons and between years, caused by differences in precipitation.
Regarding the visual inspection between the different feature set classifications, a noticeable effect was the classification of arable land areas (2.1.1) as inland freshwater and saline marshes (7.1.1), evident in several high-moisture agriculture areas (
Figure 4). This effect was minimized with the use of seasonal features, excluding the S2 L2A bands (M-S-R and A-S-R classifications).
Finally, besides the errors produced by the classifiers, artefacts created by the preprocessing steps were also observed. These errors had a negligible effect in the different classification schemes’ overall accuracy; however, they are noticeable on the local level. The cloud masking using the Cloud Probability and Scene Classification of S2 L2A images resulted in no-data values in artificial surfaces, due to constant misclassification of artificial areas in the Scene product and therefore constant masking in the compositing process. In addition, some mountain areas which were probably covered by clouds and snow most of the time also had no-data values due to constant masking in the compositing process.
4.3. Variable Importance
Variable importance was measured in GEE by the OOB error increase and the Gini Index decrease, when permuting one of the input random variables while keeping the rest constant. The 20 most important predictor features are presented in
Figure 5, averaged over the two sampling methods, for each feature set. In all feature set cases, the feature mean values instead of the standard deviations for each object were among the most important features. Moreover, in all cases, the two most important variables were topographic features, namely the mean value of elevation and slope, followed by TRASP in the third and fourth position. Object properties, in specific, area, form factor, and fractal dimension were also found to be among the 20 most important variables in all cases. SAR VV/VH ratio was found in all feature combinations among the top 10 features, unlike SAR bands which appeared much lower in the total feature lists and therefore do not exist in these figures. Regarding the original S2 spectral bands, only the first Short-Wave InfraRed (SWIR) (B11) was found to be an important feature, while in respect to texture features, only the PANTEX index was found important, though only for seasonal composites. According to the output of the variable importance, the most informative spectral indices in all classifications were BCI, GRVI, MSI, the TC components, reNDVI, and NDRBI. Finally, features from all the available months and seasons were selected as optimal in all six classification experiments.
6. Conclusions
In this paper, different classification schemes were evaluated for fine-scale, land-cover mapping at a national scale, following a fine-scale (21 classes) classification nomenclature.
Our approach integrates cloud-based analysis, a machine learning classification algorithm and freely available multi-sensor and multi-seasonal earth observation data, in order to develop a cost-effective, country-level, land-cover map that will be subsequently used for ecosystem condition and ES mapping and assessment, i.e., for the MAES implementation in Greece, at the national scale.
We have analyzed different approaches considering different feature sets and training sample extraction methods. The random forest classification algorithm is assumed to be relatively robust to reference data inaccuracies; however, our results also support earlier research findings that the refinement of the automated selected training samples leads to higher classification accuracies.
The processing chain of the different classification schemes evaluated, adopts an object-based perspective instead of the traditional per-pixel classification paradigm. This approach, not only facilitates integration of earth observation data from sensors with different characteristics, such as the active Sentinel-1 and passive Sentinel-2 satellite images used in our study, but also addresses challenges related to excessive computational and time demands, an issue that must be treated even in cloud-computing environments.
Assessment of variable importance in all classification schemes highlighted that EO data alone, are not adequate for predicting and mapping the complex Mediterranean landscape, especially when it comes to broad, diverse area mapping. Incorporation of auxiliary data related to the ecological factors seemed to have the highest impact in the classification process, indicating that this information is necessary for discriminating ambiguous habitat patches.
Seasonal and monthly spectral information features were also very important, quantifying phenological differences among map classes. In the future, additional spectral-temporal features, capturing habitat phenology, should be evaluated for further improving classification accuracy.
The present study acts as a roadmap for the further development of an end-to-end, automated workflow, for annual land-cover mapping in Greece, providing up-to-date estimates of ecosystem extent, condition, ES supply (or potential supply), and ES bundles, nationwide. The resulting map will be available to the public through the webGIS portal of the LIFE-IP 4 Natura project (currently under development).