1. Introduction
Growing population, urban sprawl, changes in land cover and land use result in an increased need for systematic and accurate land cover and land use information [
1]. Highly accurate information on land cover and land use is essential for decision makers, urban planners [
2], mapping of ecosystem services [
3], deforestation analysis [
4], detection of land cover changes [
5], and many others. Satellite imagery are recognized as one of the most important data source for land cover mapping [
6,
7], monitoring the dynamics of the land cover changes at local, regional, national and global scales [
8,
9,
10,
11].
Since the launch of the Landsat’s satellite mission, rapid development of the image processing and analysis has been observed [
12]. It results in the development of land cover classification methods, which have evolved substantially over the last four decades [
13]. Several studies successfully applied a set of Landsat data for land cover analysis in urban environments [
13], forestry [
14,
15,
16,
17], agriculture [
18] or wetlands [
13].
Nowadays, the Sentinel-2 data are more frequently used in the land cover classification domain [
19]. This is due to the shorter revisit frequency compared to Landsat, higher spatial resolution—10 m compared to 30 m for Landsat, wider swath—290 km compared to 185 km of Landsat [
20]. Additionally, the Sentinel-2 offers three red edge bands particularly useful for distinguishing the vegetation classification [
21,
22]. Red edge bands of Sentinel-2 data have been recognized as important variables, for example, for forest type mapping using a random forest classifier [
15,
16,
23]. Forkour et al. [
23] demonstrated that adding red edge bands of Sentinel-2 to the Landsat-8 bands resulted in increasing the OA of land cover classification from 0.90 to 0.92 and Kappa coefficient from 0.88 to 0.90.
A review of the application of Sentinel-2 in land cover mapping by Phiri et al. [
24] confirmed the advantage of the Sentinel-2 data compared to the Landsat data for land cover mapping, especially in the urban domain, crop fields mapping, forests and water resources monitoring. The authors stressed, in particular, the advantage of machine learning algorithms applied in the classification processes. They concluded, based on literature review, that the land cover classifiaction of Sentinel-2 data using mashine learning provides high accuracies, higher than 0.80. Furthermore, the high and temporal spatial resolution and wide swath of the Sentinel-2 mission allows observing and monitoring of the rapid changes in ecosystems and human activities at the global scale with high detail. World Cover is the one of the latest global land cover products derived based on a synergy of Sentinel-1 and Sentinel-2 data acquired in 2020, by the European Space Agency (ESA) [
25]. It consists of eleven land cover classes mapped at a 10 m spatial resolution. The OA of this global product is equal to 0.74. The highest user’s (UA) and producer’s accuracy (PA) were obtained for the tree cover, snow/ice, permanent water bodies, and bare/spare vegetation classes, reaching more than 0.80. The most problematic classes, with the lowest UA and PA values of around 0.50, are wetlands, shrubs, and moss/lichen classes [
25]. Another example of the global land cover map derived based on Sentinel-2 data is the ESRI 2020 Land Cover map [
26]. The OA of this map was equal to 0.86. The map was produced using a deep learning model trained with over 5 billion hand-labelled Sentinel-2 pixels from 20,000 sites all over the World. Out of 10 land cover classes, the highest UA and PA, above 0.82, was achieved for the classes representing water, trees and built-up areas. The classes such as grass, flooded vegetation, shrubs and bare ground achieved the lowest accuracy between 0.38 and 0.73, indicating that these classes are the most difficult to delineate [
26].
Sentinel-2 data have also been used to derive pan-European land cover maps. One example is the Land Cover Map for Europe 2017 developed by the Space Research Centre of Polish Academy of Science as a result of the S2 Global Land Cover project [
27]. The authors used the Random Forest method to classify 13 land cover classes with the OA equal to 0.86. The water bodies, coniferous and broadleaved tree cover classes were recognized as the more accurately classified classes with the PA and UA of 0.95–0.97. The lowest UA and PA values (0.10 to 0.50) were obtained for the permanent snow, marshes, moors and heathland classes, respectively [
27]. The Sentinel-2 land cover product over Europe will be shortly delivered as part of the Copernicus Land Monitoring Service (CLMS). The European Environment Agency (EEA), responsible for implementation of CLMS has started to develop a new series of land cover products called CLC+, supported by the EAGLE data model, which is also hierarchical [
28]. The new generation of the CLC+ product will contain the CLC+ Backbone, CLC+ Core and CLC+ instances components [
29]. The CLC+ Backbone component will provide a detailed wall to wall vector and raster land cover layer with basic 12 land cover classes for raster products derived based on the classification of Sentinel-2 data.
In the last decade, the rapid development of Earth Observation technologies has been observed, both in terms of technological progress in new sensors as well as in development of advanced methods of data processing and analysis. The development of artificial intelligence, machine learning algorithms and access to high computing facilitates reduced the processing time and allows more efficient processing and analysis of the Big Data. The most popular machine learning algorithms applied for land cover analyses are RFs [
30], Support Vector Machines (SVM) [
31] and Convolution Neural Networks (CNN) [
32]. Several studies have focused on selection of the best algorithms for land cover classification. Jamali [
33] evaluated and compared eight machine learning models: RF, Decision Table, DTNB, J48, Lazy IBK, Multilayer Perceptron (MLPC), Non-Nested Generalized Exemplars (NNge), and Simple Logistic for land use/land cover mapping over the Sari region (Iran) using Landsat 8 OLI. The authors pointed out the NNge, followed by Lazy IBK, RF and MLPC as the best algorithms for accurate land cover mapping [
33]. Noi et al. [
34] compared RF, SVM and k-nearest neighbour (kNN) algorithms for land cover classification based on Sentinel-2 data in the north of the Red River Delta in Vietnam. All three algorithms gave comparable OA values: kNN 0.94., RF 0.94 and SVM 0.95 [
35]. Amani et al. [
35] reported that the RF classifier reached higher accuracy than SVM, kNN, Decision Tree (DT), Maximum Likelihood (ML) in the classification of wetland areas in Canada, based on Landsat-8, Sentinel-1 and elevation data. By contrast, a study by Dabija et al. [
36] showed that the land cover classification with SVM reached higher OA compared to RF in land cover classification of Braila, Catalonia and Warsaw regions, 0.86 and 0.80, respectively. Inglada et al. [
37] compared RF, SVM, decision tree and stochastic gradient boosting (SGB) algorithms in crop mapping in 12 regions located in 12 countries. They demonstrated that the RF reached higher accuracy for eight regions compared to SVM (difference of 5 to 20 percentage points), in the remining four regions the accuracy was on the same level. The superiority of the RF algorithm over other classifiers for land cover classification was also confirmed by Adam et al. [
38]. The classification accuracy dependents on many factors such as the number of reference samples [
39], number of land cover classes [
36] or different satellite data. The RF classifier is less sensitive to tuning parameters compared to other algorithms [
22]. It allows combination of different variable types, for example categorical and continuous data [
40]. Rodriguez-Galiano et al. [
41] highlighted the following advantages of an RF classifier in land cover mapping: (a) it works well on large datasets, (b) can handle thousands of inputs, (c) estimates the variable importance in the classification process, and (d) generates an internal objective generalization error. Additionally, the RF is relatively robust to noise and outliers, and it is computationally lighter and less time consuming than other machine learning approaches.
Mapping land cover with Sentinel-2 data using an RF algorithm is very popular with different scales and locations. Usually, the classification is carried out in a standard flat approach, where all land cover classes are classified together at the same time. The OA is provided as one value for the final product and, even though it achieves a relatively high value, the final product is not always high quality considering the spatial variability of different land cover classes. Beside the long heritage of land cover classification using remote sensed techniques, there are still classes that are challenging to separate [
42]. Nguyen et al. [
39] classified Sentinel-2 images into 11 land cover classes in Dag Nong Province (Vietnam), and pointed out the classes representing the plantation, croplands and residential areas as the most problematic and difficult to separate (PA ranged from 0.33 to 0.46) [
39]. A study by Ghorbanian et al. [
43] stressed that the wetland class is the most challenging class for delineating using Sentinel-2 imagery in Iran. Among the 13 land cover classes, wetlands reached the lowest PA and UA values of 0.89 and 0.86, respectively [
43]. Similar results for wetland classes were obtained in the study by Whyte et al. [
44] conducted in South Africa applying the RF method to Sentinel-2 data. The authors classified 15 land cover classes. Three wetland classes achieved the lowest values of UA and PA of 0.60 and 0.66, respectively. Of interest, the lowest UA and PA were also obtained for bare soil (0.62–0.68) and shrub (0.65–0.73) classes [
44].
Knowing the limitation of the flat classification approach and challenges in accurate mapping of problematic classes, the hierarchical approach was tested by several studies. Hierarchical classification divides classes into groups and classifying them as a tree structure. The hierarchical approach was tested by Bobalova et al. [
45] in urban land cover mapping in six European cities (Zakopane, Bratislava, Nitra, Žilina, Kaposvár and Orosháza) using RF classifier and Sentinel-2 data. Firstly, the authors classified seven land cover classes: forests, scattered trees, shrublands, grasslands, croplands, urban fabric and water. Secondly, classification of detailed urban land cover classes (green grass, dry grass, trees/shrubs, non-vegetated and shadow) within the urban fabric mask from the first level were performed. The OA reached the values of 0.78–0.90 in each tested area at the first hierarchical level, and 0.76–0.89 on the second level. They concluded that classes such as green grass and trees/shrubs as well as dry grass and non-vegetation proved to be the most susceptible to errors due to unclear texture and similar spectral reflectance. Rahdari et al. [
46] carried out hybrid land cover classification with hierarchical approach based on Landsat TM and OLI data for the years 1998 and 2016. The authors performed classification in three steps: (1) dense and spare vegetation, (2) vegetation located on slopes higher or lower, and (3)other land cover classes (drainage agriculture, rain fed agriculture, dense rangeland, sparse rangeland, forest and rangeland, water body and residential area) using the Fisher method [
47]. Finally, all individual classes were combined into one land cover map reaching the OA value of 0.84 for the year 1998 and 0.91 for 2016. Avci et al. [
48] performed the land cover analysis in Istanbul (Turkey) based on Landsat TM, and proved that the hierarchical approach gave better results and higher accuracy than a flat approach. By applying the hierarchical classification, the OA increased from 0.47 to 0.91 compared to the flat classification. The lowest accuracy in flat approach was achieved for grasslands, roads and coniferous forest classes. By contrast, the hierarchal approach improved the accuracy of these classes by 60, 65 and 90 percentage points, respectively. The hierarchical approach was also applied by Demirkan et al. [
49] for RF classification of land cover over regions in Ankara and Izmir (Turkey), based on Sentinel-2 data. The authors found that applying the hierarchical method, the classification accuracy increased between 4 and 10 percentage points compared to the non-hierarchical method, reaching the OA of 0.82–0.85. Pena et al. [
50] also compared flat and hierarchical approaches for crop type classification in California based on ASTER images and found that SVM hierarchical model was performed better than standard flat classification. The results showed that OA increased from 0.72 for flat classification to 0.86 for hierarchical approach. The superiority of hierarchical method over the flat one was also observed by Heast et al. [
51] in habitat mapping in Belgium and by Hoscilo and Lewandowska [
52] in forest type mapping and delineation of dominant tree species.
The main aims of this study are (a) to examine whether the hierarchical classification of land cover types can give more accurate results than the standard flat classification approach, (b) to study what the advantages and disadvantages of both approaches are, and (c) to assess the stability of the different Random Forest classification models run on a set of Sentinel-2 data.
3. Results
3.1. Flat Classification Accuracy
The flat classification for individual granules reached OA between 0.89 and 0.93, the Kappa coefficient 0.81–0.86, and the F1 score 0.74–0.81. The UA and PA of the individual land cover classes are shown in
Table 4.
The highest UA value was achieved for woodland coniferous (0.94 to 0.99) and periodically herbaceous (0.94 to 0.96). The PA for these classes ranged from 0.92 to 0.98 and from 0.90 to 0.94, respectively. The lowest accuracy was obtained for shrubs, mosses and non-vegetated (base soil) classes. Interestingly, the shrubs class was shown to have the most varied accuracy values, which is likely to be related to the spatial distribution of orchards that were classified as shrubs. The large area of orchard is located in the north-eastern part of the study area. The large spread of UA and PA values was also observed for the mosses class, 0.32–0.67 and 0.55–0.79, respectively, as well as for the non-vegetated (bare soil) class, 0.18–0.76 and 0.50–0.89, respectively.
Figure 3 presents the result of the model stability of the flat classification performed for one Sentinel-2 granule.
The median of OA reached the value of 0.90, whereas the Kappa coefficient 0.81 and the F1 score 0.78. The variability of these values is rather low. However, the outliers were observed for OA and Kappa values.
3.2. Hierarchical Classification Accuracy
The accuracy of the hierarchical classification was conducted at Level 1 and 2. The results of the accuracy assessment are presented in
Table 5. The OA values for all land cover classes were greater than 0.92. The separation of land cover classes at Level 1 reached an OA of above 0.95, a Kappa coefficient above 0.70 and an F1 larger than 0.85. The lowest F1 score (0.85–0.90) was achieved for the delineation of vegetation/non-vegetated classes, whereas the highest F1 score was above 0.96 for the non-water/water classes.
Slightly low accuracies were achieved for the classification of vegetation/non-vegetated, where the OA reached values of 0.97–0.98, the Kappa 0.70–0.79 and F1 0.85–0.90. The classification of woody/non-woody cover was shown to have the largest spread of OA values, up to four percentage points among six granules.
At Level 2, the highest accuracy was achieved for the classification of coniferous woodland, broadleaved woodland and shrubs carried out within the mask of woody cover from Level 1. The OA of this classification ranged between 0.94and 0.99, Kappa 0.86–0.97 and F1 0.88–0.99. The classification of sealed surfaces/non-vegetated (bare soil) classes within the non-vegetated mask from Level 1 achieved the lowest OA values of 0.92–0.97, Kappa ranging from 0.56 to 0.85 and F1 0.78–0.92. The accuracy of the classification of the non-woody cover class (Level 1) into the permanent herbaceous, periodically herbaceous and mosses classes at Level 2 were comparable to the results of sealed surfaces/non-vegetated (bare soil) classification. However, the spread of values was lower (6–11 percentage points) for non-woody classes compared to sealed surfaces/non-vegetated classes (bare soil) (5–29 percentage points).
Figure 4 presents the variability of the OA, Kappa and F1 values for two levels of hierarchical classification. In Level 1, the OA reached values above 0.97 for three classifications. The highest values of OA, Kappa and F1 score (0.99, 0.98 and 0.98, respectively) were obtained for the classification of non-water/water bodies. This classification model was shown to be the most stable and its classes the most spectrally homogeneous. The classification of vegetation/non-vegetated classes within the mask of the non-water class showed the high values of OA: 0.97, Kappa: 0.75 and F1: 0.87. Similarly, the OA value (0.94) was achieved in the classification of woody/non-woody classes within the mask of vegetated class. Here, the Kappa and F1 reached values of 0.92 and 0.96, respectively. Slightly lower accuracy was achieved for the classifications at Level 2.
The classification of coniferous and broadleaved woodland and shrubs, within the mask of woody class from Level 1, showed to be the most stable and the most accurate compared to other classes. The classification accuracy for these classes achieved values of OA 0.96, Kappa 0.90 and F1 0.92. Slightly worst results were obtained for the classification of permanent herbaceous, periodically herbaceous and mosses classes, reaching the OA of 0.95, Kappa of 0.78 and F1 of 0.86. These classes, due to the spectral characteristics, agriculture activities and phenology cycle, are difficult to delineate accurately. However, the classification models were shown to be quite stable. The largest variability of accuracy was observed for the classification of sealed surfaces/non-vegetated (bare soil) classes, where the Kappa values ranged from 0.70 to 0.85 and the F1 values from 0.85 to 0.92.
The final land cover map over the entire study area as the result of the hierarchical classification is presented in
Figure 5.
3.3. Independent Verification of the Results of the Flat and Hierarchical Classification
The results of the independent verification of the land cover maps over the entire study area derived using flat and hierarchical approaches are presented in
Table 6.
The results confirmed that the hierarchical approach provided more accurate land cover maps compared to the flat classification. The UA of hierarchical classification was higher for all the land cover classes compared to the flat classification, except the sealed surface and woodland broadleaved classes, where the values were slightly lower. The highest UA was achieved for the following classes: mosses (1.00), water (0.96), non-vegetated (0.92), and woodland coniferous (0.91). For the water bodies class, the UA was equal to 0.96 for both classifications. The highest difference in UA values between the hierarchical and flat classifications was obtained for mosses (18 percentage points), permanent herbaceous (18 percentage points) and woodland coniferous (15 percentage points) classes.
Interestingly, the lowest UA values below 0.8 were obtained in both classifications for shrubs, sealed surfaces and woody broadleaved classes. The shrubs were partially misclassified as broadleaved woodland, mosses and permanent herbaceous. By contrast, the broadleaved woodlands were mixed up with shrubs. Sometimes the periodically herbaceous areas were misclassified as permanently herbaceous. Furthermore, the sealed surfaces were mixed up with the bare soil and vice versa. Around 20% of non-vegetated (bare soil) sampling polygons were misclassified as sealed surfaces.
The PA values of hierarchical classification were higher for six land cover classes compared to the flat approach. The PA of the flat classification reached higher values for two classes: permanent herbaceous (8 percentage points) and periodically herbaceous (10 percentage points) compared to the hierarchical classification. The largest differences in PA for both classifications were observed for sealed surfaces (18 percentage points), woodland broadleaved (14 percentage points) and woodland coniferous (12 percentage points). The lowest PA values for the flat classification were achieved for mosses (0.64), non-vegetated (0.66) and shrubs (0.70) classes. For comparison, in the hierarchical approach, the lowest PA was achieved for non-vegetated (0.68), periodically herbaceous (0.70) and mosses (0.74) classes. For the water bodies class, the PA was the same (0.98) in both classifications. The lowest PA values in both classifications (below 0.8) were achieved for shrubs, periodically herbaceous, mosses and non-vegetated (bare soil) classes.
Additionally, to assess the quality of the final land cover maps, we performed the visual comparison of the classification results versus the Sentinel-2 data and aerial orthophotos.
Figure 5 presents the final hierarchical land cover map for the entire study area and a few examples of more detailed results of both classifications. There is a quite good agreement with the reference data observed for the hierarchical classification. There is a visible misclassification of areas along the river or water bodies, which were assigned as sealed surfaces in the flat classification (
Figure 5a). By contrast, in the hierarchical classification, these areas are assigned as water or mosses, which is the real land cover type (
Figure 5b). The example in
Figure 5a,c presents the large overestimation of sealed surfaces along the edges of ponds and forest clearcuts, especially on the forest edges. These areas are correctly classified as woodlands, water or mosses in the hierarchical approach (
Figure 5b,d). The flat approach also underestimated the mosses areas, which were classified partially as water bodies. The densely built-up areas with high buildings were better classified in the flat approach (
Figure 5e,f), because of shadows of high buildings were misclassified as mosses in the hierarchical classification. On the other hands, the flat approach did not pick up the urban greenery structure very well (
Figure 5e) and in general is less detailed compared to the hierarchical one.
4. Discussion
In this study, we examined whether the land cover classification carried out using the hierarchical approach can provide more accurate and reliable results than the standard flat method. It has to be stressed that the number of studies on hierarchical classification of land cover types is rather limited. There are only a few studies conducting the pixel-based hierarchical classification using multispectral satellite images. Most of other studies focus on the examination of hierarchical approach towards the object-based (OBIA) classification of hyperspectral data or fusion of multispectral and hyperspectral data. Compared to the results of other studies for the land cover mapping using hierarchical and flat methods, our results are comparable or more accurate. We proved that the stratified, hierarchical approach to land cover classification gave more accurate results compared to the standard flat method. The OA and F1 increased on average by 5 and 12 percentage points, respectively, by applying the hierarchical classification. The differences in accuracy between two approaches was more pronounced in complex classes such as mosses, shrubs, sealed surfaces and non-vegetated. The hierarchical approach increased the separability of complex classes. Jiao et al. [
59] examined the hierarchical approach for mapping coastal wetlands in China using the Landsat data and obtained more accurate results to ours. The authors achieved the OA values of 0.93–0.96, UA around 0.99 and PA 0.97. Much lower UA and PA values for mosses class were achieved in our study, in the flat classification UA and PA ranged from 0.32 to 0.73 and 0.55 to 0.79, respectively, whereas in the hierarchical the UA and PA for this class reached values of 1.00 and 0.74, respectively. The differences in the obtained results can be related to the fact that Jiao et al. [
59] in the first level adopted the expert rules of spectral variables based on spectra indices and then in the second level utilised the machine learning approach (SVM classifier). In addition, the study was rather local.
Myint et al. [
58] compared a pixel-based flat classification and object-based hierarchical classification of urban land cover in the city of Phoenix (USA) based on very high resolution QuickBird images. The authors used the nearest neighbour classifier. They confirmed the advantage of the hierarchical classification over the standard flat classification. The OA of the flat classification result reached a value of 0.63, whereas in the hierarchical approach the OA varied between 0.80 and 0.99. This is in line with the results for land cover mapping obtained in our study. The overall accuracy of the hierarchical classification of land cover carried out by Demirkan et al. [
49] in Turkey using Sentinel-2 data and RF method is lower than that obtained in our study. The authors performed classification at two levels: first, general classes, and second more detailed land cover classes. They achieved the OA of 0.84–0.85 for Level 1 and 0.72–0.82 for Level 2. To compare, in our study, values of OA were much higher and reached 0.97–0.99 in Level 1 and 0.92–0.97 in Level 2. The difference may be caused by using the NDVI and NDWI spectral indices and applying the threshold in Level 1 instead of the classification. In addition, they used a single image instead of the time series of Sentinel-2 used in our study.
Interestingly, Clark [
42] stressed out the importance of the references sampling strategy in the classification process. They applied the RF method to classify twelve land cover classes using Sentinel-2 data in the San Francisco Bay area. The authors achieved a higher OA of 0.84 for the reference sampling polygons compared to 0.80 of OA for the reference sampling point strategy. In our study, we applied the reference sampling points and obtained higher OA for land cover classification.
Our results confirmed that classes such as mosses, shrubs and non-vegetated bare soil are the most difficult to delineate and separate using the flat standard approach. These classes showed also the higher level of variability. The independent verification of the land cover maps performed over the entire study area confirmed the superiority of the hierarchical approach. The lowest PA values in both classifications (below 0.8) were achieved for shrubs, periodically herbaceous, mosses and non-vegetated classes. Around 20% of non-vegetated sampling polygons were misclassified as sealed surfaces. In addition, the shrubs class were shown to have the highest variability of the accuracy values, which is probably related to the spatial pattern of orchards that were classified as shrubs. The orchards are characterised by heterogenous structure and mixed pixels, which affects the accurate delineation of this class. In the flat classification, the lowest PA values were achieved for mosses (0.64), non-vegetated (0.66) and shrubs (0.70) classes. To compare, in the hierarchical approach the lowest PA was achieved for non-vegetated (0.68), periodically herbaceous (0.70) and mosses (0.74) classes. The mosses/wetland class was pointed out by Ghorbanian et al. [
41] as the most challenging class for delineation using Sentinel-2 imagery. Over the 13 land cover classes, the wetlands reached the lowest PA and UA values of 0.89 and 0.86, respectively. Difficulties in delineation of wetland classes were also reported by Whyte et al. [
42] in South Africa. The authors classified 15 land cover classes using RF classifier and Sentinel-2 data and achieved the lowest values of UA and PA of 0.60 and 0.66 for three wetland classes, respectively. Interestingly, they also reported the lowest UA and PA for bare soil (0.62–0.68) and shrubs (0.65–0.73) classes. These three classes were also recognised as the most challenging to classify by our study. The sealed surfaces were sometimes mixed up with the bare soil and vice versa. This may be related to the spectral similarity of some roofs and bare ground, mixed pixels and the difficulty of detecting narrow roads or smaller buildings. The analysis of the model stability also confirmed that the classification of sealed surfaces/non-vegetated classes is less stable, showing high variability.
The highest increase in the PA of the hierarchical approach compared to the flat one was observed for sealed surfaces, broadleaved woodland, coniferous woodland and mosses, by 18, 14, 12 and 10 percentage points, respectively. It confirms the superiority of hierarchical classification over flat classification. For two herbaceous classes, the PA was higher in the flat classification, by 10 and 8 percentage points for periodically and permanent herbaceous, respectively. The independent verification showed that the periodically herbaceous areas were sometimes misclassified as permanently herbaceous. This was also observed by Bobáľová et al. [
45]. The authors tested the hierarchical OBIA approach for classification of six land cover classes in selected cities in Central Europe using Sentinel-2 data and concluded that the most problematic classes are dry grass and cropland classes. They achieved the lowest UA of 0.40–0.86 for dry grass and 0.32–0.93 for cropland classes, and the highest OA value above 0.90 for the forest class.
As part of this study, we analysed the advantages and disadvantages of both classification approaches. The main advantages of the flat classification are (i) the simplicity of the classification process, because all classes are classified together, and (ii) very short execution time. The main disadvantage is related to the lower accuracy of individual classes; less representative classes are especially underestimated. In comparison, the main advantages of the hierarchical classification are (i) more accurate and reliable results, and (ii) developing additional intermediate products as the outputs of the hierarchical stratification of land cover classes. The High-Resolution Layers for 2018 (HRL) provided by the Copernicus Land Monitoring Service (CLMS) are the examples of the individual land cover classes derived from the automatic classification of Sentinel-2 and Sentinel-1 images. There are four HRL2018 products: imperviousness, forest, grasslands, water and wetness available at CLMS [
60]. In general, the HRLs correspond to the group of land cover classes in hierarchical classification. The biggest disadvantage of hierarchical classification is the complexity of the classification process. It requires performance of several classifications instead of one. In this study, we performed six classifications, which prolongs the processing time. In addition, the process of stratification of land cover classes is based on many experiments. There is no universal rule how to divide land cover classes in the hierarchy structure. In our study, the process of classification, likely in other studies, was preceded by testing different features and parameters using the trial-and-error method [
45]. Jiao et al. [
59], mapping wetlands in China, have come to similar conclusions that it is hard to define one universe class hierarchy for different study areas.
The analysis of the stability of the RF classification models performed in this study confirmed that the hierarchical approach provides, on one hand, more accurate results but on the other is characterised by the greater variability of accuracy values than the flat classification. In the hierarchical approach, the most stable results were achieved for the non-water/water and woody/non-woody cover classification in Level 1 and the classification of woodland coniferous, broadleaved and shrubs in Level 2. The values of OA, Kappa and F1 for these classifications varied by 2–4 percentage points. The least stable and the lowest classifications accuracy was obtained for the vegetation/non-vegetated cover in Level 1 and for sealed surfaces/non-vegetated (bare soil) classification in Level 2. The Kappa coefficient for these two classifications varied from 0.72 to 0.78 in Level 1 and from 0.70 to 0.85 in Level 2. It was related to the problem in misclassifying shadows of the high buildings as the mosses class. Myint et al. [
58] also found that shadows around high buildings may cause misclassification on urban areas. In flat classification, the model was more stable but gave lower classification accuracy. The results of our study showed that some of the classes such as woody cover and water bodies are easy to separate with high accuracy using both approaches and some are more difficult such as non-vegetated (base soil), mosses and shrubs. The most complex classes were also recognised as problematic to delineate in other studies [
43,
44].