Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve

Zagajewski, Bogdan; Kluczek, Marcin; Raczko, Edwin; Njegovec, Ajda; Dabija, Anca; Kycko, Marlena

doi:10.3390/rs13132581

Open AccessEditor’s ChoiceArticle

Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve

by

Bogdan Zagajewski

^1,*

,

Marcin Kluczek

¹

,

Edwin Raczko

¹

,

Ajda Njegovec

²,

Anca Dabija

¹

and

Marlena Kycko

¹

Department of Geoinformatics Cartography and Remote Sensing, Chair of Geomatics and Information Systems, Faculty of Geography and Regional Studies, University of Warsaw, 00-927 Warszawa, Poland

²

Geodetski Zavod Celje d.o.o., 3000 Celje, Slovenia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(13), 2581; https://doi.org/10.3390/rs13132581

Submission received: 11 May 2021 / Revised: 23 June 2021 / Accepted: 28 June 2021 / Published: 1 July 2021

(This article belongs to the Special Issue Remote Sensing for Biodiversity Mapping and Monitoring)

Download

Browse Figures

Versions Notes

Abstract

:

Mountain forests are exposed to extreme conditions (e.g., strong winds and intense solar radiation) and various types of damage by insects such as bark beetles, which makes them very sensitive to climatic changes. Therefore, continuous monitoring is crucial, and remote-sensing techniques allow the monitoring of transboundary areas where a common policy is needed to protect and monitor the environment. In this study, we used Sentinel-2 and Landsat 8 open data to assess the forest stands classification of the UNESCO Krkonoše/Karkonosze Transboundary Biosphere Reserve, which is undergoing dynamic changes in recovering woodland vegetation due to an ecological disaster that led to damage and death of a large portion of the forests. Currently, in this protected area, dry big trunks and branches coexist with naturally occurring young forests. This heterogeneity generates mixes, which hinders the automation of classification. Thus, we used three machine learning algorithms—Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN)—to classify dominant tree species (birch, beech, larch and spruce). The best results were obtained for the SVM RBF classifier, which offered an average median F1-score that oscillated around 67.2–91.5% depending on the species. The obtained maps, which were based on multispectral satellite images, were also compared with classifications made for the same area on the basis of hyperspectral APEX imagery (288 spectral bands with three-meter resolution), indicating high convergence in the recognition of woody species.

Keywords:

ecological disaster; conservation; biodiversity; forest mapping; species diversity; Sentinel-2; Landsat; SVM; Random Forest; machine learning

Graphical Abstract

1. Introduction

Human activities have changed the environment for thousands of years. The significant increase in the population has resulted in increased socioeconomic activities associated with the production and consumption of environmental components. The pressure on ecosystems, natural habitats, and biodiversity loss are among the most intense impacts on the natural environment, and these effects translate into changes in mountain forested areas [1]. Therefore, an updated quality thematic mapping system is necessary to allow better analysis and decision making in forest management. Changes in forests have become a significant driver of climate change, but at the same time, climate changes affect habitats; therefore, it is essential to continuously acquire long-term observation series. Although costs of flight campaigns have decreased in recent years and their use has become more popular, the cost may be still too high for permanent monitoring. For this reason, attention is often paid to the use of free-of-charge satellite data, which has become a key element of environmental monitoring of large areas and for the protection of biodiversity [2,3]. One of the most popular types of data comes from the Landsat and Sentinel series satellites, which are commonly used to monitor different forest types and allow the identification of up to a dozen tree species [4,5,6,7]. Moreover, such techniques reduce the cost of intensive field work and, in some cases, are a good substitute for airborne images. Additionally, multitemporal data distinguishes species diversity in different periods of plant phenological development [8,9], allowing less frequent or more expensive data to be simulated, which can be used to monitor the condition of plants [10,11,12,13], including the early detection of bark beetle outbreaks in trees (starting from preliminary stages (green phase) up to dry trunks), which is a serious challenge for European managed forests [14]. The results obtained from Landsat and Sentinel-2 images differ from each other based on differences in the purpose of the work (identification of forest types or individual species), the additional data sources available (e.g., digital surface model [15], microwave data), as well as the specificity of the research area, the combination of scenes from different growing seasons [16,17,18], remote sensing vegetation indices [19,20,21], and adopted algorithms [22,23].

Currently, the most commonly used classifiers are based on nonparametric methods [24], e.g., artificial neural networks, which require significant computing resources, but offer good results [25]; the Support Vector Machine (SVM) [26,27,28]; and the Random Forest (RF) [29,30]. However, in large and homogeneous areas, parametric algorithms allow interesting results to be obtained [31]. Simple methods, e.g., the Maximum Likelihood, can be used to obtain the intended results. For example, Das and Singh [32] used Landsat TM (Thematic Mapper) data to identify four forest types with an overall accuracy of 85.1%. Noviar and Kartika [33] determined three tree classes with Landsat OLI (Operational Land Imager) images with an overall accuracy of 97%. Elhag [34] distinguished three tree species and two forest types with an accuracy of 98.1% using OLI images and showed that the most informative OLI channels were 3, 4, 5, 6, and 7. The most informative channel was 6, which is the short-wave infrared band (SWIR, 1570–1650 nm). Many authors have stated that, as classifiers, the SVM and RF algorithms offer high classification accuracy with a short computing time [35,36]. The Random Forest and Support Vector Machine are usually used for the classification of woody species, especially in multispectral data, due to the abovementioned reasons [37,38]. Banskota et al. [39] confirmed that the RF and SVM algorithms can be used to generate detailed maps, and the algorithms are characterized by easy processing.

A significant factor that contributes to accurate classification results is enrichment by multitemporal images and vegetation indices, which allows the overall accuracy to increase from 71% to 97% [22]. Using Landsat TM spring and fall scenes, Shao et al. [40] distinguished two tree species and two forest types with an accuracy of 89%. Similar observations were made by Zhu and Liu [16] who, using the Support Vector Machine algorithm, distinguished three forest classes with a total accuracy of 90.5%. Pasquarella et al. [41] used the Random Forest algorithm to identify eight forest types by testing different multitemporal datasets based on numerous images from different months. They achieved a producer accuracy (PA) ranging from 51% to 90%. Pena and Brenning [42], in addition to the standard Landsat OLI bands from spring, summer, and autumn, added the NDVI and NDWI. Then, based on the Random Forest algorithm, they identified three tree species with an overall accuracy of 94% and a producer accuracy in the range of 86% to 95%. Pimple et al. [43] tested the classification accuracy of three mountain forest types. The obtained results ranging from 78.4% to 82.3% (Landsat TM images) and from 81.0% to 82.3% (Landsat OLI). Using Decision Trees, Random Forest, and Support Vector Machine, Li et al. [31] separated three forest types on Landsat TM images, achieving an accuracy ranging from 79.1% to 88.2%, with Random Forest being the best algorithm and Decision Trees being the worst. Another example of the use of Sentinel 2 images and the Random Forest algorithm is the classification of six tree species found in the southern part of Germany [44]. The authors compared the difference in the accuracy of two classification methods: pixel and object-oriented. The overall accuracy was 66% for the object method and 63% for the pixel method.

The above examples show that woody forest species are characterized by a set of unique spectral features that can be identified using satellite images [45]. Additionally, the use of remote sensing vegetation indices and derivatives of altitude models allows both species and types of forests occurring in different biogeographic zones to be accurately identified. Of course, the best results are obtained for managed forests that cover dense, homogeneous areas, where the potential source of disturbance is quickly eliminated, creating optimal conditions for individual trees. This article focuses on semi-natural forests, such as mountain national parks and the UNESCO Biosphere Reserve, which, fifty years ago, was affected by major changes caused by acid rain, which disrupted not only leaves but also the soil, leading to destructive changes in the rhizosphere. The damaged trees largely remain in this protected area, which, according to legislation is now a living laboratory of natural processes. A new generation of trees grows intensively between the dry trunks, making it difficult to recognize the objects present there. Hence, the purpose of this article is to evaluate the methods that can be used to select representative pattern polygons for classification and verification of the obtained results. Then, we evaluate commonly available machine learning algorithms implemented in R programming language. For this, we use remote sensing methods to allow for the analysis of the entire park area. In terms of remote sensing, a detailed classification of six dominant tree species was carried out by Raczko and Zagajewski [46] using APEX aerial hyperspectral images and artificial neural networks, and an an overall accuracy of 87% was achieved, while a detailed analysis of non-forest areas was carried out by Marcinkowska-Ochtyra et al. [47], who achieved an overall accuracy of 84%, with fourteen out of twenty-four plant communities achieving a producer accuracy (PA) of more than 80% and sixteen out of twenty-four achieving an acceptable user accuracy (UA).

A motivation and a goal of the study was an assessment of remote sensing tools for mapping of transboundary diverse mountain area, which was affected by an ecological disaster 40 years ago. First rains, which were more acidic than lemon juice, weakened the photosynthetic apparatus and then the condition of plants and soil; it translated into on insect outbreaks leading to damages of the biosphere and rhizosphere. Plant recovery possibilities were very limited due to the status of the area (strictly protected area as national parks and as the UNESCO Biosphere Reserve) The consequences of the situation resulted in the fact that both dead tree trunks and spontaneously appearing different species of different ages appear next to bare rocks and exposed soil. This creates a huge mosaic of objects affecting the reflected electromagnetic signals. So, the intention of this manuscript was to use three well-known non-parametric classifiers, which proved their usefulness by different authors, offering good results. As reference data, maps achieved on the basis of the APEX airborne hyperspectral images, Airborne Lidar Scanning (ALS) and numerous field mapping were used. It allowed to obtain appropriate patterns for training and validation of the Landsat 8 and Sentinel-2 data-based classifications. The innovative element of the study is a verification of image classification algorithms of the mountain heterogeneous environment. The proposed solution allows a development of a monitoring system of the cross-border area, managed according to national environmental protection concepts (different statuses of individual protection zones). Additionally, an assessment of an impact of the size of training and verification patterns on the classification accuracies, which has a practical impact for the field campaign planning phase. To conclude this section, open, objective, and regularly repeatable satellite images and the open-source R programming language are good sources of information for large areas. The aim of the work is to assess different machine learning classification methods to classify the dominant species composition in the stand of the mountain forest, which is characterized by intensive dynamic growth due a previous ecological disaster that caused mass dieback of stands in the area of 15,000 ha [48].

Research Area

The Giant Mountains area was affected by an ecological disaster in the 1970s and 1980s [49,50]. Due to the synergistic actions of heavy air pollution, acid rains, strong winds, drought, and tree pest outbreaks, massive forest dieback, especially of spruce trees, and soil degradation occurred (Figure 1) [51]. However, the actions taken to regenerate the forest considered the proper species composition in consistency with potential habitats when trees were planted [52]. For this purpose, nest planting of originally occurring tree species was conducted, enabling the rebuilding of the species composition of the forest [53]. Numerous activities related to the reconstruction of forest ecosystems necessitated constant and objective monitoring of vegetation [54]. These efforts brought about expected results, which, in 1992, resulted in the establishment of the Transboundary UNESCO Biosphere Reserve.

The research area covers the Giant Mountains, extending over an area of approximately 38 km and a width of 8–20 km, covering approximately 650 km², of which approximately 28.5% belongs to Poland and the rest to the Czech Republic (Figure 2). The main part of the Mountains is located at the Czech–Polish border, the northern part is located in the Polish Karkonoski National Park (KNP), and the southern part is located in the Czech Krkonoše Mountains National Park (KRNAP).

The Karkonosze Mountains are quite rich and have an extensive hydrographic network [55]. Gusty and dry fen winds, which are formed when crossing a mountain barrier, are common and often cause damage to forest stands. The harsh conditions of the Karkonosze Mountains have shaped the characteristics of the existent plants. There are lower plant layers than in other Central European mountains [55]. In the primeval forest, the species with the largest shares in the composition are spruce (53%), beech (23%), and fir (11%) [56,57]. Currently, the dominant woody species is spruce (Picea abies L. Karst), and the remaining species that have significant shares are birch (Betula pendula Roth), beech (Fagus sylvatica L.), and larch (Larix decidua Mill).

2. Materials and Methods

A classification of woody species based on Sentinel-2 MSI and Landsat 8 OLI satellite images and field-verified polygons was conducted to identify representative patterns for the optimization of classifier parameters of Random Forest and Support Vector Machines. These algorithms were selected due to the ease of open-source implementation and the high accuracy of classification achieved (Figure 3) [58].

2.1. Satellite Input Data

Landsat satellite images for the following months in 2018 were obtained from the USGS Earth Explorer service (Path: 191 Row: 25) in GeoTIFF format: 20 April, 15 June, 30 October. Sentinel-2 images (level 2A) were acquired using the Copernicus Open Access Hub (acquired on 19 April, 7 August, 17 November in 2018; Table 1). Only scenes with less than 10% cloudiness were selected for further processing. The studied area is mountainous, so the choice of scenes was limited due to frequent cloud cover. Acquired Sentinel-2 and Landsat images were corrected atmospherically by ESA (level 2A) and USGS (Level-2). Nevertheless, due to the use of images from different periods and areas, the authors verified the atmospheric correction models based on Warsaw’s polygons and ASD FieldSpec 4 (with the ASD ContactProbe; Analytical Spectral Devices, Inc., Longmont, CO, USA). For this purpose, spectrometric measurements of 59 large, homogeneous, and dominant calibration targets (asphalt, concrete, gravel, and paving stones) in Warsaw and surrounding areas were made. The comparison between field measurements, six Sentinel-2 MSI, and three Landsat 8 OLI images indicated that the average Root Mean Square Error (RMSE) oscillated around 0.04−0.07 for the Sentinel-2 images and around 0.05−0.07 for Landsat images [58]. APEX (Airborne Prism EXperiment) airborne hyperspectral data acquisition with 288 bands (413−2440 nm) and a spatial resolution of 3.35 m was executed on 10 September 2012. APEX images allowed us to prepare a high-resolution woody species map, which was carefully field verified on many research transects [59], creating a reliable reference material for this study.

2.2. Field Data Collection

Due to its large size and difficult accessibility, the Karkonosze National Park manages stands based on a network of circular areas, which do not comprehensively cover the entire area of the Polish Park (Figure 4). The updated field patterns were used by Edwin Raczko [59] to generate a reference vector shapefile. This allowed the identification of homogenous polygons using the Lecia ZENO 10 GNSS receiver with an external antenna for geolocation. The measurement was based on the RTK (Real Time Kinematic) technique using the European Position Determination System (ASG-EUPOS). The Position Dilution of Precision (PDOP) values were less than 2, and the positioning accuracy was lower than 1 m. Field campaigns were conducted to collect data in the following years: 2014, 2015, and 2016. More than 1000 field measurements were collected during all campaigns, including within the park network. Then, these polygons were verified on the Normalized Digital Surface Model (nDSM, generated from the Airborne Lidar Scanning (ALS) point cloud obtained from the KNP). The data were processed in the LASTOOLS program, developed by the nDSM, which allowed the selection of homogeneous polygons representing tree species on reference plots (trees over 2.5 m high). This value was selected empirically for use to remove areas covered with low vegetation from the nDSM. The polygons belong to the network of circular polygons run by the Polish Karkonosze National Park. Due to the pixel sizes of the Landsat 8 OLI and Sentinel-2 data, the field-acquired data represent homogenous polygons of tree species on Sentinel-2 and Landsat 8 images [24,46]. In this study, we focused on four tree species that have dominant shares in the park’s stand structure: birch (Betula pendula Roth), beech (Fagus sylvatica L.), larch (Larix decidua Mill), and spruce (Picea abies L. Karst). Due to the pixel sizes of Sentinel-2 (100 m²) and Landsat (900 m²) data, it is impossible to identify rare species, which constitute 0.94% of the area [24]: alder (Alnus Mill.), Norway maple (Acer platanoides L.), and fir (Abies Mill.).

2.3. Satellite Data Processing

We used all Sentinel-2 spectral bands (10, 20 and 60 m), because we did not want to assume in advance that they might not contribute any information (and this hypothesis was confirmed). In the following step, pixels of 20 and 60 m bands were resampled to 10 m resolution to unify their size, and Landsat 8 images were kept at a resolution of 30 m. In order to distinguish between deciduous and coniferous species more precisely, satellite images from three vegetation periods (spring, summer and autumn) were used. In the next step, multi-temporal compositions were made separately for Sentinel-2 and Landsat 8 data as a layer stack. Based on the images remote-sensing vegetation indices were calculated (Table 2). Firstly, the Normalized Difference Vegetation Index (NDVI) [60] was used to show the general quantity and vigor of green plants. After the ecological disaster, the area was characterized by a large number of remaining dry trunks and branches, which significantly modified the spectral reflection, masking young trees that were intensively growing around dead trees. Thus, the Normalized Difference Water Index (NDWI) [61] was used to determine the canopy water content, which allowed us to identify healthy, deciduous, and coniferous trees by their water contents [20]. Both indices were calculated for all scenes to increase the classification accuracy of damaged plants [62]. Additionally, in the following stage, the indices were used to determine the masking of non-forest areas. Then, spectral bands and indices were stacked into separate GeoTiFF files for Sentinel-2 and Landsat 8 images. Then, pixel values were extracted from images based on polygons to obtain samples for classification (Table 3).

2.4. Classification and Accuracy Assessment

The classification was carried out using R 4.0.0 programming language in R Studio [63,64,65]. This was chosen due to the high availability of these software packages and the ease of implementation of the classifier training process and display classification. We used the Random Forest algorithm [29] from the randomForest package [66]. The parameters define ntree out of 500, because the OOB (out of bag) error values above this number usually stabilize [37]. Tuning was applied to the mtry parameter, and the mtry with the lowest OOB error value was selected. Random Forest was also used to obtain information on the variable importance impact of the classification accuracy, by using Mean Decrease Accuracy (MDA). This was performed for individual bands and indices from different time periods, which indicated how much accuracy the model suffers when each variable is excluded. As the accuracy loss increases, the more important the variable is for the classification.

To implement artificial neural networks, a multi-layer perceptron (MLP) was used from the nnet package [67]. With the grid search methods, hyperparameters (decay, hidden units) were optimized for a set of samples from Sentinel-2 and Landsat 8.

In the case of the Support Vector Machine algorithm [26], the e1071 package was used [68], which performs the standardization of spectral features in the svm function (additionally we used the tune function from the e1071 package, which performs cross-validation during the tuning process). The learning parameters were optimized using the grid search method, where each combination of parameters was checked from the pool of parameters. Tuning was performed for linear, polynomial, radial basis function kernel (RBF), and sigmoid kernel functions (Table 4). The tuning parameters were selected for kernels: gamma = 0.01, cost = 100.

The first step was to extract each class’s patterns from the image pixels using the raster and rgdal packages [69,70]. Then, on the obtained datasets, the algorithm training parameters were optimized, and tuning of all classifiers’ parameters was performed on its entire set of patterns. The parameter optimization procedures were performed on the references derived from the Multispectral Instrument Sentinel-2 (MSI) and Landsat 8 OLI images.

The influence of the number of pixels on the classification accuracy was also checked. The following pixel thresholds were investigated for each class: 50, 100, 150, 200, and 300. Due to the heterogeneity of the analyzed targets and based on previous experience [58], we focused on 300 pixels for each class in the training set and carried out an iterative accuracy assessment, which means that all classifications were repeated 100 times, according to the following steps:

The random selection of reference pixels for the training and testing datasets in a ratio of 50:50 to meet the condition of independence [71]. The rngtools and doRNG packages were used to generate random seeds and optimize the execution time of the iterative accuracy assessment. The multiple cores of the processor utilized doParallel, and for each package that used a PSOCK (Parallel Socket Cluster), a cluster was used [72,73];
The training of Random Forest and Support Vector Machine classifiers;
The accuracy assessment of each classification;
The classification accuracy results and a random seed were saved.

This approach allowed us to get more objective results because using one time split of samples between training and testing can be biased. The iterative accuracy assessment method, which was repeated one hundred times, showed the ranges of accuracy values that each class can achieve.

Then, the classification accuracy of the remaining areas was assessed on the algorithms’ selected parameters, where the parameters were found to be satisfactory in all cases. The following metrics were used to assess the classification accuracy [74]:

Overall accuracy (OA)—the ratio of all correctly classified pixels to the sum of all pixels [75];
User accuracy (UA)—the ratio of correctly classified pixels in a given class to all pixels classified as belonging to this class [76];
Producer accuracy (PA)—the ratio of correctly classified pixels of a given class to all reference pixels of a given class;
The Kappa coefficient was used to present the final results, but the index was characterized by having a high correlation with the overall accuracy, and thus the redundancy of information was doubled [77].
F1-score—the weighted harmonic mean of the user and producer accuracy [78,79].

A confusion matrix was also used. This presented correctly classified pixels in its diagonal direction [74]. Box plots were also used. Then, the average accuracy levels showen by the F1-score coefficient values of the different classes were compared. For production of the final maps, models with the highest average F1-score for all scenario classes were selected.

3. Results

The results of the abovementioned activities included a map of dominant woody species present in 2018, classification accuracy measures in the form of error matrices, and box plots obtained with an iterative classification method using Landsat 8 and Sentinel-2 images. It should be emphasized that all classification approaches obtained similar results (Figure 5). The highest median from 100 iterations was obtained with Sentinel-2 images and the SVM algorithm (86%). A slightly lower median value was obtained for the Landsat 8 images and the Random Forest algorithm (85%). The Random Forest classifier gave a slightly lower value for Sentinel-2 images (84%).

On the Sentinel-2 images, spruce was classified with the highest accuracy (regardless of the classifier, the F1-score oscillated around 90%; Figure 6). Slightly lower results were obtained for the beech class, especially with the SVM classifier (85%). For RF and ANN, the accuracy was about 82% (however, the spread of the results of individual iterations of ANN ranged from 64% to 89%). Similar outcomes occurred for the birch and larch.

Landsat images allowed dominant woody species to be classified with a good overall accuracy (85% RF, 83% SVM RBF, Figure 7), but in the case of Sentinel-2 images, the SVM RBF classifier offered an overall accuracy of 87%, RF gave an accuracy of 83%, and ANN gave an accuracy of 84%, thus confirming their suitability for large-scale monitoring of the stand species composition. The Random Forest classifier offered the highest accuracy for tree species classification, followed by the SVM. Straightforward implementations of Artificial Neural Networks (such as Multi-layered Perceptron) seem to be insufficient for mapping tree species at resolutions offered by the Landsat and Sentinel 2 datasets. Further work should focus on the use of more sophisticated ANNs (e.g., deep artificial neural networks). For Landsat and Sentinel-2 images, the ranges of the first and third quartiles (Q1–Q3) were similar; however, the median values were about 5–9% higher for Sentinel-2 images (Figure 7). The detailed analysis of the impact of the number of pixels used in the training samples of all classifiers confirmed the high effectiveness of both SVM and RF. Furthermore, a median F1-score of 75−78% (median F1-score; Figure 7) was obtained with a relatively small sample (50 pixels), and the addition of more pixels to the training patterns allows an F1-score of about 80%. In the case of SVM, when there were 100−300 pixels in the pattern, the F1-score exceeded this value. Similar results were observed for analyses carried out on Landsat 8 data, but the median F1-score was lower by a few percentage points. In the case of Landsat 8, the best classifier was the Random Forest. In addition, the scatter of the results from individual iterations was significantly higher, because the best individual classifications allowed a score close to 100% to be obtained, while the lowest achieved scores below 50% (Figure 7). The results of the tree species classification performed on the multitemporal Sentinel-2 dataset are presented in Table 5.

The Karkonosze forests are mainly dominated by spruce and also occupy the largest area on the map obtained (Figure 8). Deciduous species were also well distinguished. Beech was found to be the second most abundant species after spruce in the Czech Krkonoše forest stands, as can be seen on the map. The number of birch trees was found to be much lower and they occur sporadically, which can also be seen on the map. There was a slight overestimation of larch due to the fact that it was often planted as a protective belt for other species and mixed with their spectral reflectance. The overall accuracy result can be regarded as satisfactory (OA 86.5%). The analysis of the results showed that the broadleaved species had a smaller accuracy gap (F1-score: 80–86%) than the coniferous species (70–92%). The best level of accuracy was obtained for spruce (92%) and beech (86%). Slightly weaker but still good accuracy was obtained for birch (80%), while larch (70%) performed worse than other species, especially in terms of omission (PA 67%). Misclassification most often occurred due to the mixing of classes within the coniferous or deciduous communities, but the results should be considered satisfactory (Table 5).

During the process of training Sentinel-2 images, the variable importance impact of the classification accuracy was tested, by using a mean decrease accuracy indicator. From the results (Figure 9), B6 (740.5 nm), NDVI, B4 (664.6 nm) and B5 (704.1 nm) were shown to be the most important variables.

4. Discussion

In our opinion, there is no alternative to satellite research, because traditional research methods are key elements of the monitoring, which bases on a network of transects on which regular field observations are made (Figure 4). This is a very subjective method, because visual observations are made by different employees, so burdened with a large dose of uncertainty. In addition, research patterns are limited to selected locations, not the entire area of parks. The second important limitation are data acquisition and processing costs of airborne campaigns, so an airborne research is relatively rare and performed in a time, which makes it difficult to capture phenological changes taking place. In case of our study, law regulations are an additional problem, because the Reserve is located under the board of legally and financially independent entities, complying with various national guidelines. Satellite images, especially Sentinel-2, are acquired every few days free of charge. High time resolution allows to create masks eliminating clouds and shadows, which is common in mountainous areas. Cloud free images with high acquisition frequency captures environmental changes. The key element is the fact that the size of the smallest pixel (10 m) coincides with the tree crowns, which allows for detailed analyzes.

To compare the obtained results with achievements of other researchers (references), we limited the discussion to papers in which authors used Sentinel-2 and Landsat imagery, machine learning algorithms additional data increasing classification results (e.g., vegetation indices, derivatives of terrain models), and identified the same species as us. (Table 6). Due to the pixel size, much of the work focused on classifying the dominant tree species or the forest types in which these species predominate. In this study, when Sentinel-2 imagery and the SVM-RBF algorithm were used, the following producer accuracies were achieved: spruce (93%), birch (85%), beech (83%), and larch (67%). In a study by Hościło and Lewandowska [80], in which forest stands of the Tatra Mountains were classified using Sentinel-2 imagery, the Digital Terrain Model (DTM), and the Random Forest classifier, the following producer accuracies were obtained: spruce (71%), larch (87%) birch (77%), and beech (91%). Significantly better results were obtained for spruce and birch, while poorer results were obtained for larch and beech. The differences in the results could be due to the different characteristics of the studied stands, the application of a different method that increased the information content of images, and the use of data from other sensing periods. A similar classification was performed by Persson et al. [81], who obtained the following producer accuracies using MSI data and the Random Forest: spruce (88%), larch (95%) and birch (81%). The classification of larch (95.5%) was much weaker than in the mentioned work, but the results for spruce (88.2%) and birch (80.8%) were quite comparable. The smaller number of test polygons includes 10% of the dataset; hence, the differences in the classification accuracies may have resulted from insufficient validation. Another example is the classification performed with Sentinel-2 data and the Random Forest classifier by Immitizer et al. [44], which achieved the following producer accuracies: spruce (85%), larch (44%), and beech (49%). Only spruce was classified at a similar level with our scores, other species achieved worse results, but comparing all the results with Hościło and Lewandowska [80] we can speculate that the reason is the compactness of the tree crowns, because in the Tatras these species form homogeneous patches, and thus the obtained results are the best one, in individual parts of the Karkonosze there are also small but compact habitats, hence the high classification results, but worse than in the Tatras. Classification using the Maximum Entropy algorithm and Landsat 8 data enhanced with DTM was carried out in a mountainous area by Chiang et al. [15] and produced producer accuracies of 77% for larch and 55% for birch. For both species, the results obtained in the current study were lower, and this discrepancy can be explained by the classifier used and the additional data. When analyzing the additional data used in the study to increase the informativeness of the image, the best accuracy levels are obtained by studies with multi-temporal compositions. Penna and Brenning [42] obtained the highest overall accuracy among comparable studies at 94% using NDVI and NDWI indices, a multi-temporal composition, and the Random Forest algorithm on OLI scanner data. A high value (88%) was also obtained by Persson et al. [81] using only a multi-temporal composition, the same algorithm for classification, and MSI scanner data. The overall accuracy (85%) obtained in the current study was slightly lower but not significantly different from that obtained in the above works. Due to relatively small terrain denivelations and low height differences between montane and foothills zones (mountains were formed during the Hercynian orogeny), we chose not to use DEM data and others topographic attributes, which does not always provide confidence in the classification accuracy, whereas both the Pimple et al. [43] and Li et al. [31] studies showed increases in the classification accuracy by only 1 to 3%. Despite the relatively low F1-scores for larch, the results are very valuable because larch does not cover naturally large homogeneous areas. It is often planted linearly to protect stands from strong winds, which cause significant damage due to windbreaks. Additionally, the nature of the Krkonoše forest stands means that most species, apart from spruce, are distributed at similar altitudes, which justifies the lack of use of DEM data in the study.

The overall accuracy (77–87%) results are quite comparable to those obtained by other authors. Better results may be achieved for forest ecosystems where individually studied species occur in more dense groups. It may also depend on the adopted number of validation polygons, i.e., whether there is a significant number of them (more than 50% of the data). Other authors have mostly used 30% of the data, or even 10%, as in Persson et al. [81], for verification, which may significantly distort the assessment of classification accuracy. In examples where there is no attachment of additional data to the image, lower overall classification accuracies are obtained, which confirms the need to make multispectral data more informative using a multitemporal composition. The MSI scanner has more channels and has a better spatial resolution than the OLI scanner, and significant differences in the classification quality were observed in favor of the MSI sensor. It achieved higher overall accuracy results (4–7%) for all classifiers except the Random Forest, where the OLI instrument provided better results (by 2%). A good example is the comparison of classifications made by Soleimannejad et al. [82], where the difference in overall accuracy between the MSI and OLI instruments was 1%. Puletti et al. [23] distinguished among three forest classes based on multitemporal compositions and the following vegetation indices: NDVI, the Plant Senescence Reflectance Index (PSRI) [83], the Red Edge Normalized Difference Vegetation Index (RENDVI) [84], and the Anthocyanin Reflectance Index (ARI) [85]. A total of 42 channels were obtained and these were classified by Random Forest. The producer accuracy ranged from 83% to 91%.

Table 6. Comparison of the obtained results with those reported in the literature.

Author	Tree Species (PA %)				OA (%)	Algorithm	Satellite	Number of Classes
Author	Birch	Beech	Larch	Spruce	OA (%)	Algorithm	Satellite	Number of Classes
[86]	27.2	98.0	74.7	53.3	90	RF	S-2	9
[80]	83.7	91.5	86.5	70.5	82	RF	S-2	8
[44]	-	48.8	44.0	85.3	63	RF	S-2	6
[81]	80.8	-	95.5	88.2	87	RF	S-2	5
[87]	-	-	88.1	75.8	87	RF	S-2	5
[88]	80.0	-	-	91.9	87	Bayesian inference	S-2	4
[15]	55.0	-	96.0	-	81	Maximum Entropy	L 8	4
[23]	-	95.5	-	88.2	86	RF	S-2	4
[12]	98.8	-	-	-	95	SVM-RBF	S-2	14
[89]	-	63.0	-	73.0	63	RF	S-2	6
[90]	97.7	-	-	-	97	RF	S-2	4
[36]	-	79.0	-	-	85	RF	S-2	5
[91]	-	-	-	74.2	79	RF	S-2	4
Our results	72.5	71.6	68.3	90.8	83	SVM RBF	L 8	4
	73.9	75.8	71.0	92.7	85	RF
	59.7	67.0	56.4	91.3	77	ANN
	84.8	82.7	67.0	92.6	87	SVM RBF	S-2
	74.5	81.0	71.0	89.8	83	RF
	72.9	82.4	62.6	93.8	84	ANN

The process of classification and accuracy assessment requires an appropriate amount of up-to-date and reliable training and verification patterns. The optimal solution is field research allowing for detailed identification and determination of representative patterns, but the procedure is time consuming and expensive, especially in mountainous areas. So, the influence of the number of pixels on the classification accuracy was tested, allowing to determine the necessary number of training pixels to obtain satisfactory results. The highest results were achieved when there were 300 pixels, and this was considered the most optimal number of samples per class. Similar conclusions were reached by Sabat-Tomala et al. [92] who also tested the Random Forest and Support Vector Machine classifiers using the same threshold values on plant species. Nevertheless, the use of 100–150 pixels/class in the training patterns resulted only 2–3 percentage points lowered the results (Figure 7). It has a practical dimension allowing to shorten field research, without a significant loss of obtained results. In general, in research works where the effects of classifier choice, reference sample size, and reference class distribution on classification accuracy per pixel have been tested, various conclusions have been obtained [93,94]. Noi and Kappas [93] concluded that the performance of the RF classifier on different Sentinel-2 satellite image data with different training sampling strategies (balanced or unbalanced) differs. They observed that the training sample sizes for land cover classes were large (greater than or equal to 500 pixels/class), and the performance of the kNN, RF, and SVM on balanced and unbalanced datasets did not differ significantly.

Comparing the airborne APEX hyperspectral (9 m² pixel size, 288 spectral bands), multitemporal Sentinel-2 (100 m², 12 bands) and Landsat 8 (900 m², 10 bands) images, the best classified species was spruce (around 90% in case of all classifiers), which dominates in the parks. However, the large area covered led to relatively homogeneous images being produced for satellite and APEX airborne images (Figure 10). The worst results were obtained for larch in both Landsat and Sentinel-2 images (71% based on Random Forest). These seemingly worse results, however, are very satisfying, because larch does not cover dense polygons but is often planted in the form of linear transects. Despite spectral and spatial differences of the images, visual interpretation of the APEX and the Sentinel-2 based maps confirms the usefulness of the satellite images for identification of woody species (Figure 10). In each of the presented study areas, the localization of particular species on Sentinel-2 images is proper; however, difficulties arise for deciduous species (birch and beech) on sites where they overlap, as there is a tendency to overestimate and underestimate classes. When a species forms a compact stand, its identification is much more accurate than in the case of communities consisting of small groups of trees. Due to the fact that spruce is the dominant community, there were no difficulties in mapping it. The biggest challenge occurred with the larch, which rarely grows in dense groups, because it is most often planted in strips as protection from strong winds for other growing trees. It should be considered that maps prepared on the basis of Sentinel-2 data are valuable and satisfactory, as they can show the characteristics of tree stands, allowing quick, straightforward, and cheap analyses and being especially useful in mountain areas with limited accessibility to many tree communities.

In this study, four classes were used for classification; this number is similar to that used in other studies, but as mentioned before, due to the nature of the stand and spatial resolution, it was not possible to identify other species. There have been cases where more classes were designated, with seven [44] or eight [80] species distinguished. However, the number of species distinguished depends on the tree stand species structure and the proportion of individual species. The results obtained can be considered highly accurate and relevant due to the fact that the reference sites are distributed regularly over the park, giving results that are closer to reality. Despite using more data in the main classification, the study shows that it is possible to achieve a fairly good overall classification accuracy (87%) by combining three images from spring, summer, and autumn. However, it should be noted that the studied stands are located in a mountainous area, where the development of vegetation starts late, so better results could be obtained with a scene from later in spring, but such analyses were prevented by the cloud cover over the Park area [95].

Comparing the classifiers used, it can be concluded that the Random Forest is the most commonly used by researchers to obtain satisfactory results, and the SVM has a comparable level of usage [12,16]. A meta-analysis comparing peer-reviewed studies on RF and SVM classifiers was conducted by Sheykhmousa et al. [96]. For low spatial resolution images, the RF method consistently provides better results than the SVM, but a comparison of the average accuracies of the RF and SVM methods suggests the superiority of the SVM method when classifying data containing significantly more features. It is much less common to use other classifiers, such as, Maximum Entropy, which was used in the study by Chiang et al. [15], or deep neural nets, which are used in very complex image structures, e.g., to identify individual tree species in cities. In large and homogeneous forest areas, the classifier architecture optimization is a time-consuming procedure, and the obtained results do not compensate for the workload associated with the preparation of the classifier [97,98]. Due to the prevalence of studies that use the Random Forest algorithm and the high classification accuracy results it achieves, it can be considered one of the most suitable algorithms for this kind of research. However, in our opinion, it is also necessary to test other classifiers, as our study showed higher accuracy results for SVM.

5. Conclusions

In this paper, the usefulness of Sentinel-2 and Landsat 8 images was verified by applying the RF and SVM algorithms to identify forest species. An essential element of the work was the research area (biosphere reserve), where forest stands are characterized by growing on heterogeneous sites with highly variable species of different ages (physical parameters) and with co-occurring objects (dead trees, rocky surfaces, trails). Thus, proper classification was significantly more difficult than in managed forests, because it took place in a highly protected area where traditional forest management is not allowed. The obtained overall accuracy and producer accuracy values are comparable to those obtained in similar studies conducted by other authors on standard commercial forests, where dying trees are successively eliminated and monocultures are preferred due to the ease of treatment. Outcomes confirmed that the species investigated have sufficiently specific spectral features that allow them to be recognized, and the application of nonparametric classifiers and their optimization procedures eliminated noise generated by co-occurring objects.

Classifications carried out with all algorithms, both in the Landsat and Sentinel-2 images, confirmed good possibilities of identifying the spruce (over 90%), it is important in the time of climate change, because the spruce’s roots are located relatively shallow under the soil surface, being exposed to overdrying, leading to water stress for the tree, and thus at risk of being attacked by insects. Therefore, monitoring of the occurrence and water stress of spruce allows for the assessment of climate change. Other woody species were classified with good overall accuracies (Sentinel-2 images and the SVM RBF classifier offered an overall accuracy of 87%, Landsat 8 and RF achieved 85%, and ANN–84%), confirming their suitability for large-scale monitoring of the stand species composition.

Low-resolution multispectral data can only be used as a source of coarse data on the forest composition. Detailed information on forest cover can be obtained more reliably using airborne hyperspectral or multispectral imaging. Future studies using high-resolution orthophoto maps and advanced ML techniques for tree species identification are required.

While the use of remote sensing data cuts the costs of operation when compared with standard forest management methods, it relies heavily on having accurate and up-to-date reference data. Moreover, the conducted classifications have confirmed that the size of the training patterns can be reduced to 100–150 pixels/class, and the obtained results are only two-four percent points worse comparing to maps based on 300 pixel sets. This is valuable information, because the field verification of mountain areas, and especially protected areas, is very difficult as exploring the area is challenging (no roads, denivelations and large research area).

Originally the 60 m pixel size B1 and B9 bands, which are used as standard for atmospheric correction, showed 10–11% of MDA, which is only 50% less comparing to the most informative B6, NDVI, or B4, and their informativeness is comparable to other bands, e.g., B7, B8, so, the bands should not be omitted.

The proposed research methodology allowed to obtain results comparable to classifications of economically used forest areas, where dominate homogeneous forest stands in terms of species and age prevail. This means that both satellite images contain sufficient spectral features, and tree crown sizes have of appropriate size in relation to the pixel size, which allows for proper classification based on non-parametric classifiers. The measurable and documented result of the work is a map of the entire area of two neighboring countries, which is a valuable comparative material for traditional forest research.

Author Contributions

B.Z., M.K. (Marcin Kluczek) and E.R. were responsible for conceptualization, methodology, validation, and formal analysis. M.K. (Marcin Kluczek) and E.R. were responsible for the software; M.K. (Marcin Kluczek) acquired and processed the satellite data. B.Z. was responsible for the supervision, project administration, and funding acquisition. B.Z., M.K. (Marcin Kluczek), E.R., A.N., A.D., M.K. (Marlena Kycko) prepared and edited the text. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding (including publishing costs) from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 734687 (H2020-MSCA-RISE-2016: innoVation in geospatial and 3D data—VOLTA) and the Polish Ministry of Science and Higher Education (Ministerstwo Nauki i Szkolnictwa Wyższego—MNiSW) in the framework of H2020 co-financed projects No. 3934/H2020/2018/2 and 379067/PnH/2017 for the period 2017–2021.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Satellite data are publicly available online: Sentinel-2 images were acquired from the Copernicus Open Access Hub (https://scihub.copernicus.eu (accessed on 18 February 2021)) and Landsat 8 data from the EarthExplorer (https://earthexplorer.usgs.gov/ (accessed on 18 February 2021)). Reference polygons (field data) were prepared and owned by Dr. Edwin Raczko, who is a co-author of the manuscript.

Acknowledgments

The authors express their thanks to the European Union’s H2020 MSCA RISE and Polish Ministry of Science and Higher Education programs for their financial support, which allowed us to conduct research and publish the outcomes. The authors express their gratitude to the editors and anonymous reviewers who contributed to the improvement of the article through their experience, work, and comments. The authors are also very grateful to the Karkonosze National Park for providing reference data and permits to conduct field research in the Park.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Accuracy levels of Sentinel-2 images (hyperparameter values: RF: ntree = 500, mtry = 3; SVM: cost = 10, gamma = 0.1; ANN: decay = 1, hidden units = 18).

		RF MEAN			SVM-RBF MEAN			ANN MEAN
	Class	UA (%)	PA (%)	F1 (%)	UA (%)	PA (%)	F1 (%)	UA (%)	PA (%)	F1 (%)
50	birch	64.3	73.9	68.4	69.1	77.6	72.7	80.5	67.9	73.3
	beech	81.8	77.9	79.4	83.7	79.8	81.3	83.4	80.3	81.6
	larch	49.8	72.0	58.4	53.0	70.3	59.8	72.8	52.6	60.7
	spruce	95.4	83.4	88.9	94.0	84.7	89.0	81.0	94.4	87.1
100	birch	67.1	74.0	70.1	72.3	79.6	75.4	80.6	69.8	74.4
	beech	82.4	79.9	80.9	85.6	81.8	83.5	83.8	80.8	82.1
	larch	54.7	72.9	62.2	58.5	72.2	64.3	73.3	57.7	64.2
	spruce	95.2	86.3	90.5	93.9	87.7	90.6	84.1	94.2	88.8
150	birch	68.1	75.0	71.1	73.2	79.7	76.1	80.2	71.3	75.2
	beech	83.1	80.2	81.3	86.6	82.2	84.2	84.3	81.9	82.9
	larch	57.2	72.5	63.6	61.4	71.5	65.7	73.8	58.5	64.9
	spruce	94.8	87.5	90.9	93.2	89.0	91.0	85.1	94.1	89.3
200	birch	69.2	75.2	71.8	73.7	79.3	76.1	81.4	72.6	76.5
	beech	83.1	80.4	81.5	86.9	82.3	84.4	84.5	82.1	83.1
	larch	59.3	72.5	64.9	63.0	70.3	66.1	73.0	59.8	65.4
	spruce	94.6	88.5	91.4	92.8	89.8	91.2	85.8	93.9	89.6
300	birch	69.8	74.5	71.8	74.6	78.7	76.3	80.4	72.9	76.1
	beech	82.7	81.0	81.7	87.3	82.9	84.9	85.3	82.4	83.7
	larch	61.9	71.0	65.8	64.9	69.0	66.5	72.7	62.6	67.0
	spruce	94.2	89.8	91.9	92.2	90.8	91.5	87.1	93.8	90.3
imbalanced	birch	71.5	68.9	69.8	77.5	74.3	75.5	77.2	76.3	76.4
	beech	83.8	81.8	82.6	88.3	82.5	85.1	84.2	84.0	83.9
	larch	66.7	62.1	63.9	69.2	66.0	67.2	64.9	68.4	66.2
	spruce	91.0	93.1	92.0	90.2	93.0	91.5	91.5	91.1	91.2

Table A2. Accuracy levels of Landsat 8 images (hyperparameter values: RF: ntree = 500, mtry = 3; SVM: cost = 10, gamma = 0.1; ANN: decay = 1, hidden units = 18).

		RF MEAN			SVM-RBF MEAN			ANN MEAN
	Class	UA	PA	F1	UA	PA	F1	UA	PA	F1
50	birch	59.4	70.5	64.1	58.6	72.3	64.3	68.9	54.6	61.1
	beech	69.6	68.7	68.7	64.8	67.3	65.6	65.3	63.4	63.8
	larch	53.1	66.9	58.7	53.9	68.3	59.8	64.6	49.5	55.5
	spruce	93.5	83.4	88.1	94.2	81.7	87.4	78.3	91.8	84.3
100	birch	64.0	73.7	68.2	62.4	73.2	67.1	70.4	58.4	63.5
	beech	73.1	72.6	72.5	67.9	69.8	68.4	67.8	65.3	66.2
	larch	60.9	69.4	64.5	59.4	69.4	63.7	67.4	54.2	59.8
	spruce	94.0	87.4	90.5	94.1	85.5	89.5	81.0	92.0	86.0
150	birch	66.2	74.4	69.8	64.2	73.6	68.2	70.6	59.3	64.2
	beech	74.9	74.4	74.3	69.7	70.8	69.9	67.8	65.8	66.4
	larch	65.9	71.3	68.1	62.6	69.7	65.6	66.5	55.8	60.4
	spruce	94.1	89.4	91.7	94.1	87.7	90.7	82.8	92.1	87.1
200	birch	67.9	74.4	70.7	66.3	73.3	69.3	69.7	59.1	63.5
	beech	75.7	75.7	75.4	70.5	71.1	70.4	67.3	66.1	66.3
	larch	68.7	70.9	69.5	64.2	68.9	66.1	66.6	55.5	60.1
	spruce	93.9	90.9	92.3	93.7	89.3	91.4	82.4	91.6	86.6
300	birch	70.3	73.9	71.7	67.3	72.5	69.4	70.3	59.7	64.2
	beech	76.2	75.8	75.7	71.7	71.6	71.3	66.5	67.0	66.3
	larch	72.6	71.0	71.4	66.1	68.3	66.9	64.4	56.4	59.7
	spruce	93.5	92.7	93.1	93.5	90.8	92.1	84.3	91.3	87.6
imbalanced	birch	76.4	73.0	74.4	74.5	72.1	73.0	64.2	65.6	64.6
	beech	78.4	76.3	77.0	75.2	71.8	73.1	67.8	69.7	68.5
	larch	81.3	71.3	75.7	71.0	67.0	68.6	61.4	67.0	63.7
	spruce	93.8	97.8	95.8	92.2	94.8	93.5	91.7	89.3	90.5

References

Elsen, P.R.; Monahan, W.B.; Merenlender, A.M. Topography and human pressure in mountain ranges alter expected species responses to climate change. Nat. Commun. 2020, 11, 1–10. [Google Scholar] [CrossRef]
Turner, W.; Rondinini, C.; Pettorelli, N.; Mora, B.; Leidner, A.; Szantoi, Z.; Buchanan, G.; Dech, S.; Dwyer, J.; Herold, M.; et al. Free and open-access satellite data are key to biodiversity conservation. Biol. Conserv. 2015, 182, 173–176. [Google Scholar] [CrossRef] [Green Version]
Zagajewski, B.; Kycko, M.; Tømmervik, H.; Bochenek, Z.; Wojtuń, B.; Bjerke, J.W.; Kłos, A. Feasibility of hyperspectral vegetation indices for the detection of chlorophyll concentration in three high Arctic plants: Salix polaris, Bistorta vivipara, and Dryas octopetala. Acta Soc. Bot. Pol. 2018, 87, 87. [Google Scholar] [CrossRef]
Schultz, M.; Clevers, J.G.P.W.; Carter, S.; Verbesselt, J.; Avitabile, V.; Quang, H.V.; Herold, M. Performance of vegetation indices from Landsat time series in deforestation monitoring. Int. J. Appl. Earth Obs. Geoinf. 2016, 52, 318–327. [Google Scholar] [CrossRef]
Modica, G.; Solano, F.; Merlino, A.; Di Fazio, S.; Barreca, F.; Laudari, L.; Fichera, C.R. Using Landsat 8 imagery in detecting cork oak (Quercus suber L.) woodlands: A case study in Calabria (Italy). J. Agric. Eng. 2016, 47, 205–215. [Google Scholar] [CrossRef] [Green Version]
Bolyn, C.; Michez, A.; Gaucher, P.; Lejeune, P.; Bonnet, S. Forest mapping and species composition using supervised per pixel classification of sentinel-2 imagery. Biotechnol. Agron. Soc. Environ. 2018, 22, 172–187. [Google Scholar] [CrossRef]
Waśniewski, A.; Hościło, A.; Zagajewski, B.; Moukétou-Tarazewicz, D. Assessment of Sentinel-2 Satellite Images and Random Forest Classifier for Rainforest Mapping in Gabon. Forests 2020, 11, 941. [Google Scholar] [CrossRef]
Cunningham, S.C.; Mac Nally, R.; Read, J.; Baker, P.J.; White, M.; Thomson, J.R.; Griffioen, P. A Robust Technique for Mapping Vegetation Condition Across a Major River System. Ecosystems 2008, 12, 207–219. [Google Scholar] [CrossRef]
Lange, M.; DeChant, B.; Rebmann, C.; Vohland, M.; Cuntz, M.; Doktor, D. Validating MODIS and Sentinel-2 NDVI Products at a Temperate Deciduous Forest Site Using Two Independent Ground-Based Sensors. Sensors 2017, 17, 1855. [Google Scholar] [CrossRef] [Green Version]
Nolè, A.; Rita, A.; Ferrara, A.M.S.; Borghetti, M. Effects of a large-scale late spring frost on a beech (Fagus sylvatica L.) dominated Mediterranean mountain forest derived from the spatio-temporal variations of NDVI. Ann. For. Sci. 2018, 75, 83. [Google Scholar] [CrossRef] [Green Version]
Ochtyra, A. Forest Disturbances in Polish Tatra Mountains for 1985–2016 in Relation to Topography, Stand Features, and Protection Zone. Forests 2020, 11, 579. [Google Scholar] [CrossRef]
Karasiak, N.; Fauvel, M.; Dejoux, J.-F.; Monteil, C.; Sheeren, D. Optimal Dates for Deciduous Tree Species Mapping Using Full Years Sentinel-2 Time Series in South West France. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, V-3-2020, 469–476. [Google Scholar] [CrossRef]
Zagajewski, B.; Tømmervik, H.; Bjerke, J.W.; Raczko, E.; Bochenek, Z.; Kłos, A.; Jarocińska, A.; Lavender, S.; Ziółkowski, D. Intraspecific Differences in Spectral Reflectance Curves as Indicators of Reduced Vitality in High-Arctic Plants. Remote Sens. 2017, 9, 1289. [Google Scholar] [CrossRef] [Green Version]
Abdullah, H.; Skidmore, A.K.; Darvishzadeh, R.; Heurich, M. Sentinel-2 accurately maps green-attack stage of European spruce bark beetle (Ips typographus, L.) compared with Landsat-8. Remote Sens. Ecol. Conserv. 2019, 5, 87–106. [Google Scholar] [CrossRef] [Green Version]
Chiang, S.H.; Valdez, M.; Chen, C.-F. Forest Tree Species Distribution Mapping Using Landsat Satellite Imagery and Topographic Variables with the Maximum Entropy Method in Mongolia. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B8, 593–596. [Google Scholar] [CrossRef] [Green Version]
Zhu, X.; Liu, D. Accurate mapping of forest types using dense seasonal Landsat time-series. ISPRS J. Photogramm. Remote Sens. 2014, 96, 1–11. [Google Scholar] [CrossRef]
Tran, A.T.; Nguyen, K.A.; Liou, Y.A.; Le, M.H.; Vu, V.T.; Nguyen, D.D. Classification and Observed Seasonal Phenology of Broadleaf Deciduous Forests in a Tropical Region by Using Multitemporal Sentinel-1A and Landsat 8 Data. Forests 2021, 12, 235. [Google Scholar] [CrossRef]
Lamb, B.T.; Tzortziou, M.A.; McDonald, K.C. Evaluation of Approaches for Mapping Tidal Wetlands of the Chesapeake and Delaware Bays. Remote Sens. 2019, 11, 2366. [Google Scholar] [CrossRef] [Green Version]
Nguyen, T.H.; Jones, S.; Soto-Berelov, M.; Haywood, A.; Hislop, S. Landsat Time-Series for Estimating Forest Aboveground Biomass and Its Dynamics across Space and Time: A Review. Remote Sens. 2019, 12, 98. [Google Scholar] [CrossRef] [Green Version]
Satir, O.; Berberoglu, S.; Akca, E.; Yeler, O. Mapping the dominant forest tree distribution using a combined image classification approach in a complex Eastern Mediterranean basin. J. Spat. Sci. 2016, 62, 1–15. [Google Scholar] [CrossRef]
Wang, Q.; Ni-Meister, W. Forest canopy height and gaps from multiangular BRDF, assessed with Airborne LiDAR Data (Short Title: Vegetation Structure from LiDAR and Multiangular Data). Remote Sens. 2019, 11, 2566. [Google Scholar] [CrossRef] [Green Version]
Hill, R.A.; Wilson, A.; George, M.; Hinsley, S. Mapping tree species in temperate deciduous woodland using time-series multi-spectral data. Appl. Veg. Sci. 2010, 13, 86–99. [Google Scholar] [CrossRef]
Puletti, N.; Chianucci, F.; Castaldi, C. Use of Sentinel-2 for forest classification in Mediterranean environments. Ann. Silvic. Res. 2018, 42, 32–38. [Google Scholar] [CrossRef]
Raczko, E.; Zagajewski, B. Comparison of support vector machine, random forest and neural network classifiers for tree species classification on airborne hyperspectral APEX images. Eur. J. Remote Sens. 2017, 50, 144–154. [Google Scholar] [CrossRef] [Green Version]
Quang, N.; Quinn, C.; Stringer, L.; Carrie, R.; Hackney, C.; Hue, L.; Tan, D.; Nga, P. Multi-Decadal Changes in Mangrove Extent, Age and Species in the Red River Estuaries of Viet Nam. Remote Sens. 2020, 12, 2289. [Google Scholar] [CrossRef]
Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995; p. 314. ISBN 978-1-4757-3264-1. [Google Scholar] [CrossRef]
Vapnik, V.N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 1999, 10, 988–999. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Arenas-Castro, S.; Fernández-Haeger, J.; Jordano, D. Evaluation and Comparison of QuickBird and ADS40-SH52 Multispectral Imagery for Mapping Iberian Wild Pear Trees (Pyrus bourgaeana, Decne) in a Mediterranean Mixed Forest. Forests 2014, 5, 1304–1330. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Boateng, E.Y.; Otoo, J.; Abaye, D.A. Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review. J. Data Anal. Inf. Process. 2020, 8, 341–357. [Google Scholar] [CrossRef]
Li, M.; Im, J.; Beier, C. Machine learning approaches for forest classification and change analysis using multi-temporal Landsat TM images over Huntington Wildlife Forest. GIScience Remote Sens. 2013, 50, 361–384. [Google Scholar] [CrossRef]
Das, S.; Singh, T.P. Mapping Vegetation and Forest Types using Landsat TM in the Western Ghat Region of Maharashtra, India. Int. J. Comput. Appl. 2013, 76, 33–37. [Google Scholar] [CrossRef]
Noviar, H.; Kartika, T. Identification and Classification of Forest Types Using Data Landsat 8 in Karo, Dairi, and Samosir Districts, North Sumatra. Int. J. Remote Sens. Earth Sci. (IJReSES) 2017, 13, 139–150. [Google Scholar] [CrossRef] [Green Version]
Elhag, M. Consideration of Landsat-8 Spectral Band Combination in Typical Mediterranean Forest Classification in Halkidiki, Greece. Open Geosci. 2017, 9, 468–479. [Google Scholar] [CrossRef]
Hauglin, M.; Ørka, H.O. Discriminating between Native Norway Spruce and Invasive Sitka Spruce—A Comparison of Multitemporal Landsat 8 Imagery, Aerial Images and Airborne Laser Scanner Data. Remote Sens. 2016, 8, 363. [Google Scholar] [CrossRef] [Green Version]
Wessel, M.; Brandmeier, M.; Tiede, D. Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data. Remote Sens. 2018, 10, 1419. [Google Scholar] [CrossRef] [Green Version]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Cohen, W.B.; Healey, S.P.; Yang, Z.; Zhu, Z.; Gorelick, N. Diversity of Algorithm and Spectral Band Inputs Improves Landsat Monitoring of Forest Disturbance. Remote Sens. 2020, 12, 1673. [Google Scholar] [CrossRef]
Banskota, A.; Kayastha, N.; Falkowski, M.J.; Wulder, M.A.; Froese, R.E.; White, J. Forest Monitoring Using Landsat Time Series Data: A Review. Can. J. Remote Sens. 2014, 40, 362–384. [Google Scholar] [CrossRef]
Shao, G.; Pauli, B.P.; Haulton, G.S.; Zollner, P.A.; Shao, G. Mapping hardwood forests through a two-stage unsupervised classification by integrating Landsat Thematic Mapper and forest inventory data. J. Appl. Remote Sens. 2014, 8, 083546. [Google Scholar] [CrossRef] [Green Version]
Pasquarella, V.J.; Holden, C.E.; Woodcock, C.E. Improved mapping of forest type using spectral-temporal Landsat features. Remote Sens. Environ. 2018, 210, 193–207. [Google Scholar] [CrossRef]
Pena, M.A.; Brenning, A. Assessing fruit-tree crop classification from Landsat-8 time series for the Maipo Valley, Chile. Remote Sens. Environ. 2015, 171, 234–244. [Google Scholar] [CrossRef]
Pimple, U.; Sitthi, A.; Simonetti, D.; Pungkul, S.; Leadprathom, K.; Chidthaisong, A. Topographic Correction of Landsat TM-5 and Landsat OLI-8 Imagery to Improve the Performance of Forest Classification in the Mountainous Terrain of Northeast Thailand. Sustainability 2017, 9, 258. [Google Scholar] [CrossRef] [Green Version]
Immitzer, M.; Vuolo, F.; Atzberger, C. First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe. Remote Sens. 2016, 8, 166. [Google Scholar] [CrossRef]
Kycko, M.; Zagajewski, B.; Lavender, S.; Romanowska, E.; Zwijacz-Kozica, M. The Impact of Tourist Traffic on the Condition and Cell Structures of Alpine Swards. Remote Sens. 2018, 10, 220. [Google Scholar] [CrossRef] [Green Version]
Raczko, E.; Zagajewski, B. Tree Species Classification of the UNESCO Man and the Biosphere Karkonoski National Park (Poland) Using Artificial Neural Networks and APEX Hyperspectral Images. Remote Sens. 2018, 10, 1111. [Google Scholar] [CrossRef] [Green Version]
Marcinkowska-Ochtyra, A.; Zagajewski, B.; Raczko, E.; Ochtyra, A.; Jarocińska, A. Classification of High-Mountain Vegetation Communities within a Diverse Giant Mountains Ecosystem Using Airborne APEX Hyperspectral Imagery. Remote Sens. 2018, 10, 570. [Google Scholar] [CrossRef] [Green Version]
Sobik, M.; Błaś, M. Natural and Human Impact on Pollutant Deposition in Mountain Ecosystems with the Sudetes as an Example. In Proceedings of the 3rd IASME/WSEAS International Conference on Energy, Environment, Ecosystems & Sustainable Development; University of Cambridge: Cambridge, UK, 2008; pp. 355–359. ISBN 978-960-6766-43-5. [Google Scholar]
Da̧browska-Prot, E. Environmental characteristics of the Karkonosze Mts. Region and the problems of spruce forest decline. Polish J. Ecol. 1999, 47, 365–371. [Google Scholar]
Dobrowolska, D. Growth and development of silver fir (Abies alba Mill.) regeneration and restoration of the species in the Karkonosze Mountains. J. For. Sci. 2008, 54, 398–408. [Google Scholar] [CrossRef] [Green Version]
Fabiszewski, J.; Wojtuń, B. Contemporary floristic changes in the Karkonosze Mts. Acta Soc. Bot. Pol. 2014, 70, 237–245. [Google Scholar] [CrossRef] [Green Version]
Pusz, W.; Kroczek, M.; Kaczmarek, A. Colonization of rare and endangered seeds of plant species cultivated in maintenance breeding at The Living Gene Bank in Jagniątków by microscopic fungi. Prog. Plant Prot. 2015, 56, 34–41. [Google Scholar] [CrossRef]
Pusz, W. Plants’ healthiness assessment as part of the environmental monitoring of protected mountainous area in the example of Karkonosze (Giant) Mts. (SW Poland). Environ. Monit. Assess. 2016, 188, 544. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Raj, A. Przemiany krajobrazu leśnego Karkonoskiego Parku Narodowego w Okresie Ostatnich kilkudziesięciu Lat (in Polish, Changes in the Forest Landscape of the Karkonosze National Park over the Last Several Dozen Years); Karkonosze National Park: Jelenia Góra, Poland, 2014; p. 51. ISBN 978-83-64528-16-3. [Google Scholar]
Raj, A.; Knapik, R. Karkonoski Park Narodowy, 2nd ed.; Karkonosze National Park: Jelenia Góra, Poland, 2014; p. 104. ISBN 978-83-64528-13-2. [Google Scholar]
Simurda, J. Historia Lasu–Dziewięć Stuleci Puszczy Karkonoskich (in Polish, History of the Forest-Nine Centuries of the Karkonosze Primeval Forest); Správa Krkonošského Národního Parku (KRNAP): Vrchlabí, Czech Republic, 2012; p. 36. ISBN 978-80-86418-96-4. [Google Scholar]
Staniaszek-Kik, M.; Żarnowiec, J.; Chmura, D. The effect of forest management practices on deadwood resources and structure in protected and managed montane forests during tree-stand reconstruction after dieback of Norway spruce. Balt. For. 2019, 25, 249–256. [Google Scholar] [CrossRef]
Dabija, A.; Kluczek, M.; Zagajewski, B.; Raczko, E.; Kycko, M.; Al-Sulttani, A.; Tardà, A.; Pineda, L.; Corbera, J. Comparison of Support Vector Machines and Random Forests for Corine Land Cover Mapping. Remote Sens. 2021, 13, 777. [Google Scholar] [CrossRef]
Raczko, E. Application of Hyperspectral Data and Artificial Neural Networks for Tree Species Classification of Karkonoski National Park (in Polish). Master’s Thesis, University of Warsaw, Faculty of Geography and Regional Studies, Warsaw, Poland, 2017; p. 114. [Google Scholar]
Rouse, W.; Haas, R.H.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with ERTS, NASA SP-351. Third ERTS-1 Symp. In NASA Special Publication; NASA: Washington, DC, USA, 1974; Volume 1, pp. 309–317. [Google Scholar]
Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1995, 58, 257–266. [Google Scholar] [CrossRef]
Gudex-Cross, D.; Pontius, J.; Adams, A. Enhanced forest cover mapping using spectral unmixing and object-based classification of multi-temporal Landsat imagery. Remote Sens. Environ. 2017, 196, 193–204. [Google Scholar] [CrossRef]
Caret: Classification and Regression Training; R Package Version 6.0-86. Available online: https://rdrr.io/cran/caret/ (accessed on 25 April 2020).
Gaujoux, R. Rngtools: Utility Functions for Working with Random Number Generators; R Package Version 1.5. 2020. Available online: https://rdrr.io/rforge/rngtools/ (accessed on 25 April 2020).
Wickham, H.; François, R.; Henry, L.; Müller, K. Dplyr: A Grammar of Data Manipulation; R Package Version 1.0.0. 2020. Available online: https://rdrr.io/cran/dplyr/ (accessed on 25 April 2020).
Liaw, A.; Wiener, M. RandomForest: Classification and Regression by randomForest. R News 2002, 2, 18–22. Available online: https://rdrr.io/cran/randomForest/ (accessed on 25 April 2020).
Venables, W.N.; Ripley, B.D. nnet, Modern Applied Statistics with S, 4th ed.; Springer: New York, NY, USA, 2002; R Package Version: 7.3.14; Available online: https://rdrr.io/cran/intubate/man/nnet (accessed on 25 April 2020).
Meyer, D.; Dimitriadou, E.; Hornik, K.; Weingessel, A.; Leisch, F. e1071: Misc Functions of the Department of Statistics, Probabil-ity Theory Group (Formerly: E1071); R Package Version 1.7-3; TU: Wien, Austria, 2019; Available online: https://rdrr.io/rforge/e1071/ (accessed on 25 April 2020).
Hijmans, R.J. Raster: Geographic Data Analysis and Modeling; R Package Version 3.3-13. 2020. Available online: https://rdrr.io/cran/raster/ (accessed on 25 April 2020).
Bivand, R.; Keitt, T.; Rowlingson, B. Rgdal: Bindings for the ‘Geospatial’ Data Abstraction Library; R Package Version 1.5-12. 2020. Available online: https://rdrr.io/cran/rgdal/ (accessed on 25 April 2020).
Stehman, S.V.; Foody, G.M. Key issues in rigorous accuracy assessment of land cover products. Remote Sens. Environ. 2019, 231, 111199. [Google Scholar] [CrossRef]
Microsoft; Weston, S. doParallel: Foreach Parallel Adaptor for the ‘parallel’ Package; R Package Version 1.0.15. 2019. Available online: https://rdrr.io/rforge/doParallel/ (accessed on 25 April 2020).
Microsoft; Weston, S. foreach: Provides Foreach Looping Construct; R Package Version 1.5.0. 2020. Available online: https://rdrr.io/github/lepennec/foreach/ (accessed on 25 April 2020).
Congalton, R.G. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens. Environ. 1991, 37, 35–46. [Google Scholar] [CrossRef]
Aronoff, S. Classification accuracy: A user approach, Photogramm. Eng. Remote Sens. 1982, 48, 1299–1307. [Google Scholar]
Story, M.; Congalton, R. Accuracy assessment: A user’s perspective, Photogramm. Eng. Remote Sens. 1986, 52, 397–399. [Google Scholar]
Foody, G.M. Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification. Remote Sens. Environ. 2020, 239, 111630. [Google Scholar] [CrossRef]
Blair, D.C. Information Retrieval, 2nd ed. C.J. Van Rijsbergen. London: Butterworths; 1979: 208 pp. Price: $32.50. J. Am. Soc. Inf. Sci. 1979, 30, 374–375. [Google Scholar] [CrossRef]
Hand, D.; Christen, P. A note on using the F-measure for evaluating record linkage algorithms. Stat. Comput. 2018, 28, 539–547. [Google Scholar] [CrossRef] [Green Version]
Hościło, A.; Lewandowska, A. Mapping Forest Type and Tree Species on a Regional Scale Using Multi-Temporal Sentinel-2 Data. Remote Sens. 2019, 11, 929. [Google Scholar] [CrossRef] [Green Version]
Persson, M.; Lindberg, E.; Reese, H. Tree Species Classification with Multi-Temporal Sentinel-2 Data. Remote Sens. 2018, 10, 1794. [Google Scholar] [CrossRef] [Green Version]
Soleimannejad, L.; Ullah, S.; Abedi, R.; Dees, M.; Koch, B. Evaluating the potential of sentinel-2, landsat-8, and irs satellite images in tree species classification of hyrcanian forest of iran using random forest. J. Sustain. For. 2019, 38, 615–628. [Google Scholar] [CrossRef]
Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-destructive optical detection of pigment changes during leaf senescence and fruit ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef] [Green Version]
Gitelson, A.; Merzlyak, M.N. Quantitative estimation of chlorophyll-a using reflectance spectra: Experiments with autumn chestnut and maple leaves. J. Photochem. Photobiol. B Biol. 1994, 22, 247–252. [Google Scholar] [CrossRef]
Gislason, P.O.; Benediktsson, J.A.; Sveinsson, J.R. Random Forests for land cover classification. Pattern Recognit. Lett. 2006, 27, 294–300. [Google Scholar] [CrossRef]
Grabska, E.; Hostert, P.; Pflugmacher, D.; Ostapowicz, K. Forest Stand Species Mapping Using the Sentinel-2 Time Series. Remote Sens. 2019, 11, 1197. [Google Scholar] [CrossRef] [Green Version]
Kollert, A.; Bremer, M.; Löw, M.; Rutzinger, M. Exploring the potential of land surface phenology and seasonal cloud free composites of one year of Sentinel-2 imagery for tree species mapping in a mountainous region. Int. J. Appl. Earth Obs. Geoinf. 2021, 94, 102208. [Google Scholar] [CrossRef]
Axelsson, A.; Lindberg, E.; Reese, H.; Olsson, H. Tree species classification using Sentinel-2 imagery and Bayesian inference. Int. J. Appl. Earth Obs. Geoinf. 2021, 100, 102318. [Google Scholar] [CrossRef]
Bjerreskov, K.; Nord-Larsen, T.; Fensholt, R. Classification of Nemoral Forests with Fusion of Multi-Temporal Sentinel-1 and 2 Data. Remote Sens. 2021, 13, 950. [Google Scholar] [CrossRef]
Kutia, M.; Myroniuk, V. Evaluation of Sentinel-2 Composited Mosaics and Random Forest Method for Tree Species Distribution Mapping in Suburban Areas of Kyiv City, Ukraine. In Proceedings of the International Workshop on Environmental Management, Science and Engineering, Xiamen, China, 16–17 June 2018; SCITEPRESS-Science and Technology Publications. pp. 597–604. [Google Scholar]
Breidenbach, J.; Waser, L.T.; Debella-Gilo, M.; Schumacher, J.; Rahlf, J.; Hauglin, M.; Puliti, S.; Astrup, R. National mapping and estimation of forest area by dominant tree species using Sentinel-2 data. Can. J. For. Res. 2021, 51, 365–379. [Google Scholar] [CrossRef]
Sabat-Tomala, A.; Raczko, E.; Zagajewski, B. Comparison of Support Vector Machine and Random Forest Algorithms for Invasive and Expansive Species Classification Using Airborne Hyperspectral Data. Remote Sens. 2020, 12, 516. [Google Scholar] [CrossRef] [Green Version]
Noi, P.T.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef] [Green Version]
Heydari, S.S.; Mountrakis, G. Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites. Remote Sens. Environ. 2018, 204, 648–658. [Google Scholar] [CrossRef]
Cierniewski, J.; Kazmierowski, C.; Krolewicz, S.; Piekarczyk, J.; Wróbel, M.; Zagajewski, B. Effects of Different Illumination and Observation Techniques of Cultivated Soils on Their Hyperspectral Bidirectional Measurements Under Field and Laboratory Conditions. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2525–2530. [Google Scholar] [CrossRef]
Sheykhmousa, M.; Mahdianpari, M.; Ghanbari, H.; Mohammadimanesh, F.; Ghamisi, P.; Homayouni, S. Support Vector Machine Versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6308–6325. [Google Scholar] [CrossRef]
Wang, Z.; Fan, C.; Xian, M. Application and Evaluation of a Deep Learning Architecture to Urban Tree Canopy Mapping. Remote Sens. 2021, 13, 1749. [Google Scholar] [CrossRef]
Liu, X.; Fatoyinbo, T.E.; Thomas, N.M.; Guan, W.W.; Zhan, Y.; Mondal, P.; Lagomasino, D.; Simard, M.; Trettin, C.C.; Deo, R.; et al. Large-Scale High-Resolution Coastal Mangrove Forests Mapping Across West Africa With Machine Learning Ensemble and Satellite Big Data. Front. Earth Sci. 2021, 8, 560933. [Google Scholar] [CrossRef]

Figure 1. A view of the Karkonosze forests. Old and dry tree trunks standing next to young and healthy trees form a mix, making it difficult to identify individual species (photo: B. Zagajewski).

Figure 2. Research area of the Czech Krkonoše and Polish Karkonosze National Parks, together forming the Krkonoše/Karkonosze Transboundary UNESCO Biosphere Reserve, which was formed in 1992. Source: Sentinel-2 image (9 August 2018) The Copernicus Open Access Hub.

Figure 3. Research schema. Landsat 8 and Sentinel 2 spectral bands along with indices presenting vegetation vigor and the water content of the canopy were classified 100 times, each time a randomized set of training and verification patterns was selected. These patterns were obtained from the forest species map filtered by nDSM (in order to capture stands whose height exceeded 2.5 m). The gray box on the left represents the procedure of obtaining high-resolution reference data from APEX airborne hyperspectral images.

Figure 4. Locations of the research polygons. Points marked with different colors represent permanent circular monitoring plots of the Polish Karkonosze National Park. These data were filtered by nDSM to acquire verification patterns.

Figure 5. Overall Accuracy of Sentinel and Landsat images. RF—Random Forest; SVM—Support Vector Machine, ANN—Artificial Neural Network. Each box presents median with its 95% confidence interval (indent next to the median), first and third quartiles (Q1, Q3) between which is the interquartile range (IQR), and the minimum and maximum values represent, respectively, Q1-1.5 × IQR and Q3 + 1.5 × IQR.

Figure 6. F1-scores of Sentinel-2 images. RF—Random Forest; SVM—Support Vector Machine, ANN—Artificial Neural Network.

Figure 7. F1-score of Landsat 8 and Sentinel-2 images (details are presented in Appendix A; Table A1 and Table A2).

Figure 8. Spatially classified occurrence of dominant woody species. Background source: Sentinel-2 image: 9 August 2018; Copernicus Open Access Hub.

Figure 9. The relationship between the Sentinel-2 imaging acquisition period and the variable importance. Spring (19 April 2018); summer (7 August 2018); autumn (17 November 2018).

Figure 10. Comparison of the obtained results with the classification based on airborne APEX hyperspectral images [59].

Table 1. Characteristics of satellite data used for classification.

Satellite	Sentinel-2 (MSI)	Landsat 8 (OLI)
processing level	Level 2A	Level-2
scene location	granule: 33UWS	path: 191, row 25
resolution	10 m	30 m
dates	19 April 2018	20 April 2018
	7 August 2018	15 June 2018
	17 November 2018	30 October 2018

Table 2. Used formulas to calculate the indices.

	NDVI	NDWI
General equation	(NIR − RED)/(NIR + RED)	(NIR − SWIR)/(NIR + SWIR)
Sentinel-2	(B8 − B4)/(B8 + B4)	(B3 − B8)/(B3 + B8)
Landsat 8	(B5 − B4)/(B5 + B4)	(B5 − B6)/(B5 + B6)

Table 3. Size of the set used for classification.

Class	Number of Polygons	Number of Pixels S-2	Number of Pixels L8
Birch	108	252	105
Beech	109	243	128
Larch	144	324	124
Spruce	370	867	505

Table 4. Tested SVM hyperparameters.

Parameters	Kernel Function	Min.	Max.	Step	Scale
cost	linear, RBF, polynomial, sigmoid	0.001	10,000	10	logarithmic
gamma	RBF, polynomial, sigmoid	0.001	10,000	10	logarithmic
coef0	polynomial, sigmoid	0 (kept the default value)
degree	polynomial	3 (kept the default value)

Table 5. Error matrix of the best results acquired based on Sentinel-2, Support Vector Machine classifier, and field verified polygons (Figure 8; OA = 86.5%, Kappa coefficient = 0.77; UA—User Accuracy, PA—Producer Accuracy, F1-score).

	Birch	Beech	Larch	Spruce	UA (%)	F1 (%)
birch	67	10	3	8	76.1	80.2
beech	7	86	1	2	89.6	86.0
larch	0	4	67	20	73.6	70.2
spruce	5	4	29	377	90.8	91.7
PA (%)	84.8	82.7	67.0	92.6

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zagajewski, B.; Kluczek, M.; Raczko, E.; Njegovec, A.; Dabija, A.; Kycko, M. Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve. Remote Sens. 2021, 13, 2581. https://doi.org/10.3390/rs13132581

AMA Style

Zagajewski B, Kluczek M, Raczko E, Njegovec A, Dabija A, Kycko M. Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve. Remote Sensing. 2021; 13(13):2581. https://doi.org/10.3390/rs13132581

Chicago/Turabian Style

Zagajewski, Bogdan, Marcin Kluczek, Edwin Raczko, Ajda Njegovec, Anca Dabija, and Marlena Kycko. 2021. "Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve" Remote Sensing 13, no. 13: 2581. https://doi.org/10.3390/rs13132581

APA Style

Zagajewski, B., Kluczek, M., Raczko, E., Njegovec, A., Dabija, A., & Kycko, M. (2021). Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve. Remote Sensing, 13(13), 2581. https://doi.org/10.3390/rs13132581

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Random Forest, Support Vector Machines, and Neural Networks for Post-Disaster Forest Species Mapping of the Krkonoše/Karkonosze Transboundary Biosphere Reserve

Abstract

1. Introduction

Research Area

2. Materials and Methods

2.1. Satellite Input Data

2.2. Field Data Collection

2.3. Satellite Data Processing

2.4. Classification and Accuracy Assessment

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI